diff --git a/external/gpl2/gawk/README b/external/gpl2/gawk/README
new file mode 100644
index 000000000000..70ab9c917db2
--- /dev/null
+++ b/external/gpl2/gawk/README
@@ -0,0 +1,5 @@
+$NetBSD: README,v 1.1 2010/12/13 06:21:53 mrg Exp $
+
+the awk.texi in this directory comes from GNU awk 3.1.3 but we build
+it as part of NetBSD with nawk.  we keep it here so that it is in
+the right license subdirectory.
diff --git a/external/gpl2/gawk/dist/awk.texi b/external/gpl2/gawk/dist/awk.texi
new file mode 100644
index 000000000000..5762651c2897
--- /dev/null
+++ b/external/gpl2/gawk/dist/awk.texi
@@ -0,0 +1,28235 @@
+\input texinfo   @c -*-texinfo-*-
+@c $NetBSD: awk.texi,v 1.1 2010/12/13 06:21:53 mrg Exp $
+@c %**start of header (This is for running Texinfo on a region.)
+@setfilename awk.info
+@settitle The GNU Awk User's Guide
+@c %**end of header (This is for running Texinfo on a region.)
+
+@dircategory Text creation and manipulation
+@direntry
+* Gawk: (awk).                 A text scanning and processing language.
+@end direntry
+@dircategory Individual utilities
+@direntry
+* awk: (awk)Invoking gawk.                     Text scanning and processing.
+@end direntry
+
+@set xref-automatic-section-title
+
+@c The following information should be updated here only!
+@c This sets the edition of the document, the version of gawk it
+@c applies to and all the info about who's publishing this edition
+
+@c These apply across the board.
+@set UPDATE-MONTH June, 2003
+@set VERSION 3.1
+@set PATCHLEVEL 3
+
+@set FSF
+
+@set TITLE GAWK: Effective AWK Programming
+@set SUBTITLE A User's Guide for GNU Awk
+@set EDITION 3
+
+@iftex
+@set DOCUMENT book
+@set CHAPTER chapter
+@set APPENDIX appendix
+@set SECTION section
+@set SUBSECTION subsection
+@set DARKCORNER @inmargin{@image{lflashlight,1cm}, @image{rflashlight,1cm}}
+@end iftex
+@ifinfo
+@set DOCUMENT Info file
+@set CHAPTER major node
+@set APPENDIX major node
+@set SECTION minor node
+@set SUBSECTION node
+@set DARKCORNER (d.c.)
+@end ifinfo
+@ifhtml
+@set DOCUMENT Web page
+@set CHAPTER chapter
+@set APPENDIX appendix
+@set SECTION section
+@set SUBSECTION subsection
+@set DARKCORNER (d.c.)
+@end ifhtml
+@ifxml
+@set DOCUMENT book
+@set CHAPTER chapter
+@set APPENDIX appendix
+@set SECTION section
+@set SUBSECTION subsection
+@set DARKCORNER (d.c.)
+@end ifxml
+
+@c some special symbols
+@iftex
+@set LEQ @math{@leq}
+@end iftex
+@ifnottex
+@set LEQ <=
+@end ifnottex
+
+@set FN file name
+@set FFN File Name
+@set DF data file
+@set DDF Data File
+@set PVERSION version
+@set CTL Ctrl
+
+@ignore
+Some comments on the layout for TeX.
+1. Use at least texinfo.tex 2000-09-06.09
+2. I have done A LOT of work to make this look good. There are  `@page' commands
+   and use of `@group ... @end group' in a number of places. If you muck
+   with anything, it's your responsibility not to break the layout.
+@end ignore
+
+@c merge the function and variable indexes into the concept index
+@ifinfo
+@synindex fn cp
+@synindex vr cp
+@end ifinfo
+@iftex
+@syncodeindex fn cp
+@syncodeindex vr cp
+@end iftex
+@ifxml
+@syncodeindex fn cp
+@syncodeindex vr cp
+@end ifxml
+
+@c If "finalout" is commented out, the printed output will show
+@c black boxes that mark lines that are too long.  Thus, it is
+@c unwise to comment it out when running a master in case there are
+@c overfulls which are deemed okay.
+
+@iftex
+@finalout
+@end iftex
+
+@copying
+Copyright @copyright{} 1989, 1991, 1992, 1993, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+@sp 2
+
+This is Edition @value{EDITION} of @cite{@value{TITLE}: @value{SUBTITLE}},
+for the @value{VERSION}.@value{PATCHLEVEL} (or later) version of the GNU
+implementation of AWK.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with the
+Invariant Sections being ``GNU General Public License'', the Front-Cover
+texts being (a) (see below), and with the Back-Cover Texts being (b)
+(see below).  A copy of the license is included in the section entitled
+``GNU Free Documentation License''.
+
+@enumerate a
+@item
+``A GNU Manual''
+
+@item
+``You have freedom to copy and modify this GNU Manual, like GNU
+software.  Copies published by the Free Software Foundation raise
+funds for GNU development.''
+@end enumerate
+@end copying
+
+@c Comment out the "smallbook" for technical review.  Saves
+@c considerable paper.  Remember to turn it back on *before*
+@c starting the page-breaking work.
+
+@c 4/2002: Karl Berry recommends commenting out this and the
+@c `@setchapternewpage odd', and letting users use `texi2dvi -t'
+@c if they want to waste paper.
+@c @smallbook
+
+
+@c Uncomment this for the release.  Leaving it off saves paper
+@c during editing and review.
+@c @setchapternewpage odd
+
+@titlepage
+@title @value{TITLE}
+@subtitle @value{SUBTITLE}
+@subtitle Edition @value{EDITION}
+@subtitle @value{UPDATE-MONTH}
+@author Arnold D. Robbins
+
+@c Include the Distribution inside the titlepage environment so
+@c that headings are turned off.  Headings on and off do not work.
+
+@page
+@vskip 0pt plus 1filll
+@ignore
+The programs and applications presented in this book have been
+included for their instructional value.  They have been tested with care
+but are not guaranteed for any particular purpose.  The publisher does not
+offer any warranties or representations, nor does it accept any
+liabilities with respect to the programs or applications.
+So there.
+@sp 2
+UNIX is a registered trademark of The Open Group in the United States and other countries. @*
+Microsoft, MS and MS-DOS are registered trademarks, and Windows is a
+trademark of Microsoft Corporation in the United States and other
+countries. @*
+Atari, 520ST, 1040ST, TT, STE, Mega and Falcon are registered trademarks
+or trademarks of Atari Corporation. @*
+DEC, Digital, OpenVMS, ULTRIX and VMS are trademarks of Digital Equipment
+Corporation. @*
+@end ignore
+``To boldly go where no man has gone before'' is a
+Registered Trademark of Paramount Pictures Corporation. @*
+@c sorry, i couldn't resist
+@sp 3
+Published by:
+@sp 1
+
+Free Software Foundation @*
+59 Temple Place --- Suite 330 @*
+Boston, MA  02111-1307 USA @*
+Phone: +1-617-542-5942 @*
+Fax: +1-617-542-2652 @*
+Email: @email{gnu@@gnu.org} @*
+URL: @uref{http://www.gnu.org/} @*
+
+@c This one is correct for gawk 3.1.0 from the FSF
+ISBN 1-882114-28-0 @*
+@sp 2
+@insertcopying
+@sp 2
+Cover art by Etienne Suvasa.
+@end titlepage
+
+@c Thanks to Bob Chassell for directions on doing dedications.
+@iftex
+@headings off
+@page
+@w{ }
+@sp 9
+@center @i{To Miriam, for making me complete.}
+@sp 1
+@center @i{To Chana, for the joy you bring us.}
+@sp 1
+@center @i{To Rivka, for the exponential increase.}
+@sp 1
+@center @i{To Nachum, for the added dimension.}
+@sp 1
+@center @i{To Malka, for the new beginning.}
+@w{ }
+@page
+@w{ }
+@page
+@headings on
+@end iftex
+
+@iftex
+@headings off
+@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
+@oddheading  @| @| @strong{@thischapter}@ @ @ @thispage
+@end iftex
+
+@ifnottex
+@ifnotxml
+@node Top
+@top General Introduction
+@c Preface node should come right after the Top
+@c node, in `unnumbered' sections, then the chapter, `What is gawk'.
+@c Licensing nodes are appendices, they're not central to AWK.
+
+This file documents @command{awk}, a program that you can use to select
+particular records in a file and perform operations upon them.
+
+@insertcopying
+
+@end ifnotxml
+@end ifnottex
+
+@menu
+* Foreword::                       Some nice words about this
+                                   @value{DOCUMENT}.
+* Preface::                        What this @value{DOCUMENT} is about; brief
+                                   history and acknowledgments.
+* Getting Started::                A basic introduction to using
+                                   @command{awk}. How to run an @command{awk}
+                                   program. Command-line syntax.
+* Regexp::                         All about matching things using regular
+                                   expressions.
+* Reading Files::                  How to read files and manipulate fields.
+* Printing::                       How to print using @command{awk}. Describes
+                                   the @code{print} and @code{printf}
+                                   statements. Also describes redirection of
+                                   output.
+* Expressions::                    Expressions are the basic building blocks
+                                   of statements.
+* Patterns and Actions::           Overviews of patterns and actions.
+* Arrays::                         The description and use of arrays. Also
+                                   includes array-oriented control statements.
+* Functions::                      Built-in and user-defined functions.
+* Internationalization::           Getting @command{gawk} to speak your
+                                   language.
+* Advanced Features::              Stuff for advanced users, specific to
+                                   @command{gawk}.
+* Invoking Gawk::                  How to run @command{gawk}.
+* Library Functions::              A Library of @command{awk} Functions.
+* Sample Programs::                Many @command{awk} programs with complete
+                                   explanations.
+* Language History::               The evolution of the @command{awk}
+                                   language.
+* Installation::                   Installing @command{gawk} under various
+                                   operating systems.
+* Notes::                          Notes about @command{gawk} extensions and
+                                   possible future work.
+* Basic Concepts::                 A very quick intoduction to programming
+                                   concepts.
+* Glossary::                       An explanation of some unfamiliar terms.
+* Copying::                        Your right to copy and distribute
+                                   @command{gawk}.
+* GNU Free Documentation License:: The license for this @value{DOCUMENT}.
+* Index::                          Concept and Variable Index.
+
+@detailmenu
+* History::                        The history of @command{gawk} and
+                                   @command{awk}.
+* Names::                          What name to use to find @command{awk}.
+* This Manual::                    Using this @value{DOCUMENT}. Includes
+                                   sample input files that you can use.
+* Conventions::                    Typographical Conventions.
+* Manual History::                 Brief history of the GNU project and this
+                                   @value{DOCUMENT}.
+* How To Contribute::              Helping to save the world.
+* Acknowledgments::                Acknowledgments.
+* Running gawk::                   How to run @command{gawk} programs;
+                                   includes command-line syntax.
+* One-shot::                       Running a short throwaway @command{awk}
+                                   program.
+* Read Terminal::                  Using no input files (input from terminal
+                                   instead).
+* Long::                           Putting permanent @command{awk} programs in
+                                   files.
+* Executable Scripts::             Making self-contained @command{awk}
+                                   programs.
+* Comments::                       Adding documentation to @command{gawk}
+                                   programs.
+* Quoting::                        More discussion of shell quoting issues.
+* Sample Data Files::              Sample data files for use in the
+                                   @command{awk} programs illustrated in this
+                                   @value{DOCUMENT}.
+* Very Simple::                    A very simple example.
+* Two Rules::                      A less simple one-line example using two
+                                   rules.
+* More Complex::                   A more complex example.
+* Statements/Lines::               Subdividing or combining statements into
+                                   lines.
+* Other Features::                 Other Features of @command{awk}.
+* When::                           When to use @command{gawk} and when to use
+                                   other things.
+* Regexp Usage::                   How to Use Regular Expressions.
+* Escape Sequences::               How to write nonprinting characters.
+* Regexp Operators::               Regular Expression Operators.
+* Character Lists::                What can go between @samp{[...]}.
+* GNU Regexp Operators::           Operators specific to GNU software.
+* Case-sensitivity::               How to do case-insensitive matching.
+* Leftmost Longest::               How much text matches.
+* Computed Regexps::               Using Dynamic Regexps.
+* Locales::                        How the locale affects things.
+* Records::                        Controlling how data is split into records.
+* Fields::                         An introduction to fields.
+* Nonconstant Fields::             Nonconstant Field Numbers.
+* Changing Fields::                Changing the Contents of a Field.
+* Field Separators::               The field separator and how to change it.
+* Regexp Field Splitting::         Using regexps as the field separator.
+* Single Character Fields::        Making each character a separate field.
+* Command Line Field Separator::   Setting @code{FS} from the command-line.
+* Field Splitting Summary::        Some final points and a summary table.
+* Constant Size::                  Reading constant width data.
+* Multiple Line::                  Reading multi-line records.
+* Getline::                        Reading files under explicit program
+                                   control using the @code{getline} function.
+* Plain Getline::                  Using @code{getline} with no arguments.
+* Getline/Variable::               Using @code{getline} into a variable.
+* Getline/File::                   Using @code{getline} from a file.
+* Getline/Variable/File::          Using @code{getline} into a variable from a
+                                   file.
+* Getline/Pipe::                   Using @code{getline} from a pipe.
+* Getline/Variable/Pipe::          Using @code{getline} into a variable from a
+                                   pipe.
+* Getline/Coprocess::              Using @code{getline} from a coprocess.
+* Getline/Variable/Coprocess::     Using @code{getline} into a variable from a
+                                   coprocess.
+* Getline Notes::                  Important things to know about
+                                   @code{getline}.
+* Getline Summary::                Summary of @code{getline} Variants.
+* Print::                          The @code{print} statement.
+* Print Examples::                 Simple examples of @code{print} statements.
+* Output Separators::              The output separators and how to change
+                                   them.
+* OFMT::                           Controlling Numeric Output With
+                                   @code{print}.
+* Printf::                         The @code{printf} statement.
+* Basic Printf::                   Syntax of the @code{printf} statement.
+* Control Letters::                Format-control letters.
+* Format Modifiers::               Format-specification modifiers.
+* Printf Examples::                Several examples.
+* Redirection::                    How to redirect output to multiple files
+                                   and pipes.
+* Special Files::                  File name interpretation in @command{gawk}.
+                                   @command{gawk} allows access to inherited
+                                   file descriptors.
+* Special FD::                     Special files for I/O.
+* Special Process::                Special files for process information.
+* Special Network::                Special files for network communications.
+* Special Caveats::                Things to watch out for.
+* Close Files And Pipes::          Closing Input and Output Files and Pipes.
+* Constants::                      String, numeric and regexp constants.
+* Scalar Constants::               Numeric and string constants.
+* Nondecimal-numbers::             What are octal and hex numbers.
+* Regexp Constants::               Regular Expression constants.
+* Using Constant Regexps::         When and how to use a regexp constant.
+* Variables::                      Variables give names to values for later
+                                   use.
+* Using Variables::                Using variables in your programs.
+* Assignment Options::             Setting variables on the command-line and a
+                                   summary of command-line syntax. This is an
+                                   advanced method of input.
+* Conversion::                     The conversion of strings to numbers and
+                                   vice versa.
+* Arithmetic Ops::                 Arithmetic operations (@samp{+}, @samp{-},
+                                   etc.)
+* Concatenation::                  Concatenating strings.
+* Assignment Ops::                 Changing the value of a variable or a
+                                   field.
+* Increment Ops::                  Incrementing the numeric value of a
+                                   variable.
+* Truth Values::                   What is ``true'' and what is ``false''.
+* Typing and Comparison::          How variables acquire types and how this
+                                   affects comparison of numbers and strings
+                                   with @samp{<}, etc.
+* Boolean Ops::                    Combining comparison expressions using
+                                   boolean operators @samp{||} (``or''),
+                                   @samp{&&} (``and'') and @samp{!} (``not'').
+* Conditional Exp::                Conditional expressions select between two
+                                   subexpressions under control of a third
+                                   subexpression.
+* Function Calls::                 A function call is an expression.
+* Precedence::                     How various operators nest.
+* Pattern Overview::               What goes into a pattern.
+* Regexp Patterns::                Using regexps as patterns.
+* Expression Patterns::            Any expression can be used as a pattern.
+* Ranges::                         Pairs of patterns specify record ranges.
+* BEGIN/END::                      Specifying initialization and cleanup
+                                   rules.
+* Using BEGIN/END::                How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::              I/O issues in BEGIN/END rules.
+* Empty::                          The empty pattern, which matches every
+                                   record.
+* Using Shell Variables::          How to use shell variables with
+                                   @command{awk}.
+* Action Overview::                What goes into an action.
+* Statements::                     Describes the various control statements in
+                                   detail.
+* If Statement::                   Conditionally execute some @command{awk}
+                                   statements.
+* While Statement::                Loop until some condition is satisfied.
+* Do Statement::                   Do specified action while looping until
+                                   some condition is satisfied.
+* For Statement::                  Another looping statement, that provides
+                                   initialization and increment clauses.
+* Switch Statement::               Switch/case evaluation for conditional
+                                   execution of statements based on a value.
+* Break Statement::                Immediately exit the innermost enclosing
+                                   loop.
+* Continue Statement::             Skip to the end of the innermost enclosing
+                                   loop.
+* Next Statement::                 Stop processing the current input record.
+* Nextfile Statement::             Stop processing the current file.
+* Exit Statement::                 Stop execution of @command{awk}.
+* Built-in Variables::             Summarizes the built-in variables.
+* User-modified::                  Built-in variables that you change to
+                                   control @command{awk}.
+* Auto-set::                       Built-in variables where @command{awk}
+                                   gives you information.
+* ARGC and ARGV::                  Ways to use @code{ARGC} and @code{ARGV}.
+* Array Intro::                    Introduction to Arrays
+* Reference to Elements::          How to examine one element of an array.
+* Assigning Elements::             How to change an element of an array.
+* Array Example::                  Basic Example of an Array
+* Scanning an Array::              A variation of the @code{for} statement. It
+                                   loops through the indices of an array's
+                                   existing elements.
+* Delete::                         The @code{delete} statement removes an
+                                   element from an array.
+* Numeric Array Subscripts::       How to use numbers as subscripts in
+                                   @command{awk}.
+* Uninitialized Subscripts::       Using Uninitialized variables as
+                                   subscripts.
+* Multi-dimensional::              Emulating multidimensional arrays in
+                                   @command{awk}.
+* Multi-scanning::                 Scanning multidimensional arrays.
+* Array Sorting::                  Sorting array values and indices.
+* Built-in::                       Summarizes the built-in functions.
+* Calling Built-in::               How to call built-in functions.
+* Numeric Functions::              Functions that work with numbers, including
+                                   @code{int}, @code{sin} and @code{rand}.
+* String Functions::               Functions for string manipulation, such as
+                                   @code{split}, @code{match} and
+                                   @code{sprintf}.
+* Gory Details::                   More than you want to know about @samp{\}
+                                   and @samp{&} with @code{sub}, @code{gsub},
+                                   and @code{gensub}.
+* I/O Functions::                  Functions for files and shell commands.
+* Time Functions::                 Functions for dealing with timestamps.
+* Bitwise Functions::              Functions for bitwise operations.
+* I18N Functions::                 Functions for string translation.
+* User-defined::                   Describes User-defined functions in detail.
+* Definition Syntax::              How to write definitions and what they
+                                   mean.
+* Function Example::               An example function definition and what it
+                                   does.
+* Function Caveats::               Things to watch out for.
+* Return Statement::               Specifying the value a function returns.
+* Dynamic Typing::                 How variable types can change at runtime.
+* I18N and L10N::                  Internationalization and Localization.
+* Explaining gettext::             How GNU @code{gettext} works.
+* Programmer i18n::                Features for the programmer.
+* Translator i18n::                Features for the translator.
+* String Extraction::              Extracting marked strings.
+* Printf Ordering::                Rearranging @code{printf} arguments.
+* I18N Portability::               @command{awk}-level portability issues.
+* I18N Example::                   A simple i18n example.
+* Gawk I18N::                      @command{gawk} is also internationalized.
+* Nondecimal Data::                Allowing nondecimal input data.
+* Two-way I/O::                    Two-way communications with another
+                                   process.
+* TCP/IP Networking::              Using @command{gawk} for network
+                                   programming.
+* Portal Files::                   Using @command{gawk} with BSD portals.
+* Profiling::                      Profiling your @command{awk} programs.
+* Command Line::                   How to run @command{awk}.
+* Options::                        Command-line options and their meanings.
+* Other Arguments::                Input file names and variable assignments.
+* AWKPATH Variable::               Searching directories for @command{awk}
+                                   programs.
+* Obsolete::                       Obsolete Options and/or features.
+* Undocumented::                   Undocumented Options and Features.
+* Known Bugs::                     Known Bugs in @command{gawk}.
+* Library Names::                  How to best name private global variables
+                                   in library functions.
+* General Functions::              Functions that are of general use.
+* Nextfile Function::              Two implementations of a @code{nextfile}
+                                   function.
+* Assert Function::                A function for assertions in @command{awk}
+                                   programs.
+* Round Function::                 A function for rounding if @code{sprintf}
+                                   does not do it correctly.
+* Cliff Random Function::          The Cliff Random Number Generator.
+* Ordinal Functions::              Functions for using characters as numbers
+                                   and vice versa.
+* Join Function::                  A function to join an array into a string.
+* Gettimeofday Function::          A function to get formatted times.
+* Data File Management::           Functions for managing command-line data
+                                   files.
+* Filetrans Function::             A function for handling data file
+                                   transitions.
+* Rewind Function::                A function for rereading the current file.
+* File Checking::                  Checking that data files are readable.
+* Empty Files::                    Checking for zero-length files.
+* Ignoring Assigns::               Treating assignments as file names.
+* Getopt Function::                A function for processing command-line
+                                   arguments.
+* Passwd Functions::               Functions for getting user information.
+* Group Functions::                Functions for getting group information.
+* Running Examples::               How to run these examples.
+* Clones::                         Clones of common utilities.
+* Cut Program::                    The @command{cut} utility.
+* Egrep Program::                  The @command{egrep} utility.
+* Id Program::                     The @command{id} utility.
+* Split Program::                  The @command{split} utility.
+* Tee Program::                    The @command{tee} utility.
+* Uniq Program::                   The @command{uniq} utility.
+* Wc Program::                     The @command{wc} utility.
+* Miscellaneous Programs::         Some interesting @command{awk} programs.
+* Dupword Program::                Finding duplicated words in a document.
+* Alarm Program::                  An alarm clock.
+* Translate Program::              A program similar to the @command{tr}
+                                   utility.
+* Labels Program::                 Printing mailing labels.
+* Word Sorting::                   A program to produce a word usage count.
+* History Sorting::                Eliminating duplicate entries from a
+                                   history file.
+* Extract Program::                Pulling out programs from Texinfo source
+                                   files.
+* Simple Sed::                     A Simple Stream Editor.
+* Igawk Program::                  A wrapper for @command{awk} that includes
+                                   files.
+* V7/SVR3.1::                      The major changes between V7 and System V
+                                   Release 3.1.
+* SVR4::                           Minor changes between System V Releases 3.1
+                                   and 4.
+* POSIX::                          New features from the POSIX standard.
+* BTL::                            New features from the Bell Laboratories
+                                   version of @command{awk}.
+* POSIX/GNU::                      The extensions in @command{gawk} not in
+                                   POSIX @command{awk}.
+* Contributors::                   The major contributors to @command{gawk}.
+* Gawk Distribution::              What is in the @command{gawk} distribution.
+* Getting::                        How to get the distribution.
+* Extracting::                     How to extract the distribution.
+* Distribution contents::          What is in the distribution.
+* Unix Installation::              Installing @command{gawk} under various
+                                   versions of Unix.
+* Quick Installation::             Compiling @command{gawk} under Unix.
+* Additional Configuration Options:: Other compile-time options.
+* Configuration Philosophy::       How it's all supposed to work.
+* Non-Unix Installation::          Installation on Other Operating Systems.
+* Amiga Installation::             Installing @command{gawk} on an Amiga.
+* BeOS Installation::              Installing @command{gawk} on BeOS.
+* PC Installation::                Installing and Compiling @command{gawk} on
+                                   MS-DOS and OS/2.
+* PC Binary Installation::         Installing a prepared distribution.
+* PC Compiling::                   Compiling @command{gawk} for MS-DOS, Windows32,
+                                   and OS/2.
+* PC Using::                       Running @command{gawk} on MS-DOS, Windows32 and
+                                   OS/2.
+* PC Dynamic::                     Compiling @command{gawk} for dynamic
+                                   libraries.
+* Cygwin::                         Building and running @command{gawk} for
+                                   Cygwin.
+* VMS Installation::               Installing @command{gawk} on VMS.
+* VMS Compilation::                How to compile @command{gawk} under VMS.
+* VMS Installation Details::       How to install @command{gawk} under VMS.
+* VMS Running::                    How to run @command{gawk} under VMS.
+* VMS POSIX::                      Alternate instructions for VMS POSIX.
+* Unsupported::                    Systems whose ports are no longer
+                                   supported.
+* Atari Installation::             Installing @command{gawk} on the Atari ST.
+* Atari Compiling::                Compiling @command{gawk} on Atari.
+* Atari Using::                    Running @command{gawk} on Atari.
+* Tandem Installation::            Installing @command{gawk} on a Tandem.
+* Bugs::                           Reporting Problems and Bugs.
+* Other Versions::                 Other freely available @command{awk}
+                                   implementations.
+* Compatibility Mode::             How to disable certain @command{gawk}
+                                   extensions.
+* Additions::                      Making Additions To @command{gawk}.
+* Adding Code::                    Adding code to the main body of
+                                   @command{gawk}.
+* New Ports::                      Porting @command{gawk} to a new operating
+                                   system.
+* Dynamic Extensions::             Adding new built-in functions to
+                                   @command{gawk}.
+* Internals::                      A brief look at some @command{gawk}
+                                   internals.
+* Sample Library::                 A example of new functions.
+* Internal File Description::      What the new functions will do.
+* Internal File Ops::              The code for internal file operations.
+* Using Internal File Ops::        How to use an external extension.
+* Future Extensions::              New features that may be implemented one
+                                   day.
+* Basic High Level::               The high level view.
+* Basic Data Typing::              A very quick intro to data types.
+* Floating Point Issues::          Stuff to know about floating-point numbers.
+@end detailmenu
+@end menu
+
+@c dedication for Info file
+@ifinfo
+@center To Miriam, for making me complete.
+@sp 1
+@center To Chana, for the joy you bring us.
+@sp 1
+@center To Rivka, for the exponential increase.
+@sp 1
+@center To Nachum, for the added dimension.
+@sp 1
+@center To Malka, for the new beginning.
+@end ifinfo
+
+@summarycontents
+@contents
+
+@node Foreword
+@unnumbered Foreword
+
+Arnold Robbins and I are good friends. We were introduced 11 years ago
+by circumstances---and our favorite programming language, AWK.
+The circumstances started a couple of years
+earlier. I was working at a new job and noticed an unplugged
+Unix computer sitting in the corner.  No one knew how to use it,
+and neither did I.  However,
+a couple of days later it was running, and
+I was @code{root} and the one-and-only user.
+That day, I began the transition from statistician to Unix programmer.
+
+On one of many trips to the library or bookstore in search of
+books on Unix, I found the gray AWK book, a.k.a. Aho, Kernighan and
+Weinberger, @cite{The AWK Programming Language}, Addison-Wesley,
+1988.  AWK's simple programming paradigm---find a pattern in the
+input and then perform an action---often reduced complex or tedious
+data manipulations to few lines of code.  I was excited to try my
+hand at programming in AWK.
+
+Alas,  the @command{awk} on my computer was a limited version of the
+language described in the AWK book.  I discovered that my computer
+had ``old @command{awk}'' and the AWK book described ``new @command{awk}.''
+I learned that this was typical; the old version refused to step
+aside or relinquish its name.  If a system had a new @command{awk}, it was
+invariably called @command{nawk}, and few systems had it.
+The best way to get a new @command{awk} was to @command{ftp} the source code for
+@command{gawk} from @code{prep.ai.mit.edu}.  @command{gawk} was a version of
+new @command{awk} written by David Trueman and Arnold, and available under
+the GNU General Public License.
+
+(Incidentally,
+it's no longer difficult to find a new @command{awk}. @command{gawk} ships with
+Linux, and you can download binaries or source code for almost
+any system; my wife uses @command{gawk} on her VMS box.)
+
+My Unix system started out unplugged from the wall; it certainly was not
+plugged into a network.  So, oblivious to the existence of @command{gawk}
+and the Unix community in general, and desiring a new @command{awk}, I wrote
+my own, called @command{mawk}.
+Before I was finished I knew about @command{gawk},
+but it was too late to stop, so I eventually posted
+to a @code{comp.sources} newsgroup.
+
+A few days after my posting, I got a friendly email
+from Arnold introducing
+himself.   He suggested we share design and algorithms and
+attached a draft of the POSIX standard so
+that I could update @command{mawk} to support language extensions added
+after publication of the AWK book.
+
+Frankly, if our roles had
+been reversed, I would not have been so open and we probably would
+have never met.  I'm glad we did meet.
+He is an AWK expert's AWK expert and a genuinely nice person.
+Arnold contributes significant amounts of his
+expertise and time to the Free Software Foundation.
+
+This book is the @command{gawk} reference manual, but at its core it
+is a book about AWK programming that
+will appeal to a wide audience.
+It is a definitive reference to the AWK language as defined by the
+1987 Bell Labs release and codified in the 1992 POSIX Utilities
+standard.
+
+On the other hand, the novice AWK programmer can study
+a wealth of practical programs that emphasize
+the power of AWK's basic idioms:
+data driven control-flow, pattern matching with regular expressions,
+and associative arrays.
+Those looking for something new can try out @command{gawk}'s
+interface to network protocols via special @file{/inet} files.
+
+The programs in this book make clear that an AWK program is
+typically much smaller and faster to develop than
+a counterpart written in C.
+Consequently, there is often a payoff to prototype an
+algorithm or design in AWK to get it running quickly and expose
+problems early. Often, the interpreted performance is adequate
+and the AWK prototype becomes the product.
+
+The new @command{pgawk} (profiling @command{gawk}), produces
+program execution counts.
+I recently experimented with an algorithm that for
+@math{n} lines of input, exhibited
+@tex
+$\sim\! Cn^2$
+@end tex
+@ifnottex
+~ C n^2
+@end ifnottex
+performance, while
+theory predicted
+@tex
+$\sim\! Cn\log n$
+@end tex
+@ifnottex
+~ C n log n
+@end ifnottex
+behavior. A few minutes poring
+over the @file{awkprof.out} profile pinpointed the problem to
+a single line of code.  @command{pgawk} is a welcome addition to
+my programmer's toolbox.
+
+Arnold has distilled over a decade of experience writing and
+using AWK programs, and developing @command{gawk}, into this book.  If you use
+AWK or want to learn how, then read this book.
+
+@display
+Michael Brennan
+Author of @command{mawk}
+@end display
+
+@node Preface
+@unnumbered Preface
+@c I saw a comment somewhere that the preface should describe the book itself,
+@c and the introduction should describe what the book covers.
+@c
+@c 12/2000: Chuck wants the preface & intro combined.
+
+Several kinds of tasks occur repeatedly
+when working with text files.
+You might want to extract certain lines and discard the rest.
+Or you may need to make changes wherever certain patterns appear,
+but leave the rest of the file alone.
+Writing single-use programs for these tasks in languages such as C, C++, or Pascal
+is time-consuming and inconvenient.
+Such jobs are often easier with @command{awk}.
+The @command{awk} utility interprets a special-purpose programming language
+that makes it easy to handle simple data-reformatting jobs.
+
+The GNU implementation of @command{awk} is called @command{gawk}; it is fully
+compatible with the System V Release 4 version of
+@command{awk}.  @command{gawk} is also compatible with the POSIX
+specification of the @command{awk} language.  This means that all
+properly written @command{awk} programs should work with @command{gawk}.
+Thus, we usually don't distinguish between @command{gawk} and other
+@command{awk} implementations.
+
+@cindex @command{awk}, POSIX and, See Also POSIX @command{awk}
+@cindex @command{awk}, POSIX and
+@cindex POSIX, @command{awk} and
+@cindex @command{gawk}, @command{awk} and
+@cindex @command{awk}, @command{gawk} and
+@cindex @command{awk}, uses for
+Using @command{awk} allows you to:
+
+@itemize @bullet
+@item
+Manage small, personal databases
+
+@item
+Generate reports
+
+@item
+Validate data
+
+@item
+Produce indexes and perform other document preparation tasks
+
+@item
+Experiment with algorithms that you can adapt later to other computer
+languages
+@end itemize
+
+@cindex @command{awk}, See Also @command{gawk}
+@cindex @command{gawk}, See Also @command{awk}
+@cindex @command{gawk}, uses for
+In addition,
+@command{gawk}
+provides facilities that make it easy to:
+
+@itemize @bullet
+@item
+Extract bits and pieces of data for processing
+
+@item
+Sort data
+
+@item
+Perform simple network communications
+@end itemize
+
+This @value{DOCUMENT} teaches you about the @command{awk} language and
+how you can use it effectively.  You should already be familiar with basic
+system commands, such as @command{cat} and @command{ls},@footnote{These commands
+are available on POSIX-compliant systems, as well as on traditional
+Unix-based systems. If you are using some other operating system, you still need to
+be familiar with the ideas of I/O redirection and pipes.} as well as basic shell
+facilities, such as input/output (I/O) redirection and pipes.
+
+@cindex GNU @command{awk}, See @command{gawk}
+Implementations of the @command{awk} language are available for many
+different computing environments.  This @value{DOCUMENT}, while describing
+the @command{awk} language in general, also describes the particular
+implementation of @command{awk} called @command{gawk} (which stands for
+``GNU awk'').  @command{gawk} runs on a broad range of Unix systems,
+ranging from 80386 PC-based computers up through large-scale systems,
+such as Crays. @command{gawk} has also been ported to Mac OS X,
+MS-DOS, Microsoft Windows (all versions) and OS/2 PCs, Atari and Amiga
+microcomputers, BeOS, Tandem D20, and VMS.
+
+@menu
+* History::                     The history of @command{gawk} and
+                                @command{awk}.
+* Names::                       What name to use to find @command{awk}.
+* This Manual::                 Using this @value{DOCUMENT}. Includes sample
+                                input files that you can use.
+* Conventions::                 Typographical Conventions.
+* Manual History::              Brief history of the GNU project and this
+                                @value{DOCUMENT}.
+* How To Contribute::           Helping to save the world.
+* Acknowledgments::             Acknowledgments.
+@end menu
+
+@node History
+@unnumberedsec History of @command{awk} and @command{gawk}
+@cindex recipe for a programming language
+@cindex programming language, recipe for
+@center Recipe For A Programming Language
+
+@multitable {2 parts} {1 part  @code{egrep}} {1 part  @code{snobol}}
+@item @tab 1 part  @code{egrep} @tab 1 part  @code{snobol}
+@item @tab 2 parts @code{ed} @tab 3 parts C
+@end multitable
+
+@quotation
+Blend all parts well using @code{lex} and @code{yacc}.
+Document minimally and release.
+
+After eight years, add another part @code{egrep} and two
+more parts C.  Document very well and release.
+@end quotation
+
+@cindex Aho, Alfred
+@cindex Weinberger, Peter
+@cindex Kernighan, Brian
+@cindex @command{awk}, history of
+The name @command{awk} comes from the initials of its designers: Alfred V.@:
+Aho, Peter J.@: Weinberger and Brian W.@: Kernighan.  The original version of
+@command{awk} was written in 1977 at AT&T Bell Laboratories.
+In 1985, a new version made the programming
+language more powerful, introducing user-defined functions, multiple input
+streams, and computed regular expressions.
+This new version became widely available with Unix System V
+Release 3.1 (SVR3.1).
+The version in SVR4 added some new features and cleaned
+up the behavior in some of the ``dark corners'' of the language.
+The specification for @command{awk} in the POSIX Command Language
+and Utilities standard further clarified the language.
+Both the @command{gawk} designers and the original Bell Laboratories @command{awk}
+designers provided feedback for the POSIX specification.
+
+@cindex Rubin, Paul
+@cindex Fenlason, Jay
+@cindex Trueman, David
+Paul Rubin wrote the GNU implementation, @command{gawk}, in 1986.
+Jay Fenlason completed it, with advice from Richard Stallman.  John Woods
+contributed parts of the code as well.  In 1988 and 1989, David Trueman, with
+help from me, thoroughly reworked @command{gawk} for compatibility
+with the newer @command{awk}.
+Circa 1995, I became the primary maintainer.
+Current development focuses on bug fixes,
+performance improvements, standards compliance, and occasionally, new features.
+
+In May of 1997, J@"urgen Kahrs felt the need for network access
+from @command{awk}, and with a little help from me, set about adding
+features to do this for @command{gawk}.  At that time, he also
+wrote the bulk of
+@cite{TCP/IP Internetworking with @command{gawk}}
+(a separate document, available as part of the @command{gawk} distribution).
+His code finally became part of the main @command{gawk} distribution
+with @command{gawk} @value{PVERSION} 3.1.
+
+@xref{Contributors},
+for a complete list of those who made important contributions to @command{gawk}.
+
+@node Names
+@section A Rose by Any Other Name
+
+@cindex @command{awk}, new vs. old
+The @command{awk} language has evolved over the years. Full details are
+provided in @ref{Language History}.
+The language described in this @value{DOCUMENT}
+is often referred to as ``new @command{awk}'' (@command{nawk}).
+
+@cindex @command{awk}, versions of
+Because of this, many systems have multiple
+versions of @command{awk}.
+Some systems have an @command{awk} utility that implements the
+original version of the @command{awk} language and a @command{nawk} utility
+for the new
+version.
+Others have an @command{oawk} version for the ``old @command{awk}''
+language and plain @command{awk} for the new one.  Still others only
+have one version, which is usually the new one.@footnote{Often, these systems
+use @command{gawk} for their @command{awk} implementation!}
+
+@cindex @command{nawk} utility
+@cindex @command{oawk} utility
+All in all, this makes it difficult for you to know which version of
+@command{awk} you should run when writing your programs.  The best advice
+I can give here is to check your local documentation. Look for @command{awk},
+@command{oawk}, and @command{nawk}, as well as for @command{gawk}.
+It is likely that you already
+have some version of new @command{awk} on your system, which is what
+you should use when running your programs.  (Of course, if you're reading
+this @value{DOCUMENT}, chances are good that you have @command{gawk}!)
+
+Throughout this @value{DOCUMENT}, whenever we refer to a language feature
+that should be available in any complete implementation of POSIX @command{awk},
+we simply use the term @command{awk}.  When referring to a feature that is
+specific to the GNU implementation, we use the term @command{gawk}.
+
+@node This Manual
+@section Using This Book
+@cindex @command{awk}, terms describing
+
+The term @command{awk} refers to a particular program as well as to the language you
+use to tell this program what to do.  When we need to be careful, we call
+the language ``the @command{awk} language,''
+and the program ``the @command{awk} utility.''
+This @value{DOCUMENT} explains
+both the @command{awk} language and how to run the @command{awk} utility.
+The term @dfn{@command{awk} program} refers to a program written by you in
+the @command{awk} programming language.
+
+@cindex @command{gawk}, @command{awk} and
+@cindex @command{awk}, @command{gawk} and
+@cindex POSIX @command{awk}
+Primarily, this @value{DOCUMENT} explains the features of @command{awk},
+as defined in the POSIX standard.  It does so in the context of the
+@command{gawk} implementation.  While doing so, it also
+attempts to describe important differences between @command{gawk}
+and other @command{awk} implementations.@footnote{All such differences
+appear in the index under the
+entry ``differences in @command{awk} and @command{gawk}.''}
+Finally, any @command{gawk} features that are not in
+the POSIX standard for @command{awk} are noted.
+
+@ifnotinfo
+This @value{DOCUMENT} has the difficult task of being both a tutorial and a reference.
+If you are a novice, feel free to skip over details that seem too complex.
+You should also ignore the many cross-references; they are for the
+expert user and for the online Info version of the document.
+@end ifnotinfo
+
+There are
+subsections labelled
+as @strong{Advanced Notes}
+scattered throughout the @value{DOCUMENT}.
+They add a more complete explanation of points that are relevant, but not likely
+to be of interest on first reading.
+All appear in the index, under the heading ``advanced features.''
+
+Most of the time, the examples use complete @command{awk} programs.
+In some of the more advanced sections, only the part of the @command{awk}
+program that illustrates the concept currently being described is shown.
+
+While this @value{DOCUMENT} is aimed principally at people who have not been
+exposed
+to @command{awk}, there is a lot of information here that even the @command{awk}
+expert should find useful.  In particular, the description of POSIX
+@command{awk} and the example programs in
+@ref{Library Functions}, and in
+@ref{Sample Programs},
+should be of interest.
+
+@ref{Getting Started},
+provides the essentials you need to know to begin using @command{awk}.
+
+@ref{Regexp},
+introduces regular expressions in general, and in particular the flavors
+supported by POSIX @command{awk} and @command{gawk}.
+
+@ref{Reading Files},
+describes how @command{awk} reads your data.
+It introduces the concepts of records and fields, as well
+as the @code{getline} command.
+I/O redirection is first described here.
+
+@ref{Printing},
+describes how @command{awk} programs can produce output with
+@code{print} and @code{printf}.
+
+@ref{Expressions},
+describes expressions, which are the basic building blocks
+for getting most things done in a program.
+
+@ref{Patterns and Actions},
+describes how to write patterns for matching records, actions for
+doing something when a record is matched, and the built-in variables
+@command{awk} and @command{gawk} use.
+
+@ref{Arrays},
+covers @command{awk}'s one-and-only data structure: associative arrays.
+Deleting array elements and whole arrays is also described, as well as
+sorting arrays in @command{gawk}.
+
+@ref{Functions},
+describes the built-in functions @command{awk} and
+@command{gawk} provide, as well as how to define
+your own functions.
+
+@ref{Internationalization},
+describes special features in @command{gawk} for translating program
+messages into different languages at runtime.
+
+@ref{Advanced Features},
+describes a number of @command{gawk}-specific advanced features.
+Of particular note
+are the abilities to have two-way communications with another process,
+perform TCP/IP networking, and
+profile your @command{awk} programs.
+
+@ref{Invoking Gawk},
+describes how to run @command{gawk}, the meaning of its
+command-line options, and how it finds @command{awk}
+program source files.
+
+@ref{Library Functions}, and
+@ref{Sample Programs},
+provide many sample @command{awk} programs.
+Reading them allows you to see @command{awk}
+solving real problems.
+
+@ref{Language History},
+describes how the @command{awk} language has evolved since
+first release to present.  It also describes how @command{gawk}
+has acquired features over time.
+
+@ref{Installation},
+describes how to get @command{gawk}, how to compile it
+under Unix, and how to compile and use it on different
+non-Unix systems.  It also describes how to report bugs
+in @command{gawk} and where to get three other freely
+available implementations of @command{awk}.
+
+@ref{Notes},
+describes how to disable @command{gawk}'s extensions, as
+well as how to contribute new code to @command{gawk},
+how to write extension libraries, and some possible
+future directions for @command{gawk} development.
+
+@ref{Basic Concepts},
+provides some very cursory background material for those who
+are completely unfamiliar with computer programming.
+Also centralized there is a discussion of some of the issues
+surrounding floating-point numbers.
+
+The
+@ref{Glossary},
+defines most, if not all, the significant terms used
+throughout the book.
+If you find terms that you aren't familiar with, try looking them up here.
+
+@ref{Copying}, and
+@ref{GNU Free Documentation License},
+present the licenses that cover the @command{gawk} source code
+and this @value{DOCUMENT}, respectively.
+
+@node Conventions
+@section Typographical Conventions
+
+@cindex Texinfo
+This @value{DOCUMENT} is written using Texinfo, the GNU documentation
+formatting language.
+A single Texinfo source file is used to produce both the printed and online
+versions of the documentation.
+@ifnotinfo
+Because of this, the typographical conventions
+are slightly different than in other books you may have read.
+@end ifnotinfo
+@ifinfo
+This @value{SECTION} briefly documents the typographical conventions used in Texinfo.
+@end ifinfo
+
+Examples you would type at the command-line are preceded by the common
+shell primary and secondary prompts, @samp{$} and @samp{>}.
+Output from the command is preceded by the glyph ``@print{}''.
+This typically represents the command's standard output.
+Error messages, and other output on the command's standard error, are preceded
+by the glyph ``@error{}''.  For example:
+
+@example
+$ echo hi on stdout
+@print{} hi on stdout
+$ echo hello on stderr 1>&2
+@error{} hello on stderr
+@end example
+
+@ifnotinfo
+In the text, command names appear in @code{this font}, while code segments
+appear in the same font and quoted, @samp{like this}.  Some things are
+emphasized @emph{like this}, and if a point needs to be made
+strongly, it is done @strong{like this}.  The first occurrence of
+a new term is usually its @dfn{definition} and appears in the same
+font as the previous occurrence of ``definition'' in this sentence.
+@value{FN}s are indicated like this: @file{/path/to/ourfile}.
+@end ifnotinfo
+
+Characters that you type at the keyboard look @kbd{like this}.  In particular,
+there are special characters called ``control characters.''  These are
+characters that you type by holding down both the @kbd{CONTROL} key and
+another key, at the same time.  For example, a @kbd{@value{CTL}-d} is typed
+by first pressing and holding the @kbd{CONTROL} key, next
+pressing the @kbd{d} key and finally releasing both keys.
+
+@c fakenode --- for prepinfo
+@subsubheading Dark Corners
+@cindex Kernighan, Brian
+@quotation
+@i{Dark corners are basically fractal --- no matter how much
+you illuminate, there's always a smaller but darker one.}@*
+Brian Kernighan
+@end quotation
+
+@cindex d.c., See dark corner
+@cindex dark corner
+Until the POSIX standard (and @cite{The Gawk Manual}),
+many features of @command{awk} were either poorly documented or not
+documented at all.  Descriptions of such features
+(often called ``dark corners'') are noted in this @value{DOCUMENT} with
+@iftex
+the picture of a flashlight in the margin, as shown here.
+@value{DARKCORNER}
+@end iftex
+@ifnottex
+``(d.c.)''.
+@end ifnottex
+They also appear in the index under the heading ``dark corner.''
+
+As noted by the opening quote, though, any
+coverage of dark corners
+is, by definition, something that is incomplete.
+
+@node Manual History
+@unnumberedsec The GNU Project and This Book
+
+@cindex FSF (Free Software Foundation)
+@cindex Free Software Foundation (FSF)
+@cindex Stallman, Richard
+The Free Software Foundation (FSF) is a nonprofit organization dedicated
+to the production and distribution of freely distributable software.
+It was founded by Richard M.@: Stallman, the author of the original
+Emacs editor.  GNU Emacs is the most widely used version of Emacs today.
+
+@cindex GNU Project
+@cindex GPL (General Public License)
+@cindex General Public License, See GPL
+@cindex documentation, online
+The GNU@footnote{GNU stands for ``GNU's not Unix.''}
+Project is an ongoing effort on the part of the Free Software
+Foundation to create a complete, freely distributable, POSIX-compliant
+computing environment.
+The FSF uses the ``GNU General Public License'' (GPL) to ensure that
+their software's
+source code is always available to the end user. A
+copy of the GPL is included
+@ifnotinfo
+in this @value{DOCUMENT}
+@end ifnotinfo
+for your reference
+(@pxref{Copying}).
+The GPL applies to the C language source code for @command{gawk}.
+To find out more about the FSF and the GNU Project online,
+see @uref{http://www.gnu.org, the GNU Project's home page}.
+This @value{DOCUMENT} may also be read from
+@uref{http://www.gnu.org/manual/gawk/, their web site}.
+
+A shell, an editor (Emacs), highly portable optimizing C, C++, and
+Objective-C compilers, a symbolic debugger and dozens of large and
+small utilities (such as @command{gawk}), have all been completed and are
+freely available.  The GNU operating
+system kernel (the HURD), has been released but is still in an early
+stage of development.
+
+@cindex Linux
+@cindex GNU/Linux
+@cindex operating systems, BSD-based
+@cindex Alpha (DEC)
+Until the GNU operating system is more fully developed, you should
+consider using GNU/Linux, a freely distributable, Unix-like operating
+system for Intel 80386, DEC Alpha, Sun SPARC, IBM S/390, and other
+systems.@footnote{The terminology ``GNU/Linux'' is explained
+in the @ref{Glossary}.}
+There are
+many books on GNU/Linux. One that is freely available is @cite{Linux
+Installation and Getting Started}, by Matt Welsh.
+Many GNU/Linux distributions are often available in computer stores or
+bundled on CD-ROMs with books about Linux.
+(There are three other freely available, Unix-like operating systems for
+80386 and other systems: NetBSD, FreeBSD, and OpenBSD. All are based on the
+4.4-Lite Berkeley Software Distribution, and they use recent versions
+of @command{gawk} for their versions of @command{awk}.)
+
+@ifnotinfo
+The @value{DOCUMENT} you are reading is actually free---at least, the
+information in it is free to anyone.  The machine-readable
+source code for the @value{DOCUMENT} comes with @command{gawk}; anyone
+may take this @value{DOCUMENT} to a copying machine and make as many
+copies as they like.  (Take a moment to check the Free Documentation
+License in @ref{GNU Free Documentation License}.)
+
+Although you could just print it out yourself, bound books are much
+easier to read and use.  Furthermore,
+the proceeds from sales of this book go back to the FSF
+to help fund development of more free software.
+@end ifnotinfo
+
+@ignore
+@cindex Close, Diane
+The @value{DOCUMENT} itself has gone through several previous,
+preliminary editions.
+Paul Rubin wrote the very first draft of @cite{The GAWK Manual};
+it was around 40 pages in size.
+Diane Close and Richard Stallman improved it, yielding the
+version which I started working with in the fall of 1988.
+It was around 90 pages long and barely described the original, ``old''
+version of @command{awk}. After substantial revision, the first version of
+the @cite{The GAWK Manual} to be released was Edition 0.11 Beta in
+October of 1989.  The manual then underwent more substantial revision
+for Edition 0.13 of December 1991.
+David Trueman, Pat Rankin and Michal Jaegermann contributed sections
+of the manual for Edition 0.13.
+That edition was published by the
+FSF as a bound book early in 1992.  Since then there were several
+minor revisions, notably Edition 0.14 of November 1992 that was published
+by the FSF in January of 1993 and Edition 0.16 of August 1993.
+
+Edition 1.0 of @cite{GAWK: The GNU Awk User's Guide} represented a significant re-working
+of @cite{The GAWK Manual}, with much additional material.
+The FSF and I agreed that I was now the primary author.
+@c I also felt that the manual needed a more descriptive title.
+
+In January 1996, SSC published Edition 1.0 under the title @cite{Effective AWK Programming}.
+In February 1997, they published Edition 1.0.3 which had minor changes
+as a ``second edition.''
+In 1999, the FSF published this same version as Edition 2
+of @cite{GAWK: The GNU Awk User's Guide}.
+
+Edition @value{EDITION} maintains the basic structure of Edition 1.0,
+but with significant additional material, reflecting the host of new features
+in @command{gawk} @value{PVERSION} @value{VERSION}.
+Of particular note is
+@ref{Array Sorting},
+@ref{Bitwise Functions},
+@ref{Internationalization},
+@ref{Advanced Features},
+and
+@ref{Dynamic Extensions}.
+@end ignore
+
+@cindex Close, Diane
+The @value{DOCUMENT} itself has gone through a number of previous editions.
+Paul Rubin wrote the very first draft of @cite{The GAWK Manual};
+it was around 40 pages in size.
+Diane Close and Richard Stallman improved it, yielding a
+version that was
+around 90 pages long and barely described the original, ``old''
+version of @command{awk}.
+
+I started working with that version in the fall of 1988.
+As work on it progressed,
+the FSF published several preliminary versions (numbered 0.@var{x}).
+In 1996, Edition 1.0 was released with @command{gawk} 3.0.0.
+The FSF published the first two editions under
+the title @cite{The GNU Awk User's Guide}.
+
+This edition maintains the basic structure of Edition 1.0,
+but with significant additional material, reflecting the host of new features
+in @command{gawk} @value{PVERSION} @value{VERSION}.
+Of particular note is
+@ref{Array Sorting},
+as well as
+@ref{Bitwise Functions},
+@ref{Internationalization},
+and also
+@ref{Advanced Features},
+and
+@ref{Dynamic Extensions}.
+
+@cite{@value{TITLE}} will undoubtedly continue to evolve.
+An electronic version
+comes with the @command{gawk} distribution from the FSF.
+If you find an error in this @value{DOCUMENT}, please report it!
+@xref{Bugs}, for information on submitting
+problem reports electronically, or write to me in care of the publisher.
+
+@node How To Contribute
+@unnumberedsec How to Contribute
+
+As the maintainer of GNU @command{awk},
+I am starting a collection of publicly available @command{awk}
+programs.
+For more information,
+see @uref{ftp://ftp.freefriends.org/arnold/Awkstuff}.
+If you have written an interesting @command{awk} program, or have written a
+@command{gawk} extension that you would like to
+share with the rest of the world, please contact me (@email{arnold@@gnu.org}).
+Making things available on the Internet helps keep the
+@command{gawk} distribution down to manageable size.
+
+@node Acknowledgments
+@unnumberedsec Acknowledgments
+
+The initial draft of @cite{The GAWK Manual} had the following acknowledgments:
+
+@quotation
+Many people need to be thanked for their assistance in producing this
+manual.  Jay Fenlason contributed many ideas and sample programs.  Richard
+Mlynarik and Robert Chassell gave helpful comments on drafts of this
+manual.  The paper @cite{A Supplemental Document for @command{awk}} by John W.@:
+Pierce of the Chemistry Department at UC San Diego, pinpointed several
+issues relevant both to @command{awk} implementation and to this manual, that
+would otherwise have escaped us.
+@end quotation
+
+@cindex Stallman, Richard
+I would like to acknowledge Richard M.@: Stallman, for his vision of a
+better world and for his courage in founding the FSF and starting the
+GNU Project.
+
+The following people (in alphabetical order)
+provided helpful comments on various
+versions of this book, up to and including this edition.
+Rick Adams,
+Nelson H.F. Beebe,
+Karl Berry,
+Dr.@: Michael Brennan,
+Rich Burridge,
+Claire Cloutier,
+Diane Close,
+Scott Deifik,
+Christopher (``Topher'') Eliot,
+Jeffrey Friedl,
+Dr.@: Darrel Hankerson,
+Michal Jaegermann,
+Dr.@: Richard J.@: LeBlanc,
+Michael Lijewski,
+Pat Rankin,
+Miriam Robbins,
+Mary Sheehan,
+and
+Chuck Toporek.
+
+@cindex Berry, Karl
+@cindex Chassell, Robert J.@:
+@c @cindex Texinfo
+Robert J.@: Chassell provided much valuable advice on
+the use of Texinfo.
+He also deserves special thanks for
+convincing me @emph{not} to title this @value{DOCUMENT}
+@cite{How To Gawk Politely}.
+Karl Berry helped significantly with the @TeX{} part of Texinfo.
+
+@cindex Hartholz, Marshall
+@cindex Hartholz, Elaine
+@cindex Schreiber, Bert
+@cindex Schreiber, Rita
+I would like to thank Marshall and Elaine Hartholz of Seattle and
+Dr.@: Bert and Rita Schreiber of Detroit for large amounts of quiet vacation
+time in their homes, which allowed me to make significant progress on
+this @value{DOCUMENT} and on @command{gawk} itself.
+
+@cindex Hughes, Phil
+Phil Hughes of SSC
+contributed in a very important way by loaning me his laptop GNU/Linux
+system, not once, but twice, which allowed me to do a lot of work while
+away from home.
+
+@cindex Trueman, David
+David Trueman deserves special credit; he has done a yeoman job
+of evolving @command{gawk} so that it performs well and without bugs.
+Although he is no longer involved with @command{gawk},
+working with him on this project was a significant pleasure.
+
+@cindex Drepper, Ulrich
+@cindex GNITS mailing list
+@cindex mailing list, GNITS
+The intrepid members of the GNITS mailing list, and most notably Ulrich
+Drepper, provided invaluable help and feedback for the design of the
+internationalization features.
+
+@cindex Beebe, Nelson
+@cindex Brown, Martin
+@cindex Buening, Andreas
+@cindex Deifik, Scott
+@cindex Hankerson, Darrel
+@cindex Hasegawa, Isamu
+@cindex Jaegermann, Michal
+@cindex Kahrs, J@"urgen
+@cindex Rankin, Pat
+@cindex Rommel, Kai Uwe
+@cindex Zaretskii, Eli
+Nelson Beebe,
+Martin Brown,
+Andreas Buening,
+Scott Deifik,
+Darrel Hankerson,
+Isamu Hasegawa,
+Michal Jaegermann,
+J@"urgen Kahrs,
+Pat Rankin,
+Kai Uwe Rommel,
+and Eli Zaretskii
+(in alphabetical order)
+make up the
+@command{gawk} ``crack portability team.''  Without their hard work and
+help, @command{gawk} would not be nearly the fine program it is today.  It
+has been and continues to be a pleasure working with this team of fine
+people.
+
+@cindex Kernighan, Brian
+David and I would like to thank Brian Kernighan of Bell Laboratories for
+invaluable assistance during the testing and debugging of @command{gawk}, and for
+help in clarifying numerous points about the language.  We could not have
+done nearly as good a job on either @command{gawk} or its documentation without
+his help.
+
+Chuck Toporek, Mary Sheehan, and Claire Coutier of O'Reilly & Associates contributed
+significant editorial help for this @value{DOCUMENT} for the
+3.1 release of @command{gawk}.
+
+@cindex Robbins, Miriam
+@cindex Robbins, Jean
+@cindex Robbins, Harry
+@cindex G-d
+I must thank my wonderful wife, Miriam, for her patience through
+the many versions of this project, for her proofreading,
+and for sharing me with the computer.
+I would like to thank my parents for their love, and for the grace with
+which they raised and educated me.
+Finally, I also must acknowledge my gratitude to G-d, for the many opportunities
+He has sent my way, as well as for the gifts He has given me with which to
+take advantage of those opportunities.
+@sp 2
+@noindent
+Arnold Robbins @*
+Nof Ayalon @*
+ISRAEL @*
+March, 2001
+
+@ignore
+@c Try this
+@iftex
+@page
+@headings off
+@majorheading I@ @ @ @ The @command{awk} Language and @command{gawk}
+Part I describes the @command{awk} language and @command{gawk} program in detail.
+It starts with the basics, and continues through all of the features of @command{awk}
+and @command{gawk}.  It contains the following chapters:
+
+@itemize @bullet
+@item
+@ref{Getting Started}.
+
+@item
+@ref{Regexp}.
+
+@item
+@ref{Reading Files}.
+
+@item
+@ref{Printing}.
+
+@item
+@ref{Expressions}.
+
+@item
+@ref{Patterns and Actions}.
+
+@item
+@ref{Arrays}.
+
+@item
+@ref{Functions}.
+
+@item
+@ref{Internationalization}.
+
+@item
+@ref{Advanced Features}.
+
+@item
+@ref{Invoking Gawk}.
+@end itemize
+
+@page
+@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
+@oddheading  @| @| @strong{@thischapter}@ @ @ @thispage
+@end iftex
+@end ignore
+
+@node Getting Started
+@chapter Getting Started with @command{awk}
+@c @cindex script, definition of
+@c @cindex rule, definition of
+@c @cindex program, definition of
+@c @cindex basic function of @command{awk}
+@cindex @command{awk}, function of
+
+The basic function of @command{awk} is to search files for lines (or other
+units of text) that contain certain patterns.  When a line matches one
+of the patterns, @command{awk} performs specified actions on that line.
+@command{awk} keeps processing input lines in this way until it reaches
+the end of the input files.
+
+@cindex @command{awk}, uses for
+@c comma here is NOT for secondary
+@cindex programming languages, data-driven vs. procedural
+@cindex @command{awk} programs
+Programs in @command{awk} are different from programs in most other languages,
+because @command{awk} programs are @dfn{data-driven}; that is, you describe
+the data you want to work with and then what to do when you find it.
+Most other languages are @dfn{procedural}; you have to describe, in great
+detail, every step the program is to take.  When working with procedural
+languages, it is usually much
+harder to clearly describe the data your program will process.
+For this reason, @command{awk} programs are often refreshingly easy to
+read and write.
+
+@cindex program, definition of
+@cindex rule, definition of
+When you run @command{awk}, you specify an @command{awk} @dfn{program} that
+tells @command{awk} what to do.  The program consists of a series of
+@dfn{rules}.  (It may also contain @dfn{function definitions},
+an advanced feature that we will ignore for now.
+@xref{User-defined}.)  Each rule specifies one
+pattern to search for and one action to perform
+upon finding the pattern.
+
+Syntactically, a rule consists of a pattern followed by an action.  The
+action is enclosed in curly braces to separate it from the pattern.
+Newlines usually separate rules.  Therefore, an @command{awk}
+program looks like this:
+
+@example
+@var{pattern} @{ @var{action} @}
+@var{pattern} @{ @var{action} @}
+@dots{}
+@end example
+
+@menu
+* Running gawk::                How to run @command{gawk} programs; includes
+                                command-line syntax.
+* Sample Data Files::           Sample data files for use in the @command{awk}
+                                programs illustrated in this @value{DOCUMENT}.
+* Very Simple::                 A very simple example.
+* Two Rules::                   A less simple one-line example using two
+                                rules.
+* More Complex::                A more complex example.
+* Statements/Lines::            Subdividing or combining statements into
+                                lines.
+* Other Features::              Other Features of @command{awk}.
+* When::                        When to use @command{gawk} and when to use
+                                other things.
+@end menu
+
+@node Running gawk
+@section How to Run @command{awk} Programs
+
+@cindex @command{awk} programs, running
+There are several ways to run an @command{awk} program.  If the program is
+short, it is easiest to include it in the command that runs @command{awk},
+like this:
+
+@example
+awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@cindex command line, formats
+When the program is long, it is usually more convenient to put it in a file
+and run it with a command like this:
+
+@example
+awk -f @var{program-file} @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+This @value{SECTION} discusses both mechanisms, along with several
+variations of each.
+
+@menu
+* One-shot::                    Running a short throwaway @command{awk}
+                                program.
+* Read Terminal::               Using no input files (input from terminal
+                                instead).
+* Long::                        Putting permanent @command{awk} programs in
+                                files.
+* Executable Scripts::          Making self-contained @command{awk} programs.
+* Comments::                    Adding documentation to @command{gawk}
+                                programs.
+* Quoting::                     More discussion of shell quoting issues.
+@end menu
+
+@node One-shot
+@subsection One-Shot Throwaway @command{awk} Programs
+
+Once you are familiar with @command{awk}, you will often type in simple
+programs the moment you want to use them.  Then you can write the
+program as the first argument of the @command{awk} command, like this:
+
+@example
+awk '@var{program}' @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@noindent
+where @var{program} consists of a series of @var{patterns} and
+@var{actions}, as described earlier.
+
+@cindex single quote (@code{'})
+@cindex @code{'} (single quote)
+This command format instructs the @dfn{shell}, or command interpreter,
+to start @command{awk} and use the @var{program} to process records in the
+input file(s).  There are single quotes around @var{program} so
+the shell won't interpret any @command{awk} characters as special shell
+characters.  The quotes also cause the shell to treat all of @var{program} as
+a single argument for @command{awk}, and allow @var{program} to be more
+than one line long.
+
+@cindex shells, scripts
+@cindex @command{awk} programs, running, from shell scripts
+This format is also useful for running short or medium-sized @command{awk}
+programs from shell scripts, because it avoids the need for a separate
+file for the @command{awk} program.  A self-contained shell script is more
+reliable because there are no other files to misplace.
+
+@ref{Very Simple},
+@ifnotinfo
+later in this @value{CHAPTER},
+@end ifnotinfo
+presents several short,
+self-contained programs.
+
+@c Removed for gawk 3.1, doesn't really add anything here.
+@ignore
+As an interesting side point, the command
+
+@example
+awk '/foo/' @var{files} @dots{}
+@end example
+
+@noindent
+is essentially the same as
+
+@cindex @command{egrep} utility
+@example
+egrep foo @var{files} @dots{}
+@end example
+@end ignore
+
+@node Read Terminal
+@subsection Running @command{awk} Without Input Files
+
+@cindex standard input
+@cindex input, standard
+@cindex input files, running @command{awk} without
+You can also run @command{awk} without any input files.  If you type the
+following command line:
+
+@example
+awk '@var{program}'
+@end example
+
+@noindent
+@command{awk} applies the @var{program} to the @dfn{standard input},
+which usually means whatever you type on the terminal.  This continues
+until you indicate end-of-file by typing @kbd{@value{CTL}-d}.
+(On other operating systems, the end-of-file character may be different.
+For example, on OS/2 and MS-DOS, it is @kbd{@value{CTL}-z}.)
+
+@cindex files, input, See input files
+@cindex input files, running @command{awk} without
+@cindex @command{awk} programs, running, without input files
+As an example, the following program prints a friendly piece of advice
+(from Douglas Adams's @cite{The Hitchhiker's Guide to the Galaxy}),
+to keep you from worrying about the complexities of computer programming
+(@code{BEGIN} is a feature we haven't discussed yet):
+
+@example
+$ awk "BEGIN @{ print \"Don't Panic!\" @}"
+@print{} Don't Panic!
+@end example
+
+@cindex quoting
+@cindex double quote (@code{"})
+@cindex @code{"} (double quote)
+@cindex @code{\} (backslash)
+@cindex backslash (@code{\})
+This program does not read any input.  The @samp{\} before each of the
+inner double quotes is necessary because of the shell's quoting
+rules---in particular because it mixes both single quotes and
+double quotes.@footnote{Although we generally recommend the use of single
+quotes around the program text, double quotes are needed here in order to
+put the single quote into the message.}
+
+This next simple @command{awk} program
+emulates the @command{cat} utility; it copies whatever you type on the
+keyboard to its standard output (why this works is explained shortly).
+
+@example
+$ awk '@{ print @}'
+Now is the time for all good men
+@print{} Now is the time for all good men
+to come to the aid of their country.
+@print{} to come to the aid of their country.
+Four score and seven years ago, ...
+@print{} Four score and seven years ago, ...
+What, me worry?
+@print{} What, me worry?
+@kbd{@value{CTL}-d}
+@end example
+
+@node Long
+@subsection Running Long Programs
+
+@cindex @command{awk} programs, running
+@cindex @command{awk} programs, lengthy
+@cindex files, @command{awk} programs in
+Sometimes your @command{awk} programs can be very long.  In this case, it is
+more convenient to put the program into a separate file.  In order to tell
+@command{awk} to use that file for its program, you type:
+
+@example
+awk -f @var{source-file} @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@cindex @code{-f} option
+@cindex command line, options
+@cindex options, command-line
+The @option{-f} instructs the @command{awk} utility to get the @command{awk} program
+from the file @var{source-file}.  Any @value{FN} can be used for
+@var{source-file}.  For example, you could put the program:
+
+@example
+BEGIN @{ print "Don't Panic!" @}
+@end example
+
+@noindent
+into the file @file{advice}.  Then this command:
+
+@example
+awk -f advice
+@end example
+
+@noindent
+does the same thing as this one:
+
+@example
+awk "BEGIN @{ print \"Don't Panic!\" @}"
+@end example
+
+@cindex quoting
+@noindent
+This was explained earlier
+(@pxref{Read Terminal}).
+Note that you don't usually need single quotes around the @value{FN} that you
+specify with @option{-f}, because most @value{FN}s don't contain any of the shell's
+special characters.  Notice that in @file{advice}, the @command{awk}
+program did not have single quotes around it.  The quotes are only needed
+for programs that are provided on the @command{awk} command line.
+
+@c STARTOFRANGE sq1x
+@cindex single quote (@code{'})
+@c STARTOFRANGE qs2x
+@cindex @code{'} (single quote)
+If you want to identify your @command{awk} program files clearly as such,
+you can add the extension @file{.awk} to the @value{FN}.  This doesn't
+affect the execution of the @command{awk} program but it does make
+``housekeeping'' easier.
+
+@node Executable Scripts
+@subsection Executable @command{awk} Programs
+@cindex @command{awk} programs
+@cindex @code{#} (number sign), @code{#!} (executable scripts)
+@cindex number sign (@code{#}), @code{#!} (executable scripts)
+@cindex Unix, @command{awk} scripts and
+@cindex @code{#} (number sign), @code{#!} (executable scripts), portability issues with
+@cindex number sign (@code{#}), @code{#!} (executable scripts), portability issues with
+
+Once you have learned @command{awk}, you may want to write self-contained
+@command{awk} scripts, using the @samp{#!} script mechanism.  You can do
+this on many Unix systems@footnote{The @samp{#!} mechanism works on
+Linux systems,
+systems derived from the 4.4-Lite Berkeley Software Distribution,
+and most commercial Unix systems.} as well as on the GNU system.
+For example, you could update the file @file{advice} to look like this:
+
+@example
+#! /bin/awk -f
+
+BEGIN @{ print "Don't Panic!" @}
+@end example
+
+@noindent
+After making this file executable (with the @command{chmod} utility),
+simply type @samp{advice}
+at the shell and the system arranges to run @command{awk}@footnote{The
+line beginning with @samp{#!} lists the full @value{FN} of an interpreter
+to run and an optional initial command-line argument to pass to that
+interpreter.  The operating system then runs the interpreter with the given
+argument and the full argument list of the executed program.  The first argument
+in the list is the full @value{FN} of the @command{awk} program.  The rest of the
+argument list contains either options to @command{awk}, or @value{DF}s,
+or both.} as if you had
+typed @samp{awk -f advice}:
+
+@example
+$ chmod +x advice
+$ advice
+@print{} Don't Panic!
+@end example
+
+@noindent
+(We assume you have the current directory in your shell's search
+path variable (typically @code{$PATH}).  If not, you may need
+to type @samp{./advice} at the shell.)
+
+Self-contained @command{awk} scripts are useful when you want to write a
+program that users can invoke without their having to know that the program is
+written in @command{awk}.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Portability Issues with @samp{#!}
+@cindex portability, @code{#!} (executable scripts)
+
+Some systems limit the length of the interpreter name to 32 characters.
+Often, this can be dealt with by using a symbolic link.
+
+You should not put more than one argument on the @samp{#!}
+line after the path to @command{awk}. It does not work. The operating system
+treats the rest of the line as a single argument and passes it to @command{awk}.
+Doing this leads to confusing behavior---most likely a usage diagnostic
+of some sort from @command{awk}.
+
+@cindex @code{ARGC}/@code{ARGV} variables, portability and
+@cindex portability, @code{ARGV} variable
+Finally,
+the value of @code{ARGV[0]}
+(@pxref{Built-in Variables})
+varies depending upon your operating system.
+Some systems put @samp{awk} there, some put the full pathname
+of @command{awk} (such as @file{/bin/awk}), and some put the name
+of your script (@samp{advice}).  Don't rely on the value of @code{ARGV[0]}
+to provide your script name.
+
+@node Comments
+@subsection Comments in @command{awk} Programs
+@cindex @code{#} (number sign), commenting
+@cindex number sign (@code{#}), commenting
+@cindex commenting
+@cindex @command{awk} programs, documenting
+
+A @dfn{comment} is some text that is included in a program for the sake
+of human readers; it is not really an executable part of the program.  Comments
+can explain what the program does and how it works.  Nearly all
+programming languages have provisions for comments, as programs are
+typically hard to understand without them.
+
+In the @command{awk} language, a comment starts with the sharp sign
+character (@samp{#}) and continues to the end of the line.
+The @samp{#} does not have to be the first character on the line. The
+@command{awk} language ignores the rest of a line following a sharp sign.
+For example, we could have put the following into @file{advice}:
+
+@example
+# This program prints a nice friendly message.  It helps
+# keep novice users from being afraid of the computer.
+BEGIN    @{ print "Don't Panic!" @}
+@end example
+
+You can put comment lines into keyboard-composed throwaway @command{awk}
+programs, but this usually isn't very useful; the purpose of a
+comment is to help you or another person understand the program
+when reading it at a later time.
+
+@cindex quoting
+@cindex single quote (@code{'}), vs. apostrophe
+@cindex @code{'} (single quote), vs. apostrophe
+@strong{Caution:} As mentioned in
+@ref{One-shot},
+you can enclose small to medium programs in single quotes, in order to keep
+your shell scripts self-contained.  When doing so, @emph{don't} put
+an apostrophe (i.e., a single quote) into a comment (or anywhere else
+in your program). The shell interprets the quote as the closing
+quote for the entire program. As a result, usually the shell
+prints a message about mismatched quotes, and if @command{awk} actually
+runs, it will probably print strange messages about syntax errors.
+For example, look at the following:
+
+@example
+$ awk '@{ print "hello" @} # let's be cute'
+>
+@end example
+
+The shell sees that the first two quotes match, and that
+a new quoted object begins at the end of the command line.
+It therefore prompts with the secondary prompt, waiting for more input.
+With Unix @command{awk}, closing the quoted string produces this result:
+
+@example
+$ awk '@{ print "hello" @} # let's be cute'
+> '
+@error{} awk: can't open file be
+@error{}  source line number 1
+@end example
+
+@cindex @code{\} (backslash)
+@cindex backslash (@code{\})
+Putting a backslash before the single quote in @samp{let's} wouldn't help,
+since backslashes are not special inside single quotes.
+The next @value{SUBSECTION} describes the shell's quoting rules.
+
+@node Quoting
+@subsection Shell-Quoting Issues
+@cindex quoting, rules for
+
+For short to medium length @command{awk} programs, it is most convenient
+to enter the program on the @command{awk} command line.
+This is best done by enclosing the entire program in single quotes.
+This is true whether you are entering the program interactively at
+the shell prompt, or writing it as part of a larger shell script:
+
+@example
+awk '@var{program text}' @var{input-file1} @var{input-file2} @dots{}
+@end example
+
+@cindex shells, quoting, rules for
+@cindex Bourne shell, quoting rules for
+Once you are working with the shell, it is helpful to have a basic
+knowledge of shell quoting rules.  The following rules apply only to
+POSIX-compliant, Bourne-style shells (such as @command{bash}, the GNU Bourne-Again
+Shell).  If you use @command{csh}, you're on your own.
+
+@itemize @bullet
+@item
+Quoted items can be concatenated with nonquoted items as well as with other
+quoted items.  The shell turns everything into one argument for
+the command.
+
+@item
+Preceding any single character with a backslash (@samp{\}) quotes
+that character.  The shell removes the backslash and passes the quoted
+character on to the command.
+
+@item
+@cindex @code{\} (backslash)
+@cindex backslash (@code{\})
+@cindex single quote (@code{'})
+@cindex @code{'} (single quote)
+Single quotes protect everything between the opening and closing quotes.
+The shell does no interpretation of the quoted text, passing it on verbatim
+to the command.
+It is @emph{impossible} to embed a single quote inside single-quoted text.
+Refer back to
+@ref{Comments},
+for an example of what happens if you try.
+
+@item
+@cindex double quote (@code{"})
+@cindex @code{"} (double quote)
+Double quotes protect most things between the opening and closing quotes.
+The shell does at least variable and command substitution on the quoted text.
+Different shells may do additional kinds of processing on double-quoted text.
+
+Since certain characters within double-quoted text are processed by the shell,
+they must be @dfn{escaped} within the text.  Of note are the characters
+@samp{$}, @samp{`}, @samp{\}, and @samp{"}, all of which must be preceded by
+a backslash within double-quoted text if they are to be passed on literally
+to the program.  (The leading backslash is stripped first.)
+Thus, the example seen
+@ifnotinfo
+previously
+@end ifnotinfo
+in @ref{Read Terminal},
+is applicable:
+
+@example
+$ awk "BEGIN @{ print \"Don't Panic!\" @}"
+@print{} Don't Panic!
+@end example
+
+@cindex single quote (@code{'}), with double quotes
+@cindex @code{'} (single quote), with double quotes
+Note that the single quote is not special within double quotes.
+
+@item
+Null strings are removed when they occur as part of a non-null
+command-line argument, while explicit non-null objects are kept.
+For example, to specify that the field separator @code{FS} should
+be set to the null string, use:
+
+@example
+awk -F "" '@var{program}' @var{files} # correct
+@end example
+
+@noindent
+@cindex null strings, quoting and
+Don't use this:
+
+@example
+awk -F"" '@var{program}' @var{files}  # wrong!
+@end example
+
+@noindent
+In the second case, @command{awk} will attempt to use the text of the program
+as the value of @code{FS}, and the first @value{FN} as the text of the program!
+This results in syntax errors at best, and confusing behavior at worst.
+@end itemize
+
+@cindex quoting, tricks for
+Mixing single and double quotes is difficult.  You have to resort
+to shell quoting tricks, like this:
+
+@example
+$ awk 'BEGIN @{ print "Here is a single quote <'"'"'>" @}'
+@print{} Here is a single quote <'>
+@end example
+
+@noindent
+This program consists of three concatenated quoted strings.  The first and the
+third are single-quoted, the second is double-quoted.
+
+This can be ``simplified'' to:
+
+@example
+$ awk 'BEGIN @{ print "Here is a single quote <'\''>" @}'
+@print{} Here is a single quote <'>
+@end example
+
+@noindent
+Judge for yourself which of these two is the more readable.
+
+Another option is to use double quotes, escaping the embedded, @command{awk}-level
+double quotes:
+
+@example
+$ awk "BEGIN @{ print \"Here is a single quote <'>\" @}"
+@print{} Here is a single quote <'>
+@end example
+
+@noindent
+@c ENDOFRANGE sq1x
+@c ENDOFRANGE qs2x
+This option is also painful, because double quotes, backslashes, and dollar signs
+are very common in @command{awk} programs.
+
+If you really need both single and double quotes in your @command{awk}
+program, it is probably best to move it into a separate file, where
+the shell won't be part of the picture, and you can say what you mean.
+
+@node Sample Data Files
+@section @value{DDF}s for the Examples
+@c For gawk >= 3.2, update these data files. No-one has such slow modems!
+
+@cindex input files, examples
+@cindex @code{BBS-list} file
+Many of the examples in this @value{DOCUMENT} take their input from two sample
+@value{DF}s.  The first, @file{BBS-list}, represents a list of
+computer bulletin board systems together with information about those systems.
+The second @value{DF}, called @file{inventory-shipped}, contains
+information about monthly shipments.  In both files,
+each line is considered to be one @dfn{record}.
+
+In the @value{DF} @file{BBS-list}, each record contains the name of a computer
+bulletin board, its phone number, the board's baud rate(s), and a code for
+the number of hours it is operational.  An @samp{A} in the last column
+means the board operates 24 hours a day.  A @samp{B} in the last
+column means the board only operates on evening and weekend hours.
+A @samp{C} means the board operates only on weekends:
+
+@c 2e: Update the baud rates to reflect today's faster modems
+@example
+@c system if test ! -d eg      ; then mkdir eg      ; fi
+@c system if test ! -d eg/lib  ; then mkdir eg/lib  ; fi
+@c system if test ! -d eg/data ; then mkdir eg/data ; fi
+@c system if test ! -d eg/prog ; then mkdir eg/prog ; fi
+@c system if test ! -d eg/misc ; then mkdir eg/misc ; fi
+@c file eg/data/BBS-list
+aardvark     555-5553     1200/300          B
+alpo-net     555-3412     2400/1200/300     A
+barfly       555-7685     1200/300          A
+bites        555-1675     2400/1200/300     A
+camelot      555-0542     300               C
+core         555-2912     1200/300          C
+fooey        555-1234     2400/1200/300     B
+foot         555-6699     1200/300          B
+macfoo       555-6480     1200/300          A
+sdace        555-3430     2400/1200/300     A
+sabafoo      555-2127     1200/300          C
+@c endfile
+@end example
+
+@cindex @code{inventory-shipped} file
+The @value{DF} @file{inventory-shipped} represents
+information about shipments during the year.
+Each record contains the month, the number
+of green crates shipped, the number of red boxes shipped, the number of
+orange bags shipped, and the number of blue packages shipped,
+respectively.  There are 16 entries, covering the 12 months of last year
+and the first four months of the current year.
+
+@example
+@c file eg/data/inventory-shipped
+Jan  13  25  15 115
+Feb  15  32  24 226
+Mar  15  24  34 228
+Apr  31  52  63 420
+May  16  34  29 208
+Jun  31  42  75 492
+Jul  24  34  67 436
+Aug  15  34  47 316
+Sep  13  55  37 277
+Oct  29  54  68 525
+Nov  20  87  82 577
+Dec  17  35  61 401
+
+Jan  21  36  64 620
+Feb  26  58  80 652
+Mar  24  75  70 495
+Apr  21  70  74 514
+@c endfile
+@end example
+
+@ifinfo
+If you are reading this in GNU Emacs using Info, you can copy the regions
+of text showing these sample files into your own test files.  This way you
+can try out the examples shown in the remainder of this document.  You do
+this by using the command @kbd{M-x write-region} to copy text from the Info
+file into a file for use with @command{awk}
+(@xref{Misc File Ops, , Miscellaneous File Operations, emacs, GNU Emacs Manual},
+for more information).  Using this information, create your own
+@file{BBS-list} and @file{inventory-shipped} files and practice what you
+learn in this @value{DOCUMENT}.
+
+@cindex Texinfo
+If you are using the stand-alone version of Info,
+see @ref{Extract Program},
+for an @command{awk} program that extracts these @value{DF}s from
+@file{gawk.texi}, the Texinfo source file for this Info file.
+@end ifinfo
+
+@node Very Simple
+@section Some Simple Examples
+
+The following command runs a simple @command{awk} program that searches the
+input file @file{BBS-list} for the character string @samp{foo} (a
+grouping of characters is usually called a @dfn{string};
+the term @dfn{string} is based on similar usage in English, such
+as ``a string of pearls,'' or ``a string of cars in a train''):
+
+@example
+awk '/foo/ @{ print $0 @}' BBS-list
+@end example
+
+@noindent
+When lines containing @samp{foo} are found, they are printed because
+@w{@samp{print $0}} means print the current line.  (Just @samp{print} by
+itself means the same thing, so we could have written that
+instead.)
+
+You will notice that slashes (@samp{/}) surround the string @samp{foo}
+in the @command{awk} program.  The slashes indicate that @samp{foo}
+is the pattern to search for.  This type of pattern is called a
+@dfn{regular expression}, which is covered in more detail later
+(@pxref{Regexp}).
+The pattern is allowed to match parts of words.
+There are
+single quotes around the @command{awk} program so that the shell won't
+interpret any of it as special shell characters.
+
+Here is what this program prints:
+
+@example
+$ awk '/foo/ @{ print $0 @}' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sabafoo      555-2127     1200/300          C
+@end example
+
+@cindex actions, default
+@cindex patterns, default
+In an @command{awk} rule, either the pattern or the action can be omitted,
+but not both.  If the pattern is omitted, then the action is performed
+for @emph{every} input line.  If the action is omitted, the default
+action is to print all lines that match the pattern.
+
+@cindex actions, empty
+Thus, we could leave out the action (the @code{print} statement and the curly
+braces) in the previous example and the result would be the same: all
+lines matching the pattern @samp{foo} are printed.  By comparison,
+omitting the @code{print} statement but retaining the curly braces makes an
+empty action that does nothing (i.e., no lines are printed).
+
+@cindex @command{awk} programs, one-line examples
+Many practical @command{awk} programs are just a line or two.  Following is a
+collection of useful, short programs to get you started.  Some of these
+programs contain constructs that haven't been covered yet. (The description
+of the program will give you a good idea of what is going on, but please
+read the rest of the @value{DOCUMENT} to become an @command{awk} expert!)
+Most of the examples use a @value{DF} named @file{data}.  This is just a
+placeholder; if you use these programs yourself, substitute
+your own @value{FN}s for @file{data}.
+For future reference, note that there is often more than
+one way to do things in @command{awk}.  At some point, you may want
+to look back at these examples and see if
+you can come up with different ways to do the same things shown here:
+
+@itemize @bullet
+@item
+Print the length of the longest input line:
+
+@example
+awk '@{ if (length($0) > max) max = length($0) @}
+     END @{ print max @}' data
+@end example
+
+@item
+Print every line that is longer than 80 characters:
+
+@example
+awk 'length($0) > 80' data
+@end example
+
+The sole rule has a relational expression as its pattern and it has no
+action---so the default action, printing the record, is used.
+
+@cindex @command{expand} utility
+@item
+Print the length of the longest line in @file{data}:
+
+@example
+expand data | awk '@{ if (x < length()) x = length() @}
+              END @{ print "maximum line length is " x @}'
+@end example
+
+The input is processed by the @command{expand} utility to change tabs
+into spaces, so the widths compared are actually the right-margin columns.
+
+@item
+Print every line that has at least one field:
+
+@example
+awk 'NF > 0' data
+@end example
+
+This is an easy way to delete blank lines from a file (or rather, to
+create a new file similar to the old file but from which the blank lines
+have been removed).
+
+@item
+Print seven random numbers from 0 to 100, inclusive:
+
+@example
+awk 'BEGIN @{ for (i = 1; i <= 7; i++)
+                 print int(101 * rand()) @}'
+@end example
+
+@item
+Print the total number of bytes used by @var{files}:
+
+@example
+ls -l @var{files} | awk '@{ x += $5 @}
+                  END @{ print "total bytes: " x @}'
+@end example
+
+@item
+Print the total number of kilobytes used by @var{files}:
+
+@c Don't use \ continuation, not discussed yet
+@example
+ls -l @var{files} | awk '@{ x += $5 @}
+   END @{ print "total K-bytes: " (x + 1023)/1024 @}'
+@end example
+
+@item
+Print a sorted list of the login names of all users:
+
+@example
+awk -F: '@{ print $1 @}' /etc/passwd | sort
+@end example
+
+@item
+Count the lines in a file:
+
+@example
+awk 'END @{ print NR @}' data
+@end example
+
+@item
+Print the even-numbered lines in the @value{DF}:
+
+@example
+awk 'NR % 2 == 0' data
+@end example
+
+If you use the expression @samp{NR % 2 == 1} instead,
+the program would print the odd-numbered lines.
+@end itemize
+
+@node Two Rules
+@section An Example with Two Rules
+@cindex @command{awk} programs
+
+The @command{awk} utility reads the input files one line at a
+time.  For each line, @command{awk} tries the patterns of each of the rules.
+If several patterns match, then several actions are run in the order in
+which they appear in the @command{awk} program.  If no patterns match, then
+no actions are run.
+
+After processing all the rules that match the line (and perhaps there are none),
+@command{awk} reads the next line.  (However,
+@pxref{Next Statement},
+and also @pxref{Nextfile Statement}).
+This continues until the program reaches the end of the file.
+For example, the following @command{awk} program contains two rules:
+
+@example
+/12/  @{ print $0 @}
+/21/  @{ print $0 @}
+@end example
+
+@noindent
+The first rule has the string @samp{12} as the
+pattern and @samp{print $0} as the action.  The second rule has the
+string @samp{21} as the pattern and also has @samp{print $0} as the
+action.  Each rule's action is enclosed in its own pair of braces.
+
+This program prints every line that contains the string
+@samp{12} @emph{or} the string @samp{21}.  If a line contains both
+strings, it is printed twice, once by each rule.
+
+This is what happens if we run this program on our two sample @value{DF}s,
+@file{BBS-list} and @file{inventory-shipped}:
+
+@example
+$ awk '/12/ @{ print $0 @}
+>      /21/ @{ print $0 @}' BBS-list inventory-shipped
+@print{} aardvark     555-5553     1200/300          B
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} barfly       555-7685     1200/300          A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} core         555-2912     1200/300          C
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sdace        555-3430     2400/1200/300     A
+@print{} sabafoo      555-2127     1200/300          C
+@print{} sabafoo      555-2127     1200/300          C
+@print{} Jan  21  36  64 620
+@print{} Apr  21  70  74 514
+@end example
+
+@noindent
+Note how the line beginning with @samp{sabafoo}
+in @file{BBS-list} was printed twice, once for each rule.
+
+@node More Complex
+@section A More Complex Example
+
+Now that we've mastered some simple tasks, let's look at
+what typical @command{awk}
+programs do.  This example shows how @command{awk} can be used to
+summarize, select, and rearrange the output of another utility.  It uses
+features that haven't been covered yet, so don't worry if you don't
+understand all the details:
+
+@example
+ls -l | awk '$6 == "Nov" @{ sum += $5 @}
+             END @{ print sum @}'
+@end example
+
+@cindex @command{csh} utility, backslash continuation and
+@cindex @command{ls} utility
+@cindex backslash (@code{\}), continuing lines and, in @command{csh}
+@cindex @code{\} (backslash), continuing lines and, in @command{csh}
+This command prints the total number of bytes in all the files in the
+current directory that were last modified in November (of any year).
+@footnote{In the C shell (@command{csh}), you need to type
+a semicolon and then a backslash at the end of the first line; see
+@ref{Statements/Lines}, for an
+explanation.  In a POSIX-compliant shell, such as the Bourne
+shell or @command{bash}, you can type the example as shown.  If the command
+@samp{echo $path} produces an empty output line, you are most likely
+using a POSIX-compliant shell.  Otherwise, you are probably using the
+C shell or a shell derived from it.}
+The @w{@samp{ls -l}} part of this example is a system command that gives
+you a listing of the files in a directory, including each file's size and the date
+the file was last modified. Its output looks like this:
+
+@example
+-rw-r--r--  1 arnold   user   1933 Nov  7 13:05 Makefile
+-rw-r--r--  1 arnold   user  10809 Nov  7 13:03 awk.h
+-rw-r--r--  1 arnold   user    983 Apr 13 12:14 awk.tab.h
+-rw-r--r--  1 arnold   user  31869 Jun 15 12:20 awk.y
+-rw-r--r--  1 arnold   user  22414 Nov  7 13:03 awk1.c
+-rw-r--r--  1 arnold   user  37455 Nov  7 13:03 awk2.c
+-rw-r--r--  1 arnold   user  27511 Dec  9 13:07 awk3.c
+-rw-r--r--  1 arnold   user   7989 Nov  7 13:03 awk4.c
+@end example
+
+@noindent
+@cindex line continuations, with C shell
+The first field contains read-write permissions, the second field contains
+the number of links to the file, and the third field identifies the owner of
+the file. The fourth field identifies the group of the file.
+The fifth field contains the size of the file in bytes.  The
+sixth, seventh, and eighth fields contain the month, day, and time,
+respectively, that the file was last modified.  Finally, the ninth field
+contains the name of the file.@footnote{On some
+very old systems, you may need to use @samp{ls -lg} to get this output.}
+
+@c @cindex automatic initialization
+@cindex initialization, automatic
+The @samp{$6 == "Nov"} in our @command{awk} program is an expression that
+tests whether the sixth field of the output from @w{@samp{ls -l}}
+matches the string @samp{Nov}.  Each time a line has the string
+@samp{Nov} for its sixth field, the action @samp{sum += $5} is
+performed.  This adds the fifth field (the file's size) to the variable
+@code{sum}.  As a result, when @command{awk} has finished reading all the
+input lines, @code{sum} is the total of the sizes of the files whose
+lines matched the pattern.  (This works because @command{awk} variables
+are automatically initialized to zero.)
+
+After the last line of output from @command{ls} has been processed, the
+@code{END} rule executes and prints the value of @code{sum}.
+In this example, the value of @code{sum} is 80600.
+
+These more advanced @command{awk} techniques are covered in later sections
+(@pxref{Action Overview}).  Before you can move on to more
+advanced @command{awk} programming, you have to know how @command{awk} interprets
+your input and displays your output.  By manipulating fields and using
+@code{print} statements, you can produce some very useful and
+impressive-looking reports.
+
+@node Statements/Lines
+@section @command{awk} Statements Versus Lines
+@cindex line breaks
+@cindex newlines
+
+Most often, each line in an @command{awk} program is a separate statement or
+separate rule, like this:
+
+@example
+awk '/12/  @{ print $0 @}
+     /21/  @{ print $0 @}' BBS-list inventory-shipped
+@end example
+
+@cindex @command{gawk}, newlines in
+However, @command{gawk} ignores newlines after any of the following
+symbols and keywords:
+
+@example
+,    @{    ?    :    ||    &&    do    else
+@end example
+
+@noindent
+A newline at any other point is considered the end of the
+statement.@footnote{The @samp{?} and @samp{:} referred to here is the
+three-operand conditional expression described in
+@ref{Conditional Exp}.
+Splitting lines after @samp{?} and @samp{:} is a minor @command{gawk}
+extension; if @option{--posix} is specified
+(@pxref{Options}), then this extension is disabled.}
+
+@cindex @code{\} (backslash), continuing lines and
+@cindex backslash (@code{\}), continuing lines and
+If you would like to split a single statement into two lines at a point
+where a newline would terminate it, you can @dfn{continue} it by ending the
+first line with a backslash character (@samp{\}).  The backslash must be
+the final character on the line in order to be recognized as a continuation
+character.  A backslash is allowed anywhere in the statement, even
+in the middle of a string or regular expression.  For example:
+
+@example
+awk '/This regular expression is too long, so continue it\
+ on the next line/ @{ print $1 @}'
+@end example
+
+@noindent
+@cindex portability, backslash continuation and
+We have generally not used backslash continuation in the sample programs
+in this @value{DOCUMENT}.  In @command{gawk}, there is no limit on the
+length of a line, so backslash continuation is never strictly necessary;
+it just makes programs more readable.  For this same reason, as well as
+for clarity, we have kept most statements short in the sample programs
+presented throughout the @value{DOCUMENT}.  Backslash continuation is
+most useful when your @command{awk} program is in a separate source file
+instead of entered from the command line.  You should also note that
+many @command{awk} implementations are more particular about where you
+may use backslash continuation. For example, they may not allow you to
+split a string constant using backslash continuation.  Thus, for maximum
+portability of your @command{awk} programs, it is best not to split your
+lines in the middle of a regular expression or a string.
+@c 10/2000: gawk, mawk, and current bell labs awk allow it,
+@c solaris 2.7 nawk does not. Solaris /usr/xpg4/bin/awk does though!  sigh.
+
+@cindex @command{csh} utility
+@cindex backslash (@code{\}), continuing lines and, in @command{csh}
+@cindex @code{\} (backslash), continuing lines and, in @command{csh}
+@strong{Caution:} @emph{Backslash continuation does not work as described
+with the C shell.}  It works for @command{awk} programs in files and
+for one-shot programs, @emph{provided} you are using a POSIX-compliant
+shell, such as the Unix Bourne shell or @command{bash}.  But the C shell behaves
+differently!  There, you must use two backslashes in a row, followed by
+a newline.  Note also that when using the C shell, @emph{every} newline
+in your awk program must be escaped with a backslash. To illustrate:
+
+@example
+% awk 'BEGIN @{ \
+?   print \\
+?       "hello, world" \
+? @}'
+@print{} hello, world
+@end example
+
+@noindent
+Here, the @samp{%} and @samp{?} are the C shell's primary and secondary
+prompts, analogous to the standard shell's @samp{$} and @samp{>}.
+
+Compare the previous example to how it is done with a POSIX-compliant shell:
+
+@example
+$ awk 'BEGIN @{
+>   print \
+>       "hello, world"
+> @}'
+@print{} hello, world
+@end example
+
+@command{awk} is a line-oriented language.  Each rule's action has to
+begin on the same line as the pattern.  To have the pattern and action
+on separate lines, you @emph{must} use backslash continuation; there
+is no other option.
+
+@cindex backslash (@code{\}), continuing lines and, comments and
+@cindex @code{\} (backslash), continuing lines and, comments and
+@cindex commenting, backslash continuation and
+Another thing to keep in mind is that backslash continuation and
+comments do not mix. As soon as @command{awk} sees the @samp{#} that
+starts a comment, it ignores @emph{everything} on the rest of the
+line. For example:
+
+@example
+$ gawk 'BEGIN @{ print "dont panic" # a friendly \
+>                                    BEGIN rule
+> @}'
+@error{} gawk: cmd. line:2:                BEGIN rule
+@error{} gawk: cmd. line:2:                ^ parse error
+@end example
+
+@noindent
+In this case, it looks like the backslash would continue the comment onto the
+next line. However, the backslash-newline combination is never even
+noticed because it is ``hidden'' inside the comment. Thus, the
+@code{BEGIN} is noted as a syntax error.
+
+@cindex statements, multiple
+@cindex @code{;} (semicolon)
+@cindex semicolon (@code{;})
+When @command{awk} statements within one rule are short, you might want to put
+more than one of them on a line.  This is accomplished by separating the statements
+with a semicolon (@samp{;}).
+This also applies to the rules themselves.
+Thus, the program shown at the start of this @value{SECTION}
+could also be written this way:
+
+@example
+/12/ @{ print $0 @} ; /21/ @{ print $0 @}
+@end example
+
+@noindent
+@strong{Note:} The requirement that states that rules on the same line must be
+separated with a semicolon was not in the original @command{awk}
+language; it was added for consistency with the treatment of statements
+within an action.
+
+@node Other Features
+@section Other Features of @command{awk}
+
+@cindex variables
+The @command{awk} language provides a number of predefined, or
+@dfn{built-in}, variables that your programs can use to get information
+from @command{awk}.  There are other variables your program can set
+as well to control how @command{awk} processes your data.
+
+In addition, @command{awk} provides a number of built-in functions for doing
+common computational and string-related operations.
+@command{gawk} provides built-in functions for working with timestamps,
+performing bit manipulation, and for runtime string translation.
+
+As we develop our presentation of the @command{awk} language, we introduce
+most of the variables and many of the functions. They are defined
+systematically in @ref{Built-in Variables}, and
+@ref{Built-in}.
+
+@node When
+@section When to Use @command{awk}
+
+@cindex @command{awk}, uses for
+Now that you've seen some of what @command{awk} can do,
+you might wonder how @command{awk} could be useful for you.  By using
+utility programs, advanced patterns, field separators, arithmetic
+statements, and other selection criteria, you can produce much more
+complex output.  The @command{awk} language is very useful for producing
+reports from large amounts of raw data, such as summarizing information
+from the output of other utility programs like @command{ls}.
+(@xref{More Complex}.)
+
+Programs written with @command{awk} are usually much smaller than they would
+be in other languages.  This makes @command{awk} programs easy to compose and
+use.  Often, @command{awk} programs can be quickly composed at your terminal,
+used once, and thrown away.  Because @command{awk} programs are interpreted, you
+can avoid the (usually lengthy) compilation part of the typical
+edit-compile-test-debug cycle of software development.
+
+Complex programs have been written in @command{awk}, including a complete
+retargetable assembler for eight-bit microprocessors (@pxref{Glossary}, for
+more information), and a microcode assembler for a special-purpose Prolog
+computer.  However, @command{awk}'s capabilities are strained by tasks of
+such complexity.
+
+@cindex @command{awk} programs, complex
+If you find yourself writing @command{awk} scripts of more than, say, a few
+hundred lines, you might consider using a different programming
+language.  Emacs Lisp is a good choice if you need sophisticated string
+or pattern matching capabilities.  The shell is also good at string and
+pattern matching; in addition, it allows powerful use of the system
+utilities.  More conventional languages, such as C, C++, and Java, offer
+better facilities for system programming and for managing the complexity
+of large programs.  Programs in these languages may require more lines
+of source code than the equivalent @command{awk} programs, but they are
+easier to maintain and usually run more efficiently.
+
+@node Regexp
+@chapter Regular Expressions
+@cindex regexp, See regular expressions
+@c STARTOFRANGE regexp
+@cindex regular expressions
+
+A @dfn{regular expression}, or @dfn{regexp}, is a way of describing a
+set of strings.
+Because regular expressions are such a fundamental part of @command{awk}
+programming, their format and use deserve a separate @value{CHAPTER}.
+
+@cindex forward slash (@code{/})
+@cindex @code{/} (forward slash)
+A regular expression enclosed in slashes (@samp{/})
+is an @command{awk} pattern that matches every input record whose text
+belongs to that set.
+The simplest regular expression is a sequence of letters, numbers, or
+both.  Such a regexp matches any string that contains that sequence.
+Thus, the regexp @samp{foo} matches any string containing @samp{foo}.
+Therefore, the pattern @code{/foo/} matches any input record containing
+the three characters @samp{foo} @emph{anywhere} in the record.  Other
+kinds of regexps let you specify more complicated classes of strings.
+
+@ifnotinfo
+Initially, the examples in this @value{CHAPTER} are simple.
+As we explain more about how
+regular expressions work, we will present more complicated instances.
+@end ifnotinfo
+
+@menu
+* Regexp Usage::                How to Use Regular Expressions.
+* Escape Sequences::            How to write nonprinting characters.
+* Regexp Operators::            Regular Expression Operators.
+* Character Lists::             What can go between @samp{[...]}.
+* GNU Regexp Operators::        Operators specific to GNU software.
+* Case-sensitivity::            How to do case-insensitive matching.
+* Leftmost Longest::            How much text matches.
+* Computed Regexps::            Using Dynamic Regexps.
+* Locales::                     How the locale affects things.
+@end menu
+
+@node Regexp Usage
+@section How to Use Regular Expressions
+
+@cindex regular expressions, as patterns
+A regular expression can be used as a pattern by enclosing it in
+slashes.  Then the regular expression is tested against the
+entire text of each record.  (Normally, it only needs
+to match some part of the text in order to succeed.)  For example, the
+following prints the second field of each record that contains the string
+@samp{foo} anywhere in it:
+
+@example
+$ awk '/foo/ @{ print $2 @}' BBS-list
+@print{} 555-1234
+@print{} 555-6699
+@print{} 555-6480
+@print{} 555-2127
+@end example
+
+@cindex regular expressions, operators
+@cindex operators, string-matching
+@c @cindex operators, @code{~}
+@cindex string-matching operators
+@code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@c @cindex operators, @code{!~}
+@cindex @code{if} statement
+@cindex @code{while} statement
+@cindex @code{do}-@code{while} statement
+@c @cindex statements, @code{if}
+@c @cindex statements, @code{while}
+@c @cindex statements, @code{do}
+Regular expressions can also be used in matching expressions.  These
+expressions allow you to specify the string to match against; it need
+not be the entire current input record.  The two operators @samp{~}
+and @samp{!~} perform regular expression comparisons.  Expressions
+using these operators can be used as patterns, or in @code{if},
+@code{while}, @code{for}, and @code{do} statements.
+(@xref{Statements}.)
+For example:
+
+@example
+@var{exp} ~ /@var{regexp}/
+@end example
+
+@noindent
+is true if the expression @var{exp} (taken as a string)
+matches @var{regexp}.  The following example matches, or selects,
+all input records with the uppercase letter @samp{J} somewhere in the
+first field:
+
+@example
+$ awk '$1 ~ /J/' inventory-shipped
+@print{} Jan  13  25  15 115
+@print{} Jun  31  42  75 492
+@print{} Jul  24  34  67 436
+@print{} Jan  21  36  64 620
+@end example
+
+So does this:
+
+@example
+awk '@{ if ($1 ~ /J/) print @}' inventory-shipped
+@end example
+
+This next example is true if the expression @var{exp}
+(taken as a character string)
+does @emph{not} match @var{regexp}:
+
+@example
+@var{exp} !~ /@var{regexp}/
+@end example
+
+The following example matches,
+or selects, all input records whose first field @emph{does not} contain
+the uppercase letter @samp{J}:
+
+@example
+$ awk '$1 !~ /J/' inventory-shipped
+@print{} Feb  15  32  24 226
+@print{} Mar  15  24  34 228
+@print{} Apr  31  52  63 420
+@print{} May  16  34  29 208
+@dots{}
+@end example
+
+@cindex regexp constants
+@cindex regular expressions, constants, See regexp constants
+When a regexp is enclosed in slashes, such as @code{/foo/}, we call it
+a @dfn{regexp constant}, much like @code{5.27} is a numeric constant and
+@code{"foo"} is a string constant.
+
+@node Escape Sequences
+@section Escape Sequences
+
+@cindex escape sequences
+@cindex backslash (@code{\}), in escape sequences
+@cindex @code{\} (backslash), in escape sequences
+Some characters cannot be included literally in string constants
+(@code{"foo"}) or regexp constants (@code{/foo/}).
+Instead, they should be represented with @dfn{escape sequences},
+which are character sequences beginning with a backslash (@samp{\}).
+One use of an escape sequence is to include a double-quote character in
+a string constant.  Because a plain double quote ends the string, you
+must use @samp{\"} to represent an actual double-quote character as a
+part of the string.  For example:
+
+@example
+$ awk 'BEGIN @{ print "He said \"hi!\" to her." @}'
+@print{} He said "hi!" to her.
+@end example
+
+The  backslash character itself is another character that cannot be
+included normally; you must write @samp{\\} to put one backslash in the
+string or regexp.  Thus, the string whose contents are the two characters
+@samp{"} and @samp{\} must be written @code{"\"\\"}.
+
+Backslash also represents unprintable characters
+such as TAB or newline.  While there is nothing to stop you from entering most
+unprintable characters directly in a string constant or regexp constant,
+they may look ugly.
+
+The following table lists
+all the escape sequences used in @command{awk} and
+what they represent. Unless noted otherwise, all these escape
+sequences apply to both string constants and regexp constants:
+
+@table @code
+@item \\
+A literal backslash, @samp{\}.
+
+@c @cindex @command{awk} language, V.4 version
+@cindex @code{\} (backslash), @code{\a} escape sequence
+@cindex backslash (@code{\}), @code{\a} escape sequence
+@item \a
+The ``alert'' character, @kbd{@value{CTL}-g}, ASCII code 7 (BEL).
+(This usually makes some sort of audible noise.)
+
+@cindex @code{\} (backslash), @code{\b} escape sequence
+@cindex backslash (@code{\}), @code{\b} escape sequence
+@item \b
+Backspace, @kbd{@value{CTL}-h}, ASCII code 8 (BS).
+
+@cindex @code{\} (backslash), @code{\f} escape sequence
+@cindex backslash (@code{\}), @code{\f} escape sequence
+@item \f
+Formfeed, @kbd{@value{CTL}-l}, ASCII code 12 (FF).
+
+@cindex @code{\} (backslash), @code{\n} escape sequence
+@cindex backslash (@code{\}), @code{\n} escape sequence
+@item \n
+Newline, @kbd{@value{CTL}-j}, ASCII code 10 (LF).
+
+@cindex @code{\} (backslash), @code{\r} escape sequence
+@cindex backslash (@code{\}), @code{\r} escape sequence
+@item \r
+Carriage return, @kbd{@value{CTL}-m}, ASCII code 13 (CR).
+
+@cindex @code{\} (backslash), @code{\t} escape sequence
+@cindex backslash (@code{\}), @code{\t} escape sequence
+@item \t
+Horizontal TAB, @kbd{@value{CTL}-i}, ASCII code 9 (HT).
+
+@c @cindex @command{awk} language, V.4 version
+@cindex @code{\} (backslash), @code{\v} escape sequence
+@cindex backslash (@code{\}), @code{\v} escape sequence
+@item \v
+Vertical tab, @kbd{@value{CTL}-k}, ASCII code 11 (VT).
+
+@cindex @code{\} (backslash), @code{\}@var{nnn} escape sequence
+@cindex backslash (@code{\}), @code{\}@var{nnn} escape sequence
+@item \@var{nnn}
+The octal value @var{nnn}, where @var{nnn} stands for 1 to 3 digits
+between @samp{0} and @samp{7}.  For example, the code for the ASCII ESC
+(escape) character is @samp{\033}.
+
+@c @cindex @command{awk} language, V.4 version
+@c @cindex @command{awk} language, POSIX version
+@cindex @code{\} (backslash), @code{\x} escape sequence
+@cindex backslash (@code{\}), @code{\x} escape sequence
+@item \x@var{hh}@dots{}
+The hexadecimal value @var{hh}, where @var{hh} stands for a sequence
+of hexadecimal digits (@samp{0}--@samp{9}, and either @samp{A}--@samp{F}
+or @samp{a}--@samp{f}).  Like the same construct
+in ISO C, the escape sequence continues until the first nonhexadecimal
+digit is seen.  However, using more than two hexadecimal digits produces
+undefined results. (The @samp{\x} escape sequence is not allowed in
+POSIX @command{awk}.)
+
+@cindex @code{\} (backslash), @code{\/} escape sequence
+@cindex backslash (@code{\}), @code{\/} escape sequence
+@item \/
+A literal slash (necessary for regexp constants only).
+This expression is used when you want to write a regexp
+constant that contains a slash. Because the regexp is delimited by
+slashes, you need to escape the slash that is part of the pattern,
+in order to tell @command{awk} to keep processing the rest of the regexp.
+
+@cindex @code{\} (backslash), @code{\"} escape sequence
+@cindex backslash (@code{\}), @code{\"} escape sequence
+@item \"
+A literal double quote (necessary for string constants only).
+This expression is used when you want to write a string
+constant that contains a double quote. Because the string is delimited by
+double quotes, you need to escape the quote that is part of the string,
+in order to tell @command{awk} to keep processing the rest of the string.
+@end table
+
+In @command{gawk}, a number of additional two-character sequences that begin
+with a backslash have special meaning in regexps.
+@xref{GNU Regexp Operators}.
+
+In a regexp, a backslash before any character that is not in the previous list
+and not listed in
+@ref{GNU Regexp Operators},
+means that the next character should be taken literally, even if it would
+normally be a regexp operator.  For example, @code{/a\+b/} matches the three
+characters @samp{a+b}.
+
+@cindex backslash (@code{\}), in escape sequences
+@cindex @code{\} (backslash), in escape sequences
+@cindex portability
+For complete portability, do not use a backslash before any character not
+shown in the previous list.
+
+To summarize:
+
+@itemize @bullet
+@item
+The escape sequences in the table above are always processed first,
+for both string constants and regexp constants. This happens very early,
+as soon as @command{awk} reads your program.
+
+@item
+@command{gawk} processes both regexp constants and dynamic regexps
+(@pxref{Computed Regexps}),
+for the special operators listed in
+@ref{GNU Regexp Operators}.
+
+@item
+A backslash before any other character means to treat that character
+literally.
+@end itemize
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Backslash Before Regular Characters
+@cindex portability, backslash in escape sequences
+@cindex POSIX @command{awk}, backslashes in string constants
+@cindex backslash (@code{\}), in escape sequences, POSIX and
+@cindex @code{\} (backslash), in escape sequences, POSIX and
+
+@cindex troubleshooting, backslash before nonspecial character
+If you place a backslash in a string constant before something that is
+not one of the characters previously listed, POSIX @command{awk} purposely
+leaves what happens as undefined.  There are two choices:
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@table @asis
+@item Strip the backslash out
+This is what Unix @command{awk} and @command{gawk} both do.
+For example, @code{"a\qc"} is the same as @code{"aqc"}.
+(Because this is such an easy bug both to introduce and to miss,
+@command{gawk} warns you about it.)
+Consider @samp{FS = @w{"[ \t]+\|[ \t]+"}} to use vertical bars
+surrounded by whitespace as the field separator. There should be
+two backslashes in the string @samp{FS = @w{"[ \t]+\\|[ \t]+"}}.)
+@c I did this!  This is why I added the warning.
+
+@cindex @command{gawk}, escape sequences
+@cindex Unix @command{awk}, backslashes in escape sequences
+@item Leave the backslash alone
+Some other @command{awk} implementations do this.
+In such implementations, typing @code{"a\qc"} is the same as typing
+@code{"a\\qc"}.
+@end table
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Escape Sequences for Metacharacters
+@cindex metacharacters, escape sequences for
+
+Suppose you use an octal or hexadecimal
+escape to represent a regexp metacharacter.
+(See @ref{Regexp Operators}.)
+Does @command{awk} treat the character as a literal character or as a regexp
+operator?
+
+@cindex dark corner, escape sequences, for metacharacters
+Historically, such characters were taken literally.
+@value{DARKCORNER}
+However, the POSIX standard indicates that they should be treated
+as real metacharacters, which is what @command{gawk} does.
+In compatibility mode (@pxref{Options}),
+@command{gawk} treats the characters represented by octal and hexadecimal
+escape sequences literally when used in regexp constants. Thus,
+@code{/a\52b/} is equivalent to @code{/a\*b/}.
+
+@node Regexp Operators
+@section Regular Expression Operators
+@c STARTOFRANGE regexpo
+@cindex regular expressions, operators
+
+You can combine regular expressions with special characters,
+called @dfn{regular expression operators} or @dfn{metacharacters}, to
+increase the power and versatility of regular expressions.
+
+The escape sequences described
+@ifnotinfo
+earlier
+@end ifnotinfo
+in @ref{Escape Sequences},
+are valid inside a regexp.  They are introduced by a @samp{\} and
+are recognized and converted into corresponding real characters as
+the very first step in processing regexps.
+
+Here is a list of metacharacters.  All characters that are not escape
+sequences and that are not listed in the table stand for themselves:
+
+@table @code
+@cindex backslash (@code{\})
+@cindex @code{\} (backslash)
+@item \
+This is used to suppress the special meaning of a character when
+matching.  For example, @samp{\$}
+matches the character @samp{$}.
+
+@cindex regular expressions, anchors in
+@cindex Texinfo, chapter beginnings in files
+@cindex @code{^} (caret)
+@cindex caret (@code{^})
+@item ^
+This matches the beginning of a string.  For example, @samp{^@@chapter}
+matches @samp{@@chapter} at the beginning of a string and can be used
+to identify chapter beginnings in Texinfo source files.
+The @samp{^} is known as an @dfn{anchor}, because it anchors the pattern to
+match only at the beginning of the string.
+
+It is important to realize that @samp{^} does not match the beginning of
+a line embedded in a string.
+The condition is not true in the following example:
+
+@example
+if ("line1\nLINE 2" ~ /^L/) @dots{}
+@end example
+
+@cindex @code{$} (dollar sign)
+@cindex dollar sign (@code{$})
+@item $
+This is similar to @samp{^}, but it matches only at the end of a string.
+For example, @samp{p$}
+matches a record that ends with a @samp{p}.  The @samp{$} is an anchor
+and does not match the end of a line embedded in a string.
+The condition in the following example is not true:
+
+@example
+if ("line1\nLINE 2" ~ /1$/) @dots{}
+@end example
+
+@cindex @code{.} (period)
+@cindex period (@code{.})
+@item .
+This matches any single character,
+@emph{including} the newline character.  For example, @samp{.P}
+matches any single character followed by a @samp{P} in a string.  Using
+concatenation, we can make a regular expression such as @samp{U.A}, which
+matches any three-character sequence that begins with @samp{U} and ends
+with @samp{A}.
+
+@c comma before using does NOT do tertiary
+@cindex POSIX @command{awk}, period (@code{.}), using
+In strict POSIX mode (@pxref{Options}),
+@samp{.} does not match the @sc{nul}
+character, which is a character with all bits equal to zero.
+Otherwise, @sc{nul} is just another character. Other versions of @command{awk}
+may not be able to match the @sc{nul} character.
+
+@cindex @code{[]} (square brackets)
+@cindex square brackets (@code{[]})
+@cindex character lists
+@cindex character sets, See Also character lists
+@cindex bracket expressions, See character lists
+@item [@dots{}]
+This is called a @dfn{character list}.@footnote{In other literature,
+you may see a character list referred to as either a
+@dfn{character set}, a @dfn{character class}, or a @dfn{bracket expression}.}
+It matches any @emph{one} of the characters that are enclosed in
+the square brackets.  For example, @samp{[MVX]} matches any one of
+the characters @samp{M}, @samp{V}, or @samp{X} in a string.  A full
+discussion of what can be inside the square brackets of a character list
+is given in
+@ref{Character Lists}.
+
+@cindex character lists, complemented
+@item [^ @dots{}]
+This is a @dfn{complemented character list}.  The first character after
+the @samp{[} @emph{must} be a @samp{^}.  It matches any characters
+@emph{except} those in the square brackets.  For example, @samp{[^awk]}
+matches any character that is not an @samp{a}, @samp{w},
+or @samp{k}.
+
+@cindex @code{|} (vertical bar)
+@cindex vertical bar (@code{|})
+@item |
+This is the @dfn{alternation operator} and it is used to specify
+alternatives.
+The @samp{|} has the lowest precedence of all the regular
+expression operators.
+For example, @samp{^P|[[:digit:]]}
+matches any string that matches either @samp{^P} or @samp{[[:digit:]]}.  This
+means it matches any string that starts with @samp{P} or contains a digit.
+
+The alternation applies to the largest possible regexps on either side.
+
+@cindex @code{()} (parentheses)
+@cindex parentheses @code{()}
+@item (@dots{})
+Parentheses are used for grouping in regular expressions, as in
+arithmetic.  They can be used to concatenate regular expressions
+containing the alternation operator, @samp{|}.  For example,
+@samp{@@(samp|code)\@{[^@}]+\@}} matches both @samp{@@code@{foo@}} and
+@samp{@@samp@{bar@}}.
+(These are Texinfo formatting control sequences. The @samp{+} is
+explained further on in this list.)
+
+@cindex @code{*} (asterisk), @code{*} operator, as regexp operator
+@cindex asterisk (@code{*}), @code{*} operator, as regexp operator
+@item *
+This symbol means that the preceding regular expression should be
+repeated as many times as necessary to find a match.  For example, @samp{ph*}
+applies the @samp{*} symbol to the preceding @samp{h} and looks for matches
+of one @samp{p} followed by any number of @samp{h}s.  This also matches
+just @samp{p} if no @samp{h}s are present.
+
+The @samp{*} repeats the @emph{smallest} possible preceding expression.
+(Use parentheses if you want to repeat a larger expression.)  It finds
+as many repetitions as possible.  For example,
+@samp{awk '/\(c[ad][ad]*r x\)/ @{ print @}' sample}
+prints every record in @file{sample} containing a string of the form
+@samp{(car x)}, @samp{(cdr x)}, @samp{(cadr x)}, and so on.
+Notice the escaping of the parentheses by preceding them
+with backslashes.
+
+@cindex @code{+} (plus sign)
+@cindex plus sign (@code{+})
+@item +
+This symbol is similar to @samp{*}, except that the preceding expression must be
+matched at least once.  This means that @samp{wh+y}
+would match @samp{why} and @samp{whhy}, but not @samp{wy}, whereas
+@samp{wh*y} would match all three of these strings.
+The following is a simpler
+way of writing the last @samp{*} example:
+
+@example
+awk '/\(c[ad]+r x\)/ @{ print @}' sample
+@end example
+
+@cindex @code{?} (question mark)
+@cindex question mark (@code{?})
+@item ?
+This symbol is similar to @samp{*}, except that the preceding expression can be
+matched either once or not at all.  For example, @samp{fe?d}
+matches @samp{fed} and @samp{fd}, but nothing else.
+
+@cindex interval expressions
+@item @{@var{n}@}
+@itemx @{@var{n},@}
+@itemx @{@var{n},@var{m}@}
+One or two numbers inside braces denote an @dfn{interval expression}.
+If there is one number in the braces, the preceding regexp is repeated
+@var{n} times.
+If there are two numbers separated by a comma, the preceding regexp is
+repeated @var{n} to @var{m} times.
+If there is one number followed by a comma, then the preceding regexp
+is repeated at least @var{n} times:
+
+@table @code
+@item wh@{3@}y
+Matches @samp{whhhy}, but not @samp{why} or @samp{whhhhy}.
+
+@item wh@{3,5@}y
+Matches @samp{whhhy}, @samp{whhhhy}, or @samp{whhhhhy}, only.
+
+@item wh@{2,@}y
+Matches @samp{whhy} or @samp{whhhy}, and so on.
+@end table
+
+@cindex POSIX @command{awk}, interval expressions in
+Interval expressions were not traditionally available in @command{awk}.
+They were added as part of the POSIX standard to make @command{awk}
+and @command{egrep} consistent with each other.
+
+@cindex @command{gawk}, interval expressions and
+However, because old programs may use @samp{@{} and @samp{@}} in regexp
+constants, by default @command{gawk} does @emph{not} match interval expressions
+in regexps.  If either @option{--posix} or @option{--re-interval} are specified
+(@pxref{Options}), then interval expressions
+are allowed in regexps.
+
+For new programs that use @samp{@{} and @samp{@}} in regexp constants,
+it is good practice to always escape them with a backslash.  Then the
+regexp constants are valid and work the way you want them to, using
+any version of @command{awk}.@footnote{Use two backslashes if you're
+using a string constant with a regexp operator or function.}
+@end table
+
+@cindex precedence, regexp operators
+@cindex regular expressions, operators, precedence of
+In regular expressions, the @samp{*}, @samp{+}, and @samp{?} operators,
+as well as the braces @samp{@{} and @samp{@}},
+have
+the highest precedence, followed by concatenation, and finally by @samp{|}.
+As in arithmetic, parentheses can change how operators are grouped.
+
+@cindex POSIX @command{awk}, regular expressions and
+@cindex @command{gawk}, regular expressions, precedence
+In POSIX @command{awk} and @command{gawk}, the @samp{*}, @samp{+}, and @samp{?} operators
+stand for themselves when there is nothing in the regexp that precedes them.
+For example, @samp{/+/} matches a literal plus sign.  However, many other versions of
+@command{awk} treat such a usage as a syntax error.
+
+If @command{gawk} is in compatibility mode
+(@pxref{Options}),
+POSIX character classes and interval expressions are not available in
+regular expressions.
+@c ENDOFRANGE regexpo
+
+@node Character Lists
+@section Using Character Lists
+@c STARTOFRANGE charlist
+@cindex character lists
+@cindex character lists, range expressions
+@cindex range expressions
+
+Within a character list, a @dfn{range expression} consists of two
+characters separated by a hyphen.  It matches any single character that
+sorts between the two characters, using the locale's
+collating sequence and character set.  For example, in the default C
+locale, @samp{[a-dx-z]} is equivalent to @samp{[abcdxyz]}.  Many locales
+sort characters in dictionary order, and in these locales,
+@samp{[a-dx-z]} is typically not equivalent to @samp{[abcdxyz]}; instead it
+might be equivalent to @samp{[aBbCcDdxXyYz]}, for example.  To obtain
+the traditional interpretation of bracket expressions, you can use the C
+locale by setting the @env{LC_ALL} environment variable to the value
+@samp{C}.
+
+@cindex @code{\} (backslash), in character lists
+@cindex backslash (@code{\}), in character lists
+@cindex @code{^} (caret), in character lists
+@cindex caret (@code{^}), in character lists
+@cindex @code{-} (hyphen), in character lists
+@cindex hyphen (@code{-}), in character lists
+To include one of the characters @samp{\}, @samp{]}, @samp{-}, or @samp{^} in a
+character list, put a @samp{\} in front of it.  For example:
+
+@example
+[d\]]
+@end example
+
+@noindent
+matches either @samp{d} or @samp{]}.
+
+@cindex POSIX @command{awk}, character lists and
+@cindex Extended Regular Expressions (EREs)
+@cindex EREs (Extended Regular Expressions)
+@cindex @command{egrep} utility
+This treatment of @samp{\} in character lists
+is compatible with other @command{awk}
+implementations and is also mandated by POSIX.
+The regular expressions in @command{awk} are a superset
+of the POSIX specification for Extended Regular Expressions (EREs).
+POSIX EREs are based on the regular expressions accepted by the
+traditional @command{egrep} utility.
+
+@cindex character lists, character classes
+@cindex POSIX @command{awk}, character lists and, character classes
+@dfn{Character classes} are a new feature introduced in the POSIX standard.
+A character class is a special notation for describing
+lists of characters that have a specific attribute, but the
+actual characters can vary from country to country and/or
+from character set to character set.  For example, the notion of what
+is an alphabetic character differs between the United States and France.
+
+A character class is only valid in a regexp @emph{inside} the
+brackets of a character list.  Character classes consist of @samp{[:},
+a keyword denoting the class, and @samp{:]}.  Here are the character
+classes defined by the POSIX standard.
+
+@c the regular table is commented out while trying out the multitable.
+@c leave it here in case we need to go back, but make sure the text
+@c still corresponds!
+
+@ignore
+@table @code
+@item [:alnum:]
+Alphanumeric characters.
+
+@item [:alpha:]
+Alphabetic characters.
+
+@item [:blank:]
+Space and TAB characters.
+
+@item [:cntrl:]
+Control characters.
+
+@item [:digit:]
+Numeric characters.
+
+@item [:graph:]
+Characters that are printable and visible.
+(A space is printable but not visible, whereas an @samp{a} is both.)
+
+@item [:lower:]
+Lowercase alphabetic characters.
+
+@item [:print:]
+Printable characters (characters that are not control characters).
+
+@item [:punct:]
+Punctuation characters (characters that are not letters, digits,
+control characters, or space characters).
+
+@item [:space:]
+Space characters (such as space, TAB, and formfeed, to name a few).
+
+@item [:upper:]
+Uppercase alphabetic characters.
+
+@item [:xdigit:]
+Characters that are hexadecimal digits.
+@end table
+@end ignore
+
+@multitable {@code{[:xdigit:]}} {Characters that are both printable and visible.  (A space is}
+@item @code{[:alnum:]} @tab Alphanumeric characters.
+@item @code{[:alpha:]} @tab Alphabetic characters.
+@item @code{[:blank:]} @tab Space and TAB characters.
+@item @code{[:cntrl:]} @tab Control characters.
+@item @code{[:digit:]} @tab Numeric characters.
+@item @code{[:graph:]} @tab Characters that are both printable and visible.
+(A space is printable but not visible, whereas an @samp{a} is both.)
+@item @code{[:lower:]} @tab Lowercase alphabetic characters.
+@item @code{[:print:]} @tab Printable characters (characters that are not control characters).
+@item @code{[:punct:]} @tab Punctuation characters (characters that are not letters, digits,
+control characters, or space characters).
+@item @code{[:space:]} @tab Space characters (such as space, TAB, and formfeed, to name a few).
+@item @code{[:upper:]} @tab Uppercase alphabetic characters.
+@item @code{[:xdigit:]} @tab Characters that are hexadecimal digits.
+@end multitable
+
+For example, before the POSIX standard, you had to write @code{/[A-Za-z0-9]/}
+to match alphanumeric characters.  If your
+character set had other alphabetic characters in it, this would not
+match them, and if your character set collated differently from
+ASCII, this might not even match the ASCII alphanumeric characters.
+With the POSIX character classes, you can write
+@code{/[[:alnum:]]/} to match the alphabetic
+and numeric characters in your character set.
+
+@cindex character lists, collating elements
+@cindex character lists, non-ASCII
+@cindex collating elements
+Two additional special sequences can appear in character lists.
+These apply to non-ASCII character sets, which can have single symbols
+(called @dfn{collating elements}) that are represented with more than one
+character. They can also have several characters that are equivalent for
+@dfn{collating}, or sorting, purposes.  (For example, in French, a plain ``e''
+and a grave-accented ``@`e'' are equivalent.)
+These sequences are:
+
+@table @asis
+@cindex character lists, collating symbols
+@cindex collating symbols
+@item Collating symbols
+Multicharacter collating elements enclosed between
+@samp{[.} and @samp{.]}.  For example, if @samp{ch} is a collating element,
+then @code{[[.ch.]]} is a regexp that matches this collating element, whereas
+@code{[ch]} is a regexp that matches either @samp{c} or @samp{h}.
+
+@cindex character lists, equivalence classes
+@item Equivalence classes
+Locale-specific names for a list of
+characters that are equal. The name is enclosed between
+@samp{[=} and @samp{=]}.
+For example, the name @samp{e} might be used to represent all of
+``e,'' ``@`e,'' and ``@'e.'' In this case, @code{[[=e=]]} is a regexp
+that matches any of @samp{e}, @samp{@'e}, or @samp{@`e}.
+@end table
+
+These features are very valuable in non-English-speaking locales.
+
+@cindex internationalization, localization, character classes
+@cindex @command{gawk}, character classes and
+@cindex POSIX @command{awk}, character lists and, character classes
+@strong{Caution:} The library functions that @command{gawk} uses for regular
+expression matching currently recognize only POSIX character classes;
+they do not recognize collating symbols or equivalence classes.
+@c maybe one day ...
+@c ENDOFRANGE charlist
+
+@node GNU Regexp Operators
+@section @command{gawk}-Specific Regexp Operators
+
+@c This section adapted (long ago) from the regex-0.12 manual
+
+@c STARTOFRANGE regexpg
+@cindex regular expressions, operators, @command{gawk}
+@c STARTOFRANGE gregexp
+@cindex @command{gawk}, regular expressions, operators
+@cindex operators, GNU-specific
+@cindex regular expressions, operators, for words
+@cindex word, regexp definition of
+GNU software that deals with regular expressions provides a number of
+additional regexp operators.  These operators are described in this
+@value{SECTION} and are specific to @command{gawk};
+they are not available in other @command{awk} implementations.
+Most of the additional operators deal with word matching.
+For our purposes, a @dfn{word} is a sequence of one or more letters, digits,
+or underscores (@samp{_}):
+
+@table @code
+@c @cindex operators, @code{\w} (@command{gawk})
+@cindex backslash (@code{\}), @code{\w} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\w} operator (@command{gawk})
+@item \w
+Matches any word-constituent character---that is, it matches any
+letter, digit, or underscore. Think of it as shorthand for
+@w{@code{[[:alnum:]_]}}.
+
+@c @cindex operators, @code{\W} (@command{gawk})
+@cindex backslash (@code{\}), @code{\W} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\W} operator (@command{gawk})
+@item \W
+Matches any character that is not word-constituent.
+Think of it as shorthand for
+@w{@code{[^[:alnum:]_]}}.
+
+@c @cindex operators, @code{\<} (@command{gawk})
+@cindex backslash (@code{\}), @code{\<} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\<} operator (@command{gawk})
+@item \<
+Matches the empty string at the beginning of a word.
+For example, @code{/\<away/} matches @samp{away} but not
+@samp{stowaway}.
+
+@c @cindex operators, @code{\>} (@command{gawk})
+@cindex backslash (@code{\}), @code{\>} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\>} operator (@command{gawk})
+@item \>
+Matches the empty string at the end of a word.
+For example, @code{/stow\>/} matches @samp{stow} but not @samp{stowaway}.
+
+@c @cindex operators, @code{\y} (@command{gawk})
+@cindex backslash (@code{\}), @code{\y} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\y} operator (@command{gawk})
+@c comma before using does NOT do secondary
+@cindex word boundaries, matching
+@item \y
+Matches the empty string at either the beginning or the
+end of a word (i.e., the word boundar@strong{y}).  For example, @samp{\yballs?\y}
+matches either @samp{ball} or @samp{balls}, as a separate word.
+
+@c @cindex operators, @code{\B} (@command{gawk})
+@cindex backslash (@code{\}), @code{\B} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\B} operator (@command{gawk})
+@item \B
+Matches the empty string that occurs between two
+word-constituent characters. For example,
+@code{/\Brat\B/} matches @samp{crate} but it does not match @samp{dirty rat}.
+@samp{\B} is essentially the opposite of @samp{\y}.
+@end table
+
+@cindex buffers, operators for
+@cindex regular expressions, operators, for buffers
+@cindex operators, string-matching, for buffers
+There are two other operators that work on buffers.  In Emacs, a
+@dfn{buffer} is, naturally, an Emacs buffer.  For other programs,
+@command{gawk}'s regexp library routines consider the entire
+string to match as the buffer.
+The operators are:
+
+@table @code
+@item \`
+@c @cindex operators, @code{\`} (@command{gawk})
+@cindex backslash (@code{\}), @code{\`} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\`} operator (@command{gawk})
+Matches the empty string at the
+beginning of a buffer (string).
+
+@c @cindex operators, @code{\'} (@command{gawk})
+@cindex backslash (@code{\}), @code{\'} operator (@command{gawk})
+@cindex @code{\} (backslash), @code{\'} operator (@command{gawk})
+@item \'
+Matches the empty string at the
+end of a buffer (string).
+@end table
+
+@cindex @code{^} (caret)
+@cindex caret (@code{^})
+@cindex @code{?} (question mark)
+@cindex question mark (@code{?})
+Because @samp{^} and @samp{$} always work in terms of the beginning
+and end of strings, these operators don't add any new capabilities
+for @command{awk}.  They are provided for compatibility with other
+GNU software.
+
+@cindex @command{gawk}, word-boundary operator
+@cindex word-boundary operator (@command{gawk})
+@cindex operators, word-boundary (@command{gawk})
+In other GNU software, the word-boundary operator is @samp{\b}. However,
+that conflicts with the @command{awk} language's definition of @samp{\b}
+as backspace, so @command{gawk} uses a different letter.
+An alternative method would have been to require two backslashes in the
+GNU operators, but this was deemed too confusing. The current
+method of using @samp{\y} for the GNU @samp{\b} appears to be the
+lesser of two evils.
+
+@c NOTE!!! Keep this in sync with the same table in the summary appendix!
+@c
+@c Should really do this with file inclusion.
+@cindex regular expressions, @command{gawk}, command-line options
+@cindex @command{gawk}, command-line options
+The various command-line options
+(@pxref{Options})
+control how @command{gawk} interprets characters in regexps:
+
+@table @asis
+@item No options
+In the default case, @command{gawk} provides all the facilities of
+POSIX regexps and the
+@ifnotinfo
+previously described
+GNU regexp operators.
+@end ifnotinfo
+@ifnottex
+GNU regexp operators described
+in @ref{Regexp Operators}.
+@end ifnottex
+However, interval expressions are not supported.
+
+@item @code{--posix}
+Only POSIX regexps are supported; the GNU operators are not special
+(e.g., @samp{\w} matches a literal @samp{w}).  Interval expressions
+are allowed.
+
+@item @code{--traditional}
+Traditional Unix @command{awk} regexps are matched. The GNU operators
+are not special, interval expressions are not available, nor
+are the POSIX character classes (@code{[[:alnum:]]}, etc.).
+Characters described by octal and hexadecimal escape sequences are
+treated literally, even if they represent regexp metacharacters.
+
+@item @code{--re-interval}
+Allow interval expressions in regexps, even if @option{--traditional}
+has been provided.  (@option{--posix} automatically enables
+interval expressions, so @option{--re-interval} is redundant
+when @option{--posix} is is used.)
+@end table
+@c ENDOFRANGE gregexp
+@c ENDOFRANGE regexpg
+
+@node Case-sensitivity
+@section Case Sensitivity in Matching
+
+@c STARTOFRANGE regexpcs
+@cindex regular expressions, case sensitivity
+@c STARTOFRANGE csregexp
+@cindex case sensitivity, regexps and
+Case is normally significant in regular expressions, both when matching
+ordinary characters (i.e., not metacharacters) and inside character
+sets.  Thus, a @samp{w} in a regular expression matches only a lowercase
+@samp{w} and not an uppercase @samp{W}.
+
+The simplest way to do a case-independent match is to use a character
+list---for example, @samp{[Ww]}.  However, this can be cumbersome if
+you need to use it often, and it can make the regular expressions harder
+to read.  There are two alternatives that you might prefer.
+
+One way to perform a case-insensitive match at a particular point in the
+program is to convert the data to a single case, using the
+@code{tolower} or @code{toupper} built-in string functions (which we
+haven't discussed yet;
+@pxref{String Functions}).
+For example:
+
+@example
+tolower($1) ~ /foo/  @{ @dots{} @}
+@end example
+
+@noindent
+converts the first field to lowercase before matching against it.
+This works in any POSIX-compliant @command{awk}.
+
+@cindex @command{gawk}, regular expressions, case sensitivity
+@cindex case sensitivity, @command{gawk}
+@cindex differences in @command{awk} and @command{gawk}, regular expressions
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@cindex @code{IGNORECASE} variable
+@c @cindex variables, @code{IGNORECASE}
+Another method, specific to @command{gawk}, is to set the variable
+@code{IGNORECASE} to a nonzero value (@pxref{Built-in Variables}).
+When @code{IGNORECASE} is not zero, @emph{all} regexp and string
+operations ignore case.  Changing the value of
+@code{IGNORECASE} dynamically controls the case-sensitivity of the
+program as it runs.  Case is significant by default because
+@code{IGNORECASE} (like most variables) is initialized to zero:
+
+@example
+x = "aB"
+if (x ~ /ab/) @dots{}   # this test will fail
+
+IGNORECASE = 1
+if (x ~ /ab/) @dots{}   # now it will succeed
+@end example
+
+In general, you cannot use @code{IGNORECASE} to make certain rules
+case-insensitive and other rules case-sensitive, because there is no
+straightforward way
+to set @code{IGNORECASE} just for the pattern of
+a particular rule.@footnote{Experienced C and C++ programmers will note
+that it is possible, using something like
+@samp{IGNORECASE = 1 && /foObAr/ @{ @dots{} @}}
+and
+@samp{IGNORECASE = 0 || /foobar/ @{ @dots{} @}}.
+However, this is somewhat obscure and we don't recommend it.}
+To do this, use either character lists or @code{tolower}.  However, one
+thing you can do with @code{IGNORECASE} only is dynamically turn
+case-sensitivity on or off for all the rules at once.
+
+@code{IGNORECASE} can be set on the command line or in a @code{BEGIN} rule
+(@pxref{Other Arguments}; also
+@pxref{Using BEGIN/END}).
+Setting @code{IGNORECASE} from the command line is a way to make
+a program case-insensitive without having to edit it.
+
+Prior to @command{gawk} 3.0, the value of @code{IGNORECASE}
+affected regexp operations only. It did not affect string comparison
+with @samp{==}, @samp{!=}, and so on.
+Beginning with @value{PVERSION} 3.0, both regexp and string comparison
+operations are also affected by @code{IGNORECASE}.
+
+@c @cindex ISO 8859-1
+@c @cindex ISO Latin-1
+Beginning with @command{gawk} 3.0,
+the equivalences between upper-
+and lowercase characters are based on the ISO-8859-1 (ISO Latin-1)
+character set. This character set is a superset of the traditional 128
+ASCII characters, which also provides a number of characters suitable
+for use with European languages.
+
+The value of @code{IGNORECASE} has no effect if @command{gawk} is in
+compatibility mode (@pxref{Options}).
+Case is always significant in compatibility mode.
+@c ENDOFRANGE csregexp
+@c ENDOFRANGE regexpcs
+
+@node Leftmost Longest
+@section How Much Text Matches?
+
+@cindex regular expressions, leftmost longest match
+@c @cindex matching, leftmost longest
+Consider the following:
+
+@example
+echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
+@end example
+
+This example uses the @code{sub} function (which we haven't discussed yet;
+@pxref{String Functions})
+to make a change to the input record. Here, the regexp @code{/a+/}
+indicates ``one or more @samp{a} characters,'' and the replacement
+text is @samp{<A>}.
+
+The input contains four @samp{a} characters.
+@command{awk} (and POSIX) regular expressions always match
+the leftmost, @emph{longest} sequence of input characters that can
+match.  Thus, all four @samp{a} characters are
+replaced with @samp{<A>} in this example:
+
+@example
+$ echo aaaabcd | awk '@{ sub(/a+/, "<A>"); print @}'
+@print{} <A>bcd
+@end example
+
+For simple match/no-match tests, this is not so important. But when doing
+text matching and substitutions with the @code{match}, @code{sub}, @code{gsub},
+and @code{gensub} functions, it is very important.
+@ifinfo
+@xref{String Functions},
+for more information on these functions.
+@end ifinfo
+Understanding this principle is also important for regexp-based record
+and field splitting (@pxref{Records},
+and also @pxref{Field Separators}).
+
+@node Computed Regexps
+@section Using Dynamic Regexps
+
+@c STARTOFRANGE dregexp
+@cindex regular expressions, computed
+@c STARTOFRANGE regexpd
+@cindex regular expressions, dynamic
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@c @cindex operators, @code{~}
+@c @cindex operators, @code{!~}
+The righthand side of a @samp{~} or @samp{!~} operator need not be a
+regexp constant (i.e., a string of characters between slashes).  It may
+be any expression.  The expression is evaluated and converted to a string
+if necessary; the contents of the string are used as the
+regexp.  A regexp that is computed in this way is called a @dfn{dynamic
+regexp}:
+
+@example
+BEGIN @{ digits_regexp = "[[:digit:]]+" @}
+$0 ~ digits_regexp    @{ print @}
+@end example
+
+@noindent
+This sets @code{digits_regexp} to a regexp that describes one or more digits,
+and tests whether the input record matches this regexp.
+
+@c @strong{Caution:}
+When using the @samp{~} and @samp{!~}
+@strong{Caution:} When using the @samp{~} and @samp{!~}
+operators, there is a difference between a regexp constant
+enclosed in slashes and a string constant enclosed in double quotes.
+If you are going to use a string constant, you have to understand that
+the string is, in essence, scanned @emph{twice}: the first time when
+@command{awk} reads your program, and the second time when it goes to
+match the string on the lefthand side of the operator with the pattern
+on the right.  This is true of any string-valued expression (such as
+@code{digits_regexp}, shown previously), not just string constants.
+
+@cindex regexp constants, slashes vs. quotes
+@cindex @code{\} (backslash), regexp constants
+@cindex backslash (@code{\}), regexp constants
+@cindex @code{"} (double quote), regexp constants
+@cindex double quote (@code{"}), regexp constants
+What difference does it make if the string is
+scanned twice? The answer has to do with escape sequences, and particularly
+with backslashes.  To get a backslash into a regular expression inside a
+string, you have to type two backslashes.
+
+For example, @code{/\*/} is a regexp constant for a literal @samp{*}.
+Only one backslash is needed.  To do the same thing with a string,
+you have to type @code{"\\*"}.  The first backslash escapes the
+second one so that the string actually contains the
+two characters @samp{\} and @samp{*}.
+
+@cindex troubleshooting, regexp constants vs. string constants
+@cindex regexp constants, vs. string constants
+@cindex string constants, vs. regexp constants
+Given that you can use both regexp and string constants to describe
+regular expressions, which should you use?  The answer is ``regexp
+constants,'' for several reasons:
+
+@itemize @bullet
+@item
+String constants are more complicated to write and
+more difficult to read. Using regexp constants makes your programs
+less error-prone.  Not understanding the difference between the two
+kinds of constants is a common source of errors.
+
+@item
+It is more efficient to use regexp constants. @command{awk} can note
+that you have supplied a regexp and store it internally in a form that
+makes pattern matching more efficient.  When using a string constant,
+@command{awk} must first convert the string into this internal form and
+then perform the pattern matching.
+
+@item
+Using regexp constants is better form; it shows clearly that you
+intend a regexp match.
+@end itemize
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Using @code{\n} in Character Lists of Dynamic Regexps
+@cindex regular expressions, dynamic, with embedded newlines
+@cindex newlines, in dynamic regexps
+
+Some commercial versions of @command{awk} do not allow the newline
+character to be used inside a character list for a dynamic regexp:
+
+@example
+$ awk '$0 ~ "[ \t\n]"'
+@error{} awk: newline in character class [
+@error{} ]...
+@error{}  source line number 1
+@error{}  context is
+@error{}          >>>  <<<
+@end example
+
+@cindex newlines, in regexp constants
+But a newline in a regexp constant works with no problem:
+
+@example
+$ awk '$0 ~ /[ \t\n]/'
+here is a sample line
+@print{} here is a sample line
+@kbd{@value{CTL}-d}
+@end example
+
+@command{gawk} does not have this problem, and it isn't likely to
+occur often in practice, but it's worth noting for future reference.
+@c ENDOFRANGE dregexp
+@c ENDOFRANGE regexpd
+@c ENDOFRANGE regexp
+
+@node Locales
+@section Where You Are Makes A Difference
+
+Modern systems support the notion of @dfn{locales}: a way to tell
+the system about the local character set and language.  The current
+locale setting can affect the way regexp matching works, often
+in surprising ways.  In particular, many locales do case-insensitive
+matching, even when you may have specified characters of only
+one particular case.
+
+The following example uses the @code{sub} function, which
+does text replacement
+(@pxref{String Functions}).
+Here, the intent is to remove trailing uppercase characters:
+
+@example
+$ echo something1234abc | gawk '@{ sub("[A-Z]*$", ""); print @}'
+@print{} something1234
+@end example
+
+@noindent
+This output is unexpected, since the @samp{abc} at the end of @samp{something1234abc}
+should not normally match @samp{[A-Z]*}.  This result is due to the
+locale setting (and thus you may not see it on your system).
+There are two fixes.  The first is to use the POSIX character
+class @samp{[[:upper:]]}, instead of @samp{[A-Z]}.
+The second is to change the locale setting in the environment,
+before running @command{gawk},
+by using the shell statements:
+
+@example
+LANG=C LC_ALL=C
+export LANG LC_ALL
+@end example
+
+The setting @samp{C} forces @command{gawk} to behave in the traditional
+Unix manner, where case distinctions do matter.
+You may wish to put these statements into your shell startup file,
+e.g., @file{$HOME/.profile}.
+
+Similar considerations apply to other ranges.  For example,
+@samp{["-/]} is perfectly valid in ASCII, but is not valid in many
+Unicode locales, such as @samp{en_US.UTF-8}.  (In general, such
+ranges should be avoided; either list the characters individually,
+or use a POSIX character class such as @samp{[[:punct:]]}.)
+
+For the normal case of @samp{RS = "\n"}, the locale is largely irrelevant.
+For other single byte record separators, using @samp{LC_ALL=C} will give you
+much better performance when reading records.  Otherwise, @command{gawk} has
+to make several function calls, @emph{per input character} to find the record
+terminator.
+
+@node Reading Files
+@chapter Reading Input Files
+
+@c STARTOFRANGE infir
+@cindex input files, reading
+@cindex input files
+@cindex @code{FILENAME} variable
+In the typical @command{awk} program, all input is read either from the
+standard input (by default, this is the keyboard, but often it is a pipe from another
+command) or from files whose names you specify on the @command{awk}
+command line.  If you specify input files, @command{awk} reads them
+in order, processing all the data from one before going on to the next.
+The name of the current input file can be found in the built-in variable
+@code{FILENAME}
+(@pxref{Built-in Variables}).
+
+@cindex records
+@cindex fields
+The input is read in units called @dfn{records}, and is processed by the
+rules of your program one record at a time.
+By default, each record is one line.  Each
+record is automatically split into chunks called @dfn{fields}.
+This makes it more convenient for programs to work on the parts of a record.
+
+@cindex @code{getline} command
+On rare occasions, you may need to use the @code{getline} command.
+The  @code{getline} command is valuable, both because it
+can do explicit input from any number of files, and because the files
+used with it do not have to be named on the @command{awk} command line
+(@pxref{Getline}).
+
+@menu
+* Records::                     Controlling how data is split into records.
+* Fields::                      An introduction to fields.
+* Nonconstant Fields::          Nonconstant Field Numbers.
+* Changing Fields::             Changing the Contents of a Field.
+* Field Separators::            The field separator and how to change it.
+* Constant Size::               Reading constant width data.
+* Multiple Line::               Reading multi-line records.
+* Getline::                     Reading files under explicit program control
+                                using the @code{getline} function.
+@end menu
+
+@node Records
+@section How Input Is Split into Records
+
+@c STARTOFRANGE inspl
+@cindex input, splitting into records
+@c STARTOFRANGE recspl
+@cindex records, splitting input into
+@cindex @code{NR} variable
+@cindex @code{FNR} variable
+The @command{awk} utility divides the input for your @command{awk}
+program into records and fields.
+@command{awk} keeps track of the number of records that have
+been read
+so far
+from the current input file.  This value is stored in a
+built-in variable called @code{FNR}.  It is reset to zero when a new
+file is started.  Another built-in variable, @code{NR}, is the total
+number of input records read so far from all @value{DF}s.  It starts at zero,
+but is never automatically reset to zero.
+
+@cindex separators, for records
+@cindex record separators
+Records are separated by a character called the @dfn{record separator}.
+By default, the record separator is the newline character.
+This is why records are, by default, single lines.
+A different character can be used for the record separator by
+assigning the character to the built-in variable @code{RS}.
+
+@cindex newlines, as record separators
+@cindex @code{RS} variable
+Like any other variable,
+the value of @code{RS} can be changed in the @command{awk} program
+with the assignment operator, @samp{=}
+(@pxref{Assignment Ops}).
+The new record-separator character should be enclosed in quotation marks,
+which indicate a string constant.  Often the right time to do this is
+at the beginning of execution, before any input is processed,
+so that the very first record is read with the proper separator.
+To do this, use the special @code{BEGIN} pattern
+(@pxref{BEGIN/END}).
+For example:
+
+@cindex @code{BEGIN} pattern
+@example
+awk 'BEGIN @{ RS = "/" @}
+     @{ print $0 @}' BBS-list
+@end example
+
+@noindent
+changes the value of @code{RS} to @code{"/"}, before reading any input.
+This is a string whose first character is a slash; as a result, records
+are separated by slashes.  Then the input file is read, and the second
+rule in the @command{awk} program (the action with no pattern) prints each
+record.  Because each @code{print} statement adds a newline at the end of
+its output, this @command{awk} program copies the input
+with each slash changed to a newline.  Here are the results of running
+the program on @file{BBS-list}:
+
+@example
+$ awk 'BEGIN @{ RS = "/" @}
+>      @{ print $0 @}' BBS-list
+@print{} aardvark     555-5553     1200
+@print{} 300          B
+@print{} alpo-net     555-3412     2400
+@print{} 1200
+@print{} 300     A
+@print{} barfly       555-7685     1200
+@print{} 300          A
+@print{} bites        555-1675     2400
+@print{} 1200
+@print{} 300     A
+@print{} camelot      555-0542     300               C
+@print{} core         555-2912     1200
+@print{} 300          C
+@print{} fooey        555-1234     2400
+@print{} 1200
+@print{} 300     B
+@print{} foot         555-6699     1200
+@print{} 300          B
+@print{} macfoo       555-6480     1200
+@print{} 300          A
+@print{} sdace        555-3430     2400
+@print{} 1200
+@print{} 300     A
+@print{} sabafoo      555-2127     1200
+@print{} 300          C
+@print{}
+@end example
+
+@noindent
+Note that the entry for the @samp{camelot} BBS is not split.
+In the original @value{DF}
+(@pxref{Sample Data Files}),
+the line looks like this:
+
+@example
+camelot      555-0542     300               C
+@end example
+
+@noindent
+It has one baud rate only, so there are no slashes in the record,
+unlike the others which have two or more baud rates.
+In fact, this record is treated as part of the record
+for the @samp{core} BBS; the newline separating them in the output
+is the original newline in the @value{DF}, not the one added by
+@command{awk} when it printed the record!
+
+@cindex record separators, changing
+@cindex separators, for records
+Another way to change the record separator is on the command line,
+using the variable-assignment feature
+(@pxref{Other Arguments}):
+
+@example
+awk '@{ print $0 @}' RS="/" BBS-list
+@end example
+
+@noindent
+This sets @code{RS} to @samp{/} before processing @file{BBS-list}.
+
+Using an unusual character such as @samp{/} for the record separator
+produces correct behavior in the vast majority of cases.  However,
+the following (extreme) pipeline prints a surprising @samp{1}:
+
+@example
+$ echo | awk 'BEGIN @{ RS = "a" @} ; @{ print NF @}'
+@print{} 1
+@end example
+
+There is one field, consisting of a newline.  The value of the built-in
+variable @code{NF} is the number of fields in the current record.
+
+@cindex dark corner, input files
+Reaching the end of an input file terminates the current input record,
+even if the last character in the file is not the character in @code{RS}.
+@value{DARKCORNER}
+
+@cindex null strings
+@cindex strings, empty, See null strings
+The empty string @code{""} (a string without any characters)
+has a special meaning
+as the value of @code{RS}. It means that records are separated
+by one or more blank lines and nothing else.
+@xref{Multiple Line}, for more details.
+
+If you change the value of @code{RS} in the middle of an @command{awk} run,
+the new value is used to delimit subsequent records, but the record
+currently being processed, as well as records already processed, are not
+affected.
+
+@cindex @code{RT} variable
+@cindex records, terminating
+@cindex terminating records
+@cindex differences in @command{awk} and @command{gawk}, record separators
+@cindex regular expressions, as record separators
+@cindex record separators, regular expressions as
+@cindex separators, for records, regular expressions as
+After the end of the record has been determined, @command{gawk}
+sets the variable @code{RT} to the text in the input that matched
+@code{RS}.
+When using @command{gawk},
+the value of @code{RS} is not limited to a one-character
+string.  It can be any regular expression
+(@pxref{Regexp}).
+In general, each record
+ends at the next string that matches the regular expression; the next
+record starts at the end of the matching string.  This general rule is
+actually at work in the usual case, where @code{RS} contains just a
+newline: a record ends at the beginning of the next matching string (the
+next newline in the input), and the following record starts just after
+the end of this string (at the first character of the following line).
+The newline, because it matches @code{RS}, is not part of either record.
+
+When @code{RS} is a single character, @code{RT}
+contains the same single character. However, when @code{RS} is a
+regular expression, @code{RT} contains
+the actual input text that matched the regular expression.
+
+The following example illustrates both of these features.
+It sets @code{RS} equal to a regular expression that
+matches either a newline or a series of one or more uppercase letters
+with optional leading and/or trailing whitespace:
+
+@example
+$ echo record 1 AAAA record 2 BBBB record 3 |
+> gawk 'BEGIN @{ RS = "\n|( *[[:upper:]]+ *)" @}
+>             @{ print "Record =", $0, "and RT =", RT @}'
+@print{} Record = record 1 and RT =  AAAA
+@print{} Record = record 2 and RT =  BBBB
+@print{} Record = record 3 and RT =
+@print{}
+@end example
+
+@noindent
+The final line of output has an extra blank line. This is because the
+value of @code{RT} is a newline, and the @code{print} statement
+supplies its own terminating newline.
+@xref{Simple Sed}, for a more useful example
+of @code{RS} as a regexp and @code{RT}.
+
+If you set @code{RS} to a regular expression that allows optional
+trailing text, such as @samp{RS = "abc(XYZ)?"} it is possible, due
+to implementation constraints, that @command{gawk} may match the leading
+part of the regular expression, but not the trailing part, particularly
+if the input text that could match the trailing part is fairly long.
+@command{gawk} attempts to avoid this problem, but currently, there's
+no guarantee that this will never happen.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{RS}/@code{RT} variables
+The use of @code{RS} as a regular expression and the @code{RT}
+variable are @command{gawk} extensions; they are not available in
+compatibility mode
+(@pxref{Options}).
+In compatibility mode, only the first character of the value of
+@code{RS} is used to determine the end of the record.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: @code{RS = "\0"} Is Not Portable
+
+@cindex advanced features, @value{DF}s as single record
+@cindex portability, @value{DF}s as single record
+There are times when you might want to treat an entire @value{DF} as a
+single record.  The only way to make this happen is to give @code{RS}
+a value that you know doesn't occur in the input file.  This is hard
+to do in a general way, such that a program always works for arbitrary
+input files.
+@c can you say `understatement' boys and girls?
+
+You might think that for text files, the @sc{nul} character, which
+consists of a character with all bits equal to zero, is a good
+value to use for @code{RS} in this case:
+
+@example
+BEGIN @{ RS = "\0" @}  # whole file becomes one record?
+@end example
+
+@cindex differences in @command{awk} and @command{gawk}, strings, storing
+@command{gawk} in fact accepts this, and uses the @sc{nul}
+character for the record separator.
+However, this usage is @emph{not} portable
+to other @command{awk} implementations.
+
+@cindex dark corner, strings, storing
+All other @command{awk} implementations@footnote{At least that we know
+about.} store strings internally as C-style strings.  C strings use the
+@sc{nul} character as the string terminator.  In effect, this means that
+@samp{RS = "\0"} is the same as @samp{RS = ""}.
+@value{DARKCORNER}
+
+@cindex records, treating files as
+@cindex files, as single records
+The best way to treat a whole file as a single record is to
+simply read the file in, one record at a time, concatenating each
+record onto the end of the previous ones.
+@c ENDOFRANGE inspl
+@c ENDOFRANGE recspl
+
+@node Fields
+@section Examining Fields
+
+@cindex examining fields
+@cindex fields
+@cindex accessing fields
+@c STARTOFRANGE fiex
+@cindex fields, examining
+@cindex POSIX @command{awk}, field separators and
+@cindex field separators, POSIX and
+@cindex separators, field, POSIX and
+When @command{awk} reads an input record, the record is
+automatically @dfn{parsed} or separated by the interpreter into chunks
+called @dfn{fields}.  By default, fields are separated by @dfn{whitespace},
+like words in a line.
+Whitespace in @command{awk} means any string of one or more spaces,
+tabs, or newlines;@footnote{In POSIX @command{awk}, newlines are not
+considered whitespace for separating fields.} other characters, such as
+formfeed, vertical tab, etc.@: that are
+considered whitespace by other languages, are @emph{not} considered
+whitespace by @command{awk}.
+
+The purpose of fields is to make it more convenient for you to refer to
+these pieces of the record.  You don't have to use them---you can
+operate on the whole record if you want---but fields are what make
+simple @command{awk} programs so powerful.
+
+@cindex @code{$} field operator
+@cindex field operator @code{$}
+@cindex @code{$} (dollar sign), @code{$} field operator
+@cindex dollar sign (@code{$}), @code{$} field operator
+@c The comma here does NOT mark a secondary term:
+@cindex field operators, dollar sign as
+A dollar-sign (@samp{$}) is used
+to refer to a field in an @command{awk} program,
+followed by the number of the field you want.  Thus, @code{$1}
+refers to the first field, @code{$2} to the second, and so on.
+(Unlike the Unix shells, the field numbers are not limited to single digits.
+@code{$127} is the one hundred twenty-seventh field in the record.)
+For example, suppose the following is a line of input:
+
+@example
+This seems like a pretty nice example.
+@end example
+
+@noindent
+Here the first field, or @code{$1}, is @samp{This}, the second field, or
+@code{$2}, is @samp{seems}, and so on.  Note that the last field,
+@code{$7}, is @samp{example.}.  Because there is no space between the
+@samp{e} and the @samp{.}, the period is considered part of the seventh
+field.
+
+@cindex @code{NF} variable
+@cindex fields, number of
+@code{NF} is a built-in variable whose value is the number of fields
+in the current record.  @command{awk} automatically updates the value
+of @code{NF} each time it reads a record.  No matter how many fields
+there are, the last field in a record can be represented by @code{$NF}.
+So, @code{$NF} is the same as @code{$7}, which is @samp{example.}.
+If you try to reference a field beyond the last
+one (such as @code{$8} when the record has only seven fields), you get
+the empty string.  (If used in a numeric operation, you get zero.)
+
+The use of @code{$0}, which looks like a reference to the ``zero-th'' field, is
+a special case: it represents the whole input record
+when you are not interested in specific fields.
+Here are some more examples:
+
+@example
+$ awk '$1 ~ /foo/ @{ print $0 @}' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sabafoo      555-2127     1200/300          C
+@end example
+
+@noindent
+This example prints each record in the file @file{BBS-list} whose first
+field contains the string @samp{foo}.  The operator @samp{~} is called a
+@dfn{matching operator}
+(@pxref{Regexp Usage});
+it tests whether a string (here, the field @code{$1}) matches a given regular
+expression.
+
+By contrast, the following example
+looks for @samp{foo} in @emph{the entire record} and prints the first
+field and the last field for each matching input record:
+
+@example
+$ awk '/foo/ @{ print $1, $NF @}' BBS-list
+@print{} fooey B
+@print{} foot B
+@print{} macfoo A
+@print{} sabafoo C
+@end example
+@c ENDOFRANGE fiex
+
+@node Nonconstant Fields
+@section Nonconstant Field Numbers
+@cindex fields, numbers
+@cindex field numbers
+
+The number of a field does not need to be a constant.  Any expression in
+the @command{awk} language can be used after a @samp{$} to refer to a
+field.  The value of the expression specifies the field number.  If the
+value is a string, rather than a number, it is converted to a number.
+Consider this example:
+
+@example
+awk '@{ print $NR @}'
+@end example
+
+@noindent
+Recall that @code{NR} is the number of records read so far: one in the
+first record, two in the second, etc.  So this example prints the first
+field of the first record, the second field of the second record, and so
+on.  For the twentieth record, field number 20 is printed; most likely,
+the record has fewer than 20 fields, so this prints a blank line.
+Here is another example of using expressions as field numbers:
+
+@example
+awk '@{ print $(2*2) @}' BBS-list
+@end example
+
+@command{awk} evaluates the expression @samp{(2*2)} and uses
+its value as the number of the field to print.  The @samp{*} sign
+represents multiplication, so the expression @samp{2*2} evaluates to four.
+The parentheses are used so that the multiplication is done before the
+@samp{$} operation; they are necessary whenever there is a binary
+operator in the field-number expression.  This example, then, prints the
+hours of operation (the fourth field) for every line of the file
+@file{BBS-list}.  (All of the @command{awk} operators are listed, in
+order of decreasing precedence, in
+@ref{Precedence}.)
+
+If the field number you compute is zero, you get the entire record.
+Thus, @samp{$(2-2)} has the same value as @code{$0}.  Negative field
+numbers are not allowed; trying to reference one usually terminates
+the program.  (The POSIX standard does not define
+what happens when you reference a negative field number.  @command{gawk}
+notices this and terminates your program.  Other @command{awk}
+implementations may behave differently.)
+
+As mentioned in @ref{Fields},
+@command{awk} stores the current record's number of fields in the built-in
+variable @code{NF} (also @pxref{Built-in Variables}).  The expression
+@code{$NF} is not a special feature---it is the direct consequence of
+evaluating @code{NF} and using its value as a field number.
+
+@node Changing Fields
+@section Changing the Contents of a Field
+
+@c STARTOFRANGE ficon
+@cindex fields, changing contents of
+The contents of a field, as seen by @command{awk}, can be changed within an
+@command{awk} program; this changes what @command{awk} perceives as the
+current input record.  (The actual input is untouched; @command{awk} @emph{never}
+modifies the input file.)
+Consider the following example and its output:
+
+@example
+$ awk '@{ nboxes = $3 ; $3 = $3 - 10
+>        print nboxes, $3 @}' inventory-shipped
+@print{} 25 15
+@print{} 32 22
+@print{} 24 14
+@dots{}
+@end example
+
+@noindent
+The program first saves the original value of field three in the variable
+@code{nboxes}.
+The @samp{-} sign represents subtraction, so this program reassigns
+field three, @code{$3}, as the original value of field three minus ten:
+@samp{$3 - 10}.  (@xref{Arithmetic Ops}.)
+Then it prints the original and new values for field three.
+(Someone in the warehouse made a consistent mistake while inventorying
+the red boxes.)
+
+For this to work, the text in field @code{$3} must make sense
+as a number; the string of characters must be converted to a number
+for the computer to do arithmetic on it.  The number resulting
+from the subtraction is converted back to a string of characters that
+then becomes field three.
+@xref{Conversion}.
+
+When the value of a field is changed (as perceived by @command{awk}), the
+text of the input record is recalculated to contain the new field where
+the old one was.  In other words, @code{$0} changes to reflect the altered
+field.  Thus, this program
+prints a copy of the input file, with 10 subtracted from the second
+field of each line:
+
+@example
+$ awk '@{ $2 = $2 - 10; print $0 @}' inventory-shipped
+@print{} Jan 3 25 15 115
+@print{} Feb 5 32 24 226
+@print{} Mar 5 24 34 228
+@dots{}
+@end example
+
+It is also possible to also assign contents to fields that are out
+of range.  For example:
+
+@example
+$ awk '@{ $6 = ($5 + $4 + $3 + $2)
+>        print $6 @}' inventory-shipped
+@print{} 168
+@print{} 297
+@print{} 301
+@dots{}
+@end example
+
+@cindex adding, fields
+@cindex fields, adding
+@noindent
+We've just created @code{$6}, whose value is the sum of fields
+@code{$2}, @code{$3}, @code{$4}, and @code{$5}.  The @samp{+} sign
+represents addition.  For the file @file{inventory-shipped}, @code{$6}
+represents the total number of parcels shipped for a particular month.
+
+Creating a new field changes @command{awk}'s internal copy of the current
+input record, which is the value of @code{$0}.  Thus, if you do @samp{print $0}
+after adding a field, the record printed includes the new field, with
+the appropriate number of field separators between it and the previously
+existing fields.
+
+@cindex @code{OFS} variable
+@cindex output field separator, See @code{OFS} variable
+@cindex field separators, See Also @code{OFS}
+This recomputation affects and is affected by
+@code{NF} (the number of fields; @pxref{Fields}).
+For example, the value of @code{NF} is set to the number of the highest
+field you create.
+The exact format of @code{$0} is also affected by a feature that has not been discussed yet:
+the @dfn{output field separator}, @code{OFS},
+used to separate the fields (@pxref{Output Separators}).
+
+Note, however, that merely @emph{referencing} an out-of-range field
+does @emph{not} change the value of either @code{$0} or @code{NF}.
+Referencing an out-of-range field only produces an empty string.  For
+example:
+
+@example
+if ($(NF+1) != "")
+    print "can't happen"
+else
+    print "everything is normal"
+@end example
+
+@noindent
+should print @samp{everything is normal}, because @code{NF+1} is certain
+to be out of range.  (@xref{If Statement},
+for more information about @command{awk}'s @code{if-else} statements.
+@xref{Typing and Comparison},
+for more information about the @samp{!=} operator.)
+
+It is important to note that making an assignment to an existing field
+changes the
+value of @code{$0} but does not change the value of @code{NF},
+even when you assign the empty string to a field.  For example:
+
+@example
+$ echo a b c d | awk '@{ OFS = ":"; $2 = ""
+>                       print $0; print NF @}'
+@print{} a::c:d
+@print{} 4
+@end example
+
+@noindent
+The field is still there; it just has an empty value, denoted by
+the two colons between @samp{a} and @samp{c}.
+This example shows what happens if you create a new field:
+
+@example
+$ echo a b c d | awk '@{ OFS = ":"; $2 = ""; $6 = "new"
+>                       print $0; print NF @}'
+@print{} a::c:d::new
+@print{} 6
+@end example
+
+@noindent
+The intervening field, @code{$5}, is created with an empty value
+(indicated by the second pair of adjacent colons),
+and @code{NF} is updated with the value six.
+
+@c FIXME: Verify that this is in POSIX
+@cindex dark corner, @code{NF} variable, decrementing
+@cindex @code{NF} variable, decrementing
+Decrementing @code{NF} throws away the values of the fields
+after the new value of @code{NF} and recomputes @code{$0}.
+@value{DARKCORNER}
+Here is an example:
+
+@example
+$ echo a b c d e f | awk '@{ print "NF =", NF;
+>                            NF = 3; print $0 @}'
+@print{} NF = 6
+@print{} a b c
+@end example
+
+@c the comma before decrementing does NOT represent a tertiary entry
+@cindex portability, @code{NF} variable, decrementing
+@strong{Caution:} Some versions of @command{awk} don't
+rebuild @code{$0} when @code{NF} is decremented. Caveat emptor.
+
+Finally, there are times when it is convenient to force
+@command{awk} to rebuild the entire record, using the current
+value of the fields and @code{OFS}.  To do this, use the
+seemingly innocuous assignment:
+
+@example
+$1 = $1   # force record to be reconstituted
+print $0  # or whatever else with $0
+@end example
+
+@noindent
+This forces @command{awk} rebuild the record.  It does help
+to add a comment, as we've shown here.
+
+There is a flip side to the relationship between @code{$0} and
+the fields.  Any assignment to @code{$0} causes the record to be
+reparsed into fields using the @emph{current} value of @code{FS}.
+This also applies to any built-in function that updates @code{$0},
+such as @code{sub} and @code{gsub}
+(@pxref{String Functions}).
+@c ENDOFRANGE ficon
+
+@node Field Separators
+@section Specifying How Fields Are Separated
+
+@menu
+* Regexp Field Splitting::       Using regexps as the field separator.
+* Single Character Fields::      Making each character a separate field.
+* Command Line Field Separator:: Setting @code{FS} from the command-line.
+* Field Splitting Summary::      Some final points and a summary table.
+@end menu
+
+@cindex @code{FS} variable
+@cindex fields, separating
+@c STARTOFRANGE fisepr
+@cindex field separators
+@c STARTOFRANGE fisepg
+@cindex fields, separating
+The @dfn{field separator}, which is either a single character or a regular
+expression, controls the way @command{awk} splits an input record into fields.
+@command{awk} scans the input record for character sequences that
+match the separator; the fields themselves are the text between the matches.
+
+In the examples that follow, we use the bullet symbol (@bullet{}) to
+represent spaces in the output.
+If the field separator is @samp{oo}, then the following line:
+
+@example
+moo goo gai pan
+@end example
+
+@noindent
+is split into three fields: @samp{m}, @samp{@bullet{}g}, and
+@samp{@bullet{}gai@bullet{}pan}.
+Note the leading spaces in the values of the second and third fields.
+
+@cindex troubleshooting, @command{awk} uses @code{FS} not @code{IFS}
+The field separator is represented by the built-in variable @code{FS}.
+Shell programmers take note:  @command{awk} does @emph{not} use the
+name @code{IFS} that is used by the POSIX-compliant shells (such as
+the Unix Bourne shell, @command{sh}, or @command{bash}).
+
+@cindex @code{FS} variable, changing value of
+The value of @code{FS} can be changed in the @command{awk} program with the
+assignment operator, @samp{=} (@pxref{Assignment Ops}).
+Often the right time to do this is at the beginning of execution
+before any input has been processed, so that the very first record
+is read with the proper separator.  To do this, use the special
+@code{BEGIN} pattern
+(@pxref{BEGIN/END}).
+For example, here we set the value of @code{FS} to the string
+@code{","}:
+
+@example
+awk 'BEGIN @{ FS = "," @} ; @{ print $2 @}'
+@end example
+
+@cindex @code{BEGIN} pattern
+@noindent
+Given the input line:
+
+@example
+John Q. Smith, 29 Oak St., Walamazoo, MI 42139
+@end example
+
+@noindent
+this @command{awk} program extracts and prints the string
+@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
+
+@cindex field separators, choice of
+@cindex regular expressions as field separators
+@cindex field separators, regular expressions as
+Sometimes the input data contains separator characters that don't
+separate fields the way you thought they would.  For instance, the
+person's name in the example we just used might have a title or
+suffix attached, such as:
+
+@example
+John Q. Smith, LXIX, 29 Oak St., Walamazoo, MI 42139
+@end example
+
+@noindent
+The same program would extract @samp{@bullet{}LXIX}, instead of
+@samp{@bullet{}29@bullet{}Oak@bullet{}St.}.
+If you were expecting the program to print the
+address, you would be surprised.  The moral is to choose your data layout and
+separator characters carefully to prevent such problems.
+(If the data is not in a form that is easy to process, perhaps you
+can massage it first with a separate @command{awk} program.)
+
+@cindex newlines, as field separators
+@cindex whitespace, as field separators
+Fields are normally separated by whitespace sequences
+(spaces, tabs, and newlines), not by single spaces.  Two spaces in a row do not
+delimit an empty field.  The default value of the field separator @code{FS}
+is a string containing a single space, @w{@code{" "}}.  If @command{awk}
+interpreted this value in the usual way, each space character would separate
+fields, so two spaces in a row would make an empty field between them.
+The reason this does not happen is that a single space as the value of
+@code{FS} is a special case---it is taken to specify the default manner
+of delimiting fields.
+
+If @code{FS} is any other single character, such as @code{","}, then
+each occurrence of that character separates two fields.  Two consecutive
+occurrences delimit an empty field.  If the character occurs at the
+beginning or the end of the line, that too delimits an empty field.  The
+space character is the only single character that does not follow these
+rules.
+
+@node Regexp Field Splitting
+@subsection Using Regular Expressions to Separate Fields
+
+@c STARTOFRANGE regexpfs
+@cindex regular expressions, as field separators
+@c STARTOFRANGE fsregexp
+@cindex field separators, regular expressions as
+The previous @value{SUBSECTION}
+discussed the use of single characters or simple strings as the
+value of @code{FS}.
+More generally, the value of @code{FS} may be a string containing any
+regular expression.  In this case, each match in the record for the regular
+expression separates fields.  For example, the assignment:
+
+@example
+FS = ", \t"
+@end example
+
+@noindent
+makes every area of an input line that consists of a comma followed by a
+space and a TAB into a field separator.
+@ifinfo
+(@samp{\t}
+is an @dfn{escape sequence} that stands for a TAB;
+@pxref{Escape Sequences},
+for the complete list of similar escape sequences.)
+@end ifinfo
+
+For a less trivial example of a regular expression, try using
+single spaces to separate fields the way single commas are used.
+@code{FS} can be set to @w{@code{"[@ ]"}} (left bracket, space, right
+bracket).  This regular expression matches a single space and nothing else
+(@pxref{Regexp}).
+
+There is an important difference between the two cases of @samp{FS = @w{" "}}
+(a single space) and @samp{FS = @w{"[ \t\n]+"}}
+(a regular expression matching one or more spaces, tabs, or newlines).
+For both values of @code{FS}, fields are separated by @dfn{runs}
+(multiple adjacent occurrences) of spaces, tabs,
+and/or newlines.  However, when the value of @code{FS} is @w{@code{" "}},
+@command{awk} first strips leading and trailing whitespace from
+the record and then decides where the fields are.
+For example, the following pipeline prints @samp{b}:
+
+@example
+$ echo ' a b c d ' | awk '@{ print $2 @}'
+@print{} b
+@end example
+
+@noindent
+However, this pipeline prints @samp{a} (note the extra spaces around
+each letter):
+
+@example
+$ echo ' a  b  c  d ' | awk 'BEGIN @{ FS = "[ \t\n]+" @}
+>                                  @{ print $2 @}'
+@print{} a
+@end example
+
+@noindent
+@cindex null strings
+@cindex strings, null
+@cindex empty strings, See null strings
+In this case, the first field is @dfn{null} or empty.
+
+The stripping of leading and trailing whitespace also comes into
+play whenever @code{$0} is recomputed.  For instance, study this pipeline:
+
+@example
+$ echo '   a b c d' | awk '@{ print; $2 = $2; print @}'
+@print{}    a b c d
+@print{} a b c d
+@end example
+
+@noindent
+The first @code{print} statement prints the record as it was read,
+with leading whitespace intact.  The assignment to @code{$2} rebuilds
+@code{$0} by concatenating @code{$1} through @code{$NF} together,
+separated by the value of @code{OFS}.  Because the leading whitespace
+was ignored when finding @code{$1}, it is not part of the new @code{$0}.
+Finally, the last @code{print} statement prints the new @code{$0}.
+@c ENDOFRANGE regexpfs
+@c ENDOFRANGE fsregexp
+
+@node Single Character Fields
+@subsection Making Each Character a Separate Field
+
+@cindex differences in @command{awk} and @command{gawk}, single-character fields
+@cindex single-character fields
+@cindex fields, single-character
+There are times when you may want to examine each character
+of a record separately.  This can be done in @command{gawk} by
+simply assigning the null string (@code{""}) to @code{FS}. In this case,
+each individual character in the record becomes a separate field.
+For example:
+
+@example
+$ echo a b | gawk 'BEGIN @{ FS = "" @}
+>                  @{
+>                      for (i = 1; i <= NF; i = i + 1)
+>                          print "Field", i, "is", $i
+>                  @}'
+@print{} Field 1 is a
+@print{} Field 2 is
+@print{} Field 3 is b
+@end example
+
+@cindex dark corner, @code{FS} as null string
+@cindex FS variable, as null string
+Traditionally, the behavior of @code{FS} equal to @code{""} was not defined.
+In this case, most versions of Unix @command{awk} simply treat the entire record
+as only having one field.
+@value{DARKCORNER}
+In compatibility mode
+(@pxref{Options}),
+if @code{FS} is the null string, then @command{gawk} also
+behaves this way.
+
+@node Command Line Field Separator
+@subsection Setting @code{FS} from the Command Line
+@cindex @code{-F} option
+@cindex options, command-line
+@cindex command line, options
+@cindex field separators, on command line
+@c The comma before "setting" does NOT represent a tertiary
+@cindex command line, @code{FS} on, setting
+@cindex @code{FS} variable, setting from command line
+
+@code{FS} can be set on the command line.  Use the @option{-F} option to
+do so.  For example:
+
+@example
+awk -F, '@var{program}' @var{input-files}
+@end example
+
+@noindent
+sets @code{FS} to the @samp{,} character.  Notice that the option uses
+an uppercase @samp{F} instead of a lowercase @samp{f}. The latter
+option (@option{-f}) specifies a file
+containing an @command{awk} program.  Case is significant in command-line
+options:
+the @option{-F} and @option{-f} options have nothing to do with each other.
+You can use both options at the same time to set the @code{FS} variable
+@emph{and} get an @command{awk} program from a file.
+
+The value used for the argument to @option{-F} is processed in exactly the
+same way as assignments to the built-in variable @code{FS}.
+Any special characters in the field separator must be escaped
+appropriately.  For example, to use a @samp{\} as the field separator
+on the command line, you would have to type:
+
+@example
+# same as FS = "\\"
+awk -F\\\\ '@dots{}' files @dots{}
+@end example
+
+@noindent
+@cindex @code{\} (backslash), as field separators
+@cindex backslash (@code{\}), as field separators
+Because @samp{\} is used for quoting in the shell, @command{awk} sees
+@samp{-F\\}.  Then @command{awk} processes the @samp{\\} for escape
+characters (@pxref{Escape Sequences}), finally yielding
+a single @samp{\} to use for the field separator.
+
+@c @cindex historical features
+As a special case, in compatibility mode
+(@pxref{Options}),
+if the argument to @option{-F} is @samp{t}, then @code{FS} is set to
+the TAB character.  If you type @samp{-F\t} at the
+shell, without any quotes, the @samp{\} gets deleted, so @command{awk}
+figures that you really want your fields to be separated with tabs and
+not @samp{t}s.  Use @samp{-v FS="t"} or @samp{-F"[t]"} on the command line
+if you really do want to separate your fields with @samp{t}s.
+
+For example, let's use an @command{awk} program file called @file{baud.awk}
+that contains the pattern @code{/300/} and the action @samp{print $1}:
+
+@example
+/300/   @{ print $1 @}
+@end example
+
+Let's also set @code{FS} to be the @samp{-} character and run the
+program on the file @file{BBS-list}.  The following command prints a
+list of the names of the bulletin boards that operate at 300 baud and
+the first three digits of their phone numbers:
+
+@c tweaked to make the tex output look better in @smallbook
+@example
+$ awk -F- -f baud.awk BBS-list
+@print{} aardvark     555
+@print{} alpo
+@print{} barfly       555
+@print{} bites        555
+@print{} camelot      555
+@print{} core         555
+@print{} fooey        555
+@print{} foot         555
+@print{} macfoo       555
+@print{} sdace        555
+@print{} sabafoo      555
+@end example
+
+@noindent
+Note the second line of output.  The second line
+in the original file looked like this:
+
+@example
+alpo-net     555-3412     2400/1200/300     A
+@end example
+
+The @samp{-} as part of the system's name was used as the field
+separator, instead of the @samp{-} in the phone number that was
+originally intended.  This demonstrates why you have to be careful in
+choosing your field and record separators.
+
+@c The comma after "password files" does NOT start a tertiary
+@cindex Unix @command{awk}, password files, field separators and
+Perhaps the most common use of a single character as the field
+separator occurs when processing the Unix system password file.
+On many Unix systems, each user has a separate entry in the system password
+file, one line per user.  The information in these lines is separated
+by colons.  The first field is the user's logon name and the second is
+the user's (encrypted or shadow) password.  A password file entry might look
+like this:
+
+@cindex Robbins, Arnold
+@example
+arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/bash
+@end example
+
+The following program searches the system password file and prints
+the entries for users who have no password:
+
+@example
+awk -F: '$2 == ""' /etc/passwd
+@end example
+
+@node Field Splitting Summary
+@subsection Field-Splitting Summary
+
+It is important to remember that when you assign a string constant
+as the value of @code{FS}, it undergoes normal @command{awk} string
+processing.  For example, with Unix @command{awk} and @command{gawk},
+the assignment @samp{FS = "\.."} assigns the character string @code{".."}
+to @code{FS} (the backslash is stripped).  This creates a regexp meaning
+``fields are separated by occurrences of any two characters.''
+If instead you want fields to be separated by a literal period followed
+by any single character, use @samp{FS = "\\.."}.
+
+The following table summarizes how fields are split, based on the value
+of @code{FS} (@samp{==} means ``is equal to''):
+
+@table @code
+@item FS == " "
+Fields are separated by runs of whitespace.  Leading and trailing
+whitespace are ignored.  This is the default.
+
+@item FS == @var{any other single character}
+Fields are separated by each occurrence of the character.  Multiple
+successive occurrences delimit empty fields, as do leading and
+trailing occurrences.
+The character can even be a regexp metacharacter; it does not need
+to be escaped.
+
+@item FS == @var{regexp}
+Fields are separated by occurrences of characters that match @var{regexp}.
+Leading and trailing matches of @var{regexp} delimit empty fields.
+
+@item FS == ""
+Each individual character in the record becomes a separate field.
+(This is a @command{gawk} extension; it is not specified by the
+POSIX standard.)
+@end table
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Changing @code{FS} Does Not Affect the Fields
+
+@cindex POSIX @command{awk}, field separators and
+@cindex field separators, POSIX and
+According to the POSIX standard, @command{awk} is supposed to behave
+as if each record is split into fields at the time it is read.
+In particular, this means that if you change the value of @code{FS}
+after a record is read, the value of the fields (i.e., how they were split)
+should reflect the old value of @code{FS}, not the new one.
+
+@cindex dark corner, field separators
+@cindex @command{sed} utility
+@cindex stream editors
+However, many implementations of @command{awk} do not work this way.  Instead,
+they defer splitting the fields until a field is actually
+referenced.  The fields are split
+using the @emph{current} value of @code{FS}!
+@value{DARKCORNER}
+This behavior can be difficult
+to diagnose. The following example illustrates the difference
+between the two methods.
+(The @command{sed}@footnote{The @command{sed} utility is a ``stream editor.''
+Its behavior is also defined by the POSIX standard.}
+command prints just the first line of @file{/etc/passwd}.)
+
+@example
+sed 1q /etc/passwd | awk '@{ FS = ":" ; print $1 @}'
+@end example
+
+@noindent
+which usually prints:
+
+@example
+root
+@end example
+
+@noindent
+on an incorrect implementation of @command{awk}, while @command{gawk}
+prints something like:
+
+@example
+root:nSijPlPhZZwgE:0:0:Root:/:
+@end example
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: @code{FS} and @code{IGNORECASE}
+
+The @code{IGNORECASE} variable
+(@pxref{User-modified})
+affects field splitting @emph{only} when the value of @code{FS} is a regexp.
+It has no effect when @code{FS} is a single character, even if
+that character is a letter.  Thus, in the following code:
+
+@example
+FS = "c"
+IGNORECASE = 1
+$0 = "aCa"
+print $1
+@end example
+
+@noindent
+The output is @samp{aCa}.  If you really want to split fields on an
+alphabetic character while ignoring case, use a regexp that will
+do it for you.  E.g., @samp{FS = "[c]"}.  In this case, @code{IGNORECASE}
+will take effect.
+
+@c ENDOFRANGE fisepr
+@c ENDOFRANGE fisepg
+
+@node Constant Size
+@section Reading Fixed-Width Data
+
+@ifnotinfo
+@strong{Note:} This @value{SECTION} discusses an advanced
+feature of @command{gawk}.  If you are a novice @command{awk} user,
+you might want to skip it on the first reading.
+@end ifnotinfo
+
+@ifinfo
+(This @value{SECTION} discusses an advanced feature of @command{awk}.
+If you are a novice @command{awk} user, you might want to skip it on
+the first reading.)
+@end ifinfo
+
+@cindex data, fixed-width
+@cindex fixed-width data
+@cindex advanced features, fixed-width data
+@command{gawk} @value{PVERSION} 2.13 introduced a facility for dealing with
+fixed-width fields with no distinctive field separator.  For example,
+data of this nature arises in the input for old Fortran programs where
+numbers are run together, or in the output of programs that did not
+anticipate the use of their output as input for other programs.
+
+An example of the latter is a table where all the columns are lined up by
+the use of a variable number of spaces and @emph{empty fields are just
+spaces}.  Clearly, @command{awk}'s normal field splitting based on @code{FS}
+does not work well in this case.  Although a portable @command{awk} program
+can use a series of @code{substr} calls on @code{$0}
+(@pxref{String Functions}),
+this is awkward and inefficient for a large number of fields.
+
+@c comma before specifying is part of tertiary
+@cindex troubleshooting, fatal errors, field widths, specifying
+@cindex @command{w} utility
+@cindex @code{FIELDWIDTHS} variable
+The splitting of an input record into fixed-width fields is specified by
+assigning a string containing space-separated numbers to the built-in
+variable @code{FIELDWIDTHS}.  Each number specifies the width of the field,
+@emph{including} columns between fields.  If you want to ignore the columns
+between fields, you can specify the width as a separate field that is
+subsequently ignored.
+It is a fatal error to supply a field width that is not a positive number.
+The following data is the output of the Unix @command{w} utility.  It is useful
+to illustrate the use of @code{FIELDWIDTHS}:
+
+@example
+@group
+ 10:06pm  up 21 days, 14:04,  23 users
+User     tty       login@  idle   JCPU   PCPU  what
+hzuo     ttyV0     8:58pm            9      5  vi p24.tex
+hzang    ttyV3     6:37pm    50                -csh
+eklye    ttyV5     9:53pm            7      1  em thes.tex
+dportein ttyV6     8:17pm  1:47                -csh
+gierd    ttyD3    10:00pm     1                elm
+dave     ttyD4     9:47pm            4      4  w
+brent    ttyp0    26Jun91  4:46  26:46   4:41  bash
+dave     ttyq4    26Jun9115days     46     46  wnewmail
+@end group
+@end example
+
+The following program takes the above input, converts the idle time to
+number of seconds, and prints out the first two fields and the calculated
+idle time:
+
+@strong{Note:}
+This program uses a number of @command{awk} features that
+haven't been introduced yet.
+
+@example
+BEGIN  @{ FIELDWIDTHS = "9 6 10 6 7 7 35" @}
+NR > 2 @{
+    idle = $4
+    sub(/^  */, "", idle)   # strip leading spaces
+    if (idle == "")
+        idle = 0
+    if (idle ~ /:/) @{
+        split(idle, t, ":")
+        idle = t[1] * 60 + t[2]
+    @}
+    if (idle ~ /days/)
+        idle *= 24 * 60 * 60
+
+    print $1, $2, idle
+@}
+@end example
+
+Running the program on the data produces the following results:
+
+@example
+hzuo      ttyV0  0
+hzang     ttyV3  50
+eklye     ttyV5  0
+dportein  ttyV6  107
+gierd     ttyD3  1
+dave      ttyD4  0
+brent     ttyp0  286
+dave      ttyq4  1296000
+@end example
+
+Another (possibly more practical) example of fixed-width input data
+is the input from a deck of balloting cards.  In some parts of
+the United States, voters mark their choices by punching holes in computer
+cards.  These cards are then processed to count the votes for any particular
+candidate or on any particular issue.  Because a voter may choose not to
+vote on some issue, any column on the card may be empty.  An @command{awk}
+program for processing such data could use the @code{FIELDWIDTHS} feature
+to simplify reading the data.  (Of course, getting @command{gawk} to run on
+a system with card readers is another story!)
+
+@ignore
+Exercise: Write a ballot card reading program
+@end ignore
+
+@cindex @command{gawk}, splitting fields and
+Assigning a value to @code{FS} causes @command{gawk} to use
+@code{FS} for field splitting again.  Use @samp{FS = FS} to make this happen,
+without having to know the current value of @code{FS}.
+In order to tell which kind of field splitting is in effect,
+use @code{PROCINFO["FS"]}
+(@pxref{Auto-set}).
+The value is @code{"FS"} if regular field splitting is being used,
+or it is @code{"FIELDWIDTHS"} if fixed-width field splitting is being used:
+
+@example
+if (PROCINFO["FS"] == "FS")
+    @var{regular field splitting} @dots{}
+else
+    @var{fixed-width field splitting} @dots{}
+@end example
+
+This information is useful when writing a function
+that needs to temporarily change @code{FS} or @code{FIELDWIDTHS},
+read some records, and then restore the original settings
+(@pxref{Passwd Functions},
+for an example of such a function).
+
+@node Multiple Line
+@section Multiple-Line Records
+
+@c STARTOFRANGE recm
+@cindex records, multiline
+@c STARTOFRANGE imr
+@cindex input, multiline records
+@c STARTOFRANGE frm
+@cindex files, reading, multiline records
+@cindex input, files, See input files
+In some databases, a single line cannot conveniently hold all the
+information in one entry.  In such cases, you can use multiline
+records.  The first step in doing this is to choose your data format.
+
+@cindex record separators, with multiline records
+One technique is to use an unusual character or string to separate
+records.  For example, you could use the formfeed character (written
+@samp{\f} in @command{awk}, as in C) to separate them, making each record
+a page of the file.  To do this, just set the variable @code{RS} to
+@code{"\f"} (a string containing the formfeed character).  Any
+other character could equally well be used, as long as it won't be part
+of the data in a record.
+
+@cindex @code{RS} variable, multiline records and
+Another technique is to have blank lines separate records.  By a special
+dispensation, an empty string as the value of @code{RS} indicates that
+records are separated by one or more blank lines.  When @code{RS} is set
+to the empty string, each record always ends at the first blank line
+encountered.  The next record doesn't start until the first nonblank
+line that follows.  No matter how many blank lines appear in a row, they
+all act as one record separator.
+(Blank lines must be completely empty; lines that contain only
+whitespace do not count.)
+
+@cindex leftmost longest match
+@cindex matching, leftmost longest
+You can achieve the same effect as @samp{RS = ""} by assigning the
+string @code{"\n\n+"} to @code{RS}. This regexp matches the newline
+at the end of the record and one or more blank lines after the record.
+In addition, a regular expression always matches the longest possible
+sequence when there is a choice
+(@pxref{Leftmost Longest}).
+So the next record doesn't start until
+the first nonblank line that follows---no matter how many blank lines
+appear in a row, they are considered one record separator.
+
+@cindex dark corner, multiline records
+There is an important difference between @samp{RS = ""} and
+@samp{RS = "\n\n+"}. In the first case, leading newlines in the input
+@value{DF} are ignored, and if a file ends without extra blank lines
+after the last record, the final newline is removed from the record.
+In the second case, this special processing is not done.
+@value{DARKCORNER}
+
+@cindex field separators, in multiline records
+Now that the input is separated into records, the second step is to
+separate the fields in the record.  One way to do this is to divide each
+of the lines into fields in the normal manner.  This happens by default
+as the result of a special feature.  When @code{RS} is set to the empty
+string, @emph{and} @code{FS} is a set to a single character,
+the newline character @emph{always} acts as a field separator.
+This is in addition to whatever field separations result from
+@code{FS}.@footnote{When @code{FS} is the null string (@code{""})
+or a regexp, this special feature of @code{RS} does not apply.
+It does apply to the default field separator of a single space:
+@samp{FS = " "}.}
+
+The original motivation for this special exception was probably to provide
+useful behavior in the default case (i.e., @code{FS} is equal
+to @w{@code{" "}}).  This feature can be a problem if you really don't
+want the newline character to separate fields, because there is no way to
+prevent it.  However, you can work around this by using the @code{split}
+function to break up the record manually
+(@pxref{String Functions}).
+If you have a single character field separator, you can work around
+the special feature in a different way, by making @code{FS} into a
+regexp for that single character.  For example, if the field
+separator is a percent character, instead of
+@samp{FS = "%"}, use @samp{FS = "[%]"}.
+
+Another way to separate fields is to
+put each field on a separate line: to do this, just set the
+variable @code{FS} to the string @code{"\n"}.  (This single
+character seperator matches a single newline.)
+A practical example of a @value{DF} organized this way might be a mailing
+list, where each entry is separated by blank lines.  Consider a mailing
+list in a file named @file{addresses}, which looks like this:
+
+@example
+Jane Doe
+123 Main Street
+Anywhere, SE 12345-6789
+
+John Smith
+456 Tree-lined Avenue
+Smallville, MW 98765-4321
+@dots{}
+@end example
+
+@noindent
+A simple program to process this file is as follows:
+
+@example
+# addrs.awk --- simple mailing list program
+
+# Records are separated by blank lines.
+# Each line is one field.
+BEGIN @{ RS = "" ; FS = "\n" @}
+
+@{
+      print "Name is:", $1
+      print "Address is:", $2
+      print "City and State are:", $3
+      print ""
+@}
+@end example
+
+Running the program produces the following output:
+
+@example
+$ awk -f addrs.awk addresses
+@print{} Name is: Jane Doe
+@print{} Address is: 123 Main Street
+@print{} City and State are: Anywhere, SE 12345-6789
+@print{}
+@print{} Name is: John Smith
+@print{} Address is: 456 Tree-lined Avenue
+@print{} City and State are: Smallville, MW 98765-4321
+@print{}
+@dots{}
+@end example
+
+@xref{Labels Program}, for a more realistic
+program that deals with address lists.
+The following
+table
+summarizes how records are split, based on the
+value of
+@ifinfo
+@code{RS}.
+(@samp{==} means ``is equal to.'')
+@end ifinfo
+@ifnotinfo
+@code{RS}:
+@end ifnotinfo
+
+@table @code
+@item RS == "\n"
+Records are separated by the newline character (@samp{\n}).  In effect,
+every line in the @value{DF} is a separate record, including blank lines.
+This is the default.
+
+@item RS == @var{any single character}
+Records are separated by each occurrence of the character.  Multiple
+successive occurrences delimit empty records.
+
+@item RS == ""
+Records are separated by runs of blank lines.  The newline character
+always serves as a field separator, in addition to whatever value
+@code{FS} may have. Leading and trailing newlines in a file are ignored.
+
+@item RS == @var{regexp}
+Records are separated by occurrences of characters that match @var{regexp}.
+Leading and trailing matches of @var{regexp} delimit empty records.
+(This is a @command{gawk} extension; it is not specified by the
+POSIX standard.)
+@end table
+
+@cindex @code{RT} variable
+In all cases, @command{gawk} sets @code{RT} to the input text that matched the
+value specified by @code{RS}.
+@c ENDOFRANGE recm
+@c ENDOFRANGE imr
+@c ENDOFRANGE frm
+
+@node Getline
+@section Explicit Input with @code{getline}
+
+@c STARTOFRANGE getl
+@cindex @code{getline} command, explicit input with
+@cindex input, explicit
+So far we have been getting our input data from @command{awk}'s main
+input stream---either the standard input (usually your terminal, sometimes
+the output from another program) or from the
+files specified on the command line.  The @command{awk} language has a
+special built-in command called @code{getline} that
+can be used to read input under your explicit control.
+
+The @code{getline} command is used in several different ways and should
+@emph{not} be used by beginners.
+The examples that follow the explanation of the @code{getline} command
+include material that has not been covered yet.  Therefore, come back
+and study the @code{getline} command @emph{after} you have reviewed the
+rest of this @value{DOCUMENT} and have a good knowledge of how @command{awk} works.
+
+@cindex @code{ERRNO} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{getline} command
+@cindex @code{getline} command, return values
+The @code{getline} command returns one if it finds a record and zero if
+it encounters the end of the file.  If there is some error in getting
+a record, such as a file that cannot be opened, then @code{getline}
+returns @minus{}1.  In this case, @command{gawk} sets the variable
+@code{ERRNO} to a string describing the error that occurred.
+
+In the following examples, @var{command} stands for a string value that
+represents a shell command.
+
+@menu
+* Plain Getline::               Using @code{getline} with no arguments.
+* Getline/Variable::            Using @code{getline} into a variable.
+* Getline/File::                Using @code{getline} from a file.
+* Getline/Variable/File::       Using @code{getline} into a variable from a
+                                file.
+* Getline/Pipe::                Using @code{getline} from a pipe.
+* Getline/Variable/Pipe::       Using @code{getline} into a variable from a
+                                pipe.
+* Getline/Coprocess::           Using @code{getline} from a coprocess.
+* Getline/Variable/Coprocess::  Using @code{getline} into a variable from a
+                                coprocess.
+* Getline Notes::               Important things to know about @code{getline}.
+* Getline Summary::             Summary of @code{getline} Variants.
+@end menu
+
+@node Plain Getline
+@subsection Using @code{getline} with No Arguments
+
+The @code{getline} command can be used without arguments to read input
+from the current input file.  All it does in this case is read the next
+input record and split it up into fields.  This is useful if you've
+finished processing the current record, but want to do some special
+processing on the next record @emph{right now}.  For example:
+
+@example
+@{
+     if ((t = index($0, "/*")) != 0) @{
+          # value of `tmp' will be "" if t is 1
+          tmp = substr($0, 1, t - 1)
+          u = index(substr($0, t + 2), "*/")
+          while (u == 0) @{
+               if (getline <= 0) @{
+                    m = "unexpected EOF or error"
+                    m = (m ": " ERRNO)
+                    print m > "/dev/stderr"
+                    exit
+               @}
+               t = -1
+               u = index($0, "*/")
+          @}
+          # substr expression will be "" if */
+          # occurred at end of line
+          $0 = tmp substr($0, u + 2)
+     @}
+     print $0
+@}
+@end example
+
+This @command{awk} program deletes all C-style comments (@samp{/* @dots{}
+*/}) from the input.  By replacing the @samp{print $0} with other
+statements, you could perform more complicated processing on the
+decommented input, such as searching for matches of a regular
+expression.  (This program has a subtle problem---it does not work if one
+comment ends and another begins on the same line.)
+
+@ignore
+Exercise,
+write a program that does handle multiple comments on the line.
+@end ignore
+
+This form of the @code{getline} command sets @code{NF},
+@code{NR}, @code{FNR}, and the value of @code{$0}.
+
+@strong{Note:} The new value of @code{$0} is used to test
+the patterns of any subsequent rules.  The original value
+of @code{$0} that triggered the rule that executed @code{getline}
+is lost.
+By contrast, the @code{next} statement reads a new record
+but immediately begins processing it normally, starting with the first
+rule in the program.  @xref{Next Statement}.
+
+@node Getline/Variable
+@subsection Using @code{getline} into a Variable
+@c comma before using is NOT for tertiary
+@cindex variables, @code{getline} command into, using
+
+You can use @samp{getline @var{var}} to read the next record from
+@command{awk}'s input into the variable @var{var}.  No other processing is
+done.
+For example, suppose the next line is a comment or a special string,
+and you want to read it without triggering
+any rules.  This form of @code{getline} allows you to read that line
+and store it in a variable so that the main
+read-a-line-and-check-each-rule loop of @command{awk} never sees it.
+The following example swaps every two lines of input:
+
+@example
+@{
+     if ((getline tmp) > 0) @{
+          print tmp
+          print $0
+     @} else
+          print $0
+@}
+@end example
+
+@noindent
+It takes the following list:
+
+@example
+wan
+tew
+free
+phore
+@end example
+
+@noindent
+and produces these results:
+
+@example
+tew
+wan
+phore
+free
+@end example
+
+The @code{getline} command used in this way sets only the variables
+@code{NR} and @code{FNR} (and of course, @var{var}).  The record is not
+split into fields, so the values of the fields (including @code{$0}) and
+the value of @code{NF} do not change.
+
+@node Getline/File
+@subsection Using @code{getline} from a File
+
+@cindex input redirection
+@cindex redirection of input
+@cindex @code{<} (left angle bracket), @code{<} operator (I/O)
+@cindex left angle bracket (@code{<}), @code{<} operator (I/O)
+@cindex operators, input/output
+Use @samp{getline < @var{file}} to read the next record from @var{file}.
+Here @var{file} is a string-valued expression that
+specifies the @value{FN}.  @samp{< @var{file}} is called a @dfn{redirection}
+because it directs input to come from a different place.
+For example, the following
+program reads its input record from the file @file{secondary.input} when it
+encounters a first field with a value equal to 10 in the current input
+file:
+
+@example
+@{
+    if ($1 == 10) @{
+         getline < "secondary.input"
+         print
+    @} else
+         print
+@}
+@end example
+
+Because the main input stream is not used, the values of @code{NR} and
+@code{FNR} are not changed. However, the record it reads is split into fields in
+the normal manner, so the values of @code{$0} and the other fields are
+changed, resulting in a new value of @code{NF}.
+
+@cindex POSIX @command{awk}, @code{<} operator and
+@c Thanks to Paul Eggert for initial wording here
+According to POSIX, @samp{getline < @var{expression}} is ambiguous if
+@var{expression} contains unparenthesized operators other than
+@samp{$}; for example, @samp{getline < dir "/" file} is ambiguous
+because the concatenation operator is not parenthesized.  You should
+write it as @samp{getline < (dir "/" file)} if you want your program
+to be portable to other @command{awk} implementations.
+
+@node Getline/Variable/File
+@subsection Using @code{getline} into a Variable from a File
+@c comma before using is NOT for tertiary
+@cindex variables, @code{getline} command into, using
+
+Use @samp{getline @var{var} < @var{file}} to read input
+from the file
+@var{file}, and put it in the variable @var{var}.  As above, @var{file}
+is a string-valued expression that specifies the file from which to read.
+
+In this version of @code{getline}, none of the built-in variables are
+changed and the record is not split into fields.  The only variable
+changed is @var{var}.
+For example, the following program copies all the input files to the
+output, except for records that say @w{@samp{@@include @var{filename}}}.
+Such a record is replaced by the contents of the file
+@var{filename}:
+
+@example
+@{
+     if (NF == 2 && $1 == "@@include") @{
+          while ((getline line < $2) > 0)
+               print line
+          close($2)
+     @} else
+          print
+@}
+@end example
+
+Note here how the name of the extra input file is not built into
+the program; it is taken directly from the data, specifically from the second field on
+the @samp{@@include} line.
+
+@cindex @code{close} function
+The @code{close} function is called to ensure that if two identical
+@samp{@@include} lines appear in the input, the entire specified file is
+included twice.
+@xref{Close Files And Pipes}.
+
+One deficiency of this program is that it does not process nested
+@samp{@@include} statements
+(i.e., @samp{@@include} statements in included files)
+the way a true macro preprocessor would.
+@xref{Igawk Program}, for a program
+that does handle nested @samp{@@include} statements.
+
+@node Getline/Pipe
+@subsection Using @code{getline} from a Pipe
+
+@cindex @code{|} (vertical bar), @code{|} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|} operator (I/O)
+@cindex input pipeline
+@cindex pipes, input
+@cindex operators, input/output
+The output of a command can also be piped into @code{getline}, using
+@samp{@var{command} | getline}.  In
+this case, the string @var{command} is run as a shell command and its output
+is piped into @command{awk} to be used as input.  This form of @code{getline}
+reads one record at a time from the pipe.
+For example, the following program copies its input to its output, except for
+lines that begin with @samp{@@execute}, which are replaced by the output
+produced by running the rest of the line as a shell command:
+
+@example
+@{
+     if ($1 == "@@execute") @{
+          tmp = substr($0, 10)
+          while ((tmp | getline) > 0)
+               print
+          close(tmp)
+     @} else
+          print
+@}
+@end example
+
+@noindent
+@cindex @code{close} function
+The @code{close} function is called to ensure that if two identical
+@samp{@@execute} lines appear in the input, the command is run for
+each one.
+@ifnottex
+@xref{Close Files And Pipes}.
+@end ifnottex
+@c Exercise!!
+@c This example is unrealistic, since you could just use system
+Given the input:
+
+@example
+foo
+bar
+baz
+@@execute who
+bletch
+@end example
+
+@noindent
+the program might produce:
+
+@cindex Robbins, Bill
+@cindex Robbins, Miriam
+@cindex Robbins, Arnold
+@example
+foo
+bar
+baz
+arnold     ttyv0   Jul 13 14:22
+miriam     ttyp0   Jul 13 14:23     (murphy:0)
+bill       ttyp1   Jul 13 14:23     (murphy:0)
+bletch
+@end example
+
+@noindent
+Notice that this program ran the command @command{who} and printed the previous result.
+(If you try this program yourself, you will of course get different results,
+depending upon who is logged in on your system.)
+
+This variation of @code{getline} splits the record into fields, sets the
+value of @code{NF}, and recomputes the value of @code{$0}.  The values of
+@code{NR} and @code{FNR} are not changed.
+
+@cindex POSIX @command{awk}, @code{|} I/O operator and
+@c Thanks to Paul Eggert for initial wording here
+According to POSIX, @samp{@var{expression} | getline} is ambiguous if
+@var{expression} contains unparenthesized operators other than
+@samp{$}---for example, @samp{@w{"echo "} "date" | getline} is ambiguous
+because the concatenation operator is not parenthesized.  You should
+write it as @samp{(@w{"echo "} "date") | getline} if you want your program
+to be portable to other @command{awk} implementations.
+
+@node Getline/Variable/Pipe
+@subsection Using @code{getline} into a Variable from a Pipe
+@c comma before using is NOT for tertiary
+@cindex variables, @code{getline} command into, using
+
+When you use @samp{@var{command} | getline @var{var}}, the
+output of @var{command} is sent through a pipe to
+@code{getline} and into the variable @var{var}.  For example, the
+following program reads the current date and time into the variable
+@code{current_time}, using the @command{date} utility, and then
+prints it:
+
+@example
+BEGIN @{
+     "date" | getline current_time
+     close("date")
+     print "Report printed on " current_time
+@}
+@end example
+
+In this version of @code{getline}, none of the built-in variables are
+changed and the record is not split into fields.
+
+@ifinfo
+@c Thanks to Paul Eggert for initial wording here
+According to POSIX, @samp{@var{expression} | getline @var{var}} is ambiguous if
+@var{expression} contains unparenthesized operators other than
+@samp{$}; for example, @samp{@w{"echo "} "date" | getline @var{var}} is ambiguous
+because the concatenation operator is not parenthesized. You should
+write it as @samp{(@w{"echo "} "date") | getline @var{var}} if you want your
+program to be portable to other @command{awk} implementations.
+@end ifinfo
+
+@node Getline/Coprocess
+@subsection Using @code{getline} from a Coprocess
+@cindex coprocesses, @code{getline} from
+@c comma before using is NOT for tertiary
+@cindex @code{getline} command, coprocesses, using from
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|&} operator (I/O)
+@cindex operators, input/output
+@cindex differences in @command{awk} and @command{gawk}, input/output operators
+
+Input into @code{getline} from a pipe is a one-way operation.
+The command that is started with @samp{@var{command} | getline} only
+sends data @emph{to} your @command{awk} program.
+
+On occasion, you might want to send data to another program
+for processing and then read the results back.
+@command{gawk} allows you start a @dfn{coprocess}, with which two-way
+communications are possible.  This is done with the @samp{|&}
+operator.
+Typically, you write data to the coprocess first and then
+read results back, as shown in the following:
+
+@example
+print "@var{some query}" |& "db_server"
+"db_server" |& getline
+@end example
+
+@noindent
+which sends a query to @command{db_server} and then reads the results.
+
+The values of @code{NR} and
+@code{FNR} are not changed,
+because the main input stream is not used.
+However, the record is split into fields in
+the normal manner, thus changing the values of @code{$0}, of the other fields,
+and of @code{NF}.
+
+Coprocesses are an advanced feature. They are discussed here only because
+this is the @value{SECTION} on @code{getline}.
+@xref{Two-way I/O},
+where coprocesses are discussed in more detail.
+
+@node Getline/Variable/Coprocess
+@subsection Using @code{getline} into a Variable from a Coprocess
+@c comma before using is NOT for tertiary
+@cindex variables, @code{getline} command into, using
+
+When you use @samp{@var{command} |& getline @var{var}}, the output from
+the coprocess @var{command} is sent through a two-way pipe to @code{getline}
+and into the variable @var{var}.
+
+In this version of @code{getline}, none of the built-in variables are
+changed and the record is not split into fields.  The only variable
+changed is @var{var}.
+
+@ifinfo
+Coprocesses are an advanced feature. They are discussed here only because
+this is the @value{SECTION} on @code{getline}.
+@xref{Two-way I/O},
+where coprocesses are discussed in more detail.
+@end ifinfo
+
+@node Getline Notes
+@subsection Points to Remember About @code{getline}
+Here are some miscellaneous points about @code{getline} that
+you should bear in mind:
+
+@itemize @bullet
+@item
+When @code{getline} changes the value of @code{$0} and @code{NF},
+@command{awk} does @emph{not} automatically jump to the start of the
+program and start testing the new record against every pattern.
+However, the new record is tested against any subsequent rules.
+
+@cindex differences in @command{awk} and @command{gawk}, implementation limitations
+@cindex implementation issues, @command{gawk}, limits
+@cindex @command{awk}, implementations, limits
+@cindex @command{gawk}, implementation issues, limits
+@item
+Many @command{awk} implementations limit the number of pipelines that an @command{awk}
+program may have open to just one.  In @command{gawk}, there is no such limit.
+You can open as many pipelines (and coprocesses) as the underlying operating
+system permits.
+
+@cindex side effects, @code{FILENAME} variable
+@c The comma before "setting with" does NOT represent a tertiary
+@cindex @code{FILENAME} variable, @code{getline}, setting with
+@cindex dark corner, @code{FILENAME} variable
+@cindex @code{getline} command, @code{FILENAME} variable and
+@cindex @code{BEGIN} pattern, @code{getline} and
+@item
+An interesting side effect occurs if you use @code{getline} without a
+redirection inside a @code{BEGIN} rule. Because an unredirected @code{getline}
+reads from the command-line @value{DF}s, the first @code{getline} command
+causes @command{awk} to set the value of @code{FILENAME}. Normally,
+@code{FILENAME} does not have a value inside @code{BEGIN} rules, because you
+have not yet started to process the command-line @value{DF}s.
+@value{DARKCORNER}
+(@xref{BEGIN/END},
+also @pxref{Auto-set}.)
+
+@item
+Using @code{FILENAME} with @code{getline}
+(@samp{getline < FILENAME})
+is likely to be a source for
+confusion.  @command{awk} opens a separate input stream from the
+current input file.  However, by not using a variable, @code{$0}
+and @code{NR} are still updated.  If you're doing this, it's
+probably by accident, and you should reconsider what it is you're
+trying to accomplish.
+@end itemize
+
+@node Getline Summary
+@subsection Summary of @code{getline} Variants
+@cindex @code{getline} command, variants
+
+The following table summarizes the eight variants of @code{getline},
+listing which built-in variables are set by each one.
+
+@multitable {@var{command} @code{|& getline} @var{var}} {1234567890123456789012345678901234567890}
+@item @code{getline} @tab Sets @code{$0}, @code{NF}, @code{FNR}, and @code{NR}
+
+@item @code{getline} @var{var} @tab Sets @var{var}, @code{FNR}, and @code{NR}
+
+@item @code{getline <} @var{file} @tab Sets @code{$0} and @code{NF}
+
+@item @code{getline @var{var} < @var{file}} @tab Sets @var{var}
+
+@item @var{command} @code{| getline} @tab Sets @code{$0} and @code{NF}
+
+@item @var{command} @code{| getline} @var{var} @tab Sets @var{var}
+
+@item @var{command} @code{|& getline} @tab Sets @code{$0} and @code{NF}.
+This is a @command{gawk} extension
+
+@item @var{command} @code{|& getline} @var{var} @tab Sets @var{var}.
+This is a @command{gawk} extension
+@end multitable
+@c ENDOFRANGE getl
+@c ENDOFRANGE inex
+@c ENDOFRANGE infir
+
+@node Printing
+@chapter Printing Output
+
+@c STARTOFRANGE prnt
+@cindex printing
+@cindex output, printing, See printing
+One of the most common programming actions is to @dfn{print}, or output,
+some or all of the input.  Use the @code{print} statement
+for simple output, and the @code{printf} statement
+for fancier formatting.
+The @code{print} statement is not limited when
+computing @emph{which} values to print. However, with two exceptions,
+you cannot specify @emph{how} to print them---how many
+columns, whether to use exponential notation or not, and so on.
+(For the exceptions, @pxref{Output Separators}, and
+@ref{OFMT}.)
+For printing with specifications, you need the @code{printf} statement
+(@pxref{Printf}).
+
+@c STARTOFRANGE prnts
+@cindex @code{print} statement
+@cindex @code{printf} statement
+Besides basic and formatted printing, this @value{CHAPTER}
+also covers I/O redirections to files and pipes, introduces
+the special @value{FN}s that @command{gawk} processes internally,
+and discusses the @code{close} built-in function.
+
+@menu
+* Print::                       The @code{print} statement.
+* Print Examples::              Simple examples of @code{print} statements.
+* Output Separators::           The output separators and how to change them.
+* OFMT::                        Controlling Numeric Output With @code{print}.
+* Printf::                      The @code{printf} statement.
+* Redirection::                 How to redirect output to multiple files and
+                                pipes.
+* Special Files::               File name interpretation in @command{gawk}.
+                                @command{gawk} allows access to inherited file
+                                descriptors.
+* Close Files And Pipes::       Closing Input and Output Files and Pipes.
+@end menu
+
+@node Print
+@section The @code{print} Statement
+
+The @code{print} statement is used to produce output with simple, standardized
+formatting.  Specify only the strings or numbers to print, in a
+list separated by commas.  They are output, separated by single spaces,
+followed by a newline.  The statement looks like this:
+
+@example
+print @var{item1}, @var{item2}, @dots{}
+@end example
+
+@noindent
+The entire list of items may be optionally enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions uses the @samp{>}
+relational operator; otherwise it could be confused with a redirection
+(@pxref{Redirection}).
+
+The items to print can be constant strings or numbers, fields of the
+current record (such as @code{$1}), variables, or any @command{awk}
+expression.  Numeric values are converted to strings and then printed.
+
+@cindex records, printing
+@cindex lines, blank, printing
+@cindex text, printing
+The simple statement @samp{print} with no items is equivalent to
+@samp{print $0}: it prints the entire current record.  To print a blank
+line, use @samp{print ""}, where @code{""} is the empty string.
+To print a fixed piece of text, use a string constant, such as
+@w{@code{"Don't Panic"}}, as one item.  If you forget to use the
+double-quote characters, your text is taken as an @command{awk}
+expression, and you will probably get an error.  Keep in mind that a
+space is printed between any two items.
+
+@node Print Examples
+@section Examples of @code{print} Statements
+
+Each @code{print} statement makes at least one line of output.  However, it
+isn't limited to only one line.  If an item value is a string that contains a
+newline, the newline is output along with the rest of the string.  A
+single @code{print} statement can make any number of lines this way.
+
+@cindex newlines, printing
+The following is an example of printing a string that contains embedded newlines
+(the @samp{\n} is an escape sequence, used to represent the newline
+character; @pxref{Escape Sequences}):
+
+@example
+$ awk 'BEGIN @{ print "line one\nline two\nline three" @}'
+@print{} line one
+@print{} line two
+@print{} line three
+@end example
+
+@cindex fields, printing
+The next example, which is run on the @file{inventory-shipped} file,
+prints the first two fields of each input record, with a space between
+them:
+
+@example
+$ awk '@{ print $1, $2 @}' inventory-shipped
+@print{} Jan 13
+@print{} Feb 15
+@print{} Mar 15
+@dots{}
+@end example
+
+@cindex @code{print} statement, commas, omitting
+@c comma does NOT start tertiary
+@cindex troubleshooting, @code{print} statement, omitting commas
+A common mistake in using the @code{print} statement is to omit the comma
+between two items.  This often has the effect of making the items run
+together in the output, with no space.  The reason for this is that
+juxtaposing two string expressions in @command{awk} means to concatenate
+them.  Here is the same program, without the comma:
+
+@example
+$ awk '@{ print $1 $2 @}' inventory-shipped
+@print{} Jan13
+@print{} Feb15
+@print{} Mar15
+@dots{}
+@end example
+
+@c comma does NOT start tertiary
+@cindex @code{BEGIN} pattern, headings, adding
+To someone unfamiliar with the @file{inventory-shipped} file, neither
+example's output makes much sense.  A heading line at the beginning
+would make it clearer.  Let's add some headings to our table of months
+(@code{$1}) and green crates shipped (@code{$2}).  We do this using the
+@code{BEGIN} pattern
+(@pxref{BEGIN/END})
+so that the headings are only printed once:
+
+@example
+awk 'BEGIN @{  print "Month Crates"
+              print "----- ------" @}
+           @{  print $1, $2 @}' inventory-shipped
+@end example
+
+@noindent
+When run, the program prints the following:
+
+@example
+Month Crates
+----- ------
+Jan 13
+Feb 15
+Mar 15
+@dots{}
+@end example
+
+@noindent
+The only problem, however, is that the headings and the table data
+don't line up!  We can fix this by printing some spaces between the
+two fields:
+
+@example
+@group
+awk 'BEGIN @{ print "Month Crates"
+             print "----- ------" @}
+           @{ print $1, "     ", $2 @}' inventory-shipped
+@end group
+@end example
+
+@c comma does NOT start tertiary
+@cindex @code{printf} statement, columns, aligning
+@cindex columns, aligning
+Lining up columns this way can get pretty
+complicated when there are many columns to fix.  Counting spaces for two
+or three columns is simple, but any more than this can take up
+a lot of time. This is why the @code{printf} statement was
+created (@pxref{Printf});
+one of its specialties is lining up columns of data.
+
+@cindex line continuations, in @code{print} statement
+@cindex @code{print} statement, line continuations and
+@strong{Note:} You can continue either a @code{print} or
+@code{printf} statement simply by putting a newline after any comma
+(@pxref{Statements/Lines}).
+@c ENDOFRANGE prnts
+
+@node Output Separators
+@section Output Separators
+
+@cindex @code{OFS} variable
+As mentioned previously, a @code{print} statement contains a list
+of items separated by commas.  In the output, the items are normally
+separated by single spaces.  However, this doesn't need to be the case;
+a single space is only the default.  Any string of
+characters may be used as the @dfn{output field separator} by setting the
+built-in variable @code{OFS}.  The initial value of this variable
+is the string @w{@code{" "}}---that is, a single space.
+
+The output from an entire @code{print} statement is called an
+@dfn{output record}.  Each @code{print} statement outputs one output
+record, and then outputs a string called the @dfn{output record separator}
+(or @code{ORS}).  The initial
+value of @code{ORS} is the string @code{"\n"}; i.e., a newline
+character.  Thus, each @code{print} statement normally makes a separate line.
+
+@cindex output, records
+@cindex output record separator, See @code{ORS} variable
+@cindex @code{ORS} variable
+@cindex @code{BEGIN} pattern, @code{OFS}/@code{ORS} variables, assigning values to
+In order to change how output fields and records are separated, assign
+new values to the variables @code{OFS} and @code{ORS}.  The usual
+place to do this is in the @code{BEGIN} rule
+(@pxref{BEGIN/END}), so
+that it happens before any input is processed.  It can also be done
+with assignments on the command line, before the names of the input
+files, or using the @option{-v} command-line option
+(@pxref{Options}).
+The following example prints the first and second fields of each input
+record, separated by a semicolon, with a blank line added after each
+newline:
+
+@ignore
+Exercise,
+Rewrite the
+@example
+awk 'BEGIN @{ print "Month Crates"
+             print "----- ------" @}
+           @{ print $1, "     ", $2 @}' inventory-shipped
+@end example
+program by using a new value of @code{OFS}.
+@end ignore
+
+@example
+$ awk 'BEGIN @{ OFS = ";"; ORS = "\n\n" @}
+>            @{ print $1, $2 @}' BBS-list
+@print{} aardvark;555-5553
+@print{}
+@print{} alpo-net;555-3412
+@print{}
+@print{} barfly;555-7685
+@dots{}
+@end example
+
+If the value of @code{ORS} does not contain a newline, the program's output
+is run together on a single line.
+
+@node OFMT
+@section Controlling Numeric Output with @code{print}
+@cindex numeric, output format
+@c the comma does NOT start a secondary
+@cindex formats, numeric output
+When the @code{print} statement is used to print numeric values,
+@command{awk} internally converts the number to a string of characters
+and prints that string.  @command{awk} uses the @code{sprintf} function
+to do this conversion
+(@pxref{String Functions}).
+For now, it suffices to say that the @code{sprintf}
+function accepts a @dfn{format specification} that tells it how to format
+numbers (or strings), and that there are a number of different ways in which
+numbers can be formatted.  The different format specifications are discussed
+more fully in
+@ref{Control Letters}.
+
+@cindex @code{sprintf} function
+@cindex @code{OFMT} variable
+@c the comma before OFMT does NOT start a tertiary
+@cindex output, format specifier, @code{OFMT}
+The built-in variable @code{OFMT} contains the default format specification
+that @code{print} uses with @code{sprintf} when it wants to convert a
+number to a string for printing.
+The default value of @code{OFMT} is @code{"%.6g"}.
+The way @code{print} prints numbers can be changed
+by supplying different format specifications
+as the value of @code{OFMT}, as shown in the following example:
+
+@example
+$ awk 'BEGIN @{
+>   OFMT = "%.0f"  # print numbers as integers (rounds)
+>   print 17.23, 17.54 @}'
+@print{} 17 18
+@end example
+
+@noindent
+@cindex dark corner, @code{OFMT} variable
+@cindex POSIX @command{awk}, @code{OFMT} variable and
+@cindex @code{OFMT} variable, POSIX @command{awk} and
+According to the POSIX standard, @command{awk}'s behavior is undefined
+if @code{OFMT} contains anything but a floating-point conversion specification.
+@value{DARKCORNER}
+
+@node Printf
+@section Using @code{printf} Statements for Fancier Printing
+
+@c STARTOFRANGE printfs
+@cindex @code{printf} statement
+@cindex output, formatted
+@cindex formatting output
+For more precise control over the output format than what is
+normally provided by @code{print}, use @code{printf}.
+@code{printf} can be used to
+specify the width to use for each item, as well as various
+formatting choices for numbers (such as what output base to use, whether to
+print an exponent, whether to print a sign, and how many digits to print
+after the decimal point).  This is done by supplying a string, called
+the @dfn{format string}, that controls how and where to print the other
+arguments.
+
+@menu
+* Basic Printf::                Syntax of the @code{printf} statement.
+* Control Letters::             Format-control letters.
+* Format Modifiers::            Format-specification modifiers.
+* Printf Examples::             Several examples.
+@end menu
+
+@node Basic Printf
+@subsection Introduction to the @code{printf} Statement
+
+@cindex @code{printf} statement, syntax of
+A simple @code{printf} statement looks like this:
+
+@example
+printf @var{format}, @var{item1}, @var{item2}, @dots{}
+@end example
+
+@noindent
+The entire list of arguments may optionally be enclosed in parentheses.  The
+parentheses are necessary if any of the item expressions use the @samp{>}
+relational operator; otherwise, it can be confused with a redirection
+(@pxref{Redirection}).
+
+@cindex format strings
+The difference between @code{printf} and @code{print} is the @var{format}
+argument.  This is an expression whose value is taken as a string; it
+specifies how to output each of the other arguments.  It is called the
+@dfn{format string}.
+
+The format string is very similar to that in the ISO C library function
+@code{printf}.  Most of @var{format} is text to output verbatim.
+Scattered among this text are @dfn{format specifiers}---one per item.
+Each format specifier says to output the next item in the argument list
+at that place in the format.
+
+The @code{printf} statement does not automatically append a newline
+to its output.  It outputs only what the format string specifies.
+So if a newline is needed, you must include one in the format string.
+The output separator variables @code{OFS} and @code{ORS} have no effect
+on @code{printf} statements. For example:
+
+@example
+$ awk 'BEGIN @{
+>    ORS = "\nOUCH!\n"; OFS = "+"
+>    msg = "Dont Panic!"
+>    printf "%s\n", msg
+> @}'
+@print{} Dont Panic!
+@end example
+
+@noindent
+Here, neither the @samp{+} nor the @samp{OUCH} appear when
+the message is printed.
+
+@node Control Letters
+@subsection Format-Control Letters
+@cindex @code{printf} statement, format-control characters
+@cindex format specifiers, @code{printf} statement
+
+A format specifier starts with the character @samp{%} and ends with
+a @dfn{format-control letter}---it tells the @code{printf} statement
+how to output one item.  The format-control letter specifies what @emph{kind}
+of value to print.  The rest of the format specifier is made up of
+optional @dfn{modifiers} that control @emph{how} to print the value, such as
+the field width.  Here is a list of the format-control letters:
+
+@table @code
+@item %c
+This prints a number as an ASCII character; thus, @samp{printf "%c",
+65} outputs the letter @samp{A}. (The output for a string value is
+the first character of the string.)
+
+@item %d@r{,} %i
+These are equivalent; they both print a decimal integer.
+(The @samp{%i} specification is for compatibility with ISO C.)
+
+@item %e@r{,} %E
+These print a number in scientific (exponential) notation;
+for example:
+
+@example
+printf "%4.3e\n", 1950
+@end example
+
+@noindent
+prints @samp{1.950e+03}, with a total of four significant figures, three of
+which follow the decimal point.
+(The @samp{4.3} represents two modifiers,
+discussed in the next @value{SUBSECTION}.)
+@samp{%E} uses @samp{E} instead of @samp{e} in the output.
+
+@item %f
+This prints a number in floating-point notation.
+For example:
+
+@example
+printf "%4.3f", 1950
+@end example
+
+@noindent
+prints @samp{1950.000}, with a total of four significant figures, three of
+which follow the decimal point.
+(The @samp{4.3} represents two modifiers,
+discussed in the next @value{SUBSECTION}.)
+
+@item %g@r{,} %G
+These print a number in either scientific notation or in floating-point
+notation, whichever uses fewer characters; if the result is printed in
+scientific notation, @samp{%G} uses @samp{E} instead of @samp{e}.
+
+@item %o
+This prints an unsigned octal integer.
+
+@item %s
+This prints a string.
+
+@item %u
+This prints an unsigned decimal integer.
+(This format is of marginal use, because all numbers in @command{awk}
+are floating-point; it is provided primarily for compatibility with C.)
+
+@item %x@r{,} %X
+These print an unsigned hexadecimal integer;
+@samp{%X} uses the letters @samp{A} through @samp{F}
+instead of @samp{a} through @samp{f}.
+
+@item %%
+This isn't a format-control letter, but it does have meaning---the
+sequence @samp{%%} outputs one @samp{%}; it does not consume an
+argument and it ignores any modifiers.
+@end table
+
+@cindex dark corner, format-control characters
+@cindex @command{gawk}, format-control characters
+@strong{Note:}
+When using the integer format-control letters for values that are
+outside the range of the widest C integer type, @command{gawk} switches to the
+the @samp{%g} format specifier. If @option{--lint} is provided on the
+command line (@pxref{Options}), @command{gawk}
+warns about this.  Other versions of @command{awk} may print invalid
+values or do something else entirely.
+@value{DARKCORNER}
+
+@node Format Modifiers
+@subsection Modifiers for @code{printf} Formats
+
+@c STARTOFRANGE pfm
+@cindex @code{printf} statement, modifiers
+@c the comma here does NOT start a secondary
+@cindex modifiers, in format specifiers
+A format specification can also include @dfn{modifiers} that can control
+how much of the item's value is printed, as well as how much space it gets.
+The modifiers come between the @samp{%} and the format-control letter.
+We will use the bullet symbol ``@bullet{}'' in the following examples to
+represent
+spaces in the output. Here are the possible modifiers, in the order in
+which they may appear:
+
+@table @code
+@cindex differences in @command{awk} and @command{gawk}, @code{print}/@code{printf} statements
+@cindex @code{printf} statement, positional specifiers
+@c the command does NOT start a secondary
+@cindex positional specifiers, @code{printf} statement
+@item @var{N}$
+An integer constant followed by a @samp{$} is a @dfn{positional specifier}.
+Normally, format specifications are applied to arguments in the order
+given in the format string.  With a positional specifier, the format
+specification is applied to a specific argument, instead of what
+would be the next argument in the list.  Positional specifiers begin
+counting with one. Thus:
+
+@example
+printf "%s %s\n", "don't", "panic"
+printf "%2$s %1$s\n", "panic", "don't"
+@end example
+
+@noindent
+prints the famous friendly message twice.
+
+At first glance, this feature doesn't seem to be of much use.
+It is in fact a @command{gawk} extension, intended for use in translating
+messages at runtime.
+@xref{Printf Ordering},
+which describes how and why to use positional specifiers.
+For now, we will not use them.
+
+@item -
+The minus sign, used before the width modifier (see later on in
+this table),
+says to left-justify
+the argument within its specified width.  Normally, the argument
+is printed right-justified in the specified width.  Thus:
+
+@example
+printf "%-4s", "foo"
+@end example
+
+@noindent
+prints @samp{foo@bullet{}}.
+
+@item @var{space}
+For numeric conversions, prefix positive values with a space and
+negative values with a minus sign.
+
+@item +
+The plus sign, used before the width modifier (see later on in
+this table),
+says to always supply a sign for numeric conversions, even if the data
+to format is positive. The @samp{+} overrides the space modifier.
+
+@item #
+Use an ``alternate form'' for certain control letters.
+For @samp{%o}, supply a leading zero.
+For @samp{%x} and @samp{%X}, supply a leading @samp{0x} or @samp{0X} for
+a nonzero result.
+For @samp{%e}, @samp{%E}, and @samp{%f}, the result always contains a
+decimal point.
+For @samp{%g} and @samp{%G}, trailing zeros are not removed from the result.
+
+@cindex dark corner
+@item 0
+A leading @samp{0} (zero) acts as a flag that indicates that output should be
+padded with zeros instead of spaces.
+This applies even to non-numeric output formats.
+@value{DARKCORNER}
+This flag only has an effect when the field width is wider than the
+value to print.
+
+@item @var{width}
+This is a number specifying the desired minimum width of a field.  Inserting any
+number between the @samp{%} sign and the format-control character forces the
+field to expand to this width.  The default way to do this is to
+pad with spaces on the left.  For example:
+
+@example
+printf "%4s", "foo"
+@end example
+
+@noindent
+prints @samp{@bullet{}foo}.
+
+The value of @var{width} is a minimum width, not a maximum.  If the item
+value requires more than @var{width} characters, it can be as wide as
+necessary.  Thus, the following:
+
+@example
+printf "%4s", "foobar"
+@end example
+
+@noindent
+prints @samp{foobar}.
+
+Preceding the @var{width} with a minus sign causes the output to be
+padded with spaces on the right, instead of on the left.
+
+@item .@var{prec}
+A period followed by an integer constant
+specifies the precision to use when printing.
+The meaning of the precision varies by control letter:
+
+@table @asis
+@item @code{%e}, @code{%E}, @code{%f}
+Number of digits to the right of the decimal point.
+
+@item @code{%g}, @code{%G}
+Maximum number of significant digits.
+
+@item @code{%d}, @code{%i}, @code{%o}, @code{%u}, @code{%x}, @code{%X}
+Minimum number of digits to print.
+
+@item @code{%s}
+Maximum number of characters from the string that should print.
+@end table
+
+Thus, the following:
+
+@example
+printf "%.4s", "foobar"
+@end example
+
+@noindent
+prints @samp{foob}.
+@end table
+
+The C library @code{printf}'s dynamic @var{width} and @var{prec}
+capability (for example, @code{"%*.*s"}) is supported.  Instead of
+supplying explicit @var{width} and/or @var{prec} values in the format
+string, they are passed in the argument list.  For example:
+
+@example
+w = 5
+p = 3
+s = "abcdefg"
+printf "%*.*s\n", w, p, s
+@end example
+
+@noindent
+is exactly equivalent to:
+
+@example
+s = "abcdefg"
+printf "%5.3s\n", s
+@end example
+
+@noindent
+Both programs output @samp{@w{@bullet{}@bullet{}abc}}.
+Earlier versions of @command{awk} did not support this capability.
+If you must use such a version, you may simulate this feature by using
+concatenation to build up the format string, like so:
+
+@example
+w = 5
+p = 3
+s = "abcdefg"
+printf "%" w "." p "s\n", s
+@end example
+
+@noindent
+This is not particularly easy to read but it does work.
+
+@c @cindex lint checks
+@cindex troubleshooting, fatal errors, @code{printf} format strings
+@cindex POSIX @command{awk}, @code{printf} format strings and
+C programmers may be used to supplying additional
+@samp{l}, @samp{L}, and @samp{h}
+modifiers in @code{printf} format strings. These are not valid in @command{awk}.
+Most @command{awk} implementations silently ignore these modifiers.
+If @option{--lint} is provided on the command line
+(@pxref{Options}),
+@command{gawk} warns about their use. If @option{--posix} is supplied,
+their use is a fatal error.
+@c ENDOFRANGE pfm
+
+@node Printf Examples
+@subsection Examples Using @code{printf}
+
+The following is a simple example of
+how to use @code{printf} to make an aligned table:
+
+@example
+awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end example
+
+@noindent
+This command
+prints the names of the bulletin boards (@code{$1}) in the file
+@file{BBS-list} as a string of 10 characters that are left-justified.  It also
+prints the phone numbers (@code{$2}) next on the line.  This
+produces an aligned two-column table of names and phone numbers,
+as shown here:
+
+@example
+$ awk '@{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@print{} aardvark   555-5553
+@print{} alpo-net   555-3412
+@print{} barfly     555-7685
+@print{} bites      555-1675
+@print{} camelot    555-0542
+@print{} core       555-2912
+@print{} fooey      555-1234
+@print{} foot       555-6699
+@print{} macfoo     555-6480
+@print{} sdace      555-3430
+@print{} sabafoo    555-2127
+@end example
+
+In this case, the phone numbers had to be printed as strings because
+the numbers are separated by a dash.  Printing the phone numbers as
+numbers would have produced just the first three digits: @samp{555}.
+This would have been pretty confusing.
+
+It wasn't necessary to specify a width for the phone numbers because
+they are last on their lines.  They don't need to have spaces
+after them.
+
+The table could be made to look even nicer by adding headings to the
+tops of the columns.  This is done using the @code{BEGIN} pattern
+(@pxref{BEGIN/END})
+so that the headers are only printed once, at the beginning of
+the @command{awk} program:
+
+@example
+awk 'BEGIN @{ print "Name      Number"
+             print "----      ------" @}
+     @{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end example
+
+The above example mixed @code{print} and @code{printf} statements in
+the same program.  Using just @code{printf} statements can produce the
+same results:
+
+@example
+awk 'BEGIN @{ printf "%-10s %s\n", "Name", "Number"
+             printf "%-10s %s\n", "----", "------" @}
+     @{ printf "%-10s %s\n", $1, $2 @}' BBS-list
+@end example
+
+@noindent
+Printing each column heading with the same format specification
+used for the column elements ensures that the headings
+are aligned just like the columns.
+
+The fact that the same format specification is used three times can be
+emphasized by storing it in a variable, like this:
+
+@example
+awk 'BEGIN @{ format = "%-10s %s\n"
+             printf format, "Name", "Number"
+             printf format, "----", "------" @}
+     @{ printf format, $1, $2 @}' BBS-list
+@end example
+
+@c !!! exercise
+At this point, it would be a worthwhile exercise to use the
+@code{printf} statement to line up the headings and table data for the
+@file{inventory-shipped} example that was covered earlier in the @value{SECTION}
+on the @code{print} statement
+(@pxref{Print}).
+@c ENDOFRANGE printfs
+
+@node Redirection
+@section Redirecting Output of @code{print} and @code{printf}
+
+@cindex output redirection
+@cindex redirection of output
+So far, the output from @code{print} and @code{printf} has gone
+to the standard
+output, usually the terminal.  Both @code{print} and @code{printf} can
+also send their output to other places.
+This is called @dfn{redirection}.
+
+A redirection appears after the @code{print} or @code{printf} statement.
+Redirections in @command{awk} are written just like redirections in shell
+commands, except that they are written inside the @command{awk} program.
+
+@c the commas here are part of the see also
+@cindex @code{print} statement, See Also redirection, of output
+@cindex @code{printf} statement, See Also redirection, of output
+There are four forms of output redirection: output to a file, output
+appended to a file, output through a pipe to another command, and output
+to a coprocess.  They are all shown for the @code{print} statement,
+but they work identically for @code{printf}:
+
+@table @code
+@cindex @code{>} (right angle bracket), @code{>} operator (I/O)
+@cindex right angle bracket (@code{>}), @code{>} operator (I/O)
+@cindex operators, input/output
+@item print @var{items} > @var{output-file}
+This type of redirection prints the items into the output file named
+@var{output-file}.  The @value{FN} @var{output-file} can be any
+expression.  Its value is changed to a string and then used as a
+@value{FN} (@pxref{Expressions}).
+
+When this type of redirection is used, the @var{output-file} is erased
+before the first output is written to it.  Subsequent writes to the same
+@var{output-file} do not erase @var{output-file}, but append to it.
+(This is different from how you use redirections in shell scripts.)
+If @var{output-file} does not exist, it is created.  For example, here
+is how an @command{awk} program can write a list of BBS names to one
+file named @file{name-list}, and a list of phone numbers to another file
+named @file{phone-list}:
+
+@example
+$ awk '@{ print $2 > "phone-list"
+>        print $1 > "name-list" @}' BBS-list
+$ cat phone-list
+@print{} 555-5553
+@print{} 555-3412
+@dots{}
+$ cat name-list
+@print{} aardvark
+@print{} alpo-net
+@dots{}
+@end example
+
+@noindent
+Each output file contains one name or number per line.
+
+@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
+@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
+@item print @var{items} >> @var{output-file}
+This type of redirection prints the items into the pre-existing output file
+named @var{output-file}.  The difference between this and the
+single-@samp{>} redirection is that the old contents (if any) of
+@var{output-file} are not erased.  Instead, the @command{awk} output is
+appended to the file.
+If @var{output-file} does not exist, then it is created.
+
+@cindex @code{|} (vertical bar), @code{|} operator (I/O)
+@cindex pipes, output
+@cindex output, pipes
+@item print @var{items} | @var{command}
+It is also possible to send output to another program through a pipe
+instead of into a file.   This type of redirection opens a pipe to
+@var{command}, and writes the values of @var{items} through this pipe
+to another process created to execute @var{command}.
+
+The redirection argument @var{command} is actually an @command{awk}
+expression.  Its value is converted to a string whose contents give
+the shell command to be run.  For example, the following produces two
+files, one unsorted list of BBS names, and one list sorted in reverse
+alphabetical order:
+
+@ignore
+10/2000:
+This isn't the best style, since COMMAND is assigned for each
+record.  It's done to avoid overfull hboxes in TeX.  Leave it
+alone for now and let's hope no-one notices.
+@end ignore
+
+@example
+awk '@{ print $1 > "names.unsorted"
+       command = "sort -r > names.sorted"
+       print $1 | command @}' BBS-list
+@end example
+
+The unsorted list is written with an ordinary redirection, while
+the sorted list is written by piping through the @command{sort} utility.
+
+The next example uses redirection to mail a message to the mailing
+list @samp{bug-system}.  This might be useful when trouble is encountered
+in an @command{awk} script run periodically for system maintenance:
+
+@example
+report = "mail bug-system"
+print "Awk script failed:", $0 | report
+m = ("at record number " FNR " of " FILENAME)
+print m | report
+close(report)
+@end example
+
+The message is built using string concatenation and saved in the variable
+@code{m}.  It's then sent down the pipeline to the @command{mail} program.
+(The parentheses group the items to concatenate---see
+@ref{Concatenation}.)
+
+The @code{close} function is called here because it's a good idea to close
+the pipe as soon as all the intended output has been sent to it.
+@xref{Close Files And Pipes},
+for more information.
+
+This example also illustrates the use of a variable to represent
+a @var{file} or @var{command}---it is not necessary to always
+use a string constant.  Using a variable is generally a good idea,
+because @command{awk} requires that the string value be spelled identically
+every time.
+
+@cindex coprocesses
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
+@cindex operators, input/output
+@cindex differences in @command{awk} and @command{gawk}, input/output operators
+@item print @var{items} |& @var{command}
+This type of redirection prints the items to the input of @var{command}.
+The difference between this and the
+single-@samp{|} redirection is that the output from @var{command}
+can be read with @code{getline}.
+Thus @var{command} is a @dfn{coprocess}, which works together with,
+but subsidiary to, the @command{awk} program.
+
+This feature is a @command{gawk} extension, and is not available in
+POSIX @command{awk}.
+@xref{Two-way I/O},
+for a more complete discussion.
+@end table
+
+Redirecting output using @samp{>}, @samp{>>}, @samp{|}, or @samp{|&}
+asks the system to open a file, pipe, or coprocess only if the particular
+@var{file} or @var{command} you specify has not already been written
+to by your program or if it has been closed since it was last written to.
+
+@cindex troubleshooting, printing
+It is a common error to use @samp{>} redirection for the first @code{print}
+to a file, and then to use @samp{>>} for subsequent output:
+
+@example
+# clear the file
+print "Don't panic" > "guide.txt"
+@dots{}
+# append
+print "Avoid improbability generators" >> "guide.txt"
+@end example
+
+@noindent
+This is indeed how redirections must be used from the shell.  But in
+@command{awk}, it isn't necessary.  In this kind of case, a program should
+use @samp{>} for all the @code{print} statements, since the output file
+is only opened once.
+
+@cindex differences in @command{awk} and @command{gawk}, implementation limitations
+@c the comma here does NOT start a secondary
+@cindex implementation issues, @command{gawk}, limits
+@cindex @command{awk}, implementation issues, pipes
+@cindex @command{gawk}, implementation issues, pipes
+@ifnotinfo
+As mentioned earlier
+(@pxref{Getline Notes}),
+many
+@end ifnotinfo
+@ifnottex
+Many
+@end ifnottex
+@command{awk} implementations limit the number of pipelines that an @command{awk}
+program may have open to just one!  In @command{gawk}, there is no such limit.
+@command{gawk} allows a program to
+open as many pipelines as the underlying operating system permits.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Piping into @command{sh}
+@cindex advanced features, piping into @command{sh}
+@cindex shells, piping commands into
+
+A particularly powerful way to use redirection is to build command lines
+and pipe them into the shell, @command{sh}.  For example, suppose you
+have a list of files brought over from a system where all the @value{FN}s
+are stored in uppercase, and you wish to rename them to have names in
+all lowercase.  The following program is both simple and efficient:
+
+@c @cindex @command{mv} utility
+@example
+@{ printf("mv %s %s\n", $0, tolower($0)) | "sh" @}
+
+END @{ close("sh") @}
+@end example
+
+The @code{tolower} function returns its argument string with all
+uppercase characters converted to lowercase
+(@pxref{String Functions}).
+The program builds up a list of command lines,
+using the @command{mv} utility to rename the files.
+It then sends the list to the shell for execution.
+@c ENDOFRANGE outre
+@c ENDOFRANGE reout
+
+@node Special Files
+@section Special @value{FFN}s in @command{gawk}
+@c STARTOFRANGE gfn
+@cindex @command{gawk}, @value{FN}s in
+
+@command{gawk} provides a number of special @value{FN}s that it interprets
+internally.  These @value{FN}s provide access to standard file descriptors,
+process-related information, and TCP/IP networking.
+
+@menu
+* Special FD::                  Special files for I/O.
+* Special Process::             Special files for process information.
+* Special Network::             Special files for network communications.
+* Special Caveats::             Things to watch out for.
+@end menu
+
+@node Special FD
+@subsection Special Files for Standard Descriptors
+@cindex standard input
+@cindex input, standard
+@cindex standard output
+@cindex output, standard
+@cindex error output
+@cindex file descriptors
+@cindex files, descriptors, See file descriptors
+
+Running programs conventionally have three input and output streams
+already available to them for reading and writing.  These are known as
+the @dfn{standard input}, @dfn{standard output}, and @dfn{standard error
+output}.  These streams are, by default, connected to your terminal, but
+they are often redirected with the shell, via the @samp{<}, @samp{<<},
+@samp{>}, @samp{>>}, @samp{>&}, and @samp{|} operators.  Standard error
+is typically used for writing error messages; the reason there are two separate
+streams, standard output and standard error, is so that they can be
+redirected separately.
+
+@cindex differences in @command{awk} and @command{gawk}, error messages
+@cindex error handling
+In other implementations of @command{awk}, the only way to write an error
+message to standard error in an @command{awk} program is as follows:
+
+@example
+print "Serious error detected!" | "cat 1>&2"
+@end example
+
+@noindent
+This works by opening a pipeline to a shell command that can access the
+standard error stream that it inherits from the @command{awk} process.
+This is far from elegant, and it is also inefficient, because it requires a
+separate process.  So people writing @command{awk} programs often
+don't do this.  Instead, they send the error messages to the
+terminal, like this:
+
+@example
+print "Serious error detected!" > "/dev/tty"
+@end example
+
+@noindent
+This usually has the same effect but not always: although the
+standard error stream is usually the terminal, it can be redirected; when
+that happens, writing to the terminal is not correct.  In fact, if
+@command{awk} is run from a background job, it may not have a terminal at all.
+Then opening @file{/dev/tty} fails.
+
+@command{gawk} provides special @value{FN}s for accessing the three standard
+streams, as well as any other inherited open files.  If the @value{FN} matches
+one of these special names when @command{gawk} redirects input or output,
+then it directly uses the stream that the @value{FN} stands for.
+These special @value{FN}s work for all operating systems that @command{gawk}
+has been ported to, not just those that are POSIX-compliant:
+
+@cindex @value{FN}s, standard streams in @command{gawk}
+@cindex @code{/dev/@dots{}} special files (@command{gawk})
+@cindex files, @code{/dev/@dots{}} special files
+@c @cindex @code{/dev/stdin} special file
+@c @cindex @code{/dev/stdout} special file
+@c @cindex @code{/dev/stderr} special file
+@c @cindex @code{/dev/fd} special files
+@table @file
+@item /dev/stdin
+The standard input (file descriptor 0).
+
+@item /dev/stdout
+The standard output (file descriptor 1).
+
+@item /dev/stderr
+The standard error output (file descriptor 2).
+
+@item /dev/fd/@var{N}
+The file associated with file descriptor @var{N}.  Such a file must
+be opened by the program initiating the @command{awk} execution (typically
+the shell).  Unless special pains are taken in the shell from which
+@command{gawk} is invoked, only descriptors 0, 1, and 2 are available.
+@end table
+
+The @value{FN}s @file{/dev/stdin}, @file{/dev/stdout}, and @file{/dev/stderr}
+are aliases for @file{/dev/fd/0}, @file{/dev/fd/1}, and @file{/dev/fd/2},
+respectively. However, they are more self-explanatory.
+The proper way to write an error message in a @command{gawk} program
+is to use @file{/dev/stderr}, like this:
+
+@example
+print "Serious error detected!" > "/dev/stderr"
+@end example
+
+@cindex troubleshooting, quotes with @value{FN}s
+Note the use of quotes around the @value{FN}.
+Like any other redirection, the value must be a string.
+It is a common error to omit the quotes, which leads
+to confusing results.
+@c Exercise: What does it do?  :-)
+
+@node Special Process
+@subsection Special Files for Process-Related Information
+
+@cindex files, for process information
+@cindex process information, files for
+@command{gawk} also provides special @value{FN}s that give access to information
+about the running @command{gawk} process.  Each of these ``files'' provides
+a single record of information.  To read them more than once, they must
+first be closed with the @code{close} function
+(@pxref{Close Files And Pipes}).
+The @value{FN}s are:
+
+@c @cindex @code{/dev/pid} special file
+@c @cindex @code{/dev/pgrpid} special file
+@c @cindex @code{/dev/ppid} special file
+@c @cindex @code{/dev/user} special file
+@table @file
+@item /dev/pid
+Reading this file returns the process ID of the current process,
+in decimal form, terminated with a newline.
+
+@item /dev/ppid
+Reading this file returns the parent process ID of the current process,
+in decimal form, terminated with a newline.
+
+@item /dev/pgrpid
+Reading this file returns the process group ID of the current process,
+in decimal form, terminated with a newline.
+
+@item /dev/user
+Reading this file returns a single record terminated with a newline.
+The fields are separated with spaces.  The fields represent the
+following information:
+
+@table @code
+@item $1
+The return value of the @code{getuid} system call
+(the real user ID number).
+
+@item $2
+The return value of the @code{geteuid} system call
+(the effective user ID number).
+
+@item $3
+The return value of the @code{getgid} system call
+(the real group ID number).
+
+@item $4
+The return value of the @code{getegid} system call
+(the effective group ID number).
+@end table
+
+If there are any additional fields, they are the group IDs returned by
+the @code{getgroups} system call.
+(Multiple groups may not be supported on all systems.)
+@end table
+
+These special @value{FN}s may be used on the command line as @value{DF}s,
+as well as for I/O redirections within an @command{awk} program.
+They may not be used as source files with the @option{-f} option.
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@strong{Note:}
+The special files that provide process-related information are now considered
+obsolete and will disappear entirely
+in the next release of @command{gawk}.
+@command{gawk} prints a warning message every time you use one of
+these files.
+To obtain process-related information, use the @code{PROCINFO} array.
+@xref{Auto-set}.
+
+@node Special Network
+@subsection Special Files for Network Communications
+@cindex networks, support for
+@cindex TCP/IP, support for
+
+Starting with @value{PVERSION} 3.1 of @command{gawk}, @command{awk} programs
+can open a two-way
+TCP/IP connection, acting as either a client or a server.
+This is done using a special @value{FN} of the form:
+
+@example
+@file{/inet/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}
+@end example
+
+The @var{protocol} is one of @samp{tcp}, @samp{udp}, or @samp{raw},
+and the other fields represent the other essential pieces of information
+for making a networking connection.
+These @value{FN}s are used with the @samp{|&} operator for communicating
+with a coprocess
+(@pxref{Two-way I/O}).
+This is an advanced feature, mentioned here only for completeness.
+Full discussion is delayed until
+@ref{TCP/IP Networking}.
+
+@node Special Caveats
+@subsection Special @value{FFN} Caveats
+
+Here is a list of things to bear in mind when using the
+special @value{FN}s that @command{gawk} provides:
+
+@itemize @bullet
+@cindex compatibility mode (@command{gawk}), @value{FN}s
+@cindex @value{FN}s, in compatibility mode
+@item
+Recognition of these special @value{FN}s is disabled if @command{gawk} is in
+compatibility mode (@pxref{Options}).
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@cindex @code{PROCINFO} array
+@item
+@ifnottex
+The
+@end ifnottex
+@ifnotinfo
+As mentioned earlier, the
+@end ifnotinfo
+special files that provide process-related information are now considered
+obsolete and will disappear entirely
+in the next release of @command{gawk}.
+@command{gawk} prints a warning message every time you use one of
+these files.
+@ifnottex
+To obtain process-related information, use the @code{PROCINFO} array.
+@xref{Built-in Variables}.
+@end ifnottex
+
+@item
+Starting with @value{PVERSION} 3.1, @command{gawk} @emph{always}
+interprets these special @value{FN}s.@footnote{Older versions of
+@command{gawk} would interpret these names internally only if the system
+did not actually have a @file{/dev/fd} directory or any of the other
+special files listed earlier.  Usually this didn't make a difference,
+but sometimes it did; thus, it was decided to make @command{gawk}'s
+behavior consistent on all systems and to have it always interpret
+the special @value{FN}s itself.}
+For example, using @samp{/dev/fd/4}
+for output actually writes on file descriptor 4, and not on a new
+file descriptor that is @code{dup}'ed from file descriptor 4.  Most of
+the time this does not matter; however, it is important to @emph{not}
+close any of the files related to file descriptors 0, 1, and 2.
+Doing so results in unpredictable behavior.
+@end itemize
+@c ENDOFRANGE gfn
+
+@node Close Files And Pipes
+@section Closing Input and Output Redirections
+@cindex files, output, See output files
+@c STARTOFRANGE ifc
+@cindex input files, closing
+@c comma before closing is NOT start of tertiary
+@c STARTOFRANGE ofc
+@cindex output, files, closing
+@c STARTOFRANGE pc
+@cindex pipes, closing
+@c STARTOFRANGE cc
+@cindex coprocesses, closing
+@c comma before using is NOT start of tertiary
+@cindex @code{getline} command, coprocesses, using from
+
+If the same @value{FN} or the same shell command is used with @code{getline}
+more than once during the execution of an @command{awk} program
+(@pxref{Getline}),
+the file is opened (or the command is executed) the first time only.
+At that time, the first record of input is read from that file or command.
+The next time the same file or command is used with @code{getline},
+another record is read from it, and so on.
+
+Similarly, when a file or pipe is opened for output, the @value{FN} or
+command associated with it is remembered by @command{awk}, and subsequent
+writes to the same file or command are appended to the previous writes.
+The file or pipe stays open until @command{awk} exits.
+
+@cindex @code{close} function
+This implies that special steps are necessary in order to read the same
+file again from the beginning, or to rerun a shell command (rather than
+reading more output from the same command).  The @code{close} function
+makes these things possible:
+
+@example
+close(@var{filename})
+@end example
+
+@noindent
+or:
+
+@example
+close(@var{command})
+@end example
+
+The argument @var{filename} or @var{command} can be any expression.  Its
+value must @emph{exactly} match the string that was used to open the file or
+start the command (spaces and other ``irrelevant'' characters
+included). For example, if you open a pipe with this:
+
+@example
+"sort -r names" | getline foo
+@end example
+
+@noindent
+then you must close it with this:
+
+@example
+close("sort -r names")
+@end example
+
+Once this function call is executed, the next @code{getline} from that
+file or command, or the next @code{print} or @code{printf} to that
+file or command, reopens the file or reruns the command.
+Because the expression that you use to close a file or pipeline must
+exactly match the expression used to open the file or run the command,
+it is good practice to use a variable to store the @value{FN} or command.
+The previous example becomes the following:
+
+@example
+sortcom = "sort -r names"
+sortcom | getline foo
+@dots{}
+close(sortcom)
+@end example
+
+@noindent
+This helps avoid hard-to-find typographical errors in your @command{awk}
+programs.  Here are some of the reasons for closing an output file:
+
+@itemize @bullet
+@item
+To write a file and read it back later on in the same @command{awk}
+program.  Close the file after writing it, then
+begin reading it with @code{getline}.
+
+@item
+To write numerous files, successively, in the same @command{awk}
+program.  If the files aren't closed, eventually @command{awk} may exceed a
+system limit on the number of open files in one process.  It is best to
+close each one when the program has finished writing it.
+
+@item
+To make a command finish.  When output is redirected through a pipe,
+the command reading the pipe normally continues to try to read input
+as long as the pipe is open.  Often this means the command cannot
+really do its work until the pipe is closed.  For example, if
+output is redirected to the @command{mail} program, the message is not
+actually sent until the pipe is closed.
+
+@item
+To run the same program a second time, with the same arguments.
+This is not the same thing as giving more input to the first run!
+
+For example, suppose a program pipes output to the @command{mail} program.
+If it outputs several lines redirected to this pipe without closing
+it, they make a single message of several lines.  By contrast, if the
+program closes the pipe after each line of output, then each line makes
+a separate message.
+@end itemize
+
+@cindex differences in @command{awk} and @command{gawk}, @code{close} function
+@cindex portability, @code{close} function and
+If you use more files than the system allows you to have open,
+@command{gawk} attempts to multiplex the available open files among
+your @value{DF}s.  @command{gawk}'s ability to do this depends upon the
+facilities of your operating system, so it may not always work.  It is
+therefore both good practice and good portability advice to always
+use @code{close} on your files when you are done with them.
+In fact, if you are using a lot of pipes, it is essential that
+you close commands when done. For example, consider something like this:
+
+@example
+@{
+    @dots{}
+    command = ("grep " $1 " /some/file | my_prog -q " $3)
+    while ((command | getline) > 0) @{
+        @var{process output of} command
+    @}
+    # need close(command) here
+@}
+@end example
+
+This example creates a new pipeline based on data in @emph{each} record.
+Without the call to @code{close} indicated in the comment, @command{awk}
+creates child processes to run the commands, until it eventually
+runs out of file descriptors for more pipelines.
+
+Even though each command has finished (as indicated by the end-of-file
+return status from @code{getline}), the child process is not
+terminated;@footnote{The technical terminology is rather morbid.
+The finished child is called a ``zombie,'' and cleaning up after
+it is referred to as ``reaping.''}
+@c Good old UNIX: give the marketing guys fits, that's the ticket
+more importantly, the file descriptor for the pipe
+is not closed and released until @code{close} is called or
+@command{awk} exits.
+
+@code{close} will silently do nothing if given an argument that
+does not represent a file, pipe or coprocess that was opened with
+a redirection.
+
+Note also that @samp{close(FILENAME)} has no
+``magic'' effects on the implicit loop that reads through the
+files named on the command line.  It is, more likely, a close
+of a file that was never opened, so @command{awk} silently
+does nothing.
+
+@c comma is part of tertiary
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O), pipes, closing
+When using the @samp{|&} operator to communicate with a coprocess,
+it is occasionally useful to be able to close one end of the two-way
+pipe without closing the other.
+This is done by supplying a second argument to @code{close}.
+As in any other call to @code{close},
+the first argument is the name of the command or special file used
+to start the coprocess.
+The second argument should be a string, with either of the values
+@code{"to"} or @code{"from"}.  Case does not matter.
+As this is an advanced feature, a more complete discussion is
+delayed until
+@ref{Two-way I/O},
+which discusses it in more detail and gives an example.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Using @code{close}'s Return Value
+@cindex advanced features, @code{close} function
+@cindex dark corner, @code{close} function
+@cindex @code{close} function, return values
+@c comma does NOT start secondary
+@cindex return values, @code{close} function
+@cindex differences in @command{awk} and @command{gawk}, @code{close} function
+@cindex Unix @command{awk}, @code{close} function and
+
+In many versions of Unix @command{awk}, the @code{close} function
+is actually a statement.  It is a syntax error to try and use the return
+value from @code{close}:
+@value{DARKCORNER}
+
+@example
+command = "@dots{}"
+command | getline info
+retval = close(command)  # syntax error in most Unix awks
+@end example
+
+@command{gawk} treats @code{close} as a function.
+The return value is @minus{}1 if the argument names something
+that was never opened with a redirection, or if there is
+a system problem closing the file or process.
+In these cases, @command{gawk} sets the built-in variable
+@code{ERRNO} to a string describing the problem.
+
+In @command{gawk},
+when closing a pipe or coprocess,
+the return value is the exit status of the command.@footnote{
+This is a full 16-bit value as returned by the @code{wait}
+system call. See the system manual pages for information on
+how to decode this value.}
+Otherwise, it is the return value from the system's @code{close} or
+@code{fclose} C functions when closing input or output
+files, respectively.
+This value is zero if the close succeeds, or @minus{}1 if
+it fails.
+
+The POSIX standard is very vague; it says that @code{close}
+returns zero on success and non-zero otherwise.  In general,
+different implementations vary in what they report when closing
+pipes; thus the return value cannot be used portably.
+@value{DARKCORNER}
+
+@ignore
+@c 4/27/2003: Commenting this out for now, given the above
+@c return of 16-bit value
+The return value for closing a pipeline is particularly useful.
+It allows you to get the output from a command as well as its
+exit status.
+@c 8/21/2002, FIXME: Maybe the code and this doc should be adjusted to
+@c create values indicating death-by-signal?  Sigh.
+
+@cindex pipes, closing
+@c comma does NOT start tertiary
+@cindex POSIX @command{awk}, pipes, closing
+For POSIX-compliant systems,
+if the exit status is a number above 128, then the program
+was terminated by a signal.  Subtract 128 to get the signal number:
+
+@example
+exit_val = close(command)
+if (exit_val > 128)
+    print command, "died with signal", exit_val - 128
+else
+    print command, "exited with code", exit_val
+@end example
+
+Currently, in @command{gawk}, this only works for commands
+piping into @code{getline}.  For commands piped into
+from @code{print} or @code{printf}, the
+return value from @code{close} is that of the library's
+@code{pclose} function.
+@end ignore
+@c ENDOFRANGE ifc
+@c ENDOFRANGE ofc
+@c ENDOFRANGE pc
+@c ENDOFRANGE cc
+@c ENDOFRANGE prnt
+
+@node Expressions
+@chapter Expressions
+@c STARTOFRANGE exps
+@cindex expressions
+
+Expressions are the basic building blocks of @command{awk} patterns
+and actions.  An expression evaluates to a value that you can print, test,
+or pass to a function.  Additionally, an expression
+can assign a new value to a variable or a field by using an assignment operator.
+
+An expression can serve as a pattern or action statement on its own.
+Most other kinds of
+statements contain one or more expressions that specify the data on which to
+operate.  As in other languages, expressions in @command{awk} include
+variables, array references, constants, and function calls, as well as
+combinations of these with various operators.
+
+@menu
+* Constants::                   String, numeric and regexp constants.
+* Using Constant Regexps::      When and how to use a regexp constant.
+* Variables::                   Variables give names to values for later use.
+* Conversion::                  The conversion of strings to numbers and vice
+                                versa.
+* Arithmetic Ops::              Arithmetic operations (@samp{+}, @samp{-},
+                                etc.)
+* Concatenation::               Concatenating strings.
+* Assignment Ops::              Changing the value of a variable or a field.
+* Increment Ops::               Incrementing the numeric value of a variable.
+* Truth Values::                What is ``true'' and what is ``false''.
+* Typing and Comparison::       How variables acquire types and how this
+                                affects comparison of numbers and strings with
+                                @samp{<}, etc.
+* Boolean Ops::                 Combining comparison expressions using boolean
+                                operators @samp{||} (``or''), @samp{&&}
+                                (``and'') and @samp{!} (``not'').
+* Conditional Exp::             Conditional expressions select between two
+                                subexpressions under control of a third
+                                subexpression.
+* Function Calls::              A function call is an expression.
+* Precedence::                  How various operators nest.
+@end menu
+
+@node Constants
+@section Constant Expressions
+@cindex constants, types of
+
+The simplest type of expression is the @dfn{constant}, which always has
+the same value.  There are three types of constants: numeric,
+string, and regular expression.
+
+Each is used in the appropriate context when you need a data
+value that isn't going to change.  Numeric constants can
+have different forms, but are stored identically internally.
+
+@menu
+* Scalar Constants::            Numeric and string constants.
+* Nondecimal-numbers::          What are octal and hex numbers.
+* Regexp Constants::            Regular Expression constants.
+@end menu
+
+@node Scalar Constants
+@subsection Numeric and String Constants
+
+@cindex numeric, constants
+A @dfn{numeric constant} stands for a number.  This number can be an
+integer, a decimal fraction, or a number in scientific (exponential)
+notation.@footnote{The internal representation of all numbers,
+including integers, uses double-precision
+floating-point numbers.
+On most modern systems, these are in IEEE 754 standard format.}
+Here are some examples of numeric constants that all
+have the same value:
+
+@example
+105
+1.05e+2
+1050e-1
+@end example
+
+@cindex string constants
+A string constant consists of a sequence of characters enclosed in
+double-quotation marks.  For example:
+
+@example
+"parrot"
+@end example
+
+@noindent
+@cindex differences in @command{awk} and @command{gawk}, strings
+@cindex strings, length of
+represents the string whose contents are @samp{parrot}.  Strings in
+@command{gawk} can be of any length, and they can contain any of the possible
+eight-bit ASCII characters including ASCII @sc{nul} (character code zero).
+Other @command{awk}
+implementations may have difficulty with some character codes.
+
+@node Nondecimal-numbers
+@subsection Octal and Hexadecimal Numbers
+@cindex octal numbers
+@cindex hexadecimal numbers
+@cindex numbers, octal
+@cindex numbers, hexadecimal
+
+In @command{awk}, all numbers are in decimal; i.e., base 10.  Many other
+programming languages allow you to specify numbers in other bases, often
+octal (base 8) and hexadecimal (base 16).
+In octal, the numbers go 0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 12, etc.
+Just as @samp{11}, in decimal, is 1 times 10 plus 1, so
+@samp{11}, in octal, is 1 times 8, plus 1. This equals 9 in decimal.
+In hexadecimal, there are 16 digits. Since the everyday decimal
+number system only has ten digits (@samp{0}--@samp{9}), the letters
+@samp{a} through @samp{f} are used to represent the rest.
+(Case in the letters is usually irrelevant; hexadecimal @samp{a} and @samp{A}
+have the same value.)
+Thus, @samp{11}, in
+hexadecimal, is 1 times 16 plus 1, which equals 17 in decimal.
+
+Just by looking at plain @samp{11}, you can't tell what base it's in.
+So, in C, C++, and other languages derived from C,
+@c such as PERL, but we won't mention that....
+there is a special notation to help signify the base.
+Octal numbers start with a leading @samp{0},
+and hexadecimal numbers start with a leading @samp{0x} or @samp{0X}:
+
+@table @code
+@item 11
+Decimal value 11.
+
+@item 011
+Octal 11, decimal value 9.
+
+@item 0x11
+Hexadecimal 11, decimal value 17.
+@end table
+
+This example shows the difference:
+
+@example
+$ gawk 'BEGIN @{ printf "%d, %d, %d\n", 011, 11, 0x11 @}'
+@print{} 9, 11, 17
+@end example
+
+Being able to use octal and hexadecimal constants in your programs is most
+useful when working with data that cannot be represented conveniently as
+characters or as regular numbers, such as binary data of various sorts.
+
+@cindex @command{gawk}, octal numbers and
+@cindex @command{gawk}, hexadecimal numbers and
+@command{gawk} allows the use of octal and hexadecimal
+constants in your program text.  However, such numbers in the input data
+are not treated differently; doing so by default would break old
+programs.
+(If you really need to do this, use the @option{--non-decimal-data}
+command-line option;
+@pxref{Nondecimal Data}.)
+If you have octal or hexadecimal data,
+you can use the @code{strtonum} function
+(@pxref{String Functions})
+to convert the data into a number.
+Most of the time, you will want to use octal or hexadecimal constants
+when working with the built-in bit manipulation functions;
+see @ref{Bitwise Functions},
+for more information.
+
+Unlike some early C implementations, @samp{8} and @samp{9} are not valid
+in octal constants; e.g., @command{gawk} treats @samp{018} as decimal 18:
+
+@example
+$ gawk 'BEGIN @{ print "021 is", 021 ; print 018 @}'
+@print{} 021 is 17
+@print{} 18
+@end example
+
+@cindex compatibility mode (@command{gawk}), octal numbers
+@cindex compatibility mode (@command{gawk}), hexadecimal numbers
+Octal and hexadecimal source code constants are a @command{gawk} extension.
+If @command{gawk} is in compatibility mode
+(@pxref{Options}),
+they are not available.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: A Constant's Base Does Not Affect Its Value
+@c comma before values does NOT start tertiary
+@cindex advanced features, constants, values of
+
+Once a numeric constant has
+been converted internally into a number,
+@command{gawk} no longer remembers
+what the original form of the constant was; the internal value is
+always used.  This has particular consequences for conversion of
+numbers to strings:
+
+@example
+$ gawk 'BEGIN @{ printf "0x11 is <%s>\n", 0x11 @}'
+@print{} 0x11 is <17>
+@end example
+
+@node Regexp Constants
+@subsection Regular Expression Constants
+
+@c STARTOFRANGE rec
+@cindex regexp constants
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+A regexp constant is a regular expression description enclosed in
+slashes, such as @code{@w{/^beginning and end$/}}.  Most regexps used in
+@command{awk} programs are constant, but the @samp{~} and @samp{!~}
+matching operators can also match computed or ``dynamic'' regexps
+(which are just ordinary strings or variables that contain a regexp).
+@c ENDOFRANGE cnst
+
+@node Using Constant Regexps
+@section Using Regular Expression Constants
+
+@cindex dark corner, regexp constants
+When used on the righthand side of the @samp{~} or @samp{!~}
+operators, a regexp constant merely stands for the regexp that is to be
+matched.
+However, regexp constants (such as @code{/foo/}) may be used like simple expressions.
+When a
+regexp constant appears by itself, it has the same meaning as if it appeared
+in a pattern, i.e., @samp{($0 ~ /foo/)}
+@value{DARKCORNER}
+@xref{Expression Patterns}.
+This means that the following two code segments:
+
+@example
+if ($0 ~ /barfly/ || $0 ~ /camelot/)
+    print "found"
+@end example
+
+@noindent
+and:
+
+@example
+if (/barfly/ || /camelot/)
+    print "found"
+@end example
+
+@noindent
+are exactly equivalent.
+One rather bizarre consequence of this rule is that the following
+Boolean expression is valid, but does not do what the user probably
+intended:
+
+@example
+# note that /foo/ is on the left of the ~
+if (/foo/ ~ $1) print "found foo"
+@end example
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@cindex @command{gawk}, regexp constants and
+@cindex regexp constants, in @command{gawk}
+@noindent
+This code is ``obviously'' testing @code{$1} for a match against the regexp
+@code{/foo/}.  But in fact, the expression @samp{/foo/ ~ $1} actually means
+@samp{($0 ~ /foo/) ~ $1}.  In other words, first match the input record
+against the regexp @code{/foo/}.  The result is either zero or one,
+depending upon the success or failure of the match.  That result
+is then matched against the first field in the record.
+Because it is unlikely that you would ever really want to make this kind of
+test, @command{gawk} issues a warning when it sees this construct in
+a program.
+Another consequence of this rule is that the assignment statement:
+
+@example
+matches = /foo/
+@end example
+
+@noindent
+assigns either zero or one to the variable @code{matches}, depending
+upon the contents of the current input record.
+This feature of the language has never been well documented until the
+POSIX specification.
+
+@cindex differences in @command{awk} and @command{gawk}, regexp constants
+@cindex dark corner, regexp constants, as arguments to user-defined functions
+@cindex @code{gensub} function (@command{gawk})
+@cindex @code{sub} function
+@cindex @code{gsub} function
+Constant regular expressions are also used as the first argument for
+the @code{gensub}, @code{sub}, and @code{gsub} functions, and as the
+second argument of the @code{match} function
+(@pxref{String Functions}).
+Modern implementations of @command{awk}, including @command{gawk}, allow
+the third argument of @code{split} to be a regexp constant, but some
+older implementations do not.
+@value{DARKCORNER}
+This can lead to confusion when attempting to use regexp constants
+as arguments to user-defined functions
+(@pxref{User-defined}).
+For example:
+
+@example
+function mysub(pat, repl, str, global)
+@{
+    if (global)
+        gsub(pat, repl, str)
+    else
+        sub(pat, repl, str)
+    return str
+@}
+
+@{
+    @dots{}
+    text = "hi! hi yourself!"
+    mysub(/hi/, "howdy", text, 1)
+    @dots{}
+@}
+@end example
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+In this example, the programmer wants to pass a regexp constant to the
+user-defined function @code{mysub}, which in turn passes it on to
+either @code{sub} or @code{gsub}.  However, what really happens is that
+the @code{pat} parameter is either one or zero, depending upon whether
+or not @code{$0} matches @code{/hi/}.
+@command{gawk} issues a warning when it sees a regexp constant used as
+a parameter to a user-defined function, since passing a truth value in
+this way is probably not what was intended.
+@c ENDOFRANGE rec
+
+@node Variables
+@section Variables
+
+@cindex variables, user-defined
+@cindex user-defined, variables
+Variables are ways of storing values at one point in your program for
+use later in another part of your program.  They can be manipulated
+entirely within the program text, and they can also be assigned values
+on the @command{awk} command line.
+
+@menu
+* Using Variables::             Using variables in your programs.
+* Assignment Options::          Setting variables on the command-line and a
+                                summary of command-line syntax. This is an
+                                advanced method of input.
+@end menu
+
+@node Using Variables
+@subsection Using Variables in a Program
+
+Variables let you give names to values and refer to them later.  Variables
+have already been used in many of the examples.  The name of a variable
+must be a sequence of letters, digits, or underscores, and it may not begin
+with a digit.  Case is significant in variable names; @code{a} and @code{A}
+are distinct variables.
+
+A variable name is a valid expression by itself; it represents the
+variable's current value.  Variables are given new values with
+@dfn{assignment operators}, @dfn{increment operators}, and
+@dfn{decrement operators}.
+@xref{Assignment Ops}.
+@c NEXT ED: Can also be changed by sub, gsub, split
+
+@cindex variables, built-in
+@cindex variables, initializing
+A few variables have special built-in meanings, such as @code{FS} (the
+field separator), and @code{NF} (the number of fields in the current input
+record).  @xref{Built-in Variables}, for a list of the built-in variables.
+These built-in variables can be used and assigned just like all other
+variables, but their values are also used or changed automatically by
+@command{awk}.  All built-in variables' names are entirely uppercase.
+
+Variables in @command{awk} can be assigned either numeric or string values.
+The kind of value a variable holds can change over the life of a program.
+By default, variables are initialized to the empty string, which
+is zero if converted to a number.  There is no need to
+``initialize'' each variable explicitly in @command{awk},
+which is what you would do in C and in most other traditional languages.
+
+@node Assignment Options
+@subsection Assigning Variables on the Command Line
+@cindex variables, assigning on command line
+@c comma before assigning does NOT start tertiary
+@cindex command line, variables, assigning on
+
+Any @command{awk} variable can be set by including a @dfn{variable assignment}
+among the arguments on the command line when @command{awk} is invoked
+(@pxref{Other Arguments}).
+Such an assignment has the following form:
+
+@example
+@var{variable}=@var{text}
+@end example
+
+@c comma before assigning does NOT start tertiary
+@cindex @code{-v} option, variables, assigning
+@noindent
+With it, a variable is set either at the beginning of the
+@command{awk} run or in between input files.
+When the assignment is preceded with the @option{-v} option,
+as in the following:
+
+@example
+-v @var{variable}=@var{text}
+@end example
+
+@noindent
+the variable is set at the very beginning, even before the
+@code{BEGIN} rules are run.  The @option{-v} option and its assignment
+must precede all the @value{FN} arguments, as well as the program text.
+(@xref{Options}, for more information about
+the @option{-v} option.)
+Otherwise, the variable assignment is performed at a time determined by
+its position among the input file arguments---after the processing of the
+preceding input file argument.  For example:
+
+@example
+awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
+@end example
+
+@noindent
+prints the value of field number @code{n} for all input records.  Before
+the first file is read, the command line sets the variable @code{n}
+equal to four.  This causes the fourth field to be printed in lines from
+the file @file{inventory-shipped}.  After the first file has finished,
+but before the second file is started, @code{n} is set to two, so that the
+second field is printed in lines from @file{BBS-list}:
+
+@example
+$ awk '@{ print $n @}' n=4 inventory-shipped n=2 BBS-list
+@print{} 15
+@print{} 24
+@dots{}
+@print{} 555-5553
+@print{} 555-3412
+@dots{}
+@end example
+
+@cindex dark corner, command-line arguments
+Command-line arguments are made available for explicit examination by
+the @command{awk} program in the @code{ARGV} array
+(@pxref{ARGC and ARGV}).
+@command{awk} processes the values of command-line assignments for escape
+sequences
+(@pxref{Escape Sequences}).
+@value{DARKCORNER}
+
+@node Conversion
+@section Conversion of Strings and Numbers
+
+@cindex converting, strings to numbers
+@cindex strings, converting
+@cindex numbers, converting
+@cindex converting, numbers
+Strings are converted to numbers and numbers are converted to strings, if the context
+of the @command{awk} program demands it.  For example, if the value of
+either @code{foo} or @code{bar} in the expression @samp{foo + bar}
+happens to be a string, it is converted to a number before the addition
+is performed.  If numeric values appear in string concatenation, they
+are converted to strings.  Consider the following:
+
+@example
+two = 2; three = 3
+print (two three) + 4
+@end example
+
+@noindent
+This prints the (numeric) value 27.  The numeric values of
+the variables @code{two} and @code{three} are converted to strings and
+concatenated together.  The resulting string is converted back to the
+number 23, to which 4 is then added.
+
+@cindex null strings, converting numbers to strings
+@cindex type conversion
+If, for some reason, you need to force a number to be converted to a
+string, concatenate the empty string, @code{""}, with that number.
+To force a string to be converted to a number, add zero to that string.
+A string is converted to a number by interpreting any numeric prefix
+of the string as numerals:
+@code{"2.5"} converts to 2.5, @code{"1e3"} converts to 1000, and @code{"25fix"}
+has a numeric value of 25.
+Strings that can't be interpreted as valid numbers convert to zero.
+
+@cindex @code{CONVFMT} variable
+The exact manner in which numbers are converted into strings is controlled
+by the @command{awk} built-in variable @code{CONVFMT} (@pxref{Built-in Variables}).
+Numbers are converted using the @code{sprintf} function
+with @code{CONVFMT} as the format
+specifier
+(@pxref{String Functions}).
+
+@code{CONVFMT}'s default value is @code{"%.6g"}, which prints a value with
+at least six significant digits.  For some applications, you might want to
+change it to specify more precision.
+On most modern machines,
+17 digits is enough to capture a floating-point number's
+value exactly,
+most of the time.@footnote{Pathological cases can require up to
+752 digits (!), but we doubt that you need to worry about this.}
+
+@cindex dark corner, @code{CONVFMT} variable
+Strange results can occur if you set @code{CONVFMT} to a string that doesn't
+tell @code{sprintf} how to format floating-point numbers in a useful way.
+For example, if you forget the @samp{%} in the format, @command{awk} converts
+all numbers to the same constant string.
+As a special case, if a number is an integer, then the result of converting
+it to a string is @emph{always} an integer, no matter what the value of
+@code{CONVFMT} may be.  Given the following code fragment:
+
+@example
+CONVFMT = "%2.2f"
+a = 12
+b = a ""
+@end example
+
+@noindent
+@code{b} has the value @code{"12"}, not @code{"12.00"}.
+@value{DARKCORNER}
+
+@cindex POSIX @command{awk}, @code{OFMT} variable and
+@cindex @code{OFMT} variable
+@cindex portability, new @command{awk} vs. old @command{awk}
+@cindex @command{awk}, new vs. old, @code{OFMT} variable
+Prior to the POSIX standard, @command{awk} used the value
+of @code{OFMT} for converting numbers to strings.  @code{OFMT}
+specifies the output format to use when printing numbers with @code{print}.
+@code{CONVFMT} was introduced in order to separate the semantics of
+conversion from the semantics of printing.  Both @code{CONVFMT} and
+@code{OFMT} have the same default value: @code{"%.6g"}.  In the vast majority
+of cases, old @command{awk} programs do not change their behavior.
+However, these semantics for @code{OFMT} are something to keep in mind if you must
+port your new style program to older implementations of @command{awk}.
+We recommend
+that instead of changing your programs, just port @command{gawk} itself.
+@xref{Print},
+for more information on the @code{print} statement.
+
+Finally, once again, where you are can matter when it comes to
+converting between numbers and strings.  In
+@ref{Locales}, we mentioned that the
+local character set and language (the locale) can affect how @command{gawk} matches
+characters.  The locale also affects numeric formats.  In particular, for @command{awk}
+programs, it affects the decimal point character.  The @code{"C"} locale, and most
+English-language locales, use the period character (@samp{.}) as the decimal point.
+However, many (if not most) European and non-English locales use the comma (@samp{,})
+as the decimal point character.
+
+The POSIX standard says that @command{awk} always uses the period as the decimal
+point when reading the @command{awk} program source code, and for command-line
+variable assignments (@pxref{Other Arguments}).
+However, when interpreting input data, for @code{print} and @code{printf} output,
+and for number to string conversion, the local decimal point character is used.
+As of @value{PVERSION} 3.1.3, @command{gawk} fully complies with this aspect
+of the standard.  Here are some examples indicating the difference in behavior,
+on a GNU/Linux system:
+
+@example
+$ gawk 'BEGIN @{ printf "%g\n", 3.1415927 @}'
+@print{} 3.14159
+$  LC_ALL=en_DK gawk 'BEGIN @{ printf "%g\n", 3.1415927 @}'
+@print{} 3,14159
+$ echo 4,321 | gawk '@{ print $1 + 1 @}'
+@print{} 5
+$ echo 4,321 | LC_ALL=en_DK gawk '@{ print $1 + 1 @}'
+@print{} 5,321
+@end example
+
+@noindent
+The @samp{en_DK} locale is for English in Denmark, where the comma acts as
+the decimal point separator.  In the normal @code{"C"} locale, @command{gawk}
+treats @samp{4,321} as @samp{4}, while in the Danish locale, it's treated
+as the full number, @samp{4.321}.
+
+@node Arithmetic Ops
+@section Arithmetic Operators
+@cindex arithmetic operators
+@cindex operators, arithmetic
+@c @cindex addition
+@c @cindex subtraction
+@c @cindex multiplication
+@c @cindex division
+@c @cindex remainder
+@c @cindex quotient
+@c @cindex exponentiation
+
+The @command{awk} language uses the common arithmetic operators when
+evaluating expressions.  All of these arithmetic operators follow normal
+precedence rules and work as you would expect them to.
+
+The following example uses a file named @file{grades}, which contains
+a list of student names as well as three test scores per student (it's
+a small class):
+
+@example
+Pat   100 97 58
+Sandy  84 72 93
+Chris  72 92 89
+@end example
+
+@noindent
+This programs takes the file @file{grades} and prints the average
+of the scores:
+
+@example
+$ awk '@{ sum = $2 + $3 + $4 ; avg = sum / 3
+>        print $1, avg @}' grades
+@print{} Pat 85
+@print{} Sandy 83
+@print{} Chris 84.3333
+@end example
+
+The following list provides the arithmetic operators in @command{awk}, in order from
+the highest precedence to the lowest:
+
+@table @code
+@item - @var{x}
+Negation.
+
+@item + @var{x}
+Unary plus; the expression is converted to a number.
+
+@cindex POSIX @command{awk}, arithmetic operators and
+@item @var{x} ^ @var{y}
+@itemx @var{x} ** @var{y}
+Exponentiation; @var{x} raised to the @var{y} power.  @samp{2 ^ 3} has
+the value eight; the character sequence @samp{**} is equivalent to
+@samp{^}.
+
+@item @var{x} * @var{y}
+Multiplication.
+
+@cindex troubleshooting, division
+@cindex division
+@item @var{x} / @var{y}
+Division;  because all numbers in @command{awk} are floating-point
+numbers, the result is @emph{not} rounded to an integer---@samp{3 / 4} has
+the value 0.75.  (It is a common mistake, especially for C programmers,
+to forget that @emph{all} numbers in @command{awk} are floating-point,
+and that division of integer-looking constants produces a real number,
+not an integer.)
+
+@item @var{x} % @var{y}
+Remainder; further discussion is provided in the text, just
+after this list.
+
+@item @var{x} + @var{y}
+Addition.
+
+@item @var{x} - @var{y}
+Subtraction.
+@end table
+
+Unary plus and minus have the same precedence,
+the multiplication operators all have the same precedence, and
+addition and subtraction have the same precedence.
+
+@cindex differences in @command{awk} and @command{gawk}, trunc-mod operation
+@cindex trunc-mod operation
+When computing the remainder of @code{@var{x} % @var{y}},
+the quotient is rounded toward zero to an integer and
+multiplied by @var{y}. This result is subtracted from @var{x};
+this operation is sometimes known as ``trunc-mod.''  The following
+relation always holds:
+
+@example
+b * int(a / b) + (a % b) == a
+@end example
+
+One possibly undesirable effect of this definition of remainder is that
+@code{@var{x} % @var{y}} is negative if @var{x} is negative.  Thus:
+
+@example
+-17 % 8 = -1
+@end example
+
+In other @command{awk} implementations, the signedness of the remainder
+may be machine-dependent.
+@c !!! what does posix say?
+
+@cindex portability, @code{**} operator and
+@cindex @code{*} (asterisk), @code{**} operator
+@cindex asterisk (@code{*}), @code{**} operator
+@strong{Note:}
+The POSIX standard only specifies the use of @samp{^}
+for exponentiation.
+For maximum portability, do not use the @samp{**} operator.
+
+@node Concatenation
+@section String Concatenation
+@cindex Kernighan, Brian
+@quotation
+@i{It seemed like a good idea at the time.}@*
+Brian Kernighan
+@end quotation
+
+@cindex string operators
+@cindex operators, string
+@cindex concatenating
+There is only one string operation: concatenation.  It does not have a
+specific operator to represent it.  Instead, concatenation is performed by
+writing expressions next to one another, with no operator.  For example:
+
+@example
+$ awk '@{ print "Field number one: " $1 @}' BBS-list
+@print{} Field number one: aardvark
+@print{} Field number one: alpo-net
+@dots{}
+@end example
+
+Without the space in the string constant after the @samp{:}, the line
+runs together.  For example:
+
+@example
+$ awk '@{ print "Field number one:" $1 @}' BBS-list
+@print{} Field number one:aardvark
+@print{} Field number one:alpo-net
+@dots{}
+@end example
+
+@cindex troubleshooting, string concatenation
+Because string concatenation does not have an explicit operator, it is
+often necessary to insure that it happens at the right time by using
+parentheses to enclose the items to concatenate.  For example, the
+following code fragment does not concatenate @code{file} and @code{name}
+as you might expect:
+
+@example
+file = "file"
+name = "name"
+print "something meaningful" > file name
+@end example
+
+@noindent
+It is necessary to use the following:
+
+@example
+print "something meaningful" > (file name)
+@end example
+
+@cindex order of evaluation, concatenation
+@cindex evaluation order, concatenation
+@cindex side effects
+Parentheses should be used around concatenation in all but the
+most common contexts, such as on the righthand side of @samp{=}.
+Be careful about the kinds of expressions used in string concatenation.
+In particular, the order of evaluation of expressions used for concatenation
+is undefined in the @command{awk} language.  Consider this example:
+
+@example
+BEGIN @{
+    a = "don't"
+    print (a " " (a = "panic"))
+@}
+@end example
+
+@noindent
+It is not defined whether the assignment to @code{a} happens
+before or after the value of @code{a} is retrieved for producing the
+concatenated value.  The result could be either @samp{don't panic},
+or @samp{panic panic}.
+@c see test/nasty.awk for a worse example
+The precedence of concatenation, when mixed with other operators, is often
+counter-intuitive.  Consider this example:
+
+@ignore
+> To: bug-gnu-utils@@gnu.org
+> CC: arnold@gnu.org
+> Subject: gawk 3.0.4 bug with {print -12 " " -24}
+> From: Russell Schulz <Russell_Schulz@locutus.ofB.ORG>
+> Date: Tue, 8 Feb 2000 19:56:08 -0700
+>
+> gawk 3.0.4 on NT gives me:
+>
+> prompt> cat bad.awk
+> BEGIN { print -12 " " -24; }
+>
+> prompt> gawk -f bad.awk
+> -12-24
+>
+> when I would expect
+>
+> -12 -24
+>
+> I have not investigated the source, or other implementations.  The
+> bug is there on my NT and DOS versions 2.15.6 .
+@end ignore
+
+@example
+$ awk 'BEGIN @{ print -12 " " -24 @}'
+@print{} -12-24
+@end example
+
+This ``obviously'' is concatenating @minus{}12, a space, and @minus{}24.
+But where did the space disappear to?
+The answer lies in the combination of operator precedences and
+@command{awk}'s automatic conversion rules.  To get the desired result,
+write the program in the following manner:
+
+@example
+$ awk 'BEGIN @{ print -12 " " (-24) @}'
+@print{} -12 -24
+@end example
+
+This forces @command{awk} to treat the @samp{-} on the @samp{-24} as unary.
+Otherwise, it's parsed as follows:
+
+@display
+    @minus{}12 (@code{"@ "} @minus{} 24)
+@result{} @minus{}12 (0 @minus{} 24)
+@result{} @minus{}12 (@minus{}24)
+@result{} @minus{}12@minus{}24
+@end display
+
+As mentioned earlier,
+when doing concatenation, @emph{parenthesize}.  Otherwise,
+you're never quite sure what you'll get.
+
+@node Assignment Ops
+@section Assignment Expressions
+@c STARTOFRANGE asop
+@cindex assignment operators
+@c STARTOFRANGE opas
+@cindex operators, assignment
+@c STARTOFRANGE exas
+@cindex expressions, assignment
+@cindex @code{=} (equals sign), @code{=} operator
+@cindex equals sign (@code{=}), @code{=} operator
+An @dfn{assignment} is an expression that stores a (usually different)
+value into a variable.  For example, let's assign the value one to the variable
+@code{z}:
+
+@example
+z = 1
+@end example
+
+After this expression is executed, the variable @code{z} has the value one.
+Whatever old value @code{z} had before the assignment is forgotten.
+
+Assignments can also store string values.  For example, the
+following stores
+the value @code{"this food is good"} in the variable @code{message}:
+
+@example
+thing = "food"
+predicate = "good"
+message = "this " thing " is " predicate
+@end example
+
+@noindent
+@cindex side effects, assignment expressions
+This also illustrates string concatenation.
+The @samp{=} sign is called an @dfn{assignment operator}.  It is the
+simplest assignment operator because the value of the righthand
+operand is stored unchanged.
+Most operators (addition, concatenation, and so on) have no effect
+except to compute a value.  If the value isn't used, there's no reason to
+use the operator.  An assignment operator is different; it does
+produce a value, but even if you ignore it, the assignment still
+makes itself felt through the alteration of the variable.  We call this
+a @dfn{side effect}.
+
+@cindex lvalues/rvalues
+@cindex rvalues/lvalues
+@cindex assignment operators, lvalues/rvalues
+@cindex operators, assignment
+The lefthand operand of an assignment need not be a variable
+(@pxref{Variables}); it can also be a field
+(@pxref{Changing Fields}) or
+an array element (@pxref{Arrays}).
+These are all called @dfn{lvalues},
+which means they can appear on the lefthand side of an assignment operator.
+The righthand operand may be any expression; it produces the new value
+that the assignment stores in the specified variable, field, or array
+element. (Such values are called @dfn{rvalues}.)
+
+@cindex variables, types of
+It is important to note that variables do @emph{not} have permanent types.
+A variable's type is simply the type of whatever value it happens
+to hold at the moment.  In the following program fragment, the variable
+@code{foo} has a numeric value at first, and a string value later on:
+
+@example
+foo = 1
+print foo
+foo = "bar"
+print foo
+@end example
+
+@noindent
+When the second assignment gives @code{foo} a string value, the fact that
+it previously had a numeric value is forgotten.
+
+String values that do not begin with a digit have a numeric value of
+zero. After executing the following code, the value of @code{foo} is five:
+
+@example
+foo = "a string"
+foo = foo + 5
+@end example
+
+@noindent
+@strong{Note:} Using a variable as a number and then later as a string
+can be confusing and is poor programming style.  The previous two examples
+illustrate how @command{awk} works, @emph{not} how you should write your
+programs!
+
+An assignment is an expression, so it has a value---the same value that
+is assigned.  Thus, @samp{z = 1} is an expression with the value one.
+One consequence of this is that you can write multiple assignments together,
+such as:
+
+@example
+x = y = z = 5
+@end example
+
+@noindent
+This example stores the value five in all three variables
+(@code{x}, @code{y}, and @code{z}).
+It does so because the
+value of @samp{z = 5}, which is five, is stored into @code{y} and then
+the value of @samp{y = z = 5}, which is five, is stored into @code{x}.
+
+Assignments may be used anywhere an expression is called for.  For
+example, it is valid to write @samp{x != (y = 1)} to set @code{y} to one,
+and then test whether @code{x} equals one.  But this style tends to make
+programs hard to read; such nesting of assignments should be avoided,
+except perhaps in a one-shot program.
+
+@cindex @code{+} (plus sign), @code{+=} operator
+@cindex plus sign (@code{+}), @code{+=} operator
+Aside from @samp{=}, there are several other assignment operators that
+do arithmetic with the old value of the variable.  For example, the
+operator @samp{+=} computes a new value by adding the righthand value
+to the old value of the variable.  Thus, the following assignment adds
+five to the value of @code{foo}:
+
+@example
+foo += 5
+@end example
+
+@noindent
+This is equivalent to the following:
+
+@example
+foo = foo + 5
+@end example
+
+@noindent
+Use whichever makes the meaning of your program clearer.
+
+There are situations where using @samp{+=} (or any assignment operator)
+is @emph{not} the same as simply repeating the lefthand operand in the
+righthand expression.  For example:
+
+@cindex Rankin, Pat
+@example
+# Thanks to Pat Rankin for this example
+BEGIN  @{
+    foo[rand()] += 5
+    for (x in foo)
+       print x, foo[x]
+
+    bar[rand()] = bar[rand()] + 5
+    for (x in bar)
+       print x, bar[x]
+@}
+@end example
+
+@cindex operators, assignment, evaluation order
+@cindex assignment operators, evaluation order
+@noindent
+The indices of @code{bar} are practically guaranteed to be different, because
+@code{rand} returns different values each time it is called.
+(Arrays and the @code{rand} function haven't been covered yet.
+@xref{Arrays},
+and see @ref{Numeric Functions}, for more information).
+This example illustrates an important fact about assignment
+operators: the lefthand expression is only evaluated @emph{once}.
+It is up to the implementation as to which expression is evaluated
+first, the lefthand or the righthand.
+Consider this example:
+
+@example
+i = 1
+a[i += 2] = i + 1
+@end example
+
+@noindent
+The value of @code{a[3]} could be either two or four.
+
+Here is a table of the arithmetic assignment operators.  In each
+case, the righthand operand is an expression whose value is converted
+to a number.
+
+@ignore
+@table @code
+@item @var{lvalue} += @var{increment}
+Adds @var{increment} to the value of @var{lvalue}.
+
+@item @var{lvalue} -= @var{decrement}
+Subtracts @var{decrement} from the value of @var{lvalue}.
+
+@item @var{lvalue} *= @var{coefficient}
+Multiplies the value of @var{lvalue} by @var{coefficient}.
+
+@item @var{lvalue} /= @var{divisor}
+Divides the value of @var{lvalue} by @var{divisor}.
+
+@item @var{lvalue} %= @var{modulus}
+Sets @var{lvalue} to its remainder by @var{modulus}.
+
+@cindex @command{awk} language, POSIX version
+@cindex POSIX @command{awk}
+@item @var{lvalue} ^= @var{power}
+@itemx @var{lvalue} **= @var{power}
+Raises @var{lvalue} to the power @var{power}.
+(Only the @samp{^=} operator is specified by POSIX.)
+@end table
+@end ignore
+
+@cindex @code{-} (hyphen), @code{-=} operator
+@cindex hyphen (@code{-}), @code{-=} operator
+@cindex @code{*} (asterisk), @code{*=} operator
+@cindex asterisk (@code{*}), @code{*=} operator
+@cindex @code{/} (forward slash), @code{/=} operator
+@cindex forward slash (@code{/}), @code{/=} operator
+@cindex @code{%} (percent sign), @code{%=} operator
+@cindex percent sign (@code{%}), @code{%=} operator
+@cindex @code{^} (caret), @code{^=} operator
+@cindex caret (@code{^}), @code{^=} operator
+@cindex @code{*} (asterisk), @code{**=} operator
+@cindex asterisk (@code{*}), @code{**=} operator
+@multitable {@var{lvalue} *= @var{coefficient}} {Subtracts @var{decrement} from the value of @var{lvalue}.}
+@item @var{lvalue} @code{+=} @var{increment} @tab Adds @var{increment} to the value of @var{lvalue}.
+
+@item @var{lvalue} @code{-=} @var{decrement} @tab Subtracts @var{decrement} from the value of @var{lvalue}.
+
+@item @var{lvalue} @code{*=} @var{coefficient} @tab Multiplies the value of @var{lvalue} by @var{coefficient}.
+
+@item @var{lvalue} @code{/=} @var{divisor} @tab Divides the value of @var{lvalue} by @var{divisor}.
+
+@item @var{lvalue} @code{%=} @var{modulus} @tab Sets @var{lvalue} to its remainder by @var{modulus}.
+
+@cindex @command{awk} language, POSIX version
+@cindex POSIX @command{awk}
+@item @var{lvalue} @code{^=} @var{power} @tab
+@item @var{lvalue} @code{**=} @var{power} @tab Raises @var{lvalue} to the power @var{power}.
+@end multitable
+
+@cindex POSIX @command{awk}, @code{**=} operator and
+@cindex portability, @code{**=} operator and
+@strong{Note:}
+Only the @samp{^=} operator is specified by POSIX.
+For maximum portability, do not use the @samp{**=} operator.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Syntactic Ambiguities Between @samp{/=} and Regular Expressions
+@cindex advanced features, regexp constants
+@cindex dark corner, regexp constants, @code{/=} operator and
+@cindex @code{/} (forward slash), @code{/=} operator, vs. @code{/=@dots{}/} regexp constant
+@cindex forward slash (@code{/}), @code{/=} operator, vs. @code{/=@dots{}/} regexp constant
+@cindex regexp constants, @code{/=@dots{}/}, @code{/=} operator and
+
+@c derived from email from  "Nelson H. F. Beebe" <beebe@math.utah.edu>
+@c Date: Mon, 1 Sep 1997 13:38:35 -0600 (MDT)
+
+@cindex dark corner
+@cindex ambiguity, syntactic: @code{/=} operator vs. @code{/=@dots{}/} regexp constant
+@cindex syntactic ambiguity: @code{/=} operator vs. @code{/=@dots{}/} regexp constant
+@cindex @code{/=} operator vs. @code{/=@dots{}/} regexp constant
+There is a syntactic ambiguity between the @samp{/=} assignment
+operator and regexp constants whose first character is an @samp{=}.
+@value{DARKCORNER}
+This is most notable in commercial @command{awk} versions.
+For example:
+
+@example
+$ awk /==/ /dev/null
+@error{} awk: syntax error at source line 1
+@error{}  context is
+@error{}         >>> /= <<<
+@error{} awk: bailing out at source line 1
+@end example
+
+@noindent
+A workaround is:
+
+@example
+awk '/[=]=/' /dev/null
+@end example
+
+@command{gawk} does not have this problem,
+nor do the other
+freely available versions described in
+@ref{Other Versions}.
+@c ENDOFRANGE exas
+@c ENDOFRANGE opas
+@c ENDOFRANGE asop
+
+@node Increment Ops
+@section Increment and Decrement Operators
+
+@c STARTOFRANGE inop
+@cindex increment operators
+@c STARTOFRANGE opde
+@cindex operators, decrement/increment
+@dfn{Increment} and @dfn{decrement operators} increase or decrease the value of
+a variable by one.  An assignment operator can do the same thing, so
+the increment operators add no power to the @command{awk} language; however, they
+are convenient abbreviations for very common operations.
+
+@cindex side effects
+@cindex @code{+} (plus sign), decrement/increment operators
+@cindex plus sign (@code{+}), decrement/increment operators
+@cindex side effects, decrement/increment operators
+The operator used for adding one is written @samp{++}.  It can be used to increment
+a variable either before or after taking its value.
+To pre-increment a variable @code{v}, write @samp{++v}.  This adds
+one to the value of @code{v}---that new value is also the value of the
+expression. (The assignment expression @samp{v += 1} is completely
+equivalent.)
+Writing the @samp{++} after the variable specifies post-increment.  This
+increments the variable value just the same; the difference is that the
+value of the increment expression itself is the variable's @emph{old}
+value.  Thus, if @code{foo} has the value four, then the expression @samp{foo++}
+has the value four, but it changes the value of @code{foo} to five.
+In other words, the operator returns the old value of the variable,
+but with the side effect of incrementing it.
+
+The post-increment @samp{foo++} is nearly the same as writing @samp{(foo
++= 1) - 1}.  It is not perfectly equivalent because all numbers in
+@command{awk} are floating-point---in floating-point, @samp{foo + 1 - 1} does
+not necessarily equal @code{foo}.  But the difference is minute as
+long as you stick to numbers that are fairly small (less than 10e12).
+
+@cindex @code{$} (dollar sign), incrementing fields and arrays
+@cindex dollar sign (@code{$}), incrementing fields and arrays
+Fields and array elements are incremented
+just like variables.  (Use @samp{$(i++)} when you want to do a field reference
+and a variable increment at the same time.  The parentheses are necessary
+because of the precedence of the field reference operator @samp{$}.)
+
+@cindex decrement operators
+The decrement operator @samp{--} works just like @samp{++}, except that
+it subtracts one instead of adding it.  As with @samp{++}, it can be used before
+the lvalue to pre-decrement or after it to post-decrement.
+Following is a summary of increment and decrement expressions:
+
+@table @code
+@cindex @code{+} (plus sign), @code{++} operator
+@cindex plus sign (@code{+}), @code{++} operator
+@item ++@var{lvalue}
+This expression increments @var{lvalue}, and the new value becomes the
+value of the expression.
+
+@item @var{lvalue}++
+This expression increments @var{lvalue}, but
+the value of the expression is the @emph{old} value of @var{lvalue}.
+
+@cindex @code{-} (hyphen), @code{--} operator
+@cindex hyphen (@code{-}), @code{--} operator
+@item --@var{lvalue}
+This expression is
+like @samp{++@var{lvalue}}, but instead of adding, it subtracts.  It
+decrements @var{lvalue} and delivers the value that is the result.
+
+@item @var{lvalue}--
+This expression is
+like @samp{@var{lvalue}++}, but instead of adding, it subtracts.  It
+decrements @var{lvalue}.  The value of the expression is the @emph{old}
+value of @var{lvalue}.
+@end table
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Operator Evaluation Order
+@c comma before precedence does NOT start tertiary
+@cindex advanced features, operators, precedence
+@cindex precedence
+@cindex operators, precedence
+@cindex portability, operators
+@cindex evaluation order
+@cindex Marx, Groucho
+@quotation
+@i{Doctor, doctor!  It hurts when I do this!@*
+So don't do that!}@*
+Groucho Marx
+@end quotation
+
+@noindent
+What happens for something like the following?
+
+@example
+b = 6
+print b += b++
+@end example
+
+@noindent
+Or something even stranger?
+
+@example
+b = 6
+b += ++b + b++
+print b
+@end example
+
+@cindex side effects
+In other words, when do the various side effects prescribed by the
+postfix operators (@samp{b++}) take effect?
+When side effects happen is @dfn{implementation defined}.
+In other words, it is up to the particular version of @command{awk}.
+The result for the first example may be 12 or 13, and for the second, it
+may be 22 or 23.
+
+In short, doing things like this is not recommended and definitely
+not anything that you can rely upon for portability.
+You should avoid such things in your own programs.
+@c You'll sleep better at night and be able to look at yourself
+@c in the mirror in the morning.
+@c ENDOFRANGE inop
+@c ENDOFRANGE opde
+@c ENDOFRANGE deop
+
+@node Truth Values
+@section True and False in @command{awk}
+@cindex truth values
+@cindex logical false/true
+@cindex false, logical
+@cindex true, logical
+
+@cindex null strings
+Many programming languages have a special representation for the concepts
+of ``true'' and ``false.''  Such languages usually use the special
+constants @code{true} and @code{false}, or perhaps their uppercase
+equivalents.
+However, @command{awk} is different.
+It borrows a very simple concept of true and
+false from C.  In @command{awk}, any nonzero numeric value @emph{or} any
+nonempty string value is true.  Any other value (zero or the null
+string @code{""}) is false.  The following program prints @samp{A strange
+truth value} three times:
+
+@example
+BEGIN @{
+   if (3.1415927)
+       print "A strange truth value"
+   if ("Four Score And Seven Years Ago")
+       print "A strange truth value"
+   if (j = 57)
+       print "A strange truth value"
+@}
+@end example
+
+@cindex dark corner
+There is a surprising consequence of the ``nonzero or non-null'' rule:
+the string constant @code{"0"} is actually true, because it is non-null.
+@value{DARKCORNER}
+
+@node Typing and Comparison
+@section Variable Typing and Comparison Expressions
+@quotation
+@i{The Guide is definitive. Reality is frequently inaccurate.}@*
+The Hitchhiker's Guide to the Galaxy
+@end quotation
+
+@c STARTOFRANGE comex
+@cindex comparison expressions
+@c STARTOFRANGE excom
+@cindex expressions, comparison
+@cindex expressions, matching, See comparison expressions
+@cindex matching, expressions, See comparison expressions
+@cindex relational operators, See comparison operators
+@c comma is part of See
+@cindex operators, relational, See operators, comparison
+@c STARTOFRANGE varting
+@cindex variable typing
+@c STARTOFRANGE vartypc
+@cindex variables, types of, comparison expressions and
+Unlike other programming languages, @command{awk} variables do not have a
+fixed type. Instead, they can be either a number or a string, depending
+upon the value that is assigned to them.
+
+@cindex numeric, strings
+@cindex strings, numeric
+@cindex POSIX @command{awk}, numeric strings and
+The 1992 POSIX standard introduced
+the concept of a @dfn{numeric string}, which is simply a string that looks
+like a number---for example, @code{@w{" +2"}}.  This concept is used
+for determining the type of a variable.
+The type of the variable is important because the types of two variables
+determine how they are compared.
+In @command{gawk}, variable typing follows these rules:
+
+@itemize @bullet
+@item
+A numeric constant or the result of a numeric operation has the @var{numeric}
+attribute.
+
+@item
+A string constant or the result of a string operation has the @var{string}
+attribute.
+
+@item
+Fields, @code{getline} input, @code{FILENAME}, @code{ARGV} elements,
+@code{ENVIRON} elements, and the
+elements of an array created by @code{split} that are numeric strings
+have the @var{strnum} attribute.  Otherwise, they have the @var{string}
+attribute.
+Uninitialized variables also have the @var{strnum} attribute.
+
+@item
+Attributes propagate across assignments but are not changed by
+any use.
+@c (Although a use may cause the entity to acquire an additional
+@c value such that it has both a numeric and string value, this leaves the
+@c attribute unchanged.)
+@c This is important but not relevant
+@end itemize
+
+The last rule is particularly important. In the following program,
+@code{a} has numeric type, even though it is later used in a string
+operation:
+
+@example
+BEGIN @{
+         a = 12.345
+         b = a " is a cute number"
+         print b
+@}
+@end example
+
+When two operands are compared, either string comparison or numeric comparison
+may be used. This depends upon the attributes of the operands, according to the
+following symmetric matrix:
+
+@c thanks to Karl Berry, kb@cs.umb.edu, for major help with TeX tables
+@tex
+\centerline{
+\vbox{\bigskip % space above the table (about 1 linespace)
+% Because we have vertical rules, we can't let TeX insert interline space
+% in its usual way.
+\offinterlineskip
+%
+% Define the table template. & separates columns, and \cr ends the
+% template (and each row). # is replaced by the text of that entry on
+% each row. The template for the first column breaks down like this:
+%   \strut -- a way to make each line have the height and depth
+%             of a normal line of type, since we turned off interline spacing.
+%   \hfil -- infinite glue; has the effect of right-justifying in this case.
+%   #     -- replaced by the text (for instance, `STRNUM', in the last row).
+%   \quad -- about the width of an `M'. Just separates the columns.
+%
+% The second column (\vrule#) is what generates the vertical rule that
+% spans table rows.
+%
+% The doubled && before the next entry means `repeat the following
+% template as many times as necessary on each line' -- in our case, twice.
+%
+% The template itself, \quad#\hfil, left-justifies with a little space before.
+%
+\halign{\strut\hfil#\quad&\vrule#&&\quad#\hfil\cr
+	&&STRING	&NUMERIC	&STRNUM\cr
+% The \omit tells TeX to skip inserting the template for this column on
+% this particular row. In this case, we only want a little extra space
+% to separate the heading row from the rule below it.  the depth 2pt --
+% `\vrule depth 2pt' is that little space.
+\omit	&depth 2pt\cr
+% This is the horizontal rule below the heading. Since it has nothing to
+% do with the columns of the table, we use \noalign to get it in there.
+\noalign{\hrule}
+% Like above, this time a little more space.
+\omit	&depth 4pt\cr
+% The remaining rows have nothing special about them.
+STRING	&&string	&string		&string\cr
+NUMERIC	&&string	&numeric	&numeric\cr
+STRNUM  &&string	&numeric	&numeric\cr
+}}}
+@end tex
+@ifnottex
+@display
+        +----------------------------------------------
+        |       STRING          NUMERIC         STRNUM
+--------+----------------------------------------------
+        |
+STRING  |       string          string          string
+        |
+NUMERIC |       string          numeric         numeric
+        |
+STRNUM  |       string          numeric         numeric
+--------+----------------------------------------------
+@end display
+@end ifnottex
+
+The basic idea is that user input that looks numeric---and @emph{only}
+user input---should be treated as numeric, even though it is actually
+made of characters and is therefore also a string.
+Thus, for example, the string constant @w{@code{" +3.14"}}
+is a string, even though it looks numeric,
+and is @emph{never} treated as number for comparison
+purposes.
+
+In short, when one operand is a ``pure'' string, such as a string
+constant, then a string comparison is performed.  Otherwise, a
+numeric comparison is performed.@footnote{The POSIX standard is under
+revision.  The revised standard's rules for typing and comparison are
+the same as just described for @command{gawk}.}
+
+@dfn{Comparison expressions} compare strings or numbers for
+relationships such as equality.  They are written using @dfn{relational
+operators}, which are a superset of those in C.  Here is a table of
+them:
+
+@cindex @code{<} (left angle bracket), @code{<} operator
+@cindex left angle bracket (@code{<}), @code{<} operator
+@cindex @code{<} (left angle bracket), @code{<=} operator
+@cindex left angle bracket (@code{<}), @code{<=} operator
+@cindex @code{>} (right angle bracket), @code{>=} operator
+@cindex right angle bracket (@code{>}), @code{>=} operator
+@cindex @code{>} (right angle bracket), @code{>} operator
+@cindex right angle bracket (@code{>}), @code{>} operator
+@cindex @code{=} (equals sign), @code{==} operator
+@cindex equals sign (@code{=}), @code{==} operator
+@cindex @code{!} (exclamation point), @code{!=} operator
+@cindex exclamation point (@code{!}), @code{!=} operator
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@cindex @code{in} operator
+@table @code
+@item @var{x} < @var{y}
+True if @var{x} is less than @var{y}.
+
+@item @var{x} <= @var{y}
+True if @var{x} is less than or equal to @var{y}.
+
+@item @var{x} > @var{y}
+True if @var{x} is greater than @var{y}.
+
+@item @var{x} >= @var{y}
+True if @var{x} is greater than or equal to @var{y}.
+
+@item @var{x} == @var{y}
+True if @var{x} is equal to @var{y}.
+
+@item @var{x} != @var{y}
+True if @var{x} is not equal to @var{y}.
+
+@item @var{x} ~ @var{y}
+True if the string @var{x} matches the regexp denoted by @var{y}.
+
+@item @var{x} !~ @var{y}
+True if the string @var{x} does not match the regexp denoted by @var{y}.
+
+@item @var{subscript} in @var{array}
+True if the array @var{array} has an element with the subscript @var{subscript}.
+@end table
+
+Comparison expressions have the value one if true and zero if false.
+When comparing operands of mixed types, numeric operands are converted
+to strings using the value of @code{CONVFMT}
+(@pxref{Conversion}).
+
+Strings are compared
+by comparing the first character of each, then the second character of each,
+and so on.  Thus, @code{"10"} is less than @code{"9"}.  If there are two
+strings where one is a prefix of the other, the shorter string is less than
+the longer one.  Thus, @code{"abc"} is less than @code{"abcd"}.
+
+@cindex troubleshooting, @code{==} operator
+It is very easy to accidentally mistype the @samp{==} operator and
+leave off one of the @samp{=} characters.  The result is still valid @command{awk}
+code, but the program does not do what is intended:
+
+@example
+if (a = b)   # oops! should be a == b
+   @dots{}
+else
+   @dots{}
+@end example
+
+@noindent
+Unless @code{b} happens to be zero or the null string, the @code{if}
+part of the test always succeeds.  Because the operators are
+so similar, this kind of error is very difficult to spot when
+scanning the source code.
+
+@cindex @command{gawk}, comparison operators and
+The following table of expressions illustrates the kind of comparison
+@command{gawk} performs, as well as what the result of the comparison is:
+
+@table @code
+@item 1.5 <= 2.0
+numeric comparison (true)
+
+@item "abc" >= "xyz"
+string comparison (false)
+
+@item 1.5 != " +2"
+string comparison (true)
+
+@item "1e2" < "3"
+string comparison (true)
+
+@item a = 2; b = "2"
+@itemx a == b
+string comparison (true)
+
+@item a = 2; b = " +2"
+@item a == b
+string comparison (false)
+@end table
+
+In the next example:
+
+@example
+$ echo 1e2 3 | awk '@{ print ($1 < $2) ? "true" : "false" @}'
+@print{} false
+@end example
+
+@cindex comparison expressions, string vs. regexp
+@c @cindex string comparison vs. regexp comparison
+@c @cindex regexp comparison vs. string comparison
+@noindent
+the result is @samp{false} because both @code{$1} and @code{$2}
+are user input.  They are numeric strings---therefore both have
+the @var{strnum} attribute, dictating a numeric comparison.
+The purpose of the comparison rules and the use of numeric strings is
+to attempt to produce the behavior that is ``least surprising,'' while
+still ``doing the right thing.''
+String comparisons and regular expression comparisons are very different.
+For example:
+
+@example
+x == "foo"
+@end example
+
+@noindent
+has the value one, or is true if the variable @code{x}
+is precisely @samp{foo}.  By contrast:
+
+@example
+x ~ /foo/
+@end example
+
+@noindent
+has the value one if @code{x} contains @samp{foo}, such as
+@code{"Oh, what a fool am I!"}.
+
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+The righthand operand of the @samp{~} and @samp{!~} operators may be
+either a regexp constant (@code{/@dots{}/}) or an ordinary
+expression. In the latter case, the value of the expression as a string is used as a
+dynamic regexp (@pxref{Regexp Usage}; also
+@pxref{Computed Regexps}).
+
+@cindex @command{awk}, regexp constants and
+@cindex regexp constants
+In modern implementations of @command{awk}, a constant regular
+expression in slashes by itself is also an expression.  The regexp
+@code{/@var{regexp}/} is an abbreviation for the following comparison expression:
+
+@example
+$0 ~ /@var{regexp}/
+@end example
+
+One special place where @code{/foo/} is @emph{not} an abbreviation for
+@samp{$0 ~ /foo/} is when it is the righthand operand of @samp{~} or
+@samp{!~}.
+@xref{Using Constant Regexps},
+where this is discussed in more detail.
+@c ENDOFRANGE comex
+@c ENDOFRANGE excom
+@c ENDOFRANGE vartypc
+@c ENDOFRANGE varting
+
+@node Boolean Ops
+@section Boolean Expressions
+@cindex and Boolean-logic operator
+@cindex or Boolean-logic operator
+@cindex not Boolean-logic operator
+@c STARTOFRANGE exbo
+@cindex expressions, Boolean
+@c STARTOFRANGE boex
+@cindex Boolean expressions
+@cindex operators, Boolean, See Boolean expressions
+@cindex Boolean operators, See Boolean expressions
+@cindex logical operators, See Boolean expressions
+@cindex operators, logical, See Boolean expressions
+
+A @dfn{Boolean expression} is a combination of comparison expressions or
+matching expressions, using the Boolean operators ``or''
+(@samp{||}), ``and'' (@samp{&&}), and ``not'' (@samp{!}), along with
+parentheses to control nesting.  The truth value of the Boolean expression is
+computed by combining the truth values of the component expressions.
+Boolean expressions are also referred to as @dfn{logical expressions}.
+The terms are equivalent.
+
+Boolean expressions can be used wherever comparison and matching
+expressions can be used.  They can be used in @code{if}, @code{while},
+@code{do}, and @code{for} statements
+(@pxref{Statements}).
+They have numeric values (one if true, zero if false) that come into play
+if the result of the Boolean expression is stored in a variable or
+used in arithmetic.
+
+In addition, every Boolean expression is also a valid pattern, so
+you can use one as a pattern to control the execution of rules.
+The Boolean operators are:
+
+@table @code
+@item @var{boolean1} && @var{boolean2}
+True if both @var{boolean1} and @var{boolean2} are true.  For example,
+the following statement prints the current input record if it contains
+both @samp{2400} and @samp{foo}:
+
+@example
+if ($0 ~ /2400/ && $0 ~ /foo/) print
+@end example
+
+@cindex side effects, Boolean operators
+The subexpression @var{boolean2} is evaluated only if @var{boolean1}
+is true.  This can make a difference when @var{boolean2} contains
+expressions that have side effects. In the case of @samp{$0 ~ /foo/ &&
+($2 == bar++)}, the variable @code{bar} is not incremented if there is
+no substring @samp{foo} in the record.
+
+@item @var{boolean1} || @var{boolean2}
+True if at least one of @var{boolean1} or @var{boolean2} is true.
+For example, the following statement prints all records in the input
+that contain @emph{either} @samp{2400} or
+@samp{foo} or both:
+
+@example
+if ($0 ~ /2400/ || $0 ~ /foo/) print
+@end example
+
+The subexpression @var{boolean2} is evaluated only if @var{boolean1}
+is false.  This can make a difference when @var{boolean2} contains
+expressions that have side effects.
+
+@item ! @var{boolean}
+True if @var{boolean} is false.  For example,
+the following program prints @samp{no home!} in
+the unusual event that the @env{HOME} environment
+variable is not defined:
+
+@example
+BEGIN @{ if (! ("HOME" in ENVIRON))
+               print "no home!" @}
+@end example
+
+(The @code{in} operator is described in
+@ref{Reference to Elements}.)
+@end table
+
+@cindex short-circuit operators
+@cindex operators, short-circuit
+@cindex @code{&} (ampersand), @code{&&} operator
+@cindex ampersand (@code{&}), @code{&&} operator
+@cindex @code{|} (vertical bar), @code{||} operator
+@cindex vertical bar (@code{|}), @code{||} operator
+The @samp{&&} and @samp{||} operators are called @dfn{short-circuit}
+operators because of the way they work.  Evaluation of the full expression
+is ``short-circuited'' if the result can be determined part way through
+its evaluation.
+
+@cindex line continuations
+Statements that use @samp{&&} or @samp{||} can be continued simply
+by putting a newline after them.  But you cannot put a newline in front
+of either of these operators without using backslash continuation
+(@pxref{Statements/Lines}).
+
+@cindex @code{!} (exclamation point), @code{!}  operator
+@cindex exclamation point (@code{!}), @code{!} operator
+@cindex newlines
+@cindex variables, flag
+@cindex flag variables
+The actual value of an expression using the @samp{!} operator is
+either one or zero, depending upon the truth value of the expression it
+is applied to.
+The @samp{!} operator is often useful for changing the sense of a flag
+variable from false to true and back again. For example, the following
+program is one way to print lines in between special bracketing lines:
+
+@example
+$1 == "START"   @{ interested = ! interested; next @}
+interested == 1 @{ print @}
+$1 == "END"     @{ interested = ! interested; next @}
+@end example
+
+@noindent
+The variable @code{interested}, as with all @command{awk} variables, starts
+out initialized to zero, which is also false.  When a line is seen whose
+first field is @samp{START}, the value of @code{interested} is toggled
+to true, using @samp{!}. The next rule prints lines as long as
+@code{interested} is true.  When a line is seen whose first field is
+@samp{END}, @code{interested} is toggled back to false.
+
+@ignore
+Scott Deifik points out that this program isn't robust against
+bogus input data, but the point is to illustrate the use of `!',
+so we'll leave well enough alone.
+@end ignore
+
+@cindex @code{next} statement
+@strong{Note:} The @code{next} statement is discussed in
+@ref{Next Statement}.
+@code{next} tells @command{awk} to skip the rest of the rules, get the
+next record, and start processing the rules over again at the top.
+The reason it's there is to avoid printing the bracketing
+@samp{START} and @samp{END} lines.
+@c ENDOFRANGE exbo
+@c ENDOFRANGE boex
+
+@node Conditional Exp
+@section Conditional Expressions
+@cindex conditional expressions
+@cindex expressions, conditional
+@cindex expressions, selecting
+
+A @dfn{conditional expression} is a special kind of expression that has
+three operands.  It allows you to use one expression's value to select
+one of two other expressions.
+The conditional expression is the same as in the C language,
+as shown here:
+
+@example
+@var{selector} ? @var{if-true-exp} : @var{if-false-exp}
+@end example
+
+@noindent
+There are three subexpressions.  The first, @var{selector}, is always
+computed first.  If it is ``true'' (not zero or not null), then
+@var{if-true-exp} is computed next and its value becomes the value of
+the whole expression.  Otherwise, @var{if-false-exp} is computed next
+and its value becomes the value of the whole expression.
+For example, the following expression produces the absolute value of @code{x}:
+
+@example
+x >= 0 ? x : -x
+@end example
+
+@cindex side effects, conditional expressions
+Each time the conditional expression is computed, only one of
+@var{if-true-exp} and @var{if-false-exp} is used; the other is ignored.
+This is important when the expressions have side effects.  For example,
+this conditional expression examines element @code{i} of either array
+@code{a} or array @code{b}, and increments @code{i}:
+
+@example
+x == y ? a[i++] : b[i++]
+@end example
+
+@noindent
+This is guaranteed to increment @code{i} exactly once, because each time
+only one of the two increment expressions is executed
+and the other is not.
+@xref{Arrays},
+for more information about arrays.
+
+@cindex differences in @command{awk} and @command{gawk}, line continuations
+@cindex line continuations, @command{gawk}
+@cindex @command{gawk}, line continuation in
+As a minor @command{gawk} extension,
+a statement that uses @samp{?:} can be continued simply
+by putting a newline after either character.
+However, putting a newline in front
+of either character does not work without using backslash continuation
+(@pxref{Statements/Lines}).
+If @option{--posix} is specified
+(@pxref{Options}), then this extension is disabled.
+
+@node Function Calls
+@section Function Calls
+@cindex function calls
+
+A @dfn{function} is a name for a particular calculation.
+This enables you to
+ask for it by name at any point in the program.  For
+example, the function @code{sqrt} computes the square root of a number.
+
+@cindex functions, built-in
+A fixed set of functions are @dfn{built-in}, which means they are
+available in every @command{awk} program.  The @code{sqrt} function is one
+of these.  @xref{Built-in}, for a list of built-in
+functions and their descriptions.  In addition, you can define
+functions for use in your program.
+@xref{User-defined},
+for instructions on how to do this.
+
+@cindex arguments, in function calls
+The way to use a function is with a @dfn{function call} expression,
+which consists of the function name followed immediately by a list of
+@dfn{arguments} in parentheses.  The arguments are expressions that
+provide the raw materials for the function's calculations.
+When there is more than one argument, they are separated by commas.  If
+there are no arguments, just write @samp{()} after the function name.
+The following examples show function calls with and without arguments:
+
+@example
+sqrt(x^2 + y^2)        @i{one argument}
+atan2(y, x)            @i{two arguments}
+rand()                 @i{no arguments}
+@end example
+
+@cindex troubleshooting, function call syntax
+@strong{Caution:}
+Do not put any space between the function name and the open-parenthesis!
+A user-defined function name looks just like the name of a
+variable---a space would make the expression look like concatenation of
+a variable with an expression inside parentheses.
+
+With built-in functions, space before the parenthesis is harmless, but
+it is best not to get into the habit of using space to avoid mistakes
+with user-defined functions.  Each function expects a particular number
+of arguments.  For example, the @code{sqrt} function must be called with
+a single argument, the number of which to take the square root:
+
+@example
+sqrt(@var{argument})
+@end example
+
+Some of the built-in functions have one or
+more optional arguments.
+If those arguments are not supplied, the functions
+use a reasonable default value.
+@xref{Built-in}, for full details.  If arguments
+are omitted in calls to user-defined functions, then those arguments are
+treated as local variables and initialized to the empty string
+(@pxref{User-defined}).
+
+@cindex side effects, function calls
+Like every other expression, the function call has a value, which is
+computed by the function based on the arguments you give it.  In this
+example, the value of @samp{sqrt(@var{argument})} is the square root of
+@var{argument}.  A function can also have side effects, such as assigning
+values to certain variables or doing I/O.
+The following program reads numbers, one number per line, and prints the
+square root of each one:
+
+@example
+$ awk '@{ print "The square root of", $1, "is", sqrt($1) @}'
+1
+@print{} The square root of 1 is 1
+3
+@print{} The square root of 3 is 1.73205
+5
+@print{} The square root of 5 is 2.23607
+@kbd{@value{CTL}-d}
+@end example
+
+@node Precedence
+@section Operator Precedence (How Operators Nest)
+@c STARTOFRANGE prec
+@cindex precedence
+@c STARTOFRANGE oppr
+@cindex operators, precedence
+
+@dfn{Operator precedence} determines how operators are grouped when
+different operators appear close by in one expression.  For example,
+@samp{*} has higher precedence than @samp{+}; thus, @samp{a + b * c}
+means to multiply @code{b} and @code{c}, and then add @code{a} to the
+product (i.e., @samp{a + (b * c)}).
+
+The normal precedence of the operators can be overruled by using parentheses.
+Think of the precedence rules as saying where the
+parentheses are assumed to be.  In
+fact, it is wise to always use parentheses whenever there is an unusual
+combination of operators, because other people who read the program may
+not remember what the precedence is in this case.
+Even experienced programmers occasionally forget the exact rules,
+which leads to mistakes.
+Explicit parentheses help prevent
+any such mistakes.
+
+When operators of equal precedence are used together, the leftmost
+operator groups first, except for the assignment, conditional, and
+exponentiation operators, which group in the opposite order.
+Thus, @samp{a - b + c} groups as @samp{(a - b) + c} and
+@samp{a = b = c} groups as @samp{a = (b = c)}.
+
+The precedence of prefix unary operators does not matter as long as only
+unary operators are involved, because there is only one way to interpret
+them: innermost first.  Thus, @samp{$++i} means @samp{$(++i)} and
+@samp{++$x} means @samp{++($x)}.  However, when another operator follows
+the operand, then the precedence of the unary operators can matter.
+@samp{$x^2} means @samp{($x)^2}, but @samp{-x^2} means
+@samp{-(x^2)}, because @samp{-} has lower precedence than @samp{^},
+whereas @samp{$} has higher precedence.
+This table presents @command{awk}'s operators, in order of highest
+to lowest precedence:
+
+@c use @code in the items, looks better in TeX w/o all the quotes
+@table @code
+@item (@dots{})
+Grouping.
+
+@cindex @code{$} (dollar sign), @code{$} field operator
+@cindex dollar sign (@code{$}), @code{$} field operator
+@item $
+Field.
+
+@cindex @code{+} (plus sign), @code{++} operator
+@cindex plus sign (@code{+}), @code{++} operator
+@cindex @code{-} (hyphen), @code{--} (decrement/increment) operator
+@cindex hyphen (@code{-}), @code{--} (decrement/increment) operators
+@item ++ --
+Increment, decrement.
+
+@cindex @code{^} (caret), @code{^} operator
+@cindex caret (@code{^}), @code{^} operator
+@cindex @code{*} (asterisk), @code{**} operator
+@cindex asterisk (@code{*}), @code{**} operator
+@item ^ **
+Exponentiation.  These operators group right-to-left.
+
+@cindex @code{+} (plus sign), @code{+} operator
+@cindex plus sign (@code{+}), @code{+} operator
+@cindex @code{-} (hyphen), @code{-} operator
+@cindex hyphen (@code{-}), @code{-} operator
+@cindex @code{!} (exclamation point), @code{!} operator
+@cindex exclamation point (@code{!}), @code{!} operator
+@item + - !
+Unary plus, minus, logical ``not.''
+
+@cindex @code{*} (asterisk), @code{*} operator, as multiplication operator
+@cindex asterisk (@code{*}), @code{*} operator, as multiplication operator
+@cindex @code{/} (forward slash), @code{/} operator
+@cindex forward slash (@code{/}), @code{/} operator
+@cindex @code{%} (percent sign), @code{%} operator
+@cindex percent sign (@code{%}), @code{%} operator
+@item * / %
+Multiplication, division, modulus.
+
+@cindex @code{+} (plus sign), @code{+} operator
+@cindex plus sign (@code{+}), @code{+} operator
+@cindex @code{-} (hyphen), @code{-} operator
+@cindex hyphen (@code{-}), @code{-} operator
+@item + -
+Addition, subtraction.
+
+@item @r{String Concatenation}
+No special symbol is used to indicate concatenation.
+The operands are simply written side by side
+(@pxref{Concatenation}).
+
+@cindex @code{<} (left angle bracket), @code{<} operator
+@cindex left angle bracket (@code{<}), @code{<} operator
+@cindex @code{<} (left angle bracket), @code{<=} operator
+@cindex left angle bracket (@code{<}), @code{<=} operator
+@cindex @code{>} (right angle bracket), @code{>=} operator
+@cindex right angle bracket (@code{>}), @code{>=} operator
+@cindex @code{>} (right angle bracket), @code{>} operator
+@cindex right angle bracket (@code{>}), @code{>} operator
+@cindex @code{=} (equals sign), @code{==} operator
+@cindex equals sign (@code{=}), @code{==} operator
+@cindex @code{!} (exclamation point), @code{!=} operator
+@cindex exclamation point (@code{!}), @code{!=} operator
+@cindex @code{>} (right angle bracket), @code{>>} operator (I/O)
+@cindex right angle bracket (@code{>}), @code{>>} operator (I/O)
+@cindex operators, input/output
+@cindex @code{|} (vertical bar), @code{|} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|} operator (I/O)
+@cindex operators, input/output
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|&} operator (I/O)
+@cindex operators, input/output
+@item < <= == !=
+@itemx > >= >> | |&
+Relational and redirection.
+The relational operators and the redirections have the same precedence
+level.  Characters such as @samp{>} serve both as relationals and as
+redirections; the context distinguishes between the two meanings.
+
+@cindex @code{print} statement, I/O operators in
+@cindex @code{printf} statement, I/O operators in
+Note that the I/O redirection operators in @code{print} and @code{printf}
+statements belong to the statement level, not to expressions.  The
+redirection does not produce an expression that could be the operand of
+another operator.  As a result, it does not make sense to use a
+redirection operator near another operator of lower precedence without
+parentheses.  Such combinations (for example, @samp{print foo > a ? b : c}),
+result in syntax errors.
+The correct way to write this statement is @samp{print foo > (a ? b : c)}.
+
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@item ~ !~
+Matching, nonmatching.
+
+@cindex @code{in} operator
+@item in
+Array membership.
+
+@cindex @code{&} (ampersand), @code{&&} operator
+@cindex ampersand (@code{&}), @code{&&}operator
+@item &&
+Logical ``and''.
+
+@cindex @code{|} (vertical bar), @code{||} operator
+@cindex vertical bar (@code{|}), @code{||} operator
+@item ||
+Logical ``or''.
+
+@cindex @code{?} (question mark), @code{?:} operator
+@cindex question mark (@code{?}), @code{?:} operator
+@item ?:
+Conditional.  This operator groups right-to-left.
+
+@cindex @code{+} (plus sign), @code{+=} operator
+@cindex plus sign (@code{+}), @code{+=} operator
+@cindex @code{-} (hyphen), @code{-=} operator
+@cindex hyphen (@code{-}), @code{-=} operator
+@cindex @code{*} (asterisk), @code{*=} operator
+@cindex asterisk (@code{*}), @code{*=} operator
+@cindex @code{*} (asterisk), @code{**=} operator
+@cindex asterisk (@code{*}), @code{**=} operator
+@cindex @code{/} (forward slash), @code{/=} operator
+@cindex forward slash (@code{/}), @code{/=} operator
+@cindex @code{%} (percent sign), @code{%=} operator
+@cindex percent sign (@code{%}), @code{%=} operator
+@cindex @code{^} (caret), @code{^=} operator
+@cindex caret (@code{^}), @code{^=} operator
+@item = += -= *=
+@itemx /= %= ^= **=
+Assignment.  These operators group right to left.
+@end table
+
+@cindex portability, operators, not in POSIX @command{awk}
+@strong{Note:}
+The @samp{|&}, @samp{**}, and @samp{**=} operators are not specified by POSIX.
+For maximum portability, do not use them.
+@c ENDOFRANGE prec
+@c ENDOFRANGE oppr
+@c ENDOFRANGE exps
+
+@node Patterns and Actions
+@chapter Patterns, Actions, and Variables
+@c STARTOFRANGE pat
+@cindex patterns
+
+As you have already seen, each @command{awk} statement consists of
+a pattern with an associated action.  This @value{CHAPTER} describes how
+you build patterns and actions, what kinds of things you can do within
+actions, and @command{awk}'s built-in variables.
+
+The pattern-action rules and the statements available for use
+within actions form the core of @command{awk} programming.
+In a sense, everything covered
+up to here has been the foundation
+that programs are built on top of.  Now it's time to start
+building something useful.
+
+@menu
+* Pattern Overview::            What goes into a pattern.
+* Using Shell Variables::       How to use shell variables with @command{awk}.
+* Action Overview::             What goes into an action.
+* Statements::                  Describes the various control statements in
+                                detail.
+* Built-in Variables::          Summarizes the built-in variables.
+@end menu
+
+@node Pattern Overview
+@section Pattern Elements
+
+@menu
+* Regexp Patterns::             Using regexps as patterns.
+* Expression Patterns::         Any expression can be used as a pattern.
+* Ranges::                      Pairs of patterns specify record ranges.
+* BEGIN/END::                   Specifying initialization and cleanup rules.
+* Empty::                       The empty pattern, which matches every record.
+@end menu
+
+@cindex patterns, types of
+Patterns in @command{awk} control the execution of rules---a rule is
+executed when its pattern matches the current input record.
+The following is a summary of the types of @command{awk} patterns:
+
+@table @code
+@item /@var{regular expression}/
+A regular expression. It matches when the text of the
+input record fits the regular expression.
+(@xref{Regexp}.)
+
+@item @var{expression}
+A single expression.  It matches when its value
+is nonzero (if a number) or non-null (if a string).
+(@xref{Expression Patterns}.)
+
+@item @var{pat1}, @var{pat2}
+A pair of patterns separated by a comma, specifying a range of records.
+The range includes both the initial record that matches @var{pat1} and
+the final record that matches @var{pat2}.
+(@xref{Ranges}.)
+
+@item BEGIN
+@itemx END
+Special patterns for you to supply startup or cleanup actions for your
+@command{awk} program.
+(@xref{BEGIN/END}.)
+
+@item @var{empty}
+The empty pattern matches every input record.
+(@xref{Empty}.)
+@end table
+
+@node Regexp Patterns
+@subsection Regular Expressions as Patterns
+@cindex patterns, expressions as
+@cindex regular expressions, as patterns
+
+Regular expressions are one of the first kinds of patterns presented
+in this book.
+This kind of pattern is simply a regexp constant in the pattern part of
+a rule.  Its  meaning is @samp{$0 ~ /@var{pattern}/}.
+The pattern matches when the input record matches the regexp.
+For example:
+
+@example
+/foo|bar|baz/  @{ buzzwords++ @}
+END            @{ print buzzwords, "buzzwords seen" @}
+@end example
+
+@node Expression Patterns
+@subsection Expressions as Patterns
+@cindex expressions, as patterns
+
+Any @command{awk} expression is valid as an @command{awk} pattern.
+The pattern matches if the expression's value is nonzero (if a
+number) or non-null (if a string).
+The expression is reevaluated each time the rule is tested against a new
+input record.  If the expression uses fields such as @code{$1}, the
+value depends directly on the new input record's text; otherwise, it
+depends on only what has happened so far in the execution of the
+@command{awk} program.
+
+@cindex comparison expressions, as patterns
+@cindex patterns, comparison expressions as
+Comparison expressions, using the comparison operators described in
+@ref{Typing and Comparison},
+are a very common kind of pattern.
+Regexp matching and nonmatching are also very common expressions.
+The left operand of the @samp{~} and @samp{!~} operators is a string.
+The right operand is either a constant regular expression enclosed in
+slashes (@code{/@var{regexp}/}), or any expression whose string value
+is used as a dynamic regular expression
+(@pxref{Computed Regexps}).
+The following example prints the second field of each input record
+whose first field is precisely @samp{foo}:
+
+@cindex @code{/} (forward slash), patterns and
+@cindex forward slash (@code{/}), patterns and
+@cindex @code{~} (tilde), @code{~} operator
+@cindex tilde (@code{~}), @code{~} operator
+@cindex @code{!} (exclamation point), @code{!~} operator
+@cindex exclamation point (@code{!}), @code{!~} operator
+@example
+$ awk '$1 == "foo" @{ print $2 @}' BBS-list
+@end example
+
+@noindent
+(There is no output, because there is no BBS site with the exact name @samp{foo}.)
+Contrast this with the following regular expression match, which
+accepts any record with a first field that contains @samp{foo}:
+
+@example
+$ awk '$1 ~ /foo/ @{ print $2 @}' BBS-list
+@print{} 555-1234
+@print{} 555-6699
+@print{} 555-6480
+@print{} 555-2127
+@end example
+
+@cindex regexp constants, as patterns
+@cindex patterns, regexp constants as
+A regexp constant as a pattern is also a special case of an expression
+pattern.  The expression @code{/foo/} has the value one if @samp{foo}
+appears in the current input record. Thus, as a pattern, @code{/foo/}
+matches any record containing @samp{foo}.
+
+@cindex Boolean expressions, as patterns
+Boolean expressions are also commonly used as patterns.
+Whether the pattern
+matches an input record depends on whether its subexpressions match.
+For example, the following command prints all the records in
+@file{BBS-list} that contain both @samp{2400} and @samp{foo}:
+
+@example
+$ awk '/2400/ && /foo/' BBS-list
+@print{} fooey        555-1234     2400/1200/300     B
+@end example
+
+The following command prints all records in
+@file{BBS-list} that contain @emph{either} @samp{2400} or @samp{foo}
+(or both, of course):
+
+@example
+$ awk '/2400/ || /foo/' BBS-list
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} fooey        555-1234     2400/1200/300     B
+@print{} foot         555-6699     1200/300          B
+@print{} macfoo       555-6480     1200/300          A
+@print{} sdace        555-3430     2400/1200/300     A
+@print{} sabafoo      555-2127     1200/300          C
+@end example
+
+The following command prints all records in
+@file{BBS-list} that do @emph{not} contain the string @samp{foo}:
+
+@example
+$ awk '! /foo/' BBS-list
+@print{} aardvark     555-5553     1200/300          B
+@print{} alpo-net     555-3412     2400/1200/300     A
+@print{} barfly       555-7685     1200/300          A
+@print{} bites        555-1675     2400/1200/300     A
+@print{} camelot      555-0542     300               C
+@print{} core         555-2912     1200/300          C
+@print{} sdace        555-3430     2400/1200/300     A
+@end example
+
+@cindex @code{BEGIN} pattern, Boolean patterns and
+@cindex @code{END} pattern, Boolean patterns and
+The subexpressions of a Boolean operator in a pattern can be constant regular
+expressions, comparisons, or any other @command{awk} expressions.  Range
+patterns are not expressions, so they cannot appear inside Boolean
+patterns.  Likewise, the special patterns @code{BEGIN} and @code{END},
+which never match any input record, are not expressions and cannot
+appear inside Boolean patterns.
+
+@node Ranges
+@subsection Specifying Record Ranges with Patterns
+
+@cindex range patterns
+@cindex patterns, ranges in
+@cindex lines, matching ranges of
+@cindex @code{,} (comma), in range patterns
+@cindex comma (@code{,}), in range patterns
+A @dfn{range pattern} is made of two patterns separated by a comma, in
+the form @samp{@var{begpat}, @var{endpat}}.  It is used to match ranges of
+consecutive input records.  The first pattern, @var{begpat}, controls
+where the range begins, while @var{endpat} controls where
+the pattern ends.  For example, the following:
+
+@example
+awk '$1 == "on", $1 == "off"' myfile
+@end example
+
+@noindent
+prints every record in @file{myfile} between @samp{on}/@samp{off} pairs, inclusive.
+
+A range pattern starts out by matching @var{begpat} against every
+input record.  When a record matches @var{begpat}, the range pattern is
+@dfn{turned on} and the range pattern matches this record as well.  As long as
+the range pattern stays turned on, it automatically matches every input
+record read.  The range pattern also matches @var{endpat} against every
+input record; when this succeeds, the range pattern is turned off again
+for the following record.  Then the range pattern goes back to checking
+@var{begpat} against each record.
+
+@c last comma does NOT start a tertiary
+@cindex @code{if} statement, actions, changing
+The record that turns on the range pattern and the one that turns it
+off both match the range pattern.  If you don't want to operate on
+these records, you can write @code{if} statements in the rule's action
+to distinguish them from the records you are interested in.
+
+It is possible for a pattern to be turned on and off by the same
+record. If the record satisfies both conditions, then the action is
+executed for just that record.
+For example, suppose there is text between two identical markers (e.g.,
+the @samp{%} symbol), each on its own line, that should be ignored.
+A first attempt would be to
+combine a range pattern that describes the delimited text with the
+@code{next} statement
+(not discussed yet, @pxref{Next Statement}).
+This causes @command{awk} to skip any further processing of the current
+record and start over again with the next input record. Such a program
+looks like this:
+
+@example
+/^%$/,/^%$/    @{ next @}
+               @{ print @}
+@end example
+
+@noindent
+@cindex lines, skipping between markers
+@c @cindex flag variables
+This program fails because the range pattern is both turned on and turned off
+by the first line, which just has a @samp{%} on it.  To accomplish this task,
+write the program in the following manner, using a flag:
+
+@cindex @code{!} operator
+@example
+/^%$/     @{ skip = ! skip; next @}
+skip == 1 @{ next @} # skip lines with `skip' set
+@end example
+
+In a range pattern, the comma (@samp{,}) has the lowest precedence of
+all the operators (i.e., it is evaluated last).  Thus, the following
+program attempts to combine a range pattern with another, simpler test:
+
+@example
+echo Yes | awk '/1/,/2/ || /Yes/'
+@end example
+
+The intent of this program is @samp{(/1/,/2/) || /Yes/}.
+However, @command{awk} interprets this as @samp{/1/, (/2/ || /Yes/)}.
+This cannot be changed or worked around; range patterns do not combine
+with other patterns:
+
+@example
+$ echo Yes | gawk '(/1/,/2/) || /Yes/'
+@error{} gawk: cmd. line:1: (/1/,/2/) || /Yes/
+@error{} gawk: cmd. line:1:           ^ parse error
+@error{} gawk: cmd. line:2: (/1/,/2/) || /Yes/
+@error{} gawk: cmd. line:2:                   ^ unexpected newline
+@end example
+
+@node BEGIN/END
+@subsection The @code{BEGIN} and @code{END} Special Patterns
+
+@c STARTOFRANGE beg
+@cindex @code{BEGIN} pattern
+@c STARTOFRANGE end
+@cindex @code{END} pattern
+All the patterns described so far are for matching input records.
+The @code{BEGIN} and @code{END} special patterns are different.
+They supply startup and cleanup actions for @command{awk} programs.
+@code{BEGIN} and @code{END} rules must have actions; there is no default
+action for these rules because there is no current record when they run.
+@code{BEGIN} and @code{END} rules are often referred to as
+``@code{BEGIN} and @code{END} blocks'' by long-time @command{awk}
+programmers.
+
+@menu
+* Using BEGIN/END::             How and why to use BEGIN/END rules.
+* I/O And BEGIN/END::           I/O issues in BEGIN/END rules.
+@end menu
+
+@node Using BEGIN/END
+@subsubsection Startup and Cleanup Actions
+
+A @code{BEGIN} rule is executed once only, before the first input record
+is read. Likewise, an @code{END} rule is executed once only, after all the
+input is read.  For example:
+
+@example
+$ awk '
+> BEGIN @{ print "Analysis of \"foo\"" @}
+> /foo/ @{ ++n @}
+> END   @{ print "\"foo\" appears", n, "times." @}' BBS-list
+@print{} Analysis of "foo"
+@print{} "foo" appears 4 times.
+@end example
+
+@cindex @code{BEGIN} pattern, operators and
+@cindex @code{END} pattern, operators and
+This program finds the number of records in the input file @file{BBS-list}
+that contain the string @samp{foo}.  The @code{BEGIN} rule prints a title
+for the report.  There is no need to use the @code{BEGIN} rule to
+initialize the counter @code{n} to zero, since @command{awk} does this
+automatically (@pxref{Variables}).
+The second rule increments the variable @code{n} every time a
+record containing the pattern @samp{foo} is read.  The @code{END} rule
+prints the value of @code{n} at the end of the run.
+
+The special patterns @code{BEGIN} and @code{END} cannot be used in ranges
+or with Boolean operators (indeed, they cannot be used with any operators).
+An @command{awk} program may have multiple @code{BEGIN} and/or @code{END}
+rules.  They are executed in the order in which they appear: all the @code{BEGIN}
+rules at startup and all the @code{END} rules at termination.
+@code{BEGIN} and @code{END} rules may be intermixed with other rules.
+This feature was added in the 1987 version of @command{awk} and is included
+in the POSIX standard.
+The original (1978) version of @command{awk}
+required the @code{BEGIN} rule to be placed at the beginning of the
+program, the @code{END} rule to be placed at the end, and only allowed one of
+each.
+This is no longer required, but it is a good idea to follow this template
+in terms of program organization and readability.
+
+Multiple @code{BEGIN} and @code{END} rules are useful for writing
+library functions, because each library file can have its own @code{BEGIN} and/or
+@code{END} rule to do its own initialization and/or cleanup.
+The order in which library functions are named on the command line
+controls the order in which their @code{BEGIN} and @code{END} rules are
+executed.  Therefore, you have to be careful when writing such rules in
+library files so that the order in which they are executed doesn't matter.
+@xref{Options}, for more information on
+using library functions.
+@xref{Library Functions},
+for a number of useful library functions.
+
+If an @command{awk} program has only a @code{BEGIN} rule and no
+other rules, then the program exits after the @code{BEGIN} rule is
+run.@footnote{The original version of @command{awk} used to keep
+reading and ignoring input until the end of the file was seen.}  However, if an
+@code{END} rule exists, then the input is read, even if there are
+no other rules in the program.  This is necessary in case the @code{END}
+rule checks the @code{FNR} and @code{NR} variables.
+
+@node I/O And BEGIN/END
+@subsubsection Input/Output from @code{BEGIN} and @code{END} Rules
+
+@cindex input/output, from @code{BEGIN} and @code{END}
+There are several (sometimes subtle) points to remember when doing I/O
+from a @code{BEGIN} or @code{END} rule.
+The first has to do with the value of @code{$0} in a @code{BEGIN}
+rule.  Because @code{BEGIN} rules are executed before any input is read,
+there simply is no input record, and therefore no fields, when
+executing @code{BEGIN} rules.  References to @code{$0} and the fields
+yield a null string or zero, depending upon the context.  One way
+to give @code{$0} a real value is to execute a @code{getline} command
+without a variable (@pxref{Getline}).
+Another way is simply to assign a value to @code{$0}.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{BEGIN}/@code{END} patterns
+@cindex POSIX @command{awk}, @code{BEGIN}/@code{END} patterns
+@cindex @code{print} statement, @code{BEGIN}/@code{END} patterns and
+@cindex @code{BEGIN} pattern, @code{print} statement and
+@cindex @code{END} pattern, @code{print} statement and
+The second point is similar to the first but from the other direction.
+Traditionally, due largely to implementation issues, @code{$0} and
+@code{NF} were @emph{undefined} inside an @code{END} rule.
+The POSIX standard specifies that @code{NF} is available in an @code{END}
+rule. It contains the number of fields from the last input record.
+Most probably due to an oversight, the standard does not say that @code{$0}
+is also preserved, although logically one would think that it should be.
+In fact, @command{gawk} does preserve the value of @code{$0} for use in
+@code{END} rules.  Be aware, however, that Unix @command{awk}, and possibly
+other implementations, do not.
+
+The third point follows from the first two.  The meaning of @samp{print}
+inside a @code{BEGIN} or @code{END} rule is the same as always:
+@samp{print $0}.  If @code{$0} is the null string, then this prints an
+empty line.  Many long time @command{awk} programmers use an unadorned
+@samp{print} in @code{BEGIN} and @code{END} rules, to mean @samp{@w{print ""}},
+relying on @code{$0} being null.  Although one might generally get away with
+this in @code{BEGIN} rules, it is a very bad idea in @code{END} rules,
+at least in @command{gawk}.  It is also poor style, since if an empty
+line is needed in the output, the program should print one explicitly.
+
+@cindex @code{next} statement, @code{BEGIN}/@code{END} patterns and
+@cindex @code{nextfile} statement, @code{BEGIN}/@code{END} patterns and
+@cindex @code{BEGIN} pattern, @code{next}/@code{nextfile} statements and
+@cindex @code{END} pattern, @code{next}/@code{nextfile} statements and
+Finally, the @code{next} and @code{nextfile} statements are not allowed
+in a @code{BEGIN} rule, because the implicit
+read-a-record-and-match-against-the-rules loop has not started yet.  Similarly, those statements
+are not valid in an @code{END} rule, since all the input has been read.
+(@xref{Next Statement}, and see
+@ref{Nextfile Statement}.)
+@c ENDOFRANGE beg
+@c ENDOFRANGE end
+
+@node Empty
+@subsection The Empty Pattern
+
+@cindex empty pattern
+@cindex patterns, empty
+An empty (i.e., nonexistent) pattern is considered to match @emph{every}
+input record.  For example, the program:
+
+@example
+awk '@{ print $1 @}' BBS-list
+@end example
+
+@noindent
+prints the first field of every record.
+@c ENDOFRANGE pat
+
+@node Using Shell Variables
+@section Using Shell Variables in Programs
+@cindex shells, variables
+@cindex @command{awk} programs, shell variables in
+@c @cindex shell and @command{awk} interaction
+
+@command{awk} programs are often used as components in larger
+programs written in shell.
+For example, it is very common to use a shell variable to
+hold a pattern that the @command{awk} program searches for.
+There are two ways to get the value of the shell variable
+into the body of the @command{awk} program.
+
+@cindex shells, quoting
+The most common method is to use shell quoting to substitute
+the variable's value into the program inside the script.
+For example, in the following program:
+
+@example
+echo -n "Enter search pattern: "
+read pattern
+awk "/$pattern/ "'@{ nmatches++ @}
+     END @{ print nmatches, "found" @}' /path/to/data
+@end example
+
+@noindent
+the @command{awk} program consists of two pieces of quoted text
+that are concatenated together to form the program.
+The first part is double-quoted, which allows substitution of
+the @code{pattern} variable inside the quotes.
+The second part is single-quoted.
+
+Variable substitution via quoting works, but can be potentially
+messy.  It requires a good understanding of the shell's quoting rules
+(@pxref{Quoting}),
+and it's often difficult to correctly
+match up the quotes when reading the program.
+
+A better method is to use @command{awk}'s variable assignment feature
+(@pxref{Assignment Options})
+to assign the shell variable's value to an @command{awk} variable's
+value.  Then use dynamic regexps to match the pattern
+(@pxref{Computed Regexps}).
+The following shows how to redo the
+previous example using this technique:
+
+@example
+echo -n "Enter search pattern: "
+read pattern
+awk -v pat="$pattern" '$0 ~ pat @{ nmatches++ @}
+       END @{ print nmatches, "found" @}' /path/to/data
+@end example
+
+@noindent
+Now, the @command{awk} program is just one single-quoted string.
+The assignment @samp{-v pat="$pattern"} still requires double quotes,
+in case there is whitespace in the value of @code{$pattern}.
+The @command{awk} variable @code{pat} could be named @code{pattern}
+too, but that would be more confusing.  Using a variable also
+provides more flexibility, since the variable can be used anywhere inside
+the program---for printing, as an array subscript, or for any other
+use---without requiring the quoting tricks at every point in the program.
+
+@node Action Overview
+@section Actions
+@c @cindex action, definition of
+@c @cindex curly braces
+@c @cindex action, curly braces
+@c @cindex action, separating statements
+@cindex actions
+
+An @command{awk} program or script consists of a series of
+rules and function definitions interspersed.  (Functions are
+described later.  @xref{User-defined}.)
+A rule contains a pattern and an action, either of which (but not
+both) may be omitted.  The purpose of the @dfn{action} is to tell
+@command{awk} what to do once a match for the pattern is found.  Thus,
+in outline, an @command{awk} program generally looks like this:
+
+@example
+@r{[}@var{pattern}@r{]} @r{[}@{ @var{action} @}@r{]}
+@r{[}@var{pattern}@r{]} @r{[}@{ @var{action} @}@r{]}
+@dots{}
+function @var{name}(@var{args}) @{ @dots{} @}
+@dots{}
+@end example
+
+@cindex @code{@{@}} (braces), actions and
+@cindex braces (@code{@{@}}), actions and
+@cindex separators, for statements in actions
+@cindex newlines, separating statements in actions
+@cindex @code{;} (semicolon), separating statements in actions
+@cindex semicolon (@code{;}), separating statements in actions
+An action consists of one or more @command{awk} @dfn{statements}, enclosed
+in curly braces (@samp{@{@dots{}@}}).  Each statement specifies one
+thing to do.  The statements are separated by newlines or semicolons.
+The curly braces around an action must be used even if the action
+contains only one statement, or if it contains no statements at
+all.  However, if you omit the action entirely, omit the curly braces as
+well.  An omitted action is equivalent to @samp{@{ print $0 @}}:
+
+@example
+/foo/  @{ @}     @i{match @code{foo}, do nothing --- empty action}
+/foo/          @i{match @code{foo}, print the record --- omitted action}
+@end example
+
+The following types of statements are supported in @command{awk}:
+
+@table @asis
+@cindex side effects, statements
+@item Expressions
+Call functions or assign values to variables
+(@pxref{Expressions}).  Executing
+this kind of statement simply computes the value of the expression.
+This is useful when the expression has side effects
+(@pxref{Assignment Ops}).
+
+@item Control statements
+Specify the control flow of @command{awk}
+programs.  The @command{awk} language gives you C-like constructs
+(@code{if}, @code{for}, @code{while}, and @code{do}) as well as a few
+special ones (@pxref{Statements}).
+
+@item Compound statements
+Consist of one or more statements enclosed in
+curly braces.  A compound statement is used in order to put several
+statements together in the body of an @code{if}, @code{while}, @code{do},
+or @code{for} statement.
+
+@item Input statements
+Use the @code{getline} command
+(@pxref{Getline}).
+Also supplied in @command{awk} are the @code{next}
+statement (@pxref{Next Statement}),
+and the @code{nextfile} statement
+(@pxref{Nextfile Statement}).
+
+@item Output statements
+Such as @code{print} and @code{printf}.
+@xref{Printing}.
+
+@item Deletion statements
+For deleting array elements.
+@xref{Delete}.
+@end table
+
+@node Statements
+@section Control Statements in Actions
+@c STARTOFRANGE csta
+@cindex control statements
+@c STARTOFRANGE acs
+@cindex statements, control, in actions
+@c STARTOFRANGE accs
+@cindex actions, control statements in
+
+@dfn{Control statements}, such as @code{if}, @code{while}, and so on,
+control the flow of execution in @command{awk} programs.  Most of the
+control statements in @command{awk} are patterned on similar statements in C.
+
+@c the comma here does NOT start a secondary
+@cindex compound statements, control statements and
+@c the second comma here does NOT start a tertiary
+@cindex statements, compound, control statements and
+@cindex body, in actions
+@cindex @code{@{@}} (braces), statements, grouping
+@cindex braces (@code{@{@}}), statements, grouping
+@cindex newlines, separating statements in actions
+@cindex @code{;} (semicolon), separating statements in actions
+@cindex semicolon (@code{;}), separating statements in actions
+All the control statements start with special keywords, such as @code{if}
+and @code{while}, to distinguish them from simple expressions.
+Many control statements contain other statements.  For example, the
+@code{if} statement contains another statement that may or may not be
+executed.  The contained statement is called the @dfn{body}.
+To include more than one statement in the body, group them into a
+single @dfn{compound statement} with curly braces, separating them with
+newlines or semicolons.
+
+@menu
+* If Statement::                Conditionally execute some @command{awk}
+                                statements.
+* While Statement::             Loop until some condition is satisfied.
+* Do Statement::                Do specified action while looping until some
+                                condition is satisfied.
+* For Statement::               Another looping statement, that provides
+                                initialization and increment clauses.
+* Switch Statement::            Switch/case evaluation for conditional
+                                execution of statements based on a value.
+* Break Statement::             Immediately exit the innermost enclosing loop.
+* Continue Statement::          Skip to the end of the innermost enclosing
+                                loop.
+* Next Statement::              Stop processing the current input record.
+* Nextfile Statement::          Stop processing the current file.
+* Exit Statement::              Stop execution of @command{awk}.
+@end menu
+
+@node If Statement
+@subsection The @code{if}-@code{else} Statement
+
+@cindex @code{if} statement
+The @code{if}-@code{else} statement is @command{awk}'s decision-making
+statement.  It looks like this:
+
+@example
+if (@var{condition}) @var{then-body} @r{[}else @var{else-body}@r{]}
+@end example
+
+@noindent
+The @var{condition} is an expression that controls what the rest of the
+statement does.  If the @var{condition} is true, @var{then-body} is
+executed; otherwise, @var{else-body} is executed.
+The @code{else} part of the statement is
+optional.  The condition is considered false if its value is zero or
+the null string; otherwise, the condition is true.
+Refer to the following:
+
+@example
+if (x % 2 == 0)
+    print "x is even"
+else
+    print "x is odd"
+@end example
+
+In this example, if the expression @samp{x % 2 == 0} is true (that is,
+if the value of @code{x} is evenly divisible by two), then the first
+@code{print} statement is executed; otherwise, the second @code{print}
+statement is executed.
+If the @code{else} keyword appears on the same line as @var{then-body} and
+@var{then-body} is not a compound statement (i.e., not surrounded by
+curly braces), then a semicolon must separate @var{then-body} from
+the @code{else}.
+To illustrate this, the previous example can be rewritten as:
+
+@example
+if (x % 2 == 0) print "x is even"; else
+        print "x is odd"
+@end example
+
+@noindent
+If the @samp{;} is left out, @command{awk} can't interpret the statement and
+it produces a syntax error.  Don't actually write programs this way,
+because a human reader might fail to see the @code{else} if it is not
+the first thing on its line.
+
+@node While Statement
+@subsection The @code{while} Statement
+@cindex @code{while} statement
+@cindex loops
+@cindex loops, See Also @code{while} statement
+
+In programming, a @dfn{loop} is a part of a program that can
+be executed two or more times in succession.
+The @code{while} statement is the simplest looping statement in
+@command{awk}.  It repeatedly executes a statement as long as a condition is
+true.  For example:
+
+@example
+while (@var{condition})
+  @var{body}
+@end example
+
+@cindex body, in loops
+@noindent
+@var{body} is a statement called the @dfn{body} of the loop,
+and @var{condition} is an expression that controls how long the loop
+keeps running.
+The first thing the @code{while} statement does is test the @var{condition}.
+If the @var{condition} is true, it executes the statement @var{body}.
+@ifinfo
+(The @var{condition} is true when the value
+is not zero and not a null string.)
+@end ifinfo
+After @var{body} has been executed,
+@var{condition} is tested again, and if it is still true, @var{body} is
+executed again.  This process repeats until the @var{condition} is no longer
+true.  If the @var{condition} is initially false, the body of the loop is
+never executed and @command{awk} continues with the statement following
+the loop.
+This example prints the first three fields of each record, one per line:
+
+@example
+awk '@{ i = 1
+       while (i <= 3) @{
+           print $i
+           i++
+       @}
+@}' inventory-shipped
+@end example
+
+@noindent
+The body of this loop is a compound statement enclosed in braces,
+containing two statements.
+The loop works in the following manner: first, the value of @code{i} is set to one.
+Then, the @code{while} statement tests whether @code{i} is less than or equal to
+three.  This is true when @code{i} equals one, so the @code{i}-th
+field is printed.  Then the @samp{i++} increments the value of @code{i}
+and the loop repeats.  The loop terminates when @code{i} reaches four.
+
+A newline is not required between the condition and the
+body; however using one makes the program clearer unless the body is a
+compound statement or else is very simple.  The newline after the open-brace
+that begins the compound statement is not required either, but the
+program is harder to read without it.
+
+@node Do Statement
+@subsection The @code{do}-@code{while} Statement
+@cindex @code{do}-@code{while} statement
+
+The @code{do} loop is a variation of the @code{while} looping statement.
+The @code{do} loop executes the @var{body} once and then repeats the
+@var{body} as long as the @var{condition} is true.  It looks like this:
+
+@example
+do
+  @var{body}
+while (@var{condition})
+@end example
+
+Even if the @var{condition} is false at the start, the @var{body} is
+executed at least once (and only once, unless executing @var{body}
+makes @var{condition} true).  Contrast this with the corresponding
+@code{while} statement:
+
+@example
+while (@var{condition})
+  @var{body}
+@end example
+
+@noindent
+This statement does not execute @var{body} even once if the @var{condition}
+is false to begin with.
+The following is an example of a @code{do} statement:
+
+@example
+@{      i = 1
+       do @{
+          print $0
+          i++
+       @} while (i <= 10)
+@}
+@end example
+
+@noindent
+This program prints each input record 10 times.  However, it isn't a very
+realistic example, since in this case an ordinary @code{while} would do
+just as well.  This situation reflects actual experience; only
+occasionally is there a real use for a @code{do} statement.
+
+@node For Statement
+@subsection The @code{for} Statement
+@cindex @code{for} statement
+
+The @code{for} statement makes it more convenient to count iterations of a
+loop.  The general form of the @code{for} statement looks like this:
+
+@example
+for (@var{initialization}; @var{condition}; @var{increment})
+  @var{body}
+@end example
+
+@noindent
+The @var{initialization}, @var{condition}, and @var{increment} parts are
+arbitrary @command{awk} expressions, and @var{body} stands for any
+@command{awk} statement.
+
+The @code{for} statement starts by executing @var{initialization}.
+Then, as long
+as the @var{condition} is true, it repeatedly executes @var{body} and then
+@var{increment}.  Typically, @var{initialization} sets a variable to
+either zero or one, @var{increment} adds one to it, and @var{condition}
+compares it against the desired number of iterations.
+For example:
+
+@example
+awk '@{ for (i = 1; i <= 3; i++)
+          print $i
+@}' inventory-shipped
+@end example
+
+@noindent
+This prints the first three fields of each input record, with one field per
+line.
+
+It isn't possible to
+set more than one variable in the
+@var{initialization} part without using a multiple assignment statement
+such as @samp{x = y = 0}. This makes sense only if all the initial values
+are equal.  (But it is possible to initialize additional variables by writing
+their assignments as separate statements preceding the @code{for} loop.)
+
+@c @cindex comma operator, not supported
+The same is true of the @var{increment} part. Incrementing additional
+variables requires separate statements at the end of the loop.
+The C compound expression, using C's comma operator, is useful in
+this context but it is not supported in @command{awk}.
+
+Most often, @var{increment} is an increment expression, as in the previous
+example.  But this is not required; it can be any expression
+whatsoever.  For example, the following statement prints all the powers of two
+between 1 and 100:
+
+@example
+for (i = 1; i <= 100; i *= 2)
+  print i
+@end example
+
+If there is nothing to be done, any of the three expressions in the
+parentheses following the @code{for} keyword may be omitted.  Thus,
+@w{@samp{for (; x > 0;)}} is equivalent to @w{@samp{while (x > 0)}}.  If the
+@var{condition} is omitted, it is treated as true, effectively
+yielding an @dfn{infinite loop} (i.e., a loop that never terminates).
+
+In most cases, a @code{for} loop is an abbreviation for a @code{while}
+loop, as shown here:
+
+@example
+@var{initialization}
+while (@var{condition}) @{
+  @var{body}
+  @var{increment}
+@}
+@end example
+
+@cindex loops, @code{continue} statements and
+@noindent
+The only exception is when the @code{continue} statement
+(@pxref{Continue Statement}) is used
+inside the loop. Changing a @code{for} statement to a @code{while}
+statement in this way can change the effect of the @code{continue}
+statement inside the loop.
+
+The @command{awk} language has a @code{for} statement in addition to a
+@code{while} statement because a @code{for} loop is often both less work to
+type and more natural to think of.  Counting the number of iterations is
+very common in loops.  It can be easier to think of this counting as part
+of looping rather than as something to do inside the loop.
+
+@ifinfo
+@cindex @code{in} operator
+There is an alternate version of the @code{for} loop, for iterating over
+all the indices of an array:
+
+@example
+for (i in array)
+    @var{do something with} array[i]
+@end example
+
+@noindent
+@xref{Scanning an Array},
+for more information on this version of the @code{for} loop.
+@end ifinfo
+
+@node Switch Statement
+@subsection The @code{switch} Statement
+@cindex @code{switch} statement
+@cindex @code{case} keyword
+@cindex @code{default} keyword
+
+@strong{NOTE:} This @value{SUBSECTION} describes an experimental feature
+added in @command{gawk} 3.1.3.  It is @emph{not} enabled by default. To
+enable it, use the @option{--enable-switch} option to @command{configure}
+when @command{gawk} is being configured and built.
+@xref{Additional Configuration Options},
+for more information.
+
+The @code{switch} statement allows the evaluation of an expression and
+the execution of statements based on a @code{case} match. Case statements
+are checked for a match in the order they are defined.  If no suitable
+@code{case} is found, the @code{default} section is executed, if supplied. The
+general form of the @code{switch} statement looks like this:
+
+@example
+switch (@var{expression}) @{
+case @var{value or regular expression}:
+    @var{case-body}
+default:
+    @var{default-body}
+@}
+@end example
+
+The @code{switch} statement works as it does in C. Once a match to a given
+case is made, case statement bodies are executed until a @code{break},
+@code{continue}, @code{next}, @code{nextfile}  or @code{exit} is encountered,
+or the end of the @code{switch} statement itself. For example:
+
+@example
+switch (NR * 2 + 1) @{
+case 3:
+case "11":
+    print NR - 1
+    break
+
+case /2[[:digit:]]+/:
+    print NR
+
+default:
+    print NR + 1
+
+case -1:
+    print NR * -1
+@}
+@end example
+
+Note that if none of the statements specified above halt execution
+of a matched @code{case} statement, execution falls through to the
+next @code{case} until execution halts. In the above example, for
+any case value starting with @samp{2} followed by one or more digits,
+the @code{print} statement is executed and then falls through into the
+@code{default} section, executing its @code{print} statement. In turn,
+the @minus{}1 case will also be executed since the @code{default} does
+not halt execution.
+
+@node Break Statement
+@subsection The @code{break} Statement
+@cindex @code{break} statement
+@cindex loops, exiting
+
+The @code{break} statement jumps out of the innermost @code{for},
+@code{while}, or @code{do} loop that encloses it.  The following example
+finds the smallest divisor of any integer, and also identifies prime
+numbers:
+
+@example
+# find smallest divisor of num
+@{
+   num = $1
+   for (div = 2; div*div <= num; div++)
+     if (num % div == 0)
+       break
+   if (num % div == 0)
+     printf "Smallest divisor of %d is %d\n", num, div
+   else
+     printf "%d is prime\n", num
+@}
+@end example
+
+When the remainder is zero in the first @code{if} statement, @command{awk}
+immediately @dfn{breaks out} of the containing @code{for} loop.  This means
+that @command{awk} proceeds immediately to the statement following the loop
+and continues processing.  (This is very different from the @code{exit}
+statement, which stops the entire @command{awk} program.
+@xref{Exit Statement}.)
+
+Th following program illustrates how the @var{condition} of a @code{for}
+or @code{while} statement could be replaced with a @code{break} inside
+an @code{if}:
+
+@example
+# find smallest divisor of num
+@{
+  num = $1
+  for (div = 2; ; div++) @{
+    if (num % div == 0) @{
+      printf "Smallest divisor of %d is %d\n", num, div
+      break
+    @}
+    if (div*div > num) @{
+      printf "%d is prime\n", num
+      break
+    @}
+  @}
+@}
+@end example
+
+@c @cindex @code{break}, outside of loops
+@c @cindex historical features
+@c @cindex @command{awk} language, POSIX version
+@cindex POSIX @command{awk}, @code{break} statement and
+@cindex dark corner, @code{break} statement
+@cindex @command{gawk}, @code{break} statement in
+The @code{break} statement has no meaning when
+used outside the body of a loop.  However, although it was never documented,
+historical implementations of @command{awk} treated the @code{break}
+statement outside of a loop as if it were a @code{next} statement
+(@pxref{Next Statement}).
+Recent versions of Unix @command{awk} no longer allow this usage.
+@command{gawk} supports this use of @code{break} only
+if @option{--traditional}
+has been specified on the command line
+(@pxref{Options}).
+Otherwise, it is treated as an error, since the POSIX standard
+specifies that @code{break} should only be used inside the body of a
+loop.
+@value{DARKCORNER}
+
+@node Continue Statement
+@subsection The @code{continue} Statement
+
+@cindex @code{continue} statement
+As with @code{break}, the @code{continue} statement is used only inside
+@code{for}, @code{while}, and @code{do} loops.  It skips
+over the rest of the loop body, causing the next cycle around the loop
+to begin immediately.  Contrast this with @code{break}, which jumps out
+of the loop altogether.
+
+The @code{continue} statement in a @code{for} loop directs @command{awk} to
+skip the rest of the body of the loop and resume execution with the
+increment-expression of the @code{for} statement.  The following program
+illustrates this fact:
+
+@example
+BEGIN @{
+     for (x = 0; x <= 20; x++) @{
+         if (x == 5)
+             continue
+         printf "%d ", x
+     @}
+     print ""
+@}
+@end example
+
+@noindent
+This program prints all the numbers from 0 to 20---except for 5, for
+which the @code{printf} is skipped.  Because the increment @samp{x++}
+is not skipped, @code{x} does not remain stuck at 5.  Contrast the
+@code{for} loop from the previous example with the following @code{while} loop:
+
+@example
+BEGIN @{
+     x = 0
+     while (x <= 20) @{
+         if (x == 5)
+             continue
+         printf "%d ", x
+         x++
+     @}
+     print ""
+@}
+@end example
+
+@noindent
+This program loops forever once @code{x} reaches 5.
+
+@c @cindex @code{continue}, outside of loops
+@c @cindex historical features
+@c @cindex @command{awk} language, POSIX version
+@cindex POSIX @command{awk}, @code{continue} statement and
+@cindex dark corner, @code{continue} statement
+@cindex @command{gawk}, @code{continue} statement in
+The @code{continue} statement has no meaning when used outside the body of
+a loop.  Historical versions of @command{awk} treated a @code{continue}
+statement outside a loop the same way they treated a @code{break}
+statement outside a loop: as if it were a @code{next}
+statement
+(@pxref{Next Statement}).
+Recent versions of Unix @command{awk} no longer work this way, and
+@command{gawk} allows it only if @option{--traditional} is specified on
+the command line (@pxref{Options}).  Just like the
+@code{break} statement, the POSIX standard specifies that @code{continue}
+should only be used inside the body of a loop.
+@value{DARKCORNER}
+
+@node Next Statement
+@subsection The @code{next} Statement
+@cindex @code{next} statement
+
+The @code{next} statement forces @command{awk} to immediately stop processing
+the current record and go on to the next record.  This means that no
+further rules are executed for the current record, and the rest of the
+current rule's action isn't executed.
+
+Contrast this with the effect of the @code{getline} function
+(@pxref{Getline}).  That also causes
+@command{awk} to read the next record immediately, but it does not alter the
+flow of control in any way (i.e., the rest of the current action executes
+with a new input record).
+
+@cindex @command{awk} programs, execution of
+At the highest level, @command{awk} program execution is a loop that reads
+an input record and then tests each rule's pattern against it.  If you
+think of this loop as a @code{for} statement whose body contains the
+rules, then the @code{next} statement is analogous to a @code{continue}
+statement. It skips to the end of the body of this implicit loop and
+executes the increment (which reads another record).
+
+For example, suppose an @command{awk} program works only on records
+with four fields, and it shouldn't fail when given bad input.  To avoid
+complicating the rest of the program, write a ``weed out'' rule near
+the beginning, in the following manner:
+
+@example
+NF != 4 @{
+  err = sprintf("%s:%d: skipped: NF != 4\n", FILENAME, FNR)
+  print err > "/dev/stderr"
+  next
+@}
+@end example
+
+@noindent
+Because of the @code{next} statement,
+the program's subsequent rules won't see the bad record.  The error
+message is redirected to the standard error output stream, as error
+messages should be.
+For more detail see
+@ref{Special Files}.
+
+@c @cindex @command{awk} language, POSIX version
+@c @cindex @code{next}, inside a user-defined function
+@cindex @code{BEGIN} pattern, @code{next}/@code{nextfile} statements and
+@cindex @code{END} pattern, @code{next}/@code{nextfile} statements and
+@cindex POSIX @command{awk}, @code{next}/@code{nextfile} statements and
+@cindex @code{next} statement, user-defined functions and
+@cindex functions, user-defined, @code{next}/@code{nextfile} statements and
+According to the POSIX standard, the behavior is undefined if
+the @code{next} statement is used in a @code{BEGIN} or @code{END} rule.
+@command{gawk} treats it as a syntax error.
+Although POSIX permits it,
+some other @command{awk} implementations don't allow the @code{next}
+statement inside function bodies
+(@pxref{User-defined}).
+Just as with any other @code{next} statement, a @code{next} statement inside a
+function body reads the next record and starts processing it with the
+first rule in the program.
+If the @code{next} statement causes the end of the input to be reached,
+then the code in any @code{END} rules is executed.
+@xref{BEGIN/END}.
+
+@node Nextfile Statement
+@subsection Using @command{gawk}'s @code{nextfile} Statement
+@cindex @code{nextfile} statement
+@cindex differences in @command{awk} and @command{gawk}, @code{next}/@code{nextfile} statements
+
+@command{gawk} provides the @code{nextfile} statement,
+which is similar to the @code{next} statement.
+However, instead of abandoning processing of the current record, the
+@code{nextfile} statement instructs @command{gawk} to stop processing the
+current @value{DF}.
+
+The @code{nextfile} statement is a @command{gawk} extension.
+In most other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+@code{nextfile} is not special.
+
+Upon execution of the @code{nextfile} statement, @code{FILENAME} is
+updated to the name of the next @value{DF} listed on the command line,
+@code{FNR} is reset to one, @code{ARGIND} is incremented, and processing
+starts over with the first rule in the program.
+(@code{ARGIND} hasn't been introduced yet. @xref{Built-in Variables}.)
+If the @code{nextfile} statement causes the end of the input to be reached,
+then the code in any @code{END} rules is executed.
+@xref{BEGIN/END}.
+
+The @code{nextfile} statement is useful when there are many @value{DF}s
+to process but it isn't necessary to process every record in every file.
+Normally, in order to move on to the next @value{DF}, a program
+has to continue scanning the unwanted records.  The @code{nextfile}
+statement accomplishes this much more efficiently.
+
+While one might think that @samp{close(FILENAME)} would accomplish
+the same as @code{nextfile}, this isn't true.  @code{close} is
+reserved for closing files, pipes, and coprocesses that are
+opened with redirections.  It is not related to the main processing that
+@command{awk} does with the files listed in @code{ARGV}.
+
+If it's necessary to use an @command{awk} version that doesn't support
+@code{nextfile}, see
+@ref{Nextfile Function},
+for a user-defined function that simulates the @code{nextfile}
+statement.
+
+@cindex functions, user-defined, @code{next}/@code{nextfile} statements and
+@cindex @code{nextfile} statement, user-defined functions and
+The current version of the Bell Laboratories @command{awk}
+(@pxref{Other Versions})
+also supports @code{nextfile}.  However, it doesn't allow the @code{nextfile}
+statement inside function bodies
+(@pxref{User-defined}).
+@command{gawk} does; a @code{nextfile} inside a
+function body reads the next record and starts processing it with the
+first rule in the program, just as any other @code{nextfile} statement.
+
+@cindex @code{next file} statement, in @command{gawk}
+@cindex @command{gawk}, @code{next file} statement in
+@cindex @code{nextfile} statement, in @command{gawk}
+@cindex @command{gawk}, @code{nextfile} statement in
+@strong{Caution:}  Versions of @command{gawk} prior to 3.0 used two
+words (@samp{next file}) for the @code{nextfile} statement.
+In @value{PVERSION} 3.0, this was changed
+to one word, because the treatment of @samp{file} was
+inconsistent. When it appeared after @code{next}, @samp{file} was a keyword;
+otherwise, it was a regular identifier.  The old usage is no longer
+accepted; @samp{next file} generates a syntax error.
+
+@node Exit Statement
+@subsection The @code{exit} Statement
+
+@cindex @code{exit} statement
+The @code{exit} statement causes @command{awk} to immediately stop
+executing the current rule and to stop processing input; any remaining input
+is ignored.  The @code{exit} statement is written as follows:
+
+@example
+exit @r{[}@var{return code}@r{]}
+@end example
+
+@cindex @code{BEGIN} pattern, @code{exit} statement and
+@cindex @code{END} pattern, @code{exit} statement and
+When an @code{exit} statement is executed from a @code{BEGIN} rule, the
+program stops processing everything immediately.  No input records are
+read.  However, if an @code{END} rule is present,
+as part of executing the @code{exit} statement,
+the @code{END} rule is executed
+(@pxref{BEGIN/END}).
+If @code{exit} is used as part of an @code{END} rule, it causes
+the program to stop immediately.
+
+An @code{exit} statement that is not part of a @code{BEGIN} or @code{END}
+rule stops the execution of any further automatic rules for the current
+record, skips reading any remaining input records, and executes the
+@code{END} rule if there is one.
+
+In such a case,
+if you don't want the @code{END} rule to do its job, set a variable
+to nonzero before the @code{exit} statement and check that variable in
+the @code{END} rule.
+@xref{Assert Function},
+for an example that does this.
+
+@cindex dark corner, @code{exit} statement
+If an argument is supplied to @code{exit}, its value is used as the exit
+status code for the @command{awk} process.  If no argument is supplied,
+@code{exit} returns status zero (success).  In the case where an argument
+is supplied to a first @code{exit} statement, and then @code{exit} is
+called a second time from an @code{END} rule with no argument,
+@command{awk} uses the previously supplied exit value.
+@value{DARKCORNER}
+
+@cindex programming conventions, @code{exit} statement
+For example, suppose an error condition occurs that is difficult or
+impossible to handle.  Conventionally, programs report this by
+exiting with a nonzero status.  An @command{awk} program can do this
+using an @code{exit} statement with a nonzero argument, as shown
+in the following example:
+
+@example
+BEGIN @{
+       if (("date" | getline date_now) <= 0) @{
+         print "Can't get system date" > "/dev/stderr"
+         exit 1
+       @}
+       print "current date is", date_now
+       close("date")
+@}
+@end example
+@c ENDOFRANGE csta
+@c ENDOFRANGE acs
+@c ENDOFRANGE accs
+
+@node Built-in Variables
+@section Built-in Variables
+@c STARTOFRANGE bvar
+@cindex built-in variables
+@c STARTOFRANGE varb
+@cindex variables, built-in
+
+Most @command{awk} variables are available to use for your own
+purposes; they never change unless your program assigns values to
+them, and they never affect anything unless your program examines them.
+However, a few variables in @command{awk} have special built-in meanings.
+@command{awk} examines some of these automatically, so that they enable you
+to tell @command{awk} how to do certain things.  Others are set
+automatically by @command{awk}, so that they carry information from the
+internal workings of @command{awk} to your program.
+
+@cindex @command{gawk}, built-in variables and
+This @value{SECTION} documents all the built-in variables of
+@command{gawk}, most of which are also documented in the chapters
+describing their areas of activity.
+
+@menu
+* User-modified::               Built-in variables that you change to control
+                                @command{awk}.
+* Auto-set::                    Built-in variables where @command{awk} gives
+                                you information.
+* ARGC and ARGV::               Ways to use @code{ARGC} and @code{ARGV}.
+@end menu
+
+@node User-modified
+@subsection Built-in Variables That Control @command{awk}
+@c STARTOFRANGE bvaru
+@cindex built-in variables, user-modifiable
+@c STARTOFRANGE nmbv
+@cindex user-modifiable variables
+
+The following is an alphabetical list of variables that you can change to
+control how @command{awk} does certain things. The variables that are
+specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}).}
+
+@table @code
+@cindex @code{BINMODE} variable
+@cindex binary input/output
+@cindex input/output, binary
+@item BINMODE #
+On non-POSIX systems, this variable specifies use of binary mode for all I/O.
+Numeric values of one, two, or three specify that input files, output files, or
+all files, respectively, should use binary I/O.
+Alternatively,
+string values of @code{"r"} or @code{"w"} specify that input files and
+output files, respectively, should use binary I/O.
+A string value of @code{"rw"} or @code{"wr"} indicates that all
+files should use binary I/O.
+Any other string value is equivalent to @code{"rw"}, but @command{gawk}
+generates a warning message.
+@code{BINMODE} is described in more detail in
+@ref{PC Using}.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{BINMODE} variable
+This variable is a @command{gawk} extension.
+In other @command{awk} implementations
+(except @command{mawk},
+@pxref{Other Versions}),
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+
+@cindex @code{CONVFMT} variable
+@cindex POSIX @command{awk}, @code{CONVFMT} variable and
+@cindex numbers, converting, to strings
+@cindex strings, converting, numbers to
+@item CONVFMT
+This string controls conversion of numbers to
+strings (@pxref{Conversion}).
+It works by being passed, in effect, as the first argument to the
+@code{sprintf} function
+(@pxref{String Functions}).
+Its default value is @code{"%.6g"}.
+@code{CONVFMT} was introduced by the POSIX standard.
+
+@cindex @code{FIELDWIDTHS} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{FIELDWIDTHS} variable
+@cindex field separators, @code{FIELDWIDTHS} variable and
+@cindex separators, field, @code{FIELDWIDTHS} variable and
+@item FIELDWIDTHS #
+This is a space-separated list of columns that tells @command{gawk}
+how to split input with fixed columnar boundaries.
+Assigning a value to @code{FIELDWIDTHS}
+overrides the use of @code{FS} for field splitting.
+@xref{Constant Size}, for more information.
+
+@cindex @command{gawk}, @code{FIELDWIDTHS} variable in
+If @command{gawk} is in compatibility mode
+(@pxref{Options}), then @code{FIELDWIDTHS}
+has no special meaning, and field-splitting operations occur based
+exclusively on the value of @code{FS}.
+
+@cindex @code{FS} variable
+@cindex separators, field
+@cindex field separators
+@item FS
+This is the input field separator
+(@pxref{Field Separators}).
+The value is a single-character string or a multi-character regular
+expression that matches the separations between fields in an input
+record.  If the value is the null string (@code{""}), then each
+character in the record becomes a separate field.
+(This behavior is a @command{gawk} extension. POSIX @command{awk} does not
+specify the behavior when @code{FS} is the null string.)
+@c NEXT ED: Mark as common extension
+
+@cindex POSIX @command{awk}, @code{FS} variable and
+The default value is @w{@code{" "}}, a string consisting of a single
+space.  As a special exception, this value means that any
+sequence of spaces, tabs, and/or newlines is a single separator.@footnote{In
+POSIX @command{awk}, newline does not count as whitespace.}  It also causes
+spaces, tabs, and newlines at the beginning and end of a record to be ignored.
+
+You can set the value of @code{FS} on the command line using the
+@option{-F} option:
+
+@example
+awk -F, '@var{program}' @var{input-files}
+@end example
+
+@cindex @command{gawk}, field separators and
+If @command{gawk} is using @code{FIELDWIDTHS} for field splitting,
+assigning a value to @code{FS} causes @command{gawk} to return to
+the normal, @code{FS}-based field splitting. An easy way to do this
+is to simply say @samp{FS = FS}, perhaps with an explanatory comment.
+
+@cindex @code{IGNORECASE} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{IGNORECASE} variable
+@cindex case sensitivity, string comparisons and
+@cindex case sensitivity, regexps and
+@cindex regular expressions, case sensitivity
+@item IGNORECASE #
+If @code{IGNORECASE} is nonzero or non-null, then all string comparisons
+and all regular expression matching are case independent.  Thus, regexp
+matching with @samp{~} and @samp{!~}, as well as the @code{gensub},
+@code{gsub}, @code{index}, @code{match}, @code{split}, and @code{sub}
+functions, record termination with @code{RS}, and field splitting with
+@code{FS}, all ignore case when doing their particular regexp operations.
+However, the value of @code{IGNORECASE} does @emph{not} affect array subscripting
+and it does not affect field splitting when using a single-character
+field separator.
+@xref{Case-sensitivity}.
+
+@cindex @command{gawk}, @code{IGNORECASE} variable in
+If @command{gawk} is in compatibility mode
+(@pxref{Options}),
+then @code{IGNORECASE} has no special meaning.  Thus, string
+and regexp operations are always case-sensitive.
+
+@cindex @code{LINT} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{LINT} variable
+@cindex lint checking
+@item LINT #
+When this variable is true (nonzero or non-null), @command{gawk}
+behaves as if the @option{--lint} command-line option is in effect.
+(@pxref{Options}).
+With a value of @code{"fatal"}, lint warnings become fatal errors.
+With a value of @code{"invalid"}, only warnings about things that are
+actually invalid are issued. (This is not fully implemented yet.)
+Any other true value prints nonfatal warnings.
+Assigning a false value to @code{LINT} turns off the lint warnings.
+
+@cindex @command{gawk}, @code{LINT} variable in
+This variable is a @command{gawk} extension.  It is not special
+in other @command{awk} implementations.  Unlike the other special variables,
+changing @code{LINT} does affect the production of lint warnings,
+even if @command{gawk} is in compatibility mode.  Much as
+the @option{--lint} and @option{--traditional} options independently
+control different aspects of @command{gawk}'s behavior, the control
+of lint warnings during program execution is independent of the flavor
+of @command{awk} being executed.
+
+@cindex @code{OFMT} variable
+@cindex numbers, converting, to strings
+@cindex strings, converting, numbers to
+@item OFMT
+This string controls conversion of numbers to
+strings (@pxref{Conversion}) for
+printing with the @code{print} statement.  It works by being passed
+as the first argument to the @code{sprintf} function
+(@pxref{String Functions}).
+Its default value is @code{"%.6g"}.  Earlier versions of @command{awk}
+also used @code{OFMT} to specify the format for converting numbers to
+strings in general expressions; this is now done by @code{CONVFMT}.
+
+@cindex @code{sprintf} function, @code{OFMT} variable and
+@cindex @code{print} statement, @code{OFMT} variable and
+@cindex @code{OFS} variable
+@cindex separators, field
+@cindex field separators
+@item OFS
+This is the output field separator (@pxref{Output Separators}).  It is
+output between the fields printed by a @code{print} statement.  Its
+default value is @w{@code{" "}}, a string consisting of a single space.
+
+@cindex @code{ORS} variable
+@item ORS
+This is the output record separator.  It is output at the end of every
+@code{print} statement.  Its default value is @code{"\n"}, the newline
+character.  (@xref{Output Separators}.)
+
+@cindex @code{RS} variable
+@cindex separators, record
+@cindex record separators
+@item RS
+This is @command{awk}'s input record separator.  Its default value is a string
+containing a single newline character, which means that an input record
+consists of a single line of text.
+It can also be the null string, in which case records are separated by
+runs of blank lines.
+If it is a regexp, records are separated by
+matches of the regexp in the input text.
+(@xref{Records}.)
+
+The ability for @code{RS} to be a regular expression
+is a @command{gawk} extension.
+In most other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+just the first character of @code{RS}'s value is used.
+
+@cindex @code{SUBSEP} variable
+@cindex separators, subscript
+@cindex subscript separators
+@item SUBSEP
+This is the subscript separator.  It has the default value of
+@code{"\034"} and is used to separate the parts of the indices of a
+multidimensional array.  Thus, the expression @code{@w{foo["A", "B"]}}
+really accesses @code{foo["A\034B"]}
+(@pxref{Multi-dimensional}).
+
+@cindex @code{TEXTDOMAIN} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{TEXTDOMAIN} variable
+@cindex internationalization, localization
+@item TEXTDOMAIN #
+This variable is used for internationalization of programs at the
+@command{awk} level.  It sets the default text domain for specially
+marked string constants in the source text, as well as for the
+@code{dcgettext}, @code{dcngettext} and @code{bindtextdomain} functions
+(@pxref{Internationalization}).
+The default value of @code{TEXTDOMAIN} is @code{"messages"}.
+
+This variable is a @command{gawk} extension.
+In other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+@end table
+@c ENDOFRANGE bvar
+@c ENDOFRANGE varb
+@c ENDOFRANGE bvaru
+@c ENDOFRANGE nmbv
+
+@node Auto-set
+@subsection Built-in Variables That Convey Information
+
+@c STARTOFRANGE bvconi
+@cindex built-in variables, conveying information
+@c STARTOFRANGE vbconi
+@cindex variables, built-in, conveying information
+The following is an alphabetical list of variables that @command{awk}
+sets automatically on certain occasions in order to provide
+information to your program.  The variables that are specific to
+@command{gawk} are marked with a pound sign@w{ (@samp{#}).}
+
+@table @code
+@cindex @code{ARGC}/@code{ARGV} variables
+@cindex arguments, command-line
+@cindex command line, arguments
+@item ARGC@r{,} ARGV
+The command-line arguments available to @command{awk} programs are stored in
+an array called @code{ARGV}.  @code{ARGC} is the number of command-line
+arguments present.  @xref{Other Arguments}.
+Unlike most @command{awk} arrays,
+@code{ARGV} is indexed from 0 to @code{ARGC} @minus{} 1.
+In the following example:
+
+@example
+$ awk 'BEGIN @{
+>         for (i = 0; i < ARGC; i++)
+>             print ARGV[i]
+>      @}' inventory-shipped BBS-list
+@print{} awk
+@print{} inventory-shipped
+@print{} BBS-list
+@end example
+
+@noindent
+@code{ARGV[0]} contains @code{"awk"}, @code{ARGV[1]}
+contains @code{"inventory-shipped"}, and @code{ARGV[2]} contains
+@code{"BBS-list"}.  The value of @code{ARGC} is three, one more than the
+index of the last element in @code{ARGV}, because the elements are numbered
+from zero.
+
+@cindex programming conventions, @code{ARGC}/@code{ARGV} variables
+The names @code{ARGC} and @code{ARGV}, as well as the convention of indexing
+the array from 0 to @code{ARGC} @minus{} 1, are derived from the C language's
+method of accessing command-line arguments.
+
+The value of @code{ARGV[0]} can vary from system to system.
+Also, you should note that the program text is @emph{not} included in
+@code{ARGV}, nor are any of @command{awk}'s command-line options.
+@xref{ARGC and ARGV}, for information
+about how @command{awk} uses these variables.
+
+@cindex @code{ARGIND} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{ARGIND} variable
+@item ARGIND #
+The index in @code{ARGV} of the current file being processed.
+Every time @command{gawk} opens a new @value{DF} for processing, it sets
+@code{ARGIND} to the index in @code{ARGV} of the @value{FN}.
+When @command{gawk} is processing the input files,
+@samp{FILENAME == ARGV[ARGIND]} is always true.
+
+@c comma before ARGIND does NOT mark a tertiary
+@cindex files, processing, @code{ARGIND} variable and
+This variable is useful in file processing; it allows you to tell how far
+along you are in the list of @value{DF}s as well as to distinguish between
+successive instances of the same @value{FN} on the command line.
+
+@cindex @value{FN}s, distinguishing
+While you can change the value of @code{ARGIND} within your @command{awk}
+program, @command{gawk} automatically sets it to a new value when the
+next file is opened.
+
+This variable is a @command{gawk} extension.
+In other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+
+@cindex @code{ENVIRON} variable
+@cindex environment variables
+@item ENVIRON
+An associative array that contains the values of the environment.  The array
+indices are the environment variable names; the elements are the values of
+the particular environment variables.  For example,
+@code{ENVIRON["HOME"]} might be @file{/home/arnold}.  Changing this array
+does not affect the environment passed on to any programs that
+@command{awk} may spawn via redirection or the @code{system} function.
+@c (In a future version of @command{gawk}, it may do so.)
+
+Some operating systems may not have environment variables.
+On such systems, the @code{ENVIRON} array is empty (except for
+@w{@code{ENVIRON["AWKPATH"]}},
+@pxref{AWKPATH Variable}).
+
+@cindex @code{ERRNO} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{ERRNO} variable
+@cindex error handling, @code{ERRNO} variable and
+@item ERRNO #
+If a system error occurs during a redirection for @code{getline},
+during a read for @code{getline}, or during a @code{close} operation,
+then @code{ERRNO} contains a string describing the error.
+
+This variable is a @command{gawk} extension.
+In other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+
+@cindex @code{FILENAME} variable
+@cindex dark corner, @code{FILENAME} variable
+@item FILENAME
+The name of the file that @command{awk} is currently reading.
+When no @value{DF}s are listed on the command line, @command{awk} reads
+from the standard input and @code{FILENAME} is set to @code{"-"}.
+@code{FILENAME} is changed each time a new file is read
+(@pxref{Reading Files}).
+Inside a @code{BEGIN} rule, the value of @code{FILENAME} is
+@code{""}, since there are no input files being processed
+yet.@footnote{Some early implementations of Unix @command{awk} initialized
+@code{FILENAME} to @code{"-"}, even if there were @value{DF}s to be
+processed. This behavior was incorrect and should not be relied
+upon in your programs.}
+@value{DARKCORNER}
+Note, though, that using @code{getline}
+(@pxref{Getline})
+inside a @code{BEGIN} rule can give
+@code{FILENAME} a value.
+
+@cindex @code{FNR} variable
+@item FNR
+The current record number in the current file.  @code{FNR} is
+incremented each time a new record is read
+(@pxref{Getline}).  It is reinitialized
+to zero each time a new input file is started.
+
+@cindex @code{NF} variable
+@item NF
+The number of fields in the current input record.
+@code{NF} is set each time a new record is read, when a new field is
+created or when @code{$0} changes (@pxref{Fields}).
+
+Unlike most of the variables described in this
+@ifnotinfo
+section,
+@end ifnotinfo
+@ifinfo
+node,
+@end ifinfo
+assigning a value to @code{NF} has the potential to affect
+@command{awk}'s internal workings.  In particular, assignments
+to @code{NF} can be used to create or remove fields from the
+current record: @xref{Changing Fields}.
+
+@cindex @code{NR} variable
+@item NR
+The number of input records @command{awk} has processed since
+the beginning of the program's execution
+(@pxref{Records}).
+@code{NR} is incremented each time a new record is read.
+
+@cindex @code{PROCINFO} array
+@cindex differences in @command{awk} and @command{gawk}, @code{PROCINFO} array
+@item PROCINFO #
+The elements of this array provide access to information about the
+running @command{awk} program.
+The following elements (listed alphabetically)
+are guaranteed to be available:
+
+@table @code
+@item PROCINFO["egid"]
+The value of the @code{getegid} system call.
+
+@item PROCINFO["euid"]
+The value of the @code{geteuid} system call.
+
+@item PROCINFO["FS"]
+This is
+@code{"FS"} if field splitting with @code{FS} is in effect, or it is
+@code{"FIELDWIDTHS"} if field splitting with @code{FIELDWIDTHS} is in effect.
+
+@item PROCINFO["gid"]
+The value of the @code{getgid} system call.
+
+@item PROCINFO["pgrpid"]
+The process group ID of the current process.
+
+@item PROCINFO["pid"]
+The process ID of the current process.
+
+@item PROCINFO["ppid"]
+The parent process ID of the current process.
+
+@item PROCINFO["uid"]
+The value of the @code{getuid} system call.
+@end table
+
+On some systems, there may be elements in the array, @code{"group1"}
+through @code{"group@var{N}"} for some @var{N}. @var{N} is the number of
+supplementary groups that the process has.  Use the @code{in} operator
+to test for these elements
+(@pxref{Reference to Elements}).
+
+This array is a @command{gawk} extension.
+In other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+
+@cindex @code{RLENGTH} variable
+@item RLENGTH
+The length of the substring matched by the
+@code{match} function
+(@pxref{String Functions}).
+@code{RLENGTH} is set by invoking the @code{match} function.  Its value
+is the length of the matched string, or @minus{}1 if no match is found.
+
+@cindex @code{RSTART} variable
+@item RSTART
+The start-index in characters of the substring that is matched by the
+@code{match} function
+(@pxref{String Functions}).
+@code{RSTART} is set by invoking the @code{match} function.  Its value
+is the position of the string where the matched substring starts, or zero
+if no match was found.
+
+@cindex @code{RT} variable
+@cindex differences in @command{awk} and @command{gawk}, @code{RT} variable
+@item RT #
+This is set each time a record is read. It contains the input text
+that matched the text denoted by @code{RS}, the record separator.
+
+This variable is a @command{gawk} extension.
+In other @command{awk} implementations,
+or if @command{gawk} is in compatibility mode
+(@pxref{Options}),
+it is not special.
+@end table
+@c ENDOFRANGE bvconi
+@c ENDOFRANGE vbconi
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Changing @code{NR} and @code{FNR}
+@cindex @code{NR} variable, changing
+@cindex @code{FNR} variable, changing
+@cindex advanced features, @code{FNR}/@code{NR} variables
+@cindex dark corner, @code{FNR}/@code{NR} variables
+@command{awk} increments @code{NR} and @code{FNR}
+each time it reads a record, instead of setting them to the absolute
+value of the number of records read.  This means that a program can
+change these variables and their new values are incremented for
+each record.
+@value{DARKCORNER}
+This is demonstrated in the following example:
+
+@example
+$ echo '1
+> 2
+> 3
+> 4' | awk 'NR == 2 @{ NR = 17 @}
+> @{ print NR @}'
+@print{} 1
+@print{} 17
+@print{} 18
+@print{} 19
+@end example
+
+@noindent
+Before @code{FNR} was added to the @command{awk} language
+(@pxref{V7/SVR3.1}),
+many @command{awk} programs used this feature to track the number of
+records in a file by resetting @code{NR} to zero when @code{FILENAME}
+changed.
+
+@node ARGC and ARGV
+@subsection Using @code{ARGC} and @code{ARGV}
+@cindex @code{ARGC}/@code{ARGV} variables
+@cindex arguments, command-line
+@cindex command line, arguments
+
+@ref{Auto-set},
+presented the following program describing the information contained in @code{ARGC}
+and @code{ARGV}:
+
+@example
+$ awk 'BEGIN @{
+>        for (i = 0; i < ARGC; i++)
+>            print ARGV[i]
+>      @}' inventory-shipped BBS-list
+@print{} awk
+@print{} inventory-shipped
+@print{} BBS-list
+@end example
+
+@noindent
+In this example, @code{ARGV[0]} contains @samp{awk}, @code{ARGV[1]}
+contains @samp{inventory-shipped}, and @code{ARGV[2]} contains
+@samp{BBS-list}.
+Notice that the @command{awk} program is not entered in @code{ARGV}.  The
+other special command-line options, with their arguments, are also not
+entered.  This includes variable assignments done with the @option{-v}
+option (@pxref{Options}).
+Normal variable assignments on the command line @emph{are}
+treated as arguments and do show up in the @code{ARGV} array:
+
+@example
+$ cat showargs.awk
+@print{} BEGIN @{
+@print{}     printf "A=%d, B=%d\n", A, B
+@print{}     for (i = 0; i < ARGC; i++)
+@print{}         printf "\tARGV[%d] = %s\n", i, ARGV[i]
+@print{} @}
+@print{} END   @{ printf "A=%d, B=%d\n", A, B @}
+$ awk -v A=1 -f showargs.awk B=2 /dev/null
+@print{} A=1, B=0
+@print{}        ARGV[0] = awk
+@print{}        ARGV[1] = B=2
+@print{}        ARGV[2] = /dev/null
+@print{} A=1, B=2
+@end example
+
+A program can alter @code{ARGC} and the elements of @code{ARGV}.
+Each time @command{awk} reaches the end of an input file, it uses the next
+element of @code{ARGV} as the name of the next input file.  By storing a
+different string there, a program can change which files are read.
+Use @code{"-"} to represent the standard input.  Storing
+additional elements and incrementing @code{ARGC} causes
+additional files to be read.
+
+If the value of @code{ARGC} is decreased, that eliminates input files
+from the end of the list.  By recording the old value of @code{ARGC}
+elsewhere, a program can treat the eliminated arguments as
+something other than @value{FN}s.
+
+To eliminate a file from the middle of the list, store the null string
+(@code{""}) into @code{ARGV} in place of the file's name.  As a
+special feature, @command{awk} ignores @value{FN}s that have been
+replaced with the null string.
+Another option is to
+use the @code{delete} statement to remove elements from
+@code{ARGV} (@pxref{Delete}).
+
+All of these actions are typically done in the @code{BEGIN} rule,
+before actual processing of the input begins.
+@xref{Split Program}, and see
+@ref{Tee Program}, for examples
+of each way of removing elements from @code{ARGV}.
+The following fragment processes @code{ARGV} in order to examine, and
+then remove, command-line options:
+@c NEXT ED: Add xref to rewind() function
+
+@example
+BEGIN @{
+    for (i = 1; i < ARGC; i++) @{
+        if (ARGV[i] == "-v")
+            verbose = 1
+        else if (ARGV[i] == "-d")
+            debug = 1
+        else if (ARGV[i] ~ /^-?/) @{
+            e = sprintf("%s: unrecognized option -- %c",
+                    ARGV[0], substr(ARGV[i], 1, ,1))
+            print e > "/dev/stderr"
+        @} else
+            break
+        delete ARGV[i]
+    @}
+@}
+@end example
+
+To actually get the options into the @command{awk} program,
+end the @command{awk} options with @option{--} and then supply
+the @command{awk} program's options, in the following manner:
+
+@example
+awk -f myprog -- -v -d file1 file2 @dots{}
+@end example
+
+@cindex differences in @command{awk} and @command{gawk}, @code{ARGC}/@code{ARGV} variables
+This is not necessary in @command{gawk}. Unless @option{--posix} has
+been specified, @command{gawk} silently puts any unrecognized options
+into @code{ARGV} for the @command{awk} program to deal with.  As soon
+as it sees an unknown option, @command{gawk} stops looking for other
+options that it might otherwise recognize.  The previous example with
+@command{gawk} would be:
+
+@example
+gawk -f myprog -d -v file1 file2 @dots{}
+@end example
+
+@noindent
+Because @option{-d} is not a valid @command{gawk} option,
+it and the following @option{-v}
+are passed on to the @command{awk} program.
+
+@node Arrays
+@chapter Arrays in @command{awk}
+@c STARTOFRANGE arrs
+@cindex arrays
+
+An @dfn{array} is a table of values called @dfn{elements}.  The
+elements of an array are distinguished by their indices.  @dfn{Indices}
+may be either numbers or strings.
+
+This @value{CHAPTER} describes how arrays work in @command{awk},
+how to use array elements, how to scan through every element in an array,
+and how to remove array elements.
+It also describes how @command{awk} simulates multidimensional
+arrays, as well as some of the less obvious points about array usage.
+The @value{CHAPTER} finishes with a discussion of @command{gawk}'s facility
+for sorting an array based on its indices.
+
+@cindex variables, names of
+@cindex functions, names of
+@cindex arrays, names of
+@cindex names, arrays/variables
+@cindex namespace issues
+@command{awk} maintains a single set
+of names that may be used for naming variables, arrays, and functions
+(@pxref{User-defined}).
+Thus, you cannot have a variable and an array with the same name in the
+same @command{awk} program.
+
+@menu
+* Array Intro::                 Introduction to Arrays
+* Reference to Elements::       How to examine one element of an array.
+* Assigning Elements::          How to change an element of an array.
+* Array Example::               Basic Example of an Array
+* Scanning an Array::           A variation of the @code{for} statement. It
+                                loops through the indices of an array's
+                                existing elements.
+* Delete::                      The @code{delete} statement removes an element
+                                from an array.
+* Numeric Array Subscripts::    How to use numbers as subscripts in
+                                @command{awk}.
+* Uninitialized Subscripts::    Using Uninitialized variables as subscripts.
+* Multi-dimensional::           Emulating multidimensional arrays in
+                                @command{awk}.
+* Multi-scanning::              Scanning multidimensional arrays.
+* Array Sorting::               Sorting array values and indices.
+@end menu
+
+@node Array Intro
+@section Introduction to Arrays
+
+The @command{awk} language provides one-dimensional arrays
+for storing groups of related strings or numbers.
+Every @command{awk} array must have a name.  Array names have the same
+syntax as variable names; any valid variable name would also be a valid
+array name.  But one name cannot be used in both ways (as an array and
+as a variable) in the same @command{awk} program.
+
+Arrays in @command{awk} superficially resemble arrays in other programming
+languages, but there are fundamental differences.  In @command{awk}, it
+isn't necessary to specify the size of an array before starting to use it.
+Additionally, any number or string in @command{awk}, not just consecutive integers,
+may be used as an array index.
+
+In most other languages, arrays must be @dfn{declared} before use,
+including a specification of
+how many elements or components they contain.  In such languages, the
+declaration causes a contiguous block of memory to be allocated for that
+many elements.  Usually, an index in the array must be a positive integer.
+For example, the index zero specifies the first element in the array, which is
+actually stored at the beginning of the block of memory.  Index one
+specifies the second element, which is stored in memory right after the
+first element, and so on.  It is impossible to add more elements to the
+array, because it has room only for as many elements as given in
+the declaration.
+(Some languages allow arbitrary starting and ending
+indices---e.g., @samp{15 .. 27}---but the size of the array is still fixed when
+the array is declared.)
+
+A contiguous array of four elements might look like the following example,
+conceptually, if the element values are 8, @code{"foo"},
+@code{""}, and 30:
+
+@c NEXT ED: Use real images here
+@iftex
+@c from Karl Berry, much thanks for the help.
+@tex
+\bigskip % space above the table (about 1 linespace)
+\offinterlineskip
+\newdimen\width \width = 1.5cm
+\newdimen\hwidth \hwidth = 4\width \advance\hwidth by 2pt % 5 * 0.4pt
+\centerline{\vbox{
+\halign{\strut\hfil\ignorespaces#&&\vrule#&\hbox to\width{\hfil#\unskip\hfil}\cr
+\noalign{\hrule width\hwidth}
+	&&{\tt 8} &&{\tt "foo"} &&{\tt ""} &&{\tt 30} &&\quad Value\cr
+\noalign{\hrule width\hwidth}
+\noalign{\smallskip}
+	&\omit&0&\omit &1   &\omit&2 &\omit&3 &\omit&\quad Index\cr
+}
+}}
+@end tex
+@end iftex
+@ifinfo
+@example
++---------+---------+--------+---------+
+|    8    |  "foo"  |   ""   |    30   |    @r{Value}
++---------+---------+--------+---------+
+     0         1         2         3        @r{Index}
+@end example
+@end ifinfo
+@ifxml
+@example
++---------+---------+--------+---------+
+|    8    |  "foo"  |   ""   |    30   |    @r{Value}
++---------+---------+--------+---------+
+     0         1         2         3        @r{Index}
+@end example
+@end ifxml
+
+@noindent
+Only the values are stored; the indices are implicit from the order of
+the values. Here, 8 is the value at index zero, because 8 appears in the
+position with zero elements before it.
+
+@c STARTOFRANGE arrin
+@cindex arrays, indexing
+@c STARTOFRANGE inarr
+@cindex indexing arrays
+@cindex associative arrays
+@cindex arrays, associative
+Arrays in @command{awk} are different---they are @dfn{associative}.  This means
+that each array is a collection of pairs: an index and its corresponding
+array element value:
+
+@example
+@r{Element} 3     @r{Value} 30
+@r{Element} 1     @r{Value} "foo"
+@r{Element} 0     @r{Value} 8
+@r{Element} 2     @r{Value} ""
+@end example
+
+@noindent
+The pairs are shown in jumbled order because their order is irrelevant.
+
+One advantage of associative arrays is that new pairs can be added
+at any time.  For example, suppose a tenth element is added to the array
+whose value is @w{@code{"number ten"}}.  The result is:
+
+@example
+@r{Element} 10    @r{Value} "number ten"
+@r{Element} 3     @r{Value} 30
+@r{Element} 1     @r{Value} "foo"
+@r{Element} 0     @r{Value} 8
+@r{Element} 2     @r{Value} ""
+@end example
+
+@noindent
+@cindex sparse arrays
+@cindex arrays, sparse
+Now the array is @dfn{sparse}, which just means some indices are missing.
+It has elements 0--3 and 10, but doesn't have elements 4, 5, 6, 7, 8, or 9.
+
+Another consequence of associative arrays is that the indices don't
+have to be positive integers.  Any number, or even a string, can be
+an index.  For example, the following is an array that translates words from
+English to French:
+
+@example
+@r{Element} "dog" @r{Value} "chien"
+@r{Element} "cat" @r{Value} "chat"
+@r{Element} "one" @r{Value} "un"
+@r{Element} 1     @r{Value} "un"
+@end example
+
+@noindent
+Here we decided to translate the number one in both spelled-out and
+numeric form---thus illustrating that a single array can have both
+numbers and strings as indices.
+In fact, array subscripts are always strings; this is discussed
+in more detail in
+@ref{Numeric Array Subscripts}.
+Here, the number @code{1} isn't double-quoted, since @command{awk}
+automatically converts it to a string.
+
+@cindex case sensitivity, array indices and
+@cindex arrays, @code{IGNORECASE} variable and
+@cindex @code{IGNORECASE} variable, array subscripts and
+The value of @code{IGNORECASE} has no effect upon array subscripting.
+The identical string value used to store an array element must be used
+to retrieve it.
+When @command{awk} creates an array (e.g., with the @code{split}
+built-in function),
+that array's indices are consecutive integers starting at one.
+(@xref{String Functions}.)
+
+@command{awk}'s arrays are efficient---the time to access an element
+is independent of the number of elements in the array.
+@c ENDOFRANGE arrin
+@c ENDOFRANGE inarr
+
+@node Reference to Elements
+@section Referring to an Array Element
+@cindex arrays, elements, referencing
+@cindex elements in arrays
+
+The principal way to use an array is to refer to one of its elements.
+An array reference is an expression as follows:
+
+@example
+@var{array}[@var{index}]
+@end example
+
+@noindent
+Here, @var{array} is the name of an array.  The expression @var{index} is
+the index of the desired element of the array.
+
+The value of the array reference is the current value of that array
+element.  For example, @code{foo[4.3]} is an expression for the element
+of array @code{foo} at index @samp{4.3}.
+
+A reference to an array element that has no recorded value yields a value of
+@code{""}, the null string.  This includes elements
+that have not been assigned any value as well as elements that have been
+deleted (@pxref{Delete}).  Such a reference
+automatically creates that array element, with the null string as its value.
+(In some cases, this is unfortunate, because it might waste memory inside
+@command{awk}.)
+
+@c @cindex arrays, @code{in} operator and
+@cindex @code{in} operator, arrays and
+To determine whether an element exists in an array at a certain index, use
+the following expression:
+
+@example
+@var{index} in @var{array}
+@end example
+
+@cindex side effects, array indexing
+@noindent
+This expression tests whether the particular index exists,
+without the side effect of creating that element if it is not present.
+The expression has the value one (true) if @code{@var{array}[@var{index}]}
+exists and zero (false) if it does not exist.
+For example, this statement tests whether the array @code{frequencies}
+contains the index @samp{2}:
+
+@example
+if (2 in frequencies)
+    print "Subscript 2 is present."
+@end example
+
+Note that this is @emph{not} a test of whether the array
+@code{frequencies} contains an element whose @emph{value} is two.
+There is no way to do that except to scan all the elements.  Also, this
+@emph{does not} create @code{frequencies[2]}, while the following
+(incorrect) alternative does:
+
+@example
+if (frequencies[2] != "")
+    print "Subscript 2 is present."
+@end example
+
+@node Assigning Elements
+@section Assigning Array Elements
+@cindex arrays, elements, assigning
+@cindex elements in arrays, assigning
+
+Array elements can be assigned values just like
+@command{awk} variables:
+
+@example
+@var{array}[@var{subscript}] = @var{value}
+@end example
+
+@noindent
+@var{array} is the name of an array.  The expression
+@var{subscript} is the index of the element of the array that is
+assigned a value.  The expression @var{value} is the value to
+assign to that element of the array.
+
+@node Array Example
+@section Basic Array Example
+
+The following program takes a list of lines, each beginning with a line
+number, and prints them out in order of line number.  The line numbers
+are not in order when they are first read---instead they
+are scrambled.  This program sorts the lines by making an array using
+the line numbers as subscripts.  The program then prints out the lines
+in sorted order of their numbers.  It is a very simple program and gets
+confused upon encountering repeated numbers, gaps, or lines that don't
+begin with a number:
+
+@example
+@c file eg/misc/arraymax.awk
+@{
+  if ($1 > max)
+    max = $1
+  arr[$1] = $0
+@}
+
+END @{
+  for (x = 1; x <= max; x++)
+    print arr[x]
+@}
+@c endfile
+@end example
+
+The first rule keeps track of the largest line number seen so far;
+it also stores each line into the array @code{arr}, at an index that
+is the line's number.
+The second rule runs after all the input has been read, to print out
+all the lines.
+When this program is run with the following input:
+
+@example
+@c file eg/misc/arraymax.data
+5  I am the Five man
+2  Who are you?  The new number two!
+4  . . . And four on the floor
+1  Who is number one?
+3  I three you.
+@c endfile
+@end example
+
+@noindent
+Its output is:
+
+@example
+1  Who is number one?
+2  Who are you?  The new number two!
+3  I three you.
+4  . . . And four on the floor
+5  I am the Five man
+@end example
+
+If a line number is repeated, the last line with a given number overrides
+the others.
+Gaps in the line numbers can be handled with an easy improvement to the
+program's @code{END} rule, as follows:
+
+@example
+END @{
+  for (x = 1; x <= max; x++)
+    if (x in arr)
+      print arr[x]
+@}
+@end example
+
+@node Scanning an Array
+@section Scanning All Elements of an Array
+@cindex elements in arrays, scanning
+@cindex arrays, scanning
+
+In programs that use arrays, it is often necessary to use a loop that
+executes once for each element of an array.  In other languages, where
+arrays are contiguous and indices are limited to positive integers,
+this is easy: all the valid indices can be found by counting from
+the lowest index up to the highest.  This technique won't do the job
+in @command{awk}, because any number or string can be an array index.
+So @command{awk} has a special kind of @code{for} statement for scanning
+an array:
+
+@example
+for (@var{var} in @var{array})
+  @var{body}
+@end example
+
+@noindent
+@cindex @code{in} operator, arrays and
+This loop executes @var{body} once for each index in @var{array} that the
+program has previously used, with the variable @var{var} set to that index.
+
+@cindex arrays, @code{for} statement and
+@cindex @code{for} statement, in arrays
+The following program uses this form of the @code{for} statement.  The
+first rule scans the input records and notes which words appear (at
+least once) in the input, by storing a one into the array @code{used} with
+the word as index.  The second rule scans the elements of @code{used} to
+find all the distinct words that appear in the input.  It prints each
+word that is more than 10 characters long and also prints the number of
+such words.
+@xref{String Functions},
+for more information on the built-in function @code{length}.
+
+@example
+# Record a 1 for each word that is used at least once
+@{
+    for (i = 1; i <= NF; i++)
+        used[$i] = 1
+@}
+
+# Find number of distinct words more than 10 characters long
+END @{
+    for (x in used)
+        if (length(x) > 10) @{
+            ++num_long_words
+            print x
+        @}
+    print num_long_words, "words longer than 10 characters"
+@}
+@end example
+
+@noindent
+@xref{Word Sorting},
+for a more detailed example of this type.
+
+@cindex arrays, elements, order of
+@cindex elements in arrays, order of
+The order in which elements of the array are accessed by this statement
+is determined by the internal arrangement of the array elements within
+@command{awk} and cannot be controlled or changed.  This can lead to
+problems if new elements are added to @var{array} by statements in
+the loop body; it is not predictable whether the @code{for} loop will
+reach them.  Similarly, changing @var{var} inside the loop may produce
+strange results.  It is best to avoid such things.
+
+@node Delete
+@section The @code{delete} Statement
+@cindex @code{delete} statement
+@cindex deleting elements in arrays
+@cindex arrays, elements, deleting
+@cindex elements in arrays, deleting
+
+To remove an individual element of an array, use the @code{delete}
+statement:
+
+@example
+delete @var{array}[@var{index}]
+@end example
+
+Once an array element has been deleted, any value the element once
+had is no longer available. It is as if the element had never
+been referred to or had been given a value.
+The following is an example of deleting elements in an array:
+
+@example
+for (i in frequencies)
+  delete frequencies[i]
+@end example
+
+@noindent
+This example removes all the elements from the array @code{frequencies}.
+Once an element is deleted, a subsequent @code{for} statement to scan the array
+does not report that element and the @code{in} operator to check for
+the presence of that element returns zero (i.e., false):
+
+@example
+delete foo[4]
+if (4 in foo)
+    print "This will never be printed"
+@end example
+
+@cindex null strings, array elements and
+It is important to note that deleting an element is @emph{not} the
+same as assigning it a null value (the empty string, @code{""}).
+For example:
+
+@example
+foo[4] = ""
+if (4 in foo)
+  print "This is printed, even though foo[4] is empty"
+@end example
+
+@cindex lint checking, array elements
+It is not an error to delete an element that does not exist.
+If @option{--lint} is provided on the command line
+(@pxref{Options}),
+@command{gawk} issues a warning message when an element that
+is not in the array is deleted.
+
+@cindex arrays, deleting entire contents
+@cindex deleting entire arrays
+@cindex differences in @command{awk} and @command{gawk}, array elements, deleting
+All the elements of an array may be deleted with a single statement
+by leaving off the subscript in the @code{delete} statement,
+as follows:
+
+@example
+delete @var{array}
+@end example
+
+This ability is a @command{gawk} extension; it is not available in
+compatibility mode (@pxref{Options}).
+
+Using this version of the @code{delete} statement is about three times
+more efficient than the equivalent loop that deletes each element one
+at a time.
+
+@cindex portability, deleting array elements
+@cindex Brennan, Michael
+The following statement provides a portable but nonobvious way to clear
+out an array:@footnote{Thanks to Michael Brennan for pointing this out.}
+
+@example
+split("", array)
+@end example
+
+@c comma before deleting does NOT start a tertiary
+@cindex @code{split} function, array elements, deleting
+The @code{split} function
+(@pxref{String Functions})
+clears out the target array first. This call asks it to split
+apart the null string. Because there is no data to split out, the
+function simply clears the array and then returns.
+
+@strong{Caution:} Deleting an array does not change its type; you cannot
+delete an array and then use the array's name as a scalar
+(i.e., a regular variable). For example, the following does not work:
+
+@example
+a[1] = 3; delete a; a = 3
+@end example
+
+@node Numeric Array Subscripts
+@section Using Numbers to Subscript Arrays
+
+@cindex numbers, as array subscripts
+@cindex arrays, subscripts
+@cindex subscripts in arrays, numbers as
+@cindex @code{CONVFMT} variable, array subscripts and
+An important aspect about arrays to remember is that @emph{array subscripts
+are always strings}.  When a numeric value is used as a subscript,
+it is converted to a string value before being used for subscripting
+(@pxref{Conversion}).
+This means that the value of the built-in variable @code{CONVFMT} can
+affect how your program accesses elements of an array.  For example:
+
+@example
+xyz = 12.153
+data[xyz] = 1
+CONVFMT = "%2.2f"
+if (xyz in data)
+    printf "%s is in data\n", xyz
+else
+    printf "%s is not in data\n", xyz
+@end example
+
+@noindent
+This prints @samp{12.15 is not in data}.  The first statement gives
+@code{xyz} a numeric value.  Assigning to
+@code{data[xyz]} subscripts @code{data} with the string value @code{"12.153"}
+(using the default conversion value of @code{CONVFMT}, @code{"%.6g"}).
+Thus, the array element @code{data["12.153"]} is assigned the value one.
+The program then changes
+the value of @code{CONVFMT}.  The test @samp{(xyz in data)} generates a new
+string value from @code{xyz}---this time @code{"12.15"}---because the value of
+@code{CONVFMT} only allows two significant digits.  This test fails,
+since @code{"12.15"} is a different string from @code{"12.153"}.
+
+@cindex converting, during subscripting
+According to the rules for conversions
+(@pxref{Conversion}), integer
+values are always converted to strings as integers, no matter what the
+value of @code{CONVFMT} may happen to be.  So the usual case of
+the following works:
+
+@example
+for (i = 1; i <= maxsub; i++)
+    @i{do something with} array[i]
+@end example
+
+The ``integer values always convert to strings as integers'' rule
+has an additional consequence for array indexing.
+Octal and hexadecimal constants
+(@pxref{Nondecimal-numbers})
+are converted internally into numbers, and their original form
+is forgotten.
+This means, for example, that
+@code{array[17]},
+@code{array[021]},
+and
+@code{array[0x11]}
+all refer to the same element!
+
+As with many things in @command{awk}, the majority of the time
+things work as one would expect them to.  But it is useful to have a precise
+knowledge of the actual rules which sometimes can have a subtle
+effect on your programs.
+
+@node Uninitialized Subscripts
+@section Using Uninitialized Variables as Subscripts
+
+@c last comma does NOT start a tertiary
+@cindex variables, uninitialized, as array subscripts
+@cindex uninitialized variables, as array subscripts
+@cindex subscripts in arrays, uninitialized variables as
+@cindex arrays, subscripts, uninitialized variables as
+Suppose it's necessary to write a program
+to print the input data in reverse order.
+A reasonable attempt to do so (with some test
+data) might look like this:
+
+@example
+$ echo 'line 1
+> line 2
+> line 3' | awk '@{ l[lines] = $0; ++lines @}
+> END @{
+>     for (i = lines-1; i >= 0; --i)
+>        print l[i]
+> @}'
+@print{} line 3
+@print{} line 2
+@end example
+
+Unfortunately, the very first line of input data did not come out in the
+output!
+
+At first glance, this program should have worked.  The variable @code{lines}
+is uninitialized, and uninitialized variables have the numeric value zero.
+So, @command{awk} should have printed the value of @code{l[0]}.
+
+The issue here is that subscripts for @command{awk} arrays are @emph{always}
+strings. Uninitialized variables, when used as strings, have the
+value @code{""}, not zero.  Thus, @samp{line 1} ends up stored in
+@code{l[""]}.
+The following version of the program works correctly:
+
+@example
+@{ l[lines++] = $0 @}
+END @{
+    for (i = lines - 1; i >= 0; --i)
+       print l[i]
+@}
+@end example
+
+Here, the @samp{++} forces @code{lines} to be numeric, thus making
+the ``old value'' numeric zero. This is then converted to @code{"0"}
+as the array subscript.
+
+@cindex null strings, as array subscripts
+@cindex dark corner, array subscripts
+@cindex lint checking, array subscripts
+Even though it is somewhat unusual, the null string
+(@code{""}) is a valid array subscript.
+@value{DARKCORNER}
+@command{gawk} warns about the use of the null string as a subscript
+if @option{--lint} is provided
+on the command line (@pxref{Options}).
+
+@node Multi-dimensional
+@section Multidimensional Arrays
+
+@cindex subscripts in arrays, multidimensional
+@cindex arrays, multidimensional
+A multidimensional array is an array in which an element is identified
+by a sequence of indices instead of a single index.  For example, a
+two-dimensional array requires two indices.  The usual way (in most
+languages, including @command{awk}) to refer to an element of a
+two-dimensional array named @code{grid} is with
+@code{grid[@var{x},@var{y}]}.
+
+@cindex @code{SUBSEP} variable, multidimensional arrays
+Multidimensional arrays are supported in @command{awk} through
+concatenation of indices into one string.
+@command{awk} converts the indices into strings
+(@pxref{Conversion}) and
+concatenates them together, with a separator between them.  This creates
+a single string that describes the values of the separate indices.  The
+combined string is used as a single index into an ordinary,
+one-dimensional array.  The separator used is the value of the built-in
+variable @code{SUBSEP}.
+
+For example, suppose we evaluate the expression @samp{foo[5,12] = "value"}
+when the value of @code{SUBSEP} is @code{"@@"}.  The numbers 5 and 12 are
+converted to strings and
+concatenated with an @samp{@@} between them, yielding @code{"5@@12"}; thus,
+the array element @code{foo["5@@12"]} is set to @code{"value"}.
+
+Once the element's value is stored, @command{awk} has no record of whether
+it was stored with a single index or a sequence of indices.  The two
+expressions @samp{foo[5,12]} and @w{@samp{foo[5 SUBSEP 12]}} are always
+equivalent.
+
+The default value of @code{SUBSEP} is the string @code{"\034"},
+which contains a nonprinting character that is unlikely to appear in an
+@command{awk} program or in most input data.
+The usefulness of choosing an unlikely character comes from the fact
+that index values that contain a string matching @code{SUBSEP} can lead to
+combined strings that are ambiguous.  Suppose that @code{SUBSEP} is
+@code{"@@"}; then @w{@samp{foo["a@@b", "c"]}} and @w{@samp{foo["a",
+"b@@c"]}} are indistinguishable because both are actually
+stored as @samp{foo["a@@b@@c"]}.
+
+To test whether a particular index sequence exists in a
+multidimensional array, use the same operator (@samp{in}) that is
+used for single dimensional arrays.  Write the whole sequence of indices
+in parentheses, separated by commas, as the left operand:
+
+@example
+(@var{subscript1}, @var{subscript2}, @dots{}) in @var{array}
+@end example
+
+The following example treats its input as a two-dimensional array of
+fields; it rotates this array 90 degrees clockwise and prints the
+result.  It assumes that all lines have the same number of
+elements:
+
+@example
+@{
+     if (max_nf < NF)
+          max_nf = NF
+     max_nr = NR
+     for (x = 1; x <= NF; x++)
+          vector[x, NR] = $x
+@}
+
+END @{
+     for (x = 1; x <= max_nf; x++) @{
+          for (y = max_nr; y >= 1; --y)
+               printf("%s ", vector[x, y])
+          printf("\n")
+     @}
+@}
+@end example
+
+@noindent
+When given the input:
+
+@example
+1 2 3 4 5 6
+2 3 4 5 6 1
+3 4 5 6 1 2
+4 5 6 1 2 3
+@end example
+
+@noindent
+the program produces the following output:
+
+@example
+4 3 2 1
+5 4 3 2
+6 5 4 3
+1 6 5 4
+2 1 6 5
+3 2 1 6
+@end example
+
+@node Multi-scanning
+@section Scanning Multidimensional Arrays
+
+There is no special @code{for} statement for scanning a
+``multidimensional'' array. There cannot be one, because, in truth, there
+are no multidimensional arrays or elements---there is only a
+multidimensional @emph{way of accessing} an array.
+
+@cindex subscripts in arrays, multidimensional, scanning
+@cindex arrays, multidimensional, scanning
+However, if your program has an array that is always accessed as
+multidimensional, you can get the effect of scanning it by combining
+the scanning @code{for} statement
+(@pxref{Scanning an Array}) with the
+built-in @code{split} function
+(@pxref{String Functions}).
+It works in the following manner:
+
+@example
+for (combined in array) @{
+    split(combined, separate, SUBSEP)
+    @dots{}
+@}
+@end example
+
+@noindent
+This sets the variable @code{combined} to
+each concatenated combined index in the array, and splits it
+into the individual indices by breaking it apart where the value of
+@code{SUBSEP} appears.  The individual indices then become the elements of
+the array @code{separate}.
+
+Thus, if a value is previously stored in @code{array[1, "foo"]}; then
+an element with index @code{"1\034foo"} exists in @code{array}.  (Recall
+that the default value of @code{SUBSEP} is the character with code 034.)
+Sooner or later, the @code{for} statement finds that index and does an
+iteration with the variable @code{combined} set to @code{"1\034foo"}.
+Then the @code{split} function is called as follows:
+
+@example
+split("1\034foo", separate, "\034")
+@end example
+
+@noindent
+The result is to set @code{separate[1]} to @code{"1"} and
+@code{separate[2]} to @code{"foo"}.  Presto! The original sequence of
+separate indices is recovered.
+
+@node Array Sorting
+@section Sorting Array Values and Indices with @command{gawk}
+
+@cindex arrays, sorting
+@cindex @code{asort} function (@command{gawk})
+@c last comma does NOT start a tertiary
+@cindex @code{asort} function (@command{gawk}), arrays, sorting
+@cindex sort function, arrays, sorting
+The order in which an array is scanned with a @samp{for (i in array)}
+loop is essentially arbitrary.
+In most @command{awk} implementations, sorting an array requires
+writing a @code{sort} function.
+While this can be educational for exploring different sorting algorithms,
+usually that's not the point of the program.
+@command{gawk} provides the built-in @code{asort}
+and @code{asorti} functions
+(@pxref{String Functions})
+for sorting arrays.  For example:
+
+@example
+@var{populate the array} data
+n = asort(data)
+for (i = 1; i <= n; i++)
+    @var{do something with} data[i]
+@end example
+
+After the call to @code{asort}, the array @code{data} is indexed from 1
+to some number @var{n}, the total number of elements in @code{data}.
+(This count is @code{asort}'s return value.)
+@code{data[1]} @value{LEQ} @code{data[2]} @value{LEQ} @code{data[3]}, and so on.
+The comparison of array elements is done
+using @command{gawk}'s usual comparison rules
+(@pxref{Typing and Comparison}).
+
+@cindex side effects, @code{asort} function
+An important side effect of calling @code{asort} is that
+@emph{the array's original indices are irrevocably lost}.
+As this isn't always desirable, @code{asort} accepts a
+second argument:
+
+@example
+@var{populate the array} source
+n = asort(source, dest)
+for (i = 1; i <= n; i++)
+    @var{do something with} dest[i]
+@end example
+
+In this case, @command{gawk} copies the @code{source} array into the
+@code{dest} array and then sorts @code{dest}, destroying its indices.
+However, the @code{source} array is not affected.
+
+Often, what's needed is to sort on the values of the @emph{indices}
+instead of the values of the elements.
+To do that, starting with @command{gawk} 3.1.2, use the
+@code{asorti} function.  The interface is identical to that of
+@code{asort}, except that the index values are used for sorting, and
+become the values of the result array:
+
+@example
+@{ source[$0] = some_func($0) @}
+
+END @{
+    n = asorti(source, dest)
+    for (i = 1; i <= n; i++)
+        @var{do something with} dest[i]
+@}
+@end example
+
+If your version of @command{gawk} is 3.1.0 or 3.1.1, you don't
+have @code{asorti}. Instead, use a helper array
+to hold the sorted index values, and then access the original array's
+elements.  It works in the following way:
+
+@example
+@var{populate the array} data
+# copy indices
+j = 1
+for (i in data) @{
+    ind[j] = i    # index value becomes element value
+    j++
+@}
+n = asort(ind)    # index values are now sorted
+for (i = 1; i <= n; i++)
+    @var{do something with} data[ind[i]]
+@end example
+
+Sorting the array by replacing the indices provides maximal flexibility.
+To traverse the elements in decreasing order, use a loop that goes from
+@var{n} down to 1, either over the elements or over the indices.
+
+@cindex reference counting, sorting arrays
+Copying array indices and elements isn't expensive in terms of memory.
+Internally, @command{gawk} maintains @dfn{reference counts} to data.
+For example, when @code{asort} copies the first array to the second one,
+there is only one copy of the original array elements' data, even though
+both arrays use the values.  Similarly, when copying the indices from
+@code{data} to @code{ind}, there is only one copy of the actual index
+strings.
+
+@c Document It And Call It A Feature. Sigh.
+@cindex arrays, sorting, @code{IGNORECASE} variable and
+@cindex @code{IGNORECASE} variable, array sorting and
+We said previously that comparisons are done using @command{gawk}'s
+``usual comparison rules.''  Because @code{IGNORECASE} affects
+string comparisons, the value of @code{IGNORECASE} also
+affects sorting for both @code{asort} and @code{asorti}.
+Caveat Emptor.
+@c ENDOFRANGE arrs
+
+@node Functions
+@chapter Functions
+
+@c STARTOFRANGE funcbi
+@cindex functions, built-in
+@c STARTOFRANGE bifunc
+@cindex built-in functions
+This @value{CHAPTER} describes @command{awk}'s built-in functions,
+which fall into three categories: numeric, string, and I/O.
+@command{gawk} provides additional groups of functions
+to work with values that represent time, do
+bit manipulation, and internationalize and localize programs.
+
+Besides the built-in functions, @command{awk} has provisions for
+writing new functions that the rest of a program can use.
+The second half of this @value{CHAPTER} describes these
+@dfn{user-defined} functions.
+
+@menu
+* Built-in::                    Summarizes the built-in functions.
+* User-defined::                Describes User-defined functions in detail.
+@end menu
+
+@node Built-in
+@section Built-in Functions
+
+@c 2e: USE TEXINFO-2 FUNCTION DEFINITION STUFF!!!!!!!!!!!!!
+@dfn{Built-in} functions are always available for
+your @command{awk} program to call.  This @value{SECTION} defines all
+the built-in
+functions in @command{awk}; some of these are mentioned in other sections
+but are summarized here for your convenience.
+
+@menu
+* Calling Built-in::            How to call built-in functions.
+* Numeric Functions::           Functions that work with numbers, including
+                                @code{int}, @code{sin} and @code{rand}.
+* String Functions::            Functions for string manipulation, such as
+                                @code{split}, @code{match} and @code{sprintf}.
+* I/O Functions::               Functions for files and shell commands.
+* Time Functions::              Functions for dealing with timestamps.
+* Bitwise Functions::           Functions for bitwise operations.
+* I18N Functions::              Functions for string translation.
+@end menu
+
+@node Calling Built-in
+@subsection Calling Built-in Functions
+
+To call one of @command{awk}'s built-in functions, write the name of
+the function followed
+by arguments in parentheses.  For example, @samp{atan2(y + z, 1)}
+is a call to the function @code{atan2} and has two arguments.
+
+@cindex programming conventions, functions, calling
+@c last comma does NOT start a tertiary
+@cindex whitespace, functions, calling
+Whitespace is ignored between the built-in function name and the
+open parenthesis, and it is good practice to avoid using whitespace
+there.  User-defined functions do not permit whitespace in this way, and
+it is easier to avoid mistakes by following a simple
+convention that always works---no whitespace after a function name.
+
+@c last comma is part of tertiary
+@cindex troubleshooting, @command{gawk}, fatal errors, function arguments
+@cindex @command{gawk}, function arguments and
+@cindex differences in @command{awk} and @command{gawk}, function arguments (@command{gawk})
+Each built-in function accepts a certain number of arguments.
+In some cases, arguments can be omitted. The defaults for omitted
+arguments vary from function to function and are described under the
+individual functions.  In some @command{awk} implementations, extra
+arguments given to built-in functions are ignored.  However, in @command{gawk},
+it is a fatal error to give extra arguments to a built-in function.
+
+When a function is called, expressions that create the function's actual
+parameters are evaluated completely before the call is performed.
+For example, in the following code fragment:
+
+@example
+i = 4
+j = sqrt(i++)
+@end example
+
+@cindex evaluation order, functions
+@cindex functions, built-in, evaluation order
+@cindex built-in functions, evaluation order
+@noindent
+the variable @code{i} is incremented to the value five before @code{sqrt}
+is called with a value of four for its actual parameter.
+The order of evaluation of the expressions used for the function's
+parameters is undefined.  Thus, avoid writing programs that
+assume that parameters are evaluated from left to right or from
+right to left.  For example:
+
+@example
+i = 5
+j = atan2(i++, i *= 2)
+@end example
+
+If the order of evaluation is left to right, then @code{i} first becomes
+6, and then 12, and @code{atan2} is called with the two arguments 6
+and 12.  But if the order of evaluation is right to left, @code{i}
+first becomes 10, then 11, and @code{atan2} is called with the
+two arguments 11 and 10.
+
+@node Numeric Functions
+@subsection Numeric Functions
+
+The following list describes all of
+the built-in functions that work with numbers.
+Optional parameters are enclosed in square brackets@w{ ([ ]):}
+
+@table @code
+@item int(@var{x})
+@cindex @code{int} function
+This returns the nearest integer to @var{x}, located between @var{x} and zero and
+truncated toward zero.
+
+For example, @code{int(3)} is 3, @code{int(3.9)} is 3, @code{int(-3.9)}
+is @minus{}3, and @code{int(-3)} is @minus{}3 as well.
+
+@item sqrt(@var{x})
+@cindex @code{sqrt} function
+This returns the positive square root of @var{x}.
+@command{gawk} reports an error
+if @var{x} is negative.  Thus, @code{sqrt(4)} is 2.
+
+@item exp(@var{x})
+@cindex @code{exp} function
+This returns the exponential of @var{x} (@code{e ^ @var{x}}) or reports
+an error if @var{x} is out of range.  The range of values @var{x} can have
+depends on your machine's floating-point representation.
+
+@item log(@var{x})
+@cindex @code{log} function
+This returns the natural logarithm of @var{x}, if @var{x} is positive;
+otherwise, it reports an error.
+
+@item sin(@var{x})
+@cindex @code{sin} function
+This returns the sine of @var{x}, with @var{x} in radians.
+
+@item cos(@var{x})
+@cindex @code{cos} function
+This returns the cosine of @var{x}, with @var{x} in radians.
+
+@item atan2(@var{y}, @var{x})
+@cindex @code{atan2} function
+This returns the arctangent of @code{@var{y} / @var{x}} in radians.
+
+@item rand()
+@cindex @code{rand} function
+@cindex random numbers, @code{rand}/@code{srand} functions
+This returns a random number.  The values of @code{rand} are
+uniformly distributed between zero and one.
+The value could be zero but is never one.@footnote{The C version of @code{rand}
+is known to produce fairly poor sequences of random numbers.
+However, nothing requires that an @command{awk} implementation use the C
+@code{rand} to implement the @command{awk} version of @code{rand}.
+In fact, @command{gawk} uses the BSD @code{random} function, which is
+considerably better than @code{rand}, to produce random numbers.}
+
+Often random integers are needed instead.  Following is a user-defined function
+that can be used to obtain a random non-negative integer less than @var{n}:
+
+@example
+function randint(n) @{
+     return int(n * rand())
+@}
+@end example
+
+@noindent
+The multiplication produces a random number greater than zero and less
+than @code{n}.  Using @code{int}, this result is made into
+an integer between zero and @code{n} @minus{} 1, inclusive.
+
+The following example uses a similar function to produce random integers
+between one and @var{n}.  This program prints a new random number for
+each input record:
+
+@example
+# Function to roll a simulated die.
+function roll(n) @{ return 1 + int(rand() * n) @}
+
+# Roll 3 six-sided dice and
+# print total number of points.
+@{
+      printf("%d points\n",
+             roll(6)+roll(6)+roll(6))
+@}
+@end example
+
+@cindex numbers, random
+@cindex random numbers, seed of
+@c MAWK uses a different seed each time.
+@strong{Caution:} In most @command{awk} implementations, including @command{gawk},
+@code{rand} starts generating numbers from the same
+starting number, or @dfn{seed}, each time you run @command{awk}.  Thus,
+a program generates the same results each time you run it.
+The numbers are random within one @command{awk} run but predictable
+from run to run.  This is convenient for debugging, but if you want
+a program to do different things each time it is used, you must change
+the seed to a value that is different in each run.  To do this,
+use @code{srand}.
+
+@item srand(@r{[}@var{x}@r{]})
+@cindex @code{srand} function
+The function @code{srand} sets the starting point, or seed,
+for generating random numbers to the value @var{x}.
+
+Each seed value leads to a particular sequence of random
+numbers.@footnote{Computer-generated random numbers really are not truly
+random.  They are technically known as ``pseudorandom.''  This means
+that while the numbers in a sequence appear to be random, you can in
+fact generate the same sequence of random numbers over and over again.}
+Thus, if the seed is set to the same value a second time,
+the same sequence of random numbers is produced again.
+
+Different @command{awk} implementations use different random-number
+generators internally.  Don't expect the same @command{awk} program
+to produce the same series of random numbers when executed by
+different versions of @command{awk}.
+
+If the argument @var{x} is omitted, as in @samp{srand()}, then the current
+date and time of day are used for a seed.  This is the way to get random
+numbers that are truly unpredictable.
+
+The return value of @code{srand} is the previous seed.  This makes it
+easy to keep track of the seeds in case you need to consistently reproduce
+sequences of random numbers.
+@end table
+
+@node String Functions
+@subsection String-Manipulation Functions
+
+The functions in this @value{SECTION} look at or change the text of one or more
+strings.
+Optional parameters are enclosed in square brackets@w{ ([ ]).}
+Those functions that are
+specific to @command{gawk} are marked with a pound sign@w{ (@samp{#}):}
+
+@menu
+* Gory Details::                More than you want to know about @samp{\} and
+                                @samp{&} with @code{sub}, @code{gsub}, and
+                                @code{gensub}.
+@end menu
+
+@table @code
+@item asort(@var{source} @r{[}, @var{dest}@r{]}) #
+@cindex arrays, elements, retrieving number of
+@cindex @code{asort} function (@command{gawk})
+@code{asort} is a @command{gawk}-specific extension, returning the number of
+elements in the array @var{source}.  The contents of @var{source} are
+sorted using @command{gawk}'s normal rules for comparing values
+(in particular, @code{IGNORECASE} affects the sorting)
+and the indices
+of the sorted values of @var{source} are replaced with sequential
+integers starting with one. If the optional array @var{dest} is specified,
+then @var{source} is duplicated into @var{dest}.  @var{dest} is then
+sorted, leaving the indices of @var{source} unchanged.
+For example, if the contents of @code{a} are as follows:
+
+@example
+a["last"] = "de"
+a["first"] = "sac"
+a["middle"] = "cul"
+@end example
+
+@noindent
+A call to @code{asort}:
+
+@example
+asort(a)
+@end example
+
+@noindent
+results in the following contents of @code{a}:
+
+@example
+a[1] = "cul"
+a[2] = "de"
+a[3] = "sac"
+@end example
+
+The @code{asort} function is described in more detail in
+@ref{Array Sorting}.
+@code{asort} is a @command{gawk} extension; it is not available
+in compatibility mode (@pxref{Options}).
+
+@item asorti(@var{source} @r{[}, @var{dest}@r{]}) #
+@cindex @code{asorti} function (@command{gawk})
+@code{asorti} is a @command{gawk}-specific extension, returning the number of
+elements in the array @var{source}.
+It works similarly to @code{asort}, however, the @emph{indices}
+are sorted, instead of the values.  As array indices are always strings,
+the comparison performed is always a string comparison.  (Here too,
+@code{IGNORECASE} affects the sorting.)
+
+The @code{asorti} function is described in more detail in
+@ref{Array Sorting}.
+It was added in @command{gawk} 3.1.2.
+@code{asorti} is a @command{gawk} extension; it is not available
+in compatibility mode (@pxref{Options}).
+
+@item index(@var{in}, @var{find})
+@cindex @code{index} function
+@cindex searching
+This searches the string @var{in} for the first occurrence of the string
+@var{find}, and returns the position in characters where that occurrence
+begins in the string @var{in}.  Consider the following example:
+
+@example
+$ awk 'BEGIN @{ print index("peanut", "an") @}'
+@print{} 3
+@end example
+
+@noindent
+If @var{find} is not found, @code{index} returns zero.
+(Remember that string indices in @command{awk} start at one.)
+
+@item length(@r{[}@var{string}@r{]})
+@cindex @code{length} function
+This returns the number of characters in @var{string}.  If
+@var{string} is a number, the length of the digit string representing
+that number is returned.  For example, @code{length("abcde")} is 5.  By
+contrast, @code{length(15 * 35)} works out to 3. In this example, 15 * 35 =
+525, and 525 is then converted to the string @code{"525"}, which has
+three characters.
+
+If no argument is supplied, @code{length} returns the length of @code{$0}.
+
+@c @cindex historical features
+@cindex portability, @code{length} function
+@cindex POSIX @command{awk}, functions and, @code{length}
+@strong{Note:}
+In older versions of @command{awk}, the @code{length} function could
+be called
+without any parentheses.  Doing so is marked as ``deprecated'' in the
+POSIX standard.  This means that while a program can do this,
+it is a feature that can eventually be removed from a future
+version of the standard.  Therefore, for programs to be maximally portable,
+always supply the parentheses.
+
+@item match(@var{string}, @var{regexp} @r{[}, @var{array}@r{]})
+@cindex @code{match} function
+The @code{match} function searches @var{string} for the
+longest, leftmost substring matched by the regular expression,
+@var{regexp}.  It returns the character position, or @dfn{index},
+at which that substring begins (one, if it starts at the beginning of
+@var{string}).  If no match is found, it returns zero.
+
+The @var{regexp} argument may be either a regexp constant
+(@samp{/@dots{}/}) or a string constant (@var{"@dots{}"}).
+In the latter case, the string is treated as a regexp to be matched.
+@ref{Computed Regexps}, for a
+discussion of the difference between the two forms, and the
+implications for writing your program correctly.
+
+The order of the first two arguments is backwards from most other string
+functions that work with regular expressions, such as
+@code{sub} and @code{gsub}.  It might help to remember that
+for @code{match}, the order is the same as for the @samp{~} operator:
+@samp{@var{string} ~ @var{regexp}}.
+
+@cindex @code{RSTART} variable, @code{match} function and
+@cindex @code{RLENGTH} variable, @code{match} function and
+@cindex @code{match} function, @code{RSTART}/@code{RLENGTH} variables
+The @code{match} function sets the built-in variable @code{RSTART} to
+the index.  It also sets the built-in variable @code{RLENGTH} to the
+length in characters of the matched substring.  If no match is found,
+@code{RSTART} is set to zero, and @code{RLENGTH} to @minus{}1.
+
+For example:
+
+@example
+@c file eg/misc/findpat.awk
+@{
+       if ($1 == "FIND")
+         regex = $2
+       else @{
+         where = match($0, regex)
+         if (where != 0)
+           print "Match of", regex, "found at",
+                     where, "in", $0
+       @}
+@}
+@c endfile
+@end example
+
+@noindent
+This program looks for lines that match the regular expression stored in
+the variable @code{regex}.  This regular expression can be changed.  If the
+first word on a line is @samp{FIND}, @code{regex} is changed to be the
+second word on that line.  Therefore, if given:
+
+@example
+@c file eg/misc/findpat.data
+FIND ru+n
+My program runs
+but not very quickly
+FIND Melvin
+JF+KM
+This line is property of Reality Engineering Co.
+Melvin was here.
+@c endfile
+@end example
+
+@noindent
+@command{awk} prints:
+
+@example
+Match of ru+n found at 12 in My program runs
+Match of Melvin found at 1 in Melvin was here.
+@end example
+
+@cindex differences in @command{awk} and @command{gawk}, @code{match} function
+If @var{array} is present, it is cleared, and then the 0th element
+of @var{array} is set to the entire portion of @var{string}
+matched by @var{regexp}.  If @var{regexp} contains parentheses,
+the integer-indexed elements of @var{array} are set to contain the
+portion of @var{string} matching the corresponding parenthesized
+subexpression.
+For example:
+
+@example
+$ echo foooobazbarrrrr |
+> gawk '@{ match($0, /(fo+).+(bar*)/, arr)
+>           print arr[1], arr[2] @}'
+@print{} foooo barrrrr
+@end example
+
+In addition,
+beginning with @command{gawk} 3.1.2,
+multidimensional subscripts are available providing
+the start index and length of each matched subexpression:
+
+@example
+$ echo foooobazbarrrrr |
+> gawk '@{ match($0, /(fo+).+(bar*)/, arr)
+>           print arr[1], arr[2]
+>           print arr[1, "start"], arr[1, "length"]
+>           print arr[2, "start"], arr[2, "length"]
+> @}'
+@print{} foooo barrrrr
+@print{} 1 5
+@print{} 9 7
+@end example
+
+There may not be subscripts for the start and index for every parenthesized
+subexpressions, since they may not all have matched text; thus they
+should be tested for with the @code{in} operator
+(@pxref{Reference to Elements}).
+
+@cindex troubleshooting, @code{match} function
+The @var{array} argument to @code{match} is a
+@command{gawk} extension.  In compatibility mode
+(@pxref{Options}),
+using a third argument is a fatal error.
+
+@item split(@var{string}, @var{array} @r{[}, @var{fieldsep}@r{]})
+@cindex @code{split} function
+This function divides @var{string} into pieces separated by @var{fieldsep}
+and stores the pieces in @var{array}.  The first piece is stored in
+@code{@var{array}[1]}, the second piece in @code{@var{array}[2]}, and so
+forth.  The string value of the third argument, @var{fieldsep}, is
+a regexp describing where to split @var{string} (much as @code{FS} can
+be a regexp describing where to split input records).  If
+@var{fieldsep} is omitted, the value of @code{FS} is used.
+@code{split} returns the number of elements created.
+
+The @code{split} function splits strings into pieces in a
+manner similar to the way input lines are split into fields.  For example:
+
+@example
+split("cul-de-sac", a, "-")
+@end example
+
+@noindent
+@cindex strings, splitting
+splits the string @samp{cul-de-sac} into three fields using @samp{-} as the
+separator.  It sets the contents of the array @code{a} as follows:
+
+@example
+a[1] = "cul"
+a[2] = "de"
+a[3] = "sac"
+@end example
+
+@noindent
+The value returned by this call to @code{split} is three.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{split} function
+As with input field-splitting, when the value of @var{fieldsep} is
+@w{@code{" "}}, leading and trailing whitespace is ignored, and the elements
+are separated by runs of whitespace.
+Also as with input field-splitting, if @var{fieldsep} is the null string, each
+individual character in the string is split into its own array element.
+(This is a @command{gawk}-specific extension.)
+
+Note, however, that @code{RS} has no effect on the way @code{split}
+works. Even though @samp{RS = ""} causes newline to also be an input
+field separator, this does not affect how @code{split} splits strings.
+
+@cindex dark corner, @code{split} function
+Modern implementations of @command{awk}, including @command{gawk}, allow
+the third argument to be a regexp constant (@code{/abc/}) as well as a
+string.
+@value{DARKCORNER}
+The POSIX standard allows this as well.
+@ref{Computed Regexps}, for a
+discussion of the difference between using a string constant or a regexp constant,
+and the implications for writing your program correctly.
+
+Before splitting the string, @code{split} deletes any previously existing
+elements in the array @var{array}.
+
+If @var{string} is null, the array has no elements. (So this is a portable
+way to delete an entire array with one statement.
+@xref{Delete}.)
+
+If @var{string} does not match @var{fieldsep} at all (but is not null),
+@var{array} has one element only. The value of that element is the original
+@var{string}.
+
+@item sprintf(@var{format}, @var{expression1}, @dots{})
+@cindex @code{sprintf} function
+This returns (without printing) the string that @code{printf} would
+have printed out with the same arguments
+(@pxref{Printf}).
+For example:
+
+@example
+pival = sprintf("pi = %.2f (approx.)", 22/7)
+@end example
+
+@noindent
+assigns the string @w{@code{"pi = 3.14 (approx.)"}} to the variable @code{pival}.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{strtonum} function (@command{gawk})
+@cindex @code{strtonum} function (@command{gawk})
+@item strtonum(@var{str}) #
+Examines @var{str} and returns its numeric value.  If @var{str}
+begins with a leading @samp{0}, @code{strtonum} assumes that @var{str}
+is an octal number.  If @var{str} begins with a leading @samp{0x} or
+@samp{0X}, @code{strtonum} assumes that @var{str} is a hexadecimal number.
+For example:
+
+@example
+$ echo 0x11 |
+> gawk '@{ printf "%d\n", strtonum($1) @}'
+@print{} 17
+@end example
+
+Using the @code{strtonum} function is @emph{not} the same as adding zero
+to a string value; the automatic coercion of strings to numbers
+works only for decimal data, not for octal or hexadecimal.@footnote{Unless
+you use the @option{--non-decimal-data} option, which isn't recommended.
+@xref{Nondecimal Data}, for more information.}
+
+@cindex differences in @command{awk} and @command{gawk}, @code{strtonum} function (@command{gawk})
+@code{strtonum} is a @command{gawk} extension; it is not available
+in compatibility mode (@pxref{Options}).
+
+@item sub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
+@cindex @code{sub} function
+The @code{sub} function alters the value of @var{target}.
+It searches this value, which is treated as a string, for the
+leftmost, longest substring matched by the regular expression @var{regexp}.
+Then the entire string is
+changed by replacing the matched text with @var{replacement}.
+The modified string becomes the new value of @var{target}.
+
+The @var{regexp} argument may be either a regexp constant
+(@samp{/@dots{}/}) or a string constant (@var{"@dots{}"}).
+In the latter case, the string is treated as a regexp to be matched.
+@ref{Computed Regexps}, for a
+discussion of the difference between the two forms, and the
+implications for writing your program correctly.
+
+This function is peculiar because @var{target} is not simply
+used to compute a value, and not just any expression will do---it
+must be a variable, field, or array element so that @code{sub} can
+store a modified value there.  If this argument is omitted, then the
+default is to use and alter @code{$0}.@footnote{Note that this means
+that the record will first be regenerated using the value of @code{OFS} if
+any fields have been changed, and that the fields will be updated
+after the substituion, even if the operation is a ``no-op'' such
+as @samp{sub(/^/, "")}.}
+For example:
+
+@example
+str = "water, water, everywhere"
+sub(/at/, "ith", str)
+@end example
+
+@noindent
+sets @code{str} to @w{@code{"wither, water, everywhere"}}, by replacing the
+leftmost longest occurrence of @samp{at} with @samp{ith}.
+
+The @code{sub} function returns the number of substitutions made (either
+one or zero).
+
+If the special character @samp{&} appears in @var{replacement}, it
+stands for the precise substring that was matched by @var{regexp}.  (If
+the regexp can match more than one string, then this precise substring
+may vary.)  For example:
+
+@example
+@{ sub(/candidate/, "& and his wife"); print @}
+@end example
+
+@noindent
+changes the first occurrence of @samp{candidate} to @samp{candidate
+and his wife} on each input line.
+Here is another example:
+
+@example
+$ awk 'BEGIN @{
+>         str = "daabaaa"
+>         sub(/a+/, "C&C", str)
+>         print str
+> @}'
+@print{} dCaaCbaaa
+@end example
+
+@noindent
+This shows how @samp{&} can represent a nonconstant string and also
+illustrates the ``leftmost, longest'' rule in regexp matching
+(@pxref{Leftmost Longest}).
+
+The effect of this special character (@samp{&}) can be turned off by putting a
+backslash before it in the string.  As usual, to insert one backslash in
+the string, you must write two backslashes.  Therefore, write @samp{\\&}
+in a string constant to include a literal @samp{&} in the replacement.
+For example, the following shows how to replace the first @samp{|} on each line with
+an @samp{&}:
+
+@example
+@{ sub(/\|/, "\\&"); print @}
+@end example
+
+@cindex @code{sub} function, arguments of
+@cindex @code{gsub} function, arguments of
+As mentioned, the third argument to @code{sub} must
+be a variable, field or array reference.
+Some versions of @command{awk} allow the third argument to
+be an expression that is not an lvalue.  In such a case, @code{sub}
+still searches for the pattern and returns zero or one, but the result of
+the substitution (if any) is thrown away because there is no place
+to put it.  Such versions of @command{awk} accept expressions
+such as the following:
+
+@example
+sub(/USA/, "United States", "the USA and Canada")
+@end example
+
+@noindent
+@cindex troubleshooting, @code{gsub}/@code{sub} functions
+For historical compatibility, @command{gawk} accepts erroneous code,
+such as in the previous example. However, using any other nonchangeable
+object as the third parameter causes a fatal error and your program
+will not run.
+
+Finally, if the @var{regexp} is not a regexp constant, it is converted into a
+string, and then the value of that string is treated as the regexp to match.
+
+@item gsub(@var{regexp}, @var{replacement} @r{[}, @var{target}@r{]})
+@cindex @code{gsub} function
+This is similar to the @code{sub} function, except @code{gsub} replaces
+@emph{all} of the longest, leftmost, @emph{nonoverlapping} matching
+substrings it can find.  The @samp{g} in @code{gsub} stands for
+``global,'' which means replace everywhere.  For example:
+
+@example
+@{ gsub(/Britain/, "United Kingdom"); print @}
+@end example
+
+@noindent
+replaces all occurrences of the string @samp{Britain} with @samp{United
+Kingdom} for all input records.
+
+The @code{gsub} function returns the number of substitutions made.  If
+the variable to search and alter (@var{target}) is
+omitted, then the entire input record (@code{$0}) is used.
+As in @code{sub}, the characters @samp{&} and @samp{\} are special,
+and the third argument must be assignable.
+
+@item gensub(@var{regexp}, @var{replacement}, @var{how} @r{[}, @var{target}@r{]}) #
+@cindex @code{gensub} function (@command{gawk})
+@code{gensub} is a general substitution function.  Like @code{sub} and
+@code{gsub}, it searches the target string @var{target} for matches of
+the regular expression @var{regexp}.  Unlike @code{sub} and @code{gsub},
+the modified string is returned as the result of the function and the
+original target string is @emph{not} changed.  If @var{how} is a string
+beginning with @samp{g} or @samp{G}, then it replaces all matches of
+@var{regexp} with @var{replacement}.  Otherwise, @var{how} is treated
+as a number that indicates which match of @var{regexp} to replace. If
+no @var{target} is supplied, @code{$0} is used.
+
+@code{gensub} provides an additional feature that is not available
+in @code{sub} or @code{gsub}: the ability to specify components of a
+regexp in the replacement text.  This is done by using parentheses in
+the regexp to mark the components and then specifying @samp{\@var{N}}
+in the replacement text, where @var{N} is a digit from 1 to 9.
+For example:
+
+@example
+$ gawk '
+> BEGIN @{
+>      a = "abc def"
+>      b = gensub(/(.+) (.+)/, "\\2 \\1", "g", a)
+>      print b
+> @}'
+@print{} def abc
+@end example
+
+@noindent
+As with @code{sub}, you must type two backslashes in order
+to get one into the string.
+In the replacement text, the sequence @samp{\0} represents the entire
+matched text, as does the character @samp{&}.
+
+The following example shows how you can use the third argument to control
+which match of the regexp should be changed:
+
+@example
+$ echo a b c a b c |
+> gawk '@{ print gensub(/a/, "AA", 2) @}'
+@print{} a b c AA b c
+@end example
+
+In this case, @code{$0} is used as the default target string.
+@code{gensub} returns the new string as its result, which is
+passed directly to @code{print} for printing.
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+If the @var{how} argument is a string that does not begin with @samp{g} or
+@samp{G}, or if it is a number that is less than or equal to zero, only one
+substitution is performed.  If @var{how} is zero, @command{gawk} issues
+a warning message.
+
+If @var{regexp} does not match @var{target}, @code{gensub}'s return value
+is the original unchanged value of @var{target}.
+
+@code{gensub} is a @command{gawk} extension; it is not available
+in compatibility mode (@pxref{Options}).
+
+@item substr(@var{string}, @var{start} @r{[}, @var{length}@r{]})
+@cindex @code{substr} function
+This returns a @var{length}-character-long substring of @var{string},
+starting at character number @var{start}.  The first character of a
+string is character number one.@footnote{This is different from
+C and C++, in which the first character is number zero.}
+For example, @code{substr("washington", 5, 3)} returns @code{"ing"}.
+
+If @var{length} is not present, this function returns the whole suffix of
+@var{string} that begins at character number @var{start}.  For example,
+@code{substr("washington", 5)} returns @code{"ington"}.  The whole
+suffix is also returned
+if @var{length} is greater than the number of characters remaining
+in the string, counting from character @var{start}.
+
+If @var{start} is less than one, @code{substr} treats it as
+if it was one. (POSIX doesn't specify what to do in this case:
+Unix @command{awk} acts this way, and therefore @command{gawk}
+does too.)
+If @var{start} is greater than the number of characters
+in the string, @code{substr} returns the null string.
+Similarly, if @var{length} is present but less than or equal to zero,
+the null string is returned.
+
+@cindex troubleshooting, @code{substr} function
+The string returned by @code{substr} @emph{cannot} be
+assigned.  Thus, it is a mistake to attempt to change a portion of
+a string, as shown in the following example:
+
+@example
+string = "abcdef"
+# try to get "abCDEf", won't work
+substr(string, 3, 3) = "CDE"
+@end example
+
+@noindent
+It is also a mistake to use @code{substr} as the third argument
+of @code{sub} or @code{gsub}:
+
+@example
+gsub(/xyz/, "pdq", substr($0, 5, 20))  # WRONG
+@end example
+
+@cindex portability, @code{substr} function
+(Some commercial versions of @command{awk} do in fact let you use
+@code{substr} this way, but doing so is not portable.)
+
+If you need to replace bits and pieces of a string, combine @code{substr}
+with string concatenation, in the following manner:
+
+@example
+string = "abcdef"
+@dots{}
+string = substr(string, 1, 2) "CDE" substr(string, 6)
+@end example
+
+@cindex case sensitivity, converting case
+@cindex converting, case
+@item tolower(@var{string})
+@cindex @code{tolower} function
+This returns a copy of @var{string}, with each uppercase character
+in the string replaced with its corresponding lowercase character.
+Nonalphabetic characters are left unchanged.  For example,
+@code{tolower("MiXeD cAsE 123")} returns @code{"mixed case 123"}.
+
+@item toupper(@var{string})
+@cindex @code{toupper} function
+This returns a copy of @var{string}, with each lowercase character
+in the string replaced with its corresponding uppercase character.
+Nonalphabetic characters are left unchanged.  For example,
+@code{toupper("MiXeD cAsE 123")} returns @code{"MIXED CASE 123"}.
+@end table
+
+@node Gory Details
+@subsubsection More About @samp{\} and @samp{&} with @code{sub}, @code{gsub}, and @code{gensub}
+
+@cindex escape processing, @code{gsub}/@code{gensub}/@code{sub} functions
+@cindex @code{sub} function, escape processing
+@cindex @code{gsub} function, escape processing
+@cindex @code{gensub} function (@command{gawk}), escape processing
+@cindex @code{\} (backslash), @code{gsub}/@code{gensub}/@code{sub} functions and
+@cindex backslash (@code{\}), @code{gsub}/@code{gensub}/@code{sub} functions and
+@cindex @code{&} (ampersand), @code{gsub}/@code{gensub}/@code{sub} functions and
+@cindex ampersand (@code{&}), @code{gsub}/@code{gensub}/@code{sub} functions and
+When using @code{sub}, @code{gsub}, or @code{gensub}, and trying to get literal
+backslashes and ampersands into the replacement text, you need to remember
+that there are several levels of @dfn{escape processing} going on.
+
+First, there is the @dfn{lexical} level, which is when @command{awk} reads
+your program
+and builds an internal copy of it that can be executed.
+Then there is the runtime level, which is when @command{awk} actually scans the
+replacement string to determine what to generate.
+
+At both levels, @command{awk} looks for a defined set of characters that
+can come after a backslash.  At the lexical level, it looks for the
+escape sequences listed in @ref{Escape Sequences}.
+Thus, for every @samp{\} that @command{awk} processes at the runtime
+level, type two backslashes at the lexical level.
+When a character that is not valid for an escape sequence follows the
+@samp{\}, Unix @command{awk} and @command{gawk} both simply remove the initial
+@samp{\} and put the next character into the string. Thus, for
+example, @code{"a\qb"} is treated as @code{"aqb"}.
+
+At the runtime level, the various functions handle sequences of
+@samp{\} and @samp{&} differently.  The situation is (sadly) somewhat complex.
+Historically, the @code{sub} and @code{gsub} functions treated the two
+character sequence @samp{\&} specially; this sequence was replaced in
+the generated text with a single @samp{&}.  Any other @samp{\} within
+the @var{replacement} string that did not precede an @samp{&} was passed
+through unchanged.  To illustrate with a table:
+
+@c Thank to Karl Berry for help with the TeX stuff.
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+   @code{\&}!       @code{&}!the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+ @code{\\\&}!      @code{\&}!a literal @samp{&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\&}@cr
+@code{\\\\\&}!     @code{\\&}!a literal @samp{\&}@cr
+@code{\\\\\\&}!     @code{\\\&}!a literal @samp{\\&}@cr
+  @code{\\q}!      @code{\q}!a literal @samp{\q}@cr
+}
+@bigskip}
+@end tex
+@ifnottex
+@display
+ You type         @code{sub} sees          @code{sub} generates
+ --------         ----------          ---------------
+     @code{\&}              @code{&}            the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+   @code{\\\&}             @code{\&}            a literal @samp{&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\&}
+ @code{\\\\\&}            @code{\\&}            a literal @samp{\&}
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\\&}
+    @code{\\q}             @code{\q}            a literal @samp{\q}
+@end display
+@end ifnottex
+
+@noindent
+This table shows both the lexical-level processing, where
+an odd number of backslashes becomes an even number at the runtime level,
+as well as the runtime processing done by @code{sub}.
+(For the sake of simplicity, the rest of the following tables only show the
+case of even numbers of backslashes entered at the lexical level.)
+
+The problem with the historical approach is that there is no way to get
+a literal @samp{\} followed by the matched text.
+
+@c @cindex @command{awk} language, POSIX version
+@cindex POSIX @command{awk}, functions and, @code{gsub}/@code{sub}
+The 1992 POSIX standard attempted to fix this problem. The standard
+says that @code{sub} and @code{gsub} look for either a @samp{\} or an @samp{&}
+after the @samp{\}. If either one follows a @samp{\}, that character is
+output literally.  The interpretation of @samp{\} and @samp{&} then becomes:
+
+@c thanks to Karl Berry for formatting this table
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+    @code{&}!       @code{&}!the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\}, then the matched text@cr
+@code{\\\\\\&}!  @code{\\\&}!a literal @samp{\&}@cr
+}
+@bigskip}
+@end tex
+@ifnottex
+@display
+ You type         @code{sub} sees          @code{sub} generates
+ --------         ----------          ---------------
+      @code{&}              @code{&}            the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\}, then the matched text
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\&}
+@end display
+@end ifnottex
+
+@noindent
+This appears to solve the problem.
+Unfortunately, the phrasing of the standard is unusual. It
+says, in effect, that @samp{\} turns off the special meaning of any
+following character, but for anything other than @samp{\} and @samp{&},
+such special meaning is undefined.  This wording leads to two problems:
+
+@itemize @bullet
+@item
+Backslashes must now be doubled in the @var{replacement} string, breaking
+historical @command{awk} programs.
+
+@item
+To make sure that an @command{awk} program is portable, @emph{every} character
+in the @var{replacement} string must be preceded with a
+backslash.@footnote{This consequence was certainly unintended.}
+@c I can say that, 'cause I was involved in making this change
+@end itemize
+
+The POSIX standard is under revision.
+Because of the problems just listed, proposed text for the revised standard
+reverts to rules that correspond more closely to the original existing
+practice. The proposed rules have special cases that make it possible
+to produce a @samp{\} preceding the matched text:
+
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{sub} sees!@code{sub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+@code{\\\\\\&}!     @code{\\\&}!a literal @samp{\&}@cr
+@code{\\\\&}!     @code{\\&}!a literal @samp{\}, followed by the matched text@cr
+  @code{\\&}!      @code{\&}!a literal @samp{&}@cr
+  @code{\\q}!      @code{\q}!a literal @samp{\q}@cr
+}
+@bigskip}
+@end tex
+@ifinfo
+@display
+ You type         @code{sub} sees         @code{sub} generates
+ --------         ----------         ---------------
+@code{\\\\\\&}           @code{\\\&}            a literal @samp{\&}
+  @code{\\\\&}            @code{\\&}            a literal @samp{\}, followed by the matched text
+    @code{\\&}             @code{\&}            a literal @samp{&}
+    @code{\\q}             @code{\q}            a literal @samp{\q}
+@end display
+@end ifinfo
+
+In a nutshell, at the runtime level, there are now three special sequences
+of characters (@samp{\\\&}, @samp{\\&} and @samp{\&}) whereas historically
+there was only one.  However, as in the historical case, any @samp{\} that
+is not part of one of these three sequences is not special and appears
+in the output literally.
+
+@command{gawk} 3.0 and 3.1 follow these proposed POSIX rules for @code{sub} and
+@code{gsub}.
+@c As much as we think it's a lousy idea. You win some, you lose some. Sigh.
+Whether these proposed rules will actually become codified into the
+standard is unknown at this point. Subsequent @command{gawk} releases will
+track the standard and implement whatever the final version specifies;
+this @value{DOCUMENT} will be updated as
+well.@footnote{As this @value{DOCUMENT} was being finalized,
+we learned that the POSIX standard will not use these rules.
+However, it was too late to change @command{gawk} for the 3.1 release.
+@command{gawk} behaves as described here.}
+
+The rules for @code{gensub} are considerably simpler. At the runtime
+level, whenever @command{gawk} sees a @samp{\}, if the following character
+is a digit, then the text that matched the corresponding parenthesized
+subexpression is placed in the generated output.  Otherwise,
+no matter what character follows the @samp{\}, it
+appears in the generated text and the @samp{\} does not:
+
+@tex
+\vbox{\bigskip
+% This table has lots of &'s and \'s, so unspecialize them.
+\catcode`\& = \other \catcode`\\ = \other
+% But then we need character for escape and tab.
+@catcode`! = 4
+@halign{@hfil#!@qquad@hfil#!@qquad#@hfil@cr
+    You type!@code{gensub} sees!@code{gensub} generates@cr
+@hrulefill!@hrulefill!@hrulefill@cr
+      @code{&}!           @code{&}!the matched text@cr
+    @code{\\&}!          @code{\&}!a literal @samp{&}@cr
+   @code{\\\\}!          @code{\\}!a literal @samp{\}@cr
+  @code{\\\\&}!         @code{\\&}!a literal @samp{\}, then the matched text@cr
+@code{\\\\\\&}!        @code{\\\&}!a literal @samp{\&}@cr
+    @code{\\q}!          @code{\q}!a literal @samp{q}@cr
+}
+@bigskip}
+@end tex
+@ifnottex
+@display
+  You type          @code{gensub} sees         @code{gensub} generates
+  --------          -------------         ------------------
+      @code{&}                    @code{&}            the matched text
+    @code{\\&}                   @code{\&}            a literal @samp{&}
+   @code{\\\\}                   @code{\\}            a literal @samp{\}
+  @code{\\\\&}                  @code{\\&}            a literal @samp{\}, then the matched text
+@code{\\\\\\&}                 @code{\\\&}            a literal @samp{\&}
+    @code{\\q}                   @code{\q}            a literal @samp{q}
+@end display
+@end ifnottex
+
+Because of the complexity of the lexical and runtime level processing
+and the special cases for @code{sub} and @code{gsub},
+we recommend the use of @command{gawk} and @code{gensub} when you have
+to do substitutions.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Matching the Null String
+@c last comma does NOT start tertiary
+@cindex advanced features, null strings, matching
+@cindex matching, null strings
+@cindex null strings, matching
+@c last comma in next two is part of tertiary
+@cindex @code{*} (asterisk), @code{*} operator, null strings, matching
+@cindex asterisk (@code{*}), @code{*} operator, null strings, matching
+
+In @command{awk}, the @samp{*} operator can match the null string.
+This is particularly important for the @code{sub}, @code{gsub},
+and @code{gensub} functions.  For example:
+
+@example
+$ echo abc | awk '@{ gsub(/m*/, "X"); print @}'
+@print{} XaXbXcX
+@end example
+
+@noindent
+Although this makes a certain amount of sense, it can be surprising.
+
+@node I/O Functions
+@subsection Input/Output Functions
+
+The following functions relate to input/output (I/O).
+Optional parameters are enclosed in square brackets ([ ]):
+
+@table @code
+@item close(@var{filename} @r{[}, @var{how}@r{]})
+@cindex @code{close} function
+@cindex files, closing
+Close the file @var{filename} for input or output. Alternatively, the
+argument may be a shell command that was used for creating a coprocess, or
+for redirecting to or from a pipe; then the coprocess or pipe is closed.
+@xref{Close Files And Pipes},
+for more information.
+
+When closing a coprocess, it is occasionally useful to first close
+one end of the two-way pipe and then to close the other.  This is done
+by providing a second argument to @code{close}.  This second argument
+should be one of the two string values @code{"to"} or @code{"from"},
+indicating which end of the pipe to close.  Case in the string does
+not matter.
+@xref{Two-way I/O},
+which discusses this feature in more detail and gives an example.
+
+@item fflush(@r{[}@var{filename}@r{]})
+@cindex @code{fflush} function
+Flush any buffered output associated with @var{filename}, which is either a
+file opened for writing or a shell command for redirecting output to
+a pipe or coprocess.
+
+@cindex portability, @code{fflush} function and
+@cindex buffers, flushing
+@cindex output, buffering
+Many utility programs @dfn{buffer} their output; i.e., they save information
+to write to a disk file or terminal in memory until there is enough
+for it to be worthwhile to send the data to the output device.
+This is often more efficient than writing
+every little bit of information as soon as it is ready.  However, sometimes
+it is necessary to force a program to @dfn{flush} its buffers; that is,
+write the information to its destination, even if a buffer is not full.
+This is the purpose of the @code{fflush} function---@command{gawk} also
+buffers its output and the @code{fflush} function forces
+@command{gawk} to flush its buffers.
+
+@code{fflush} was added to the Bell Laboratories research
+version of @command{awk} in 1994; it is not part of the POSIX standard and is
+not available if @option{--posix} has been specified on the
+command line (@pxref{Options}).
+
+@cindex @command{gawk}, @code{fflush} function in
+@command{gawk} extends the @code{fflush} function in two ways.  The first
+is to allow no argument at all. In this case, the buffer for the
+standard output is flushed.  The second is to allow the null string
+(@w{@code{""}}) as the argument. In this case, the buffers for
+@emph{all} open output files and pipes are flushed.
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@cindex troubleshooting, @code{fflush} function
+@code{fflush} returns zero if the buffer is successfully flushed;
+otherwise, it returns @minus{}1.
+In the case where all buffers are flushed, the return value is zero
+only if all buffers were flushed successfully.  Otherwise, it is
+@minus{}1, and @command{gawk} warns about the problem @var{filename}.
+
+@command{gawk} also issues a warning message if you attempt to flush
+a file or pipe that was opened for reading (such as with @code{getline}),
+or if @var{filename} is not an open file, pipe, or coprocess.
+In such a case, @code{fflush} returns @minus{}1, as well.
+
+@item system(@var{command})
+@cindex @code{system} function
+@cindex interacting with other programs
+Executes operating-system
+commands and then returns to the @command{awk} program.  The @code{system}
+function executes the command given by the string @var{command}.
+It returns the status returned by the command that was executed as
+its value.
+
+For example, if the following fragment of code is put in your @command{awk}
+program:
+
+@example
+END @{
+     system("date | mail -s 'awk run done' root")
+@}
+@end example
+
+@noindent
+the system administrator is sent mail when the @command{awk} program
+finishes processing input and begins its end-of-input processing.
+
+Note that redirecting @code{print} or @code{printf} into a pipe is often
+enough to accomplish your task.  If you need to run many commands, it
+is more efficient to simply print them down a pipeline to the shell:
+
+@example
+while (@var{more stuff to do})
+    print @var{command} | "/bin/sh"
+close("/bin/sh")
+@end example
+
+@noindent
+@cindex troubleshooting, @code{system} function
+However, if your @command{awk}
+program is interactive, @code{system} is useful for cranking up large
+self-contained programs, such as a shell or an editor.
+Some operating systems cannot implement the @code{system} function.
+@code{system} causes a fatal error if it is not supported.
+@end table
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Interactive Versus Noninteractive Buffering
+@cindex advanced features, buffering
+@cindex buffering, interactive vs. noninteractive
+
+As a side point, buffering issues can be even more confusing, depending
+upon whether your program is @dfn{interactive}, i.e., communicating
+with a user sitting at a keyboard.@footnote{A program is interactive
+if the standard output is connected
+to a terminal device.}
+
+@c Thanks to Walter.Mecky@dresdnerbank.de for this example, and for
+@c motivating me to write this section.
+Interactive programs generally @dfn{line buffer} their output; i.e., they
+write out every line.  Noninteractive programs wait until they have
+a full buffer, which may be many lines of output.
+Here is an example of the difference:
+
+@example
+$ awk '@{ print $1 + $2 @}'
+1 1
+@print{} 2
+2 3
+@print{} 5
+@kbd{@value{CTL}-d}
+@end example
+
+@noindent
+Each line of output is printed immediately. Compare that behavior
+with this example:
+
+@example
+$ awk '@{ print $1 + $2 @}' | cat
+1 1
+2 3
+@kbd{@value{CTL}-d}
+@print{} 2
+@print{} 5
+@end example
+
+@noindent
+Here, no output is printed until after the @kbd{@value{CTL}-d} is typed, because
+it is all buffered and sent down the pipe to @command{cat} in one shot.
+
+@c fakenode --- for prepinfo
+@subheading Advanced Notes: Controlling Output Buffering with @code{system}
+@cindex advanced features, buffering
+@cindex buffers, flushing
+@cindex buffering, input/output
+@cindex output, buffering
+
+The @code{fflush} function provides explicit control over output buffering for
+individual files and pipes.  However, its use is not portable to many other
+@command{awk} implementations.  An alternative method to flush output
+buffers is to call @code{system} with a null string as its argument:
+
+@example
+system("")   # flush output
+@end example
+
+@noindent
+@command{gawk} treats this use of the @code{system} function as a special
+case and is smart enough not to run a shell (or other command
+interpreter) with the empty command.  Therefore, with @command{gawk}, this
+idiom is not only useful, it is also efficient.  While this method should work
+with other @command{awk} implementations, it does not necessarily avoid
+starting an unnecessary shell.  (Other implementations may only
+flush the buffer associated with the standard output and not necessarily
+all buffered output.)
+
+If you think about what a programmer expects, it makes sense that
+@code{system} should flush any pending output.  The following program:
+
+@example
+BEGIN @{
+     print "first print"
+     system("echo system echo")
+     print "second print"
+@}
+@end example
+
+@noindent
+must print:
+
+@example
+first print
+system echo
+second print
+@end example
+
+@noindent
+and not:
+
+@example
+system echo
+first print
+second print
+@end example
+
+If @command{awk} did not flush its buffers before calling @code{system},
+you would see the latter (undesirable) output.
+
+@node Time Functions
+@subsection Using @command{gawk}'s Timestamp Functions
+
+@c STARTOFRANGE tst
+@cindex timestamps
+@c STARTOFRANGE logftst
+@cindex log files, timestamps in
+@c last comma does NOT start tertiary
+@c STARTOFRANGE filogtst
+@cindex files, log, timestamps in
+@c STARTOFRANGE gawtst
+@cindex @command{gawk}, timestamps
+@cindex POSIX @command{awk}, timestamps and
+@code{awk} programs are commonly used to process log files
+containing timestamp information, indicating when a
+particular log record was written.  Many programs log their timestamp
+in the form returned by the @code{time} system call, which is the
+number of seconds since a particular epoch.  On POSIX-compliant systems,
+it is the number of seconds since
+1970-01-01 00:00:00 UTC, not counting leap seconds.@footnote{@xref{Glossary},
+especially the entries ``Epoch'' and ``UTC.''}
+All known POSIX-compliant systems support timestamps from 0 through
+@math{2^31 - 1}, which is sufficient to represent times through
+2038-01-19 03:14:07 UTC.  Many systems support a wider range of timestamps,
+including negative timestamps that represent times before the
+epoch.
+
+@cindex @command{date} utility, GNU
+@cindex time, retrieving
+In order to make it easier to process such log files and to produce
+useful reports, @command{gawk} provides the following functions for
+working with timestamps.  They are @command{gawk} extensions; they are
+not specified in the POSIX standard, nor are they in any other known
+version of @command{awk}.@footnote{The GNU @command{date} utility can
+also do many of the things described here.  Its use may be preferable
+for simple time-related operations in shell scripts.}
+Optional parameters are enclosed in square brackets ([ ]):
+
+@table @code
+@item systime()
+@cindex @code{systime} function (@command{gawk})
+@cindex timestamps
+This function returns the current time as the number of seconds since
+the system epoch.  On POSIX systems, this is the number of seconds
+since 1970-01-01 00:00:00 UTC, not counting leap seconds.
+It may be a different number on
+other systems.
+
+@item mktime(@var{datespec})
+@cindex @code{mktime} function (@command{gawk})
+This function turns @var{datespec} into a timestamp in the same form
+as is returned by @code{systime}.  It is similar to the function of the
+same name in ISO C.  The argument, @var{datespec}, is a string of the form
+@w{@code{"@var{YYYY} @var{MM} @var{DD} @var{HH} @var{MM} @var{SS} [@var{DST}]"}}.
+The string consists of six or seven numbers representing, respectively,
+the full year including century, the month from 1 to 12, the day of the month
+from 1 to 31, the hour of the day from 0 to 23, the minute from 0 to
+59, the second from 0 to 60,@footnote{Occasionally there are
+minutes in a year with a leap second, which is why the
+seconds can go up to 60.}
+and an optional daylight-savings flag.
+
+The values of these numbers need not be within the ranges specified;
+for example, an hour of @minus{}1 means 1 hour before midnight.
+The origin-zero Gregorian calendar is assumed, with year 0 preceding
+year 1 and year @minus{}1 preceding year 0.
+The time is assumed to be in the local timezone.
+If the daylight-savings flag is positive, the time is assumed to be
+daylight savings time; if zero, the time is assumed to be standard
+time; and if negative (the default), @code{mktime} attempts to determine
+whether daylight savings time is in effect for the specified time.
+
+If @var{datespec} does not contain enough elements or if the resulting time
+is out of range, @code{mktime} returns @minus{}1.
+
+@item strftime(@r{[}@var{format} @r{[}, @var{timestamp}@r{]]})
+@c STARTOFRANGE strf
+@cindex @code{strftime} function (@command{gawk})
+This function returns a string.  It is similar to the function of the
+same name in ISO C.  The time specified by @var{timestamp} is used to
+produce a string, based on the contents of the @var{format} string.
+The @var{timestamp} is in the same format as the value returned by the
+@code{systime} function.  If no @var{timestamp} argument is supplied,
+@command{gawk} uses the current time of day as the timestamp.
+If no @var{format} argument is supplied, @code{strftime} uses
+@code{@w{"%a %b %d %H:%M:%S %Z %Y"}}.  This format string produces
+output that is (almost) equivalent to that of the @command{date} utility.
+(Versions of @command{gawk} prior to 3.0 require the @var{format} argument.)
+@end table
+
+The @code{systime} function allows you to compare a timestamp from a
+log file with the current time of day.  In particular, it is easy to
+determine how long ago a particular record was logged.  It also allows
+you to produce log records using the ``seconds since the epoch'' format.
+
+@cindex converting, dates to timestamps
+@cindex dates, converting to timestamps
+@cindex timestamps, converting dates to
+The @code{mktime} function allows you to convert a textual representation
+of a date and time into a timestamp.   This makes it easy to do before/after
+comparisons of dates and times, particularly when dealing with date and
+time data coming from an external source, such as a log file.
+
+The @code{strftime} function allows you to easily turn a timestamp
+into human-readable information.  It is similar in nature to the @code{sprintf}
+function
+(@pxref{String Functions}),
+in that it copies nonformat specification characters verbatim to the
+returned string, while substituting date and time values for format
+specifications in the @var{format} string.
+
+@cindex format specifiers, @code{strftime} function (@command{gawk})
+@code{strftime} is guaranteed by the 1999 ISO C standard@footnote{As this
+is a recent standard, not every system's @code{strftime} necessarily
+supports all of the conversions listed here.}
+to support the following date format specifications:
+
+@table @code
+@item %a
+The locale's abbreviated weekday name.
+
+@item %A
+The locale's full weekday name.
+
+@item %b
+The locale's abbreviated month name.
+
+@item %B
+The locale's full month name.
+
+@item %c
+The locale's ``appropriate'' date and time representation.
+(This is @samp{%A %B %d %T %Y} in the @code{"C"} locale.)
+
+@item %C
+The century.  This is the year divided by 100 and truncated to the next
+lower integer.
+
+@item %d
+The day of the month as a decimal number (01--31).
+
+@item %D
+Equivalent to specifying @samp{%m/%d/%y}.
+
+@item %e
+The day of the month, padded with a space if it is only one digit.
+
+@item %F
+Equivalent to specifying @samp{%Y-%m-%d}.
+This is the ISO 8601 date format.
+
+@item %g
+The year modulo 100 of the ISO week number, as a decimal number (00--99).
+For example, January 1, 1993 is in week 53 of 1992. Thus, the year
+of its ISO week number is 1992, even though its year is 1993.
+Similarly, December 31, 1973 is in week 1 of 1974. Thus, the year
+of its ISO week number is 1974, even though its year is 1973.
+
+@item %G
+The full year of the ISO week number, as a decimal number.
+
+@item %h
+Equivalent to @samp{%b}.
+
+@item %H
+The hour (24-hour clock) as a decimal number (00--23).
+
+@item %I
+The hour (12-hour clock) as a decimal number (01--12).
+
+@item %j
+The day of the year as a decimal number (001--366).
+
+@item %m
+The month as a decimal number (01--12).
+
+@item %M
+The minute as a decimal number (00--59).
+
+@item %n
+A newline character (ASCII LF).
+
+@item %p
+The locale's equivalent of the AM/PM designations associated
+with a 12-hour clock.
+
+@item %r
+The locale's 12-hour clock time.
+(This is @samp{%I:%M:%S %p} in the @code{"C"} locale.)
+
+@item %R
+Equivalent to specifying @samp{%H:%M}.
+
+@item %S
+The second as a decimal number (00--60).
+
+@item %t
+A TAB character.
+
+@item %T
+Equivalent to specifying @samp{%H:%M:%S}.
+
+@item %u
+The weekday as a decimal number (1--7).  Monday is day one.
+
+@item %U
+The week number of the year (the first Sunday as the first day of week one)
+as a decimal number (00--53).
+
+@c @cindex ISO 8601
+@item %V
+The week number of the year (the first Monday as the first
+day of week one) as a decimal number (01--53).
+The method for determining the week number is as specified by ISO 8601.
+(To wit: if the week containing January 1 has four or more days in the
+new year, then it is week one; otherwise it is week 53 of the previous year
+and the next week is week one.)
+
+@item %w
+The weekday as a decimal number (0--6).  Sunday is day zero.
+
+@item %W
+The week number of the year (the first Monday as the first day of week one)
+as a decimal number (00--53).
+
+@item %x
+The locale's ``appropriate'' date representation.
+(This is @samp{%A %B %d %Y} in the @code{"C"} locale.)
+
+@item %X
+The locale's ``appropriate'' time representation.
+(This is @samp{%T} in the @code{"C"} locale.)
+
+@item %y
+The year modulo 100 as a decimal number (00--99).
+
+@item %Y
+The full year as a decimal number (e.g., 1995).
+
+@c @cindex RFC 822
+@c @cindex RFC 1036
+@item %z
+The timezone offset in a +HHMM format (e.g., the format necessary to
+produce RFC 822/RFC 1036 date headers).
+
+@item %Z
+The time zone name or abbreviation; no characters if
+no time zone is determinable.
+
+@item %Ec %EC %Ex %EX %Ey %EY %Od %Oe %OH
+@itemx %OI %Om %OM %OS %Ou %OU %OV %Ow %OW %Oy
+``Alternate representations'' for the specifications
+that use only the second letter (@samp{%c}, @samp{%C},
+and so on).@footnote{If you don't understand any of this, don't worry about
+it; these facilities are meant to make it easier to ``internationalize''
+programs.
+Other internationalization features are described in
+@ref{Internationalization}.}
+(These facilitate compliance with the POSIX @command{date} utility.)
+
+@item %%
+A literal @samp{%}.
+@end table
+
+If a conversion specifier is not one of the above, the behavior is
+undefined.@footnote{This is because ISO C leaves the
+behavior of the C version of @code{strftime} undefined and @command{gawk}
+uses the system's version of @code{strftime} if it's there.
+Typically, the conversion specifier either does not appear in the
+returned string or appears literally.}
+
+@c @cindex locale, definition of
+Informally, a @dfn{locale} is the geographic place in which a program
+is meant to run.  For example, a common way to abbreviate the date
+September 4, 1991 in the United States is ``9/4/91.''
+In many countries in Europe, however, it is abbreviated ``4.9.91.''
+Thus, the @samp{%x} specification in a @code{"US"} locale might produce
+@samp{9/4/91}, while in a @code{"EUROPE"} locale, it might produce
+@samp{4.9.91}.  The ISO C standard defines a default @code{"C"}
+locale, which is an environment that is typical of what most C programmers
+are used to.
+
+A public-domain C version of @code{strftime} is supplied with @command{gawk}
+for systems that are not yet fully standards-compliant.
+It supports all of the just listed format specifications.
+If that version is
+used to compile @command{gawk} (@pxref{Installation}),
+then the following additional format specifications are available:
+
+@table @code
+@item %k
+The hour (24-hour clock) as a decimal number (0--23).
+Single-digit numbers are padded with a space.
+
+@item %l
+The hour (12-hour clock) as a decimal number (1--12).
+Single-digit numbers are padded with a space.
+
+@item %N
+The ``Emperor/Era'' name.
+Equivalent to @code{%C}.
+
+@item %o
+The ``Emperor/Era'' year.
+Equivalent to @code{%y}.
+
+@item %s
+The time as a decimal timestamp in seconds since the epoch.
+
+@item %v
+The date in VMS format (e.g., @samp{20-JUN-1991}).
+@end table
+@c ENDOFRANGE strf
+
+Additionally, the alternate representations are recognized but their
+normal representations are used.
+
+@cindex @code{date} utility, POSIX
+@cindex POSIX @command{awk}, @code{date} utility and
+This example is an @command{awk} implementation of the POSIX
+@command{date} utility.  Normally, the @command{date} utility prints the
+current date and time of day in a well-known format.  However, if you
+provide an argument to it that begins with a @samp{+}, @command{date}
+copies nonformat specifier characters to the standard output and
+interprets the current time according to the format specifiers in
+the string.  For example:
+
+@example
+$ date '+Today is %A, %B %d, %Y.'
+@print{} Today is Thursday, September 14, 2000.
+@end example
+
+Here is the @command{gawk} version of the @command{date} utility.
+It has a shell ``wrapper'' to handle the @option{-u} option,
+which requires that @command{date} run as if the time zone
+is set to UTC:
+
+@example
+#! /bin/sh
+#
+# date --- approximate the P1003.2 'date' command
+
+case $1 in
+-u)  TZ=UTC0     # use UTC
+     export TZ
+     shift ;;
+esac
+
+@c FIXME: One day, change %d to %e, when C 99 is common.
+gawk 'BEGIN  @{
+    format = "%a %b %d %H:%M:%S %Z %Y"
+    exitval = 0
+
+    if (ARGC > 2)
+        exitval = 1
+    else if (ARGC == 2) @{
+        format = ARGV[1]
+        if (format ~ /^\+/)
+            format = substr(format, 2)   # remove leading +
+    @}
+    print strftime(format)
+    exit exitval
+@}' "$@@"
+@end example
+@c ENDOFRANGE tst
+@c ENDOFRANGE logftst
+@c ENDOFRANGE filogtst
+@c ENDOFRANGE gawtst
+
+@node Bitwise Functions
+@subsection Bit-Manipulation Functions of @command{gawk}
+@c STARTOFRANGE bit
+@cindex bitwise, operations
+@c STARTOFRANGE and
+@cindex AND bitwise operation
+@c STARTOFRANGE oro
+@cindex OR bitwise operation
+@c STARTOFRANGE xor
+@cindex XOR bitwise operation
+@c STARTOFRANGE opbit
+@cindex operations, bitwise
+@quotation
+@i{I can explain it for you, but I can't understand it for you.}@*
+Anonymous
+@end quotation
+
+Many languages provide the ability to perform @dfn{bitwise} operations
+on two integer numbers.  In other words, the operation is performed on
+each successive pair of bits in the operands.
+Three common operations are bitwise AND, OR, and XOR.
+The operations are described by the following table:
+
+@ifnottex
+@display
+                Bit Operator
+          |  AND  |   OR  |  XOR
+          |---+---+---+---+---+---
+Operands  | 0 | 1 | 0 | 1 | 0 | 1
+----------+---+---+---+---+---+---
+    0     | 0   0 | 0   1 | 0   1
+    1     | 0   1 | 1   1 | 1   0
+@end display
+@end ifnottex
+@tex
+\centerline{
+\vbox{\bigskip % space above the table (about 1 linespace)
+% Because we have vertical rules, we can't let TeX insert interline space
+% in its usual way.
+\offinterlineskip
+\halign{\strut\hfil#\quad\hfil  % operands
+        &\vrule#&\quad#\quad    % rule, 0 (of and)
+        &\vrule#&\quad#\quad    % rule, 1 (of and)
+        &\vrule#                % rule between and and or
+        &\quad#\quad            % 0 (of or)
+        &\vrule#&\quad#\quad    % rule, 1 (of of)
+        &\vrule#                % rule between or and xor
+        &\quad#\quad            % 0 of xor
+        &\vrule#&\quad#\quad    % rule, 1 of xor
+        \cr
+&\omit&\multispan{11}\hfil\bf Bit operator\hfil\cr
+\noalign{\smallskip}
+&     &\multispan3\hfil AND\hfil&&\multispan3\hfil  OR\hfil
+                           &&\multispan3\hfil XOR\hfil\cr
+\bf Operands&&0&&1&&0&&1&&0&&1\cr
+\noalign{\hrule}
+\omit&height 2pt&&\omit&&&&\omit&&&&\omit\cr
+\noalign{\hrule height0pt}% without this the rule does not extend; why?
+0&&0&\omit&0&&0&\omit&1&&0&\omit&1\cr
+1&&0&\omit&1&&1&\omit&1&&1&\omit&0\cr
+}}}
+@end tex
+
+@cindex bitwise, complement
+@cindex complement, bitwise
+As you can see, the result of an AND operation is 1 only when @emph{both}
+bits are 1.
+The result of an OR operation is 1 if @emph{either} bit is 1.
+The result of an XOR operation is 1 if either bit is 1,
+but not both.
+The next operation is the @dfn{complement}; the complement of 1 is 0 and
+the complement of 0 is 1. Thus, this operation ``flips'' all the bits
+of a given value.
+
+@cindex bitwise, shift
+@cindex left shift, bitwise
+@cindex right shift, bitwise
+@cindex shift, bitwise
+Finally, two other common operations are to shift the bits left or right.
+For example, if you have a bit string @samp{10111001} and you shift it
+right by three bits, you end up with @samp{00010111}.@footnote{This example
+shows that 0's come in on the left side. For @command{gawk}, this is
+always true, but in some languages, it's possible to have the left side
+fill with 1's. Caveat emptor.}
+@c Purposely decided to use   0's   and   1's   here.  2/2001.
+If you start over
+again with @samp{10111001} and shift it left by three bits, you end up
+with @samp{11001000}.
+@command{gawk} provides built-in functions that implement the
+bitwise operations just described. They are:
+
+@ignore
+@table @code
+@cindex @code{and} function (@command{gawk})
+@item and(@var{v1}, @var{v2})
+Return the bitwise AND of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{or} function (@command{gawk})
+@item or(@var{v1}, @var{v2})
+Return the bitwise OR of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{xor} function (@command{gawk})
+@item xor(@var{v1}, @var{v2})
+Return the bitwise XOR of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{compl} function (@command{gawk})
+@item compl(@var{val})
+Return the bitwise complement of @var{val}.
+
+@cindex @code{lshift} function (@command{gawk})
+@item lshift(@var{val}, @var{count})
+Return the value of @var{val}, shifted left by @var{count} bits.
+
+@cindex @code{rshift} function (@command{gawk})
+@item rshift(@var{val}, @var{count})
+Return the value of @var{val}, shifted right by @var{count} bits.
+@end table
+@end ignore
+
+@cindex @command{gawk}, bitwise operations in
+@multitable {@code{rshift(@var{val}, @var{count})}} {Return the value of @var{val}, shifted right by @var{count} bits.}
+@cindex @code{and} function (@command{gawk})
+@item @code{and(@var{v1}, @var{v2})}
+@tab Returns the bitwise AND of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{or} function (@command{gawk})
+@item @code{or(@var{v1}, @var{v2})}
+@tab Returns the bitwise OR of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{xor} function (@command{gawk})
+@item @code{xor(@var{v1}, @var{v2})}
+@tab Returns the bitwise XOR of the values provided by @var{v1} and @var{v2}.
+
+@cindex @code{compl} function (@command{gawk})
+@item @code{compl(@var{val})}
+@tab Returns the bitwise complement of @var{val}.
+
+@cindex @code{lshift} function (@command{gawk})
+@item @code{lshift(@var{val}, @var{count})}
+@tab Returns the value of @var{val}, shifted left by @var{count} bits.
+
+@cindex @code{rshift} function (@command{gawk})
+@item @code{rshift(@var{val}, @var{count})}
+@tab Returns the value of @var{val}, shifted right by @var{count} bits.
+@end multitable
+
+For all of these functions, first the double-precision floating-point value is
+converted to the widest C unsigned integer type, then the bitwise operation is
+performed and then the result is converted back into a C @code{double}. (If
+you don't understand this paragraph, don't worry about it.)
+
+Here is a user-defined function
+(@pxref{User-defined})
+that illustrates the use of these functions:
+
+@cindex @code{bits2str} user-defined function
+@cindex @code{testbits.awk} program
+@smallexample
+@group
+@c file eg/lib/bits2str.awk
+# bits2str --- turn a byte into readable 1's and 0's
+
+function bits2str(bits,        data, mask)
+@{
+    if (bits == 0)
+        return "0"
+
+    mask = 1
+    for (; bits != 0; bits = rshift(bits, 1))
+        data = (and(bits, mask) ? "1" : "0") data
+
+    while ((length(data) % 8) != 0)
+        data = "0" data
+
+    return data
+@}
+@c endfile
+@end group
+
+@c this is a hack to make testbits.awk self-contained
+@ignore
+@c file eg/prog/testbits.awk
+# bits2str --- turn a byte into readable 1's and 0's
+
+function bits2str(bits,        data, mask)
+@{
+    if (bits == 0)
+        return "0"
+
+    mask = 1
+    for (; bits != 0; bits = rshift(bits, 1))
+        data = (and(bits, mask) ? "1" : "0") data
+
+    while ((length(data) % 8) != 0)
+        data = "0" data
+
+    return data
+@}
+@c endfile
+@end ignore
+@c file eg/prog/testbits.awk
+BEGIN @{
+    printf "123 = %s\n", bits2str(123)
+    printf "0123 = %s\n", bits2str(0123)
+    printf "0x99 = %s\n", bits2str(0x99)
+    comp = compl(0x99)
+    printf "compl(0x99) = %#x = %s\n", comp, bits2str(comp)
+    shift = lshift(0x99, 2)
+    printf "lshift(0x99, 2) = %#x = %s\n", shift, bits2str(shift)
+    shift = rshift(0x99, 2)
+    printf "rshift(0x99, 2) = %#x = %s\n", shift, bits2str(shift)
+@}
+@c endfile
+@end smallexample
+
+@noindent
+This program produces the following output when run:
+
+@smallexample
+$ gawk -f testbits.awk
+@print{} 123 = 01111011
+@print{} 0123 = 01010011
+@print{} 0x99 = 10011001
+@print{} compl(0x99) = 0xffffff66 = 11111111111111111111111101100110
+@print{} lshift(0x99, 2) = 0x264 = 0000001001100100
+@print{} rshift(0x99, 2) = 0x26 = 00100110
+@end smallexample
+
+@cindex numbers, converting, to strings
+@cindex strings, converting, numbers to
+@cindex converting, numbers, to strings
+The @code{bits2str} function turns a binary number into a string.
+The number @code{1} represents a binary value where the rightmost bit
+is set to 1.  Using this mask,
+the function repeatedly checks the rightmost bit.
+ANDing the mask with the value indicates whether the
+rightmost bit is 1 or not. If so, a @code{"1"} is concatenated onto the front
+of the string.
+Otherwise, a @code{"0"} is added.
+The value is then shifted right by one bit and the loop continues
+until there are no more 1 bits.
+
+If the initial value is zero it returns a simple @code{"0"}.
+Otherwise, at the end, it pads the value with zeros to represent multiples
+of 8-bit quantities. This is typical in modern computers.
+
+The main code in the @code{BEGIN} rule shows the difference between the
+decimal and octal values for the same numbers
+(@pxref{Nondecimal-numbers}),
+and then demonstrates the
+results of the @code{compl}, @code{lshift}, and @code{rshift} functions.
+@c ENDOFRANGE bit
+@c ENDOFRANGE and
+@c ENDOFRANGE oro
+@c ENDOFRANGE xor
+@c ENDOFRANGE opbit
+
+@node I18N Functions
+@subsection Using @command{gawk}'s String-Translation Functions
+@cindex @command{gawk}, string-translation functions
+@cindex functions, string-translation
+@cindex internationalization
+@cindex @command{awk} programs, internationalizing
+
+@command{gawk} provides facilities for internationalizing @command{awk} programs.
+These include the functions described in the following list.
+The descriptions here are purposely brief.
+@xref{Internationalization},
+for the full story.
+Optional parameters are enclosed in square brackets ([ ]):
+
+@table @code
+@cindex @code{dcgettext} function (@command{gawk})
+@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+This function returns the translation of @var{string} in
+text domain @var{domain} for locale category @var{category}.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+@cindex @code{dcngettext} function (@command{gawk})
+@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+This function returns the plural form used for @var{number} of the
+translation of @var{string1} and @var{string2} in text domain
+@var{domain} for locale category @var{category}. @var{string1} is the
+English singular variant of a message, and @var{string2} the English plural
+variant of the same message.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+@cindex @code{bindtextdomain} function (@command{gawk})
+@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]})
+This function allows you to specify the directory in which
+@command{gawk} will look for message translation files, in case they
+will not or cannot be placed in the ``standard'' locations
+(e.g., during testing).
+It returns the directory in which @var{domain} is ``bound.''
+
+The default @var{domain} is the value of @code{TEXTDOMAIN}.
+If @var{directory} is the null string (@code{""}), then
+@code{bindtextdomain} returns the current binding for the
+given @var{domain}.
+@end table
+@c ENDOFRANGE funcbi
+@c ENDOFRANGE bifunc
+
+@node User-defined
+@section User-Defined Functions
+
+@c STARTOFRANGE udfunc
+@cindex user-defined, functions
+@c STARTOFRANGE funcud
+@cindex functions, user-defined
+Complicated @command{awk} programs can often be simplified by defining
+your own functions.  User-defined functions can be called just like
+built-in ones (@pxref{Function Calls}), but it is up to you to define
+them, i.e., to tell @command{awk} what they should do.
+
+@menu
+* Definition Syntax::           How to write definitions and what they mean.
+* Function Example::            An example function definition and what it
+                                does.
+* Function Caveats::            Things to watch out for.
+* Return Statement::            Specifying the value a function returns.
+* Dynamic Typing::              How variable types can change at runtime.
+@end menu
+
+@node Definition Syntax
+@subsection Function Definition Syntax
+
+@c STARTOFRANGE fdef
+@cindex functions, defining
+Definitions of functions can appear anywhere between the rules of an
+@command{awk} program.  Thus, the general form of an @command{awk} program is
+extended to include sequences of rules @emph{and} user-defined function
+definitions.
+There is no need to put the definition of a function
+before all uses of the function.  This is because @command{awk} reads the
+entire program before starting to execute any of it.
+
+The definition of a function named @var{name} looks like this:
+@c NEXT ED: put [ ] around parameter list
+
+@example
+function @var{name}(@var{parameter-list})
+@{
+     @var{body-of-function}
+@}
+@end example
+
+@cindex names, functions
+@cindex functions, names of
+@cindex namespace issues, functions
+@noindent
+@var{name} is the name of the function to define.  A valid function
+name is like a valid variable name: a sequence of letters, digits, and
+underscores that doesn't start with a digit.
+Within a single @command{awk} program, any particular name can only be
+used as a variable, array, or function.
+
+@c NEXT ED: parameter-list is an OPTIONAL list of ...
+@var{parameter-list} is a list of the function's arguments and local
+variable names, separated by commas.  When the function is called,
+the argument names are used to hold the argument values given in
+the call.  The local variables are initialized to the empty string.
+A function cannot have two parameters with the same name, nor may it
+have a parameter with the same name as the function itself.
+
+The @var{body-of-function} consists of @command{awk} statements.  It is the
+most important part of the definition, because it says what the function
+should actually @emph{do}.  The argument names exist to give the body a
+way to talk about the arguments; local variables exist to give the body
+places to keep temporary values.
+
+Argument names are not distinguished syntactically from local variable
+names. Instead, the number of arguments supplied when the function is
+called determines how many argument variables there are.  Thus, if three
+argument values are given, the first three names in @var{parameter-list}
+are arguments and the rest are local variables.
+
+It follows that if the number of arguments is not the same in all calls
+to the function, some of the names in @var{parameter-list} may be
+arguments on some occasions and local variables on others.  Another
+way to think of this is that omitted arguments default to the
+null string.
+
+@cindex programming conventions, functions, writing
+Usually when you write a function, you know how many names you intend to
+use for arguments and how many you intend to use as local variables.  It is
+conventional to place some extra space between the arguments and
+the local variables, in order to document how your function is supposed to be used.
+
+@cindex variables, shadowing
+During execution of the function body, the arguments and local variable
+values hide, or @dfn{shadow}, any variables of the same names used in the
+rest of the program.  The shadowed variables are not accessible in the
+function definition, because there is no way to name them while their
+names have been taken away for the local variables.  All other variables
+used in the @command{awk} program can be referenced or set normally in the
+function's body.
+
+The arguments and local variables last only as long as the function body
+is executing.  Once the body finishes, you can once again access the
+variables that were shadowed while the function was running.
+
+@cindex recursive functions
+@cindex functions, recursive
+The function body can contain expressions that call functions.  They
+can even call this function, either directly or by way of another
+function.  When this happens, we say the function is @dfn{recursive}.
+The act of a function calling itself is called @dfn{recursion}.
+
+@c @cindex @command{awk} language, POSIX version
+@c @cindex POSIX @command{awk}
+@cindex POSIX @command{awk}, @code{function} keyword in
+In many @command{awk} implementations, including @command{gawk},
+the keyword @code{function} may be
+abbreviated @code{func}.  However, POSIX only specifies the use of
+the keyword @code{function}.  This actually has some practical implications.
+If @command{gawk} is in POSIX-compatibility mode
+(@pxref{Options}), then the following
+statement does @emph{not} define a function:
+
+@example
+func foo() @{ a = sqrt($1) ; print a @}
+@end example
+
+@noindent
+Instead it defines a rule that, for each record, concatenates the value
+of the variable @samp{func} with the return value of the function @samp{foo}.
+If the resulting string is non-null, the action is executed.
+This is probably not what is desired.  (@command{awk} accepts this input as
+syntactically valid, because functions may be used before they are defined
+in @command{awk} programs.)
+@c NEXT ED: This won't actually run, since foo() is undefined ...
+
+@c last comma does NOT start tertiary
+@cindex portability, functions, defining
+To ensure that your @command{awk} programs are portable, always use the
+keyword @code{function} when defining a function.
+
+@node Function Example
+@subsection Function Definition Examples
+
+Here is an example of a user-defined function, called @code{myprint}, that
+takes a number and prints it in a specific format:
+
+@example
+function myprint(num)
+@{
+     printf "%6.3g\n", num
+@}
+@end example
+
+@noindent
+To illustrate, here is an @command{awk} rule that uses our @code{myprint}
+function:
+
+@example
+$3 > 0     @{ myprint($3) @}
+@end example
+
+@noindent
+This program prints, in our special format, all the third fields that
+contain a positive number in our input.  Therefore, when given the following:
+
+@example
+ 1.2   3.4    5.6   7.8
+ 9.10 11.12 -13.14 15.16
+17.18 19.20  21.22 23.24
+@end example
+
+@noindent
+this program, using our function to format the results, prints:
+
+@example
+   5.6
+  21.2
+@end example
+
+This function deletes all the elements in an array:
+
+@example
+function delarray(a,    i)
+@{
+    for (i in a)
+       delete a[i]
+@}
+@end example
+
+When working with arrays, it is often necessary to delete all the elements
+in an array and start over with a new list of elements
+(@pxref{Delete}).
+Instead of having
+to repeat this loop everywhere that you need to clear out
+an array, your program can just call @code{delarray}.
+(This guarantees portability.  The use of @samp{delete @var{array}} to delete
+the contents of an entire array is a nonstandard extension.)
+
+The following is an example of a recursive function.  It takes a string
+as an input parameter and returns the string in backwards order.
+Recursive functions must always have a test that stops the recursion.
+In this case, the recursion terminates when the starting position
+is zero, i.e., when there are no more characters left in the string.
+
+@cindex @code{rev} user-defined function
+@example
+function rev(str, start)
+@{
+    if (start == 0)
+        return ""
+
+    return (substr(str, start, 1) rev(str, start - 1))
+@}
+@end example
+
+If this function is in a file named @file{rev.awk}, it can be tested
+this way:
+
+@example
+$ echo "Don't Panic!" |
+> gawk --source '@{ print rev($0, length($0)) @}' -f rev.awk
+@print{} !cinaP t'noD
+@end example
+
+The C @code{ctime} function takes a timestamp and returns it in a string,
+formatted in a well-known fashion.
+The following example uses the built-in @code{strftime} function
+(@pxref{Time Functions})
+to create an @command{awk} version of @code{ctime}:
+
+@cindex @code{ctime} user-defined function
+@c FIXME: One day, change %d to %e, when C 99 is common.
+@example
+@c file eg/lib/ctime.awk
+# ctime.awk
+#
+# awk version of C ctime(3) function
+
+function ctime(ts,    format)
+@{
+    format = "%a %b %d %H:%M:%S %Z %Y"
+    if (ts == 0)
+        ts = systime()       # use current time as default
+    return strftime(format, ts)
+@}
+@c endfile
+@end example
+@c ENDOFRANGE fdef
+
+@node Function Caveats
+@subsection Calling User-Defined Functions
+
+@c STARTOFRANGE fudc
+@cindex functions, user-defined, calling
+@dfn{Calling a function} means causing the function to run and do its job.
+A function call is an expression and its value is the value returned by
+the function.
+
+A function call consists of the function name followed by the arguments
+in parentheses.  @command{awk} expressions are what you write in the
+call for the arguments.  Each time the call is executed, these
+expressions are evaluated, and the values are the actual arguments.  For
+example, here is a call to @code{foo} with three arguments (the first
+being a string concatenation):
+
+@example
+foo(x y, "lose", 4 * z)
+@end example
+
+@strong{Caution:} Whitespace characters (spaces and tabs) are not allowed
+between the function name and the open-parenthesis of the argument list.
+If you write whitespace by mistake, @command{awk} might think that you mean
+to concatenate a variable with an expression in parentheses.  However, it
+notices that you used a function name and not a variable name, and reports
+an error.
+
+@cindex call by value
+When a function is called, it is given a @emph{copy} of the values of
+its arguments.  This is known as @dfn{call by value}.  The caller may use
+a variable as the expression for the argument, but the called function
+does not know this---it only knows what value the argument had.  For
+example, if you write the following code:
+
+@example
+foo = "bar"
+z = myfunc(foo)
+@end example
+
+@noindent
+then you should not think of the argument to @code{myfunc} as being
+``the variable @code{foo}.''  Instead, think of the argument as the
+string value @code{"bar"}.
+If the function @code{myfunc} alters the values of its local variables,
+this has no effect on any other variables.  Thus, if @code{myfunc}
+does this:
+
+@example
+function myfunc(str)
+@{
+  print str
+  str = "zzz"
+  print str
+@}
+@end example
+
+@noindent
+to change its first argument variable @code{str}, it does @emph{not}
+change the value of @code{foo} in the caller.  The role of @code{foo} in
+calling @code{myfunc} ended when its value (@code{"bar"}) was computed.
+If @code{str} also exists outside of @code{myfunc}, the function body
+cannot alter this outer value, because it is shadowed during the
+execution of @code{myfunc} and cannot be seen or changed from there.
+
+@cindex call by reference
+@cindex arrays, as parameters to functions
+@cindex functions, arrays as parameters to
+However, when arrays are the parameters to functions, they are @emph{not}
+copied.  Instead, the array itself is made available for direct manipulation
+by the function.  This is usually called @dfn{call by reference}.
+Changes made to an array parameter inside the body of a function @emph{are}
+visible outside that function.
+
+@strong{Note:} Changing an array parameter inside a function
+can be very dangerous if you do not watch what you are doing.
+For example:
+
+@example
+function changeit(array, ind, nvalue)
+@{
+     array[ind] = nvalue
+@}
+
+BEGIN @{
+    a[1] = 1; a[2] = 2; a[3] = 3
+    changeit(a, 2, "two")
+    printf "a[1] = %s, a[2] = %s, a[3] = %s\n",
+            a[1], a[2], a[3]
+@}
+@end example
+
+@noindent
+prints @samp{a[1] = 1, a[2] = two, a[3] = 3}, because
+@code{changeit} stores @code{"two"} in the second element of @code{a}.
+
+@cindex undefined functions
+@cindex functions, undefined
+Some @command{awk} implementations allow you to call a function that
+has not been defined. They only report a problem at runtime when the
+program actually tries to call the function. For example:
+
+@example
+BEGIN @{
+    if (0)
+        foo()
+    else
+        bar()
+@}
+function bar() @{ @dots{} @}
+# note that `foo' is not defined
+@end example
+
+@noindent
+Because the @samp{if} statement will never be true, it is not really a
+problem that @code{foo} has not been defined.  Usually, though, it is a
+problem if a program calls an undefined function.
+
+@cindex lint checking, undefined functions
+If @option{--lint} is specified
+(@pxref{Options}),
+@command{gawk} reports calls to undefined functions.
+
+@cindex portability, @code{next} statement in user-defined functions
+Some @command{awk} implementations generate a runtime
+error if you use the @code{next} statement
+(@pxref{Next Statement})
+inside a user-defined function.
+@command{gawk} does not have this limitation.
+@c ENDOFRANGE fudc
+
+@node Return Statement
+@subsection The @code{return} Statement
+@c comma does NOT start a secondary
+@cindex @code{return} statement, user-defined functions
+
+The body of a user-defined function can contain a @code{return} statement.
+This statement returns control to the calling part of the @command{awk} program.  It
+can also be used to return a value for use in the rest of the @command{awk}
+program.  It looks like this:
+
+@example
+return @r{[}@var{expression}@r{]}
+@end example
+
+The @var{expression} part is optional.  If it is omitted, then the returned
+value is undefined, and therefore, unpredictable.
+
+A @code{return} statement with no value expression is assumed at the end of
+every function definition.  So if control reaches the end of the function
+body, then the function returns an unpredictable value.  @command{awk}
+does @emph{not} warn you if you use the return value of such a function.
+
+Sometimes, you want to write a function for what it does, not for
+what it returns.  Such a function corresponds to a @code{void} function
+in C or to a @code{procedure} in Pascal.  Thus, it may be appropriate to not
+return any value; simply bear in mind that if you use the return
+value of such a function, you do so at your own risk.
+
+The following is an example of a user-defined function that returns a value
+for the largest number among the elements of an array:
+
+@example
+function maxelt(vec,   i, ret)
+@{
+     for (i in vec) @{
+          if (ret == "" || vec[i] > ret)
+               ret = vec[i]
+     @}
+     return ret
+@}
+@end example
+
+@cindex programming conventions, function parameters
+@noindent
+You call @code{maxelt} with one argument, which is an array name.  The local
+variables @code{i} and @code{ret} are not intended to be arguments;
+while there is nothing to stop you from passing more than one argument
+to @code{maxelt}, the results would be strange.  The extra space before
+@code{i} in the function parameter list indicates that @code{i} and
+@code{ret} are not supposed to be arguments.
+You should follow this convention when defining functions.
+
+The following program uses the @code{maxelt} function.  It loads an
+array, calls @code{maxelt}, and then reports the maximum number in that
+array:
+
+@example
+function maxelt(vec,   i, ret)
+@{
+     for (i in vec) @{
+          if (ret == "" || vec[i] > ret)
+               ret = vec[i]
+     @}
+     return ret
+@}
+
+# Load all fields of each record into nums.
+@{
+     for(i = 1; i <= NF; i++)
+          nums[NR, i] = $i
+@}
+
+END @{
+     print maxelt(nums)
+@}
+@end example
+
+Given the following input:
+
+@example
+ 1 5 23 8 16
+44 3 5 2 8 26
+256 291 1396 2962 100
+-6 467 998 1101
+99385 11 0 225
+@end example
+
+@noindent
+the program reports (predictably) that @code{99385} is the largest number
+in the array.
+
+@node Dynamic Typing
+@subsection Functions and Their Effects on Variable Typing
+
+@command{awk} is a very fluid language.
+It is possible that @command{awk} can't tell if an identifier
+represents a regular variable or an array until runtime.
+Here is an annotated sample program:
+
+@example
+function foo(a)
+@{
+    a[1] = 1   # parameter is an array
+@}
+
+BEGIN @{
+    b = 1
+    foo(b)  # invalid: fatal type mismatch
+
+    foo(x)  # x uninitialized, becomes an array dynamically
+    x = 1   # now not allowed, runtime error
+@}
+@end example
+
+Usually, such things aren't a big issue, but it's worth
+being aware of them.
+@c ENDOFRANGE udfunc
+@c ENDOFRANGE funcud
+
+@node Internationalization
+@chapter Internationalization with @command{gawk}
+
+Once upon a time, computer makers
+wrote software that worked only in English.
+Eventually, hardware and software vendors noticed that if their
+systems worked in the native languages of non-English-speaking
+countries, they were able to sell more systems.
+As a result, internationalization and localization
+of programs and software systems became a common practice.
+
+@c STARTOFRANGE inloc
+@cindex internationalization, localization
+@cindex @command{gawk}, internationalization and, See internationalization
+@cindex internationalization, localization, @command{gawk} and
+Until recently, the ability to provide internationalization
+was largely restricted to programs written in C and C++.
+This @value{CHAPTER} describes the underlying library @command{gawk}
+uses for internationalization, as well as how
+@command{gawk} makes internationalization
+features available at the @command{awk} program level.
+Having internationalization available at the @command{awk} level
+gives software developers additional flexibility---they are no
+longer required to write in C when internationalization is
+a requirement.
+
+@menu
+* I18N and L10N::               Internationalization and Localization.
+* Explaining gettext::          How GNU @code{gettext} works.
+* Programmer i18n::             Features for the programmer.
+* Translator i18n::             Features for the translator.
+* I18N Example::                A simple i18n example.
+* Gawk I18N::                   @command{gawk} is also internationalized.
+@end menu
+
+@node I18N and L10N
+@section Internationalization and Localization
+
+@cindex internationalization
+@c comma is part of see
+@cindex localization, See internationalization, localization
+@cindex localization
+@dfn{Internationalization} means writing (or modifying) a program once,
+in such a way that it can use multiple languages without requiring
+further source-code changes.
+@dfn{Localization} means providing the data necessary for an
+internationalized program to work in a particular language.
+Most typically, these terms refer to features such as the language
+used for printing error messages, the language used to read
+responses, and information related to how numerical and
+monetary values are printed and read.
+
+@node Explaining gettext
+@section GNU @code{gettext}
+
+@cindex internationalizing a program
+@c STARTOFRANGE gettex
+@cindex @code{gettext} library
+The facilities in GNU @code{gettext} focus on messages; strings printed
+by a program, either directly or via formatting with @code{printf} or
+@code{sprintf}.@footnote{For some operating systems, the @command{gawk}
+port doesn't support GNU @code{gettext}.  This applies most notably to
+the PC operating systems.  As such, these features are not available
+if you are using one of those operating systems.  Sorry.}
+
+@cindex portability, @code{gettext} library and
+When using GNU @code{gettext}, each application has its own
+@dfn{text domain}.  This is a unique name, such as @samp{kpilot} or @samp{gawk},
+that identifies the application.
+A complete application may have multiple components---programs written
+in C or C++, as well as scripts written in @command{sh} or @command{awk}.
+All of the components use the same text domain.
+
+To make the discussion concrete, assume we're writing an application
+named @command{guide}.  Internationalization consists of the
+following steps, in this order:
+
+@enumerate
+@item
+The programmer goes
+through the source for all of @command{guide}'s components
+and marks each string that is a candidate for translation.
+For example, @code{"`-F': option required"} is a good candidate for translation.
+A table with strings of option names is not (e.g., @command{gawk}'s
+@option{--profile} option should remain the same, no matter what the local
+language).
+
+@cindex @code{textdomain} function (C library)
+@item
+The programmer indicates the application's text domain
+(@code{"guide"}) to the @code{gettext} library,
+by calling the @code{textdomain} function.
+
+@item
+Messages from the application are extracted from the source code and
+collected into a portable object file (@file{guide.po}),
+which lists the strings and their translations.
+The translations are initially empty.
+The original (usually English) messages serve as the key for
+lookup of the translations.
+
+@cindex @code{.po} files
+@cindex files, @code{.po}
+@cindex portable object files
+@cindex files, portable object
+@item
+For each language with a translator, @file{guide.po}
+is copied and translations are created and shipped with the application.
+
+@cindex @code{.mo} files
+@cindex files, @code{.mo}
+@cindex message object files
+@cindex files, message object
+@item
+Each language's @file{.po} file is converted into a binary
+message object (@file{.mo}) file.
+A message object file contains the original messages and their
+translations in a binary format that allows fast lookup of translations
+at runtime.
+
+@item
+When @command{guide} is built and installed, the binary translation files
+are installed in a standard place.
+
+@cindex @code{bindtextdomain} function (C library)
+@item
+For testing and development, it is possible to tell @code{gettext}
+to use @file{.mo} files in a different directory than the standard
+one by using the @code{bindtextdomain} function.
+
+@cindex @code{.mo} files, specifying directory of
+@cindex files, @code{.mo}, specifying directory of
+@cindex message object files, specifying directory of
+@cindex files, message object, specifying directory of
+@item
+At runtime, @command{guide} looks up each string via a call
+to @code{gettext}.  The returned string is the translated string
+if available, or the original string if not.
+
+@item
+If necessary, it is possible to access messages from a different
+text domain than the one belonging to the application, without
+having to switch the application's default text domain back
+and forth.
+@end enumerate
+
+@cindex @code{gettext} function (C library)
+In C (or C++), the string marking and dynamic translation lookup
+are accomplished by wrapping each string in a call to @code{gettext}:
+
+@example
+printf(gettext("Don't Panic!\n"));
+@end example
+
+The tools that extract messages from source code pull out all
+strings enclosed in calls to @code{gettext}.
+
+@cindex @code{_} (underscore), @code{_} C macro
+@cindex underscore (@code{_}), @code{_} C macro
+The GNU @code{gettext} developers, recognizing that typing
+@samp{gettext} over and over again is both painful and ugly to look
+at, use the macro @samp{_} (an underscore) to make things easier:
+
+@example
+/* In the standard header file: */
+#define _(str) gettext(str)
+
+/* In the program text: */
+printf(_("Don't Panic!\n"));
+@end example
+
+@cindex internationalization, localization, locale categories
+@cindex @code{gettext} library, locale categories
+@cindex locale categories
+@noindent
+This reduces the typing overhead to just three extra characters per string
+and is considerably easier to read as well.
+There are locale @dfn{categories}
+for different types of locale-related information.
+The defined locale categories that @code{gettext} knows about are:
+
+@table @code
+@cindex @code{LC_MESSAGES} locale category
+@item LC_MESSAGES
+Text messages.  This is the default category for @code{gettext}
+operations, but it is possible to supply a different one explicitly,
+if necessary.  (It is almost never necessary to supply a different category.)
+
+@cindex sorting characters in different languages
+@cindex @code{LC_COLLATE} locale category
+@item LC_COLLATE
+Text-collation information; i.e., how different characters
+and/or groups of characters sort in a given language.
+
+@cindex @code{LC_CTYPE} locale category
+@item LC_CTYPE
+Character-type information (alphabetic, digit, upper- or lowercase, and
+so on).
+This information is accessed via the
+POSIX character classes in regular expressions,
+such as @code{/[[:alnum:]]/}
+(@pxref{Regexp Operators}).
+
+@cindex monetary information, localization
+@cindex currency symbols, localization
+@cindex @code{LC_MONETARY} locale category
+@item LC_MONETARY
+Monetary information, such as the currency symbol, and whether the
+symbol goes before or after a number.
+
+@cindex @code{LC_NUMERIC} locale category
+@item LC_NUMERIC
+Numeric information, such as which characters to use for the decimal
+point and the thousands separator.@footnote{Americans
+use a comma every three decimal places and a period for the decimal
+point, while many Europeans do exactly the opposite:
+@code{1,234.56} versus @code{1.234,56}.}
+
+@cindex @code{LC_RESPONSE} locale category
+@item LC_RESPONSE
+Response information, such as how ``yes'' and ``no'' appear in the
+local language, and possibly other information as well.
+
+@cindex time, localization and
+@c last comma does NOT start a tertiary
+@cindex dates, information related to, localization
+@cindex @code{LC_TIME} locale category
+@item LC_TIME
+Time- and date-related information, such as 12- or 24-hour clock, month printed
+before or after day in a date, local month abbreviations, and so on.
+
+@cindex @code{LC_ALL} locale category
+@item LC_ALL
+All of the above.  (Not too useful in the context of @code{gettext}.)
+@end table
+@c ENDOFRANGE gettex
+
+@node Programmer i18n
+@section Internationalizing @command{awk} Programs
+@c STARTOFRANGE inap
+@cindex @command{awk} programs, internationalizing
+
+@command{gawk} provides the following variables and functions for
+internationalization:
+
+@table @code
+@cindex @code{TEXTDOMAIN} variable
+@item TEXTDOMAIN
+This variable indicates the application's text domain.
+For compatibility with GNU @code{gettext}, the default
+value is @code{"messages"}.
+
+@cindex internationalization, localization, marked strings
+@cindex strings, for localization
+@item _"your message here"
+String constants marked with a leading underscore
+are candidates for translation at runtime.
+String constants without a leading underscore are not translated.
+
+@cindex @code{dcgettext} function (@command{gawk})
+@item dcgettext(@var{string} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+This built-in function returns the translation of @var{string} in
+text domain @var{domain} for locale category @var{category}.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+If you supply a value for @var{category}, it must be a string equal to
+one of the known locale categories described in
+@ifnotinfo
+the previous @value{SECTION}.
+@end ifnotinfo
+@ifinfo
+@ref{Explaining gettext}.
+@end ifinfo
+You must also supply a text domain.  Use @code{TEXTDOMAIN} if
+you want to use the current domain.
+
+@strong{Caution:} The order of arguments to the @command{awk} version
+of the @code{dcgettext} function is purposely different from the order for
+the C version.  The @command{awk} version's order was
+chosen to be simple and to allow for reasonable @command{awk}-style
+default arguments.
+
+@cindex @code{dcngettext} function (@command{gawk})
+@item dcngettext(@var{string1}, @var{string2}, @var{number} @r{[}, @var{domain} @r{[}, @var{category}@r{]]})
+This built-in function returns the plural form used for @var{number} of the
+translation of @var{string1} and @var{string2} in text domain
+@var{domain} for locale category @var{category}. @var{string1} is the
+English singular variant of a message, and @var{string2} the English plural
+variant of the same message.
+The default value for @var{domain} is the current value of @code{TEXTDOMAIN}.
+The default value for @var{category} is @code{"LC_MESSAGES"}.
+
+The same remarks as for the @code{dcgettext} function apply.
+
+@cindex @code{.mo} files, specifying directory of
+@cindex files, @code{.mo}, specifying directory of
+@cindex message object files, specifying directory of
+@cindex files, message object, specifying directory of
+@cindex @code{bindtextdomain} function (@command{gawk})
+@item bindtextdomain(@var{directory} @r{[}, @var{domain}@r{]})
+This built-in function allows you to specify the directory in which
+@code{gettext} looks for @file{.mo} files, in case they
+will not or cannot be placed in the standard locations
+(e.g., during testing).
+It returns the directory in which @var{domain} is ``bound.''
+
+The default @var{domain} is the value of @code{TEXTDOMAIN}.
+If @var{directory} is the null string (@code{""}), then
+@code{bindtextdomain} returns the current binding for the
+given @var{domain}.
+@end table
+
+To use these facilities in your @command{awk} program, follow the steps
+outlined in
+@ifnotinfo
+the previous @value{SECTION},
+@end ifnotinfo
+@ifinfo
+@ref{Explaining gettext},
+@end ifinfo
+like so:
+
+@enumerate
+@cindex @code{BEGIN} pattern, @code{TEXTDOMAIN} variable and
+@cindex @code{TEXTDOMAIN} variable, @code{BEGIN} pattern and
+@item
+Set the variable @code{TEXTDOMAIN} to the text domain of
+your program.  This is best done in a @code{BEGIN} rule
+(@pxref{BEGIN/END}),
+or it can also be done via the @option{-v} command-line
+option (@pxref{Options}):
+
+@example
+BEGIN @{
+    TEXTDOMAIN = "guide"
+    @dots{}
+@}
+@end example
+
+@cindex @code{_} (underscore), translatable string
+@cindex underscore (@code{_}), translatable string
+@item
+Mark all translatable strings with a leading underscore (@samp{_})
+character.  It @emph{must} be adjacent to the opening
+quote of the string.  For example:
+
+@example
+print _"hello, world"
+x = _"you goofed"
+printf(_"Number of users is %d\n", nusers)
+@end example
+
+@item
+If you are creating strings dynamically, you can
+still translate them, using the @code{dcgettext}
+built-in function:
+
+@example
+message = nusers " users logged in"
+message = dcgettext(message, "adminprog")
+print message
+@end example
+
+Here, the call to @code{dcgettext} supplies a different
+text domain (@code{"adminprog"}) in which to find the
+message, but it uses the default @code{"LC_MESSAGES"} category.
+
+@cindex @code{LC_MESSAGES} locale category, @code{bindtextdomain} function (@command{gawk})
+@item
+During development, you might want to put the @file{.mo}
+file in a private directory for testing.  This is done
+with the @code{bindtextdomain} built-in function:
+
+@example
+BEGIN @{
+   TEXTDOMAIN = "guide"   # our text domain
+   if (Testing) @{
+       # where to find our files
+       bindtextdomain("testdir")
+       # joe is in charge of adminprog
+       bindtextdomain("../joe/testdir", "adminprog")
+   @}
+   @dots{}
+@}
+@end example
+
+@end enumerate
+
+@xref{I18N Example},
+for an example program showing the steps to create
+and use translations from @command{awk}.
+
+@node Translator i18n
+@section Translating @command{awk} Programs
+
+@cindex @code{.po} files
+@cindex files, @code{.po}
+@cindex portable object files
+@cindex files, portable object
+Once a program's translatable strings have been marked, they must
+be extracted to create the initial @file{.po} file.
+As part of translation, it is often helpful to rearrange the order
+in which arguments to @code{printf} are output.
+
+@command{gawk}'s @option{--gen-po} command-line option extracts
+the messages and is discussed next.
+After that, @code{printf}'s ability to
+rearrange the order for @code{printf} arguments at runtime
+is covered.
+
+@menu
+* String Extraction::           Extracting marked strings.
+* Printf Ordering::             Rearranging @code{printf} arguments.
+* I18N Portability::            @command{awk}-level portability issues.
+@end menu
+
+@node String Extraction
+@subsection Extracting Marked Strings
+@cindex strings, extracting
+@c comma does NOT start secondary
+@cindex marked strings, extracting
+@cindex @code{--gen-po} option
+@cindex command-line options, string extraction
+@cindex string extraction (internationalization)
+@cindex marked string extraction (internationalization)
+@cindex extraction, of marked strings (internationalization)
+
+@cindex @code{--gen-po} option
+Once your @command{awk} program is working, and all the strings have
+been marked and you've set (and perhaps bound) the text domain,
+it is time to produce translations.
+First, use the @option{--gen-po} command-line option to create
+the initial @file{.po} file:
+
+@example
+$ gawk --gen-po -f guide.awk > guide.po
+@end example
+
+@cindex @code{xgettext} utility
+When run with @option{--gen-po}, @command{gawk} does not execute your
+program.  Instead, it parses it as usual and prints all marked strings
+to standard output in the format of a GNU @code{gettext} Portable Object
+file.  Also included in the output are any constant strings that
+appear as the first argument to @code{dcgettext} or as the first and
+second argument to @code{dcngettext}.@footnote{Starting with @code{gettext}
+version 0.11.5, the @command{xgettext} utility that comes with GNU
+@code{gettext} can handle @file{.awk} files.}
+@xref{I18N Example},
+for the full list of steps to go through to create and test
+translations for @command{guide}.
+
+@node Printf Ordering
+@subsection Rearranging @code{printf} Arguments
+
+@cindex @code{printf} statement, positional specifiers
+@c comma does NOT start secondary
+@cindex positional specifiers, @code{printf} statement
+Format strings for @code{printf} and @code{sprintf}
+(@pxref{Printf})
+present a special problem for translation.
+Consider the following:@footnote{This example is borrowed
+from the GNU @code{gettext} manual.}
+
+@c line broken here only for smallbook format
+@example
+printf(_"String `%s' has %d characters\n",
+          string, length(string)))
+@end example
+
+A possible German translation for this might be:
+
+@example
+"%d Zeichen lang ist die Zeichenkette `%s'\n"
+@end example
+
+The problem should be obvious: the order of the format
+specifications is different from the original!
+Even though @code{gettext} can return the translated string
+at runtime,
+it cannot change the argument order in the call to @code{printf}.
+
+To solve this problem, @code{printf} format specificiers may have
+an additional optional element, which we call a @dfn{positional specifier}.
+For example:
+
+@example
+"%2$d Zeichen lang ist die Zeichenkette `%1$s'\n"
+@end example
+
+Here, the positional specifier consists of an integer count, which indicates which
+argument to use, and a @samp{$}. Counts are one-based, and the
+format string itself is @emph{not} included.  Thus, in the following
+example, @samp{string} is the first argument and @samp{length(string)} is the second:
+
+@example
+$ gawk 'BEGIN @{
+>     string = "Dont Panic"
+>     printf _"%2$d characters live in \"%1$s\"\n",
+>                         string, length(string)
+> @}'
+@print{} 10 characters live in "Dont Panic"
+@end example
+
+If present, positional specifiers come first in the format specification,
+before the flags, the field width, and/or the precision.
+
+Positional specifiers can be used with the dynamic field width and
+precision capability:
+
+@example
+$ gawk 'BEGIN @{
+>    printf("%*.*s\n", 10, 20, "hello")
+>    printf("%3$*2$.*1$s\n", 20, 10, "hello")
+> @}'
+@print{}      hello
+@print{}      hello
+@end example
+
+@noindent
+@strong{Note:} When using @samp{*} with a positional specifier, the @samp{*}
+comes first, then the integer position, and then the @samp{$}.
+This is somewhat counterintutive.
+
+@cindex @code{printf} statement, positional specifiers, mixing with regular formats
+@c first comma does is part of primary
+@cindex positional specifiers, @code{printf} statement, mixing with regular formats
+@cindex format specifiers, mixing regular with positional specifiers
+@command{gawk} does not allow you to mix regular format specifiers
+and those with positional specifiers in the same string:
+
+@smallexample
+$ gawk 'BEGIN @{ printf _"%d %3$s\n", 1, 2, "hi" @}'
+@error{} gawk: cmd. line:1: fatal: must use `count$' on all formats or none
+@end smallexample
+
+@strong{Note:} There are some pathological cases that @command{gawk} may fail to
+diagnose.  In such cases, the output may not be what you expect.
+It's still a bad idea to try mixing them, even if @command{gawk}
+doesn't detect it.
+
+Although positional specifiers can be used directly in @command{awk} programs,
+their primary purpose is to help in producing correct translations of
+format strings into languages different from the one in which the program
+is first written.
+
+@node I18N Portability
+@subsection @command{awk} Portability Issues
+
+@cindex portability, internationalization and
+@cindex internationalization, localization, portability and
+@command{gawk}'s internationalization features were purposely chosen to
+have as little impact as possible on the portability of @command{awk}
+programs that use them to other versions of @command{awk}.
+Consider this program:
+
+@example
+BEGIN @{
+    TEXTDOMAIN = "guide"
+    if (Test_Guide)   # set with -v
+        bindtextdomain("/test/guide/messages")
+    print _"don't panic!"
+@}
+@end example
+
+@noindent
+As written, it won't work on other versions of @command{awk}.
+However, it is actually almost portable, requiring very little
+change:
+
+@itemize @bullet
+@cindex @code{TEXTDOMAIN} variable, portability and
+@item
+Assignments to @code{TEXTDOMAIN} won't have any effect,
+since @code{TEXTDOMAIN} is not special in other @command{awk} implementations.
+
+@item
+Non-GNU versions of @command{awk} treat marked strings
+as the concatenation of a variable named @code{_} with the string
+following it.@footnote{This is good fodder for an ``Obfuscated
+@command{awk}'' contest.} Typically, the variable @code{_} has
+the null string (@code{""}) as its value, leaving the original string constant as
+the result.
+
+@item
+By defining ``dummy'' functions to replace @code{dcgettext}, @code{dcngettext}
+and @code{bindtextdomain}, the @command{awk} program can be made to run, but
+all the messages are output in the original language.
+For example:
+
+@cindex @code{bindtextdomain} function (@command{gawk}), portability and
+@cindex @code{dcgettext} function (@command{gawk}), portability and
+@cindex @code{dcngettext} function (@command{gawk}), portability and
+@example
+@c file eg/lib/libintl.awk
+function bindtextdomain(dir, domain)
+@{
+    return dir
+@}
+
+function dcgettext(string, domain, category)
+@{
+    return string
+@}
+
+function dcngettext(string1, string2, number, domain, category)
+@{
+    return (number == 1 ? string1 : string2)
+@}
+@c endfile
+@end example
+
+@item
+The use of positional specifications in @code{printf} or
+@code{sprintf} is @emph{not} portable.
+To support @code{gettext} at the C level, many systems' C versions of
+@code{sprintf} do support positional specifiers.  But it works only if
+enough arguments are supplied in the function call.  Many versions of
+@command{awk} pass @code{printf} formats and arguments unchanged to the
+underlying C library version of @code{sprintf}, but only one format and
+argument at a time.  What happens if a positional specification is
+used is anybody's guess.
+However, since the positional specifications are primarily for use in
+@emph{translated} format strings, and since non-GNU @command{awk}s never
+retrieve the translated string, this should not be a problem in practice.
+@end itemize
+@c ENDOFRANGE inap
+
+@node I18N Example
+@section A Simple Internationalization Example
+
+Now let's look at a step-by-step example of how to internationalize and
+localize a simple @command{awk} program, using @file{guide.awk} as our
+original source:
+
+@example
+@c file eg/prog/guide.awk
+BEGIN @{
+    TEXTDOMAIN = "guide"
+    bindtextdomain(".")  # for testing
+    print _"Don't Panic"
+    print _"The Answer Is", 42
+    print "Pardon me, Zaphod who?"
+@}
+@c endfile
+@end example
+
+@noindent
+Run @samp{gawk --gen-po} to create the @file{.po} file:
+
+@example
+$ gawk --gen-po -f guide.awk > guide.po
+@end example
+
+@noindent
+This produces:
+
+@example
+@c file eg/data/guide.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr ""
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr ""
+
+@c endfile
+@end example
+
+This original portable object file is saved and reused for each language
+into which the application is translated.  The @code{msgid}
+is the original string and the @code{msgstr} is the translation.
+
+@strong{Note:} Strings not marked with a leading underscore do not
+appear in the @file{guide.po} file.
+
+Next, the messages must be translated.
+Here is a translation to a hypothetical dialect of English,
+called ``Mellow'':@footnote{Perhaps it would be better if it were
+called ``Hippy.'' Ah, well.}
+
+@example
+@group
+$ cp guide.po guide-mellow.po
+@var{Add translations to} guide-mellow.po @dots{}
+@end group
+@end example
+
+@noindent
+Following are the translations:
+
+@example
+@c file eg/data/guide-mellow.po
+#: guide.awk:4
+msgid "Don't Panic"
+msgstr "Hey man, relax!"
+
+#: guide.awk:5
+msgid "The Answer Is"
+msgstr "Like, the scoop is"
+
+@c endfile
+@end example
+
+@cindex Linux
+@cindex GNU/Linux
+The next step is to make the directory to hold the binary message object
+file and then to create the @file{guide.mo} file.
+The directory layout shown here is standard for GNU @code{gettext} on
+GNU/Linux systems.  Other versions of @code{gettext} may use a different
+layout:
+
+@example
+$ mkdir en_US en_US/LC_MESSAGES
+@end example
+
+@cindex @code{.po} files, converting to @code{.mo}
+@cindex files, @code{.po}, converting to @code{.mo}
+@cindex @code{.mo} files, converting from @code{.po}
+@cindex files, @code{.mo}, converting from @code{.po}
+@cindex portable object files, converting to message object files
+@cindex files, portable object, converting to message object files
+@cindex message object files, converting from portable object files
+@cindex files, message object, converting from portable object files
+@cindex @command{msgfmt} utility
+The @command{msgfmt} utility does the conversion from human-readable
+@file{.po} file to machine-readable @file{.mo} file.
+By default, @command{msgfmt} creates a file named @file{messages}.
+This file must be renamed and placed in the proper directory so that
+@command{gawk} can find it:
+
+@example
+$ msgfmt guide-mellow.po
+$ mv messages en_US/LC_MESSAGES/guide.mo
+@end example
+
+Finally, we run the program to test it:
+
+@example
+$ gawk -f guide.awk
+@print{} Hey man, relax!
+@print{} Like, the scoop is 42
+@print{} Pardon me, Zaphod who?
+@end example
+
+If the three replacement functions for @code{dcgettext}, @code{dcngettext}
+and @code{bindtextdomain}
+(@pxref{I18N Portability})
+are in a file named @file{libintl.awk},
+then we can run @file{guide.awk} unchanged as follows:
+
+@example
+$ gawk --posix -f guide.awk -f libintl.awk
+@print{} Don't Panic
+@print{} The Answer Is 42
+@print{} Pardon me, Zaphod who?
+@end example
+
+@node Gawk I18N
+@section @command{gawk} Can Speak Your Language
+
+As of @value{PVERSION} 3.1, @command{gawk} itself has been internationalized
+using the GNU @code{gettext} package.
+@ifinfo
+(GNU @code{gettext} is described in
+complete detail in
+@ref{Top}.)
+@end ifinfo
+@ifnotinfo
+(GNU @code{gettext} is described in
+complete detail in
+@cite{GNU gettext tools}.)
+@end ifnotinfo
+As of this writing, the latest version of GNU @code{gettext} is
+@uref{ftp://ftp.gnu.org/gnu/gettext/gettext-0.11.5.tar.gz, @value{PVERSION} 0.11.5}.
+
+If a translation of @command{gawk}'s messages exists,
+then @command{gawk} produces usage messages, warnings,
+and fatal errors in the local language.
+
+@cindex @code{--with-included-gettext} configuration option
+@cindex configuration option, @code{--with-included-gettext}
+On systems that do not use @value{PVERSION} 2 (or later) of the GNU C library, you should
+configure @command{gawk} with the @option{--with-included-gettext} option
+before compiling and installing it.
+@xref{Additional Configuration Options},
+for more information.
+@c ENDOFRANGE inloc
+
+@node Advanced Features
+@chapter Advanced Features of @command{gawk}
+@cindex advanced features, network connections, See Also networks, connections
+@c STARTOFRANGE gawadv
+@cindex @command{gawk}, features, advanced
+@c STARTOFRANGE advgaw
+@cindex advanced features, @command{gawk}
+@ignore
+Contributed by: Peter Langston <pud!psl@bellcore.bellcore.com>
+
+    Found in Steve English's "signature" line:
+
+"Write documentation as if whoever reads it is a violent psychopath
+who knows where you live."
+@end ignore
+@quotation
+@i{Write documentation as if whoever reads it is
+a violent psychopath who knows where you live.}@*
+Steve English, as quoted by Peter Langston
+@end quotation
+
+This @value{CHAPTER} discusses advanced features in @command{gawk}.
+It's a bit of a ``grab bag'' of items that are otherwise unrelated
+to each other.
+First, a command-line option allows @command{gawk} to recognize
+nondecimal numbers in input data, not just in @command{awk}
+programs.  Next, two-way I/O, discussed briefly in earlier parts of this
+@value{DOCUMENT}, is described in full detail, along with the basics
+of TCP/IP networking and BSD portal files.  Finally, @command{gawk}
+can @dfn{profile} an @command{awk} program, making it possible to tune
+it for performance.
+
+@ref{Dynamic Extensions},
+discusses the ability to dynamically add new built-in functions to
+@command{gawk}.  As this feature is still immature and likely to change,
+its description is relegated to an appendix.
+
+@menu
+* Nondecimal Data::             Allowing nondecimal input data.
+* Two-way I/O::                 Two-way communications with another process.
+* TCP/IP Networking::           Using @command{gawk} for network programming.
+* Portal Files::                Using @command{gawk} with BSD portals.
+* Profiling::                   Profiling your @command{awk} programs.
+@end menu
+
+@node Nondecimal Data
+@section Allowing Nondecimal Input Data
+@cindex @code{--non-decimal-data} option
+@cindex advanced features, @command{gawk}, nondecimal input data
+@c last comma does NOT start tertiary
+@cindex input, data, nondecimal
+@cindex constants, nondecimal
+
+If you run @command{gawk} with the @option{--non-decimal-data} option,
+you can have nondecimal constants in your input data:
+
+@c line break here for small book format
+@example
+$ echo 0123 123 0x123 |
+> gawk --non-decimal-data '@{ printf "%d, %d, %d\n",
+>                                         $1, $2, $3 @}'
+@print{} 83, 123, 291
+@end example
+
+For this feature to work, write your program so that
+@command{gawk} treats your data as numeric:
+
+@example
+$ echo 0123 123 0x123 | gawk '@{ print $1, $2, $3 @}'
+@print{} 0123 123 0x123
+@end example
+
+@noindent
+The @code{print} statement treats its expressions as strings.
+Although the fields can act as numbers when necessary,
+they are still strings, so @code{print} does not try to treat them
+numerically.  You may need to add zero to a field to force it to
+be treated as a number.  For example:
+
+@example
+$ echo 0123 123 0x123 | gawk --non-decimal-data '
+> @{ print $1, $2, $3
+>   print $1 + 0, $2 + 0, $3 + 0 @}'
+@print{} 0123 123 0x123
+@print{} 83 123 291
+@end example
+
+Because it is common to have decimal data with leading zeros, and because
+using it could lead to surprising results, the default is to leave this
+facility disabled.  If you want it, you must explicitly request it.
+
+@cindex programming conventions, @code{--non-decimal-data} option
+@cindex @code{--non-decimal-data} option, @code{strtonum} function and
+@cindex @code{strtonum} function (@command{gawk}), @code{--non-decimal-data} option and
+@strong{Caution:}
+@emph{Use of this option is not recommended.}
+It can break old programs very badly.
+Instead, use the @code{strtonum} function to convert your data
+(@pxref{Nondecimal-numbers}).
+This makes your programs easier to write and easier to read, and
+leads to less surprising results.
+
+@node Two-way I/O
+@section Two-Way Communications with Another Process
+@cindex Brennan, Michael
+@cindex programmers, attractiveness of
+@smallexample
+@c Path: cssun.mathcs.emory.edu!gatech!newsxfer3.itd.umich.edu!news-peer.sprintlink.net!news-sea-19.sprintlink.net!news-in-west.sprintlink.net!news.sprintlink.net!Sprint!204.94.52.5!news.whidbey.com!brennan
+From: brennan@@whidbey.com (Mike Brennan)
+Newsgroups: comp.lang.awk
+Subject: Re: Learn the SECRET to Attract Women Easily
+Date: 4 Aug 1997 17:34:46 GMT
+@c Organization: WhidbeyNet
+@c Lines: 12
+Message-ID: <5s53rm$eca@@news.whidbey.com>
+@c References: <5s20dn$2e1@chronicle.concentric.net>
+@c Reply-To: brennan@whidbey.com
+@c NNTP-Posting-Host: asn202.whidbey.com
+@c X-Newsreader: slrn (0.9.4.1 UNIX)
+@c Xref: cssun.mathcs.emory.edu comp.lang.awk:5403
+
+On 3 Aug 1997 13:17:43 GMT, Want More Dates???
+<tracy78@@kilgrona.com> wrote:
+>Learn the SECRET to Attract Women Easily
+>
+>The SCENT(tm)  Pheromone Sex Attractant For Men to Attract Women
+
+The scent of awk programmers is a lot more attractive to women than
+the scent of perl programmers.
+--
+Mike Brennan
+@c brennan@@whidbey.com
+@end smallexample
+
+@c final comma is part of tertiary
+@cindex advanced features, @command{gawk}, processes, communicating with
+@cindex processes, two-way communications with
+It is often useful to be able to
+send data to a separate program for
+processing and then read the result.  This can always be
+done with temporary files:
+
+@example
+# write the data for processing
+tempfile = ("mydata." PROCINFO["pid"])
+while (@var{not done with data})
+    print @var{data} | ("subprogram > " tempfile)
+close("subprogram > " tempfile)
+
+# read the results, remove tempfile when done
+while ((getline newdata < tempfile) > 0)
+    @var{process} newdata @var{appropriately}
+close(tempfile)
+system("rm " tempfile)
+@end example
+
+@noindent
+This works, but not elegantly.  Among other things, it requires that
+the program be run in a directory that cannot be shared among users;
+for example, @file{/tmp} will not do, as another user might happen
+to be using a temporary file with the same name.
+
+@cindex coprocesses
+@cindex input/output, two-way
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O)
+@cindex vertical bar (@code{|}), @code{|&} I/O operator (I/O)
+@cindex @command{csh} utility, @code{|&} operator, comparison with
+Starting with @value{PVERSION} 3.1 of @command{gawk}, it is possible to
+open a @emph{two-way} pipe to another process.  The second process is
+termed a @dfn{coprocess}, since it runs in parallel with @command{gawk}.
+The two-way connection is created using the new @samp{|&} operator
+(borrowed from the Korn shell, @command{ksh}):@footnote{This is very
+different from the same operator in the C shell, @command{csh}.}
+
+@example
+do @{
+    print @var{data} |& "subprogram"
+    "subprogram" |& getline results
+@} while (@var{data left to process})
+close("subprogram")
+@end example
+
+The first time an I/O operation is executed using the @samp{|&}
+operator, @command{gawk} creates a two-way pipeline to a child process
+that runs the other program.  Output created with @code{print}
+or @code{printf} is written to the program's standard input, and
+output from the program's standard output can be read by the @command{gawk}
+program using @code{getline}.
+As is the case with processes started by @samp{|}, the subprogram
+can be any program, or pipeline of programs, that can be started by
+the shell.
+
+There are some cautionary items to be aware of:
+
+@itemize @bullet
+@item
+As the code inside @command{gawk} currently stands, the coprocess's
+standard error goes to the same place that the parent @command{gawk}'s
+standard error goes. It is not possible to read the child's
+standard error separately.
+
+@cindex deadlocks
+@cindex buffering, input/output
+@cindex @code{getline} command, deadlock and
+@item
+I/O buffering may be a problem.  @command{gawk} automatically
+flushes all output down the pipe to the child process.
+However, if the coprocess does not flush its output,
+@command{gawk} may hang when doing a @code{getline} in order to read
+the coprocess's results.  This could lead to a situation
+known as @dfn{deadlock}, where each process is waiting for the
+other one to do something.
+@end itemize
+
+@cindex @code{close} function, two-way pipes and
+It is possible to close just one end of the two-way pipe to
+a coprocess, by supplying a second argument to the @code{close}
+function of either @code{"to"} or @code{"from"}
+(@pxref{Close Files And Pipes}).
+These strings tell @command{gawk} to close the end of the pipe
+that sends data to the process or the end that reads from it,
+respectively.
+
+@cindex @command{sort} utility, coprocesses and
+This is particularly necessary in order to use
+the system @command{sort} utility as part of a coprocess;
+@command{sort} must read @emph{all} of its input
+data before it can produce any output.
+The @command{sort} program does not receive an end-of-file indication
+until @command{gawk} closes the write end of the pipe.
+
+When you have finished writing data to the @command{sort}
+utility, you can close the @code{"to"} end of the pipe, and
+then start reading sorted data via @code{getline}.
+For example:
+
+@example
+BEGIN @{
+    command = "LC_ALL=C sort"
+    n = split("abcdefghijklmnopqrstuvwxyz", a, "")
+
+    for (i = n; i > 0; i--)
+        print a[i] |& command
+    close(command, "to")
+
+    while ((command |& getline line) > 0)
+        print "got", line
+    close(command)
+@}
+@end example
+
+This program writes the letters of the alphabet in reverse order, one
+per line, down the two-way pipe to @command{sort}.  It then closes the
+write end of the pipe, so that @command{sort} receives an end-of-file
+indication.  This causes @command{sort} to sort the data and write the
+sorted data back to the @command{gawk} program.  Once all of the data
+has been read, @command{gawk} terminates the coprocess and exits.
+
+As a side note, the assignment @samp{LC_ALL=C} in the @command{sort}
+command ensures traditional Unix (ASCII) sorting from @command{sort}.
+
+Beginning with @command{gawk} 3.1.2, you may use Pseudo-ttys (ptys) for
+two-way communication instead of pipes, if your system supports them.
+This is done on a per-command basis, by setting a special element
+in the @code{PROCINFO} array
+(@pxref{Auto-set}),
+like so:
+
+@example
+command = "sort -nr"           # command, saved in variable for convenience
+PROCINFO[command, "pty"] = 1   # update PROCINFO
+print @dots{} |& command       # start two-way pipe
+@dots{}
+@end example
+
+@noindent
+Using ptys avoids the buffer deadlock issues described earlier, at some
+loss in performance.  If your system does not have ptys, or if all the
+system's ptys are in use, @command{gawk} automatically falls back to
+using regular pipes.
+
+@node TCP/IP Networking
+@section Using @command{gawk} for Network Programming
+@cindex advanced features, @command{gawk}, network programming
+@cindex networks, programming
+@c STARTOFRANGE tcpip
+@cindex TCP/IP
+@cindex @code{/inet/} files (@command{gawk})
+@cindex files, @code{/inet/} (@command{gawk})
+@cindex @code{EMISTERED}
+@quotation
+@code{EMISTERED}: @i{A host is a host from coast to coast,@*
+and no-one can talk to host that's close,@*
+unless the host that isn't close@*
+is busy hung or dead.}
+@end quotation
+
+In addition to being able to open a two-way pipeline to a coprocess
+on the same system
+(@pxref{Two-way I/O}),
+it is possible to make a two-way connection to
+another process on another system across an IP networking connection.
+
+You can think of this as just a @emph{very long} two-way pipeline to
+a coprocess.
+The way @command{gawk} decides that you want to use TCP/IP networking is
+by recognizing special @value{FN}s that begin with @samp{/inet/}.
+
+The full syntax of the special @value{FN} is
+@file{/inet/@var{protocol}/@var{local-port}/@var{remote-host}/@var{remote-port}}.
+The components are:
+
+@table @var
+@item protocol
+The protocol to use over IP.  This must be either @samp{tcp},
+@samp{udp}, or @samp{raw}, for a TCP, UDP, or raw IP connection,
+respectively.  The use of TCP is recommended for most applications.
+
+@cindex raw sockets
+@cindex sockets
+@strong{Caution:} The use of raw sockets is not currently supported
+in @value{PVERSION} 3.1 of @command{gawk}.
+
+@item local-port
+@cindex @code{getservbyname} function (C library)
+The local TCP or UDP port number to use.  Use a port number of @samp{0}
+when you want the system to pick a port. This is what you should do
+when writing a TCP or UDP client.
+You may also use a well-known service name, such as @samp{smtp}
+or @samp{http}, in which case @command{gawk} attempts to determine
+the predefined port number using the C @code{getservbyname} function.
+
+@item remote-host
+The IP address or fully-qualified domain name of the Internet
+host to which you want to connect.
+
+@item remote-port
+The TCP or UDP port number to use on the given @var{remote-host}.
+Again, use @samp{0} if you don't care, or else a well-known
+service name.
+@end table
+
+Consider the following very simple example:
+
+@example
+BEGIN @{
+  Service = "/inet/tcp/0/localhost/daytime"
+  Service |& getline
+  print $0
+  close(Service)
+@}
+@end example
+
+This program reads the current date and time from the local system's
+TCP @samp{daytime} server.
+It then prints the results and closes the connection.
+
+Because this topic is extensive, the use of @command{gawk} for
+TCP/IP programming is documented separately.
+@ifinfo
+@xref{Top},
+@end ifinfo
+@ifnotinfo
+See @cite{TCP/IP Internetworking with @command{gawk}},
+which comes as part of the @command{gawk} distribution,
+@end ifnotinfo
+for a much more complete introduction and discussion, as well as
+extensive examples.
+
+@node Portal Files
+@section Using @command{gawk} with BSD Portals
+@cindex advanced features, @command{gawk}, BSD portals
+@cindex portal files
+@cindex files, portal
+@cindex BSD portals
+@cindex @code{/p} files (@command{gawk})
+@cindex files, @code{/p} (@command{gawk})
+@cindex @code{--enable-portals} configuration option
+@cindex operating systems, BSD-based
+
+Similar to the @file{/inet} special files, if @command{gawk}
+is configured with the @option{--enable-portals} option
+(@pxref{Quick Installation}),
+then @command{gawk} treats
+files whose pathnames begin with @code{/p} as 4.4 BSD-style portals.
+
+@cindex @code{|} (vertical bar), @code{|&} operator (I/O), two-way communications
+@cindex vertical bar (@code{|}), @code{|&} operator (I/O), two-way communications
+When used with the @samp{|&} operator, @command{gawk} opens the file
+for two-way communications.  The operating system's portal mechanism
+then manages creating the process associated with the portal and
+the corresponding communications with the portal's process.
+@c ENDOFRANGE tcpip
+
+@node Profiling
+@section Profiling Your @command{awk} Programs
+@c STARTOFRANGE awkp
+@cindex @command{awk} programs, profiling
+@c STARTOFRANGE proawk
+@cindex profiling @command{awk} programs
+@c STARTOFRANGE pgawk
+@cindex @command{pgawk} program
+@cindex profiling @command{gawk}, See @command{pgawk} program
+
+Beginning with @value{PVERSION} 3.1 of @command{gawk}, you may produce execution
+traces of your @command{awk} programs.
+This is done with a specially compiled version of @command{gawk},
+called @command{pgawk} (``profiling @command{gawk}'').
+
+@cindex @code{awkprof.out} file
+@cindex files, @code{awkprof.out}
+@cindex @command{pgawk} program, @code{awkprof.out} file
+@command{pgawk} is identical in every way to @command{gawk}, except that when
+it has finished running, it creates a profile of your program in a file
+named @file{awkprof.out}.
+Because it is profiling, it also executes up to 45% slower than
+@command{gawk} normally does.
+
+@cindex @code{--profile} option
+As shown in the following example,
+the @option{--profile} option can be used to change the name of the file
+where @command{pgawk} will write the profile:
+
+@example
+$ pgawk --profile=myprog.prof -f myprog.awk data1 data2
+@end example
+
+@noindent
+In the above example, @command{pgawk} places the profile in
+@file{myprog.prof} instead of in @file{awkprof.out}.
+
+Regular @command{gawk} also accepts this option.  When called with just
+@option{--profile}, @command{gawk} ``pretty prints'' the program into
+@file{awkprof.out}, without any execution counts.  You may supply an
+option to @option{--profile} to change the @value{FN}.  Here is a sample
+session showing a simple @command{awk} program, its input data, and the
+results from running @command{pgawk}.  First, the @command{awk} program:
+
+@example
+BEGIN @{ print "First BEGIN rule" @}
+
+END @{ print "First END rule" @}
+
+/foo/ @{
+    print "matched /foo/, gosh"
+    for (i = 1; i <= 3; i++)
+        sing()
+@}
+
+@{
+    if (/foo/)
+        print "if is true"
+    else
+        print "else is true"
+@}
+
+BEGIN @{ print "Second BEGIN rule" @}
+
+END @{ print "Second END rule" @}
+
+function sing(    dummy)
+@{
+    print "I gotta be me!"
+@}
+@end example
+
+Following is the input data:
+
+@example
+foo
+bar
+baz
+foo
+junk
+@end example
+
+Here is the @file{awkprof.out} that results from running @command{pgawk}
+on this program and data (this example also illustrates that @command{awk}
+programmers sometimes have to work late):
+
+@cindex @code{BEGIN} pattern, @command{pgawk} program
+@cindex @code{END} pattern, @command{pgawk} program
+@example
+        # gawk profile, created Sun Aug 13 00:00:15 2000
+
+        # BEGIN block(s)
+
+        BEGIN @{
+     1          print "First BEGIN rule"
+     1          print "Second BEGIN rule"
+        @}
+
+        # Rule(s)
+
+     5  /foo/   @{ # 2
+     2          print "matched /foo/, gosh"
+     6          for (i = 1; i <= 3; i++) @{
+     6                  sing()
+                @}
+        @}
+
+     5  @{
+     5          if (/foo/) @{ # 2
+     2                  print "if is true"
+     3          @} else @{
+     3                  print "else is true"
+                @}
+        @}
+
+        # END block(s)
+
+        END @{
+     1          print "First END rule"
+     1          print "Second END rule"
+        @}
+
+        # Functions, listed alphabetically
+
+     6  function sing(dummy)
+        @{
+     6          print "I gotta be me!"
+        @}
+@end example
+
+This example illustrates many of the basic rules for profiling output.
+The rules are as follows:
+
+@itemize @bullet
+@item
+The program is printed in the order @code{BEGIN} rule,
+pattern/action rules, @code{END} rule and functions, listed
+alphabetically.
+Multiple @code{BEGIN} and @code{END} rules are merged together.
+
+@cindex patterns, counts
+@item
+Pattern-action rules have two counts.
+The first count, to the left of the rule, shows how many times
+the rule's pattern was @emph{tested}.
+The second count, to the right of the rule's opening left brace
+in a comment,
+shows how many times the rule's action was @emph{executed}.
+The difference between the two indicates how many times the rule's
+pattern evaluated to false.
+
+@item
+Similarly,
+the count for an @code{if}-@code{else} statement shows how many times
+the condition was tested.
+To the right of the opening left brace for the @code{if}'s body
+is a count showing how many times the condition was true.
+The count for the @code{else}
+indicates how many times the test failed.
+
+@cindex loops, count for header
+@item
+The count for a loop header (such as @code{for}
+or @code{while}) shows how many times the loop test was executed.
+(Because of this, you can't just look at the count on the first
+statement in a rule to determine how many times the rule was executed.
+If the first statement is a loop, the count is misleading.)
+
+@cindex functions, user-defined, counts
+@cindex user-defined, functions, counts
+@item
+For user-defined functions, the count next to the @code{function}
+keyword indicates how many times the function was called.
+The counts next to the statements in the body show how many times
+those statements were executed.
+
+@cindex @code{@{@}} (braces), @command{pgawk} program
+@cindex braces (@code{@{@}}), @command{pgawk} program
+@item
+The layout uses ``K&R'' style with tabs.
+Braces are used everywhere, even when
+the body of an @code{if}, @code{else}, or loop is only a single statement.
+
+@cindex @code{()} (parentheses), @command{pgawk} program
+@cindex parentheses @code{()}, @command{pgawk} program
+@item
+Parentheses are used only where needed, as indicated by the structure
+of the program and the precedence rules.
+@c extra verbiage here satisfies the copyeditor. ugh.
+For example, @samp{(3 + 5) * 4} means add three plus five, then multiply
+the total by four.  However, @samp{3 + 5 * 4} has no parentheses, and
+means @samp{3 + (5 * 4)}.
+
+@item
+All string concatenations are parenthesized too.
+(This could be made a bit smarter.)
+
+@item
+Parentheses are used around the arguments to @code{print}
+and @code{printf} only when
+the @code{print} or @code{printf} statement is followed by a redirection.
+Similarly, if
+the target of a redirection isn't a scalar, it gets parenthesized.
+
+@item
+@command{pgawk} supplies leading comments in
+front of the @code{BEGIN} and @code{END} rules,
+the pattern/action rules, and the functions.
+
+@end itemize
+
+The profiled version of your program may not look exactly like what you
+typed when you wrote it.  This is because @command{pgawk} creates the
+profiled version by ``pretty printing'' its internal representation of
+the program.  The advantage to this is that @command{pgawk} can produce
+a standard representation.  The disadvantage is that all source-code
+comments are lost, as are the distinctions among multiple @code{BEGIN}
+and @code{END} rules.  Also, things such as:
+
+@example
+/foo/
+@end example
+
+@noindent
+come out as:
+
+@example
+/foo/   @{
+    print $0
+@}
+@end example
+
+@noindent
+which is correct, but possibly surprising.
+
+@cindex profiling @command{awk} programs, dynamically
+@cindex @command{pgawk} program, dynamic profiling
+Besides creating profiles when a program has completed,
+@command{pgawk} can produce a profile while it is running.
+This is useful if your @command{awk} program goes into an
+infinite loop and you want to see what has been executed.
+To use this feature, run @command{pgawk} in the background:
+
+@example
+$ pgawk -f myprog &
+[1] 13992
+@end example
+
+@c comma does NOT start secondary
+@cindex @command{kill} command, dynamic profiling
+@cindex @code{USR1} signal
+@cindex signals, @code{USR1}/@code{SIGUSR1}
+@noindent
+The shell prints a job number and process ID number; in this case, 13992.
+Use the @command{kill} command to send the @code{USR1} signal
+to @command{pgawk}:
+
+@example
+$ kill -USR1 13992
+@end example
+
+@noindent
+As usual, the profiled version of the program is written to
+@file{awkprof.out}, or to a different file if you use the @option{--profile}
+option.
+
+Along with the regular profile, as shown earlier, the profile
+includes a trace of any active functions:
+
+@example
+# Function Call Stack:
+
+#   3. baz
+#   2. bar
+#   1. foo
+# -- main --
+@end example
+
+You may send @command{pgawk} the @code{USR1} signal as many times as you like.
+Each time, the profile and function call trace are appended to the output
+profile file.
+
+@cindex @code{HUP} signal
+@cindex signals, @code{HUP}/@code{SIGHUP}
+If you use the @code{HUP} signal instead of the @code{USR1} signal,
+@command{pgawk} produces the profile and the function call trace and then exits.
+
+@cindex @code{INT} signal (MS-DOS)
+@cindex signals, @code{INT}/@code{SIGINT} (MS-DOS)
+@cindex @code{QUIT} signal (MS-DOS)
+@cindex signals, @code{QUIT}/@code{SIGQUIT} (MS-DOS)
+When @command{pgawk} runs on MS-DOS or MS-Windows, it uses the
+@code{INT} and @code{QUIT} signals for producing the profile and, in
+the case of the @code{INT} signal, @command{pgawk} exits.  This is
+because these systems don't support the @command{kill} command, so the
+only signals you can deliver to a program are those generated by the
+keyboard.  The @code{INT} signal is generated by the
+@kbd{@value{CTL}-@key{C}} or @kbd{@value{CTL}-@key{BREAK}} key, while the
+@code{QUIT} signal is generated by the @kbd{@value{CTL}-@key{\}} key.
+@c ENDOFRANGE advgaw
+@c ENDOFRANGE gawadv
+@c ENDOFRANGE pgawk
+@c ENDOFRANGE awkp
+@c ENDOFRANGE proawk
+
+@node Invoking Gawk
+@chapter Running @command{awk} and @command{gawk}
+
+This @value{CHAPTER} covers how to run awk, both POSIX-standard
+and @command{gawk}-specific command-line options, and what
+@command{awk} and
+@command{gawk} do with non-option arguments.
+It then proceeds to cover how @command{gawk} searches for source files,
+obsolete options and/or features, and known bugs in @command{gawk}.
+This @value{CHAPTER} rounds out the discussion of @command{awk}
+as a program and as a language.
+
+While a number of the options and features described here were
+discussed in passing earlier in the book, this @value{CHAPTER} provides the
+full details.
+
+@menu
+* Command Line::                How to run @command{awk}.
+* Options::                     Command-line options and their meanings.
+* Other Arguments::             Input file names and variable assignments.
+* AWKPATH Variable::            Searching directories for @command{awk}
+                                programs.
+* Obsolete::                    Obsolete Options and/or features.
+* Undocumented::                Undocumented Options and Features.
+* Known Bugs::                  Known Bugs in @command{gawk}.
+@end menu
+
+@node Command Line
+@section Invoking @command{awk}
+@cindex command line, invoking @command{awk} from
+@cindex @command{awk}, invoking
+@cindex arguments, command-line, invoking @command{awk}
+@cindex options, command-line, invoking @command{awk}
+
+There are two ways to run @command{awk}---with an explicit program or with
+one or more program files.  Here are templates for both of them; items
+enclosed in [@dots{}] in these templates are optional:
+
+@example
+awk @r{[@var{options}]} -f progfile @r{[@code{--}]} @var{file} @dots{}
+awk @r{[@var{options}]} @r{[@code{--}]} '@var{program}' @var{file} @dots{}
+@end example
+
+@cindex GNU long options
+@cindex long options
+@cindex options, long
+Besides traditional one-letter POSIX-style options, @command{gawk} also
+supports GNU long options.
+
+@cindex dark corner, invoking @command{awk}
+@cindex lint checking, empty programs
+It is possible to invoke @command{awk} with an empty program:
+
+@example
+awk '' datafile1 datafile2
+@end example
+
+@cindex @code{--lint} option
+@noindent
+Doing so makes little sense, though; @command{awk} exits
+silently when given an empty program.
+@value{DARKCORNER}
+If @option{--lint} has
+been specified on the command line, @command{gawk} issues a
+warning that the program is empty.
+
+@node Options
+@section Command-Line Options
+@c STARTOFRANGE ocl
+@cindex options, command-line
+@c STARTOFRANGE clo
+@cindex command line, options
+@c STARTOFRANGE gnulo
+@cindex GNU long options
+@c STARTOFRANGE longo
+@cindex options, long
+
+Options begin with a dash and consist of a single character.
+GNU-style long options consist of two dashes and a keyword.
+The keyword can be abbreviated, as long as the abbreviation allows the option
+to be uniquely identified.  If the option takes an argument, then the
+keyword is either immediately followed by an equals sign (@samp{=}) and the
+argument's value, or the keyword and the argument's value are separated
+by whitespace.
+If a particular option with a value is given more than once, it is the
+last value that counts.
+
+@cindex POSIX @command{awk}, GNU long options and
+Each long option for @command{gawk} has a corresponding
+POSIX-style option.
+The long and short options are
+interchangeable in all contexts.
+The options and their meanings are as follows:
+
+@table @code
+@item -F @var{fs}
+@itemx --field-separator @var{fs}
+@cindex @code{-F} option
+@cindex @code{--field-separator} option
+@cindex @code{FS} variable, @code{--field-separator} option and
+Sets the @code{FS} variable to @var{fs}
+(@pxref{Field Separators}).
+
+@item -f @var{source-file}
+@itemx --file @var{source-file}
+@cindex @code{-f} option
+@cindex @code{--file} option
+@cindex @command{awk} programs, location of
+Indicates that the @command{awk} program is to be found in @var{source-file}
+instead of in the first non-option argument.
+
+@item -v @var{var}=@var{val}
+@itemx --assign @var{var}=@var{val}
+@cindex @code{-v} option
+@cindex @code{--assign} option
+@cindex variables, setting
+Sets the variable @var{var} to the value @var{val} @emph{before}
+execution of the program begins.  Such variable values are available
+inside the @code{BEGIN} rule
+(@pxref{Other Arguments}).
+
+The @option{-v} option can only set one variable, but it can be used
+more than once, setting another variable each time, like this:
+@samp{awk @w{-v foo=1} @w{-v bar=2} @dots{}}.
+
+@c last comma is part of secondary
+@cindex built-in variables, @code{-v} option, setting with
+@c last comma is part of tertiary
+@cindex variables, built-in, @code{-v} option, setting with
+@strong{Caution:}  Using @option{-v} to set the values of the built-in
+variables may lead to surprising results.  @command{awk} will reset the
+values of those variables as it needs to, possibly ignoring any
+predefined value you may have given.
+
+@item -mf @var{N}
+@itemx -mr @var{N}
+@cindex @code{-mf}/@code{-mr} options
+@cindex memory, setting limits
+Sets various memory limits to the value @var{N}.  The @samp{f} flag sets
+the maximum number of fields and the @samp{r} flag sets the maximum
+record size.  These two flags and the @option{-m} option are from the
+Bell Laboratories research version of Unix @command{awk}.  They are provided
+for compatibility but otherwise ignored by
+@command{gawk}, since @command{gawk} has no predefined limits.
+(The Bell Laboratories @command{awk} no longer needs these options;
+it continues to accept them to avoid breaking old programs.)
+
+@item -W @var{gawk-opt}
+@cindex @code{-W} option
+Following the POSIX standard, implementation-specific
+options are supplied as arguments to the @option{-W} option.  These options
+also have corresponding GNU-style long options.
+Note that the long options may be abbreviated, as long as
+the abbreviations remain unique.
+The full list of @command{gawk}-specific options is provided next.
+
+@item --
+@cindex command line, options, end of
+@cindex options, command-line, end of
+Signals the end of the command-line options.  The following arguments
+are not treated as options even if they begin with @samp{-}.  This
+interpretation of @option{--} follows the POSIX argument parsing
+conventions.
+
+@cindex @code{-} (hyphen), filenames beginning with
+@cindex hyphen (@code{-}), filenames beginning with
+This is useful if you have @value{FN}s that start with @samp{-},
+or in shell scripts, if you have @value{FN}s that will be specified
+by the user that could start with @samp{-}.
+@end table
+@c ENDOFRANGE gnulo
+@c ENDOFRANGE longo
+
+The previous list described options mandated by the POSIX standard,
+as well as options available in the Bell Laboratories version of @command{awk}.
+The following list describes @command{gawk}-specific options:
+
+@table @code
+@item -W compat
+@itemx -W traditional
+@itemx --compat
+@itemx --traditional
+@cindex @code{--compat} option
+@cindex @code{--traditional} option
+@cindex compatibility mode (@command{gawk}), specifying
+Specifies @dfn{compatibility mode}, in which the GNU extensions to
+the @command{awk} language are disabled, so that @command{gawk} behaves just
+like the Bell Laboratories research version of Unix @command{awk}.
+@option{--traditional} is the preferred form of this option.
+@xref{POSIX/GNU},
+which summarizes the extensions.  Also see
+@ref{Compatibility Mode}.
+
+@item -W copyright
+@itemx --copyright
+@cindex @code{--copyright} option
+@cindex GPL (General Public License), printing
+Print the short version of the General Public License and then exit.
+
+@item -W copyleft
+@itemx --copyleft
+@cindex @code{--copyleft} option
+Just like @option{--copyright}.
+This option may disappear in a future version of @command{gawk}.
+
+@cindex @code{--dump-variables} option
+@cindex @code{awkvars.out} file
+@cindex files, @code{awkvars.out}
+@cindex variables, global, printing list of
+@item -W dump-variables@r{[}=@var{file}@r{]}
+@itemx --dump-variables@r{[}=@var{file}@r{]}
+Prints a sorted list of global variables, their types, and final values
+to @var{file}.  If no @var{file} is provided, @command{gawk} prints this
+list to the file named @file{awkvars.out} in the current directory.
+
+@c last comma is part of secondary
+@cindex troubleshooting, typographical errors, global variables
+Having a list of all global variables is a good way to look for
+typographical errors in your programs.
+You would also use this option if you have a large program with a lot of
+functions, and you want to be sure that your functions don't
+inadvertently use global variables that you meant to be local.
+(This is a particularly easy mistake to make with simple variable
+names like @code{i}, @code{j}, etc.)
+
+@item -W gen-po
+@itemx --gen-po
+@cindex @code{--gen-po} option
+@cindex portable object files, generating
+@cindex files, portable object, generating
+Analyzes the source program and
+generates a GNU @code{gettext} Portable Object file on standard
+output for all string constants that have been marked for translation.
+@xref{Internationalization},
+for information about this option.
+
+@item -W help
+@itemx -W usage
+@itemx --help
+@itemx --usage
+@cindex @code{--help} option
+@cindex @code{--usage} option
+@cindex GNU long options, printing list of
+@cindex options, printing list of
+@cindex printing, list of options
+Prints a ``usage'' message summarizing the short and long style options
+that @command{gawk} accepts and then exit.
+
+@item -W lint@r{[}=fatal@r{]}
+@itemx --lint@r{[}=fatal@r{]}
+@cindex @code{--lint} option
+@cindex lint checking, issuing warnings
+@cindex warnings, issuing
+Warns about constructs that are dubious or nonportable to
+other @command{awk} implementations.
+Some warnings are issued when @command{gawk} first reads your program.  Others
+are issued at runtime, as your program executes.
+With an optional argument of @samp{fatal},
+lint warnings become fatal errors.
+This may be drastic, but its use will certainly encourage the
+development of cleaner @command{awk} programs.
+With an optional argument of @samp{invalid}, only warnings about things that are
+actually invalid are issued. (This is not fully implemented yet.)
+
+@item -W lint-old
+@itemx --lint-old
+@cindex @code{--lint-old} option
+Warns about constructs that are not available in the original version of
+@command{awk} from Version 7 Unix
+(@pxref{V7/SVR3.1}).
+
+@item -W non-decimal-data
+@itemx --non-decimal-data
+@cindex @code{--non-decimal-data} option
+@cindex hexadecimal, values, enabling interpretation of
+@c comma is part of primary
+@cindex octal values, enabling interpretation of
+Enable automatic interpretation of octal and hexadecimal
+values in input data
+(@pxref{Nondecimal Data}).
+
+@cindex troubleshooting, @code{--non-decimal-data} option
+@strong{Caution:} This option can severely break old programs.
+Use with care.
+
+@item -W posix
+@itemx --posix
+@cindex @code{--posix} option
+@cindex POSIX mode
+@c last comma is part of tertiary
+@cindex @command{gawk}, extensions, disabling
+Operates in strict POSIX mode.  This disables all @command{gawk}
+extensions (just like @option{--traditional}) and adds the following additional
+restrictions:
+
+@c IMPORTANT! Keep this list in sync with the one in node POSIX
+
+@itemize @bullet
+@cindex escape sequences, unrecognized
+@item
+@code{\x} escape sequences are not recognized
+(@pxref{Escape Sequences}).
+
+@cindex newlines
+@cindex whitespace, newlines as
+@item
+Newlines do not act as whitespace to separate fields when @code{FS} is
+equal to a single space
+(@pxref{Fields}).
+
+@item
+Newlines are not allowed after @samp{?} or @samp{:}
+(@pxref{Conditional Exp}).
+
+@item
+The synonym @code{func} for the keyword @code{function} is not
+recognized (@pxref{Definition Syntax}).
+
+@cindex @code{*} (asterisk), @code{**} operator
+@cindex asterisk (@code{*}), @code{**} operator
+@cindex @code{*} (asterisk), @code{**=} operator
+@cindex asterisk (@code{*}), @code{**=} operator
+@cindex @code{^} (caret), @code{^} operator
+@cindex caret (@code{^}), @code{^} operator
+@cindex @code{^} (caret), @code{^=} operator
+@cindex caret (@code{^}), @code{^=} operator
+@item
+The @samp{**} and @samp{**=} operators cannot be used in
+place of @samp{^} and @samp{^=} (@pxref{Arithmetic Ops},
+and also @pxref{Assignment Ops}).
+
+@cindex @code{FS} variable, as TAB character
+@item
+Specifying @samp{-Ft} on the command-line does not set the value
+of @code{FS} to be a single TAB character
+(@pxref{Field Separators}).
+
+@c comma does not start secondary
+@cindex @code{fflush} function, unsupported
+@item
+The @code{fflush} built-in function is not supported
+(@pxref{I/O Functions}).
+@end itemize
+
+@c @cindex automatic warnings
+@c @cindex warnings, automatic
+@cindex @code{--traditional} option, @code{--posix} option and
+@cindex @code{--posix} option, @code{--traditional} option and
+If you supply both @option{--traditional} and @option{--posix} on the
+command line, @option{--posix} takes precedence. @command{gawk}
+also issues a warning if both options are supplied.
+
+@item -W profile@r{[}=@var{file}@r{]}
+@itemx --profile@r{[}=@var{file}@r{]}
+@cindex @code{--profile} option
+@cindex @command{awk} programs, profiling, enabling
+Enable profiling of @command{awk} programs
+(@pxref{Profiling}).
+By default, profiles are created in a file named @file{awkprof.out}.
+The optional @var{file} argument allows you to specify a different
+@value{FN} for the profile file.
+
+When run with @command{gawk}, the profile is just a ``pretty printed'' version
+of the program.  When run with @command{pgawk}, the profile contains execution
+counts for each statement in the program in the left margin, and function
+call counts for each function.
+
+@item -W re-interval
+@itemx --re-interval
+@cindex @code{--re-interval} option
+@cindex regular expressions, interval expressions and
+Allows interval expressions
+(@pxref{Regexp Operators})
+in regexps.
+Because interval expressions were traditionally not available in @command{awk},
+@command{gawk} does not provide them by default. This prevents old @command{awk}
+programs from breaking.
+
+@item -W source @var{program-text}
+@itemx --source @var{program-text}
+@cindex @code{--source} option
+@cindex source code, mixing
+Allows you to mix source code in files with source
+code that you enter on the command line.
+Program source code is taken from the @var{program-text}.
+This is particularly useful
+when you have library functions that you want to use from your command-line
+programs (@pxref{AWKPATH Variable}).
+
+@item -W version
+@itemx --version
+@cindex @code{--version} option
+@c last comma is part of tertiary
+@cindex @command{gawk}, versions of, information about, printing
+Prints version information for this particular copy of @command{gawk}.
+This allows you to determine if your copy of @command{gawk} is up to date
+with respect to whatever the Free Software Foundation is currently
+distributing.
+It is also useful for bug reports
+(@pxref{Bugs}).
+@end table
+
+As long as program text has been supplied,
+any other options are flagged as invalid with a warning message but
+are otherwise ignored.
+
+@cindex @code{-F} option, @code{-Ft} sets @code{FS} to TAB
+In compatibility mode, as a special case, if the value of @var{fs} supplied
+to the @option{-F} option is @samp{t}, then @code{FS} is set to the TAB
+character (@code{"\t"}).  This is true only for @option{--traditional} and not
+for @option{--posix}
+(@pxref{Field Separators}).
+
+@cindex @code{-f} option, on command line
+The @option{-f} option may be used more than once on the command line.
+If it is, @command{awk} reads its program source from all of the named files, as
+if they had been concatenated together into one big file.  This is
+useful for creating libraries of @command{awk} functions.  These functions
+can be written once and then retrieved from a standard place, instead
+of having to be included into each individual program.
+(As mentioned in
+@ref{Definition Syntax},
+function names must be unique.)
+
+Library functions can still be used, even if the program is entered at the terminal,
+by specifying @samp{-f /dev/tty}.  After typing your program,
+type @kbd{@value{CTL}-d} (the end-of-file character) to terminate it.
+(You may also use @samp{-f -} to read program source from the standard
+input but then you will not be able to also use the standard input as a
+source of data.)
+
+Because it is clumsy using the standard @command{awk} mechanisms to mix source
+file and command-line @command{awk} programs, @command{gawk} provides the
+@option{--source} option.  This does not require you to pre-empt the standard
+input for your source code; it allows you to easily mix command-line
+and library source code
+(@pxref{AWKPATH Variable}).
+
+@cindex @code{--source} option
+If no @option{-f} or @option{--source} option is specified, then @command{gawk}
+uses the first non-option command-line argument as the text of the
+program source code.
+
+@cindex @code{POSIXLY_CORRECT} environment variable
+@cindex lint checking, @code{POSIXLY_CORRECT} environment variable
+@cindex POSIX mode
+If the environment variable @env{POSIXLY_CORRECT} exists,
+then @command{gawk} behaves in strict POSIX mode, exactly as if
+you had supplied the @option{--posix} command-line option.
+Many GNU programs look for this environment variable to turn on
+strict POSIX mode. If @option{--lint} is supplied on the command line
+and @command{gawk} turns on POSIX mode because of @env{POSIXLY_CORRECT},
+then it issues a warning message indicating that POSIX
+mode is in effect.
+You would typically set this variable in your shell's startup file.
+For a Bourne-compatible shell (such as @command{bash}), you would add these
+lines to the @file{.profile} file in your home directory:
+
+@example
+POSIXLY_CORRECT=true
+export POSIXLY_CORRECT
+@end example
+
+@cindex @command{csh} utility, @code{POSIXLY_CORRECT} environment variable
+For a @command{csh}-compatible
+shell,@footnote{Not recommended.}
+you would add this line to the @file{.login} file in your home directory:
+
+@example
+setenv POSIXLY_CORRECT true
+@end example
+
+@cindex portability, @code{POSIXLY_CORRECT} environment variable
+Having @env{POSIXLY_CORRECT} set is not recommended for daily use,
+but it is good for testing the portability of your programs to other
+environments.
+@c ENDOFRANGE ocl
+@c ENDOFRANGE clo
+
+@node Other Arguments
+@section Other Command-Line Arguments
+@cindex command line, arguments
+@cindex arguments, command-line
+
+Any additional arguments on the command line are normally treated as
+input files to be processed in the order specified.   However, an
+argument that has the form @code{@var{var}=@var{value}}, assigns
+the value @var{value} to the variable @var{var}---it does not specify a
+file at all.
+(This was discussed earlier in
+@ref{Assignment Options}.)
+
+@cindex @code{ARGIND} variable, command-line arguments
+@cindex @code{ARGC}/@code{ARGV} variables, command-line arguments
+All these arguments are made available to your @command{awk} program in the
+@code{ARGV} array (@pxref{Built-in Variables}).  Command-line options
+and the program text (if present) are omitted from @code{ARGV}.
+All other arguments, including variable assignments, are
+included.   As each element of @code{ARGV} is processed, @command{gawk}
+sets the variable @code{ARGIND} to the index in @code{ARGV} of the
+current element.
+
+@cindex input files, variable assignments and
+The distinction between @value{FN} arguments and variable-assignment
+arguments is made when @command{awk} is about to open the next input file.
+At that point in execution, it checks the @value{FN} to see whether
+it is really a variable assignment; if so, @command{awk} sets the variable
+instead of reading a file.
+
+Therefore, the variables actually receive the given values after all
+previously specified files have been read.  In particular, the values of
+variables assigned in this fashion are @emph{not} available inside a
+@code{BEGIN} rule
+(@pxref{BEGIN/END}),
+because such rules are run before @command{awk} begins scanning the argument list.
+
+@cindex dark corner, escape sequences
+The variable values given on the command line are processed for escape
+sequences (@pxref{Escape Sequences}).
+@value{DARKCORNER}
+
+In some earlier implementations of @command{awk}, when a variable assignment
+occurred before any @value{FN}s, the assignment would happen @emph{before}
+the @code{BEGIN} rule was executed.  @command{awk}'s behavior was thus
+inconsistent; some command-line assignments were available inside the
+@code{BEGIN} rule, while others were not.  Unfortunately,
+some applications came to depend
+upon this ``feature.''  When @command{awk} was changed to be more consistent,
+the @option{-v} option was added to accommodate applications that depended
+upon the old behavior.
+
+The variable assignment feature is most useful for assigning to variables
+such as @code{RS}, @code{OFS}, and @code{ORS}, which control input and
+output formats before scanning the @value{DF}s.  It is also useful for
+controlling state if multiple passes are needed over a @value{DF}.  For
+example:
+
+@cindex files, multiple passes over
+@example
+awk 'pass == 1  @{ @var{pass 1 stuff} @}
+     pass == 2  @{ @var{pass 2 stuff} @}' pass=1 mydata pass=2 mydata
+@end example
+
+Given the variable assignment feature, the @option{-F} option for setting
+the value of @code{FS} is not
+strictly necessary.  It remains for historical compatibility.
+
+@node AWKPATH Variable
+@section The @env{AWKPATH} Environment Variable
+@cindex @env{AWKPATH} environment variable
+@cindex directories, searching
+@cindex search paths, for source files
+@cindex differences in @command{awk} and @command{gawk}, @code{AWKPATH} environment variable
+@ifinfo
+The previous @value{SECTION} described how @command{awk} program files can be named
+on the command-line with the @option{-f} option.
+@end ifinfo
+In most @command{awk}
+implementations, you must supply a precise path name for each program
+file, unless the file is in the current directory.
+But in @command{gawk}, if the @value{FN} supplied to the @option{-f} option
+does not contain a @samp{/}, then @command{gawk} searches a list of
+directories (called the @dfn{search path}), one by one, looking for a
+file with the specified name.
+
+The search path is a string consisting of directory names
+separated by colons.  @command{gawk} gets its search path from the
+@env{AWKPATH} environment variable.  If that variable does not exist,
+@command{gawk} uses a default path,
+@samp{.:/usr/local/share/awk}.@footnote{Your version of @command{gawk}
+may use a different directory; it
+will depend upon how @command{gawk} was built and installed. The actual
+directory is the value of @samp{$(datadir)} generated when
+@command{gawk} was configured.  You probably don't need to worry about this,
+though.} (Programs written for use by
+system administrators should use an @env{AWKPATH} variable that
+does not include the current directory, @file{.}.)
+
+The search path feature is particularly useful for building libraries
+of useful @command{awk} functions.  The library files can be placed in a
+standard directory in the default path and then specified on
+the command line with a short @value{FN}.  Otherwise, the full @value{FN}
+would have to be typed for each file.
+
+By using both the @option{--source} and @option{-f} options, your command-line
+@command{awk} programs can use facilities in @command{awk} library files
+(@pxref{Library Functions}).
+Path searching is not done if @command{gawk} is in compatibility mode.
+This is true for both @option{--traditional} and @option{--posix}.
+@xref{Options}.
+
+@strong{Note:} If you want files in the current directory to be found,
+you must include the current directory in the path, either by including
+@file{.} explicitly in the path or by writing a null entry in the
+path.  (A null entry is indicated by starting or ending the path with a
+colon or by placing two colons next to each other (@samp{::}).)  If the
+current directory is not included in the path, then files cannot be
+found in the current directory.  This path search mechanism is identical
+to the shell's.
+@c someday, @cite{The Bourne Again Shell}....
+
+Starting with @value{PVERSION} 3.0, if @env{AWKPATH} is not defined in the
+environment, @command{gawk} places its default search path into
+@code{ENVIRON["AWKPATH"]}. This makes it easy to determine
+the actual search path that @command{gawk} will use
+from within an @command{awk} program.
+
+While you can change @code{ENVIRON["AWKPATH"]} within your @command{awk}
+program, this has no effect on the running program's behavior.  This makes
+sense: the @env{AWKPATH} environment variable is used to find the program
+source files.  Once your program is running, all the files have been
+found, and @command{gawk} no longer needs to use @env{AWKPATH}.
+
+@node Obsolete
+@section Obsolete Options and/or Features
+
+@cindex features, advanced, See advanced features
+@cindex options, deprecated
+@cindex features, deprecated
+@cindex obsolete features
+This @value{SECTION} describes features and/or command-line options from
+previous releases of @command{gawk} that are either not available in the
+current version or that are still supported but deprecated (meaning that
+they will @emph{not} be in the next release).
+
+@c update this section for each release!
+
+@cindex @code{next file} statement, deprecated
+@cindex @code{nextfile} statement, @code{next file} statement and
+For @value{PVERSION} @value{VERSION} of @command{gawk}, there are no
+deprecated command-line options
+@c or other deprecated features
+from the previous version of @command{gawk}.
+The use of @samp{next file} (two words) for @code{nextfile} was deprecated
+in @command{gawk} 3.0 but still worked.  Starting with @value{PVERSION} 3.1, the
+two-word usage is no longer accepted.
+
+The process-related special files described in
+@ref{Special Process},
+work as described, but
+are now considered deprecated.
+@command{gawk} prints a warning message every time they are used.
+(Use @code{PROCINFO} instead; see
+@ref{Auto-set}.)
+They will be removed from the next release of @command{gawk}.
+
+@ignore
+This @value{SECTION}
+is thus essentially a place holder,
+in case some option becomes obsolete in a future version of @command{gawk}.
+@end ignore
+
+@node Undocumented
+@section Undocumented Options and Features
+@cindex undocumented features
+@cindex features, undocumented
+@cindex Skywalker, Luke
+@cindex Kenobi, Obi-Wan
+@cindex Jedi knights
+@cindex Knights, jedi
+@quotation
+@i{Use the Source, Luke!}@*
+Obi-Wan
+@end quotation
+
+This @value{SECTION} intentionally left
+blank.
+
+@ignore
+@c If these came out in the Info file or TeX document, then they wouldn't
+@c be undocumented, would they?
+
+@command{gawk} has one undocumented option:
+
+@table @code
+@item -W nostalgia
+@itemx --nostalgia
+Print the message @code{"awk: bailing out near line 1"} and dump core.
+This option was inspired by the common behavior of very early versions of
+Unix @command{awk} and by a t--shirt.
+The message is @emph{not} subject to translation in non-English locales.
+@c so there! nyah, nyah.
+@end table
+
+Early versions of @command{awk} used to not require any separator (either
+a newline or @samp{;}) between the rules in @command{awk} programs.  Thus,
+it was common to see one-line programs like:
+
+@example
+awk '@{ sum += $1 @} END @{ print sum @}'
+@end example
+
+@command{gawk} actually supports this but it is purposely undocumented
+because it is considered bad style.  The correct way to write such a program
+is either
+
+@example
+awk '@{ sum += $1 @} ; END @{ print sum @}'
+@end example
+
+@noindent
+or
+
+@example
+awk '@{ sum += $1 @}
+     END @{ print sum @}' data
+@end example
+
+@noindent
+@xref{Statements/Lines}, for a fuller
+explanation.
+
+You can insert newlines after the @samp{;} in @code{for} loops.
+This seems to have been a long-undocumented feature in Unix @command{awk}.
+
+Similarly, you may use @code{print} or @code{printf} statements in the
+@var{init} and @var{increment} parts of a @code{for} loop.  This is another
+long-undocumented ``feature'' of Unix @code{awk}.
+
+If the environment variable @env{WHINY_USERS} exists
+when @command{gawk} is run,
+then the associative @code{for} loop will go through the array
+indices in sorted order.
+The comparison used for sorting is simple string comparison;
+any non-English or non-ASCII locales are not taken into account.
+@code{IGNORECASE} does not affect the comparison either.
+
+In addition, if @env{WHINY_USERS} is set, the profiled version of a
+program generated by @option{--profile} will print all 8-bit characters
+verbatim, instead of using the octal equivalent.
+
+@end ignore
+
+@node Known Bugs
+@section Known Bugs in @command{gawk}
+@cindex @command{gawk}, debugging
+@cindex debugging @command{gawk}
+@cindex troubleshooting, @command{gawk}
+
+@itemize @bullet
+@cindex troubleshooting, @code{-F} option
+@cindex @code{-F} option, troubleshooting
+@cindex @code{FS} variable, changing value of
+@item
+The @option{-F} option for changing the value of @code{FS}
+(@pxref{Options})
+is not necessary given the command-line variable
+assignment feature; it remains only for backward compatibility.
+
+@item
+Syntactically invalid single-character programs tend to overflow
+the parse stack, generating a rather unhelpful message.  Such programs
+are surprisingly difficult to diagnose in the completely general case,
+and the effort to do so really is not worth it.
+@end itemize
+
+@ignore
+@c Try this
+@iftex
+@page
+@headings off
+@majorheading II@ @ @ Using @command{awk} and @command{gawk}
+Part II shows how to use @command{awk} and @command{gawk} for problem solving.
+There is lots of code here for you to read and learn from.
+It contains the following chapters:
+
+@itemize @bullet
+@item
+@ref{Library Functions}.
+
+@item
+@ref{Sample Programs}.
+
+@end itemize
+
+@page
+@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
+@oddheading  @| @| @strong{@thischapter}@ @ @ @thispage
+@end iftex
+@end ignore
+
+@node Library Functions
+@chapter A Library of @command{awk} Functions
+@c STARTOFRANGE libf
+@cindex libraries of @command{awk} functions
+@c STARTOFRANGE flib
+@cindex functions, library
+@c STARTOFRANGE fudlib
+@cindex functions, user-defined, library of
+
+@ref{User-defined}, describes how to write
+your own @command{awk} functions.  Writing functions is important, because
+it allows you to encapsulate algorithms and program tasks in a single
+place.  It simplifies programming, making program development more
+manageable, and making programs more readable.
+
+One valuable way to learn a new programming language is to @emph{read}
+programs in that language.  To that end, this @value{CHAPTER}
+and @ref{Sample Programs},
+provide a good-sized body of code for you to read,
+and hopefully, to learn from.
+
+@c 2e: USE TEXINFO-2 FUNCTION DEFINITION STUFF!!!!!!!!!!!!!
+This @value{CHAPTER} presents a library of useful @command{awk} functions.
+Many of the sample programs presented later in this @value{DOCUMENT}
+use these functions.
+The functions are presented here in a progression from simple to complex.
+
+@cindex Texinfo
+@ref{Extract Program},
+presents a program that you can use to extract the source code for
+these example library functions and programs from the Texinfo source
+for this @value{DOCUMENT}.
+(This has already been done as part of the @command{gawk} distribution.)
+
+If you have written one or more useful, general-purpose @command{awk} functions
+and would like to contribute them to the author's collection of @command{awk}
+programs, see
+@ref{How To Contribute}, for more information.
+
+@cindex portability, example programs
+The programs in this @value{CHAPTER} and in
+@ref{Sample Programs},
+freely use features that are @command{gawk}-specific.
+Rewriting these programs for different implementations of awk is pretty straightforward.
+
+Diagnostic error messages are sent to @file{/dev/stderr}.
+Use @samp{| "cat 1>&2"} instead of @samp{> "/dev/stderr"} if your system
+does not have a @file{/dev/stderr}, or if you cannot use @command{gawk}.
+
+A number of programs use @code{nextfile}
+(@pxref{Nextfile Statement})
+to skip any remaining input in the input file.
+@ref{Nextfile Function},
+shows you how to write a function that does the same thing.
+
+@c 12/2000: Thanks to Nelson Beebe for pointing out the output issue.
+@cindex case sensitivity, example programs
+@cindex @code{IGNORECASE} variable, in example programs
+Finally, some of the programs choose to ignore upper- and lowercase
+distinctions in their input. They do so by assigning one to @code{IGNORECASE}.
+You can achieve almost the same effect@footnote{The effects are
+not identical.  Output of the transformed
+record will be in all lowercase, while @code{IGNORECASE} preserves the original
+contents of the input record.} by adding the following rule to the
+beginning of the program:
+
+@example
+# ignore case
+@{ $0 = tolower($0) @}
+@end example
+
+@noindent
+Also, verify that all regexp and string constants used in
+comparisons use only lowercase letters.
+
+@menu
+* Library Names::               How to best name private global variables in
+                                library functions.
+* General Functions::           Functions that are of general use.
+* Data File Management::        Functions for managing command-line data
+                                files.
+* Getopt Function::             A function for processing command-line
+                                arguments.
+* Passwd Functions::            Functions for getting user information.
+* Group Functions::             Functions for getting group information.
+@end menu
+
+@node Library Names
+@section Naming Library Function Global Variables
+
+@cindex names, arrays/variables
+@cindex names, functions
+@cindex namespace issues
+@cindex @command{awk} programs, documenting
+@cindex documentation, of @command{awk} programs
+Due to the way the @command{awk} language evolved, variables are either
+@dfn{global} (usable by the entire program) or @dfn{local} (usable just by
+a specific function).  There is no intermediate state analogous to
+@code{static} variables in C.
+
+@cindex variables, global, for library functions
+@cindex private variables
+@cindex variables, private
+Library functions often need to have global variables that they can use to
+preserve state information between calls to the function---for example,
+@code{getopt}'s variable @code{_opti}
+(@pxref{Getopt Function}).
+Such variables are called @dfn{private}, since the only functions that need to
+use them are the ones in the library.
+
+When writing a library function, you should try to choose names for your
+private variables that will not conflict with any variables used by
+either another library function or a user's main program.  For example, a
+name like @samp{i} or @samp{j} is not a good choice, because user programs
+often use variable names like these for their own purposes.
+
+@cindex programming conventions, private variable names
+The example programs shown in this @value{CHAPTER} all start the names of their
+private variables with an underscore (@samp{_}).  Users generally don't use
+leading underscores in their variable names, so this convention immediately
+decreases the chances that the variable name will be accidentally shared
+with the user's program.
+
+@cindex @code{_} (underscore), in names of private variables
+@cindex underscore (@code{_}), in names of private variables
+In addition, several of the library functions use a prefix that helps
+indicate what function or set of functions use the variables---for example,
+@code{_pw_byname} in the user database routines
+(@pxref{Passwd Functions}).
+This convention is recommended, since it even further decreases the
+chance of inadvertent conflict among variable names.  Note that this
+convention is used equally well for variable names and for private
+function names as well.@footnote{While all the library routines could have
+been rewritten to use this convention, this was not done, in order to
+show how my own @command{awk} programming style has evolved and to
+provide some basis for this discussion.}
+
+As a final note on variable naming, if a function makes global variables
+available for use by a main program, it is a good convention to start that
+variable's name with a capital letter---for
+example, @code{getopt}'s @code{Opterr} and @code{Optind} variables
+(@pxref{Getopt Function}).
+The leading capital letter indicates that it is global, while the fact that
+the variable name is not all capital letters indicates that the variable is
+not one of @command{awk}'s built-in variables, such as @code{FS}.
+
+@cindex @code{--dump-variables} option
+It is also important that @emph{all} variables in library
+functions that do not need to save state are, in fact, declared
+local.@footnote{@command{gawk}'s @option{--dump-variables} command-line
+option is useful for verifying this.} If this is not done, the variable
+could accidentally be used in the user's program, leading to bugs that
+are very difficult to track down:
+
+@example
+function lib_func(x, y,    l1, l2)
+@{
+    @dots{}
+    @var{use variable} some_var   # some_var should be local
+    @dots{}                   # but is not by oversight
+@}
+@end example
+
+@cindex arrays, associative, library functions and
+@cindex libraries of @command{awk} functions, associative arrays and
+@cindex functions, library, associative arrays and
+@cindex Tcl
+A different convention, common in the Tcl community, is to use a single
+associative array to hold the values needed by the library function(s), or
+``package.''  This significantly decreases the number of actual global names
+in use.  For example, the functions described in
+@ref{Passwd Functions},
+might have used array elements @code{@w{PW_data["inited"]}}, @code{@w{PW_data["total"]}},
+@code{@w{PW_data["count"]}}, and @code{@w{PW_data["awklib"]}}, instead of
+@code{@w{_pw_inited}}, @code{@w{_pw_awklib}}, @code{@w{_pw_total}},
+and @code{@w{_pw_count}}.
+
+The conventions presented in this @value{SECTION} are exactly
+that: conventions. You are not required to write your programs this
+way---we merely recommend that you do so.
+
+@node General Functions
+@section General Programming
+
+This @value{SECTION} presents a number of functions that are of general
+programming use.
+
+@menu
+* Nextfile Function::           Two implementations of a @code{nextfile}
+                                function.
+* Assert Function::             A function for assertions in @command{awk}
+                                programs.
+* Round Function::              A function for rounding if @code{sprintf} does
+                                not do it correctly.
+* Cliff Random Function::       The Cliff Random Number Generator.
+* Ordinal Functions::           Functions for using characters as numbers and
+                                vice versa.
+* Join Function::               A function to join an array into a string.
+* Gettimeofday Function::       A function to get formatted times.
+@end menu
+
+@node Nextfile Function
+@subsection Implementing @code{nextfile} as a Function
+
+@cindex input files, skipping
+@c STARTOFRANGE libfnex
+@cindex libraries of @command{awk} functions, @code{nextfile} statement
+@c STARTOFRANGE flibnex
+@cindex functions, library, @code{nextfile} statement
+@c STARTOFRANGE nexim
+@cindex @code{nextfile} statement, implementing
+@cindex @command{gawk}, @code{nextfile} statement in
+The @code{nextfile} statement, presented in
+@ref{Nextfile Statement},
+is a @command{gawk}-specific extension---it is not available in most other
+implementations of @command{awk}.  This @value{SECTION} shows two versions of a
+@code{nextfile} function that you can use to simulate @command{gawk}'s
+@code{nextfile} statement if you cannot use @command{gawk}.
+
+A first attempt at writing a @code{nextfile} function is as follows:
+
+@example
+# nextfile --- skip remaining records in current file
+# this should be read in before the "main" awk program
+
+function nextfile()    @{ _abandon_ = FILENAME; next @}
+_abandon_ == FILENAME  @{ next @}
+@end example
+
+@cindex programming conventions, @code{nextfile} statement
+Because it supplies a rule that must be executed first, this file should
+be included before the main program. This rule compares the current
+@value{DF}'s name (which is always in the @code{FILENAME} variable) to
+a private variable named @code{_abandon_}.  If the @value{FN} matches,
+then the action part of the rule executes a @code{next} statement to
+go on to the next record.  (The use of @samp{_} in the variable name is
+a convention.  It is discussed more fully in
+@ref{Library Names}.)
+
+The use of the @code{next} statement effectively creates a loop that reads
+all the records from the current @value{DF}.
+The end of the file is eventually reached and
+a new @value{DF} is opened, changing the value of @code{FILENAME}.
+Once this happens, the comparison of @code{_abandon_} to @code{FILENAME}
+fails, and execution continues with the first rule of the ``real'' program.
+
+The @code{nextfile} function itself simply sets the value of @code{_abandon_}
+and then executes a @code{next} statement to start the
+loop.
+@ignore
+@c If the function can't be used on other versions of awk, this whole
+@c section is pointless, no?  Sigh.
+@footnote{@command{gawk} is the only known @command{awk} implementation
+that allows you to
+execute @code{next} from within a function body. Some other workaround
+is necessary if you are not using @command{gawk}.}
+@end ignore
+
+@cindex @code{nextfile} user-defined function
+This initial version has a subtle problem.
+If the same @value{DF} is listed @emph{twice} on the commandline,
+one right after the other
+or even with just a variable assignment between them,
+this code skips right through the file a second time, even though
+it should stop when it gets to the end of the first occurrence.
+A second version of @code{nextfile} that remedies this problem
+is shown here:
+
+@example
+@c file eg/lib/nextfile.awk
+# nextfile --- skip remaining records in current file
+# correctly handle successive occurrences of the same file
+@c endfile
+@ignore
+@c file eg/lib/nextfile.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May, 1993
+
+@c endfile
+@end ignore
+@c file eg/lib/nextfile.awk
+# this should be read in before the "main" awk program
+
+function nextfile()   @{ _abandon_ = FILENAME; next @}
+
+_abandon_ == FILENAME @{
+      if (FNR == 1)
+          _abandon_ = ""
+      else
+          next
+@}
+@c endfile
+@end example
+
+The @code{nextfile} function has not changed.  It makes @code{_abandon_}
+equal to the current @value{FN} and then executes a @code{next} statement.
+The @code{next} statement reads the next record and increments @code{FNR}
+so that @code{FNR} is guaranteed to have a value of at least two.
+However, if @code{nextfile} is called for the last record in the file,
+then @command{awk} closes the current @value{DF} and moves on to the next
+one.  Upon doing so, @code{FILENAME} is set to the name of the new file
+and @code{FNR} is reset to one.  If this next file is the same as
+the previous one, @code{_abandon_} is still equal to @code{FILENAME}.
+However, @code{FNR} is equal to one, telling us that this is a new
+occurrence of the file and not the one we were reading when the
+@code{nextfile} function was executed.  In that case, @code{_abandon_}
+is reset to the empty string, so that further executions of this rule
+fail (until the next time that @code{nextfile} is called).
+
+If @code{FNR} is not one, then we are still in the original @value{DF}
+and the program executes a @code{next} statement to skip through it.
+
+An important question to ask at this point is: given that the
+functionality of @code{nextfile} can be provided with a library file,
+why is it built into @command{gawk}?  Adding
+features for little reason leads to larger, slower programs that are
+harder to maintain.
+The answer is that building @code{nextfile} into @command{gawk} provides
+significant gains in efficiency.  If the @code{nextfile} function is executed
+at the beginning of a large @value{DF}, @command{awk} still has to scan the entire
+file, splitting it up into records,
+@c at least conceptually
+just to skip over it.  The built-in
+@code{nextfile} can simply close the file immediately and proceed to the
+next one, which saves a lot of time.  This is particularly important in
+@command{awk}, because @command{awk} programs are generally I/O-bound (i.e.,
+they spend most of their time doing input and output, instead of performing
+computations).
+@c ENDOFRANGE libfnex
+@c ENDOFRANGE flibnex
+@c ENDOFRANGE nexim
+
+@node Assert Function
+@subsection Assertions
+
+@c STARTOFRANGE asse
+@cindex assertions
+@c STARTOFRANGE assef
+@cindex @code{assert} function (C library)
+@c STARTOFRANGE libfass
+@cindex libraries of @command{awk} functions, assertions
+@c STARTOFRANGE flibass
+@cindex functions, library, assertions
+@cindex @command{awk} programs, lengthy, assertions
+When writing large programs, it is often useful to know
+that a condition or set of conditions is true.  Before proceeding with a
+particular computation, you make a statement about what you believe to be
+the case.  Such a statement is known as an
+@dfn{assertion}.  The C language provides an @code{<assert.h>} header file
+and corresponding @code{assert} macro that the programmer can use to make
+assertions.  If an assertion fails, the @code{assert} macro arranges to
+print a diagnostic message describing the condition that should have
+been true but was not, and then it kills the program.  In C, using
+@code{assert} looks this:
+
+@example
+#include <assert.h>
+
+int myfunc(int a, double b)
+@{
+     assert(a <= 5 && b >= 17.1);
+     @dots{}
+@}
+@end example
+
+If the assertion fails, the program prints a message similar to this:
+
+@example
+prog.c:5: assertion failed: a <= 5 && b >= 17.1
+@end example
+
+@cindex @code{assert} user-defined function
+The C language makes it possible to turn the condition into a string for use
+in printing the diagnostic message.  This is not possible in @command{awk}, so
+this @code{assert} function also requires a string version of the condition
+that is being tested.
+Following is the function:
+
+@example
+@c file eg/lib/assert.awk
+# assert --- assert that a condition is true. Otherwise exit.
+@c endfile
+@ignore
+@c file eg/lib/assert.awk
+
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May, 1993
+
+@c endfile
+@end ignore
+@c file eg/lib/assert.awk
+function assert(condition, string)
+@{
+    if (! condition) @{
+        printf("%s:%d: assertion failed: %s\n",
+            FILENAME, FNR, string) > "/dev/stderr"
+        _assert_exit = 1
+        exit 1
+    @}
+@}
+
+@group
+END @{
+    if (_assert_exit)
+        exit 1
+@}
+@end group
+@c endfile
+@end example
+
+The @code{assert} function tests the @code{condition} parameter. If it
+is false, it prints a message to standard error, using the @code{string}
+parameter to describe the failed condition.  It then sets the variable
+@code{_assert_exit} to one and executes the @code{exit} statement.
+The @code{exit} statement jumps to the @code{END} rule. If the @code{END}
+rules finds @code{_assert_exit} to be true, it exits immediately.
+
+The purpose of the test in the @code{END} rule is to
+keep any other @code{END} rules from running.  When an assertion fails, the
+program should exit immediately.
+If no assertions fail, then @code{_assert_exit} is still
+false when the @code{END} rule is run normally, and the rest of the
+program's @code{END} rules execute.
+For all of this to work correctly, @file{assert.awk} must be the
+first source file read by @command{awk}.
+The function can be used in a program in the following way:
+
+@example
+function myfunc(a, b)
+@{
+     assert(a <= 5 && b >= 17.1, "a <= 5 && b >= 17.1")
+     @dots{}
+@}
+@end example
+
+@noindent
+If the assertion fails, you see a message similar to the following:
+
+@example
+mydata:1357: assertion failed: a <= 5 && b >= 17.1
+@end example
+
+@cindex @code{END} pattern, @code{assert} user-defined function and
+There is a small problem with this version of @code{assert}.
+An @code{END} rule is automatically added
+to the program calling @code{assert}.  Normally, if a program consists
+of just a @code{BEGIN} rule, the input files and/or standard input are
+not read. However, now that the program has an @code{END} rule, @command{awk}
+attempts to read the input @value{DF}s or standard input
+(@pxref{Using BEGIN/END}),
+most likely causing the program to hang as it waits for input.
+
+@cindex @code{BEGIN} pattern, @code{assert} user-defined function and
+There is a simple workaround to this:
+make sure the @code{BEGIN} rule always ends
+with an @code{exit} statement.
+@c ENDOFRANGE asse
+@c ENDOFRANGE assef
+@c ENDOFRANGE flibass
+@c ENDOFRANGE libfass
+
+@node Round Function
+@subsection Rounding Numbers
+
+@cindex rounding
+@cindex rounding numbers
+@cindex numbers, rounding
+@cindex libraries of @command{awk} functions, rounding numbers
+@cindex functions, library, rounding numbers
+@cindex @code{print} statement, @code{sprintf} function and
+@cindex @code{printf} statement, @code{sprintf} function and
+@cindex @code{sprintf} function, @code{print}/@code{printf} statements and
+The way @code{printf} and @code{sprintf}
+(@pxref{Printf})
+perform rounding often depends upon the system's C @code{sprintf}
+subroutine.  On many machines, @code{sprintf} rounding is ``unbiased,''
+which means it doesn't always round a trailing @samp{.5} up, contrary
+to naive expectations.  In unbiased rounding, @samp{.5} rounds to even,
+rather than always up, so 1.5 rounds to 2 but 4.5 rounds to 4.  This means
+that if you are using a format that does rounding (e.g., @code{"%.0f"}),
+you should check what your system does.  The following function does
+traditional rounding; it might be useful if your awk's @code{printf}
+does unbiased rounding:
+
+@cindex @code{round} user-defined function
+@example
+@c file eg/lib/round.awk
+# round.awk --- do normal rounding
+@c endfile
+@ignore
+@c file eg/lib/round.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# August, 1996
+
+@c endfile
+@end ignore
+@c file eg/lib/round.awk
+function round(x,   ival, aval, fraction)
+@{
+   ival = int(x)    # integer part, int() truncates
+
+   # see if fractional part
+   if (ival == x)   # no fraction
+      return x
+
+   if (x < 0) @{
+      aval = -x     # absolute value
+      ival = int(aval)
+      fraction = aval - ival
+      if (fraction >= .5)
+         return int(x) - 1   # -2.5 --> -3
+      else
+         return int(x)       # -2.3 --> -2
+   @} else @{
+      fraction = x - ival
+      if (fraction >= .5)
+         return ival + 1
+      else
+         return ival
+   @}
+@}
+
+# test harness
+@{ print $0, round($0) @}
+@c endfile
+@end example
+
+@node Cliff Random Function
+@subsection The Cliff Random Number Generator
+@cindex random numbers, Cliff
+@cindex Cliff random numbers
+@cindex numbers, Cliff random
+@cindex functions, library, Cliff random numbers
+
+The Cliff random number
+generator@footnote{@uref{http://mathworld.wolfram.com/CliffRandomNumberGenerator.hmtl}}
+is a very simple random number generator that ``passes the noise sphere test
+for randomness by showing no structure.''
+It is easily programmed, in less than 10 lines of @command{awk} code:
+
+@cindex @code{cliff_rand} user-defined function
+@example
+@c file eg/lib/cliff_rand.awk
+# cliff_rand.awk --- generate Cliff random numbers
+@c endfile
+@ignore
+@c file eg/lib/cliff_rand.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# December 2000
+
+@c endfile
+@end ignore
+@c file eg/lib/cliff_rand.awk
+BEGIN @{ _cliff_seed = 0.1 @}
+
+function cliff_rand()
+@{
+    _cliff_seed = (100 * log(_cliff_seed)) % 1
+    if (_cliff_seed < 0)
+        _cliff_seed = - _cliff_seed
+    return _cliff_seed
+@}
+@c endfile
+@end example
+
+This algorithm requires an initial ``seed'' of 0.1.  Each new value
+uses the current seed as input for the calculation.
+If the built-in @code{rand} function
+(@pxref{Numeric Functions})
+isn't random enough, you might try using this function instead.
+
+@node Ordinal Functions
+@subsection Translating Between Characters and Numbers
+
+@cindex libraries of @command{awk} functions, character values as numbers
+@cindex functions, library, character values as numbers
+@cindex characters, values of as numbers
+@cindex numbers, as values of characters
+One commercial implementation of @command{awk} supplies a built-in function,
+@code{ord}, which takes a character and returns the numeric value for that
+character in the machine's character set.  If the string passed to
+@code{ord} has more than one character, only the first one is used.
+
+The inverse of this function is @code{chr} (from the function of the same
+name in Pascal), which takes a number and returns the corresponding character.
+Both functions are written very nicely in @command{awk}; there is no real
+reason to build them into the @command{awk} interpreter:
+
+@cindex @code{ord} user-defined function
+@cindex @code{chr} user-defined function
+@example
+@c file eg/lib/ord.awk
+# ord.awk --- do ord and chr
+
+# Global identifiers:
+#    _ord_:        numerical values indexed by characters
+#    _ord_init:    function to initialize _ord_
+@c endfile
+@ignore
+@c file eg/lib/ord.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# 16 January, 1992
+# 20 July, 1992, revised
+
+@c endfile
+@end ignore
+@c file eg/lib/ord.awk
+BEGIN    @{ _ord_init() @}
+
+function _ord_init(    low, high, i, t)
+@{
+    low = sprintf("%c", 7) # BEL is ascii 7
+    if (low == "\a") @{    # regular ascii
+        low = 0
+        high = 127
+    @} else if (sprintf("%c", 128 + 7) == "\a") @{
+        # ascii, mark parity
+        low = 128
+        high = 255
+    @} else @{        # ebcdic(!)
+        low = 0
+        high = 255
+    @}
+
+    for (i = low; i <= high; i++) @{
+        t = sprintf("%c", i)
+        _ord_[t] = i
+    @}
+@}
+@c endfile
+@end example
+
+@cindex character sets
+@cindex character encodings
+@cindex ASCII
+@cindex EBCDIC
+@cindex mark parity
+Some explanation of the numbers used by @code{chr} is worthwhile.
+The most prominent character set in use today is ASCII. Although an
+8-bit byte can hold 256 distinct values (from 0 to 255), ASCII only
+defines characters that use the values from 0 to 127.@footnote{ASCII
+has been extended in many countries to use the values from 128 to 255
+for country-specific characters.  If your  system uses these extensions,
+you can simplify @code{_ord_init} to simply loop from 0 to 255.}
+In the now distant past,
+at least one minicomputer manufacturer
+@c Pr1me, blech
+used ASCII, but with mark parity, meaning that the leftmost bit in the byte
+is always 1.  This means that on those systems, characters
+have numeric values from 128 to 255.
+Finally, large mainframe systems use the EBCDIC character set, which
+uses all 256 values.
+While there are other character sets in use on some older systems,
+they are not really worth worrying about:
+
+@example
+@c file eg/lib/ord.awk
+function ord(str,    c)
+@{
+    # only first character is of interest
+    c = substr(str, 1, 1)
+    return _ord_[c]
+@}
+
+function chr(c)
+@{
+    # force c to be numeric by adding 0
+    return sprintf("%c", c + 0)
+@}
+@c endfile
+
+#### test code ####
+# BEGIN    \
+# @{
+#    for (;;) @{
+#        printf("enter a character: ")
+#        if (getline var <= 0)
+#            break
+#        printf("ord(%s) = %d\n", var, ord(var))
+#    @}
+# @}
+@c endfile
+@end example
+
+An obvious improvement to these functions is to move the code for the
+@code{@w{_ord_init}} function into the body of the @code{BEGIN} rule.  It was
+written this way initially for ease of development.
+There is a ``test program'' in a @code{BEGIN} rule, to test the
+function.  It is commented out for production use.
+
+@node Join Function
+@subsection Merging an Array into a String
+
+@cindex libraries of @command{awk} functions, merging arrays into strings
+@cindex functions, library, merging arrays into strings
+@cindex strings, merging arrays into
+@cindex arrays, merging into strings
+When doing string processing, it is often useful to be able to join
+all the strings in an array into one long string.  The following function,
+@code{join}, accomplishes this task.  It is used later in several of
+the application programs
+(@pxref{Sample Programs}).
+
+Good function design is important; this function needs to be general but it
+should also have a reasonable default behavior.  It is called with an array
+as well as the beginning and ending indices of the elements in the array to be
+merged.  This assumes that the array indices are numeric---a reasonable
+assumption since the array was likely created with @code{split}
+(@pxref{String Functions}):
+
+@cindex @code{join} user-defined function
+@example
+@c file eg/lib/join.awk
+# join.awk --- join an array into a string
+@c endfile
+@ignore
+@c file eg/lib/join.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/lib/join.awk
+function join(array, start, end, sep,    result, i)
+@{
+    if (sep == "")
+       sep = " "
+    else if (sep == SUBSEP) # magic value
+       sep = ""
+    result = array[start]
+    for (i = start + 1; i <= end; i++)
+        result = result sep array[i]
+    return result
+@}
+@c endfile
+@end example
+
+An optional additional argument is the separator to use when joining the
+strings back together.  If the caller supplies a nonempty value,
+@code{join} uses it; if it is not supplied, it has a null
+value.  In this case, @code{join} uses a single blank as a default
+separator for the strings.  If the value is equal to @code{SUBSEP},
+then @code{join} joins the strings with no separator between them.
+@code{SUBSEP} serves as a ``magic'' value to indicate that there should
+be no separation between the component strings.@footnote{It would
+be nice if @command{awk} had an assignment operator for concatenation.
+The lack of an explicit operator for concatenation makes string operations
+more difficult than they really need to be.}
+
+@node Gettimeofday Function
+@subsection Managing the Time of Day
+
+@cindex libraries of @command{awk} functions, managing, time
+@cindex functions, library, managing time
+@cindex timestamps, formatted
+@cindex time, managing
+The @code{systime} and @code{strftime} functions described in
+@ref{Time Functions},
+provide the minimum functionality necessary for dealing with the time of day
+in human readable form.  While @code{strftime} is extensive, the control
+formats are not necessarily easy to remember or intuitively obvious when
+reading a program.
+
+The following function, @code{gettimeofday}, populates a user-supplied array
+with preformatted time information.  It returns a string with the current
+time formatted in the same way as the @command{date} utility:
+
+@cindex @code{gettimeofday} user-defined function
+@example
+@c file eg/lib/gettime.awk
+# gettimeofday.awk --- get the time of day in a usable format
+@c endfile
+@ignore
+@c file eg/lib/gettime.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain, May 1993
+#
+@c endfile
+@end ignore
+@c file eg/lib/gettime.awk
+
+# Returns a string in the format of output of date(1)
+# Populates the array argument time with individual values:
+#    time["second"]       -- seconds (0 - 59)
+#    time["minute"]       -- minutes (0 - 59)
+#    time["hour"]         -- hours (0 - 23)
+#    time["althour"]      -- hours (0 - 12)
+#    time["monthday"]     -- day of month (1 - 31)
+#    time["month"]        -- month of year (1 - 12)
+#    time["monthname"]    -- name of the month
+#    time["shortmonth"]   -- short name of the month
+#    time["year"]         -- year modulo 100 (0 - 99)
+#    time["fullyear"]     -- full year
+#    time["weekday"]      -- day of week (Sunday = 0)
+#    time["altweekday"]   -- day of week (Monday = 0)
+#    time["dayname"]      -- name of weekday
+#    time["shortdayname"] -- short name of weekday
+#    time["yearday"]      -- day of year (0 - 365)
+#    time["timezone"]     -- abbreviation of timezone name
+#    time["ampm"]         -- AM or PM designation
+#    time["weeknum"]      -- week number, Sunday first day
+#    time["altweeknum"]   -- week number, Monday first day
+
+function gettimeofday(time,    ret, now, i)
+@{
+    # get time once, avoids unnecessary system calls
+    now = systime()
+
+    # return date(1)-style output
+    ret = strftime("%a %b %d %H:%M:%S %Z %Y", now)
+
+    # clear out target array
+    delete time
+
+    # fill in values, force numeric values to be
+    # numeric by adding 0
+    time["second"]       = strftime("%S", now) + 0
+    time["minute"]       = strftime("%M", now) + 0
+    time["hour"]         = strftime("%H", now) + 0
+    time["althour"]      = strftime("%I", now) + 0
+    time["monthday"]     = strftime("%d", now) + 0
+    time["month"]        = strftime("%m", now) + 0
+    time["monthname"]    = strftime("%B", now)
+    time["shortmonth"]   = strftime("%b", now)
+    time["year"]         = strftime("%y", now) + 0
+    time["fullyear"]     = strftime("%Y", now) + 0
+    time["weekday"]      = strftime("%w", now) + 0
+    time["altweekday"]   = strftime("%u", now) + 0
+    time["dayname"]      = strftime("%A", now)
+    time["shortdayname"] = strftime("%a", now)
+    time["yearday"]      = strftime("%j", now) + 0
+    time["timezone"]     = strftime("%Z", now)
+    time["ampm"]         = strftime("%p", now)
+    time["weeknum"]      = strftime("%U", now) + 0
+    time["altweeknum"]   = strftime("%W", now) + 0
+
+    return ret
+@}
+@c endfile
+@end example
+
+The string indices are easier to use and read than the various formats
+required by @code{strftime}.  The @code{alarm} program presented in
+@ref{Alarm Program},
+uses this function.
+A more general design for the @code{gettimeofday} function would have
+allowed the user to supply an optional timestamp value to use instead
+of the current time.
+
+@node Data File Management
+@section @value{DDF} Management
+
+@c STARTOFRANGE dataf
+@cindex files, managing
+@c STARTOFRANGE libfdataf
+@cindex libraries of @command{awk} functions, managing, @value{DF}s
+@c STARTOFRANGE flibdataf
+@cindex functions, library, managing @value{DF}s
+This @value{SECTION} presents functions that are useful for managing
+command-line @value{DF}s.
+
+@menu
+* Filetrans Function::          A function for handling data file transitions.
+* Rewind Function::             A function for rereading the current file.
+* File Checking::               Checking that data files are readable.
+* Empty Files::                 Checking for zero-length files.
+* Ignoring Assigns::            Treating assignments as file names.
+@end menu
+
+@node Filetrans Function
+@subsection Noting @value{DDF} Boundaries
+
+@cindex files, managing, @value{DF} boundaries
+@cindex files, initialization and cleanup
+The @code{BEGIN} and @code{END} rules are each executed exactly once at
+the beginning and end of your @command{awk} program, respectively
+(@pxref{BEGIN/END}).
+We (the @command{gawk} authors) once had a user who mistakenly thought that the
+@code{BEGIN} rule is executed at the beginning of each @value{DF} and the
+@code{END} rule is executed at the end of each @value{DF}.  When informed
+that this was not the case, the user requested that we add new special
+patterns to @command{gawk}, named @code{BEGIN_FILE} and @code{END_FILE}, that
+would have the desired behavior.  He even supplied us the code to do so.
+
+Adding these special patterns to @command{gawk} wasn't necessary;
+the job can be done cleanly in @command{awk} itself, as illustrated
+by the following library program.
+It arranges to call two user-supplied functions, @code{beginfile} and
+@code{endfile}, at the beginning and end of each @value{DF}.
+Besides solving the problem in only nine(!) lines of code, it does so
+@emph{portably}; this works with any implementation of @command{awk}:
+
+@example
+# transfile.awk
+#
+# Give the user a hook for filename transitions
+#
+# The user must supply functions beginfile() and endfile()
+# that each take the name of the file being started or
+# finished, respectively.
+@c #
+@c # Arnold Robbins, arnold@@gnu.org, Public Domain
+@c # January 1992
+
+FILENAME != _oldfilename \
+@{
+    if (_oldfilename != "")
+        endfile(_oldfilename)
+    _oldfilename = FILENAME
+    beginfile(FILENAME)
+@}
+
+END   @{ endfile(FILENAME) @}
+@end example
+
+This file must be loaded before the user's ``main'' program, so that the
+rule it supplies is executed first.
+
+This rule relies on @command{awk}'s @code{FILENAME} variable that
+automatically changes for each new @value{DF}.  The current @value{FN} is
+saved in a private variable, @code{_oldfilename}.  If @code{FILENAME} does
+not equal @code{_oldfilename}, then a new @value{DF} is being processed and
+it is necessary to call @code{endfile} for the old file.  Because
+@code{endfile} should only be called if a file has been processed, the
+program first checks to make sure that @code{_oldfilename} is not the null
+string.  The program then assigns the current @value{FN} to
+@code{_oldfilename} and calls @code{beginfile} for the file.
+Because, like all @command{awk} variables, @code{_oldfilename} is
+initialized to the null string, this rule executes correctly even for the
+first @value{DF}.
+
+The program also supplies an @code{END} rule to do the final processing for
+the last file.  Because this @code{END} rule comes before any @code{END} rules
+supplied in the ``main'' program, @code{endfile} is called first.  Once
+again the value of multiple @code{BEGIN} and @code{END} rules should be clear.
+
+@cindex @code{beginfile} user-defined function
+@cindex @code{endfile} user-defined function
+This version has same problem as the first version of @code{nextfile}
+(@pxref{Nextfile Function}).
+If the same @value{DF} occurs twice in a row on the command line, then
+@code{endfile} and @code{beginfile} are not executed at the end of the
+first pass and at the beginning of the second pass.
+The following version solves the problem:
+
+@example
+@c file eg/lib/ftrans.awk
+# ftrans.awk --- handle data file transitions
+#
+# user supplies beginfile() and endfile() functions
+@c endfile
+@ignore
+@c file eg/lib/ftrans.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# November 1992
+
+@c endfile
+@end ignore
+@c file eg/lib/ftrans.awk
+FNR == 1 @{
+    if (_filename_ != "")
+        endfile(_filename_)
+    _filename_ = FILENAME
+    beginfile(FILENAME)
+@}
+
+END  @{ endfile(_filename_) @}
+@c endfile
+@end example
+
+@ref{Wc Program},
+shows how this library function can be used and
+how it simplifies writing the main program.
+
+@node Rewind Function
+@subsection Rereading the Current File
+
+@cindex files, reading
+Another request for a new built-in function was for a @code{rewind}
+function that would make it possible to reread the current file.
+The requesting user didn't want to have to use @code{getline}
+(@pxref{Getline})
+inside a loop.
+
+However, as long as you are not in the @code{END} rule, it is
+quite easy to arrange to immediately close the current input file
+and then start over with it from the top.
+For lack of a better name, we'll call it @code{rewind}:
+
+@cindex @code{rewind} user-defined function
+@example
+@c file eg/lib/rewind.awk
+# rewind.awk --- rewind the current file and start over
+@c endfile
+@ignore
+@c file eg/lib/rewind.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# September 2000
+
+@c endfile
+@end ignore
+@c file eg/lib/rewind.awk
+function rewind(    i)
+@{
+    # shift remaining arguments up
+    for (i = ARGC; i > ARGIND; i--)
+        ARGV[i] = ARGV[i-1]
+
+    # make sure gawk knows to keep going
+    ARGC++
+
+    # make current file next to get done
+    ARGV[ARGIND+1] = FILENAME
+
+    # do it
+    nextfile
+@}
+@c endfile
+@end example
+
+This code relies on the @code{ARGIND} variable
+(@pxref{Auto-set}),
+which is specific to @command{gawk}.
+If you are not using
+@command{gawk}, you can use ideas presented in
+@ifnotinfo
+the previous @value{SECTION}
+@end ifnotinfo
+@ifinfo
+@ref{Filetrans Function},
+@end ifinfo
+to either update @code{ARGIND} on your own
+or modify this code as appropriate.
+
+The @code{rewind} function also relies on the @code{nextfile} keyword
+(@pxref{Nextfile Statement}).
+@xref{Nextfile Function},
+for a function version of @code{nextfile}.
+
+@node File Checking
+@subsection Checking for Readable @value{DDF}s
+
+@cindex troubleshooting, readable @value{DF}s
+@c comma is part of primary
+@cindex readable @value{DF}s, checking
+@cindex files, skipping
+Normally, if you give @command{awk} a @value{DF} that isn't readable,
+it stops with a fatal error.  There are times when you
+might want to just ignore such files and keep going.  You can
+do this by prepending the following program to your @command{awk}
+program:
+
+@cindex @code{readable.awk} program
+@example
+@c file eg/lib/readable.awk
+# readable.awk --- library file to skip over unreadable files
+@c endfile
+@ignore
+@c file eg/lib/readable.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# October 2000
+
+@c endfile
+@end ignore
+@c file eg/lib/readable.awk
+BEGIN @{
+    for (i = 1; i < ARGC; i++) @{
+        if (ARGV[i] ~ /^[A-Za-z_][A-Za-z0-9_]*=.*/ \
+            || ARGV[i] == "-")
+            continue    # assignment or standard input
+        else if ((getline junk < ARGV[i]) < 0) # unreadable
+            delete ARGV[i]
+        else
+            close(ARGV[i])
+    @}
+@}
+@c endfile
+@end example
+
+@cindex troubleshooting, @code{getline} function
+In @command{gawk}, the @code{getline} won't be fatal (unless
+@option{--posix} is in force).
+Removing the element from @code{ARGV} with @code{delete}
+skips the file (since it's no longer in the list).
+
+@c This doesn't handle /dev/stdin etc.  Not worth the hassle to mention or fix.
+
+@node Empty Files
+@subsection Checking For Zero-length Files
+
+All known @command{awk} implementations silently skip over zero-length files.
+This is a by-product of @command{awk}'s implicit 
+read-a-record-and-match-against-the-rules loop: when @command{awk}
+tries to read a record from an empty file, it immediately receives an
+end of file indication, closes the file, and proceeds on to the next
+command-line @value{DF}, @emph{without} executing any user-level
+@command{awk} program code.
+
+Using @command{gawk}'s @code{ARGIND} variable
+(@pxref{Built-in Variables}), it is possible to detect when an empty
+@value{DF} has been skipped.  Similar to the library file presented
+in @ref{Filetrans Function}, the following library file calls a function named
+@code{zerofile} that the user must provide.  The arguments passed are
+the @value{FN} and the position in @code{ARGV} where it was found:
+
+@cindex @code{zerofile.awk} program
+@example
+@c file eg/lib/zerofile.awk
+# zerofile.awk --- library file to process empty input files
+@c endfile
+@ignore
+@c file eg/lib/zerofile.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# June 2003
+
+@c endfile
+@end ignore
+@c file eg/lib/zerofile.awk
+BEGIN @{ Argind = 0 @}
+
+ARGIND > Argind + 1 @{
+    for (Argind++; Argind < ARGIND; Argind++)
+        zerofile(ARGV[Argind], Argind)
+@}
+
+ARGIND != Argind @{ Argind = ARGIND @}
+
+END @{
+    if (ARGIND > Argind)
+        for (Argind++; Argind <= ARGIND; Argind++)
+            zerofile(ARGV[Argind], Argind)
+@}
+@c endfile
+@end example
+
+The user-level variable @code{Argind} allows the @command{awk} program
+to track its progress through @code{ARGV}.  Whenever the program detects
+that @code{ARGIND} is greater than @samp{Argind + 1}, it means that one or
+more empty files were skipped.  The action then calls @code{zerofile} for
+each such file, incrementing @code{Argind} along the way.
+
+The @samp{Argind != ARGIND} rule simply keeps @code{Argind} up to date
+in the normal case.
+
+Finally, the @code{END} rule catches the case of any empty files at
+the end of the command-line arguments.  Note that the test in the
+condition of the @code{for} loop uses the @samp{<=} operator,
+not @code{<}.
+
+As an exercise, you might consider whether this same problem can
+be solved without relying on @command{gawk}'s @code{ARGIND} variable.
+
+As a second exercise, revise this code to handle the case where
+an intervening value in @code{ARGV} is a variable assignment.
+
+@ignore
+# zerofile2.awk --- same thing, portably
+BEGIN @{
+    ARGIND = Argind = 0
+    for (i = 1; i < ARGC; i++)
+        Fnames[ARGV[i]]++
+
+@}
+FNR == 1 @{
+    while (ARGV[ARGIND] != FILENAME)
+        ARGIND++
+    Seen[FILENAME]++
+    if (Seen[FILENAME] == Fnames[FILENAME])
+        do
+            ARGIND++
+        while (ARGV[ARGIND] != FILENAME)
+@}
+ARGIND > Argind + 1 @{
+    for (Argind++; Argind < ARGIND; Argind++)
+        zerofile(ARGV[Argind], Argind)
+@}
+ARGIND != Argind @{
+    Argind = ARGIND
+@}
+END @{
+    if (ARGIND < ARGC - 1)
+        ARGIND = ARGC - 1 
+    if (ARGIND > Argind)
+        for (Argind++; Argind <= ARGIND; Argind++)
+            zerofile(ARGV[Argind], Argind)
+@}
+@end ignore
+
+@node Ignoring Assigns
+@subsection Treating Assignments as @value{FFN}s
+
+@cindex assignments as filenames
+@cindex filenames, assignments as
+Occasionally, you might not want @command{awk} to process command-line
+variable assignments
+(@pxref{Assignment Options}).
+In particular, if you have @value{FN}s that contain an @samp{=} character,
+@command{awk} treats the @value{FN} as an assignment, and does not process it.
+
+Some users have suggested an additional command-line option for @command{gawk}
+to disable command-line assignments.  However, some simple programming with
+a library file does the trick:
+
+@cindex @code{noassign.awk} program
+@example
+@c file eg/lib/noassign.awk
+# noassign.awk --- library file to avoid the need for a
+# special option that disables command-line assignments
+@c endfile
+@ignore
+@c file eg/lib/noassign.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# October 1999
+
+@c endfile
+@end ignore
+@c file eg/lib/noassign.awk
+function disable_assigns(argc, argv,    i)
+@{
+    for (i = 1; i < argc; i++)
+        if (argv[i] ~ /^[A-Za-z_][A-Za-z_0-9]*=.*/)
+            argv[i] = ("./" argv[i])
+@}
+
+BEGIN @{
+    if (No_command_assign)
+        disable_assigns(ARGC, ARGV)
+@}
+@c endfile
+@end example
+
+You then run your program this way:
+
+@example
+awk -v No_command_assign=1 -f noassign.awk -f yourprog.awk *
+@end example
+
+The function works by looping through the arguments.
+It prepends @samp{./} to
+any argument that matches the form
+of a variable assignment, turning that argument into a @value{FN}.
+
+The use of @code{No_command_assign} allows you to disable command-line
+assignments at invocation time, by giving the variable a true value.
+When not set, it is initially zero (i.e., false), so the command-line arguments
+are left alone.
+@c ENDOFRANGE dataf
+@c ENDOFRANGE flibdataf
+@c ENDOFRANGE libfdataf
+
+@node Getopt Function
+@section Processing Command-Line Options
+
+@c STARTOFRANGE libfclo
+@cindex libraries of @command{awk} functions, command-line options
+@c STARTOFRANGE flibclo
+@cindex functions, library, command-line options
+@c STARTOFRANGE clop
+@cindex command-line options, processing
+@c STARTOFRANGE oclp
+@cindex options, command-line, processing
+@c STARTOFRANGE clibf
+@cindex functions, library, C library
+@cindex arguments, processing
+Most utilities on POSIX compatible systems take options, or ``switches,'' on
+the command line that can be used to change the way a program behaves.
+@command{awk} is an example of such a program
+(@pxref{Options}).
+Often, options take @dfn{arguments}; i.e., data that the program needs to
+correctly obey the command-line option.  For example, @command{awk}'s
+@option{-F} option requires a string to use as the field separator.
+The first occurrence on the command line of either @option{--} or a
+string that does not begin with @samp{-} ends the options.
+
+@cindex @code{getopt} function (C library)
+Modern Unix systems provide a C function named @code{getopt} for processing
+command-line arguments.  The programmer provides a string describing the
+one-letter options. If an option requires an argument, it is followed in the
+string with a colon.  @code{getopt} is also passed the
+count and values of the command-line arguments and is called in a loop.
+@code{getopt} processes the command-line arguments for option letters.
+Each time around the loop, it returns a single character representing the
+next option letter that it finds, or @samp{?} if it finds an invalid option.
+When it returns @minus{}1, there are no options left on the command line.
+
+When using @code{getopt}, options that do not take arguments can be
+grouped together.  Furthermore, options that take arguments require that the
+argument is present.  The argument can immediately follow the option letter,
+or it can be a separate command-line argument.
+
+Given a hypothetical program that takes
+three command-line options, @option{-a}, @option{-b}, and @option{-c}, where
+@option{-b} requires an argument, all of the following are valid ways of
+invoking the program:
+
+@example
+prog -a -b foo -c data1 data2 data3
+prog -ac -bfoo -- data1 data2 data3
+prog -acbfoo data1 data2 data3
+@end example
+
+Notice that when the argument is grouped with its option, the rest of
+the argument is considered to be the option's argument.
+In this example, @option{-acbfoo} indicates that all of the
+@option{-a}, @option{-b}, and @option{-c} options were supplied,
+and that @samp{foo} is the argument to the @option{-b} option.
+
+@code{getopt} provides four external variables that the programmer can use:
+
+@table @code
+@item optind
+The index in the argument value array (@code{argv}) where the first
+nonoption command-line argument can be found.
+
+@item optarg
+The string value of the argument to an option.
+
+@item opterr
+Usually @code{getopt} prints an error message when it finds an invalid
+option.  Setting @code{opterr} to zero disables this feature.  (An
+application might want to print its own error message.)
+
+@item optopt
+The letter representing the command-line option.
+@c While not usually documented, most versions supply this variable.
+@end table
+
+The following C fragment shows how @code{getopt} might process command-line
+arguments for @command{awk}:
+
+@example
+int
+main(int argc, char *argv[])
+@{
+    @dots{}
+    /* print our own message */
+    opterr = 0;
+    while ((c = getopt(argc, argv, "v:f:F:W:")) != -1) @{
+        switch (c) @{
+        case 'f':    /* file */
+            @dots{}
+            break;
+        case 'F':    /* field separator */
+            @dots{}
+            break;
+        case 'v':    /* variable assignment */
+            @dots{}
+            break;
+        case 'W':    /* extension */
+            @dots{}
+            break;
+        case '?':
+        default:
+            usage();
+            break;
+        @}
+    @}
+    @dots{}
+@}
+@end example
+
+As a side point, @command{gawk} actually uses the GNU @code{getopt_long}
+function to process both normal and GNU-style long options
+(@pxref{Options}).
+
+The abstraction provided by @code{getopt} is very useful and is quite
+handy in @command{awk} programs as well.  Following is an @command{awk}
+version of @code{getopt}.  This function highlights one of the
+greatest weaknesses in @command{awk}, which is that it is very poor at
+manipulating single characters.  Repeated calls to @code{substr} are
+necessary for accessing individual characters
+(@pxref{String Functions}).@footnote{This
+function was written before @command{gawk} acquired the ability to
+split strings into single characters using @code{""} as the separator.
+We have left it alone, since using @code{substr} is more portable.}
+
+The discussion that follows walks through the code a bit at a time:
+
+@cindex @code{getopt} user-defined function
+@example
+@c file eg/lib/getopt.awk
+# getopt.awk --- do C library getopt(3) function in awk
+@c endfile
+@ignore
+@c file eg/lib/getopt.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+#
+# Initial version: March, 1991
+# Revised: May, 1993
+
+@c endfile
+@end ignore
+@c file eg/lib/getopt.awk
+# External variables:
+#    Optind -- index in ARGV of first nonoption argument
+#    Optarg -- string value of argument to current option
+#    Opterr -- if nonzero, print our own diagnostic
+#    Optopt -- current option letter
+
+# Returns:
+#    -1     at end of options
+#    ?      for unrecognized option
+#    <c>    a character representing the current option
+
+# Private Data:
+#    _opti  -- index in multi-flag option, e.g., -abc
+@c endfile
+@end example
+
+The function starts out with
+a list of the global variables it uses,
+what the return values are, what they mean, and any global variables that
+are ``private'' to this library function.  Such documentation is essential
+for any program, and particularly for library functions.
+
+The @code{getopt} function first checks that it was indeed called with a string of options
+(the @code{options} parameter).  If @code{options} has a zero length,
+@code{getopt} immediately returns @minus{}1:
+
+@cindex @code{getopt} user-defined function
+@example
+@c file eg/lib/getopt.awk
+function getopt(argc, argv, options,    thisopt, i)
+@{
+    if (length(options) == 0)    # no options given
+        return -1
+
+@group
+    if (argv[Optind] == "--") @{  # all done
+        Optind++
+        _opti = 0
+        return -1
+@end group
+    @} else if (argv[Optind] !~ /^-[^: \t\n\f\r\v\b]/) @{
+        _opti = 0
+        return -1
+    @}
+@c endfile
+@end example
+
+The next thing to check for is the end of the options.  A @option{--}
+ends the command-line options, as does any command-line argument that
+does not begin with a @samp{-}.  @code{Optind} is used to step through
+the array of command-line arguments; it retains its value across calls
+to @code{getopt}, because it is a global variable.
+
+The regular expression that is used, @code{@w{/^-[^: \t\n\f\r\v\b]/}}, is
+perhaps a bit of overkill; it checks for a @samp{-} followed by anything
+that is not whitespace and not a colon.
+If the current command-line argument does not match this pattern,
+it is not an option, and it ends option processing:
+
+@example
+@c file eg/lib/getopt.awk
+    if (_opti == 0)
+        _opti = 2
+    thisopt = substr(argv[Optind], _opti, 1)
+    Optopt = thisopt
+    i = index(options, thisopt)
+    if (i == 0) @{
+        if (Opterr)
+            printf("%c -- invalid option\n",
+                                  thisopt) > "/dev/stderr"
+        if (_opti >= length(argv[Optind])) @{
+            Optind++
+            _opti = 0
+        @} else
+            _opti++
+        return "?"
+    @}
+@c endfile
+@end example
+
+The @code{_opti} variable tracks the position in the current command-line
+argument (@code{argv[Optind]}).  If multiple options are
+grouped together with one @samp{-} (e.g., @option{-abx}), it is necessary
+to return them to the user one at a time.
+
+If @code{_opti} is equal to zero, it is set to two, which is the index in
+the string of the next character to look at (we skip the @samp{-}, which
+is at position one).  The variable @code{thisopt} holds the character,
+obtained with @code{substr}.  It is saved in @code{Optopt} for the main
+program to use.
+
+If @code{thisopt} is not in the @code{options} string, then it is an
+invalid option.  If @code{Opterr} is nonzero, @code{getopt} prints an error
+message on the standard error that is similar to the message from the C
+version of @code{getopt}.
+
+Because the option is invalid, it is necessary to skip it and move on to the
+next option character.  If @code{_opti} is greater than or equal to the
+length of the current command-line argument, it is necessary to move on
+to the next argument, so @code{Optind} is incremented and @code{_opti} is reset
+to zero. Otherwise, @code{Optind} is left alone and @code{_opti} is merely
+incremented.
+
+In any case, because the option is invalid, @code{getopt} returns @samp{?}.
+The main program can examine @code{Optopt} if it needs to know what the
+invalid option letter actually is. Continuing on:
+
+@example
+@c file eg/lib/getopt.awk
+    if (substr(options, i + 1, 1) == ":") @{
+        # get option argument
+        if (length(substr(argv[Optind], _opti + 1)) > 0)
+            Optarg = substr(argv[Optind], _opti + 1)
+        else
+            Optarg = argv[++Optind]
+        _opti = 0
+    @} else
+        Optarg = ""
+@c endfile
+@end example
+
+If the option requires an argument, the option letter is followed by a colon
+in the @code{options} string.  If there are remaining characters in the
+current command-line argument (@code{argv[Optind]}), then the rest of that
+string is assigned to @code{Optarg}.  Otherwise, the next command-line
+argument is used (@samp{-xFOO} versus @samp{@w{-x FOO}}). In either case,
+@code{_opti} is reset to zero, because there are no more characters left to
+examine in the current command-line argument. Continuing:
+
+@example
+@c file eg/lib/getopt.awk
+    if (_opti == 0 || _opti >= length(argv[Optind])) @{
+        Optind++
+        _opti = 0
+    @} else
+        _opti++
+    return thisopt
+@}
+@c endfile
+@end example
+
+Finally, if @code{_opti} is either zero or greater than the length of the
+current command-line argument, it means this element in @code{argv} is
+through being processed, so @code{Optind} is incremented to point to the
+next element in @code{argv}.  If neither condition is true, then only
+@code{_opti} is incremented, so that the next option letter can be processed
+on the next call to @code{getopt}.
+
+The @code{BEGIN} rule initializes both @code{Opterr} and @code{Optind} to one.
+@code{Opterr} is set to one, since the default behavior is for @code{getopt}
+to print a diagnostic message upon seeing an invalid option.  @code{Optind}
+is set to one, since there's no reason to look at the program name, which is
+in @code{ARGV[0]}:
+
+@example
+@c file eg/lib/getopt.awk
+BEGIN @{
+    Opterr = 1    # default is to diagnose
+    Optind = 1    # skip ARGV[0]
+
+    # test program
+    if (_getopt_test) @{
+        while ((_go_c = getopt(ARGC, ARGV, "ab:cd")) != -1)
+            printf("c = <%c>, optarg = <%s>\n",
+                                       _go_c, Optarg)
+        printf("non-option arguments:\n")
+        for (; Optind < ARGC; Optind++)
+            printf("\tARGV[%d] = <%s>\n",
+                                    Optind, ARGV[Optind])
+    @}
+@}
+@c endfile
+@end example
+
+The rest of the @code{BEGIN} rule is a simple test program.  Here is the
+result of two sample runs of the test program:
+
+@example
+$ awk -f getopt.awk -v _getopt_test=1 -- -a -cbARG bax -x
+@print{} c = <a>, optarg = <>
+@print{} c = <c>, optarg = <>
+@print{} c = <b>, optarg = <ARG>
+@print{} non-option arguments:
+@print{}         ARGV[3] = <bax>
+@print{}         ARGV[4] = <-x>
+
+$ awk -f getopt.awk -v _getopt_test=1 -- -a -x -- xyz abc
+@print{} c = <a>, optarg = <>
+@error{} x -- invalid option
+@print{} c = <?>, optarg = <>
+@print{} non-option arguments:
+@print{}         ARGV[4] = <xyz>
+@print{}         ARGV[5] = <abc>
+@end example
+
+In both runs,
+the first @option{--} terminates the arguments to @command{awk}, so that it does
+not try to interpret the @option{-a}, etc., as its own options.
+Several of the sample programs presented in
+@ref{Sample Programs},
+use @code{getopt} to process their arguments.
+@c ENDOFRANGE libfclo
+@c ENDOFRANGE flibclo
+@c ENDOFRANGE clop
+@c ENDOFRANGE oclp
+
+@node Passwd Functions
+@section Reading the User Database
+
+@c STARTOFRANGE libfudata
+@cindex libraries of @command{awk} functions, user database, reading
+@c STARTOFRANGE flibudata
+@cindex functions, library, user database, reading
+@c last comma is part of primary
+@c STARTOFRANGE udatar
+@cindex user database, reading
+@c last comma is part of secondary
+@c STARTOFRANGE dataur
+@cindex database, users, reading
+@cindex @code{PROCINFO} array
+The @code{PROCINFO} array
+(@pxref{Built-in Variables})
+provides access to the current user's real and effective user and group ID
+numbers, and if available, the user's supplementary group set.
+However, because these are numbers, they do not provide very useful
+information to the average user.  There needs to be some way to find the
+user information associated with the user and group ID numbers.  This
+@value{SECTION} presents a suite of functions for retrieving information from the
+user database.  @xref{Group Functions},
+for a similar suite that retrieves information from the group database.
+
+@cindex @code{getpwent} function (C library)
+@cindex @code{getpwent} user-defined function
+@cindex users, information about, retrieving
+@cindex login information
+@cindex account information
+@cindex password file
+@cindex files, password
+The POSIX standard does not define the file where user information is
+kept.  Instead, it provides the @code{<pwd.h>} header file
+and several C language subroutines for obtaining user information.
+The primary function is @code{getpwent}, for ``get password entry.''
+The ``password'' comes from the original user database file,
+@file{/etc/passwd}, which stores user information, along with the
+encrypted passwords (hence the name).
+
+@cindex @command{pwcat} program
+While an @command{awk} program could simply read @file{/etc/passwd}
+directly, this file may not contain complete information about the
+system's set of users.@footnote{It is often the case that password
+information is stored in a network database.} To be sure you are able to
+produce a readable and complete version of the user database, it is necessary
+to write a small C program that calls @code{getpwent}.  @code{getpwent}
+is defined as returning a pointer to a @code{struct passwd}.  Each time it
+is called, it returns the next entry in the database.  When there are
+no more entries, it returns @code{NULL}, the null pointer.  When this
+happens, the C program should call @code{endpwent} to close the database.
+Following is @command{pwcat}, a C program that ``cats'' the password database:
+
+@c Use old style function header for portability to old systems (SunOS, HP/UX).
+
+@example
+@c file eg/lib/pwcat.c
+/*
+ * pwcat.c
+ *
+ * Generate a printable version of the password database
+ */
+@c endfile
+@ignore
+@c file eg/lib/pwcat.c
+/*
+ * Arnold Robbins, arnold@@gnu.org, May 1993
+ * Public Domain
+ */
+
+#if HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+@c endfile
+@end ignore
+@c file eg/lib/pwcat.c
+#include <stdio.h>
+#include <pwd.h>
+
+@c endfile
+@ignore
+@c file eg/lib/pwcat.c
+#if defined (STDC_HEADERS)
+#include <stdlib.h>
+#endif
+
+@c endfile
+@end ignore
+@c file eg/lib/pwcat.c
+int
+main(argc, argv)
+int argc;
+char **argv;
+@{
+    struct passwd *p;
+
+    while ((p = getpwent()) != NULL)
+        printf("%s:%s:%ld:%ld:%s:%s:%s\n",
+            p->pw_name, p->pw_passwd, (long) p->pw_uid,
+            (long) p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell);
+
+    endpwent();
+    return 0;
+@}
+@c endfile
+@end example
+
+If you don't understand C, don't worry about it.
+The output from @command{pwcat} is the user database, in the traditional
+@file{/etc/passwd} format of colon-separated fields.  The fields are:
+
+@ignore
+@table @asis
+@item Login name
+The user's login name.
+
+@item Encrypted password
+The user's encrypted password.  This may not be available on some systems.
+
+@item User-ID
+The user's numeric user ID number.
+(On some systems it's a C @code{long}, and not an @code{int}.  Thus
+we cast it to @code{long} for all cases.)
+
+@item Group-ID
+The user's numeric group ID number.
+(Similar comments about @code{long} vs.@: @code{int} apply here.)
+
+@item Full name
+The user's full name, and perhaps other information associated with the
+user.
+
+@item Home directory
+The user's login (or ``home'') directory (familiar to shell programmers as
+@code{$HOME}).
+
+@item Login shell
+The program that is run when the user logs in.  This is usually a
+shell, such as @command{bash}.
+@end table
+@end ignore
+
+@multitable {Encrypted password} {1234567890123456789012345678901234567890123456}
+@item Login name @tab The user's login name.
+
+@item Encrypted password @tab The user's encrypted password.  This may not be available on some systems.
+
+@item User-ID @tab The user's numeric user ID number.
+
+@item Group-ID @tab The user's numeric group ID number.
+
+@item Full name @tab The user's full name, and perhaps other information associated with the
+user.
+
+@item Home directory @tab The user's login (or ``home'') directory (familiar to shell programmers as
+@code{$HOME}).
+
+@item Login shell @tab The program that is run when the user logs in.  This is usually a
+shell, such as @command{bash}.
+@end multitable
+
+A few lines representative of @command{pwcat}'s output are as follows:
+
+@cindex Jacobs, Andrew
+@cindex Robbins, Arnold
+@cindex Robbins, Miriam
+@example
+$ pwcat
+@print{} root:3Ov02d5VaUPB6:0:1:Operator:/:/bin/sh
+@print{} nobody:*:65534:65534::/:
+@print{} daemon:*:1:1::/:
+@print{} sys:*:2:2::/:/bin/csh
+@print{} bin:*:3:3::/bin:
+@print{} arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh
+@print{} miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh
+@print{} andy:abcca2:113:10:Andy Jacobs:/home/andy:/bin/sh
+@dots{}
+@end example
+
+With that introduction, following is a group of functions for getting user
+information.  There are several functions here, corresponding to the C
+functions of the same names:
+
+@c Exercise: simplify all these functions that return values.
+@c Answer: return foo[key] returns "" if key not there, no need to check with `in'.
+
+@cindex @code{_pw_init} user-defined function
+@example
+@c file eg/lib/passwdawk.in
+# passwd.awk --- access password file information
+@c endfile
+@ignore
+@c file eg/lib/passwdawk.in
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+# Revised October 2000
+
+@c endfile
+@end ignore
+@c file eg/lib/passwdawk.in
+BEGIN @{
+    # tailor this to suit your system
+    _pw_awklib = "/usr/local/libexec/awk/"
+@}
+
+function _pw_init(    oldfs, oldrs, olddol0, pwcat, using_fw)
+@{
+    if (_pw_inited)
+        return
+
+    oldfs = FS
+    oldrs = RS
+    olddol0 = $0
+    using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
+    FS = ":"
+    RS = "\n"
+
+    pwcat = _pw_awklib "pwcat"
+    while ((pwcat | getline) > 0) @{
+        _pw_byname[$1] = $0
+        _pw_byuid[$3] = $0
+        _pw_bycount[++_pw_total] = $0
+    @}
+    close(pwcat)
+    _pw_count = 0
+    _pw_inited = 1
+    FS = oldfs
+    if (using_fw)
+        FIELDWIDTHS = FIELDWIDTHS
+    RS = oldrs
+    $0 = olddol0
+@}
+@c endfile
+@end example
+
+@cindex @code{BEGIN} pattern, @code{pwcat} program
+The @code{BEGIN} rule sets a private variable to the directory where
+@command{pwcat} is stored.  Because it is used to help out an @command{awk} library
+routine, we have chosen to put it in @file{/usr/local/libexec/awk};
+however, you might want it to be in a different directory on your system.
+
+The function @code{_pw_init} keeps three copies of the user information
+in three associative arrays.  The arrays are indexed by username
+(@code{_pw_byname}), by user ID number (@code{_pw_byuid}), and by order of
+occurrence (@code{_pw_bycount}).
+The variable @code{_pw_inited} is used for efficiency; @code{_pw_init}
+needs only to be called once.
+
+@cindex @code{getline} command, @code{_pw_init} function
+Because this function uses @code{getline} to read information from
+@command{pwcat}, it first saves the values of @code{FS}, @code{RS}, and @code{$0}.
+It notes in the variable @code{using_fw} whether field splitting
+with @code{FIELDWIDTHS} is in effect or not.
+Doing so is necessary, since these functions could be called
+from anywhere within a user's program, and the user may have his
+or her
+own way of splitting records and fields.
+
+The @code{using_fw} variable checks @code{PROCINFO["FS"]}, which
+is @code{"FIELDWIDTHS"} if field splitting is being done with
+@code{FIELDWIDTHS}.  This makes it possible to restore the correct
+field-splitting mechanism later.  The test can only be true for
+@command{gawk}.  It is false if using @code{FS} or on some other
+@command{awk} implementation.
+
+The main part of the function uses a loop to read database lines, split
+the line into fields, and then store the line into each array as necessary.
+When the loop is done, @code{@w{_pw_init}} cleans up by closing the pipeline,
+setting @code{@w{_pw_inited}} to one, and restoring @code{FS} (and @code{FIELDWIDTHS}
+if necessary), @code{RS}, and @code{$0}.
+The use of @code{@w{_pw_count}} is explained shortly.
+
+@c NEXT ED: All of these functions don't need the ... in ... test.  Just
+@c return the array element, which will be "" if not already there.  Duh.
+@cindex @code{getpwnam} function (C library)
+The @code{getpwnam} function takes a username as a string argument. If that
+user is in the database, it returns the appropriate line. Otherwise, it
+returns the null string:
+
+@cindex @code{getpwnam} user-defined function
+@example
+@group
+@c file eg/lib/passwdawk.in
+function getpwnam(name)
+@{
+    _pw_init()
+    if (name in _pw_byname)
+        return _pw_byname[name]
+    return ""
+@}
+@c endfile
+@end group
+@end example
+
+@cindex @code{getpwuid} function (C library)
+Similarly,
+the @code{getpwuid} function takes a user ID number argument. If that
+user number is in the database, it returns the appropriate line. Otherwise, it
+returns the null string:
+
+@cindex @code{getpwuid} user-defined function
+@example
+@c file eg/lib/passwdawk.in
+function getpwuid(uid)
+@{
+    _pw_init()
+    if (uid in _pw_byuid)
+        return _pw_byuid[uid]
+    return ""
+@}
+@c endfile
+@end example
+
+@cindex @code{getpwent} function (C library)
+The @code{getpwent} function simply steps through the database, one entry at
+a time.  It uses @code{_pw_count} to track its current position in the
+@code{_pw_bycount} array:
+
+@cindex @code{getpwent} user-defined function
+@example
+@c file eg/lib/passwdawk.in
+function getpwent()
+@{
+    _pw_init()
+    if (_pw_count < _pw_total)
+        return _pw_bycount[++_pw_count]
+    return ""
+@}
+@c endfile
+@end example
+
+@cindex @code{endpwent} function (C library)
+The @code{@w{endpwent}} function resets @code{@w{_pw_count}} to zero, so that
+subsequent calls to @code{getpwent} start over again:
+
+@cindex @code{endpwent} user-defined function
+@example
+@c file eg/lib/passwdawk.in
+function endpwent()
+@{
+    _pw_count = 0
+@}
+@c endfile
+@end example
+
+A conscious design decision in this suite was made that each subroutine calls
+@code{@w{_pw_init}} to initialize the database arrays.  The overhead of running
+a separate process to generate the user database, and the I/O to scan it,
+are only incurred if the user's main program actually calls one of these
+functions.  If this library file is loaded along with a user's program, but
+none of the routines are ever called, then there is no extra runtime overhead.
+(The alternative is move the body of @code{@w{_pw_init}} into a
+@code{BEGIN} rule, which always runs @command{pwcat}.  This simplifies the
+code but runs an extra process that may never be needed.)
+
+In turn, calling @code{_pw_init} is not too expensive, because the
+@code{_pw_inited} variable keeps the program from reading the data more than
+once.  If you are worried about squeezing every last cycle out of your
+@command{awk} program, the check of @code{_pw_inited} could be moved out of
+@code{_pw_init} and duplicated in all the other functions.  In practice,
+this is not necessary, since most @command{awk} programs are I/O-bound, and it
+clutters up the code.
+
+The @command{id} program in @ref{Id Program},
+uses these functions.
+@c ENDOFRANGE libfudata
+@c ENDOFRANGE flibudata
+@c ENDOFRANGE udatar
+@c ENDOFRANGE dataur
+
+@node Group Functions
+@section Reading the Group Database
+
+@c STARTOFRANGE libfgdata
+@cindex libraries of @command{awk} functions, group database, reading
+@c STARTOFRANGE flibgdata
+@cindex functions, library, group database, reading
+@c STARTOFRANGE gdatar
+@cindex group database, reading
+@c STARTOFRANGE datagr
+@cindex database, group, reading
+@cindex @code{PROCINFO} array
+@cindex @code{getgrent} function (C library)
+@cindex @code{getgrent} user-defined function
+@c comma is part of primary
+@cindex groups, information about
+@cindex account information
+@cindex group file
+@cindex files, group
+Much of the discussion presented in
+@ref{Passwd Functions},
+applies to the group database as well.  Although there has traditionally
+been a well-known file (@file{/etc/group}) in a well-known format, the POSIX
+standard only provides a set of C library routines
+(@code{<grp.h>} and @code{getgrent})
+for accessing the information.
+Even though this file may exist, it likely does not have
+complete information.  Therefore, as with the user database, it is necessary
+to have a small C program that generates the group database as its output.
+
+@cindex @command{grcat} program
+@command{grcat}, a C program that ``cats'' the group database,
+is as follows:
+
+@example
+@c file eg/lib/grcat.c
+/*
+ * grcat.c
+ *
+ * Generate a printable version of the group database
+ */
+@c endfile
+@ignore
+@c file eg/lib/grcat.c
+/*
+ * Arnold Robbins, arnold@@gnu.org, May 1993
+ * Public Domain
+ */
+
+/* For OS/2, do nothing. */
+#if HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#if defined (STDC_HEADERS)
+#include <stdlib.h>
+#endif
+
+#ifndef HAVE_GETGRENT
+int main() { return 0; }
+#else
+@c endfile
+@end ignore
+@c file eg/lib/grcat.c
+#include <stdio.h>
+#include <grp.h>
+
+int
+main(argc, argv)
+int argc;
+char **argv;
+@{
+    struct group *g;
+    int i;
+
+    while ((g = getgrent()) != NULL) @{
+        printf("%s:%s:%ld:", g->gr_name, g->gr_passwd,
+                                     (long) g->gr_gid);
+        for (i = 0; g->gr_mem[i] != NULL; i++) @{
+            printf("%s", g->gr_mem[i]);
+@group
+            if (g->gr_mem[i+1] != NULL)
+                putchar(',');
+        @}
+@end group
+        putchar('\n');
+    @}
+    endgrent();
+    return 0;
+@}
+@c endfile
+@end example
+@ignore
+@c file eg/lib/grcat.c
+#endif /* HAVE_GETGRENT */
+@c endfile
+@end ignore
+
+Each line in the group database represents one group.  The fields are
+separated with colons and represent the following information:
+
+@ignore
+@table @asis
+@item Group Name
+The name of the group.
+
+@item Group Password
+The encrypted group password. In practice, this field is never used. It is
+usually empty or set to @samp{*}.
+
+@item Group ID Number
+The numeric group ID number. This number is unique within the file.
+(On some systems it's a C @code{long}, and not an @code{int}.  Thus
+we cast it to @code{long} for all cases.)
+
+@item Group Member List
+A comma-separated list of usernames.  These users are members of the group.
+Modern Unix systems allow users to be members of several groups
+simultaneously.  If your system does, then there are elements
+@code{"group1"} through @code{"group@var{N}"} in @code{PROCINFO}
+for those group ID numbers.
+(Note that @code{PROCINFO} is a @command{gawk} extension;
+@pxref{Built-in Variables}.)
+@end table
+@end ignore
+
+@multitable {Encrypted password} {1234567890123456789012345678901234567890123456}
+@item Group name @tab The group's name.
+
+@item Group password @tab The group's encrypted password. In practice, this field is never used;
+it is usually empty or set to @samp{*}.
+
+@item Group-ID @tab
+The group's numeric group ID number; this number should be unique within the file.
+
+@item Group member list @tab
+A comma-separated list of usernames.  These users are members of the group.
+Modern Unix systems allow users to be members of several groups
+simultaneously.  If your system does, then there are elements
+@code{"group1"} through @code{"group@var{N}"} in @code{PROCINFO}
+for those group ID numbers.
+(Note that @code{PROCINFO} is a @command{gawk} extension;
+@pxref{Built-in Variables}.)
+@end multitable
+
+Here is what running @command{grcat} might produce:
+
+@example
+$ grcat
+@print{} wheel:*:0:arnold
+@print{} nogroup:*:65534:
+@print{} daemon:*:1:
+@print{} kmem:*:2:
+@print{} staff:*:10:arnold,miriam,andy
+@print{} other:*:20:
+@dots{}
+@end example
+
+Here are the functions for obtaining information from the group database.
+There are several, modeled after the C library functions of the same names:
+
+@cindex @code{getline} command, @code{_gr_init} user-defined function
+@cindex @code{_gr_init} user-defined function
+@example
+@c file eg/lib/groupawk.in
+# group.awk --- functions for dealing with the group file
+@c endfile
+@ignore
+@c file eg/lib/groupawk.in
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+# Revised October 2000
+
+@c endfile
+@end ignore
+@c line break on _gr_init for smallbook
+@c file eg/lib/groupawk.in
+BEGIN    \
+@{
+    # Change to suit your system
+    _gr_awklib = "/usr/local/libexec/awk/"
+@}
+
+function _gr_init(    oldfs, oldrs, olddol0, grcat,
+                             using_fw, n, a, i)
+@{
+    if (_gr_inited)
+        return
+
+    oldfs = FS
+    oldrs = RS
+    olddol0 = $0
+    using_fw = (PROCINFO["FS"] == "FIELDWIDTHS")
+    FS = ":"
+    RS = "\n"
+
+    grcat = _gr_awklib "grcat"
+    while ((grcat | getline) > 0) @{
+        if ($1 in _gr_byname)
+            _gr_byname[$1] = _gr_byname[$1] "," $4
+        else
+            _gr_byname[$1] = $0
+        if ($3 in _gr_bygid)
+            _gr_bygid[$3] = _gr_bygid[$3] "," $4
+        else
+            _gr_bygid[$3] = $0
+
+        n = split($4, a, "[ \t]*,[ \t]*")
+        for (i = 1; i <= n; i++)
+            if (a[i] in _gr_groupsbyuser)
+                _gr_groupsbyuser[a[i]] = \
+                    _gr_groupsbyuser[a[i]] " " $1
+            else
+                _gr_groupsbyuser[a[i]] = $1
+
+        _gr_bycount[++_gr_count] = $0
+    @}
+    close(grcat)
+    _gr_count = 0
+    _gr_inited++
+    FS = oldfs
+    if (using_fw)
+        FIELDWIDTHS = FIELDWIDTHS
+    RS = oldrs
+    $0 = olddol0
+@}
+@c endfile
+@end example
+
+The @code{BEGIN} rule sets a private variable to the directory where
+@command{grcat} is stored.  Because it is used to help out an @command{awk} library
+routine, we have chosen to put it in @file{/usr/local/libexec/awk}.  You might
+want it to be in a different directory on your system.
+
+These routines follow the same general outline as the user database routines
+(@pxref{Passwd Functions}).
+The @code{@w{_gr_inited}} variable is used to
+ensure that the database is scanned no more than once.
+The @code{@w{_gr_init}} function first saves @code{FS}, @code{FIELDWIDTHS}, @code{RS}, and
+@code{$0}, and then sets @code{FS} and @code{RS} to the correct values for
+scanning the group information.
+
+The group information is stored is several associative arrays.
+The arrays are indexed by group name (@code{@w{_gr_byname}}), by group ID number
+(@code{@w{_gr_bygid}}), and by position in the database (@code{@w{_gr_bycount}}).
+There is an additional array indexed by username (@code{@w{_gr_groupsbyuser}}),
+which is a space-separated list of groups to which each user belongs.
+
+Unlike the user database, it is possible to have multiple records in the
+database for the same group.  This is common when a group has a large number
+of members.  A pair of such entries might look like the following:
+
+@example
+tvpeople:*:101:johnny,jay,arsenio
+tvpeople:*:101:david,conan,tom,joan
+@end example
+
+For this reason, @code{_gr_init} looks to see if a group name or
+group ID number is already seen.  If it is, then the usernames are
+simply concatenated onto the previous list of users.  (There is actually a
+subtle problem with the code just presented.  Suppose that
+the first time there were no names. This code adds the names with
+a leading comma. It also doesn't check that there is a @code{$4}.)
+
+Finally, @code{_gr_init} closes the pipeline to @command{grcat}, restores
+@code{FS} (and @code{FIELDWIDTHS} if necessary), @code{RS}, and @code{$0},
+initializes @code{_gr_count} to zero
+(it is used later), and makes @code{_gr_inited} nonzero.
+
+@cindex @code{getgrnam} function (C library)
+The @code{getgrnam} function takes a group name as its argument, and if that
+group exists, it is returned. Otherwise, @code{getgrnam} returns the null
+string:
+
+@cindex @code{getgrnam} user-defined function
+@example
+@c file eg/lib/groupawk.in
+function getgrnam(group)
+@{
+    _gr_init()
+    if (group in _gr_byname)
+        return _gr_byname[group]
+    return ""
+@}
+@c endfile
+@end example
+
+@cindex @code{getgrgid} function (C library)
+The @code{getgrgid} function is similar, it takes a numeric group ID and
+looks up the information associated with that group ID:
+
+@cindex @code{getgrgid} user-defined function
+@example
+@c file eg/lib/groupawk.in
+function getgrgid(gid)
+@{
+    _gr_init()
+    if (gid in _gr_bygid)
+        return _gr_bygid[gid]
+    return ""
+@}
+@c endfile
+@end example
+
+@cindex @code{getgruser} function (C library)
+The @code{getgruser} function does not have a C counterpart. It takes a
+username and returns the list of groups that have the user as a member:
+
+@cindex @code{getgruser} function, user-defined
+@example
+@c file eg/lib/groupawk.in
+function getgruser(user)
+@{
+    _gr_init()
+    if (user in _gr_groupsbyuser)
+        return _gr_groupsbyuser[user]
+    return ""
+@}
+@c endfile
+@end example
+
+@cindex @code{getgrent} function (C library)
+The @code{getgrent} function steps through the database one entry at a time.
+It uses @code{_gr_count} to track its position in the list:
+
+@cindex @code{getgrent} user-defined function
+@example
+@c file eg/lib/groupawk.in
+function getgrent()
+@{
+    _gr_init()
+    if (++_gr_count in _gr_bycount)
+        return _gr_bycount[_gr_count]
+    return ""
+@}
+@c endfile
+@end example
+@c ENDOFRANGE clibf
+
+@cindex @code{endgrent} function (C library)
+The @code{endgrent} function resets @code{_gr_count} to zero so that @code{getgrent} can
+start over again:
+
+@cindex @code{endgrent} user-defined function
+@example
+@c file eg/lib/groupawk.in
+function endgrent()
+@{
+    _gr_count = 0
+@}
+@c endfile
+@end example
+
+As with the user database routines, each function calls @code{_gr_init} to
+initialize the arrays.  Doing so only incurs the extra overhead of running
+@command{grcat} if these functions are used (as opposed to moving the body of
+@code{_gr_init} into a @code{BEGIN} rule).
+
+Most of the work is in scanning the database and building the various
+associative arrays.  The functions that the user calls are themselves very
+simple, relying on @command{awk}'s associative arrays to do work.
+
+The @command{id} program in @ref{Id Program},
+uses these functions.
+@c ENDOFRANGE libfgdata
+@c ENDOFRANGE flibgdata
+@c ENDOFRANGE gdatar
+@c ENDOFRANGE libf
+@c ENDOFRANGE flib
+@c ENDOFRANGE fudlib
+@c ENDOFRANGE datagr
+
+@node Sample Programs
+@chapter Practical @command{awk} Programs
+@c STARTOFRANGE awkpex
+@cindex @command{awk} programs, examples of
+
+@ref{Library Functions},
+presents the idea that reading programs in a language contributes to
+learning that language.  This @value{CHAPTER} continues that theme,
+presenting a potpourri of @command{awk} programs for your reading
+enjoyment.
+@ifnotinfo
+There are three sections.
+The first describes how to run the programs presented
+in this @value{CHAPTER}.
+
+The second presents @command{awk}
+versions of several common POSIX utilities.
+These are programs that you are hopefully already familiar with,
+and therefore, whose problems are understood.
+By reimplementing these programs in @command{awk},
+you can focus on the @command{awk}-related aspects of solving
+the programming problem.
+
+The third is a grab bag of interesting programs.
+These solve a number of different data-manipulation and management
+problems.  Many of the programs are short, which emphasizes @command{awk}'s
+ability to do a lot in just a few lines of code.
+@end ifnotinfo
+
+Many of these programs use the library functions presented in
+@ref{Library Functions}.
+
+@menu
+* Running Examples::            How to run these examples.
+* Clones::                      Clones of common utilities.
+* Miscellaneous Programs::      Some interesting @command{awk} programs.
+@end menu
+
+@node Running Examples
+@section Running the Example Programs
+
+To run a given program, you would typically do something like this:
+
+@example
+awk -f @var{program} -- @var{options} @var{files}
+@end example
+
+@noindent
+Here, @var{program} is the name of the @command{awk} program (such as
+@file{cut.awk}), @var{options} are any command-line options for the
+program that start with a @samp{-}, and @var{files} are the actual @value{DF}s.
+
+If your system supports the @samp{#!} executable interpreter mechanism
+(@pxref{Executable Scripts}),
+you can instead run your program directly:
+
+@example
+cut.awk -c1-8 myfiles > results
+@end example
+
+If your @command{awk} is not @command{gawk}, you may instead need to use this:
+
+@example
+cut.awk -- -c1-8 myfiles > results
+@end example
+
+@node Clones
+@section Reinventing Wheels for Fun and Profit
+@c last comma is part of secondary
+@c STARTOFRANGE posimawk
+@cindex POSIX, programs, implementing in @command{awk}
+
+This @value{SECTION} presents a number of POSIX utilities that are implemented in
+@command{awk}.  Reinventing these programs in @command{awk} is often enjoyable,
+because the algorithms can be very clearly expressed, and the code is usually
+very concise and simple.  This is true because @command{awk} does so much for you.
+
+It should be noted that these programs are not necessarily intended to
+replace the installed versions on your system.  Instead, their
+purpose is to illustrate @command{awk} language programming for ``real world''
+tasks.
+
+The programs are presented in alphabetical order.
+
+@menu
+* Cut Program::                 The @command{cut} utility.
+* Egrep Program::               The @command{egrep} utility.
+* Id Program::                  The @command{id} utility.
+* Split Program::               The @command{split} utility.
+* Tee Program::                 The @command{tee} utility.
+* Uniq Program::                The @command{uniq} utility.
+* Wc Program::                  The @command{wc} utility.
+@end menu
+
+@node Cut Program
+@subsection Cutting out Fields and Columns
+
+@cindex @command{cut} utility
+@c STARTOFRANGE cut
+@cindex @command{cut} utility
+@c STARTOFRANGE ficut
+@cindex fields, cutting
+@c STARTOFRANGE colcut
+@cindex columns, cutting
+The @command{cut} utility selects, or ``cuts,'' characters or fields
+from its standard input and sends them to its standard output.
+Fields are separated by tabs by default,
+but you may supply a command-line option to change the field
+@dfn{delimiter} (i.e., the field-separator character). @command{cut}'s
+definition of fields is less general than @command{awk}'s.
+
+A common use of @command{cut} might be to pull out just the login name of
+logged-on users from the output of @command{who}.  For example, the following
+pipeline generates a sorted, unique list of the logged-on users:
+
+@example
+who | cut -c1-8 | sort | uniq
+@end example
+
+The options for @command{cut} are:
+
+@table @code
+@item -c @var{list}
+Use @var{list} as the list of characters to cut out.  Items within the list
+may be separated by commas, and ranges of characters can be separated with
+dashes.  The list @samp{1-8,15,22-35} specifies characters 1 through
+8, 15, and 22 through 35.
+
+@item -f @var{list}
+Use @var{list} as the list of fields to cut out.
+
+@item -d @var{delim}
+Use @var{delim} as the field-separator character instead of the tab
+character.
+
+@item -s
+Suppress printing of lines that do not contain the field delimiter.
+@end table
+
+The @command{awk} implementation of @command{cut} uses the @code{getopt} library
+function (@pxref{Getopt Function})
+and the @code{join} library function
+(@pxref{Join Function}).
+
+The program begins with a comment describing the options, the library
+functions needed, and a @code{usage} function that prints out a usage
+message and exits.  @code{usage} is called if invalid arguments are
+supplied:
+
+@cindex @code{cut.awk} program
+@example
+@c file eg/prog/cut.awk
+# cut.awk --- implement cut in awk
+@c endfile
+@ignore
+@c file eg/prog/cut.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/cut.awk
+# Options:
+#    -f list     Cut fields
+#    -d c        Field delimiter character
+#    -c list     Cut characters
+#
+#    -s          Suppress lines without the delimiter
+#
+# Requires getopt and join library functions
+
+@group
+function usage(    e1, e2)
+@{
+    e1 = "usage: cut [-f list] [-d c] [-s] [files...]"
+    e2 = "usage: cut [-c list] [files...]"
+    print e1 > "/dev/stderr"
+    print e2 > "/dev/stderr"
+    exit 1
+@}
+@end group
+@c endfile
+@end example
+
+@noindent
+The variables @code{e1} and @code{e2} are used so that the function
+fits nicely on the
+@ifnotinfo
+page.
+@end ifnotinfo
+@ifnottex
+screen.
+@end ifnottex
+
+@cindex @code{BEGIN} pattern, running @command{awk} programs and
+@cindex @code{FS} variable, running @command{awk} programs and
+Next comes a @code{BEGIN} rule that parses the command-line options.
+It sets @code{FS} to a single TAB character, because that is @command{cut}'s
+default field separator.  The output field separator is also set to be the
+same as the input field separator.  Then @code{getopt} is used to step
+through the command-line options.  Exactly one of the variables
+@code{by_fields} or @code{by_chars} is set to true, to indicate that
+processing should be done by fields or by characters, respectively.
+When cutting by characters, the output field separator is set to the null
+string:
+
+@example
+@c file eg/prog/cut.awk
+BEGIN    \
+@{
+    FS = "\t"    # default
+    OFS = FS
+    while ((c = getopt(ARGC, ARGV, "sf:c:d:")) != -1) @{
+        if (c == "f") @{
+            by_fields = 1
+            fieldlist = Optarg
+        @} else if (c == "c") @{
+            by_chars = 1
+            fieldlist = Optarg
+            OFS = ""
+        @} else if (c == "d") @{
+            if (length(Optarg) > 1) @{
+                printf("Using first character of %s" \
+                " for delimiter\n", Optarg) > "/dev/stderr"
+                Optarg = substr(Optarg, 1, 1)
+            @}
+            FS = Optarg
+            OFS = FS
+            if (FS == " ")    # defeat awk semantics
+                FS = "[ ]"
+        @} else if (c == "s")
+            suppress++
+        else
+            usage()
+    @}
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+@c endfile
+@end example
+
+@cindex field separators, spaces as
+Special care is taken when the field delimiter is a space.  Using
+a single space (@code{@w{" "}}) for the value of @code{FS} is
+incorrect---@command{awk} would separate fields with runs of spaces,
+tabs, and/or newlines, and we want them to be separated with individual
+spaces.  Also, note that after @code{getopt} is through, we have to
+clear out all the elements of @code{ARGV} from 1 to @code{Optind},
+so that @command{awk} does not try to process the command-line options
+as @value{FN}s.
+
+After dealing with the command-line options, the program verifies that the
+options make sense.  Only one or the other of @option{-c} and @option{-f}
+should be used, and both require a field list.  Then the program calls
+either @code{set_fieldlist} or @code{set_charlist} to pull apart the
+list of fields or characters:
+
+@example
+@c file eg/prog/cut.awk
+    if (by_fields && by_chars)
+        usage()
+
+    if (by_fields == 0 && by_chars == 0)
+        by_fields = 1    # default
+
+    if (fieldlist == "") @{
+        print "cut: needs list for -c or -f" > "/dev/stderr"
+        exit 1
+    @}
+
+    if (by_fields)
+        set_fieldlist()
+    else
+        set_charlist()
+@}
+@c endfile
+@end example
+
+@code{set_fieldlist}  is used to split the field list apart at the commas
+and into an array.  Then, for each element of the array, it looks to
+see if it is actually a range, and if so, splits it apart. The range
+is verified to make sure the first number is smaller than the second.
+Each number in the list is added to the @code{flist} array, which
+simply lists the fields that will be printed.  Normal field splitting
+is used.  The program lets @command{awk} handle the job of doing the
+field splitting:
+
+@example
+@c file eg/prog/cut.awk
+function set_fieldlist(        n, m, i, j, k, f, g)
+@{
+    n = split(fieldlist, f, ",")
+    j = 1    # index in flist
+    for (i = 1; i <= n; i++) @{
+        if (index(f[i], "-") != 0) @{ # a range
+            m = split(f[i], g, "-")
+@group
+            if (m != 2 || g[1] >= g[2]) @{
+                printf("bad field list: %s\n",
+                                  f[i]) > "/dev/stderr"
+                exit 1
+            @}
+@end group
+            for (k = g[1]; k <= g[2]; k++)
+                flist[j++] = k
+        @} else
+            flist[j++] = f[i]
+    @}
+    nfields = j - 1
+@}
+@c endfile
+@end example
+
+The @code{set_charlist} function is more complicated than @code{set_fieldlist}.
+The idea here is to use @command{gawk}'s @code{FIELDWIDTHS} variable
+(@pxref{Constant Size}),
+which describes constant-width input.  When using a character list, that is
+exactly what we have.
+
+Setting up @code{FIELDWIDTHS} is more complicated than simply listing the
+fields that need to be printed.  We have to keep track of the fields to
+print and also the intervening characters that have to be skipped.
+For example, suppose you wanted characters 1 through 8, 15, and
+22 through 35.  You would use @samp{-c 1-8,15,22-35}.  The necessary value
+for @code{FIELDWIDTHS} is @code{@w{"8 6 1 6 14"}}.  This yields five
+fields, and the fields to print
+are @code{$1}, @code{$3}, and @code{$5}.
+The intermediate fields are @dfn{filler},
+which is stuff in between the desired data.
+@code{flist} lists the fields to print, and @code{t} tracks the
+complete field list, including filler fields:
+
+@example
+@c file eg/prog/cut.awk
+function set_charlist(    field, i, j, f, g, t,
+                          filler, last, len)
+@{
+    field = 1   # count total fields
+    n = split(fieldlist, f, ",")
+    j = 1       # index in flist
+    for (i = 1; i <= n; i++) @{
+        if (index(f[i], "-") != 0) @{ # range
+            m = split(f[i], g, "-")
+            if (m != 2 || g[1] >= g[2]) @{
+                printf("bad character list: %s\n",
+                               f[i]) > "/dev/stderr"
+                exit 1
+            @}
+            len = g[2] - g[1] + 1
+            if (g[1] > 1)  # compute length of filler
+                filler = g[1] - last - 1
+            else
+                filler = 0
+@group
+            if (filler)
+                t[field++] = filler
+@end group
+            t[field++] = len  # length of field
+            last = g[2]
+            flist[j++] = field - 1
+        @} else @{
+            if (f[i] > 1)
+                filler = f[i] - last - 1
+            else
+                filler = 0
+            if (filler)
+                t[field++] = filler
+            t[field++] = 1
+            last = f[i]
+            flist[j++] = field - 1
+        @}
+    @}
+    FIELDWIDTHS = join(t, 1, field - 1)
+    nfields = j - 1
+@}
+@c endfile
+@end example
+
+Next is the rule that actually processes the data.  If the @option{-s} option
+is given, then @code{suppress} is true.  The first @code{if} statement
+makes sure that the input record does have the field separator.  If
+@command{cut} is processing fields, @code{suppress} is true, and the field
+separator character is not in the record, then the record is skipped.
+
+If the record is valid, then @command{gawk} has split the data
+into fields, either using the character in @code{FS} or using fixed-length
+fields and @code{FIELDWIDTHS}.  The loop goes through the list of fields
+that should be printed.  The corresponding field is printed if it contains data.
+If the next field also has data, then the separator character is
+written out between the fields:
+
+@example
+@c file eg/prog/cut.awk
+@{
+    if (by_fields && suppress && index($0, FS) != 0)
+        next
+
+    for (i = 1; i <= nfields; i++) @{
+        if ($flist[i] != "") @{
+            printf "%s", $flist[i]
+            if (i < nfields && $flist[i+1] != "")
+                printf "%s", OFS
+        @}
+    @}
+    print ""
+@}
+@c endfile
+@end example
+
+This version of @command{cut} relies on @command{gawk}'s @code{FIELDWIDTHS}
+variable to do the character-based cutting.  While it is possible in
+other @command{awk} implementations to use @code{substr}
+(@pxref{String Functions}),
+it is also extremely painful.
+The @code{FIELDWIDTHS} variable supplies an elegant solution to the problem
+of picking the input line apart by characters.
+@c ENDOFRANGE cut
+@c ENDOFRANGE ficut
+@c ENDOFRANGE colcut
+
+@c Exercise: Rewrite using split with "".
+
+@node Egrep Program
+@subsection Searching for Regular Expressions in Files
+
+@c STARTOFRANGE regexps
+@cindex regular expressions, searching for
+@c STARTOFRANGE sfregexp
+@cindex searching, files for regular expressions
+@c STARTOFRANGE fsregexp
+@cindex files, searching for regular expressions
+@cindex @command{egrep} utility
+The @command{egrep} utility searches files for patterns.  It uses regular
+expressions that are almost identical to those available in @command{awk}
+(@pxref{Regexp}).
+It is used in the following manner:
+
+@example
+egrep @r{[} @var{options} @r{]} '@var{pattern}' @var{files} @dots{}
+@end example
+
+The @var{pattern} is a regular expression.  In typical usage, the regular
+expression is quoted to prevent the shell from expanding any of the
+special characters as @value{FN} wildcards.  Normally, @command{egrep}
+prints the lines that matched.  If multiple @value{FN}s are provided on
+the command line, each output line is preceded by the name of the file
+and a colon.
+
+The options to @command{egrep} are as follows:
+
+@table @code
+@item -c
+Print out a count of the lines that matched the pattern, instead of the
+lines themselves.
+
+@item -s
+Be silent.  No output is produced and the exit value indicates whether
+the pattern was matched.
+
+@item -v
+Invert the sense of the test. @command{egrep} prints the lines that do
+@emph{not} match the pattern and exits successfully if the pattern is not
+matched.
+
+@item -i
+Ignore case distinctions in both the pattern and the input data.
+
+@item -l
+Only print (list) the names of the files that matched, not the lines that matched.
+
+@item -e @var{pattern}
+Use @var{pattern} as the regexp to match.  The purpose of the @option{-e}
+option is to allow patterns that start with a @samp{-}.
+@end table
+
+This version uses the @code{getopt} library function
+(@pxref{Getopt Function})
+and the file transition library program
+(@pxref{Filetrans Function}).
+
+The program begins with a descriptive comment and then a @code{BEGIN} rule
+that processes the command-line arguments with @code{getopt}.  The @option{-i}
+(ignore case) option is particularly easy with @command{gawk}; we just use the
+@code{IGNORECASE} built-in variable
+(@pxref{Built-in Variables}):
+
+@cindex @code{egrep.awk} program
+@example
+@c file eg/prog/egrep.awk
+# egrep.awk --- simulate egrep in awk
+@c endfile
+@ignore
+@c file eg/prog/egrep.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/egrep.awk
+# Options:
+#    -c    count of lines
+#    -s    silent - use exit value
+#    -v    invert test, success if no match
+#    -i    ignore case
+#    -l    print filenames only
+#    -e    argument is pattern
+#
+# Requires getopt and file transition library functions
+
+BEGIN @{
+    while ((c = getopt(ARGC, ARGV, "ce:svil")) != -1) @{
+        if (c == "c")
+            count_only++
+        else if (c == "s")
+            no_print++
+        else if (c == "v")
+            invert++
+        else if (c == "i")
+            IGNORECASE = 1
+        else if (c == "l")
+            filenames_only++
+        else if (c == "e")
+            pattern = Optarg
+        else
+            usage()
+    @}
+@c endfile
+@end example
+
+Next comes the code that handles the @command{egrep}-specific behavior. If no
+pattern is supplied with @option{-e}, the first nonoption on the
+command line is used.  The @command{awk} command-line arguments up to @code{ARGV[Optind]}
+are cleared, so that @command{awk} won't try to process them as files.  If no
+files are specified, the standard input is used, and if multiple files are
+specified, we make sure to note this so that the @value{FN}s can precede the
+matched lines in the output:
+
+@example
+@c file eg/prog/egrep.awk
+    if (pattern == "")
+        pattern = ARGV[Optind++]
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+    if (Optind >= ARGC) @{
+        ARGV[1] = "-"
+        ARGC = 2
+    @} else if (ARGC - Optind > 1)
+        do_filenames++
+
+#    if (IGNORECASE)
+#        pattern = tolower(pattern)
+@}
+@c endfile
+@end example
+
+The last two lines are commented out, since they are not needed in
+@command{gawk}.  They should be uncommented if you have to use another version
+of @command{awk}.
+
+The next set of lines should be uncommented if you are not using
+@command{gawk}.  This rule translates all the characters in the input line
+into lowercase if the @option{-i} option is specified.@footnote{It
+also introduces a subtle bug;
+if a match happens, we output the translated line, not the original.}
+The rule is
+commented out since it is not necessary with @command{gawk}:
+
+@c Exercise: Fix this, w/array and new line as key to original line
+
+@example
+@c file eg/prog/egrep.awk
+#@{
+#    if (IGNORECASE)
+#        $0 = tolower($0)
+#@}
+@c endfile
+@end example
+
+The @code{beginfile} function is called by the rule in @file{ftrans.awk}
+when each new file is processed.  In this case, it is very simple; all it
+does is initialize a variable @code{fcount} to zero. @code{fcount} tracks
+how many lines in the current file matched the pattern
+(naming the parameter @code{junk} shows we know that @code{beginfile}
+is called with a parameter, but that we're not interested in its value):
+
+@example
+@c file eg/prog/egrep.awk
+function beginfile(junk)
+@{
+    fcount = 0
+@}
+@c endfile
+@end example
+
+The @code{endfile} function is called after each file has been processed.
+It affects the output only when the user wants a count of the number of lines that
+matched.  @code{no_print} is true only if the exit status is desired.
+@code{count_only} is true if line counts are desired.  @command{egrep}
+therefore only prints line counts if printing and counting are enabled.
+The output format must be adjusted depending upon the number of files to
+process.  Finally, @code{fcount} is added to @code{total}, so that we
+know the total number of lines that matched the pattern:
+
+@example
+@c file eg/prog/egrep.awk
+function endfile(file)
+@{
+    if (! no_print && count_only)
+        if (do_filenames)
+            print file ":" fcount
+        else
+            print fcount
+
+    total += fcount
+@}
+@c endfile
+@end example
+
+The following rule does most of the work of matching lines. The variable
+@code{matches} is true if the line matched the pattern. If the user
+wants lines that did not match, the sense of @code{matches} is inverted
+using the @samp{!} operator. @code{fcount} is incremented with the value of
+@code{matches}, which is either one or zero, depending upon a
+successful or unsuccessful match.  If the line does not match, the
+@code{next} statement just moves on to the next record.
+
+@cindex @code{!} (exclamation point), @code{!} operator
+@cindex exclamation point (@code{!}), @code{!} operator
+A number of additional tests are made, but they are only done if we
+are not counting lines.  First, if the user only wants exit status
+(@code{no_print} is true), then it is enough to know that @emph{one}
+line in this file matched, and we can skip on to the next file with
+@code{nextfile}.  Similarly, if we are only printing @value{FN}s, we can
+print the @value{FN}, and then skip to the next file with @code{nextfile}.
+Finally, each line is printed, with a leading @value{FN} and colon
+if necessary:
+
+@cindex @code{!} operator
+@example
+@c file eg/prog/egrep.awk
+@{
+    matches = ($0 ~ pattern)
+    if (invert)
+        matches = ! matches
+
+    fcount += matches    # 1 or 0
+
+    if (! matches)
+        next
+
+    if (! count_only) @{
+        if (no_print)
+            nextfile
+
+        if (filenames_only) @{
+            print FILENAME
+            nextfile
+        @}
+
+        if (do_filenames)
+            print FILENAME ":" $0
+        else
+            print
+    @}
+@}
+@c endfile
+@end example
+
+The @code{END} rule takes care of producing the correct exit status. If
+there are no matches, the exit status is one; otherwise it is zero:
+
+@example
+@c file eg/prog/egrep.awk
+END    \
+@{
+    if (total == 0)
+        exit 1
+    exit 0
+@}
+@c endfile
+@end example
+
+The @code{usage} function prints a usage message in case of invalid options,
+and then exits:
+
+@example
+@c file eg/prog/egrep.awk
+function usage(    e)
+@{
+    e = "Usage: egrep [-csvil] [-e pat] [files ...]"
+    e = e "\n\tegrep [-csvil] pat [files ...]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+@c endfile
+@end example
+
+The variable @code{e} is used so that the function fits nicely
+on the printed page.
+
+@cindex @code{END} pattern, backslash continuation and
+@cindex @code{\} (backslash), continuing lines and
+@cindex backslash (@code{\}), continuing lines and
+Just a note on programming style: you may have noticed that the @code{END}
+rule uses backslash continuation, with the open brace on a line by
+itself.  This is so that it more closely resembles the way functions
+are written.  Many of the examples
+in this @value{CHAPTER}
+use this style. You can decide for yourself if you like writing
+your @code{BEGIN} and @code{END} rules this way
+or not.
+@c ENDOFRANGE regexps
+@c ENDOFRANGE sfregexp
+@c ENDOFRANGE fsregexp
+
+@node Id Program
+@subsection Printing out User Information
+
+@cindex printing, user information
+@cindex users, information about, printing
+@cindex @command{id} utility
+The @command{id} utility lists a user's real and effective user ID numbers,
+real and effective group ID numbers, and the user's group set, if any.
+@command{id} only prints the effective user ID and group ID if they are
+different from the real ones.  If possible, @command{id} also supplies the
+corresponding user and group names.  The output might look like this:
+
+@example
+$ id
+@print{} uid=2076(arnold) gid=10(staff) groups=10(staff),4(tty)
+@end example
+
+This information is part of what is provided by @command{gawk}'s
+@code{PROCINFO} array (@pxref{Built-in Variables}).
+However, the @command{id} utility provides a more palatable output than just
+individual numbers.
+
+Here is a simple version of @command{id} written in @command{awk}.
+It uses the user database library functions
+(@pxref{Passwd Functions})
+and the group database library functions
+(@pxref{Group Functions}):
+
+The program is fairly straightforward.  All the work is done in the
+@code{BEGIN} rule.  The user and group ID numbers are obtained from
+@code{PROCINFO}.
+The code is repetitive.  The entry in the user database for the real user ID
+number is split into parts at the @samp{:}. The name is the first field.
+Similar code is used for the effective user ID number and the group
+numbers:
+
+@cindex @code{id.awk} program
+@example
+@c file eg/prog/id.awk
+# id.awk --- implement id in awk
+#
+# Requires user and group library functions
+@c endfile
+@ignore
+@c file eg/prog/id.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+# Revised February 1996
+
+@c endfile
+@end ignore
+@c file eg/prog/id.awk
+# output is:
+# uid=12(foo) euid=34(bar) gid=3(baz) \
+#             egid=5(blat) groups=9(nine),2(two),1(one)
+
+@group
+BEGIN    \
+@{
+    uid = PROCINFO["uid"]
+    euid = PROCINFO["euid"]
+    gid = PROCINFO["gid"]
+    egid = PROCINFO["egid"]
+@end group
+
+    printf("uid=%d", uid)
+    pw = getpwuid(uid)
+    if (pw != "") @{
+        split(pw, a, ":")
+        printf("(%s)", a[1])
+    @}
+
+    if (euid != uid) @{
+        printf(" euid=%d", euid)
+        pw = getpwuid(euid)
+        if (pw != "") @{
+            split(pw, a, ":")
+            printf("(%s)", a[1])
+        @}
+    @}
+
+    printf(" gid=%d", gid)
+    pw = getgrgid(gid)
+    if (pw != "") @{
+        split(pw, a, ":")
+        printf("(%s)", a[1])
+    @}
+
+    if (egid != gid) @{
+        printf(" egid=%d", egid)
+        pw = getgrgid(egid)
+        if (pw != "") @{
+            split(pw, a, ":")
+            printf("(%s)", a[1])
+        @}
+    @}
+
+    for (i = 1; ("group" i) in PROCINFO; i++) @{
+        if (i == 1)
+            printf(" groups=")
+        group = PROCINFO["group" i]
+        printf("%d", group)
+        pw = getgrgid(group)
+        if (pw != "") @{
+            split(pw, a, ":")
+            printf("(%s)", a[1])
+        @}
+        if (("group" (i+1)) in PROCINFO)
+            printf(",")
+    @}
+
+    print ""
+@}
+@c endfile
+@end example
+
+@cindex @code{in} operator
+The test in the @code{for} loop is worth noting.
+Any supplementary groups in the @code{PROCINFO} array have the
+indices @code{"group1"} through @code{"group@var{N}"} for some
+@var{N}, i.e., the total number of supplementary groups.
+However, we don't know in advance how many of these groups
+there are.
+
+This loop works by starting at one, concatenating the value with
+@code{"group"}, and then using @code{in} to see if that value is
+in the array.  Eventually, @code{i} is incremented past
+the last group in the array and the loop exits.
+
+The loop is also correct if there are @emph{no} supplementary
+groups; then the condition is false the first time it's
+tested, and the loop body never executes.
+
+@c exercise!!!
+@ignore
+The POSIX version of @command{id} takes arguments that control which
+information is printed.  Modify this version to accept the same
+arguments and perform in the same way.
+@end ignore
+
+@node Split Program
+@subsection Splitting a Large File into Pieces
+
+@c STARTOFRANGE filspl
+@cindex files, splitting
+@cindex @code{split} utility
+The @code{split} program splits large text files into smaller pieces.
+Usage is as follows:
+
+@example
+split @r{[}-@var{count}@r{]} file @r{[} @var{prefix} @r{]}
+@end example
+
+By default,
+the output files are named @file{xaa}, @file{xab}, and so on. Each file has
+1000 lines in it, with the likely exception of the last file. To change the
+number of lines in each file, supply a number on the command line
+preceded with a minus; e.g., @samp{-500} for files with 500 lines in them
+instead of 1000.  To change the name of the output files to something like
+@file{myfileaa}, @file{myfileab}, and so on, supply an additional
+argument that specifies the @value{FN} prefix.
+
+Here is a version of @code{split} in @command{awk}. It uses the @code{ord} and
+@code{chr} functions presented in
+@ref{Ordinal Functions}.
+
+The program first sets its defaults, and then tests to make sure there are
+not too many arguments.  It then looks at each argument in turn.  The
+first argument could be a minus sign followed by a number. If it is, this happens
+to look like a negative number, so it is made positive, and that is the
+count of lines.  The data @value{FN} is skipped over and the final argument
+is used as the prefix for the output @value{FN}s:
+
+@cindex @code{split.awk} program
+@example
+@c file eg/prog/split.awk
+# split.awk --- do split in awk
+#
+# Requires ord and chr library functions
+@c endfile
+@ignore
+@c file eg/prog/split.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/split.awk
+# usage: split [-num] [file] [outname]
+
+BEGIN @{
+    outfile = "x"    # default
+    count = 1000
+    if (ARGC > 4)
+        usage()
+
+    i = 1
+    if (ARGV[i] ~ /^-[0-9]+$/) @{
+        count = -ARGV[i]
+        ARGV[i] = ""
+        i++
+    @}
+    # test argv in case reading from stdin instead of file
+    if (i in ARGV)
+        i++    # skip data file name
+    if (i in ARGV) @{
+        outfile = ARGV[i]
+        ARGV[i] = ""
+    @}
+
+    s1 = s2 = "a"
+    out = (outfile s1 s2)
+@}
+@c endfile
+@end example
+
+The next rule does most of the work. @code{tcount} (temporary count) tracks
+how many lines have been printed to the output file so far. If it is greater
+than @code{count}, it is time to close the current file and start a new one.
+@code{s1} and @code{s2} track the current suffixes for the @value{FN}. If
+they are both @samp{z}, the file is just too big.  Otherwise, @code{s1}
+moves to the next letter in the alphabet and @code{s2} starts over again at
+@samp{a}:
+
+@c else on separate line here for page breaking
+@example
+@c file eg/prog/split.awk
+@{
+    if (++tcount > count) @{
+        close(out)
+        if (s2 == "z") @{
+            if (s1 == "z") @{
+                printf("split: %s is too large to split\n",
+                       FILENAME) > "/dev/stderr"
+                exit 1
+            @}
+            s1 = chr(ord(s1) + 1)
+            s2 = "a"
+        @}
+@group
+        else
+            s2 = chr(ord(s2) + 1)
+@end group
+        out = (outfile s1 s2)
+        tcount = 1
+    @}
+    print > out
+@}
+@c endfile
+@end example
+
+@c Exercise: do this with just awk builtin functions, index("abc..."), substr, etc.
+
+@noindent
+The @code{usage} function simply prints an error message and exits:
+
+@example
+@c file eg/prog/split.awk
+function usage(   e)
+@{
+    e = "usage: split [-num] [file] [outname]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+@c endfile
+@end example
+
+@noindent
+The variable @code{e} is used so that the function
+fits nicely on the
+@ifinfo
+screen.
+@end ifinfo
+@ifnotinfo
+page.
+@end ifnotinfo
+
+This program is a bit sloppy; it relies on @command{awk} to automatically close the last file
+instead of doing it in an @code{END} rule.
+It also assumes that letters are contiguous in the character set,
+which isn't true for EBCDIC systems.
+@c BFD...
+@c ENDOFRANGE filspl
+
+@node Tee Program
+@subsection Duplicating Output into Multiple Files
+
+@c last comma is part of secondary
+@cindex files, multiple, duplicating output into
+@cindex output, duplicating into files
+@cindex @code{tee} utility
+The @code{tee} program is known as a ``pipe fitting.''  @code{tee} copies
+its standard input to its standard output and also duplicates it to the
+files named on the command line.  Its usage is as follows:
+
+@example
+tee @r{[}-a@r{]} file @dots{}
+@end example
+
+The @option{-a} option tells @code{tee} to append to the named files, instead of
+truncating them and starting over.
+
+The @code{BEGIN} rule first makes a copy of all the command-line arguments
+into an array named @code{copy}.
+@code{ARGV[0]} is not copied, since it is not needed.
+@code{tee} cannot use @code{ARGV} directly, since @command{awk} attempts to
+process each @value{FN} in @code{ARGV} as input data.
+
+@cindex flag variables
+If the first argument is @option{-a}, then the flag variable
+@code{append} is set to true, and both @code{ARGV[1]} and
+@code{copy[1]} are deleted. If @code{ARGC} is less than two, then no
+@value{FN}s were supplied and @code{tee} prints a usage message and exits.
+Finally, @command{awk} is forced to read the standard input by setting
+@code{ARGV[1]} to @code{"-"} and @code{ARGC} to two:
+
+@c NEXT ED: Add more leading commentary in this program
+@cindex @code{tee.awk} program
+@example
+@c file eg/prog/tee.awk
+# tee.awk --- tee in awk
+@c endfile
+@ignore
+@c file eg/prog/tee.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+# Revised December 1995
+
+@c endfile
+@end ignore
+@c file eg/prog/tee.awk
+BEGIN    \
+@{
+    for (i = 1; i < ARGC; i++)
+        copy[i] = ARGV[i]
+
+    if (ARGV[1] == "-a") @{
+        append = 1
+        delete ARGV[1]
+        delete copy[1]
+        ARGC--
+    @}
+    if (ARGC < 2) @{
+        print "usage: tee [-a] file ..." > "/dev/stderr"
+        exit 1
+    @}
+    ARGV[1] = "-"
+    ARGC = 2
+@}
+@c endfile
+@end example
+
+The single rule does all the work.  Since there is no pattern, it is
+executed for each line of input.  The body of the rule simply prints the
+line into each file on the command line, and then to the standard output:
+
+@example
+@c file eg/prog/tee.awk
+@{
+    # moving the if outside the loop makes it run faster
+    if (append)
+        for (i in copy)
+            print >> copy[i]
+    else
+        for (i in copy)
+            print > copy[i]
+    print
+@}
+@c endfile
+@end example
+
+@noindent
+It is also possible to write the loop this way:
+
+@example
+for (i in copy)
+    if (append)
+        print >> copy[i]
+    else
+        print > copy[i]
+@end example
+
+@noindent
+This is more concise but it is also less efficient.  The @samp{if} is
+tested for each record and for each output file.  By duplicating the loop
+body, the @samp{if} is only tested once for each input record.  If there are
+@var{N} input records and @var{M} output files, the first method only
+executes @var{N} @samp{if} statements, while the second executes
+@var{N}@code{*}@var{M} @samp{if} statements.
+
+Finally, the @code{END} rule cleans up by closing all the output files:
+
+@example
+@c file eg/prog/tee.awk
+END    \
+@{
+    for (i in copy)
+        close(copy[i])
+@}
+@c endfile
+@end example
+
+@node Uniq Program
+@subsection Printing Nonduplicated Lines of Text
+
+@c STARTOFRANGE prunt
+@cindex printing, unduplicated lines of text
+@c first comma is part of primary
+@c STARTOFRANGE tpul
+@cindex text, printing, unduplicated lines of
+@cindex @command{uniq} utility
+The @command{uniq} utility reads sorted lines of data on its standard
+input, and by default removes duplicate lines.  In other words, it only
+prints unique lines---hence the name.  @command{uniq} has a number of
+options. The usage is as follows:
+
+@example
+uniq @r{[}-udc @r{[}-@var{n}@r{]]} @r{[}+@var{n}@r{]} @r{[} @var{input file} @r{[} @var{output file} @r{]]}
+@end example
+
+The options for @command{uniq} are:
+
+@table @code
+@item -d
+Pnly print only repeated lines.
+
+@item -u
+Print only nonrepeated lines.
+
+@item -c
+Count lines. This option overrides @option{-d} and @option{-u}.  Both repeated
+and nonrepeated lines are counted.
+
+@item -@var{n}
+Skip @var{n} fields before comparing lines.  The definition of fields
+is similar to @command{awk}'s default: nonwhitespace characters separated
+by runs of spaces and/or tabs.
+
+@item +@var{n}
+Skip @var{n} characters before comparing lines.  Any fields specified with
+@samp{-@var{n}} are skipped first.
+
+@item @var{input file}
+Data is read from the input file named on the command line, instead of from
+the standard input.
+
+@item @var{output file}
+The generated output is sent to the named output file, instead of to the
+standard output.
+@end table
+
+Normally @command{uniq} behaves as if both the @option{-d} and
+@option{-u} options are provided.
+
+@command{uniq} uses the
+@code{getopt} library function
+(@pxref{Getopt Function})
+and the @code{join} library function
+(@pxref{Join Function}).
+
+The program begins with a @code{usage} function and then a brief outline of
+the options and their meanings in a comment.
+The @code{BEGIN} rule deals with the command-line arguments and options. It
+uses a trick to get @code{getopt} to handle options of the form @samp{-25},
+treating such an option as the option letter @samp{2} with an argument of
+@samp{5}. If indeed two or more digits are supplied (@code{Optarg} looks
+like a number), @code{Optarg} is
+concatenated with the option digit and then the result is added to zero to make
+it into a number.  If there is only one digit in the option, then
+@code{Optarg} is not needed. In this case, @code{Optind} must be decremented so that
+@code{getopt} processes it next time.  This code is admittedly a bit
+tricky.
+
+If no options are supplied, then the default is taken, to print both
+repeated and nonrepeated lines.  The output file, if provided, is assigned
+to @code{outputfile}.  Early on, @code{outputfile} is initialized to the
+standard output, @file{/dev/stdout}:
+
+@cindex @code{uniq.awk} program
+@example
+@c file eg/prog/uniq.awk
+@group
+# uniq.awk --- do uniq in awk
+#
+# Requires getopt and join library functions
+@end group
+@c endfile
+@ignore
+@c file eg/prog/uniq.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/uniq.awk
+function usage(    e)
+@{
+    e = "Usage: uniq [-udc [-n]] [+n] [ in [ out ]]"
+    print e > "/dev/stderr"
+    exit 1
+@}
+
+# -c    count lines. overrides -d and -u
+# -d    only repeated lines
+# -u    only non-repeated lines
+# -n    skip n fields
+# +n    skip n characters, skip fields first
+
+BEGIN   \
+@{
+    count = 1
+    outputfile = "/dev/stdout"
+    opts = "udc0:1:2:3:4:5:6:7:8:9:"
+    while ((c = getopt(ARGC, ARGV, opts)) != -1) @{
+        if (c == "u")
+            non_repeated_only++
+        else if (c == "d")
+            repeated_only++
+        else if (c == "c")
+            do_count++
+        else if (index("0123456789", c) != 0) @{
+            # getopt requires args to options
+            # this messes us up for things like -5
+            if (Optarg ~ /^[0-9]+$/)
+                fcount = (c Optarg) + 0
+            else @{
+                fcount = c + 0
+                Optind--
+            @}
+        @} else
+            usage()
+    @}
+
+    if (ARGV[Optind] ~ /^\+[0-9]+$/) @{
+        charcount = substr(ARGV[Optind], 2) + 0
+        Optind++
+    @}
+
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+
+    if (repeated_only == 0 && non_repeated_only == 0)
+        repeated_only = non_repeated_only = 1
+
+    if (ARGC - Optind == 2) @{
+        outputfile = ARGV[ARGC - 1]
+        ARGV[ARGC - 1] = ""
+    @}
+@}
+@c endfile
+@end example
+
+The following function, @code{are_equal}, compares the current line,
+@code{$0}, to the
+previous line, @code{last}.  It handles skipping fields and characters.
+If no field count and no character count are specified, @code{are_equal}
+simply returns one or zero depending upon the result of a simple string
+comparison of @code{last} and @code{$0}.  Otherwise, things get more
+complicated.
+If fields have to be skipped, each line is broken into an array using
+@code{split}
+(@pxref{String Functions});
+the desired fields are then joined back into a line using @code{join}.
+The joined lines are stored in @code{clast} and @code{cline}.
+If no fields are skipped, @code{clast} and @code{cline} are set to
+@code{last} and @code{$0}, respectively.
+Finally, if characters are skipped, @code{substr} is used to strip off the
+leading @code{charcount} characters in @code{clast} and @code{cline}.  The
+two strings are then compared and @code{are_equal} returns the result:
+
+@example
+@c file eg/prog/uniq.awk
+function are_equal(    n, m, clast, cline, alast, aline)
+@{
+    if (fcount == 0 && charcount == 0)
+        return (last == $0)
+
+    if (fcount > 0) @{
+        n = split(last, alast)
+        m = split($0, aline)
+        clast = join(alast, fcount+1, n)
+        cline = join(aline, fcount+1, m)
+    @} else @{
+        clast = last
+        cline = $0
+    @}
+    if (charcount) @{
+        clast = substr(clast, charcount + 1)
+        cline = substr(cline, charcount + 1)
+    @}
+
+    return (clast == cline)
+@}
+@c endfile
+@end example
+
+The following two rules are the body of the program.  The first one is
+executed only for the very first line of data.  It sets @code{last} equal to
+@code{$0}, so that subsequent lines of text have something to be compared to.
+
+The second rule does the work. The variable @code{equal} is one or zero,
+depending upon the results of @code{are_equal}'s comparison. If @command{uniq}
+is counting repeated lines, and the lines are equal, then it increments the @code{count} variable.
+Otherwise, it prints the line and resets @code{count},
+since the two lines are not equal.
+
+If @command{uniq} is not counting, and if the lines are equal, @code{count} is incremented.
+Nothing is printed, since the point is to remove duplicates.
+Otherwise, if @command{uniq} is counting repeated lines and more than
+one line is seen, or if @command{uniq} is counting nonrepeated lines
+and only one line is seen, then the line is printed, and @code{count}
+is reset.
+
+Finally, similar logic is used in the @code{END} rule to print the final
+line of input data:
+
+@example
+@c file eg/prog/uniq.awk
+NR == 1 @{
+    last = $0
+    next
+@}
+
+@{
+    equal = are_equal()
+
+    if (do_count) @{    # overrides -d and -u
+        if (equal)
+            count++
+        else @{
+            printf("%4d %s\n", count, last) > outputfile
+            last = $0
+            count = 1    # reset
+        @}
+        next
+    @}
+
+    if (equal)
+        count++
+    else @{
+        if ((repeated_only && count > 1) ||
+            (non_repeated_only && count == 1))
+                print last > outputfile
+        last = $0
+        count = 1
+    @}
+@}
+
+END @{
+    if (do_count)
+        printf("%4d %s\n", count, last) > outputfile
+    else if ((repeated_only && count > 1) ||
+            (non_repeated_only && count == 1))
+        print last > outputfile
+@}
+@c endfile
+@end example
+@c ENDOFRANGE prunt
+@c ENDOFRANGE tpul
+
+@node Wc Program
+@subsection Counting Things
+
+@c STARTOFRANGE count
+@cindex counting
+@c STARTOFRANGE infco
+@cindex input files, counting elements in
+@c STARTOFRANGE woco
+@cindex words, counting
+@c STARTOFRANGE chco
+@cindex characters, counting
+@c STARTOFRANGE lico
+@cindex lines, counting
+@cindex @command{wc} utility
+The @command{wc} (word count) utility counts lines, words, and characters in
+one or more input files. Its usage is as follows:
+
+@example
+wc @r{[}-lwc@r{]} @r{[} @var{files} @dots{} @r{]}
+@end example
+
+If no files are specified on the command line, @command{wc} reads its standard
+input. If there are multiple files, it also prints total counts for all
+the files.  The options and their meanings are shown in the following list:
+
+@table @code
+@item -l
+Count only lines.
+
+@item -w
+Count only words.
+A ``word'' is a contiguous sequence of nonwhitespace characters, separated
+by spaces and/or tabs.  Luckily, this is the normal way @command{awk} separates
+fields in its input data.
+
+@item -c
+Count only characters.
+@end table
+
+Implementing @command{wc} in @command{awk} is particularly elegant,
+since @command{awk} does a lot of the work for us; it splits lines into
+words (i.e., fields) and counts them, it counts lines (i.e., records),
+and it can easily tell us how long a line is.
+
+This uses the @code{getopt} library function
+(@pxref{Getopt Function})
+and the file-transition functions
+(@pxref{Filetrans Function}).
+
+This version has one notable difference from traditional versions of
+@command{wc}: it always prints the counts in the order lines, words,
+and characters.  Traditional versions note the order of the @option{-l},
+@option{-w}, and @option{-c} options on the command line, and print the
+counts in that order.
+
+The @code{BEGIN} rule does the argument processing.  The variable
+@code{print_total} is true if more than one file is named on the
+command line:
+
+@cindex @code{wc.awk} program
+@example
+@c file eg/prog/wc.awk
+# wc.awk --- count lines, words, characters
+@c endfile
+@ignore
+@c file eg/prog/wc.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+@c endfile
+@end ignore
+@c file eg/prog/wc.awk
+
+# Options:
+#    -l    only count lines
+#    -w    only count words
+#    -c    only count characters
+#
+# Default is to count lines, words, characters
+#
+# Requires getopt and file transition library functions
+
+BEGIN @{
+    # let getopt print a message about
+    # invalid options. we ignore them
+    while ((c = getopt(ARGC, ARGV, "lwc")) != -1) @{
+        if (c == "l")
+            do_lines = 1
+        else if (c == "w")
+            do_words = 1
+        else if (c == "c")
+            do_chars = 1
+    @}
+    for (i = 1; i < Optind; i++)
+        ARGV[i] = ""
+
+    # if no options, do all
+    if (! do_lines && ! do_words && ! do_chars)
+        do_lines = do_words = do_chars = 1
+
+    print_total = (ARGC - i > 2)
+@}
+@c endfile
+@end example
+
+The @code{beginfile} function is simple; it just resets the counts of lines,
+words, and characters to zero, and saves the current @value{FN} in
+@code{fname}:
+
+@c NEXT ED: make it lines = words = chars = 0
+@example
+@c file eg/prog/wc.awk
+function beginfile(file)
+@{
+    chars = lines = words = 0
+    fname = FILENAME
+@}
+@c endfile
+@end example
+
+The @code{endfile} function adds the current file's numbers to the running
+totals of lines, words, and characters.@footnote{@command{wc} can't just use the value of
+@code{FNR} in @code{endfile}. If you examine
+the code in
+@ref{Filetrans Function}
+you will see that
+@code{FNR} has already been reset by the time
+@code{endfile} is called.}  It then prints out those numbers
+for the file that was just read. It relies on @code{beginfile} to reset the
+numbers for the following @value{DF}:
+@c ONE DAY: make the above footnote an exercise, instead of giving away the answer.
+
+@c NEXT ED: make order for += be lines, words, chars
+@example
+@c file eg/prog/wc.awk
+function endfile(file)
+@{
+    tchars += chars
+    tlines += lines
+    twords += words
+    if (do_lines)
+        printf "\t%d", lines
+@group
+    if (do_words)
+        printf "\t%d", words
+@end group
+    if (do_chars)
+        printf "\t%d", chars
+    printf "\t%s\n", fname
+@}
+@c endfile
+@end example
+
+There is one rule that is executed for each line. It adds the length of
+the record, plus one, to @code{chars}.  Adding one plus the record length
+is needed because the newline character separating records (the value
+of @code{RS}) is not part of the record itself, and thus not included
+in its length.  Next, @code{lines} is incremented for each line read,
+and @code{words} is incremented by the value of @code{NF}, which is the
+number of ``words'' on this line:
+
+@example
+@c file eg/prog/wc.awk
+# do per line
+@{
+    chars += length($0) + 1    # get newline
+    lines++
+    words += NF
+@}
+@c endfile
+@end example
+
+Finally, the @code{END} rule simply prints the totals for all the files:
+
+@example
+@c file eg/prog/wc.awk
+END @{
+    if (print_total) @{
+        if (do_lines)
+            printf "\t%d", tlines
+        if (do_words)
+            printf "\t%d", twords
+        if (do_chars)
+            printf "\t%d", tchars
+        print "\ttotal"
+    @}
+@}
+@c endfile
+@end example
+@c ENDOFRANGE count
+@c ENDOFRANGE infco
+@c ENDOFRANGE lico
+@c ENDOFRANGE woco
+@c ENDOFRANGE chco
+@c ENDOFRANGE posimawk
+
+@node Miscellaneous Programs
+@section A Grab Bag of @command{awk} Programs
+
+This @value{SECTION} is a large ``grab bag'' of miscellaneous programs.
+We hope you find them both interesting and enjoyable.
+
+@menu
+* Dupword Program::             Finding duplicated words in a document.
+* Alarm Program::               An alarm clock.
+* Translate Program::           A program similar to the @command{tr} utility.
+* Labels Program::              Printing mailing labels.
+* Word Sorting::                A program to produce a word usage count.
+* History Sorting::             Eliminating duplicate entries from a history
+                                file.
+* Extract Program::             Pulling out programs from Texinfo source
+                                files.
+* Simple Sed::                  A Simple Stream Editor.
+* Igawk Program::               A wrapper for @command{awk} that includes
+                                files.
+@end menu
+
+@node Dupword Program
+@subsection Finding Duplicated Words in a Document
+
+@c last comma is part of secondary
+@cindex words, duplicate, searching for
+@cindex searching, for words
+@c first comma is part of primary
+@cindex documents, searching
+A common error when writing large amounts of prose is to accidentally
+duplicate words.  Typically you will see this in text as something like ``the
+the program does the following@dots{}''  When the text is online, often
+the duplicated words occur at the end of one line and the beginning of
+another, making them very difficult to spot.
+@c as here!
+
+This program, @file{dupword.awk}, scans through a file one line at a time
+and looks for adjacent occurrences of the same word.  It also saves the last
+word on a line (in the variable @code{prev}) for comparison with the first
+word on the next line.
+
+@cindex Texinfo
+The first two statements make sure that the line is all lowercase,
+so that, for example, ``The'' and ``the'' compare equal to each other.
+The next statement replaces nonalphanumeric and nonwhitespace characters
+with spaces, so that punctuation does not affect the comparison either.
+The characters are replaced with spaces so that formatting controls
+don't create nonsense words (e.g., the Texinfo @samp{@@code@{NF@}}
+becomes @samp{codeNF} if punctuation is simply deleted).  The record is
+then resplit into fields, yielding just the actual words on the line,
+and ensuring that there are no empty fields.
+
+If there are no fields left after removing all the punctuation, the
+current record is skipped.  Otherwise, the program loops through each
+word, comparing it to the previous one:
+
+@cindex @code{dupword.awk} program
+@example
+@c file eg/prog/dupword.awk
+# dupword.awk --- find duplicate words in text
+@c endfile
+@ignore
+@c file eg/prog/dupword.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# December 1991
+# Revised October 2000
+
+@c endfile
+@end ignore
+@c file eg/prog/dupword.awk
+@{
+    $0 = tolower($0)
+    gsub(/[^[:alnum:][:blank:]]/, " ");
+    $0 = $0         # re-split
+    if (NF == 0)
+        next
+    if ($1 == prev)
+        printf("%s:%d: duplicate %s\n",
+            FILENAME, FNR, $1)
+    for (i = 2; i <= NF; i++)
+        if ($i == $(i-1))
+            printf("%s:%d: duplicate %s\n",
+                FILENAME, FNR, $i)
+    prev = $NF
+@}
+@c endfile
+@end example
+
+@node Alarm Program
+@subsection An Alarm Clock Program
+@cindex insomnia, cure for
+@cindex Robbins, Arnold
+@quotation
+@i{Nothing cures insomnia like a ringing alarm clock.}@*
+Arnold Robbins
+@end quotation
+
+@c STARTOFRANGE tialarm
+@cindex time, alarm clock example program
+@c STARTOFRANGE alaex
+@cindex alarm clock example program
+The following program is a simple ``alarm clock'' program.
+You give it a time of day and an optional message.  At the specified time,
+it prints the message on the standard output. In addition, you can give it
+the number of times to repeat the message as well as a delay between
+repetitions.
+
+This program uses the @code{gettimeofday} function from
+@ref{Gettimeofday Function}.
+
+All the work is done in the @code{BEGIN} rule.  The first part is argument
+checking and setting of defaults: the delay, the count, and the message to
+print.  If the user supplied a message without the ASCII BEL
+character (known as the ``alert'' character, @code{"\a"}), then it is added to
+the message.  (On many systems, printing the ASCII BEL generates an
+audible alert. Thus when the alarm goes off, the system calls attention
+to itself in case the user is not looking at the computer or terminal.)
+Here is the program:
+
+@cindex @code{alarm.awk} program
+@example
+@c file eg/prog/alarm.awk
+# alarm.awk --- set an alarm
+#
+# Requires gettimeofday library function
+@c endfile
+@ignore
+@c file eg/prog/alarm.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/alarm.awk
+# usage: alarm time [ "message" [ count [ delay ] ] ]
+
+BEGIN    \
+@{
+    # Initial argument sanity checking
+    usage1 = "usage: alarm time ['message' [count [delay]]]"
+    usage2 = sprintf("\t(%s) time ::= hh:mm", ARGV[1])
+
+    if (ARGC < 2) @{
+        print usage1 > "/dev/stderr"
+        print usage2 > "/dev/stderr"
+        exit 1
+    @} else if (ARGC == 5) @{
+        delay = ARGV[4] + 0
+        count = ARGV[3] + 0
+        message = ARGV[2]
+    @} else if (ARGC == 4) @{
+        count = ARGV[3] + 0
+        message = ARGV[2]
+    @} else if (ARGC == 3) @{
+        message = ARGV[2]
+    @} else if (ARGV[1] !~ /[0-9]?[0-9]:[0-9][0-9]/) @{
+        print usage1 > "/dev/stderr"
+        print usage2 > "/dev/stderr"
+        exit 1
+    @}
+
+    # set defaults for once we reach the desired time
+    if (delay == 0)
+        delay = 180    # 3 minutes
+@group
+    if (count == 0)
+        count = 5
+@end group
+    if (message == "")
+        message = sprintf("\aIt is now %s!\a", ARGV[1])
+    else if (index(message, "\a") == 0)
+        message = "\a" message "\a"
+@c endfile
+@end example
+
+The next @value{SECTION} of code turns the alarm time into hours and minutes,
+converts it (if necessary) to a 24-hour clock, and then turns that
+time into a count of the seconds since midnight.  Next it turns the current
+time into a count of seconds since midnight.  The difference between the two
+is how long to wait before setting off the alarm:
+
+@example
+@c file eg/prog/alarm.awk
+    # split up alarm time
+    split(ARGV[1], atime, ":")
+    hour = atime[1] + 0    # force numeric
+    minute = atime[2] + 0  # force numeric
+
+    # get current broken down time
+    gettimeofday(now)
+
+    # if time given is 12-hour hours and it's after that
+    # hour, e.g., `alarm 5:30' at 9 a.m. means 5:30 p.m.,
+    # then add 12 to real hour
+    if (hour < 12 && now["hour"] > hour)
+        hour += 12
+
+    # set target time in seconds since midnight
+    target = (hour * 60 * 60) + (minute * 60)
+
+    # get current time in seconds since midnight
+    current = (now["hour"] * 60 * 60) + \
+               (now["minute"] * 60) + now["second"]
+
+    # how long to sleep for
+    naptime = target - current
+    if (naptime <= 0) @{
+        print "time is in the past!" > "/dev/stderr"
+        exit 1
+    @}
+@c endfile
+@end example
+
+@cindex @command{sleep} utility
+Finally, the program uses the @code{system} function
+(@pxref{I/O Functions})
+to call the @command{sleep} utility.  The @command{sleep} utility simply pauses
+for the given number of seconds.  If the exit status is not zero,
+the program assumes that @command{sleep} was interrupted and exits. If
+@command{sleep} exited with an OK status (zero), then the program prints the
+message in a loop, again using @command{sleep} to delay for however many
+seconds are necessary:
+
+@example
+@c file eg/prog/alarm.awk
+    # zzzzzz..... go away if interrupted
+    if (system(sprintf("sleep %d", naptime)) != 0)
+        exit 1
+
+    # time to notify!
+    command = sprintf("sleep %d", delay)
+    for (i = 1; i <= count; i++) @{
+        print message
+        # if sleep command interrupted, go away
+        if (system(command) != 0)
+            break
+    @}
+
+    exit 0
+@}
+@c endfile
+@end example
+@c ENDOFRANGE tialarm
+@c ENDOFRANGE alaex
+
+@node Translate Program
+@subsection Transliterating Characters
+
+@c STARTOFRANGE chtra
+@cindex characters, transliterating
+@cindex @command{tr} utility
+The system @command{tr} utility transliterates characters.  For example, it is
+often used to map uppercase letters into lowercase for further processing:
+
+@example
+@var{generate data} | tr 'A-Z' 'a-z' | @var{process data} @dots{}
+@end example
+
+@command{tr} requires two lists of characters.@footnote{On some older
+System V systems,
+@ifset ORA
+including Solaris,
+@end ifset
+@command{tr} may require that the lists be written as
+range expressions enclosed in square brackets (@samp{[a-z]}) and quoted,
+to prevent the shell from attempting a @value{FN} expansion.  This is
+not a feature.}  When processing the input, the first character in the
+first list is replaced with the first character in the second list,
+the second character in the first list is replaced with the second
+character in the second list, and so on.  If there are more characters
+in the ``from'' list than in the ``to'' list, the last character of the
+``to'' list is used for the remaining characters in the ``from'' list.
+
+Some time ago,
+@c early or mid-1989!
+a user proposed that a transliteration function should
+be added to @command{gawk}.
+@c Wishing to avoid gratuitous new features,
+@c at least theoretically
+The following program was written to
+prove that character transliteration could be done with a user-level
+function.  This program is not as complete as the system @command{tr} utility
+but it does most of the job.
+
+The @command{translate} program demonstrates one of the few weaknesses
+of standard @command{awk}: dealing with individual characters is very
+painful, requiring repeated use of the @code{substr}, @code{index},
+and @code{gsub} built-in functions
+(@pxref{String Functions}).@footnote{This
+program was written before @command{gawk} acquired the ability to
+split each character in a string into separate array elements.}
+@c Exercise: How might you use this new feature to simplify the program?
+There are two functions.  The first, @code{stranslate}, takes three
+arguments:
+
+@table @code
+@item from
+A list of characters from which to translate.
+
+@item to
+A list of characters to which to translate.
+
+@item target
+The string on which to do the translation.
+@end table
+
+Associative arrays make the translation part fairly easy. @code{t_ar} holds
+the ``to'' characters, indexed by the ``from'' characters.  Then a simple
+loop goes through @code{from}, one character at a time.  For each character
+in @code{from}, if the character appears in @code{target}, @code{gsub}
+is used to change it to the corresponding @code{to} character.
+
+The @code{translate} function simply calls @code{stranslate} using @code{$0}
+as the target.  The main program sets two global variables, @code{FROM} and
+@code{TO}, from the command line, and then changes @code{ARGV} so that
+@command{awk} reads from the standard input.
+
+Finally, the processing rule simply calls @code{translate} for each record:
+
+@cindex @code{translate.awk} program
+@example
+@c file eg/prog/translate.awk
+# translate.awk --- do tr-like stuff
+@c endfile
+@ignore
+@c file eg/prog/translate.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# August 1989
+
+@c endfile
+@end ignore
+@c file eg/prog/translate.awk
+# Bugs: does not handle things like: tr A-Z a-z, it has
+# to be spelled out. However, if `to' is shorter than `from',
+# the last character in `to' is used for the rest of `from'.
+
+function stranslate(from, to, target,     lf, lt, t_ar, i, c)
+@{
+    lf = length(from)
+    lt = length(to)
+    for (i = 1; i <= lt; i++)
+        t_ar[substr(from, i, 1)] = substr(to, i, 1)
+    if (lt < lf)
+        for (; i <= lf; i++)
+            t_ar[substr(from, i, 1)] = substr(to, lt, 1)
+    for (i = 1; i <= lf; i++) @{
+        c = substr(from, i, 1)
+        if (index(target, c) > 0)
+            gsub(c, t_ar[c], target)
+    @}
+    return target
+@}
+
+function translate(from, to)
+@{
+    return $0 = stranslate(from, to, $0)
+@}
+
+# main program
+BEGIN @{
+@group
+    if (ARGC < 3) @{
+        print "usage: translate from to" > "/dev/stderr"
+        exit
+    @}
+@end group
+    FROM = ARGV[1]
+    TO = ARGV[2]
+    ARGC = 2
+    ARGV[1] = "-"
+@}
+
+@{
+    translate(FROM, TO)
+    print
+@}
+@c endfile
+@end example
+
+While it is possible to do character transliteration in a user-level
+function, it is not necessarily efficient, and we (the @command{gawk}
+authors) started to consider adding a built-in function.  However,
+shortly after writing this program, we learned that the System V Release 4
+@command{awk} had added the @code{toupper} and @code{tolower} functions
+(@pxref{String Functions}).
+These functions handle the vast majority of the
+cases where character transliteration is necessary, and so we chose to
+simply add those functions to @command{gawk} as well and then leave well
+enough alone.
+
+An obvious improvement to this program would be to set up the
+@code{t_ar} array only once, in a @code{BEGIN} rule. However, this
+assumes that the ``from'' and ``to'' lists
+will never change throughout the lifetime of the program.
+@c ENDOFRANGE chtra
+
+@node Labels Program
+@subsection Printing Mailing Labels
+
+@c STARTOFRANGE prml
+@cindex printing, mailing labels
+@c comma is part of primary
+@c STARTOFRANGE mlprint
+@cindex mailing labels, printing
+Here is a ``real world''@footnote{``Real world'' is defined as
+``a program actually used to get something done.''}
+program.  This
+script reads lists of names and
+addresses and generates mailing labels.  Each page of labels has 20 labels
+on it, 2 across and 10 down.  The addresses are guaranteed to be no more
+than 5 lines of data.  Each address is separated from the next by a blank
+line.
+
+The basic idea is to read 20 labels worth of data.  Each line of each label
+is stored in the @code{line} array.  The single rule takes care of filling
+the @code{line} array and printing the page when 20 labels have been read.
+
+The @code{BEGIN} rule simply sets @code{RS} to the empty string, so that
+@command{awk} splits records at blank lines
+(@pxref{Records}).
+It sets @code{MAXLINES} to 100, since 100 is the maximum number
+of lines on the page (20 * 5 = 100).
+
+Most of the work is done in the @code{printpage} function.
+The label lines are stored sequentially in the @code{line} array.  But they
+have to print horizontally; @code{line[1]} next to @code{line[6]},
+@code{line[2]} next to @code{line[7]}, and so on.  Two loops are used to
+accomplish this.  The outer loop, controlled by @code{i}, steps through
+every 10 lines of data; this is each row of labels.  The inner loop,
+controlled by @code{j}, goes through the lines within the row.
+As @code{j} goes from 0 to 4, @samp{i+j} is the @code{j}-th line in
+the row, and @samp{i+j+5} is the entry next to it.  The output ends up
+looking something like this:
+
+@example
+line 1          line 6
+line 2          line 7
+line 3          line 8
+line 4          line 9
+line 5          line 10
+@dots{}
+@end example
+
+As a final note, an extra blank line is printed at lines 21 and 61, to keep
+the output lined up on the labels.  This is dependent on the particular
+brand of labels in use when the program was written.  You will also note
+that there are 2 blank lines at the top and 2 blank lines at the bottom.
+
+The @code{END} rule arranges to flush the final page of labels; there may
+not have been an even multiple of 20 labels in the data:
+
+@cindex @code{labels.awk} program
+@example
+@c file eg/prog/labels.awk
+# labels.awk --- print mailing labels
+@c endfile
+@ignore
+@c file eg/prog/labels.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# June 1992
+@c endfile
+@end ignore
+@c file eg/prog/labels.awk
+
+# Each label is 5 lines of data that may have blank lines.
+# The label sheets have 2 blank lines at the top and 2 at
+# the bottom.
+
+BEGIN    @{ RS = "" ; MAXLINES = 100 @}
+
+function printpage(    i, j)
+@{
+    if (Nlines <= 0)
+        return
+
+    printf "\n\n"        # header
+
+    for (i = 1; i <= Nlines; i += 10) @{
+        if (i == 21 || i == 61)
+            print ""
+        for (j = 0; j < 5; j++) @{
+            if (i + j > MAXLINES)
+                break
+            printf "   %-41s %s\n", line[i+j], line[i+j+5]
+        @}
+        print ""
+    @}
+
+    printf "\n\n"        # footer
+
+    for (i in line)
+        line[i] = ""
+@}
+
+# main rule
+@{
+    if (Count >= 20) @{
+        printpage()
+        Count = 0
+        Nlines = 0
+    @}
+    n = split($0, a, "\n")
+    for (i = 1; i <= n; i++)
+        line[++Nlines] = a[i]
+    for (; i <= 5; i++)
+        line[++Nlines] = ""
+    Count++
+@}
+
+END    \
+@{
+    printpage()
+@}
+@c endfile
+@end example
+@c ENDOFRANGE prml
+@c ENDOFRANGE mlprint
+
+@node Word Sorting
+@subsection Generating Word-Usage Counts
+
+@c last comma is part of secondary
+@c STARTOFRANGE worus
+@cindex words, usage counts, generating
+@c NEXT ED: Rewrite this whole section and example
+The following @command{awk} program prints
+the number of occurrences of each word in its input.  It illustrates the
+associative nature of @command{awk} arrays by using strings as subscripts.  It
+also demonstrates the @samp{for @var{index} in @var{array}} mechanism.
+Finally, it shows how @command{awk} is used in conjunction with other
+utility programs to do a useful task of some complexity with a minimum of
+effort.  Some explanations follow the program listing:
+
+@example
+# Print list of word frequencies
+@{
+    for (i = 1; i <= NF; i++)
+        freq[$i]++
+@}
+
+END @{
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word]
+@}
+@end example
+
+@c Exercise: Use asort() here
+
+This program has two rules.  The
+first rule, because it has an empty pattern, is executed for every input line.
+It uses @command{awk}'s field-accessing mechanism
+(@pxref{Fields}) to pick out the individual words from
+the line, and the built-in variable @code{NF} (@pxref{Built-in Variables})
+to know how many fields are available.
+For each input word, it increments an element of the array @code{freq} to
+reflect that the word has been seen an additional time.
+
+The second rule, because it has the pattern @code{END}, is not executed
+until the input has been exhausted.  It prints out the contents of the
+@code{freq} table that has been built up inside the first action.
+This program has several problems that would prevent it from being
+useful by itself on real text files:
+
+@itemize @bullet
+@item
+Words are detected using the @command{awk} convention that fields are
+separated just by whitespace.  Other characters in the input (except
+newlines) don't have any special meaning to @command{awk}.  This means that
+punctuation characters count as part of words.
+
+@item
+The @command{awk} language considers upper- and lowercase characters to be
+distinct.  Therefore, ``bartender'' and ``Bartender'' are not treated
+as the same word.  This is undesirable, since in normal text, words
+are capitalized if they begin sentences, and a frequency analyzer should not
+be sensitive to capitalization.
+
+@item
+The output does not come out in any useful order.  You're more likely to be
+interested in which words occur most frequently or in having an alphabetized
+table of how frequently each word occurs.
+@end itemize
+
+@cindex @command{sort} utility
+The way to solve these problems is to use some of @command{awk}'s more advanced
+features.  First, we use @code{tolower} to remove
+case distinctions.  Next, we use @code{gsub} to remove punctuation
+characters.  Finally, we use the system @command{sort} utility to process the
+output of the @command{awk} script.  Here is the new version of
+the program:
+
+@cindex @code{wordfreq.awk} program
+@example
+@c file eg/prog/wordfreq.awk
+# wordfreq.awk --- print list of word frequencies
+
+@{
+    $0 = tolower($0)    # remove case distinctions
+    # remove punctuation
+    gsub(/[^[:alnum:]_[:blank:]]/, "", $0)
+    for (i = 1; i <= NF; i++)
+        freq[$i]++
+@}
+
+END @{
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word]
+@}
+@c endfile
+@end example
+
+Assuming we have saved this program in a file named @file{wordfreq.awk},
+and that the data is in @file{file1}, the following pipeline:
+
+@example
+awk -f wordfreq.awk file1 | sort -k 2nr
+@end example
+
+@noindent
+produces a table of the words appearing in @file{file1} in order of
+decreasing frequency.  The @command{awk} program suitably massages the
+data and produces a word frequency table, which is not ordered.
+
+The @command{awk} script's output is then sorted by the @command{sort}
+utility and printed on the terminal.  The options given to @command{sort}
+specify a sort that uses the second field of each input line (skipping
+one field), that the sort keys should be treated as numeric quantities
+(otherwise @samp{15} would come before @samp{5}), and that the sorting
+should be done in descending (reverse) order.
+
+The @command{sort} could even be done from within the program, by changing
+the @code{END} action to:
+
+@example
+@c file eg/prog/wordfreq.awk
+END @{
+    sort = "sort -k 2nr"
+    for (word in freq)
+        printf "%s\t%d\n", word, freq[word] | sort
+    close(sort)
+@}
+@c endfile
+@end example
+
+This way of sorting must be used on systems that do not
+have true pipes at the command-line (or batch-file) level.
+See the general operating system documentation for more information on how
+to use the @command{sort} program.
+@c ENDOFRANGE worus
+
+@node History Sorting
+@subsection Removing Duplicates from Unsorted Text
+
+@c last comma is part of secondary
+@c STARTOFRANGE lidu
+@cindex lines, duplicate, removing
+The @command{uniq} program
+(@pxref{Uniq Program}),
+removes duplicate lines from @emph{sorted} data.
+
+Suppose, however, you need to remove duplicate lines from a @value{DF} but
+that you want to preserve the order the lines are in.  A good example of
+this might be a shell history file.  The history file keeps a copy of all
+the commands you have entered, and it is not unusual to repeat a command
+several times in a row.  Occasionally you might want to compact the history
+by removing duplicate entries.  Yet it is desirable to maintain the order
+of the original commands.
+
+This simple program does the job.  It uses two arrays.  The @code{data}
+array is indexed by the text of each line.
+For each line, @code{data[$0]} is incremented.
+If a particular line has not
+been seen before, then @code{data[$0]} is zero.
+In this case, the text of the line is stored in @code{lines[count]}.
+Each element of @code{lines} is a unique command, and the indices of
+@code{lines} indicate the order in which those lines are encountered.
+The @code{END} rule simply prints out the lines, in order:
+
+@cindex Rakitzis, Byron
+@cindex @code{histsort.awk} program
+@example
+@c file eg/prog/histsort.awk
+# histsort.awk --- compact a shell history file
+# Thanks to Byron Rakitzis for the general idea
+@c endfile
+@ignore
+@c file eg/prog/histsort.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/histsort.awk
+@group
+@{
+    if (data[$0]++ == 0)
+        lines[++count] = $0
+@}
+@end group
+
+END @{
+    for (i = 1; i <= count; i++)
+        print lines[i]
+@}
+@c endfile
+@end example
+
+This program also provides a foundation for generating other useful
+information.  For example, using the following @code{print} statement in the
+@code{END} rule indicates how often a particular command is used:
+
+@example
+print data[lines[i]], lines[i]
+@end example
+
+This works because @code{data[$0]} is incremented each time a line is
+seen.
+@c ENDOFRANGE lidu
+
+@node Extract Program
+@subsection Extracting Programs from Texinfo Source Files
+
+@c STARTOFRANGE texse
+@cindex Texinfo, extracting programs from source files
+@c last comma is part of secondary
+@c STARTOFRANGE fitex
+@cindex files, Texinfo, extracting programs from
+@ifnotinfo
+Both this chapter and the previous chapter
+(@ref{Library Functions})
+present a large number of @command{awk} programs.
+@end ifnotinfo
+@ifinfo
+The nodes
+@ref{Library Functions},
+and @ref{Sample Programs},
+are the top level nodes for a large number of @command{awk} programs.
+@end ifinfo
+If you want to experiment with these programs, it is tedious to have to type
+them in by hand.  Here we present a program that can extract parts of a
+Texinfo input file into separate files.
+
+@cindex Texinfo
+This @value{DOCUMENT} is written in Texinfo, the GNU project's document
+formatting
+language.
+A single Texinfo source file can be used to produce both
+printed and online documentation.
+@ifnotinfo
+Texinfo is fully documented in the book
+@cite{Texinfo---The GNU Documentation Format},
+available from the Free Software Foundation.
+@end ifnotinfo
+@ifinfo
+The Texinfo language is described fully, starting with
+@ref{Top}.
+@end ifinfo
+
+For our purposes, it is enough to know three things about Texinfo input
+files:
+
+@itemize @bullet
+@item
+The ``at'' symbol (@samp{@@}) is special in Texinfo, much as
+the backslash (@samp{\}) is in C
+or @command{awk}.  Literal @samp{@@} symbols are represented in Texinfo source
+files as @samp{@@@@}.
+
+@item
+Comments start with either @samp{@@c} or @samp{@@comment}.
+The file-extraction program works by using special comments that start
+at the beginning of a line.
+
+@item
+Lines containing @samp{@@group} and @samp{@@end group} commands bracket
+example text that should not be split across a page boundary.
+(Unfortunately, @TeX{} isn't always smart enough to do things exactly right,
+and we have to give it some help.)
+@end itemize
+
+The following program, @file{extract.awk}, reads through a Texinfo source
+file and does two things, based on the special comments.
+Upon seeing @samp{@w{@@c system @dots{}}},
+it runs a command, by extracting the command text from the
+control line and passing it on to the @code{system} function
+(@pxref{I/O Functions}).
+Upon seeing @samp{@@c file @var{filename}}, each subsequent line is sent to
+the file @var{filename}, until @samp{@@c endfile} is encountered.
+The rules in @file{extract.awk} match either @samp{@@c} or
+@samp{@@comment} by letting the @samp{omment} part be optional.
+Lines containing @samp{@@group} and @samp{@@end group} are simply removed.
+@file{extract.awk} uses the @code{join} library function
+(@pxref{Join Function}).
+
+The example programs in the online Texinfo source for @cite{@value{TITLE}}
+(@file{gawk.texi}) have all been bracketed inside @samp{file} and
+@samp{endfile} lines.  The @command{gawk} distribution uses a copy of
+@file{extract.awk} to extract the sample programs and install many
+of them in a standard directory where @command{gawk} can find them.
+The Texinfo file looks something like this:
+
+@example
+@dots{}
+This program has a @@code@{BEGIN@} rule,
+that prints a nice message:
+
+@@example
+@@c file examples/messages.awk
+BEGIN @@@{ print "Don't panic!" @@@}
+@@c end file
+@@end example
+
+It also prints some final advice:
+
+@@example
+@@c file examples/messages.awk
+END @@@{ print "Always avoid bored archeologists!" @@@}
+@@c end file
+@@end example
+@dots{}
+@end example
+
+@file{extract.awk} begins by setting @code{IGNORECASE} to one, so that
+mixed upper- and lowercase letters in the directives won't matter.
+
+The first rule handles calling @code{system}, checking that a command is
+given (@code{NF} is at least three) and also checking that the command
+exits with a zero exit status, signifying OK:
+
+@cindex @code{extract.awk} program
+@example
+@c file eg/prog/extract.awk
+# extract.awk --- extract files and run programs
+#                 from texinfo files
+@c endfile
+@ignore
+@c file eg/prog/extract.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# May 1993
+# Revised September 2000
+
+@c endfile
+@end ignore
+@c file eg/prog/extract.awk
+BEGIN    @{ IGNORECASE = 1 @}
+
+/^@@c(omment)?[ \t]+system/    \
+@{
+    if (NF < 3) @{
+        e = (FILENAME ":" FNR)
+        e = (e  ": badly formed `system' line")
+        print e > "/dev/stderr"
+        next
+    @}
+    $1 = ""
+    $2 = ""
+    stat = system($0)
+    if (stat != 0) @{
+        e = (FILENAME ":" FNR)
+        e = (e ": warning: system returned " stat)
+        print e > "/dev/stderr"
+    @}
+@}
+@c endfile
+@end example
+
+@noindent
+The variable @code{e} is used so that the function
+fits nicely on the
+@ifnotinfo
+page.
+@end ifnotinfo
+@ifnottex
+screen.
+@end ifnottex
+
+The second rule handles moving data into files.  It verifies that a
+@value{FN} is given in the directive.  If the file named is not the
+current file, then the current file is closed.  Keeping the current file
+open until a new file is encountered allows the use of the @samp{>}
+redirection for printing the contents, keeping open file management
+simple.
+
+The @samp{for} loop does the work.  It reads lines using @code{getline}
+(@pxref{Getline}).
+For an unexpected end of file, it calls the @code{@w{unexpected_eof}}
+function.  If the line is an ``endfile'' line, then it breaks out of
+the loop.
+If the line is an @samp{@@group} or @samp{@@end group} line, then it
+ignores it and goes on to the next line.
+Similarly, comments within examples are also ignored.
+
+Most of the work is in the following few lines.  If the line has no @samp{@@}
+symbols, the program can print it directly.
+Otherwise, each leading @samp{@@} must be stripped off.
+To remove the @samp{@@} symbols, the line is split into separate elements of
+the array @code{a}, using the @code{split} function
+(@pxref{String Functions}).
+The @samp{@@} symbol is used as the separator character.
+Each element of @code{a} that is empty indicates two successive @samp{@@}
+symbols in the original line.  For each two empty elements (@samp{@@@@} in
+the original file), we have to add a single @samp{@@} symbol back in.
+
+When the processing of the array is finished, @code{join} is called with the
+value of @code{SUBSEP}, to rejoin the pieces back into a single
+line.  That line is then printed to the output file:
+
+@example
+@c file eg/prog/extract.awk
+/^@@c(omment)?[ \t]+file/    \
+@{
+    if (NF != 3) @{
+        e = (FILENAME ":" FNR ": badly formed `file' line")
+        print e > "/dev/stderr"
+        next
+    @}
+    if ($3 != curfile) @{
+        if (curfile != "")
+            close(curfile)
+        curfile = $3
+    @}
+
+    for (;;) @{
+        if ((getline line) <= 0)
+            unexpected_eof()
+        if (line ~ /^@@c(omment)?[ \t]+endfile/)
+            break
+        else if (line ~ /^@@(end[ \t]+)?group/)
+            continue
+        else if (line ~ /^@@c(omment+)?[ \t]+/)
+            continue
+        if (index(line, "@@") == 0) @{
+            print line > curfile
+            continue
+        @}
+        n = split(line, a, "@@")
+        # if a[1] == "", means leading @@,
+        # don't add one back in.
+        for (i = 2; i <= n; i++) @{
+            if (a[i] == "") @{ # was an @@@@
+                a[i] = "@@"
+                if (a[i+1] == "")
+                    i++
+            @}
+        @}
+        print join(a, 1, n, SUBSEP) > curfile
+    @}
+@}
+@c endfile
+@end example
+
+An important thing to note is the use of the @samp{>} redirection.
+Output done with @samp{>} only opens the file once; it stays open and
+subsequent output is appended to the file
+(@pxref{Redirection}).
+This makes it easy to mix program text and explanatory prose for the same
+sample source file (as has been done here!) without any hassle.  The file is
+only closed when a new data @value{FN} is encountered or at the end of the
+input file.
+
+Finally, the function @code{@w{unexpected_eof}} prints an appropriate
+error message and then exits.
+The @code{END} rule handles the final cleanup, closing the open file:
+
+@c function lb put on same line for page breaking. sigh
+@example
+@c file eg/prog/extract.awk
+@group
+function unexpected_eof() @{
+    printf("%s:%d: unexpected EOF or error\n",
+        FILENAME, FNR) > "/dev/stderr"
+    exit 1
+@}
+@end group
+
+END @{
+    if (curfile)
+        close(curfile)
+@}
+@c endfile
+@end example
+@c ENDOFRANGE texse
+@c ENDOFRANGE fitex
+
+@node Simple Sed
+@subsection A Simple Stream Editor
+
+@cindex @command{sed} utility
+@cindex stream editors
+The @command{sed} utility is a stream editor, a program that reads a
+stream of data, makes changes to it, and passes it on.
+It is often used to make global changes to a large file or to a stream
+of data generated by a pipeline of commands.
+While @command{sed} is a complicated program in its own right, its most common
+use is to perform global substitutions in the middle of a pipeline:
+
+@example
+command1 < orig.data | sed 's/old/new/g' | command2 > result
+@end example
+
+Here, @samp{s/old/new/g} tells @command{sed} to look for the regexp
+@samp{old} on each input line and globally replace it with the text
+@samp{new}, i.e., all the occurrences on a line.  This is similar to
+@command{awk}'s @code{gsub} function
+(@pxref{String Functions}).
+
+The following program, @file{awksed.awk}, accepts at least two command-line
+arguments: the pattern to look for and the text to replace it with. Any
+additional arguments are treated as data @value{FN}s to process. If none
+are provided, the standard input is used:
+
+@cindex Brennan, Michael
+@cindex @command{awksed.awk} program
+@c @cindex simple stream editor
+@c @cindex stream editor, simple
+@example
+@c file eg/prog/awksed.awk
+# awksed.awk --- do s/foo/bar/g using just print
+#    Thanks to Michael Brennan for the idea
+@c endfile
+@ignore
+@c file eg/prog/awksed.awk
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# August 1995
+
+@c endfile
+@end ignore
+@c file eg/prog/awksed.awk
+function usage()
+@{
+    print "usage: awksed pat repl [files...]" > "/dev/stderr"
+    exit 1
+@}
+
+BEGIN @{
+    # validate arguments
+    if (ARGC < 3)
+        usage()
+
+    RS = ARGV[1]
+    ORS = ARGV[2]
+
+    # don't use arguments as files
+    ARGV[1] = ARGV[2] = ""
+@}
+
+@group
+# look ma, no hands!
+@{
+    if (RT == "")
+        printf "%s", $0
+    else
+        print
+@}
+@end group
+@c endfile
+@end example
+
+The program relies on @command{gawk}'s ability to have @code{RS} be a regexp,
+as well as on the setting of @code{RT} to the actual text that terminates the
+record (@pxref{Records}).
+
+The idea is to have @code{RS} be the pattern to look for. @command{gawk}
+automatically sets @code{$0} to the text between matches of the pattern.
+This is text that we want to keep, unmodified.  Then, by setting @code{ORS}
+to the replacement text, a simple @code{print} statement outputs the
+text we want to keep, followed by the replacement text.
+
+There is one wrinkle to this scheme, which is what to do if the last record
+doesn't end with text that matches @code{RS}.  Using a @code{print}
+statement unconditionally prints the replacement text, which is not correct.
+However, if the file did not end in text that matches @code{RS}, @code{RT}
+is set to the null string.  In this case, we can print @code{$0} using
+@code{printf}
+(@pxref{Printf}).
+
+The @code{BEGIN} rule handles the setup, checking for the right number
+of arguments and calling @code{usage} if there is a problem. Then it sets
+@code{RS} and @code{ORS} from the command-line arguments and sets
+@code{ARGV[1]} and @code{ARGV[2]} to the null string, so that they are
+not treated as @value{FN}s
+(@pxref{ARGC and ARGV}).
+
+The @code{usage} function prints an error message and exits.
+Finally, the single rule handles the printing scheme outlined above,
+using @code{print} or @code{printf} as appropriate, depending upon the
+value of @code{RT}.
+
+@ignore
+Exercise, compare the performance of this version with the more
+straightforward:
+
+BEGIN {
+    pat = ARGV[1]
+    repl = ARGV[2]
+    ARGV[1] = ARGV[2] = ""
+}
+
+{ gsub(pat, repl); print }
+
+Exercise: what are the advantages and disadvantages of this version versus sed?
+  Advantage: egrep regexps
+             speed (?)
+  Disadvantage: no & in replacement text
+
+Others?
+@end ignore
+
+@node Igawk Program
+@subsection An Easy Way to Use Library Functions
+
+@c STARTOFRANGE libfex
+@cindex libraries of @command{awk} functions, example program for using
+@c STARTOFRANGE flibex
+@cindex functions, library, example program for using
+Using library functions in @command{awk} can be very beneficial. It
+encourages code reuse and the writing of general functions. Programs are
+smaller and therefore clearer.
+However, using library functions is only easy when writing @command{awk}
+programs; it is painful when running them, requiring multiple @option{-f}
+options.  If @command{gawk} is unavailable, then so too is the @env{AWKPATH}
+environment variable and the ability to put @command{awk} functions into a
+library directory (@pxref{Options}).
+It would be nice to be able to write programs in the following manner:
+
+@example
+# library functions
+@@include getopt.awk
+@@include join.awk
+@dots{}
+
+# main program
+BEGIN @{
+    while ((c = getopt(ARGC, ARGV, "a:b:cde")) != -1)
+        @dots{}
+    @dots{}
+@}
+@end example
+
+The following program, @file{igawk.sh}, provides this service.
+It simulates @command{gawk}'s searching of the @env{AWKPATH} variable
+and also allows @dfn{nested} includes; i.e., a file that is included
+with @samp{@@include} can contain further @samp{@@include} statements.
+@command{igawk} makes an effort to only include files once, so that nested
+includes don't accidentally include a library function twice.
+
+@command{igawk} should behave just like @command{gawk} externally.  This
+means it should accept all of @command{gawk}'s command-line arguments,
+including the ability to have multiple source files specified via
+@option{-f}, and the ability to mix command-line and library source files.
+
+The program is written using the POSIX Shell (@command{sh}) command
+language.@footnote{Fully explaining the @command{sh} language is beyond
+the scope of this book. We provide some minimal explanations, but see
+a good shell programming book if you wish to understand things in more
+depth.} It works as follows:
+
+@enumerate
+@item
+Loop through the arguments, saving anything that doesn't represent
+@command{awk} source code for later, when the expanded program is run.
+
+@item
+For any arguments that do represent @command{awk} text, put the arguments into
+a shell variable that will be expanded.  There are two cases:
+
+@enumerate a
+@item
+Literal text, provided with @option{--source} or @option{--source=}.  This
+text is just appended directly.
+
+@item
+Source @value{FN}s, provided with @option{-f}.  We use a neat trick and append
+@samp{@@include @var{filename}} to the shell variable's contents.  Since the file-inclusion
+program works the way @command{gawk} does, this gets the text
+of the file included into the program at the correct point.
+@end enumerate
+
+@item
+Run an @command{awk} program (naturally) over the shell variable's contents to expand
+@samp{@@include} statements.  The expanded program is placed in a second
+shell variable.
+
+@item
+Run the expanded program with @command{gawk} and any other original command-line
+arguments that the user supplied (such as the data @value{FN}s).
+@end enumerate
+
+This program uses shell variables extensively; for storing command line arguments,
+the text of the @command{awk} program that will expand the user's program, for the
+user's original program, and for the expanded program.  Doing so removes some
+potential problems that might arise were we to use temporary files instead,
+at the cost of making the script somewhat more complicated.
+
+The initial part of the program turns on shell tracing if the first
+argument is @samp{debug}.
+
+The next part loops through all the command-line arguments.
+There are several cases of interest:
+
+@table @code
+@item --
+This ends the arguments to @command{igawk}.  Anything else should be passed on
+to the user's @command{awk} program without being evaluated.
+
+@item -W
+This indicates that the next option is specific to @command{gawk}.  To make
+argument processing easier, the @option{-W} is appended to the front of the
+remaining arguments and the loop continues.  (This is an @command{sh}
+programming trick.  Don't worry about it if you are not familiar with
+@command{sh}.)
+
+@item -v@r{,} -F
+These are saved and passed on to @command{gawk}.
+
+@item -f@r{,} --file@r{,} --file=@r{,} -Wfile=
+The @value{FN} is appended to the shell variable @code{program} with an
+@samp{@@include} statement.
+The @command{expr} utility is used to remove the leading option part of the
+argument (e.g., @samp{--file=}).
+(Typical @command{sh} usage would be to use the @command{echo} and @command{sed}
+utilities to do this work.  Unfortunately, some versions of @command{echo} evaluate
+escape sequences in their arguments, possibly mangling the program text.
+Using @command{expr} avoids this problem.)
+
+@item --source@r{,} --source=@r{,} -Wsource=
+The source text is appended to @code{program}.
+
+@item --version@r{,} -Wversion
+@command{igawk} prints its version number, runs @samp{gawk --version}
+to get the @command{gawk} version information, and then exits.
+@end table
+
+If none of the @option{-f}, @option{--file}, @option{-Wfile}, @option{--source},
+or @option{-Wsource} arguments are supplied, then the first nonoption argument
+should be the @command{awk} program.  If there are no command-line
+arguments left, @command{igawk} prints an error message and exits.
+Otherwise, the first argument is appended to @code{program}.
+In any case, after the arguments have been processed,
+@code{program} contains the complete text of the original @command{awk}
+program.
+
+The program is as follows:
+
+@cindex @code{igawk.sh} program
+@example
+@c file eg/prog/igawk.sh
+#! /bin/sh
+# igawk --- like gawk but do @@include processing
+@c endfile
+@ignore
+@c file eg/prog/igawk.sh
+#
+# Arnold Robbins, arnold@@gnu.org, Public Domain
+# July 1993
+
+@c endfile
+@end ignore
+@c file eg/prog/igawk.sh
+if [ "$1" = debug ]
+then
+    set -x
+    shift
+fi
+
+# A literal newline, so that program text is formmatted correctly
+n='
+'
+
+# Initialize variables to empty
+program=
+opts=
+
+while [ $# -ne 0 ] # loop over arguments
+do
+    case $1 in
+    --)     shift; break;;
+
+    -W)     shift
+            # The $@{x?'message here'@} construct prints a
+            # diagnostic if $x is the null string
+            set -- -W"$@{@@?'missing operand'@}"
+            continue;;
+
+    -[vF])  opts="$opts $1 '$@{2?'missing operand'@}'"
+            shift;;
+
+    -[vF]*) opts="$opts '$1'" ;;
+
+    -f)     program="$program$n@@include $@{2?'missing operand'@}"
+            shift;;
+
+    -f*)    f=`expr "$1" : '-f\(.*\)'`
+            program="$program$n@@include $f";;
+
+    -[W-]file=*)
+            f=`expr "$1" : '-.file=\(.*\)'`
+            program="$program$n@@include $f";;
+
+    -[W-]file)
+            program="$program$n@@include $@{2?'missing operand'@}"
+            shift;;
+
+    -[W-]source=*)
+            t=`expr "$1" : '-.source=\(.*\)'`
+            program="$program$n$t";;
+
+    -[W-]source)
+            program="$program$n$@{2?'missing operand'@}"
+            shift;;
+
+    -[W-]version)
+            echo igawk: version 2.0 1>&2
+            gawk --version
+            exit 0 ;;
+
+    -[W-]*) opts="$opts '$1'" ;;
+
+    *)      break;;
+    esac
+    shift
+done
+
+if [ -z "$program" ]
+then
+     program=$@{1?'missing program'@}
+     shift
+fi
+
+# At this point, `program' has the program.
+@c endfile
+@end example
+
+The @command{awk} program to process @samp{@@include} directives
+is stored in the shell variable @code{expand_prog}.  Doing this keeps
+the shell script readable.  The @command{awk} program
+reads through the user's program, one line at a time, using @code{getline}
+(@pxref{Getline}).  The input
+@value{FN}s and @samp{@@include} statements are managed using a stack.
+As each @samp{@@include} is encountered, the current @value{FN} is
+``pushed'' onto the stack and the file named in the @samp{@@include}
+directive becomes the current @value{FN}.  As each file is finished,
+the stack is ``popped,'' and the previous input file becomes the current
+input file again.  The process is started by making the original file
+the first one on the stack.
+
+The @code{pathto} function does the work of finding the full path to
+a file.  It simulates @command{gawk}'s behavior when searching the
+@env{AWKPATH} environment variable
+(@pxref{AWKPATH Variable}).
+If a @value{FN} has a @samp{/} in it, no path search is done. Otherwise,
+the @value{FN} is concatenated with the name of each directory in
+the path, and an attempt is made to open the generated @value{FN}.
+The only way to test if a file can be read in @command{awk} is to go
+ahead and try to read it with @code{getline}; this is what @code{pathto}
+does.@footnote{On some very old versions of @command{awk}, the test
+@samp{getline junk < t} can loop forever if the file exists but is empty.
+Caveat emptor.} If the file can be read, it is closed and the @value{FN}
+is returned:
+
+@ignore
+An alternative way to test for the file's existence would be to call
+@samp{system("test -r " t)}, which uses the @command{test} utility to
+see if the file exists and is readable.  The disadvantage to this method
+is that it requires creating an extra process and can thus be slightly
+slower.
+@end ignore
+
+@example
+@c file eg/prog/igawk.sh
+expand_prog='
+
+function pathto(file,    i, t, junk)
+@{
+    if (index(file, "/") != 0)
+        return file
+
+    for (i = 1; i <= ndirs; i++) @{
+        t = (pathlist[i] "/" file)
+@group
+        if ((getline junk < t) > 0) @{
+            # found it
+            close(t)
+            return t
+        @}
+@end group
+    @}
+    return ""
+@}
+@c endfile
+@end example
+
+The main program is contained inside one @code{BEGIN} rule.  The first thing it
+does is set up the @code{pathlist} array that @code{pathto} uses.  After
+splitting the path on @samp{:}, null elements are replaced with @code{"."},
+which represents the current directory:
+
+@example
+@c file eg/prog/igawk.sh
+BEGIN @{
+    path = ENVIRON["AWKPATH"]
+    ndirs = split(path, pathlist, ":")
+    for (i = 1; i <= ndirs; i++) @{
+        if (pathlist[i] == "")
+            pathlist[i] = "."
+    @}
+@c endfile
+@end example
+
+The stack is initialized with @code{ARGV[1]}, which will be @file{/dev/stdin}.
+The main loop comes next.  Input lines are read in succession. Lines that
+do not start with @samp{@@include} are printed verbatim.
+If the line does start with @samp{@@include}, the @value{FN} is in @code{$2}.
+@code{pathto} is called to generate the full path.  If it cannot, then we
+print an error message and continue.
+
+The next thing to check is if the file is included already.  The
+@code{processed} array is indexed by the full @value{FN} of each included
+file and it tracks this information for us.  If the file is
+seen again, a warning message is printed. Otherwise, the new @value{FN} is
+pushed onto the stack and processing continues.
+
+Finally, when @code{getline} encounters the end of the input file, the file
+is closed and the stack is popped.  When @code{stackptr} is less than zero,
+the program is done:
+
+@example
+@c file eg/prog/igawk.sh
+    stackptr = 0
+    input[stackptr] = ARGV[1] # ARGV[1] is first file
+
+    for (; stackptr >= 0; stackptr--) @{
+        while ((getline < input[stackptr]) > 0) @{
+            if (tolower($1) != "@@include") @{
+                print
+                continue
+            @}
+            fpath = pathto($2)
+@group
+            if (fpath == "") @{
+                printf("igawk:%s:%d: cannot find %s\n",
+                    input[stackptr], FNR, $2) > "/dev/stderr"
+                continue
+            @}
+@end group
+            if (! (fpath in processed)) @{
+                processed[fpath] = input[stackptr]
+                input[++stackptr] = fpath  # push onto stack
+            @} else
+                print $2, "included in", input[stackptr],
+                    "already included in",
+                    processed[fpath] > "/dev/stderr"
+        @}
+        close(input[stackptr])
+    @}
+@}'  # close quote ends `expand_prog' variable
+
+processed_program=`gawk -- "$expand_prog" /dev/stdin <<EOF
+$program
+EOF
+`
+@c endfile
+@end example
+
+The shell construct @samp{@var{command} << @var{marker}} is called a @dfn{here document}.
+Everything in the shell script up to the @var{marker} is fed to @var{command} as input.
+The shell processes the contents of the here document for variable and command substitution
+(and possibly other things as well, depending upon the shell).
+
+The shell construct @samp{`@dots{}`} is called @dfn{command substitution}.
+The output of the command between the two backquotes (grave accents) is substituted
+into the command line.  It is saved as a single string, even if the results
+contain whitespace.
+
+The expanded program is saved in the variable @code{processed_program}.
+It's done in these steps:
+
+@enumerate
+@item
+Run @command{gawk} with the @samp{@@include}-processing program (the
+value of the @code{expand_prog} shell variable) on standard input.
+
+@item
+Standard input is the contents of the user's program, from the shell variable @code{program}.
+Its contents are fed to @command{gawk} via a here document.
+
+@item
+The results of this processing are saved in the shell variable @code{processed_program} by using command substitution.
+@end enumerate
+
+The last step is to call @command{gawk} with the expanded program,
+along with the original
+options and command-line arguments that the user supplied.
+
+@c this causes more problems than it solves, so leave it out.
+@ignore
+The special file @file{/dev/null} is passed as a @value{DF} to @command{gawk}
+to handle an interesting case. Suppose that the user's program only has
+a @code{BEGIN} rule and there are no @value{DF}s to read.
+The program should exit without reading any @value{DF}s.
+However, suppose that an included library file defines an @code{END}
+rule of its own. In this case, @command{gawk} will hang, reading standard
+input. In order to avoid this, @file{/dev/null} is explicitly added to the
+command-line. Reading from @file{/dev/null} always returns an immediate
+end of file indication.
+
+@c Hmm. Add /dev/null if $# is 0?  Still messes up ARGV. Sigh.
+@end ignore
+
+@example
+@c file eg/prog/igawk.sh
+eval gawk $opts -- '"$processed_program"' '"$@@"'
+@c endfile
+@end example
+
+The @command{eval} command is a shell construct that reruns the shell's parsing
+process.  This keeps things properly quoted.
+
+This version of @command{igawk} represents my fourth attempt at this program.
+There are four key simplifications that make the program work better:
+
+@itemize @bullet
+@item
+Using @samp{@@include} even for the files named with @option{-f} makes building
+the initial collected @command{awk} program much simpler; all the
+@samp{@@include} processing can be done once.
+
+@item
+Not trying to save the line read with @code{getline}
+in the @code{pathto} function when testing for the
+file's accessibility for use with the main program simplifies things
+considerably.
+@c what problem does this engender though - exercise
+@c answer, reading from "-" or /dev/stdin
+
+@item
+Using a @code{getline} loop in the @code{BEGIN} rule does it all in one
+place.  It is not necessary to call out to a separate loop for processing
+nested @samp{@@include} statements.
+
+@item
+Instead of saving the expanded program in a temporary file, putting it in a shell variable
+avoids some potential security problems.
+This has the disadvantage that the script relies upon more features
+of the @command{sh} language, making it harder to follow for those who
+aren't familiar with @command{sh}.
+@end itemize
+
+Also, this program illustrates that it is often worthwhile to combine
+@command{sh} and @command{awk} programming together.  You can usually
+accomplish quite a lot, without having to resort to low-level programming
+in C or C++, and it is frequently easier to do certain kinds of string
+and argument manipulation using the shell than it is in @command{awk}.
+
+Finally, @command{igawk} shows that it is not always necessary to add new
+features to a program; they can often be layered on top.  With @command{igawk},
+there is no real reason to build @samp{@@include} processing into
+@command{gawk} itself.
+
+@cindex search paths, for source files
+@c comma is part of primary
+@cindex source files, search path for
+@c last comma is part of secondary
+@cindex files, source, search path for
+@cindex directories, searching
+As an additional example of this, consider the idea of having two
+files in a directory in the search path:
+
+@table @file
+@item default.awk
+This file contains a set of default library functions, such
+as @code{getopt} and @code{assert}.
+
+@item site.awk
+This file contains library functions that are specific to a site or
+installation; i.e., locally developed functions.
+Having a separate file allows @file{default.awk} to change with
+new @command{gawk} releases, without requiring the system administrator to
+update it each time by adding the local functions.
+@end table
+
+One user
+@c Karl Berry, karl@ileaf.com, 10/95
+suggested that @command{gawk} be modified to automatically read these files
+upon startup.  Instead, it would be very simple to modify @command{igawk}
+to do this. Since @command{igawk} can process nested @samp{@@include}
+directives, @file{default.awk} could simply contain @samp{@@include}
+statements for the desired library functions.
+@c ENDOFRANGE libfex
+@c ENDOFRANGE flibex
+@c ENDOFRANGE awkpex
+
+@c Exercise: make this change
+
+@ignore
+@c Try this
+@iftex
+@page
+@headings off
+@majorheading III@ @ @ Appendixes
+Part III provides the appendixes, the Glossary, and two licenses that cover
+the @command{gawk} source code and this @value{DOCUMENT}, respectively.
+It contains the following appendixes:
+
+@itemize @bullet
+@item
+@ref{Language History}.
+
+@item
+@ref{Installation}.
+
+@item
+@ref{Notes}.
+
+@item
+@ref{Basic Concepts}.
+
+@item
+@ref{Glossary}.
+
+@item
+@ref{Copying}.
+
+@item
+@ref{GNU Free Documentation License}.
+@end itemize
+
+@page
+@evenheading @thispage@ @ @ @strong{@value{TITLE}} @| @|
+@oddheading  @| @| @strong{@thischapter}@ @ @ @thispage
+@end iftex
+@end ignore
+
+@node Language History
+@appendix The Evolution of the @command{awk} Language
+
+This @value{DOCUMENT} describes the GNU implementation of @command{awk}, which follows
+the POSIX specification.
+Many long-time @command{awk} users learned @command{awk} programming
+with the original @command{awk} implementation in Version 7 Unix.
+(This implementation was the basis for @command{awk} in Berkeley Unix,
+through 4.3-Reno.  Subsequent versions of Berkeley Unix, and systems
+derived from 4.4BSD-Lite, use various versions of @command{gawk}
+for their @command{awk}.)
+This @value{CHAPTER} briefly describes the
+evolution of the @command{awk} language, with cross-references to other parts
+of the @value{DOCUMENT} where you can find more information.
+
+@menu
+* V7/SVR3.1::                   The major changes between V7 and System V
+                                Release 3.1.
+* SVR4::                        Minor changes between System V Releases 3.1
+                                and 4.
+* POSIX::                       New features from the POSIX standard.
+* BTL::                         New features from the Bell Laboratories
+                                version of @command{awk}.
+* POSIX/GNU::                   The extensions in @command{gawk} not in POSIX
+                                @command{awk}.
+* Contributors::                The major contributors to @command{gawk}.
+@end menu
+
+@node V7/SVR3.1
+@appendixsec Major Changes Between V7 and SVR3.1
+@c STARTOFRANGE gawkv
+@cindex @command{awk}, versions of
+@c STARTOFRANGE gawkv1
+@cindex @command{awk}, versions of, changes between V7 and SVR3.1
+
+The @command{awk} language evolved considerably between the release of
+Version 7 Unix (1978) and the new version that was first made generally available in
+System V Release 3.1 (1987).  This @value{SECTION} summarizes the changes, with
+cross-references to further details:
+
+@itemize @bullet
+@item
+The requirement for @samp{;} to separate rules on a line
+(@pxref{Statements/Lines}).
+
+@item
+User-defined functions and the @code{return} statement
+(@pxref{User-defined}).
+
+@item
+The @code{delete} statement (@pxref{Delete}).
+
+@item
+The @code{do}-@code{while} statement
+(@pxref{Do Statement}).
+
+@item
+The built-in functions @code{atan2}, @code{cos}, @code{sin}, @code{rand}, and
+@code{srand} (@pxref{Numeric Functions}).
+
+@item
+The built-in functions @code{gsub}, @code{sub}, and @code{match}
+(@pxref{String Functions}).
+
+@item
+The built-in functions @code{close} and @code{system}
+(@pxref{I/O Functions}).
+
+@item
+The @code{ARGC}, @code{ARGV}, @code{FNR}, @code{RLENGTH}, @code{RSTART},
+and @code{SUBSEP} built-in variables (@pxref{Built-in Variables}).
+
+@item
+The conditional expression using the ternary operator @samp{?:}
+(@pxref{Conditional Exp}).
+
+@item
+The exponentiation operator @samp{^}
+(@pxref{Arithmetic Ops}) and its assignment operator
+form @samp{^=} (@pxref{Assignment Ops}).
+
+@item
+C-compatible operator precedence, which breaks some old @command{awk}
+programs (@pxref{Precedence}).
+
+@item
+Regexps as the value of @code{FS}
+(@pxref{Field Separators}) and as the
+third argument to the @code{split} function
+(@pxref{String Functions}).
+
+@item
+Dynamic regexps as operands of the @samp{~} and @samp{!~} operators
+(@pxref{Regexp Usage}).
+
+@item
+The escape sequences @samp{\b}, @samp{\f}, and @samp{\r}
+(@pxref{Escape Sequences}).
+(Some vendors have updated their old versions of @command{awk} to
+recognize @samp{\b}, @samp{\f}, and @samp{\r}, but this is not
+something you can rely on.)
+
+@item
+Redirection of input for the @code{getline} function
+(@pxref{Getline}).
+
+@item
+Multiple @code{BEGIN} and @code{END} rules
+(@pxref{BEGIN/END}).
+
+@item
+Multidimensional arrays
+(@pxref{Multi-dimensional}).
+@end itemize
+@c ENDOFRANGE gawkv1
+
+@node SVR4
+@appendixsec Changes Between SVR3.1 and SVR4
+
+@cindex @command{awk}, versions of, changes between SVR3.1 and SVR4
+The System V Release 4 (1989) version of Unix @command{awk} added these features
+(some of which originated in @command{gawk}):
+
+@itemize @bullet
+@item
+The @code{ENVIRON} variable (@pxref{Built-in Variables}).
+@c gawk and MKS awk
+
+@item
+Multiple @option{-f} options on the command line
+(@pxref{Options}).
+@c MKS awk
+
+@item
+The @option{-v} option for assigning variables before program execution begins
+(@pxref{Options}).
+@c GNU, Bell Laboratories & MKS together
+
+@item
+The @option{--} option for terminating command-line options.
+
+@item
+The @samp{\a}, @samp{\v}, and @samp{\x} escape sequences
+(@pxref{Escape Sequences}).
+@c GNU, for ANSI C compat
+
+@item
+A defined return value for the @code{srand} built-in function
+(@pxref{Numeric Functions}).
+
+@item
+The @code{toupper} and @code{tolower} built-in string functions
+for case translation
+(@pxref{String Functions}).
+
+@item
+A cleaner specification for the @samp{%c} format-control letter in the
+@code{printf} function
+(@pxref{Control Letters}).
+
+@item
+The ability to dynamically pass the field width and precision (@code{"%*.*d"})
+in the argument list of the @code{printf} function
+(@pxref{Control Letters}).
+
+@item
+The use of regexp constants, such as @code{/foo/}, as expressions, where
+they are equivalent to using the matching operator, as in @samp{$0 ~ /foo/}
+(@pxref{Using Constant Regexps}).
+
+@item
+Processing of escape sequences inside command-line variable assignments
+(@pxref{Assignment Options}).
+@end itemize
+
+@node POSIX
+@appendixsec Changes Between SVR4 and POSIX @command{awk}
+@cindex @command{awk}, versions of, changes between SVR4 and POSIX @command{awk}
+@cindex POSIX @command{awk}, changes in @command{awk} versions
+
+The POSIX Command Language and Utilities standard for @command{awk} (1992)
+introduced the following changes into the language:
+
+@itemize @bullet
+@item
+The use of @option{-W} for implementation-specific options
+(@pxref{Options}).
+
+@item
+The use of @code{CONVFMT} for controlling the conversion of numbers
+to strings (@pxref{Conversion}).
+
+@item
+The concept of a numeric string and tighter comparison rules to go
+with it (@pxref{Typing and Comparison}).
+
+@item
+More complete documentation of many of the previously undocumented
+features of the language.
+@end itemize
+
+The following common extensions are not permitted by the POSIX
+standard:
+
+@c IMPORTANT! Keep this list in sync with the one in node Options
+
+@itemize @bullet
+@item
+@code{\x} escape sequences are not recognized
+(@pxref{Escape Sequences}).
+
+@item
+Newlines do not act as whitespace to separate fields when @code{FS} is
+equal to a single space
+(@pxref{Fields}).
+
+@item
+Newlines are not allowed after @samp{?} or @samp{:}
+(@pxref{Conditional Exp}).
+
+@item
+The synonym @code{func} for the keyword @code{function} is not
+recognized (@pxref{Definition Syntax}).
+
+@item
+The operators @samp{**} and @samp{**=} cannot be used in
+place of @samp{^} and @samp{^=} (@pxref{Arithmetic Ops},
+and @ref{Assignment Ops}).
+
+@item
+Specifying @samp{-Ft} on the command line does not set the value
+of @code{FS} to be a single TAB character
+(@pxref{Field Separators}).
+
+@item
+The @code{fflush} built-in function is not supported
+(@pxref{I/O Functions}).
+@end itemize
+@c ENDOFRANGE gawkv
+
+@node BTL
+@appendixsec Extensions in the Bell Laboratories @command{awk}
+
+@cindex @command{awk}, versions of, See Also Bell Laboratories @command{awk}
+@cindex extensions, Bell Laboratories @command{awk}
+@cindex Bell Laboratories @command{awk} extensions
+@cindex Kernighan, Brian
+Brian Kernighan, one of the original designers of Unix @command{awk},
+has made his version available via his home page
+(@pxref{Other Versions}).
+This @value{SECTION} describes extensions in his version of @command{awk} that are
+not in POSIX @command{awk}:
+
+@itemize @bullet
+@item
+The @samp{-mf @var{N}} and @samp{-mr @var{N}} command-line options
+to set the maximum number of fields and the maximum
+record size, respectively
+(@pxref{Options}).
+As a side note, his @command{awk} no longer needs these options;
+it continues to accept them to avoid breaking old programs.
+
+@item
+The @code{fflush} built-in function for flushing buffered output
+(@pxref{I/O Functions}).
+
+@item
+The @samp{**} and @samp{**=} operators
+(@pxref{Arithmetic Ops}
+and
+@ref{Assignment Ops}).
+
+@item
+The use of @code{func} as an abbreviation for @code{function}
+(@pxref{Definition Syntax}).
+
+@ignore
+@item
+The @code{SYMTAB} array, that allows access to @command{awk}'s internal symbol
+table. This feature is not documented, largely because
+it is somewhat shakily implemented. For instance, you cannot access arrays
+or array elements through it.
+@end ignore
+@end itemize
+
+The Bell Laboratories @command{awk} also incorporates the following extensions,
+originally developed for @command{gawk}:
+
+@itemize @bullet
+@item
+The @samp{\x} escape sequence
+(@pxref{Escape Sequences}).
+
+@item
+The @file{/dev/stdin}, @file{/dev/stdout}, and @file{/dev/stderr}
+special files
+(@pxref{Special Files}).
+
+@item
+The ability for @code{FS} and for the third
+argument to @code{split} to be null strings
+(@pxref{Single Character Fields}).
+
+@item
+The @code{nextfile} statement
+(@pxref{Nextfile Statement}).
+
+@item
+The ability to delete all of an array at once with @samp{delete @var{array}}
+(@pxref{Delete}).
+@end itemize
+
+@node POSIX/GNU
+@appendixsec Extensions in @command{gawk} Not in POSIX @command{awk}
+
+@ignore
+I've tried to follow this general order, esp. for the 3.0 and 3.1 sections:
+       variables
+       special files
+       language changes (e.g., hex constants)
+       differences in standard awk functions
+       new gawk functions
+       new keywords
+       new command-line options
+       new ports
+Within each category, be alphabetical.
+@end ignore
+
+@c STARTOFRANGE fripls
+@cindex compatibility mode (@command{gawk}), extensions
+@c STARTOFRANGE exgnot
+@cindex extensions, in @command{gawk}, not in POSIX @command{awk}
+@c STARTOFRANGE posnot
+@cindex POSIX, @command{gawk} extensions not included in
+The GNU implementation, @command{gawk}, adds a large number of features.
+This @value{SECTION} lists them in the order they were added to @command{gawk}.
+They can all be disabled with either the @option{--traditional} or
+@option{--posix} options
+(@pxref{Options}).
+
+Version 2.10 of @command{gawk} introduced the following features:
+
+@itemize @bullet
+@item
+The @env{AWKPATH} environment variable for specifying a path search for
+the @option{-f} command-line option
+(@pxref{Options}).
+
+@item
+The @code{IGNORECASE} variable and its effects
+(@pxref{Case-sensitivity}).
+
+@item
+The @file{/dev/stdin}, @file{/dev/stdout}, @file{/dev/stderr} and
+@file{/dev/fd/@var{N}} special @value{FN}s
+(@pxref{Special Files}).
+@end itemize
+
+Version 2.13 of @command{gawk} introduced the following features:
+
+@itemize @bullet
+@item
+The @code{FIELDWIDTHS} variable and its effects
+(@pxref{Constant Size}).
+
+@item
+The @code{systime} and @code{strftime} built-in functions for obtaining
+and printing timestamps
+(@pxref{Time Functions}).
+
+@item
+The @option{-W lint} option to provide error and portability checking
+for both the source code and at runtime
+(@pxref{Options}).
+
+@item
+The @option{-W compat} option to turn off the GNU extensions
+(@pxref{Options}).
+
+@item
+The @option{-W posix} option for full POSIX compliance
+(@pxref{Options}).
+@end itemize
+
+Version 2.14 of @command{gawk} introduced the following feature:
+
+@itemize @bullet
+@item
+The @code{next file} statement for skipping to the next @value{DF}
+(@pxref{Nextfile Statement}).
+@end itemize
+
+Version 2.15 of @command{gawk} introduced the following features:
+
+@itemize @bullet
+@item
+The @code{ARGIND} variable, which tracks the movement of @code{FILENAME}
+through @code{ARGV}  (@pxref{Built-in Variables}).
+
+@item
+The @code{ERRNO} variable, which contains the system error message when
+@code{getline} returns @minus{}1 or @code{close} fails
+(@pxref{Built-in Variables}).
+
+@item
+The @file{/dev/pid}, @file{/dev/ppid}, @file{/dev/pgrpid}, and
+@file{/dev/user} @value{FN} interpretation
+(@pxref{Special Files}).
+
+@item
+The ability to delete all of an array at once with @samp{delete @var{array}}
+(@pxref{Delete}).
+
+@item
+The ability to use GNU-style long-named options that start with @option{--}
+(@pxref{Options}).
+
+@item
+The @option{--source} option for mixing command-line and library-file
+source code
+(@pxref{Options}).
+@end itemize
+
+Version 3.0 of @command{gawk} introduced the following features:
+
+@itemize @bullet
+@item
+@code{IGNORECASE} changed, now applying to string comparison as well
+as regexp operations
+(@pxref{Case-sensitivity}).
+
+@item
+The @code{RT} variable that contains the input text that
+matched @code{RS}
+(@pxref{Records}).
+
+@item
+Full support for both POSIX and GNU regexps
+(@pxref{Regexp}).
+
+@item
+The @code{gensub} function for more powerful text manipulation
+(@pxref{String Functions}).
+
+@item
+The @code{strftime} function acquired a default time format,
+allowing it to be called with no arguments
+(@pxref{Time Functions}).
+
+@item
+The ability for @code{FS} and for the third
+argument to @code{split} to be null strings
+(@pxref{Single Character Fields}).
+
+@item
+The ability for @code{RS} to be a regexp
+(@pxref{Records}).
+
+@item
+The @code{next file} statement became @code{nextfile}
+(@pxref{Nextfile Statement}).
+
+@item
+The @option{--lint-old} option to
+warn about constructs that are not available in
+the original Version 7 Unix version of @command{awk}
+(@pxref{V7/SVR3.1}).
+
+@item
+The @option{-m} option and the @code{fflush} function from the
+Bell Laboratories research version of @command{awk}
+(@pxref{Options}; also
+@pxref{I/O Functions}).
+
+@item
+The @option{--re-interval} option to provide interval expressions in regexps
+(@pxref{Regexp Operators}).
+
+@item
+The @option{--traditional} option was added as a better name for
+@option{--compat} (@pxref{Options}).
+
+@item
+The use of GNU Autoconf to control the configuration process
+(@pxref{Quick Installation}).
+
+@item
+Amiga support
+(@pxref{Amiga Installation}).
+
+@end itemize
+
+Version 3.1 of @command{gawk} introduced the following features:
+
+@itemize @bullet
+@item
+The @code{BINMODE} special variable for non-POSIX systems,
+which allows binary I/O for input and/or output files
+(@pxref{PC Using}).
+
+@item
+The @code{LINT} special variable, which dynamically controls lint warnings
+(@pxref{Built-in Variables}).
+
+@item
+The @code{PROCINFO} array for providing process-related information
+(@pxref{Built-in Variables}).
+
+@item
+The @code{TEXTDOMAIN} special variable for setting an application's
+internationalization text domain
+(@pxref{Built-in Variables},
+and
+@ref{Internationalization}).
+
+@item
+The ability to use octal and hexadecimal constants in @command{awk}
+program source code
+(@pxref{Nondecimal-numbers}).
+
+@item
+The @samp{|&} operator for two-way I/O to a coprocess
+(@pxref{Two-way I/O}).
+
+@item
+The @file{/inet} special files for TCP/IP networking using @samp{|&}
+(@pxref{TCP/IP Networking}).
+
+@item
+The optional second argument to @code{close} that allows closing one end
+of a two-way pipe to a coprocess
+(@pxref{Two-way I/O}).
+
+@item
+The optional third argument to the @code{match} function
+for capturing text-matching subexpressions within a regexp
+(@pxref{String Functions}).
+
+@item
+Positional specifiers in @code{printf} formats for
+making translations easier
+(@pxref{Printf Ordering}).
+
+@item
+The @code{asort} and @code{asorti} functions for sorting arrays
+(@pxref{Array Sorting}).
+
+@item
+The @code{bindtextdomain}, @code{dcgettext} and @code{dcngettext} functions
+for internationalization
+(@pxref{Programmer i18n}).
+
+@item
+The @code{extension} built-in function and the ability to add
+new built-in functions dynamically
+(@pxref{Dynamic Extensions}).
+
+@item
+The @code{mktime} built-in function for creating timestamps
+(@pxref{Time Functions}).
+
+@item
+The
+@code{and},
+@code{or},
+@code{xor},
+@code{compl},
+@code{lshift},
+@code{rshift},
+and
+@code{strtonum} built-in
+functions
+(@pxref{Bitwise Functions}).
+
+@item
+@cindex @code{next file} statement
+The support for @samp{next file} as two words was removed completely
+(@pxref{Nextfile Statement}).
+
+@item
+The @option{--dump-variables} option to print a list of all global variables
+(@pxref{Options}).
+
+@item
+The @option{--gen-po} command-line option and the use of a leading
+underscore to mark strings that should be translated
+(@pxref{String Extraction}).
+
+@item
+The @option{--non-decimal-data} option to allow non-decimal
+input data
+(@pxref{Nondecimal Data}).
+
+@item
+The @option{--profile} option and @command{pgawk}, the
+profiling version of @command{gawk}, for producing execution
+profiles of @command{awk} programs
+(@pxref{Profiling}).
+
+@item
+The @option{--enable-portals} configuration option to enable special treatment of
+pathnames that begin with @file{/p} as BSD portals
+(@pxref{Portal Files}).
+
+@item
+The use of GNU Automake to help in standardizing the configuration process
+(@pxref{Quick Installation}).
+
+@item
+The use of GNU @code{gettext} for @command{gawk}'s own message output
+(@pxref{Gawk I18N}).
+
+@item
+BeOS support
+(@pxref{BeOS Installation}).
+
+@item
+Tandem support
+(@pxref{Tandem Installation}).
+
+@item
+The Atari port became officially unsupported
+(@pxref{Atari Installation}).
+
+@item
+The source code now uses new-style function definitions, with
+@command{ansi2knr} to convert the code on systems with old compilers.
+
+@item
+The @option{--disable-lint} configuration option to disable lint checking
+at compile time
+(@pxref{Additional Configuration Options}).
+
+@end itemize
+
+@c XXX ADD MORE STUFF HERE
+
+@c ENDOFRANGE fripls
+@c ENDOFRANGE exgnot
+@c ENDOFRANGE posnot
+
+@node Contributors
+@appendixsec Major Contributors to @command{gawk}
+@cindex @command{gawk}, list of contributors to
+@quotation
+@i{Always give credit where credit is due.}@*
+Anonymous
+@end quotation
+
+This @value{SECTION} names the major contributors to @command{gawk}
+and/or this @value{DOCUMENT}, in approximate chronological order:
+
+@itemize @bullet
+@item
+@cindex Aho, Alfred
+@cindex Weinberger, Peter
+@cindex Kernighan, Brian
+Dr.@: Alfred V.@: Aho,
+Dr.@: Peter J.@: Weinberger, and
+Dr.@: Brian W.@: Kernighan, all of Bell Laboratories,
+designed and implemented Unix @command{awk},
+from which @command{gawk} gets the majority of its feature set.
+
+@item
+@cindex Rubin, Paul
+Paul Rubin
+did the initial design and implementation in 1986, and wrote
+the first draft (around 40 pages) of this @value{DOCUMENT}.
+
+@item
+@cindex Fenlason, Jay
+Jay Fenlason
+finished the initial implementation.
+
+@item
+@cindex Close, Diane
+Diane Close
+revised the first draft of this @value{DOCUMENT}, bringing it
+to around 90 pages.
+
+@item
+@cindex Stallman, Richard
+Richard Stallman
+helped finish the implementation and the initial draft of this
+@value{DOCUMENT}.
+He is also the founder of the FSF and the GNU project.
+
+@item
+@cindex Woods, John
+John Woods
+contributed parts of the code (mostly fixes) in
+the initial version of @command{gawk}.
+
+@item
+@cindex Trueman, David
+In 1988,
+David Trueman
+took over primary maintenance of @command{gawk},
+making it compatible with ``new'' @command{awk}, and
+greatly improving its performance.
+
+@item
+@cindex Rankin, Pat
+Pat Rankin
+provided the VMS port and its documentation.
+
+@item
+@cindex Kwok, Conrad
+@cindex Garfinkle, Scott
+@cindex Williams, Kent
+Conrad Kwok,
+Scott Garfinkle,
+and
+Kent Williams
+did the initial ports to MS-DOS with various versions of MSC.
+
+@item
+@cindex Peterson, Hal
+Hal Peterson
+provided help in porting @command{gawk} to Cray systems.
+
+@item
+@cindex Rommel, Kai Uwe
+Kai Uwe Rommel
+provided the initial port to OS/2 and its documentation.
+
+@item
+@cindex Jaegermann, Michal
+Michal Jaegermann
+provided the port to Atari systems and its documentation.
+He continues to provide portability checking with DEC Alpha
+systems, and has done a lot of work to make sure @command{gawk}
+works on non-32-bit systems.
+
+@item
+@cindex Fish, Fred
+Fred Fish
+provided the port to Amiga systems and its documentation.
+
+@item
+@cindex Deifik, Scott
+Scott Deifik
+currently maintains the MS-DOS port.
+
+@item
+@cindex Grigera, Juan
+Juan Grigera
+maintains the port to Windows32 systems.
+
+@item
+@cindex Hankerson, Darrel
+Dr.@: Darrel Hankerson
+acts as coordinator for the various ports to different PC platforms
+and creates binary distributions for various PC operating systems.
+He is also instrumental in keeping the documentation up to date for
+the various PC platforms.
+
+@item
+@cindex Zoulas, Christos
+Christos Zoulas
+provided the @code{extension}
+built-in function for dynamically adding new modules.
+
+@item
+@cindex Kahrs, J@"urgen
+J@"urgen Kahrs
+contributed the initial version of the TCP/IP networking
+code and documentation, and motivated the inclusion of the @samp{|&} operator.
+
+@item
+@cindex Davies, Stephen
+Stephen Davies
+provided the port to Tandem systems and its documentation.
+
+@item
+@cindex Brown, Martin
+Martin Brown
+provided the port to BeOS and its documentation.
+
+@item
+@cindex Peters, Arno
+Arno Peters
+did the initial work to convert @command{gawk} to use
+GNU Automake and @code{gettext}.
+
+@item
+@cindex Broder, Alan J.@:
+Alan J.@: Broder
+provided the initial version of the @code{asort} function
+as well as the code for the new optional third argument to the @code{match} function.
+
+@item
+@cindex Buening, Andreas
+Andreas Buening
+updated the @command{gawk} port for OS/2.
+
+@cindex Hasegawa, Isamu
+Isamu Hasegawa,
+of IBM in Japan, contributed support for multibyte characters.
+
+@cindex Benzinger, Michael
+Michael Benzinger contributed the initial code for @code{switch} statements.
+
+@cindex McPhee, Patrick
+Patrick T.J.@: McPhee contributed the code for dynamic loading in Windows32
+environments.
+
+@item
+@cindex Robbins, Arnold
+Arnold Robbins
+has been working on @command{gawk} since 1988, at first
+helping David Trueman, and as the primary maintainer since around 1994.
+@end itemize
+
+@node Installation
+@appendix Installing @command{gawk}
+
+@c last two commas are part of see also
+@cindex operating systems, See Also GNU/Linux, PC operating systems, Unix
+@c STARTOFRANGE gligawk
+@cindex @command{gawk}, installing
+@c STARTOFRANGE ingawk
+@cindex installing @command{gawk}
+This appendix provides instructions for installing @command{gawk} on the
+various platforms that are supported by the developers.  The primary
+developer supports GNU/Linux (and Unix), whereas the other ports are
+contributed.
+@xref{Bugs},
+for the electronic mail addresses of the people who did
+the respective ports.
+
+@menu
+* Gawk Distribution::           What is in the @command{gawk} distribution.
+* Unix Installation::           Installing @command{gawk} under various
+                                versions of Unix.
+* Non-Unix Installation::       Installation on Other Operating Systems.
+* Unsupported::                 Systems whose ports are no longer supported.
+* Bugs::                        Reporting Problems and Bugs.
+* Other Versions::              Other freely available @command{awk}
+                                implementations.
+@end menu
+
+@node Gawk Distribution
+@appendixsec The @command{gawk} Distribution
+@cindex source code, @command{gawk}
+
+This @value{SECTION} describes how to get the @command{gawk}
+distribution, how to extract it, and then what is in the various files and
+subdirectories.
+
+@menu
+* Getting::                     How to get the distribution.
+* Extracting::                  How to extract the distribution.
+* Distribution contents::       What is in the distribution.
+@end menu
+
+@node Getting
+@appendixsubsec Getting the @command{gawk} Distribution
+@c last comma is part of secondary
+@cindex @command{gawk}, source code, obtaining
+There are three ways to get GNU software:
+
+@itemize @bullet
+@item
+Copy it from someone else who already has it.
+
+@cindex FSF (Free Software Foundation)
+@cindex Free Software Foundation (FSF)
+@item
+Order @command{gawk} directly from the Free Software Foundation.
+Software distributions are available for
+Gnu/Linux, Unix, and MS-Windows, in several CD packages.
+Their address is:
+
+@display
+Free Software Foundation
+59 Temple Place, Suite 330
+Boston, MA  02111-1307 USA
+Phone: +1-617-542-5942
+Fax (including Japan): +1-617-542-2652
+Email: @email{gnu@@gnu.org}
+URL: @uref{http://www.gnu.org}
+@end display
+
+@noindent
+Ordering from the FSF directly contributes to the support of the foundation
+and to the production of more free software.
+
+@item
+Retrieve @command{gawk} by using anonymous @command{ftp} to the Internet host
+@code{ftp.gnu.org}, in the directory @file{/gnu/gawk}.
+@end itemize
+
+The GNU software archive is mirrored around the world.
+The up-to-date list of mirror sites is available from
+@uref{http://www.gnu.org/order/ftp.html, the main FSF web site}.
+Try to use one of the mirrors; they
+will be less busy, and you can usually find one closer to your site.
+
+@node Extracting
+@appendixsubsec Extracting the Distribution
+@command{gawk} is distributed as a @code{tar} file compressed with the
+GNU Zip program, @code{gzip}.
+
+Once you have the distribution (for example,
+@file{gawk-@value{VERSION}.@value{PATCHLEVEL}.tar.gz}),
+use @code{gzip} to expand the
+file and then use @code{tar} to extract it.  You can use the following
+pipeline to produce the @command{gawk} distribution:
+
+@example
+# Under System V, add 'o' to the tar options
+gzip -d -c gawk-@value{VERSION}.@value{PATCHLEVEL}.tar.gz | tar -xvpf -
+@end example
+
+@noindent
+This creates a directory named @file{gawk-@value{VERSION}.@value{PATCHLEVEL}}
+in the current directory.
+
+The distribution @value{FN} is of the form
+@file{gawk-@var{V}.@var{R}.@var{P}.tar.gz}.
+The @var{V} represents the major version of @command{gawk},
+the @var{R} represents the current release of version @var{V}, and
+the @var{P} represents a @dfn{patch level}, meaning that minor bugs have
+been fixed in the release.  The current patch level is @value{PATCHLEVEL},
+but when retrieving distributions, you should get the version with the highest
+version, release, and patch level.  (Note, however, that patch levels greater than
+or equal to 80 denote ``beta'' or nonproduction software; you might not want
+to retrieve such a version unless you don't mind experimenting.)
+If you are not on a Unix system, you need to make other arrangements
+for getting and extracting the @command{gawk} distribution.  You should consult
+a local expert.
+
+@node Distribution contents
+@appendixsubsec Contents of the @command{gawk} Distribution
+@c STARTOFRANGE gawdis
+@cindex @command{gawk}, distribution
+
+The @command{gawk} distribution has a number of C source files,
+documentation files,
+subdirectories, and files related to the configuration process
+(@pxref{Unix Installation}),
+as well as several subdirectories related to different non-Unix
+operating systems:
+
+@table @asis
+@item Various @samp{.c}, @samp{.y}, and @samp{.h} files
+The actual @command{gawk} source code.
+@end table
+
+@table @file
+@item README
+@itemx README_d/README.*
+Descriptive files: @file{README} for @command{gawk} under Unix and the
+rest for the various hardware and software combinations.
+
+@item INSTALL
+A file providing an overview of the configuration and installation process.
+
+@item ChangeLog
+A detailed list of source code changes as bugs are fixed or improvements made.
+
+@item NEWS
+A list of changes to @command{gawk} since the last release or patch.
+
+@item COPYING
+The GNU General Public License.
+
+@item FUTURES
+A brief list of features and changes being contemplated for future
+releases, with some indication of the time frame for the feature, based
+on its difficulty.
+
+@item LIMITATIONS
+A list of those factors that limit @command{gawk}'s performance.
+Most of these depend on the hardware or operating system software and
+are not limits in @command{gawk} itself.
+
+@item POSIX.STD
+A description of one area in which the POSIX standard for @command{awk} is
+incorrect as well as how @command{gawk} handles the problem.
+
+@c comma is part of primary
+@cindex artificial intelligence, @command{gawk} and
+@item doc/awkforai.txt
+A short article describing why @command{gawk} is a good language for
+AI (Artificial Intelligence) programming.
+
+@item doc/README.card
+@itemx doc/ad.block
+@itemx doc/awkcard.in
+@itemx doc/cardfonts
+@itemx doc/colors
+@itemx doc/macros
+@itemx doc/no.colors
+@itemx doc/setter.outline
+The @command{troff} source for a five-color @command{awk} reference card.
+A modern version of @command{troff} such as GNU @command{troff} (@command{groff}) is
+needed to produce the color version. See the file @file{README.card}
+for instructions if you have an older @command{troff}.
+
+@item doc/gawk.1
+The @command{troff} source for a manual page describing @command{gawk}.
+This is distributed for the convenience of Unix users.
+
+@cindex Texinfo
+@item doc/gawk.texi
+The Texinfo source file for this @value{DOCUMENT}.
+It should be processed with @TeX{} to produce a printed document, and
+with @command{makeinfo} to produce an Info or HTML file.
+
+@item doc/awk.info
+The generated Info file for this @value{DOCUMENT}.
+
+@item doc/gawkinet.texi
+The Texinfo source file for
+@ifinfo
+@xref{Top}.
+@end ifinfo
+@ifnotinfo
+@cite{TCP/IP Internetworking with @command{gawk}}.
+@end ifnotinfo
+It should be processed with @TeX{} to produce a printed document and
+with @command{makeinfo} to produce an Info or HTML file.
+
+@item doc/gawkinet.info
+The generated Info file for
+@cite{TCP/IP Internetworking with @command{gawk}}.
+
+@item doc/igawk.1
+The @command{troff} source for a manual page describing the @command{igawk}
+program presented in
+@ref{Igawk Program}.
+
+@item doc/Makefile.in
+The input file used during the configuration process to generate the
+actual @file{Makefile} for creating the documentation.
+
+@item Makefile.am
+@itemx */Makefile.am
+Files used by the GNU @command{automake} software for generating
+the @file{Makefile.in} files used by @command{autoconf} and
+@command{configure}.
+
+@item Makefile.in
+@itemx acconfig.h
+@itemx acinclude.m4
+@itemx aclocal.m4
+@itemx configh.in
+@itemx configure.in
+@itemx configure
+@itemx custom.h
+@itemx missing_d/*
+@itemx m4/*
+These files and subdirectories are used when configuring @command{gawk}
+for various Unix systems.  They are explained in
+@ref{Unix Installation}.
+
+@item intl/*
+@itemx po/*
+The @file{intl} directory provides the GNU @code{gettext} library, which implements
+@command{gawk}'s internationalization features, while the @file{po} library
+contains message translations.
+
+@item awklib/extract.awk
+@itemx awklib/Makefile.am
+@itemx awklib/Makefile.in
+@itemx awklib/eg/*
+The @file{awklib} directory contains a copy of @file{extract.awk}
+(@pxref{Extract Program}),
+which can be used to extract the sample programs from the Texinfo
+source file for this @value{DOCUMENT}. It also contains a @file{Makefile.in} file, which
+@command{configure} uses to generate a @file{Makefile}.
+@file{Makefile.am} is used by GNU Automake to create @file{Makefile.in}.
+The library functions from
+@ref{Library Functions},
+and the @command{igawk} program from
+@ref{Igawk Program},
+are included as ready-to-use files in the @command{gawk} distribution.
+They are installed as part of the installation process.
+The rest of the programs in this @value{DOCUMENT} are available in appropriate
+subdirectories of @file{awklib/eg}.
+
+@item unsupported/atari/*
+Files needed for building @command{gawk} on an Atari ST
+(@pxref{Atari Installation}, for details).
+
+@item unsupported/tandem/*
+Files needed for building @command{gawk} on a Tandem
+(@pxref{Tandem Installation}, for details).
+
+@item posix/*
+Files needed for building @command{gawk} on POSIX-compliant systems.
+
+@item pc/*
+Files needed for building @command{gawk} under MS-DOS, MS Windows and OS/2
+(@pxref{PC Installation}, for details).
+
+@item vms/*
+Files needed for building @command{gawk} under VMS
+(@pxref{VMS Installation}, for details).
+
+@item test/*
+A test suite for
+@command{gawk}.  You can use @samp{make check} from the top-level @command{gawk}
+directory to run your version of @command{gawk} against the test suite.
+If @command{gawk} successfully passes @samp{make check}, then you can
+be confident of a successful port.
+@end table
+@c ENDOFRANGE gawdis
+
+@node Unix Installation
+@appendixsec Compiling and Installing @command{gawk} on Unix
+
+Usually, you can compile and install @command{gawk} by typing only two
+commands.  However, if you use an unusual system, you may need
+to configure @command{gawk} for your system yourself.
+
+@menu
+* Quick Installation::               Compiling @command{gawk} under Unix.
+* Additional Configuration Options:: Other compile-time options.
+* Configuration Philosophy::         How it's all supposed to work.
+@end menu
+
+@node Quick Installation
+@appendixsubsec Compiling @command{gawk} for Unix
+
+@c @cindex installation, unix
+After you have extracted the @command{gawk} distribution, @command{cd}
+to @file{gawk-@value{VERSION}.@value{PATCHLEVEL}}.  Like most GNU software,
+@command{gawk} is configured
+automatically for your Unix system by running the @command{configure} program.
+This program is a Bourne shell script that is generated automatically using
+GNU @command{autoconf}.
+@ifnotinfo
+(The @command{autoconf} software is
+described fully in
+@cite{Autoconf---Generating Automatic Configuration Scripts},
+which is available from the Free Software Foundation.)
+@end ifnotinfo
+@ifinfo
+(The @command{autoconf} software is described fully starting with
+@ref{Top}.)
+@end ifinfo
+
+To configure @command{gawk}, simply run @command{configure}:
+
+@example
+sh ./configure
+@end example
+
+This produces a @file{Makefile} and @file{config.h} tailored to your system.
+The @file{config.h} file describes various facts about your system.
+You might want to edit the @file{Makefile} to
+change the @code{CFLAGS} variable, which controls
+the command-line options that are passed to the C compiler (such as
+optimization levels or compiling for debugging).
+
+Alternatively, you can add your own values for most @command{make}
+variables on the command line, such as @code{CC} and @code{CFLAGS}, when
+running @command{configure}:
+
+@example
+CC=cc CFLAGS=-g sh ./configure
+@end example
+
+@noindent
+See the file @file{INSTALL} in the @command{gawk} distribution for
+all the details.
+
+After you have run @command{configure} and possibly edited the @file{Makefile},
+type:
+
+@example
+make
+@end example
+
+@noindent
+Shortly thereafter, you should have an executable version of @command{gawk}.
+That's all there is to it!
+To verify that @command{gawk} is working properly,
+run @samp{make check}.  All of the tests should succeed.
+If these steps do not work, or if any of the tests fail,
+check the files in the @file{README_d} directory to see if you've
+found a known problem.  If the failure is not described there,
+please send in a bug report
+(@pxref{Bugs}.)
+
+@node Additional Configuration Options
+@appendixsubsec Additional Configuration Options
+@cindex @command{gawk}, configuring, options
+@c comma is part of primary
+@cindex configuration options, @command{gawk}
+
+There are several additional options you may use on the @command{configure}
+command line when compiling @command{gawk} from scratch, including:
+
+@table @code
+@cindex @code{--enable-portals} configuration option
+@cindex configuration option, @code{--enable-portals}
+@item --enable-portals
+Treat pathnames that begin
+with @file{/p} as BSD portal files when doing two-way I/O with
+the @samp{|&} operator
+(@pxref{Portal Files}).
+
+@cindex @code{--enable-switch} configuration option
+@cindex configuration option, @code{--enable-switch}
+@item --enable-switch
+Enable the recognition and execution of C-style @code{switch} statements
+in @command{awk} programs
+(@pxref{Switch Statement}.)
+
+@cindex Linux
+@cindex GNU/Linux
+@cindex @code{--with-included-gettext} configuration option
+@cindex @code{--with-included-gettext} configuration option, configuring @command{gawk} with
+@cindex configuration option, @code{--with-included-gettext}
+@item --with-included-gettext
+Use the version of the @code{gettext} library that comes with @command{gawk}.
+This option should be used on systems that do @emph{not} use @value{PVERSION} 2 (or later)
+of the GNU C library.
+All known modern GNU/Linux systems use Glibc 2.  Use this option on any other system.
+
+@cindex @code{--disable-lint} configuration option
+@cindex configuration option, @code{--disable-lint}
+@item --disable-lint
+This option disables all lint checking within @code{gawk}.  The
+@option{--lint} and @option{--lint-old} options
+(@pxref{Options})
+are accepted, but silently do nothing.
+Similarly, setting the @code{LINT} variable
+(@pxref{User-modified})
+has no effect on the running @command{awk} program.
+
+When used with GCC's automatic dead-code-elimination, this option
+cuts almost 200K bytes off the size of the @command{gawk}
+executable on GNU/Linux x86 systems.  Results on other systems and
+with other compilers are likely to vary.
+Using this option may bring you some slight performance improvement.
+
+Using this option will cause some of the tests in the test suite
+to fail.  This option may be removed at a later date.
+
+@cindex @code{--disable-nls} configuration option
+@cindex configuration option, @code{--disable-nls}
+@item --disable-nls
+Disable all message-translation facilities.
+This is usually not desirable, but it may bring you some slight performance
+improvement.
+You should also use this option if @option{--with-included-gettext}
+doesn't work on your system.
+@end table
+
+@node Configuration Philosophy
+@appendixsubsec The Configuration Process
+
+@cindex @command{gawk}, configuring
+This @value{SECTION} is of interest only if you know something about using the
+C language and the Unix operating system.
+
+The source code for @command{gawk} generally attempts to adhere to formal
+standards wherever possible.  This means that @command{gawk} uses library
+routines that are specified by the ISO C standard and by the POSIX
+operating system interface standard.  When using an ISO C compiler,
+function prototypes are used to help improve the compile-time checking.
+
+Many Unix systems do not support all of either the ISO or the
+POSIX standards.  The @file{missing_d} subdirectory in the @command{gawk}
+distribution contains replacement versions of those functions that are
+most likely to be missing.
+
+The @file{config.h} file that @command{configure} creates contains
+definitions that describe features of the particular operating system
+where you are attempting to compile @command{gawk}.  The three things
+described by this file are: what header files are available, so that
+they can be correctly included, what (supposedly) standard functions
+are actually available in your C libraries, and various miscellaneous
+facts about your variant of Unix.  For example, there may not be an
+@code{st_blksize} element in the @code{stat} structure.  In this case,
+@samp{HAVE_ST_BLKSIZE} is undefined.
+
+@cindex @code{custom.h} file
+It is possible for your C compiler to lie to @command{configure}. It may
+do so by not exiting with an error when a library function is not
+available.  To get around this, edit the file @file{custom.h}.
+Use an @samp{#ifdef} that is appropriate for your system, and either
+@code{#define} any constants that @command{configure} should have defined but
+didn't, or @code{#undef} any constants that @command{configure} defined and
+should not have.  @file{custom.h} is automatically included by
+@file{config.h}.
+
+It is also possible that the @command{configure} program generated by
+@command{autoconf} will not work on your system in some other fashion.
+If you do have a problem, the file @file{configure.in} is the input for
+@command{autoconf}.  You may be able to change this file and generate a
+new version of @command{configure} that works on your system
+(@pxref{Bugs},
+for information on how to report problems in configuring @command{gawk}).
+The same mechanism may be used to send in updates to @file{configure.in}
+and/or @file{custom.h}.
+
+@node Non-Unix Installation
+@appendixsec Installation on Other Operating Systems
+
+This @value{SECTION} describes how to install @command{gawk} on
+various non-Unix systems.
+
+@menu
+* Amiga Installation::          Installing @command{gawk} on an Amiga.
+* BeOS Installation::           Installing @command{gawk} on BeOS.
+* PC Installation::             Installing and Compiling @command{gawk} on
+                                MS-DOS and OS/2.
+* VMS Installation::            Installing @command{gawk} on VMS.
+@end menu
+
+@node Amiga Installation
+@appendixsubsec Installing @command{gawk} on an Amiga
+
+@cindex amiga
+@cindex installation, amiga
+You can install @command{gawk} on an Amiga system using a Unix emulation
+environment, available via anonymous @command{ftp} from
+@code{ftp.ninemoons.com} in the directory @file{pub/ade/current}.
+This includes a shell based on @command{pdksh}.  The primary component of
+this environment is a Unix emulation library, @file{ixemul.lib}.
+@c could really use more background here, who wrote this, etc.
+
+A more complete distribution for the Amiga is available on
+the Geek Gadgets CD-ROM, available from:
+
+@display
+CRONUS
+1840 E. Warner Road #105-265
+Tempe, AZ 85284  USA
+US Toll Free: (800) 804-0833
+Phone: +1-602-491-0442
+FAX: +1-602-491-0048
+Email: @email{info@@ninemoons.com}
+WWW: @uref{http://www.ninemoons.com}
+Anonymous @command{ftp} site: @code{ftp.ninemoons.com}
+@end display
+
+Once you have the distribution, you can configure @command{gawk} simply by
+running @command{configure}:
+
+@example
+configure -v m68k-amigaos
+@end example
+
+Then run @command{make} and you should be all set!
+If these steps do not work, please send in a bug report
+(@pxref{Bugs}).
+
+@node BeOS Installation
+@appendixsubsec Installing @command{gawk} on BeOS
+@cindex BeOS
+@cindex installation, beos
+
+@c From email contributed by Martin Brown, mc@whoever.com
+Since BeOS DR9, all the tools that you should need to build @code{gawk} are
+included with BeOS. The process is basically identical to the Unix process
+of running @command{configure} and then @command{make}. Full instructions are given below.
+
+You can compile @command{gawk} under BeOS by extracting the standard sources
+and running @command{configure}. You @emph{must} specify the location
+prefix for the installation directory. For BeOS DR9 and beyond, the best directory to
+use is @file{/boot/home/config}, so the @command{configure} command is:
+
+@example
+configure --prefix=/boot/home/config
+@end example
+
+This installs the compiled application into @file{/boot/home/config/bin},
+which is already specified in the standard @env{PATH}.
+
+Once the configuration process is completed, you can run @command{make},
+and then @samp{make install}:
+
+@example
+$ make
+@dots{}
+$ make install
+@end example
+
+BeOS uses @command{bash} as its shell; thus, you use @command{gawk} the same way you would
+under Unix.
+If these steps do not work, please send in a bug report
+(@pxref{Bugs}).
+
+@c Rewritten by Scott Deifik <scottd@amgen.com>
+@c and Darrel Hankerson <hankedr@mail.auburn.edu>
+
+@node PC Installation
+@appendixsubsec Installation on PC Operating Systems
+
+@c first comma is part of primary
+@cindex PC operating systems, @command{gawk} on, installing
+@c {PC, gawk on} is the secondary term
+@cindex operating systems, PC, @command{gawk} on, installing
+This @value{SECTION} covers installation and usage of @command{gawk} on x86 machines
+running DOS, any version of Windows, or OS/2.
+In this @value{SECTION}, the term ``Windows32''
+refers to any of Windows-95/98/ME/NT/2000.
+
+The limitations of DOS (and DOS shells under Windows or OS/2) has meant
+that various ``DOS extenders'' are often used with programs such as
+@command{gawk}.  The varying capabilities of Microsoft Windows 3.1
+and Windows32 can add to the confusion.  For an overview of the
+considerations, please refer to @file{README_d/README.pc} in the
+distribution.
+
+@menu
+* PC Binary Installation::      Installing a prepared distribution.
+* PC Compiling::                Compiling @command{gawk} for MS-DOS, Windows32,
+                                and OS/2.
+* PC Dynamic::                  Compiling @command{gawk} for dynamic libraries.
+* PC Using::                    Running @command{gawk} on MS-DOS, Windows32 and
+                                OS/2.
+* Cygwin::                      Building and running @command{gawk} for
+                                Cygwin.
+@end menu
+
+@node PC Binary Installation
+@appendixsubsubsec Installing a Prepared Distribution for PC Systems
+
+If you have received a binary distribution prepared by the DOS
+maintainers, then @command{gawk} and the necessary support files appear
+under the @file{gnu} directory, with executables in @file{gnu/bin},
+libraries in @file{gnu/lib/awk}, and manual pages under @file{gnu/man}.
+This is designed for easy installation to a @file{/gnu} directory on your
+drive---however, the files can be installed anywhere provided @env{AWKPATH} is
+set properly.  Regardless of the installation directory, the first line of
+@file{igawk.cmd} and @file{igawk.bat} (in @file{gnu/bin}) may need to be
+edited.
+
+The binary distribution contains a separate file describing the
+contents. In particular, it may include more than one version of the
+@command{gawk} executable.
+
+OS/2 (32 bit, EMX) binary distributions are prepared for the @file{/usr}
+directory of your preferred drive. Set @env{UNIXROOT} to your installation
+drive (e.g., @samp{e:}) if you want to install @command{gawk} onto another drive
+than the hardcoded default @samp{c:}. Executables appear in @file{/usr/bin},
+libraries under @file{/usr/share/awk}, manual pages under @file{/usr/man},
+Texinfo documentation under @file{/usr/info} and NLS files under @file{/usr/share/locale}.
+If you already have a file @file{/usr/info/dir} from another package
+@emph{do not overwrite it!} Instead enter the following commands at your prompt
+(replace @samp{x:} by your installation drive):
+
+@example
+install-info --info-dir=x:/usr/info x:/usr/info/awk.info
+install-info --info-dir=x:/usr/info x:/usr/info/gawkinet.info
+@end example
+
+However, the files can be installed anywhere provided @env{AWKPATH} is
+set properly.
+
+The binary distribution may contain a separate file containing additional
+or more detailed installation instructions.
+
+@node PC Compiling
+@appendixsubsubsec Compiling @command{gawk} for PC Operating Systems
+
+@command{gawk} can be compiled for MS-DOS, Windows32, and OS/2 using the GNU
+development tools from DJ Delorie (DJGPP; MS-DOS only) or Eberhard
+Mattes (EMX; MS-DOS, Windows32 and OS/2).  Microsoft Visual C/C++ can be used
+to build a Windows32 version, and Microsoft C/C++ can be
+used to build 16-bit versions for MS-DOS and OS/2.
+@c FIXME:
+(As of @command{gawk} 3.1.2, the MSC version doesn't work. However,
+the maintainer is working on fixing it.)
+The file
+@file{README_d/README.pc} in the @command{gawk} distribution contains
+additional notes, and @file{pc/Makefile} contains important information on
+compilation options.
+
+To build @command{gawk} for MS-DOS, Windows32, and OS/2 (16 bit only; for 32 bit
+(EMX) you can use the @command{configure} script and skip the following paragraphs;
+for details see below), copy the files in the @file{pc} directory (@emph{except}
+for @file{ChangeLog}) to the directory with the rest of the @command{gawk}
+sources. The @file{Makefile} contains a configuration section with comments and
+may need to be edited in order to work with your @command{make} utility.
+
+The @file{Makefile} contains a number of targets for building various MS-DOS,
+Windows32, and OS/2 versions. A list of targets is printed if the @command{make}
+command is given without a target. As an example, to build @command{gawk}
+using the DJGPP tools, enter @samp{make djgpp}.
+
+Using @command{make} to run the standard tests and to install @command{gawk}
+requires additional Unix-like tools, including @command{sh}, @command{sed}, and
+@command{cp}. In order to run the tests, the @file{test/*.ok} files may need to
+be converted so that they have the usual DOS-style end-of-line markers. Most
+of the tests work properly with Stewartson's shell along with the
+companion utilities or appropriate GNU utilities.  However, some editing of
+@file{test/Makefile} is required. It is recommended that you copy the file
+@file{pc/Makefile.tst} over the file @file{test/Makefile} as a
+replacement. Details can be found in @file{README_d/README.pc}
+and in the file @file{pc/Makefile.tst}.
+
+The 32 bit EMX version of @command{gawk} works ``out of the box'' under OS/2.
+In principle, it is possible to compile @command{gawk} the following way:
+
+@example
+$ ./configure
+$ make
+@end example
+
+This is not recommended, though. To get an OMF executable you should
+use the following commands at your @command{sh} prompt:
+
+@example
+$ CPPFLAGS="-D__ST_MT_ERRNO__"
+$ export CPPFLAGS
+$ CFLAGS="-O2 -Zomf -Zmt"
+$ export CFLAGS
+$ LDFLAGS="-s -Zcrtdll -Zlinker /exepack:2 -Zlinker /pm:vio -Zstack 0x8000"
+$ export LDFLAGS
+$ RANLIB="echo"
+$ export RANLIB
+$ ./configure --prefix=c:/usr --without-included-gettext
+$ make AR=emxomfar
+@end example
+
+These are just suggestions. You may use any other set of (self-consistent)
+environment variables and compiler flags.
+
+To get an FHS-compliant file hierarchy it is recommended to use the additional
+@command{configure} options @option{--infodir=c:/usr/share/info}, @option{--mandir=c:/usr/share/man}
+and @option{--libexecdir=c:/usr/lib}.
+
+The internal @code{gettext} library tends to be problematic. It is therefore recommended
+to use either an external one (@option{--without-included-gettext}) or to disable
+NLS entirely (@option{--disable-nls}).
+
+If you use GCC 2.95 or newer it is recommended to use also:
+
+@example
+$ LIBS="-lgcc"
+$ export LIBS
+@end example
+
+You can also get an @code{a.out} executable if you prefer:
+
+@example
+$ CPPFLAGS="-D__ST_MT_ERRNO__"
+$ export CPPFLAGS
+$ CFLAGS="-O2 -Zmt"
+$ export CFLAGS
+$ LDFLAGS="-s -Zstack 0x8000"
+$ LIBS="-lgcc"
+$ unset RANLIB
+$ ./configure --prefix=c:/usr --without-included-gettext
+$ make
+@end example
+
+@strong{Note:} Even if the compiled @command{gawk.exe} (@code{a.out}) executable
+contains a DOS header, it does @emph{not} work under DOS. To compile an executable
+that runs under DOS, @code{"-DPIPES_SIMULATED"} must be added to @env{CPPFLAGS}.
+But then some nonstandard extensions of @command{gawk} (e.g., @samp{|&}) do not work!
+
+After compilation the internal tests can be performed. Enter
+@samp{make check CMP="diff -a"} at your command prompt. All tests
+but the @code{pid} test are expected to work properly. The @code{pid}
+test fails because child processes are not started by @code{fork()}.
+
+@samp{make install} works as expected.
+
+@strong{Note:} Most OS/2 ports of GNU @command{make} are not able to handle
+the Makefiles of this package. If you encounter any problems with @command{make}
+try GNU Make 3.79.1 or later versions. You should find the latest
+version on @uref{http://www.unixos2.org/sw/pub/binary/make/} or on
+@uref{ftp://hobbes.nmsu.edu/pub/os2/}.
+
+@node PC Dynamic
+@appendixsubsubsec Compiling @command{gawk} For Dynamic Libraries
+
+@c From README_d/README.pcdynamic
+@c 11 June 2003
+
+To compile @command{gawk} with dynamic extension support,
+uncomment the definitions of @code{DYN_FLAGS}, @code{DYN_EXP},
+@code{DYN_OBJ}, and @code{DYN_MAKEXP} in the configuration section of
+the @file{Makefile}. There are two definitions for @code{DYN_MAKEXP}:
+pick the one that matches your target.
+
+To build some of the example extension libraries, @command{cd} to the
+extension directory and copy @file{Makefile.pc} to @file{Makefile}. You
+can then build using the same two targets. To run the example
+@command{awk} scripts, you'll need to either change the call to
+the @code{extension} function to match the name of the library (for
+instance, change @code{"./ordchr.so"} to @code{"ordchr.dll"} or simply
+@code{"ordchr"}), or rename the library to match the call (for instance,
+rename @file{ordchr.dll} to @file{ordchr.so}).
+
+If you build @command{gawk.exe} with one compiler but want to build
+an extension library with the other, you need to copy the import
+library. Visual C uses a library called @file{gawk.lib}, while MinGW uses
+a library called @file{libgawk.a}. These files are equivalent and will
+interoperate if you give them the correct name.  The resulting shared
+libraries are also interoperable.
+
+To create your own extension library, you can use the examples as models,
+but you're essentially on your own. Post to @code{comp.lang.awk} or
+send electronic mail to @email{ptjm@@interlog.com} if you have problems getting
+started. If you need to access functions or variables which are not
+exported by @command{gawk.exe}, add them to @file{gawkw32.def} and
+rebuild. You should also add @code{ATTRIBUTE_EXPORTED} to the declaration
+in @file{awk.h} of any variables you add to @file{gawkw32.def}.
+
+Note that extension libraries have the name of the @command{awk}
+executable embedded in them at link time, so they will work only
+with @command{gawk.exe}. In particular, they won't work if you
+rename @command{gawk.exe} to @command{awk.exe} or if you try to use
+@command{pgawk.exe}. You can perform profiling by temporarily renaming
+@command{pgawk.exe} to @command{gawk.exe}. You can resolve this problem
+by changing the program name in the definition of @code{DYN_MAKEXP}
+for your compiler.
+
+On Windows32, libraries are sought first in the current directory, then in
+the directory containing @command{gawk.exe}, and finally through the
+@env{PATH} environment variable.
+
+@node PC Using
+@appendixsubsubsec Using @command{gawk} on PC Operating Systems
+@c STARTOFRANGE opgawx
+@cindex operating systems, PC, @command{gawk} on
+@c STARTOFRANGE pcgawon
+@cindex PC operating systems, @command{gawk} on
+
+With the exception of the Cygwin environment,
+the @samp{|&} operator and TCP/IP networking
+(@pxref{TCP/IP Networking})
+are not supported for MS-DOS or MS-Windows. EMX (OS/2 only) does support
+at least the @samp{|&} operator.
+
+@cindex search paths
+@cindex @command{gawk}, OS/2 version of
+@cindex @command{gawk}, MS-DOS version of
+@cindex @code{;} (semicolon), @code{AWKPATH} variable and
+@cindex semicolon (@code{;}), @code{AWKPATH} variable and
+@cindex @code{AWKPATH} environment variable
+The OS/2 and MS-DOS versions of @command{gawk} search for program files as
+described in @ref{AWKPATH Variable}.
+However, semicolons (rather than colons) separate elements
+in the @env{AWKPATH} variable. If @env{AWKPATH} is not set or is empty,
+then the default search path for OS/2 (16 bit) and MS-DOS versions is
+@code{@w{".;c:/lib/awk;c:/gnu/lib/awk"}}.
+
+The search path for OS/2 (32 bit, EMX) is determined by the prefix directory
+(most likely @file{/usr} or @file{c:/usr}) that has been specified as an option of
+the @command{configure} script like it is the case for the Unix versions.
+If @file{c:/usr} is the prefix directory then the default search path contains @file{.}
+and @file{c:/usr/share/awk}.
+Additionally, to support binary distributions of @command{gawk} for OS/2
+systems whose drive @samp{c:} might not support long file names or might not exist
+at all, there is a special environment variable. If @env{UNIXROOT} specifies
+a drive then this specific drive is also searched for program files.
+E.g., if @env{UNIXROOT} is set to @file{e:} the complete default search path is
+@code{@w{".;c:/usr/share/awk;e:/usr/share/awk"}}.
+
+An @command{sh}-like shell (as opposed to @command{command.com} under MS-DOS
+or @command{cmd.exe} under OS/2) may be useful for @command{awk} programming.
+Ian Stewartson has written an excellent shell for MS-DOS and OS/2,
+Daisuke Aoyama has ported GNU @command{bash} to MS-DOS using the DJGPP tools,
+and several shells are available for OS/2, including @command{ksh}.  The file
+@file{README_d/README.pc} in the @command{gawk} distribution contains
+information on these shells.  Users of Stewartson's shell on DOS should
+examine its documentation for handling command lines; in particular,
+the setting for @command{gawk} in the shell configuration may need to be
+changed and the @code{ignoretype} option may also be of interest.
+
+@cindex differences in @command{awk} and @command{gawk}, @code{BINMODE} variable
+@cindex @code{BINMODE} variable
+Under OS/2 and DOS, @command{gawk} (and many other text programs) silently
+translate end-of-line @code{"\r\n"} to @code{"\n"} on input and @code{"\n"}
+to @code{"\r\n"} on output.  A special @code{BINMODE} variable allows
+control over these translations and is interpreted as follows:
+
+@itemize @bullet
+@item
+If @code{BINMODE} is @samp{"r"}, or
+@code{(BINMODE & 1)} is nonzero, then
+binary mode is set on read (i.e., no translations on reads).
+
+@item
+If @code{BINMODE} is @code{"w"}, or
+@code{(BINMODE & 2)} is nonzero, then
+binary mode is set on write (i.e., no translations on writes).
+
+@item
+If @code{BINMODE} is @code{"rw"} or @code{"wr"},
+binary mode is set for both read and write
+(same as @code{(BINMODE & 3)}).
+
+@item
+@code{BINMODE=@var{non-null-string}} is
+the same as @samp{BINMODE=3} (i.e., no translations on
+reads or writes).  However, @command{gawk} issues a warning
+message if the string is not one of @code{"rw"} or @code{"wr"}.
+@end itemize
+
+@noindent
+The modes for standard input and standard output are set one time
+only (after the
+command line is read, but before processing any of the @command{awk} program).
+Setting @code{BINMODE} for standard input or
+standard output is accomplished by using an
+appropriate @samp{-v BINMODE=@var{N}} option on the command line.
+@code{BINMODE} is set at the time a file or pipe is opened and cannot be
+changed mid-stream.
+
+The name @code{BINMODE} was chosen to match @command{mawk}
+(@pxref{Other Versions}).
+Both @command{mawk} and @command{gawk} handle @code{BINMODE} similarly; however,
+@command{mawk} adds a @samp{-W BINMODE=@var{N}} option and an environment
+variable that can set @code{BINMODE}, @code{RS}, and @code{ORS}.  The
+files @file{binmode[1-3].awk} (under @file{gnu/lib/awk} in some of the
+prepared distributions) have been chosen to match @command{mawk}'s @samp{-W
+BINMODE=@var{N}} option.  These can be changed or discarded; in particular,
+the setting of @code{RS} giving the fewest ``surprises'' is open to debate.
+@command{mawk} uses @samp{RS = "\r\n"} if binary mode is set on read, which is
+appropriate for files with the DOS-style end-of-line.
+
+To illustrate, the following examples set binary mode on writes for standard
+output and other files, and set @code{ORS} as the ``usual'' DOS-style
+end-of-line:
+
+@example
+gawk -v BINMODE=2 -v ORS="\r\n" @dots{}
+@end example
+
+@noindent
+or:
+
+@example
+gawk -v BINMODE=w -f binmode2.awk @dots{}
+@end example
+
+@noindent
+These give the same result as the @samp{-W BINMODE=2} option in
+@command{mawk}.
+The following changes the record separator to @code{"\r\n"} and sets binary
+mode on reads, but does not affect the mode on standard input:
+
+@example
+gawk -v RS="\r\n" --source "BEGIN @{ BINMODE = 1 @}" @dots{}
+@end example
+
+@noindent
+or:
+
+@example
+gawk -f binmode1.awk @dots{}
+@end example
+
+@noindent
+With proper quoting, in the first example the setting of @code{RS} can be
+moved into the @code{BEGIN} rule.
+
+@node Cygwin
+@appendixsubsubsec Using @command{gawk} In The Cygwin Environment
+
+@command{gawk} can be used ``out of the box'' under Windows if you are
+using the Cygwin environment.@footnote{@uref{http://www.cygwin.com}}
+This environment provides an excellent simulation of Unix, using the
+GNU tools, such as @command{bash}, the GNU Compiler Collection (GCC),
+GNU Make, and other GNU tools.  Compilation and installation for Cygwin
+is the same as for a Unix system:
+
+@example
+tar -xvpzf gawk-@value{VERSION}.@value{PATCHLEVEL}.tar.gz
+cd gawk-@value{VERSION}.@value{PATCHLEVEL}
+./configure
+make
+@end example
+
+When compared to GNU/Linux on the same system, the @samp{configure}
+step on Cygwin takes considerably longer.  However, it does finish,
+and then the @samp{make} proceeds as usual.
+
+@strong{Note:} The @samp{|&} operator and TCP/IP networking
+(@pxref{TCP/IP Networking})
+are fully supported in the Cygwin environment.  This is not true
+for any other environment for MS-DOS or MS-Windows.
+
+@node VMS Installation
+@appendixsubsec How to Compile and Install @command{gawk} on VMS
+
+@c based on material from Pat Rankin <rankin@eql.caltech.edu>
+@c now rankin@pactechdata.com
+
+@cindex installation, vms
+This @value{SUBSECTION} describes how to compile and install @command{gawk} under VMS.
+
+@menu
+* VMS Compilation::             How to compile @command{gawk} under VMS.
+* VMS Installation Details::    How to install @command{gawk} under VMS.
+* VMS Running::                 How to run @command{gawk} under VMS.
+* VMS POSIX::                   Alternate instructions for VMS POSIX.
+@end menu
+
+@node VMS Compilation
+@appendixsubsubsec Compiling @command{gawk} on VMS
+
+To compile @command{gawk} under VMS, there is a @code{DCL} command procedure that
+issues all the necessary @code{CC} and @code{LINK} commands. There is
+also a @file{Makefile} for use with the @code{MMS} utility.  From the source
+directory, use either:
+
+@example
+$ @@[.VMS]VMSBUILD.COM
+@end example
+
+@noindent
+or:
+
+@example
+$ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK
+@end example
+
+Depending upon which C compiler you are using, follow one of the sets
+of instructions in this table:
+
+@table @asis
+@item VAX C V3.x
+Use either @file{vmsbuild.com} or @file{descrip.mms} as is.  These use
+@code{CC/OPTIMIZE=NOLINE}, which is essential for Version 3.0.
+
+@item VAX C V2.x
+You must have Version 2.3 or 2.4; older ones won't work.  Edit either
+@file{vmsbuild.com} or @file{descrip.mms} according to the comments in them.
+For @file{vmsbuild.com}, this just entails removing two @samp{!} delimiters.
+Also edit @file{config.h} (which is a copy of file @file{[.config]vms-conf.h})
+and comment out or delete the two lines @samp{#define __STDC__ 0} and
+@samp{#define VAXC_BUILTINS} near the end.
+
+@item GNU C
+Edit @file{vmsbuild.com} or @file{descrip.mms}; the changes are different
+from those for VAX C V2.x but equally straightforward.  No changes to
+@file{config.h} are needed.
+
+@item DEC C
+Edit @file{vmsbuild.com} or @file{descrip.mms} according to their comments.
+No changes to @file{config.h} are needed.
+@end table
+
+@command{gawk} has been tested under VAX/VMS 5.5-1 using VAX C V3.2, and
+GNU C 1.40 and 2.3.  It should work without modifications for VMS V4.6 and up.
+
+@node VMS Installation Details
+@appendixsubsubsec Installing @command{gawk} on VMS
+
+To install @command{gawk}, all you need is a ``foreign'' command, which is
+a @code{DCL} symbol whose value begins with a dollar sign. For example:
+
+@example
+$ GAWK :== $disk1:[gnubin]GAWK
+@end example
+
+@noindent
+Substitute the actual location of @command{gawk.exe} for
+@samp{$disk1:[gnubin]}. The symbol should be placed in the
+@file{login.com} of any user who wants to run @command{gawk},
+so that it is defined every time the user logs on.
+Alternatively, the symbol may be placed in the system-wide
+@file{sylogin.com} procedure, which allows all users
+to run @command{gawk}.
+
+Optionally, the help entry can be loaded into a VMS help library:
+
+@example
+$ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP
+@end example
+
+@noindent
+(You may want to substitute a site-specific help library rather than
+the standard VMS library @samp{HELPLIB}.)  After loading the help text,
+the command:
+
+@example
+$ HELP GAWK
+@end example
+
+@noindent
+provides information about both the @command{gawk} implementation and the
+@command{awk} programming language.
+
+The logical name @samp{AWK_LIBRARY} can designate a default location
+for @command{awk} program files.  For the @option{-f} option, if the specified
+@value{FN} has no device or directory path information in it, @command{gawk}
+looks in the current directory first, then in the directory specified
+by the translation of @samp{AWK_LIBRARY} if the file is not found.
+If, after searching in both directories, the file still is not found,
+@command{gawk} appends the suffix @samp{.awk} to the filename and retries
+the file search.  If @samp{AWK_LIBRARY} is not defined, that
+portion of the file search fails benignly.
+
+@node VMS Running
+@appendixsubsubsec Running @command{gawk} on VMS
+
+Command-line parsing and quoting conventions are significantly different
+on VMS, so examples in this @value{DOCUMENT} or from other sources often need minor
+changes.  They @emph{are} minor though, and all @command{awk} programs
+should run correctly.
+
+Here are a couple of trivial tests:
+
+@example
+$ gawk -- "BEGIN @{print ""Hello, World!""@}"
+$ gawk -"W" version
+! could also be -"W version" or "-W version"
+@end example
+
+@noindent
+Note that uppercase and mixed-case text must be quoted.
+
+The VMS port of @command{gawk} includes a @code{DCL}-style interface in addition
+to the original shell-style interface (see the help entry for details).
+One side effect of dual command-line parsing is that if there is only a
+single parameter (as in the quoted string program above), the command
+becomes ambiguous.  To work around this, the normally optional @option{--}
+flag is required to force Unix style rather than @code{DCL} parsing.  If any
+other dash-type options (or multiple parameters such as @value{DF}s to
+process) are present, there is no ambiguity and @option{--} can be omitted.
+
+@c @cindex directory search
+@c @cindex path, search
+@cindex search paths
+@cindex search paths, for source files
+The default search path, when looking for @command{awk} program files specified
+by the @option{-f} option, is @code{"SYS$DISK:[],AWK_LIBRARY:"}.  The logical
+name @samp{AWKPATH} can be used to override this default.  The format
+of @samp{AWKPATH} is a comma-separated list of directory specifications.
+When defining it, the value should be quoted so that it retains a single
+translation and not a multitranslation @code{RMS} searchlist.
+
+@node VMS POSIX
+@appendixsubsubsec Building and Using @command{gawk} on VMS POSIX
+
+Ignore the instructions above, although @file{vms/gawk.hlp} should still
+be made available in a help library.  The source tree should be unpacked
+into a container file subsystem rather than into the ordinary VMS filesystem.
+Make sure that the two scripts, @file{configure} and
+@file{vms/posix-cc.sh}, are executable; use @samp{chmod +x} on them if
+necessary.  Then execute the following two commands:
+
+@example
+psx> CC=vms/posix-cc.sh configure
+psx> make CC=c89 gawk
+@end example
+
+@noindent
+The first command constructs files @file{config.h} and @file{Makefile} out
+of templates, using a script to make the C compiler fit @command{configure}'s
+expectations.  The second command compiles and links @command{gawk} using
+the C compiler directly; ignore any warnings from @command{make} about being
+unable to redefine @code{CC}.  @command{configure} takes a very long
+time to execute, but at least it provides incremental feedback as it runs.
+
+This has been tested with VAX/VMS V6.2, VMS POSIX V2.0, and DEC C V5.2.
+
+Once built, @command{gawk} works like any other shell utility.  Unlike
+the normal VMS port of @command{gawk}, no special command-line manipulation is
+needed in the VMS POSIX environment.
+
+@node Unsupported
+@appendixsec Unsupported Operating System Ports
+
+This sections describes systems for which
+the @command{gawk} port is no longer supported.
+
+@menu
+* Atari Installation::          Installing @command{gawk} on the Atari ST.
+* Tandem Installation::         Installing @command{gawk} on a Tandem.
+@end menu
+
+@node Atari Installation
+@appendixsubsec Installing @command{gawk} on the Atari ST
+
+The Atari port is no longer supported.  It is
+included for those who might want to use it but it is no longer being
+actively maintained.
+
+@c based on material from Michal Jaegermann <michal@gortel.phys.ualberta.ca>
+@cindex atari
+@cindex installation, atari
+There are no substantial differences when installing @command{gawk} on
+various Atari models.  Compiled @command{gawk} executables do not require
+a large amount of memory with most @command{awk} programs, and should run on all
+Motorola processor-based models (called further ST, even if that is not
+exactly right).
+
+In order to use @command{gawk}, you need to have a shell, either text or
+graphics, that does not map all the characters of a command line to
+uppercase.  Maintaining case distinction in option flags is very
+important (@pxref{Options}).
+These days this is the default and it may only be a problem for some
+very old machines.  If your system does not preserve the case of option
+flags, you need to upgrade your tools.  Support for I/O
+redirection is necessary to make it easy to import @command{awk} programs
+from other environments.  Pipes are nice to have but not vital.
+
+@menu
+* Atari Compiling::             Compiling @command{gawk} on Atari.
+* Atari Using::                 Running @command{gawk} on Atari.
+@end menu
+
+@node Atari Compiling
+@appendixsubsubsec Compiling @command{gawk} on the Atari ST
+
+A proper compilation of @command{gawk} sources when @code{sizeof(int)}
+differs from @code{sizeof(void *)} requires an ISO C compiler. An initial
+port was done with @command{gcc}.  You may actually prefer executables
+where @code{int}s are four bytes wide but the other variant works as well.
+
+You may need quite a bit of memory when trying to recompile the @command{gawk}
+sources, as some source files (@file{regex.c} in particular) are quite
+big.  If you run out of memory compiling such a file, try reducing the
+optimization level for this particular file, which may help.
+
+@cindex Linux
+@cindex GNU/Linux
+With a reasonable shell (@command{bash} will do), you have a pretty good chance
+that the @command{configure} utility will succeed, and in particular if
+you run GNU/Linux, MiNT or a similar operating system.  Otherwise
+sample versions of @file{config.h} and @file{Makefile.st} are given in the
+@file{atari} subdirectory and can be edited and copied to the
+corresponding files in the main source directory.  Even if
+@command{configure} produces something, it might be advisable to compare
+its results with the sample versions and possibly make adjustments.
+
+Some @command{gawk} source code fragments depend on a preprocessor define
+@samp{atarist}.  This basically assumes the TOS environment with @command{gcc}.
+Modify these sections as appropriate if they are not right for your
+environment.  Also see the remarks about @env{AWKPATH} and @code{envsep} in
+@ref{Atari Using}.
+
+As shipped, the sample @file{config.h} claims that the @code{system}
+function is missing from the libraries, which is not true, and an
+alternative implementation of this function is provided in
+@file{unsupported/atari/system.c}.
+Depending upon your particular combination of
+shell and operating system, you might want to change the file to indicate
+that @code{system} is available.
+
+@node Atari Using
+@appendixsubsubsec Running @command{gawk} on the Atari ST
+
+An executable version of @command{gawk} should be placed, as usual,
+anywhere in your @env{PATH} where your shell can find it.
+
+While executing, the Atari version of @command{gawk} creates a number of temporary files.  When
+using @command{gcc} libraries for TOS, @command{gawk} looks for either of
+the environment variables, @env{TEMP} or @env{TMPDIR}, in that order.
+If either one is found, its value is assumed to be a directory for
+temporary files.  This directory must exist, and if you can spare the
+memory, it is a good idea to put it on a RAM drive.  If neither
+@env{TEMP} nor @env{TMPDIR} are found, then @command{gawk} uses the
+current directory for its temporary files.
+
+The ST version of @command{gawk} searches for its program files, as described in
+@ref{AWKPATH Variable}.
+The default value for the @env{AWKPATH} variable is taken from
+@code{DEFPATH} defined in @file{Makefile}. The sample @command{gcc}/TOS
+@file{Makefile} for the ST in the distribution sets @code{DEFPATH} to
+@code{@w{".,c:\lib\awk,c:\gnu\lib\awk"}}.  The search path can be
+modified by explicitly setting @env{AWKPATH} to whatever you want.
+Note that colons cannot be used on the ST to separate elements in the
+@env{AWKPATH} variable, since they have another reserved meaning.
+Instead, you must use a comma to separate elements in the path.  When
+recompiling, the separating character can be modified by initializing
+the @code{envsep} variable in @file{unsupported/atari/gawkmisc.atr} to another
+value.
+
+Although @command{awk} allows great flexibility in doing I/O redirections
+from within a program, this facility should be used with care on the ST
+running under TOS.  In some circumstances, the OS routines for file-handle
+pool processing lose track of certain events, causing the
+computer to crash and requiring a reboot.  Often a warm reboot is
+sufficient.  Fortunately, this happens infrequently and in rather
+esoteric situations.  In particular, avoid having one part of an
+@command{awk} program using @code{print} statements explicitly redirected
+to @file{/dev/stdout}, while other @code{print} statements use the
+default standard output, and a calling shell has redirected standard
+output to a file.
+@c 10/2000: Is this still true, now that gawk does /dev/stdout internally?
+
+When @command{gawk} is compiled with the ST version of @command{gcc} and its
+usual libraries, it accepts both @samp{/} and @samp{\} as path separators.
+While this is convenient, it should be remembered that this removes one
+technically valid character (@samp{/}) from your @value{FN}.
+It may also create problems for external programs called via the @code{system}
+function, which may not support this convention.  Whenever it is possible
+that a file created by @command{gawk} will be used by some other program,
+use only backslashes.  Also remember that in @command{awk}, backslashes in
+strings have to be doubled in order to get literal backslashes
+(@pxref{Escape Sequences}).
+
+@node Tandem Installation
+@appendixsubsec Installing @command{gawk} on a Tandem
+@cindex tandem
+@cindex installation, tandem
+
+The Tandem port is only minimally supported.
+The port's contributor no longer has access to a Tandem system.
+
+@c This section based on README.Tandem by Stephen Davies (scldad@sdc.com.au)
+The Tandem port was done on a Cyclone machine running D20.
+The port is pretty clean and all facilities seem to work except for
+the I/O piping facilities
+(@pxref{Getline/Pipe},
+@ref{Getline/Variable/Pipe},
+and
+@ref{Redirection}),
+which is just too foreign a concept for Tandem.
+
+To build a Tandem executable from source, download all of the files so
+that the @value{FN}s on the Tandem box conform to the restrictions of D20.
+For example, @file{array.c} becomes @file{ARRAYC}, and @file{awk.h}
+becomes @file{AWKH}.  The totally Tandem-specific files are in the
+@file{tandem} ``subvolume'' (@file{unsupported/tandem} in the @command{gawk}
+distribution) and should be copied to the main source directory before
+building @command{gawk}.
+
+The file @file{compit} can then be used to compile and bind an executable.
+Alas, there is no @command{configure} or @command{make}.
+
+Usage is the same as for Unix, except that D20 requires all @samp{@{} and
+@samp{@}} characters to be escaped with @samp{~} on the command line
+(but @emph{not} in script files). Also, the standard Tandem syntax for
+@samp{/in filename,out filename/} must be used instead of the usual
+Unix @samp{<} and @samp{>} for file redirection.  (Redirection options
+on @code{getline}, @code{print} etc., are supported.)
+
+The @samp{-mr @var{val}} option
+(@pxref{Options})
+has been ``stolen'' to enable Tandem users to process fixed-length
+records with no ``end-of-line'' character. That is, @samp{-mr 74} tells
+@command{gawk} to read the input file as fixed 74-byte records.
+@c ENDOFRANGE opgawx
+@c ENDOFRANGE pcgawon
+
+@node Bugs
+@appendixsec Reporting Problems and Bugs
+@cindex archeologists
+@quotation
+@i{There is nothing more dangerous than a bored archeologist.}@*
+The Hitchhiker's Guide to the Galaxy
+@end quotation
+@c the radio show, not the book. :-)
+
+@c STARTOFRANGE dbugg
+@cindex debugging @command{gawk}, bug reports
+@c STARTOFRANGE tblgawb
+@cindex troubleshooting, @command{gawk}, bug reports
+If you have problems with @command{gawk} or think that you have found a bug,
+please report it to the developers; we cannot promise to do anything
+but we might well want to fix it.
+
+Before reporting a bug, make sure you have actually found a real bug.
+Carefully reread the documentation and see if it really says you can do
+what you're trying to do.  If it's not clear whether you should be able
+to do something or not, report that too; it's a bug in the documentation!
+
+Before reporting a bug or trying to fix it yourself, try to isolate it
+to the smallest possible @command{awk} program and input @value{DF} that
+reproduces the problem.  Then send us the program and @value{DF},
+some idea of what kind of Unix system you're using,
+the compiler you used to compile @command{gawk}, and the exact results
+@command{gawk} gave you.  Also say what you expected to occur; this helps
+us decide whether the problem is really in the documentation.
+
+@cindex @code{bug-gawk@@gnu.org} bug reporting address
+@cindex email address for bug reports, @code{bug-gawk@@gnu.org}
+@cindex bug reports, email address, @code{bug-gawk@@gnu.org}
+Once you have a precise problem, send email to @email{bug-gawk@@gnu.org}.
+
+@cindex Robbins, Arnold
+Please include the version number of @command{gawk} you are using.
+You can get this information with the command @samp{gawk --version}.
+Using this address automatically sends a carbon copy of your
+mail to me.  If necessary, I can be reached directly at
+@email{arnold@@gnu.org}.  The bug reporting address is preferred since the
+email list is archived at the GNU Project.
+@emph{All email should be in English, since that is my native language.}
+
+@cindex @code{comp.lang.awk} newsgroup
+@strong{Caution:} Do @emph{not} try to report bugs in @command{gawk} by
+posting to the Usenet/Internet newsgroup @code{comp.lang.awk}.
+While the @command{gawk} developers do occasionally read this newsgroup,
+there is no guarantee that we will see your posting.  The steps described
+above are the official recognized ways for reporting bugs.
+
+Non-bug suggestions are always welcome as well.  If you have questions
+about things that are unclear in the documentation or are just obscure
+features, ask me; I will try to help you out, although I
+may not have the time to fix the problem.  You can send me electronic
+mail at the Internet address noted previously.
+
+If you find bugs in one of the non-Unix ports of @command{gawk}, please send
+an electronic mail message to the person who maintains that port.  They
+are named in the following list, as well as in the @file{README} file in the @command{gawk}
+distribution.  Information in the @file{README} file should be considered
+authoritative if it conflicts with this @value{DOCUMENT}.
+
+The people maintaining the non-Unix ports of @command{gawk} are
+as follows:
+
+@ignore
+@table @asis
+@cindex Fish, Fred
+@item Amiga
+Fred Fish, @email{fnf@@ninemoons.com}.
+
+@cindex Brown, Martin
+@item BeOS
+Martin Brown, @email{mc@@whoever.com}.
+
+@cindex Deifik, Scott
+@cindex Hankerson, Darrel
+@item MS-DOS
+Scott Deifik, @email{scottd@@amgen.com} and
+Darrel Hankerson, @email{hankedr@@mail.auburn.edu}.
+
+@cindex Grigera, Juan
+@item MS-Windows
+Juan Grigera, @email{juan@@biophnet.unlp.edu.ar}.
+
+@item OS/2
+The Unix for OS/2 team, @email{gawk-maintainer@@unixos2.org}.
+
+@cindex Davies, Stephen
+@item Tandem
+Stephen Davies, @email{scldad@@sdc.com.au}.
+
+@cindex Rankin, Pat
+@item VMS
+Pat Rankin, @email{rankin@@pactechdata.com}.
+@end table
+@end ignore
+
+@multitable {MS-Windows} {123456789012345678901234567890123456789001234567890}
+@cindex Fish, Fred
+@item Amiga @tab Fred Fish, @email{fnf@@ninemoons.com}.
+
+@cindex Brown, Martin
+@item BeOS @tab Martin Brown, @email{mc@@whoever.com}.
+
+@cindex Deifik, Scott
+@cindex Hankerson, Darrel
+@item MS-DOS @tab Scott Deifik, @email{scottd@@amgen.com} and
+Darrel Hankerson, @email{hankedr@@mail.auburn.edu}.
+
+@cindex Grigera, Juan
+@item MS-Windows @tab Juan Grigera, @email{juan@@biophnet.unlp.edu.ar}.
+
+@item OS/2 @tab The Unix for OS/2 team, @email{gawk-maintainer@@unixos2.org}.
+
+@cindex Davies, Stephen
+@item Tandem @tab Stephen Davies, @email{scldad@@sdc.com.au}.
+
+@cindex Rankin, Pat
+@item VMS @tab Pat Rankin, @email{rankin@@pactechdata.com}.
+@end multitable
+
+If your bug is also reproducible under Unix, please send a copy of your
+report to the @email{bug-gawk@@gnu.org} email list as well.
+@c ENDOFRANGE dbugg
+@c ENDOFRANGE tblgawb
+
+@node Other Versions
+@appendixsec Other Freely Available @command{awk} Implementations
+@c STARTOFRANGE awkim
+@cindex @command{awk}, implementations
+@ignore
+From: emory!amc.com!brennan (Michael Brennan)
+Subject: C++ comments in awk programs
+To: arnold@gnu.ai.mit.edu (Arnold Robbins)
+Date: Wed, 4 Sep 1996 08:11:48 -0700 (PDT)
+
+@end ignore
+@cindex Brennan, Michael
+@quotation
+@i{It's kind of fun to put comments like this in your awk code.}@*
+@ @ @ @ @ @ @code{// Do C++ comments work? answer: yes! of course}@*
+Michael Brennan
+@end quotation
+
+There are three other freely available @command{awk} implementations.
+This @value{SECTION} briefly describes where to get them:
+
+@table @asis
+@cindex Kernighan, Brian
+@cindex source code, Bell Laboratories @command{awk}
+@item Unix @command{awk}
+Brian Kernighan has made his implementation of
+@command{awk} freely available.
+You can retrieve this version via the World Wide Web from
+his home page.@footnote{@uref{http://cm.bell-labs.com/who/bwk}}
+It is available in several archive formats:
+
+@table @asis
+@item Shell archive
+@uref{http://cm.bell-labs.com/who/bwk/awk.shar}
+
+@item Compressed @command{tar} file
+@uref{http://cm.bell-labs.com/who/bwk/awk.tar.gz}
+
+@item Zip file
+@uref{http://cm.bell-labs.com/who/bwk/awk.zip}
+@end table
+
+This version requires an ISO C (1990 standard) compiler;
+the C compiler from
+GCC (the GNU Compiler Collection)
+works quite nicely.
+
+@xref{BTL},
+for a list of extensions in this @command{awk} that are not in POSIX @command{awk}.
+
+@cindex Brennan, Michael
+@cindex @command{mawk} program
+@cindex source code, @command{mawk}
+@item @command{mawk}
+Michael Brennan has written an independent implementation of @command{awk},
+called @command{mawk}.  It is available under the GPL
+(@pxref{Copying}),
+just as @command{gawk} is.
+
+You can get it via anonymous @command{ftp} to the host
+@code{@w{ftp.whidbey.net}}.  Change directory to @file{/pub/brennan}.
+Use ``binary'' or ``image'' mode, and retrieve @file{mawk1.3.3.tar.gz}
+(or the latest version that is there).
+
+@command{gunzip} may be used to decompress this file. Installation
+is similar to @command{gawk}'s
+(@pxref{Unix Installation}).
+
+@cindex extensions, @command{mawk}
+@command{mawk} has the following extensions that are not in POSIX @command{awk}:
+
+@itemize @bullet
+@item
+The @code{fflush} built-in function for flushing buffered output
+(@pxref{I/O Functions}).
+
+@item
+The @samp{**} and @samp{**=} operators
+(@pxref{Arithmetic Ops}
+and also see
+@ref{Assignment Ops}).
+
+@item
+The use of @code{func} as an abbreviation for @code{function}
+(@pxref{Definition Syntax}).
+
+@item
+The @samp{\x} escape sequence
+(@pxref{Escape Sequences}).
+
+@item
+The @file{/dev/stdout}, and @file{/dev/stderr}
+special files
+(@pxref{Special Files}).
+Use @code{"-"} instead of @code{"/dev/stdin"} with @command{mawk}.
+
+@item
+The ability for @code{FS} and for the third
+argument to @code{split} to be null strings
+(@pxref{Single Character Fields}).
+
+@item
+The ability to delete all of an array at once with @samp{delete @var{array}}
+(@pxref{Delete}).
+
+@item
+The ability for @code{RS} to be a regexp
+(@pxref{Records}).
+
+@item
+The @code{BINMODE} special variable for non-Unix operating systems
+(@pxref{PC Using}).
+@end itemize
+
+The next version of @command{mawk} will support @code{nextfile}.
+
+@cindex Sumner, Andrew
+@cindex @command{awka} compiler for @command{awk}
+@cindex source code, @command{awka}
+@item @command{awka}
+Written by Andrew Sumner,
+@command{awka} translates @command{awk} programs into C, compiles them,
+and links them with a library of functions that provides the core
+@command{awk} functionality.
+It also has a number of extensions.
+
+The @command{awk} translator is released under the GPL, and the library
+is under the LGPL.
+
+To get @command{awka}, go to @uref{http://awka.sourceforge.net}.
+You can reach Andrew Sumner at @email{andrew@@zbcom.net}.
+
+@cindex Beebe, Nelson H.F.
+@cindex @command{pawk} profiling Bell Labs @command{awk}
+@item @command{pawk}
+Nelson H.F.@: Beebe at the University of Utah has modified
+the Bell Labs @command{awk} to provide timing and profiling information.
+It is different from @command{pgawk}
+(@pxref{Profiling}),
+in that it uses CPU-based profiling, not line-count
+profiling.  You may find it at either
+@uref{ftp://ftp.math.utah.edu/pub/pawk/pawk-20020210.tar.gz}
+or
+@uref{http://www.math.utah.edu/pub/pawk/pawk-20020210.tar.gz}.
+
+@end table
+@c ENDOFRANGE gligawk
+@c ENDOFRANGE ingawk
+@c ENDOFRANGE awkim
+
+@node Notes
+@appendix Implementation Notes
+@c STARTOFRANGE gawii
+@cindex @command{gawk}, implementation issues
+@c STARTOFRANGE impis
+@cindex implementation issues, @command{gawk}
+
+This appendix contains information mainly of interest to implementors and
+maintainers of @command{gawk}.  Everything in it applies specifically to
+@command{gawk} and not to other implementations.
+
+@menu
+* Compatibility Mode::          How to disable certain @command{gawk}
+                                extensions.
+* Additions::                   Making Additions To @command{gawk}.
+* Dynamic Extensions::          Adding new built-in functions to
+                                @command{gawk}.
+* Future Extensions::           New features that may be implemented one day.
+@end menu
+
+@node Compatibility Mode
+@appendixsec Downward Compatibility and Debugging
+@cindex @command{gawk}, implementation issues, downward compatibility
+@cindex @command{gawk}, implementation issues, debugging
+@cindex troubleshooting, @command{gawk}
+@c first comma is part of primary
+@cindex implementation issues, @command{gawk}, debugging
+
+@xref{POSIX/GNU},
+for a summary of the GNU extensions to the @command{awk} language and program.
+All of these features can be turned off by invoking @command{gawk} with the
+@option{--traditional} option or with the @option{--posix} option.
+
+If @command{gawk} is compiled for debugging with @samp{-DDEBUG}, then there
+is one more option available on the command line:
+
+@table @code
+@item -W parsedebug
+@itemx --parsedebug
+Prints out the parse stack information as the program is being parsed.
+@end table
+
+This option is intended only for serious @command{gawk} developers
+and not for the casual user.  It probably has not even been compiled into
+your version of @command{gawk}, since it slows down execution.
+
+@node Additions
+@appendixsec Making Additions to @command{gawk}
+
+If you find that you want to enhance @command{gawk} in a significant
+fashion, you are perfectly free to do so.  That is the point of having
+free software; the source code is available and you are free to change
+it as you want (@pxref{Copying}).
+
+This @value{SECTION} discusses the ways you might want to change @command{gawk}
+as well as any considerations you should bear in mind.
+
+@menu
+* Adding Code::                 Adding code to the main body of
+                                @command{gawk}.
+* New Ports::                   Porting @command{gawk} to a new operating
+                                system.
+@end menu
+
+@node Adding Code
+@appendixsubsec Adding New Features
+
+@c STARTOFRANGE adfgaw
+@cindex adding, features to @command{gawk}
+@c STARTOFRANGE fadgaw
+@cindex features, adding to @command{gawk}
+@c STARTOFRANGE gawadf
+@cindex @command{gawk}, features, adding
+You are free to add any new features you like to @command{gawk}.
+However, if you want your changes to be incorporated into the @command{gawk}
+distribution, there are several steps that you need to take in order to
+make it possible for me to include your changes:
+
+@enumerate 1
+@item
+Before building the new feature into @command{gawk} itself,
+consider writing it as an extension module
+(@pxref{Dynamic Extensions}).
+If that's not possible, continue with the rest of the steps in this list.
+
+@item
+Get the latest version.
+It is much easier for me to integrate changes if they are relative to
+the most recent distributed version of @command{gawk}.  If your version of
+@command{gawk} is very old, I may not be able to integrate them at all.
+(@xref{Getting},
+for information on getting the latest version of @command{gawk}.)
+
+@item
+@ifnotinfo
+Follow the @cite{GNU Coding Standards}.
+@end ifnotinfo
+@ifinfo
+See @inforef{Top, , Version, standards, GNU Coding Standards}.
+@end ifinfo
+This document describes how GNU software should be written. If you haven't
+read it, please do so, preferably @emph{before} starting to modify @command{gawk}.
+(The @cite{GNU Coding Standards} are available from
+the GNU Project's
+@command{ftp}
+site, at
+@uref{ftp://ftp.gnu.org/gnu/GNUinfo/standards.text}.
+An HTML version, suitable for reading with a WWW browser, is
+available at
+@uref{http://www.gnu.org/prep/standards_toc.html}.
+Texinfo, Info, and DVI versions are also available.)
+
+@cindex @command{gawk}, coding style in
+@item
+Use the @command{gawk} coding style.
+The C code for @command{gawk} follows the instructions in the
+@cite{GNU Coding Standards}, with minor exceptions.  The code is formatted
+using the traditional ``K&R'' style, particularly as regards to the placement
+of braces and the use of tabs.  In brief, the coding rules for @command{gawk}
+are as follows:
+
+@itemize @bullet
+@item
+Use ANSI/ISO style (prototype) function headers when defining functions.
+
+@item
+Put the name of the function at the beginning of its own line.
+
+@item
+Put the return type of the function, even if it is @code{int}, on the
+line above the line with the name and arguments of the function.
+
+@item
+Put spaces around parentheses used in control structures
+(@code{if}, @code{while}, @code{for}, @code{do}, @code{switch},
+and @code{return}).
+
+@item
+Do not put spaces in front of parentheses used in function calls.
+
+@item
+Put spaces around all C operators and after commas in function calls.
+
+@item
+Do not use the comma operator to produce multiple side effects, except
+in @code{for} loop initialization and increment parts, and in macro bodies.
+
+@item
+Use real tabs for indenting, not spaces.
+
+@item
+Use the ``K&R'' brace layout style.
+
+@item
+Use comparisons against @code{NULL} and @code{'\0'} in the conditions of
+@code{if}, @code{while}, and @code{for} statements, as well as in the @code{case}s
+of @code{switch} statements, instead of just the
+plain pointer or character value.
+
+@item
+Use the @code{TRUE}, @code{FALSE} and @code{NULL} symbolic constants
+and the character constant @code{'\0'} where appropriate, instead of @code{1}
+and @code{0}.
+
+@item
+Use the @code{ISALPHA}, @code{ISDIGIT}, etc.@: macros, instead of the
+traditional lowercase versions; these macros are better behaved for
+non-ASCII character sets.
+
+@item
+Provide one-line descriptive comments for each function.
+
+@item
+Do not use @samp{#elif}. Many older Unix C compilers cannot handle it.
+
+@item
+Do not use the @code{alloca} function for allocating memory off the stack.
+Its use causes more portability trouble than is worth the minor benefit of not having
+to free the storage. Instead, use @code{malloc} and @code{free}.
+@end itemize
+
+@strong{Note:}
+If I have to reformat your code to follow the coding style used in
+@command{gawk}, I may not bother to integrate your changes at all.
+
+@item
+Be prepared to sign the appropriate paperwork.
+In order for the FSF to distribute your changes, you must either place
+those changes in the public domain and submit a signed statement to that
+effect, or assign the copyright in your changes to the FSF.
+Both of these actions are easy to do and @emph{many} people have done so
+already. If you have questions, please contact me
+(@pxref{Bugs}),
+or @email{gnu@@gnu.org}.
+
+@cindex Texinfo
+@item
+Update the documentation.
+Along with your new code, please supply new sections and/or chapters
+for this @value{DOCUMENT}.  If at all possible, please use real
+Texinfo, instead of just supplying unformatted ASCII text (although
+even that is better than no documentation at all).
+Conventions to be followed in @cite{@value{TITLE}} are provided
+after the @samp{@@bye} at the end of the Texinfo source file.
+If possible, please update the @command{man} page as well.
+
+You will also have to sign paperwork for your documentation changes.
+
+@item
+Submit changes as context diffs or unified diffs.
+Use @samp{diff -c -r -N} or @samp{diff -u -r -N} to compare
+the original @command{gawk} source tree with your version.
+(I find context diffs to be more readable but unified diffs are
+more compact.)
+I recommend using the GNU version of @command{diff}.
+Send the output produced by either run of @command{diff} to me when you
+submit your changes.
+(@xref{Bugs}, for the electronic mail
+information.)
+
+Using this format makes it easy for me to apply your changes to the
+master version of the @command{gawk} source code (using @code{patch}).
+If I have to apply the changes manually, using a text editor, I may
+not do so, particularly if there are lots of changes.
+
+@item
+Include an entry for the @file{ChangeLog} file with your submission.
+This helps further minimize the amount of work I have to do,
+making it easier for me to accept patches.
+@end enumerate
+
+Although this sounds like a lot of work, please remember that while you
+may write the new code, I have to maintain it and support it. If it
+isn't possible for me to do that with a minimum of extra work, then I
+probably will not.
+@c ENDOFRANGE adfgaw
+@c ENDOFRANGE gawadf
+@c ENDOFRANGE fadgaw
+
+@node New Ports
+@appendixsubsec Porting @command{gawk} to a New Operating System
+@cindex portability, @command{gawk}
+@cindex operating systems, porting @command{gawk} to
+
+@cindex porting @command{gawk}
+If you want to port @command{gawk} to a new operating system, there are
+several steps:
+
+@enumerate 1
+@item
+Follow the guidelines in
+@ifinfo
+@ref{Adding Code},
+@end ifinfo
+@ifnotinfo
+the previous @value{SECTION}
+@end ifnotinfo
+concerning coding style, submission of diffs, and so on.
+
+@item
+When doing a port, bear in mind that your code must coexist peacefully
+with the rest of @command{gawk} and the other ports. Avoid gratuitous
+changes to the system-independent parts of the code. If at all possible,
+avoid sprinkling @samp{#ifdef}s just for your port throughout the
+code.
+
+If the changes needed for a particular system affect too much of the
+code, I probably will not accept them.  In such a case, you can, of course,
+distribute your changes on your own, as long as you comply
+with the GPL
+(@pxref{Copying}).
+
+@item
+A number of the files that come with @command{gawk} are maintained by other
+people at the Free Software Foundation.  Thus, you should not change them
+unless it is for a very good reason; i.e., changes are not out of the
+question, but changes to these files are scrutinized extra carefully.
+The files are @file{getopt.h}, @file{getopt.c},
+@file{getopt1.c}, @file{regex.h}, @file{regex.c}, @file{dfa.h},
+@file{dfa.c}, @file{install-sh}, and @file{mkinstalldirs}.
+
+@item
+Be willing to continue to maintain the port.
+Non-Unix operating systems are supported by volunteers who maintain
+the code needed to compile and run @command{gawk} on their systems. If noone
+volunteers to maintain a port, it becomes unsupported and it may
+be necessary to remove it from the distribution.
+
+@item
+Supply an appropriate @file{gawkmisc.???} file.
+Each port has its own @file{gawkmisc.???} that implements certain
+operating system specific functions. This is cleaner than a plethora of
+@samp{#ifdef}s scattered throughout the code.  The @file{gawkmisc.c} in
+the main source directory includes the appropriate
+@file{gawkmisc.???} file from each subdirectory.
+Be sure to update it as well.
+
+Each port's @file{gawkmisc.???} file has a suffix reminiscent of the machine
+or operating system for the port---for example, @file{pc/gawkmisc.pc} and
+@file{vms/gawkmisc.vms}. The use of separate suffixes, instead of plain
+@file{gawkmisc.c}, makes it possible to move files from a port's subdirectory
+into the main subdirectory, without accidentally destroying the real
+@file{gawkmisc.c} file.  (Currently, this is only an issue for the
+PC operating system ports.)
+
+@item
+Supply a @file{Makefile} as well as any other C source and header files that are
+necessary for your operating system.  All your code should be in a
+separate subdirectory, with a name that is the same as, or reminiscent
+of, either your operating system or the computer system.  If possible,
+try to structure things so that it is not necessary to move files out
+of the subdirectory into the main source directory.  If that is not
+possible, then be sure to avoid using names for your files that
+duplicate the names of files in the main source directory.
+
+@item
+Update the documentation.
+Please write a section (or sections) for this @value{DOCUMENT} describing the
+installation and compilation steps needed to compile and/or install
+@command{gawk} for your system.
+
+@item
+Be prepared to sign the appropriate paperwork.
+In order for the FSF to distribute your code, you must either place
+your code in the public domain and submit a signed statement to that
+effect, or assign the copyright in your code to the FSF.
+@ifinfo
+Both of these actions are easy to do and @emph{many} people have done so
+already. If you have questions, please contact me, or
+@email{gnu@@gnu.org}.
+@end ifinfo
+@end enumerate
+
+Following these steps makes it much easier to integrate your changes
+into @command{gawk} and have them coexist happily with other
+operating systems' code that is already there.
+
+In the code that you supply and maintain, feel free to use a
+coding style and brace layout that suits your taste.
+
+@node Dynamic Extensions
+@appendixsec Adding New Built-in Functions to @command{gawk}
+@cindex Robinson, Will
+@cindex robot, the
+@cindex Lost In Space
+@quotation
+@i{Danger Will Robinson!  Danger!!@*
+Warning! Warning!}@*
+The Robot
+@end quotation
+
+@c STARTOFRANGE gladfgaw
+@cindex @command{gawk}, functions, adding
+@c STARTOFRANGE adfugaw
+@cindex adding, functions to @command{gawk}
+@c STARTOFRANGE fubadgaw
+@cindex functions, built-in, adding to @command{gawk}
+Beginning with @command{gawk} 3.1, it is possible to add new built-in
+functions to @command{gawk} using dynamically loaded libraries. This
+facility is available on systems (such as GNU/Linux) that support
+the @code{dlopen} and @code{dlsym} functions.
+This @value{SECTION} describes how to write and use dynamically
+loaded extentions for @command{gawk}.
+Experience with programming in
+C or C++ is necessary when reading this @value{SECTION}.
+
+@strong{Caution:} The facilities described in this @value{SECTION}
+are very much subject to change in the next @command{gawk} release.
+Be aware that you may have to re-do everything, perhaps from scratch,
+upon the next release.
+
+@menu
+* Internals::                   A brief look at some @command{gawk} internals.
+* Sample Library::              A example of new functions.
+@end menu
+
+@node Internals
+@appendixsubsec A Minimal Introduction to @command{gawk} Internals
+@c STARTOFRANGE gawint
+@cindex @command{gawk}, internals
+
+The truth is that @command{gawk} was not designed for simple extensibility.
+The facilities for adding functions using shared libraries work, but
+are something of a ``bag on the side.''  Thus, this tour is
+brief and simplistic; would-be @command{gawk} hackers are encouraged to
+spend some time reading the source code before trying to write
+extensions based on the material presented here.  Of particular note
+are the files @file{awk.h}, @file{builtin.c}, and @file{eval.c}.
+Reading @file{awk.y} in order to see how the parse tree is built
+would also be of use.
+
+@cindex @code{awk.h} file (internal)
+With the disclaimers out of the way, the following types, structure
+members, functions, and macros are declared in @file{awk.h} and are of
+use when writing extensions.  The next @value{SECTION}
+shows how they are used:
+
+@table @code
+@cindex floating-point, numbers, @code{AWKNUM} internal type
+@cindex numbers, floating-point, @code{AWKNUM} internal type
+@cindex @code{AWKNUM} internal type
+@item AWKNUM
+An @code{AWKNUM} is the internal type of @command{awk}
+floating-point numbers.  Typically, it is a C @code{double}.
+
+@cindex @code{NODE} internal type
+@cindex strings, @code{NODE} internal type
+@cindex numbers, @code{NODE} internal type
+@item NODE
+Just about everything is done using objects of type @code{NODE}.
+These contain both strings and numbers, as well as variables and arrays.
+
+@cindex @code{force_number} internal function
+@cindex numeric, values
+@item AWKNUM force_number(NODE *n)
+This macro forces a value to be numeric. It returns the actual
+numeric value contained in the node.
+It may end up calling an internal @command{gawk} function.
+
+@cindex @code{force_string} internal function
+@item void force_string(NODE *n)
+This macro guarantees that a @code{NODE}'s string value is current.
+It may end up calling an internal @command{gawk} function.
+It also guarantees that the string is zero-terminated.
+
+@c comma is part of primary
+@cindex parameters, number of
+@cindex @code{param_cnt} internal variable
+@item n->param_cnt
+The number of parameters actually passed in a function call at runtime.
+
+@cindex @code{stptr} internal variable
+@cindex @code{stlen} internal variable
+@item n->stptr
+@itemx n->stlen
+The data and length of a @code{NODE}'s string value, respectively.
+The string is @emph{not} guaranteed to be zero-terminated.
+If you need to pass the string value to a C library function, save
+the value in @code{n->stptr[n->stlen]}, assign @code{'\0'} to it,
+call the routine, and then restore the value.
+
+@cindex @code{type} internal variable
+@item n->type
+The type of the @code{NODE}. This is a C @code{enum}. Values should
+be either @code{Node_var} or @code{Node_var_array} for function
+parameters.
+
+@cindex @code{vname} internal variable
+@item n->vname
+The ``variable name'' of a node.  This is not of much use inside
+externally written extensions.
+
+@cindex arrays, associative, clearing
+@cindex @code{assoc_clear} internal function
+@item void assoc_clear(NODE *n)
+Clears the associative array pointed to by @code{n}.
+Make sure that @samp{n->type == Node_var_array} first.
+
+@cindex arrays, elements, installing
+@cindex @code{assoc_lookup} internal function
+@item NODE **assoc_lookup(NODE *symbol, NODE *subs, int reference)
+Finds, and installs if necessary, array elements.
+@code{symbol} is the array, @code{subs} is the subscript.
+This is usually a value created with @code{tmp_string} (see below).
+@code{reference} should be @code{TRUE} if it is an error to use the
+value before it is created. Typically, @code{FALSE} is the
+correct value to use from extension functions.
+
+@cindex strings
+@cindex @code{make_string} internal function
+@item NODE *make_string(char *s, size_t len)
+Take a C string and turn it into a pointer to a @code{NODE} that
+can be stored appropriately.  This is permanent storage; understanding
+of @command{gawk} memory management is helpful.
+
+@cindex numbers
+@cindex @code{make_number} internal function
+@item NODE *make_number(AWKNUM val)
+Take an @code{AWKNUM} and turn it into a pointer to a @code{NODE} that
+can be stored appropriately.  This is permanent storage; understanding
+of @command{gawk} memory management is helpful.
+
+@cindex @code{tmp_string} internal function
+@item NODE *tmp_string(char *s, size_t len);
+Take a C string and turn it into a pointer to a @code{NODE} that
+can be stored appropriately.  This is temporary storage; understanding
+of @command{gawk} memory management is helpful.
+
+@cindex @code{tmp_number} internal function
+@item NODE *tmp_number(AWKNUM val)
+Take an @code{AWKNUM} and turn it into a pointer to a @code{NODE} that
+can be stored appropriately.  This is temporary storage;
+understanding of @command{gawk} memory management is helpful.
+
+@c comma is part of primary
+@cindex nodes, duplicating
+@cindex @code{dupnode} internal function
+@item NODE *dupnode(NODE *n)
+Duplicate a node.  In most cases, this increments an internal
+reference count instead of actually duplicating the entire @code{NODE};
+understanding of @command{gawk} memory management is helpful.
+
+@cindex memory, releasing
+@cindex @code{free_temp} internal macro
+@item void free_temp(NODE *n)
+This macro releases the memory associated with a @code{NODE}
+allocated with @code{tmp_string} or @code{tmp_number}.
+Understanding of @command{gawk} memory management is helpful.
+
+@cindex @code{make_builtin} internal function
+@item void make_builtin(char *name, NODE *(*func)(NODE *), int count)
+Register a C function pointed to by @code{func} as new built-in
+function @code{name}. @code{name} is a regular C string. @code{count}
+is the maximum number of arguments that the function takes.
+The function should be written in the following manner:
+
+@example
+/* do_xxx --- do xxx function for gawk */
+
+NODE *
+do_xxx(NODE *tree)
+@{
+    @dots{}
+@}
+@end example
+
+@cindex arguments, retrieving
+@cindex @code{get_argument} internal function
+@item NODE *get_argument(NODE *tree, int i)
+This function is called from within a C extension function to get
+the @code{i}-th argument from the function call.
+The first argument is argument zero.
+
+@c last comma is part of secondary
+@cindex functions, return values, setting
+@cindex @code{set_value} internal function
+@item void set_value(NODE *tree)
+This function is called from within a C extension function to set
+the return value from the extension function.  This value is
+what the @command{awk} program sees as the return value from the
+new @command{awk} function.
+
+@cindex @code{ERRNO} variable
+@cindex @code{update_ERRNO} internal function
+@item void update_ERRNO(void)
+This function is called from within a C extension function to set
+the value of @command{gawk}'s @code{ERRNO} variable, based on the current
+value of the C @code{errno} variable.
+It is provided as a convenience.
+@end table
+
+An argument that is supposed to be an array needs to be handled with
+some extra code, in case the array being passed in is actually
+from a function parameter.
+
+In versions of @command{gawk} up to and including 3.1.2, the
+following boilerplate code shows how to do this:
+
+@smallexample
+NODE *the_arg;
+
+the_arg = get_argument(tree, 2); /* assume need 3rd arg, 0-based */
+
+/* if a parameter, get it off the stack */
+if (the_arg->type == Node_param_list)
+    the_arg = stack_ptr[the_arg->param_cnt];
+
+/* parameter referenced an array, get it */
+if (the_arg->type == Node_array_ref)
+    the_arg = the_arg->orig_array;
+
+/* check type */
+if (the_arg->type != Node_var && the_arg->type != Node_var_array)
+    fatal("newfunc: third argument is not an array");
+
+/* force it to be an array, if necessary, clear it */
+the_arg->type = Node_var_array;
+assoc_clear(the_arg);
+@end smallexample
+
+For versions 3.1.3 and later, the internals changed.  In particular,
+the interface was actually @emph{simplified} drastically.  The
+following boilerplate code now suffices:
+
+@smallexample
+NODE *the_arg;
+
+the_arg = get_argument(tree, 2); /* assume need 3rd arg, 0-based */
+
+/* force it to be an array: */
+the_arg = get_array(the_arg);
+
+/* if necessary, clear it: */
+assoc_clear(the_arg);
+@end smallexample
+
+Again, you should spend time studying the @command{gawk} internals;
+don't just blindly copy this code.
+@c ENDOFRANGE gawint
+
+@node Sample Library
+@appendixsubsec Directory and File Operation Built-ins
+@c comma is part of primary
+@c STARTOFRANGE chdirg
+@cindex @code{chdir} function, implementing in @command{gawk}
+@c comma is part of primary
+@c STARTOFRANGE statg
+@cindex @code{stat} function, implementing in @command{gawk}
+@c last comma is part of secondary
+@c STARTOFRANGE filre
+@cindex files, information about, retrieving
+@c STARTOFRANGE dirch
+@cindex directories, changing
+
+Two useful functions that are not in @command{awk} are @code{chdir}
+(so that an @command{awk} program can change its directory) and
+@code{stat} (so that an @command{awk} program can gather information about
+a file).
+This @value{SECTION} implements these functions for @command{gawk} in an
+external extension library.
+
+@menu
+* Internal File Description::   What the new functions will do.
+* Internal File Ops::           The code for internal file operations.
+* Using Internal File Ops::     How to use an external extension.
+@end menu
+
+@node Internal File Description
+@appendixsubsubsec Using @code{chdir} and @code{stat}
+
+This @value{SECTION} shows how to use the new functions at the @command{awk}
+level once they've been integrated into the running @command{gawk}
+interpreter.
+Using @code{chdir} is very straightforward. It takes one argument,
+the new directory to change to:
+
+@example
+@dots{}
+newdir = "/home/arnold/funstuff"
+ret = chdir(newdir)
+if (ret < 0) @{
+    printf("could not change to %s: %s\n",
+                   newdir, ERRNO) > "/dev/stderr"
+    exit 1
+@}
+@dots{}
+@end example
+
+The return value is negative if the @code{chdir} failed,
+and @code{ERRNO}
+(@pxref{Built-in Variables})
+is set to a string indicating the error.
+
+Using @code{stat} is a bit more complicated.
+The C @code{stat} function fills in a structure that has a fair
+amount of information.
+The right way to model this in @command{awk} is to fill in an associative
+array with the appropriate information:
+
+@c broke printf for page breaking
+@example
+file = "/home/arnold/.profile"
+fdata[1] = "x"    # force `fdata' to be an array
+ret = stat(file, fdata)
+if (ret < 0) @{
+    printf("could not stat %s: %s\n",
+             file, ERRNO) > "/dev/stderr"
+    exit 1
+@}
+printf("size of %s is %d bytes\n", file, fdata["size"])
+@end example
+
+The @code{stat} function always clears the data array, even if
+the @code{stat} fails.  It fills in the following elements:
+
+@table @code
+@item "name"
+The name of the file that was @code{stat}'ed.
+
+@item "dev"
+@itemx "ino"
+The file's device and inode numbers, respectively.
+
+@item "mode"
+The file's mode, as a numeric value. This includes both the file's
+type and its permissions.
+
+@item "nlink"
+The number of hard links (directory entries) the file has.
+
+@item "uid"
+@itemx "gid"
+The numeric user and group ID numbers of the file's owner.
+
+@item "size"
+The size in bytes of the file.
+
+@item "blocks"
+The number of disk blocks the file actually occupies. This may not
+be a function of the file's size if the file has holes.
+
+@item "atime"
+@itemx "mtime"
+@itemx "ctime"
+The file's last access, modification, and inode update times,
+respectively.  These are numeric timestamps, suitable for formatting
+with @code{strftime}
+(@pxref{Built-in}).
+
+@item "pmode"
+The file's ``printable mode.''  This is a string representation of
+the file's type and permissions, such as what is produced by
+@samp{ls -l}---for example, @code{"drwxr-xr-x"}.
+
+@item "type"
+A printable string representation of the file's type.  The value
+is one of the following:
+
+@table @code
+@item "blockdev"
+@itemx "chardev"
+The file is a block or character device (``special file'').
+
+@ignore
+@item "door"
+The file is a Solaris ``door'' (special file used for
+interprocess communications).
+@end ignore
+
+@item "directory"
+The file is a directory.
+
+@item "fifo"
+The file is a named-pipe (also known as a FIFO).
+
+@item "file"
+The file is just a regular file.
+
+@item "socket"
+The file is an @code{AF_UNIX} (``Unix domain'') socket in the
+filesystem.
+
+@item "symlink"
+The file is a symbolic link.
+@end table
+@end table
+
+Several additional elements may be present depending upon the operating
+system and the type of the file.  You can test for them in your @command{awk}
+program by using the @code{in} operator
+(@pxref{Reference to Elements}):
+
+@table @code
+@item "blksize"
+The preferred block size for I/O to the file. This field is not
+present on all POSIX-like systems in the C @code{stat} structure.
+
+@item "linkval"
+If the file is a symbolic link, this element is the name of the
+file the link points to (i.e., the value of the link).
+
+@item "rdev"
+@itemx "major"
+@itemx "minor"
+If the file is a block or character device file, then these values
+represent the numeric device number and the major and minor components
+of that number, respectively.
+@end table
+
+@node Internal File Ops
+@appendixsubsubsec C Code for @code{chdir} and @code{stat}
+
+Here is the C code for these extensions.  They were written for
+GNU/Linux.  The code needs some more work for complete portability
+to other POSIX-compliant systems:@footnote{This version is edited
+slightly for presentation.  The complete version can be found in
+@file{extension/filefuncs.c} in the @command{gawk} distribution.}
+
+@c break line for page breaking
+@example
+#include "awk.h"
+
+#include <sys/sysmacros.h>
+
+/*  do_chdir --- provide dynamically loaded
+                 chdir() builtin for gawk */
+
+static NODE *
+do_chdir(tree)
+NODE *tree;
+@{
+    NODE *newdir;
+    int ret = -1;
+
+    newdir = get_argument(tree, 0);
+@end example
+
+The file includes the @code{"awk.h"} header file for definitions
+for the @command{gawk} internals.  It includes @code{<sys/sysmacros.h>}
+for access to the @code{major} and @code{minor} macros.
+
+@cindex programming conventions, @command{gawk} internals
+By convention, for an @command{awk} function @code{foo}, the function that
+implements it is called @samp{do_foo}.  The function should take
+a @samp{NODE *} argument, usually called @code{tree}, that
+represents the argument list to the function.  The @code{newdir}
+variable represents the new directory to change to, retrieved
+with @code{get_argument}.  Note that the first argument is
+numbered zero.
+
+This code actually accomplishes the @code{chdir}. It first forces
+the argument to be a string and passes the string value to the
+@code{chdir} system call. If the @code{chdir} fails, @code{ERRNO}
+is updated.
+The result of @code{force_string} has to be freed with @code{free_temp}:
+
+@example
+    if (newdir != NULL) @{
+        (void) force_string(newdir);
+        ret = chdir(newdir->stptr);
+        if (ret < 0)
+            update_ERRNO();
+
+        free_temp(newdir);
+    @}
+@end example
+
+Finally, the function returns the return value to the @command{awk} level,
+using @code{set_value}. Then it must return a value from the call to
+the new built-in (this value ignored by the interpreter):
+
+@example
+    /* Set the return value */
+    set_value(tmp_number((AWKNUM) ret));
+
+    /* Just to make the interpreter happy */
+    return tmp_number((AWKNUM) 0);
+@}
+@end example
+
+The @code{stat} built-in is more involved.  First comes a function
+that turns a numeric mode into a printable representation
+(e.g., 644 becomes @samp{-rw-r--r--}). This is omitted here for brevity:
+
+@c break line for page breaking
+@example
+/* format_mode --- turn a stat mode field
+                   into something readable */
+
+static char *
+format_mode(fmode)
+unsigned long fmode;
+@{
+    @dots{}
+@}
+@end example
+
+Next comes the actual @code{do_stat} function itself.  First come the
+variable declarations and argument checking:
+
+@ignore
+Changed message for page breaking. Used to be:
+    "stat: called with incorrect number of arguments (%d), should be 2",
+@end ignore
+@example
+/* do_stat --- provide a stat() function for gawk */
+
+static NODE *
+do_stat(tree)
+NODE *tree;
+@{
+    NODE *file, *array;
+    struct stat sbuf;
+    int ret;
+    char *msg;
+    NODE **aptr;
+    char *pmode;    /* printable mode */
+    char *type = "unknown";
+
+    /* check arg count */
+    if (tree->param_cnt != 2)
+        fatal(
+    "stat: called with %d arguments, should be 2",
+            tree->param_cnt);
+@end example
+
+Then comes the actual work. First, we get the arguments.
+Then, we always clear the array.  To get the file information,
+we use @code{lstat}, in case the file is a symbolic link.
+If there's an error, we set @code{ERRNO} and return:
+
+@c comment made multiline for page breaking
+@example
+    /*
+     * directory is first arg,
+     * array to hold results is second
+     */
+    file = get_argument(tree, 0);
+    array = get_argument(tree, 1);
+
+    /* empty out the array */
+    assoc_clear(array);
+
+    /* lstat the file, if error, set ERRNO and return */
+    (void) force_string(file);
+    ret = lstat(file->stptr, & sbuf);
+    if (ret < 0) @{
+        update_ERRNO();
+
+        set_value(tmp_number((AWKNUM) ret));
+
+        free_temp(file);
+        return tmp_number((AWKNUM) 0);
+    @}
+@end example
+
+Now comes the tedious part: filling in the array.  Only a few of the
+calls are shown here, since they all follow the same pattern:
+
+@example
+    /* fill in the array */
+    aptr = assoc_lookup(array, tmp_string("name", 4), FALSE);
+    *aptr = dupnode(file);
+
+    aptr = assoc_lookup(array, tmp_string("mode", 4), FALSE);
+    *aptr = make_number((AWKNUM) sbuf.st_mode);
+
+    aptr = assoc_lookup(array, tmp_string("pmode", 5), FALSE);
+    pmode = format_mode(sbuf.st_mode);
+    *aptr = make_string(pmode, strlen(pmode));
+@end example
+
+When done, we free the temporary value containing the @value{FN},
+set the return value, and return:
+
+@example
+    free_temp(file);
+
+    /* Set the return value */
+    set_value(tmp_number((AWKNUM) ret));
+
+    /* Just to make the interpreter happy */
+    return tmp_number((AWKNUM) 0);
+@}
+@end example
+
+@cindex programming conventions, @command{gawk} internals
+Finally, it's necessary to provide the ``glue'' that loads the
+new function(s) into @command{gawk}.  By convention, each library has
+a routine named @code{dlload} that does the job:
+
+@example
+/* dlload --- load new builtins in this library */
+
+NODE *
+dlload(tree, dl)
+NODE *tree;
+void *dl;
+@{
+    make_builtin("chdir", do_chdir, 1);
+    make_builtin("stat", do_stat, 2);
+    return tmp_number((AWKNUM) 0);
+@}
+@end example
+
+And that's it!  As an exercise, consider adding functions to
+implement system calls such as @code{chown}, @code{chmod}, and @code{umask}.
+
+@node Using Internal File Ops
+@appendixsubsubsec Integrating the Extensions
+
+@c last comma is part of secondary
+@cindex @command{gawk}, interpreter, adding code to
+Now that the code is written, it must be possible to add it at
+runtime to the running @command{gawk} interpreter.  First, the
+code must be compiled.  Assuming that the functions are in
+a file named @file{filefuncs.c}, and @var{idir} is the location
+of the @command{gawk} include files,
+the following steps create
+a GNU/Linux shared library:
+
+@example
+$ gcc -shared -DHAVE_CONFIG_H -c -O -g -I@var{idir} filefuncs.c
+$ ld -o filefuncs.so -shared filefuncs.o
+@end example
+
+@cindex @code{extension} function (@command{gawk})
+Once the library exists, it is loaded by calling the @code{extension}
+built-in function.
+This function takes two arguments: the name of the
+library to load and the name of a function to call when the library
+is first loaded. This function adds the new functions to @command{gawk}.
+It returns the value returned by the initialization function
+within the shared library:
+
+@example
+# file testff.awk
+BEGIN @{
+    extension("./filefuncs.so", "dlload")
+
+    chdir(".")  # no-op
+
+    data[1] = 1 # force `data' to be an array
+    print "Info for testff.awk"
+    ret = stat("testff.awk", data)
+    print "ret =", ret
+    for (i in data)
+        printf "data[\"%s\"] = %s\n", i, data[i]
+    print "testff.awk modified:",
+        strftime("%m %d %y %H:%M:%S", data["mtime"])
+@}
+@end example
+
+Here are the results of running the program:
+
+@example
+$ gawk -f testff.awk
+@print{} Info for testff.awk
+@print{} ret = 0
+@print{} data["blksize"] = 4096
+@print{} data["mtime"] = 932361936
+@print{} data["mode"] = 33188
+@print{} data["type"] = file
+@print{} data["dev"] = 2065
+@print{} data["gid"] = 10
+@print{} data["ino"] = 878597
+@print{} data["ctime"] = 971431797
+@print{} data["blocks"] = 2
+@print{} data["nlink"] = 1
+@print{} data["name"] = testff.awk
+@print{} data["atime"] = 971608519
+@print{} data["pmode"] = -rw-r--r--
+@print{} data["size"] = 607
+@print{} data["uid"] = 2076
+@print{} testff.awk modified: 07 19 99 08:25:36
+@end example
+@c ENDOFRANGE filre
+@c ENDOFRANGE dirch
+@c ENDOFRANGE statg
+@c ENDOFRANGE chdirg
+@c ENDOFRANGE gladfgaw
+@c ENDOFRANGE adfugaw
+@c ENDOFRANGE fubadgaw
+
+@node Future Extensions
+@appendixsec Probable Future Extensions
+@ignore
+From emory!scalpel.netlabs.com!lwall Tue Oct 31 12:43:17 1995
+Return-Path: <emory!scalpel.netlabs.com!lwall>
+Message-Id: <9510311732.AA28472@scalpel.netlabs.com>
+To: arnold@skeeve.atl.ga.us (Arnold D. Robbins)
+Subject: Re: May I quote you?
+In-Reply-To: Your message of "Tue, 31 Oct 95 09:11:00 EST."
+             <m0tAHPQ-00014MC@skeeve.atl.ga.us>
+Date: Tue, 31 Oct 95 09:32:46 -0800
+From: Larry Wall <emory!scalpel.netlabs.com!lwall>
+
+: Greetings. I am working on the release of gawk 3.0. Part of it will be a
+: thoroughly updated manual. One of the sections deals with planned future
+: extensions and enhancements.  I have the following at the beginning
+: of it:
+:
+: @cindex PERL
+: @cindex Wall, Larry
+: @display
+: @i{AWK is a language similar to PERL, only considerably more elegant.} @*
+: Arnold Robbins
+: @sp 1
+: @i{Hey!} @*
+: Larry Wall
+: @end display
+:
+: Before I actually release this for publication, I wanted to get your
+: permission to quote you.  (Hopefully, in the spirit of much of GNU, the
+: implied humor is visible... :-)
+
+I think that would be fine.
+
+Larry
+@end ignore
+@cindex PERL
+@cindex Wall, Larry
+@cindex Robbins, Arnold
+@quotation
+@i{AWK is a language similar to PERL, only considerably more elegant.}@*
+Arnold Robbins
+
+@i{Hey!}@*
+Larry Wall
+@end quotation
+
+This @value{SECTION} briefly lists extensions and possible improvements
+that indicate the directions we are
+currently considering for @command{gawk}.  The file @file{FUTURES} in the
+@command{gawk} distribution lists these extensions as well.
+
+Following is a list of probable future changes visible at the
+@command{awk} language level:
+
+@c these are ordered by likelihood
+@table @asis
+@item Loadable module interface
+It is not clear that the @command{awk}-level interface to the
+modules facility is as good as it should be.  The interface needs to be
+redesigned, particularly taking namespace issues into account, as
+well as possibly including issues such as library search path order
+and versioning.
+
+@item @code{RECLEN} variable for fixed-length records
+Along with @code{FIELDWIDTHS}, this would speed up the processing of
+fixed-length records.
+@code{PROCINFO["RS"]} would be @code{"RS"} or @code{"RECLEN"},
+depending upon which kind of record processing is in effect.
+
+@item Additional @code{printf} specifiers
+The 1999 ISO C standard added a number of additional @code{printf}
+format specifiers.  These should be evaluated for possible inclusion
+in @command{gawk}.
+
+@ignore
+@item A @samp{%'d} flag
+Add @samp{%'d} for putting in commas in formatting numeric values.
+@end ignore
+
+@item Databases
+It may be possible to map a GDBM/NDBM/SDBM file into an @command{awk} array.
+
+@item Large character sets
+It would be nice if @command{gawk} could handle UTF-8 and other
+character sets that are larger than eight bits.
+
+@item More @code{lint} warnings
+There are more things that could be checked for portability.
+@end table
+
+Following is a list of probable improvements that will make @command{gawk}'s
+source code easier to work with:
+
+@table @asis
+@item Loadable module mechanics
+The current extension mechanism works
+(@pxref{Dynamic Extensions}),
+but is rather primitive. It requires a fair amount of manual work
+to create and integrate a loadable module.
+Nor is the current mechanism as portable as might be desired.
+The GNU @command{libtool} package provides a number of features that
+would make using loadable modules much easier.
+@command{gawk} should be changed to use @command{libtool}.
+
+@item Loadable module internals
+The API to its internals that @command{gawk} ``exports'' should be revised.
+Too many things are needlessly exposed.  A new API should be designed
+and implemented to make module writing easier.
+
+@item Better array subscript management
+@command{gawk}'s management of array subscript storage could use revamping,
+so that using the same value to index multiple arrays only
+stores one copy of the index value.
+
+@item Integrating the DBUG library
+Integrating Fred Fish's DBUG library would be helpful during development,
+but it's a lot of work to do.
+@end table
+
+Following is a list of probable improvements that will make @command{gawk}
+perform better:
+
+@table @asis
+@c NEXT ED: remove this item. awka and mawk do these respectively
+@item Compilation of @command{awk} programs
+@command{gawk} uses a Bison (YACC-like)
+parser to convert the script given it into a syntax tree; the syntax
+tree is then executed by a simple recursive evaluator.  This method incurs
+a lot of overhead, since the recursive evaluator performs many procedure
+calls to do even the simplest things.
+
+It should be possible for @command{gawk} to convert the script's parse tree
+into a C program which the user would then compile, using the normal
+C compiler and a special @command{gawk} library to provide all the needed
+functions (regexps, fields, associative arrays, type coercion, and so on).
+
+@c last comma is part of secondary
+@cindex @command{gawk}, interpreter, adding code to
+An easier possibility might be for an intermediate phase of @command{gawk} to
+convert the parse tree into a linear byte code form like the one used
+in GNU Emacs Lisp.  The recursive evaluator would then be replaced by
+a straight line byte code interpreter that would be intermediate in speed
+between running a compiled program and doing what @command{gawk} does
+now.
+@end table
+
+Finally,
+the programs in the test suite could use documenting in this @value{DOCUMENT}.
+
+@xref{Additions},
+if you are interested in tackling any of these projects.
+@c ENDOFRANGE impis
+@c ENDOFRANGE gawii
+
+@node Basic Concepts
+@appendix Basic Programming Concepts
+@cindex programming, concepts
+@c STARTOFRANGE procon
+@cindex programming, concepts
+
+This @value{APPENDIX} attempts to define some of the basic concepts
+and terms that are used throughout the rest of this @value{DOCUMENT}.
+As this @value{DOCUMENT} is specifically about @command{awk},
+and not about computer programming in general, the coverage here
+is by necessity fairly cursory and simplistic.
+(If you need more background, there are many
+other introductory texts that you should refer to instead.)
+
+@menu
+* Basic High Level::            The high level view.
+* Basic Data Typing::           A very quick intro to data types.
+* Floating Point Issues::       Stuff to know about floating-point numbers.
+@end menu
+
+@node Basic High Level
+@appendixsec What a Program Does
+
+@cindex processing data
+At the most basic level, the job of a program is to process
+some input data and produce results.
+
+@c NEXT ED: Use real images here
+@iftex
+@tex
+\expandafter\ifx\csname graph\endcsname\relax \csname newbox\endcsname\graph\fi
+\expandafter\ifx\csname graphtemp\endcsname\relax \csname newdimen\endcsname\graphtemp\fi
+\setbox\graph=\vtop{\vskip 0pt\hbox{%
+    \special{pn 20}%
+    \special{pa 2425 200}%
+    \special{pa 2850 200}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 20}%
+    \special{pa 2750 175}%
+    \special{pa 2850 200}%
+    \special{pa 2750 225}%
+    \special{pa 2750 175}%
+    \special{fp}%
+    \special{pn 20}%
+    \special{pa 850 200}%
+    \special{pa 1250 200}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 20}%
+    \special{pa 1150 175}%
+    \special{pa 1250 200}%
+    \special{pa 1150 225}%
+    \special{pa 1150 175}%
+    \special{fp}%
+    \special{pn 20}%
+    \special{pa 2950 400}%
+    \special{pa 3650 400}%
+    \special{pa 3650 0}%
+    \special{pa 2950 0}%
+    \special{pa 2950 400}%
+    \special{fp}%
+    \special{pn 10}%
+    \special{ar 1800 200 450 200 0 6.28319}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 3.300in\lower\graphtemp\hbox to 0pt{\hss Results\hss}}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 1.800in\lower\graphtemp\hbox to 0pt{\hss Program\hss}}%
+    \special{pn 10}%
+    \special{pa 0 400}%
+    \special{pa 700 400}%
+    \special{pa 700 0}%
+    \special{pa 0 0}%
+    \special{pa 0 400}%
+    \special{fp}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 0.350in\lower\graphtemp\hbox to 0pt{\hss Data\hss}}%
+    \hbox{\vrule depth0.400in width0pt height 0pt}%
+    \kern 3.650in
+  }%
+}%
+\centerline{\box\graph}
+@end tex
+@end iftex
+@ifnottex
+@example
+                  _______
++------+         /       \         +---------+
+| Data | -----> < Program > -----> | Results |
++------+         \_______/         +---------+
+@end example
+@end ifnottex
+
+@cindex compiled programs
+@cindex interpreted programs
+The ``program'' in the figure can be either a compiled
+program@footnote{Compiled programs are typically written
+in lower-level languages such as C, C++, Fortran, or Ada,
+and then translated, or @dfn{compiled}, into a form that
+the computer can execute directly.}
+(such as @command{ls}),
+or it may be @dfn{interpreted}.  In the latter case, a machine-executable
+program such as @command{awk} reads your program, and then uses the
+instructions in your program to process the data.
+
+@cindex programming, basic steps
+When you write a program, it usually consists
+of the following, very basic set of steps:
+
+@c NEXT ED: Use real images here
+@iftex
+@tex
+\expandafter\ifx\csname graph\endcsname\relax \csname newbox\endcsname\graph\fi
+\expandafter\ifx\csname graphtemp\endcsname\relax \csname newdimen\endcsname\graphtemp\fi
+\setbox\graph=\vtop{\vskip 0pt\hbox{%
+    \graphtemp=.5ex\advance\graphtemp by 0.600in
+    \rlap{\kern 2.800in\lower\graphtemp\hbox to 0pt{\hss Yes\hss}}%
+    \graphtemp=.5ex\advance\graphtemp by 0.100in
+    \rlap{\kern 3.300in\lower\graphtemp\hbox to 0pt{\hss No\hss}}%
+    \special{pn 8}%
+    \special{pa 2100 1000}%
+    \special{pa 1600 1000}%
+    \special{pa 1600 1000}%
+    \special{pa 1600 300}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 8}%
+    \special{pa 1575 400}%
+    \special{pa 1600 300}%
+    \special{pa 1625 400}%
+    \special{pa 1575 400}%
+    \special{fp}%
+    \special{pn 8}%
+    \special{pa 2600 500}%
+    \special{pa 2600 900}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 8}%
+    \special{pa 2625 800}%
+    \special{pa 2600 900}%
+    \special{pa 2575 800}%
+    \special{pa 2625 800}%
+    \special{fp}%
+    \special{pn 8}%
+    \special{pa 3200 200}%
+    \special{pa 4000 200}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 8}%
+    \special{pa 3900 175}%
+    \special{pa 4000 200}%
+    \special{pa 3900 225}%
+    \special{pa 3900 175}%
+    \special{fp}%
+    \special{pn 8}%
+    \special{pa 1400 200}%
+    \special{pa 2100 200}%
+    \special{fp}%
+    \special{sh 1.000}%
+    \special{pn 8}%
+    \special{pa 2000 175}%
+    \special{pa 2100 200}%
+    \special{pa 2000 225}%
+    \special{pa 2000 175}%
+    \special{fp}%
+    \special{pn 8}%
+    \special{ar 2600 1000 400 100 0 6.28319}%
+    \graphtemp=.5ex\advance\graphtemp by 1.000in
+    \rlap{\kern 2.600in\lower\graphtemp\hbox to 0pt{\hss Process\hss}}%
+    \special{pn 8}%
+    \special{pa 2200 400}%
+    \special{pa 3100 400}%
+    \special{pa 3100 0}%
+    \special{pa 2200 0}%
+    \special{pa 2200 400}%
+    \special{fp}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 2.688in\lower\graphtemp\hbox to 0pt{\hss More Data?\hss}}%
+    \special{pn 8}%
+    \special{ar 650 200 650 200 0 6.28319}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 0.613in\lower\graphtemp\hbox to 0pt{\hss Initialization\hss}}%
+    \special{pn 8}%
+    \special{ar 0 200 0 0 0 6.28319}%
+    \special{pn 8}%
+    \special{ar 4550 200 450 100 0 6.28319}%
+    \graphtemp=.5ex\advance\graphtemp by 0.200in
+    \rlap{\kern 4.600in\lower\graphtemp\hbox to 0pt{\hss Clean Up\hss}}%
+    \hbox{\vrule depth1.100in width0pt height 0pt}%
+    \kern 5.000in
+  }%
+}%
+\centerline{\box\graph}
+@end tex
+@end iftex
+@ifnottex
+@example
+                              ______
++----------------+           / More \  No       +----------+
+| Initialization | -------> <  Data  > -------> | Clean Up |
++----------------+    ^      \   ?  /           +----------+
+                      |       +--+-+
+                      |          | Yes
+                      |          |
+                      |          V
+                      |     +---------+
+                      +-----+ Process |
+                            +---------+
+@end example
+@end ifnottex
+
+@table @asis
+@item Initialization
+These are the things you do before actually starting to process
+data, such as checking arguments, initializing any data you need
+to work with, and so on.
+This step corresponds to @command{awk}'s @code{BEGIN} rule
+(@pxref{BEGIN/END}).
+
+If you were baking a cake, this might consist of laying out all the
+mixing bowls and the baking pan, and making sure you have all the
+ingredients that you need.
+
+@item Processing
+This is where the actual work is done.  Your program reads data,
+one logical chunk at a time, and processes it as appropriate.
+
+In most programming languages, you have to manually manage the reading
+of data, checking to see if there is more each time you read a chunk.
+@command{awk}'s pattern-action paradigm
+(@pxref{Getting Started})
+handles the mechanics of this for you.
+
+In baking a cake, the processing corresponds to the actual labor:
+breaking eggs, mixing the flour, water, and other ingredients, and then putting the cake
+into the oven.
+
+@item Clean Up
+Once you've processed all the data, you may have things you need to
+do before exiting.
+This step corresponds to @command{awk}'s @code{END} rule
+(@pxref{BEGIN/END}).
+
+After the cake comes out of the oven, you still have to wrap it in
+plastic wrap to keep anyone from tasting it, as well as wash
+the mixing bowls and utensils.
+@end table
+
+@cindex algorithms
+An @dfn{algorithm} is a detailed set of instructions necessary to accomplish
+a task, or process data.  It is much the same as a recipe for baking
+a cake.  Programs implement algorithms.  Often, it is up to you to design
+the algorithm and implement it, simultaneously.
+
+@cindex records
+@cindex fields
+The ``logical chunks'' we talked about previously are called @dfn{records},
+similar to the records a company keeps on employees, a school keeps for
+students, or a doctor keeps for patients.
+Each record has many component parts, such as first and last names,
+date of birth, address, and so on.  The component parts are referred
+to as the @dfn{fields} of the record.
+
+The act of reading data is termed @dfn{input}, and that of
+generating results, not too surprisingly, is termed @dfn{output}.
+They are often referred to together as ``input/output,''
+and even more often, as ``I/O'' for short.
+(You will also see ``input'' and ``output'' used as verbs.)
+
+@cindex data-driven languages
+@c comma is part of primary
+@cindex languages, data-driven
+@command{awk} manages the reading of data for you, as well as the
+breaking it up into records and fields.  Your program's job is to
+tell @command{awk} what to with the data.  You do this by describing
+@dfn{patterns} in the data to look for, and @dfn{actions} to execute
+when those patterns are seen.  This @dfn{data-driven} nature of
+@command{awk} programs usually makes them both easier to write
+and easier to read.
+
+@node Basic Data Typing
+@appendixsec Data Values in a Computer
+
+@cindex variables
+In a program,
+you keep track of information and values in things called @dfn{variables}.
+A variable is just a name for a given value, such as @code{first_name},
+@code{last_name}, @code{address}, and so on.
+@command{awk} has several predefined variables, and it has
+special names to refer to the current input record
+and the fields of the record.
+You may also group multiple
+associated values under one name, as an array.
+
+@cindex values, numeric
+@cindex values, string
+@cindex scalar values
+Data, particularly in @command{awk}, consists of either numeric
+values, such as 42 or 3.1415927, or string values.
+String values are essentially anything that's not a number, such as a name.
+Strings are sometimes referred to as @dfn{character data}, since they
+store the individual characters that comprise them.
+Individual variables, as well as numeric and string variables, are
+referred to as @dfn{scalar} values.
+Groups of values, such as arrays, are not scalars.
+
+@cindex integers
+@cindex floating-point, numbers
+@cindex numbers, floating-point
+Within computers, there are two kinds of numeric values: @dfn{integers}
+and @dfn{floating-point}.
+In school, integer values were referred to as ``whole'' numbers---that is,
+numbers without any fractional part, such as 1, 42, or @minus{}17.
+The advantage to integer numbers is that they represent values exactly.
+The disadvantage is that their range is limited.  On most modern systems,
+this range is @minus{}2,147,483,648 to 2,147,483,647.
+
+@cindex unsigned integers
+@cindex integers, unsigned
+Integer values come in two flavors: @dfn{signed} and @dfn{unsigned}.
+Signed values may be negative or positive, with the range of values just
+described.
+Unsigned values are always positive.  On most modern systems,
+the range is from 0 to 4,294,967,295.
+
+@cindex double-precision floating-point
+@cindex single-precision floating-point
+Floating-point numbers represent what are called ``real'' numbers; i.e.,
+those that do have a fractional part, such as 3.1415927.
+The advantage to floating-point numbers is that they
+can represent a much larger range of values.
+The disadvantage is that there are numbers that they cannot represent
+exactly.
+@command{awk} uses @dfn{double-precision} floating-point numbers, which
+can hold more digits than @dfn{single-precision}
+floating-point numbers.
+Floating-point issues are discussed more fully in
+@ref{Floating Point Issues}.
+
+At the very lowest level, computers store values as groups of binary digits,
+or @dfn{bits}.  Modern computers group bits into groups of eight, called @dfn{bytes}.
+Advanced applications sometimes have to manipulate bits directly,
+and @command{gawk} provides functions for doing so.
+
+@cindex null strings
+While you are probably used to the idea of a number without a value (i.e., zero),
+it takes a bit more getting used to the idea of zero-length character data.
+Nevertheless, such a thing exists.
+It is called the @dfn{null string}.
+The null string is character data that has no value.
+In other words, it is empty.  It is written in @command{awk} programs
+like this: @code{""}.
+
+Humans are used to working in decimal; i.e., base 10.  In base 10,
+numbers go from 0 to 9, and then ``roll over'' into the next
+column.  (Remember grade school? 42 is 4 times 10 plus 2.)
+
+There are other number bases though.  Computers commonly use base 2
+or @dfn{binary}, base 8 or @dfn{octal}, and base 16 or @dfn{hexadecimal}.
+In binary, each column represents two times the value in the column to
+its right. Each column may contain either a 0 or a 1.
+Thus, binary 1010 represents 1 times 8, plus 0 times 4, plus 1 times 2,
+plus 0 times 1, or decimal 10.
+Octal and hexadecimal are discussed more in
+@ref{Nondecimal-numbers}.
+
+Programs are written in programming languages.
+Hundreds, if not thousands, of programming languages exist.
+One of the most popular is the C programming language.
+The C language had a very strong influence on the design of
+the @command{awk} language.
+
+@cindex Kernighan, Brian
+@cindex Ritchie, Dennis
+There have been several versions of C.  The first is often referred to
+as ``K&R'' C, after the initials of Brian Kernighan and Dennis Ritchie,
+the authors of the first book on C.  (Dennis Ritchie created the language,
+and Brian Kernighan was one of the creators of @command{awk}.)
+
+In the mid-1980s, an effort began to produce an international standard
+for C.  This work culminated in 1989, with the production of the ANSI
+standard for C.  This standard became an ISO standard in 1990.
+Where it makes sense, POSIX @command{awk} is compatible with 1990 ISO C.
+
+In 1999, a revised ISO C standard was approved and released.
+Future versions of @command{gawk} will be as compatible as possible
+with this standard.
+
+@node Floating Point Issues
+@appendixsec Floating-Point Number Caveats
+
+As mentioned earlier, floating-point numbers represent what are called
+``real'' numbers, i.e., those that have a fractional part.  @command{awk}
+uses double-precision floating-point numbers to represent all
+numeric values.  This @value{SECTION} describes some of the issues
+involved in using floating-point numbers.
+
+There is a very nice paper on floating-point arithmetic by
+David Goldberg, ``What Every
+Computer Scientist Should Know About Floating-point Arithmetic,''
+@cite{ACM Computing Surveys} @strong{23}, 1 (1991-03),
+5-48.@footnote{@uref{http://www.validlab.com/goldberg/paper.ps}.}
+This is worth reading if you are interested in the details,
+but it does require a background in computer science.
+
+Internally, @command{awk} keeps both the numeric value
+(double-precision floating-point) and the string value for a variable.
+Separately, @command{awk} keeps
+track of what type the variable has
+(@pxref{Typing and Comparison}),
+which plays a role in how variables are used in comparisons.
+
+It is important to note that the string value for a number may not
+reflect the full value (all the digits) that the numeric value
+actually contains.
+The following program (@file{values.awk}) illustrates this:
+
+@example
+@{
+   $1 = $2 + $3
+   # see it for what it is
+   printf("$1 = %.12g\n", $1)
+   # use CONVFMT
+   a = "<" $1 ">"
+   print "a =", a
+@group
+   # use OFMT
+   print "$1 =", $1
+@end group
+@}
+@end example
+
+@noindent
+This program shows the full value of the sum of @code{$2} and @code{$3}
+using @code{printf}, and then prints the string values obtained
+from both automatic conversion (via @code{CONVFMT}) and
+from printing (via @code{OFMT}).
+
+Here is what happens when the program is run:
+
+@example
+$ echo 2 3.654321 1.2345678 | awk -f values.awk
+@print{} $1 = 4.8888888
+@print{} a = <4.88889>
+@print{} $1 = 4.88889
+@end example
+
+This makes it clear that the full numeric value is different from
+what the default string representations show.
+
+@code{CONVFMT}'s default value is @code{"%.6g"}, which yields a value with
+at least six significant digits.  For some applications, you might want to
+change it to specify more precision.
+On most modern machines, most of the time,
+17 digits is enough to capture a floating-point number's
+value exactly.@footnote{Pathological cases can require up to
+752 digits (!), but we doubt that you need to worry about this.}
+
+@cindex floating-point
+Unlike numbers in the abstract sense (such as what you studied in high school
+or college math), numbers stored in computers are limited in certain ways.
+They cannot represent an infinite number of digits, nor can they always
+represent things exactly.
+In particular,
+floating-point numbers cannot
+always represent values exactly.  Here is an example:
+
+@example
+$ awk '@{ printf("%010d\n", $1 * 100) @}'
+515.79
+@print{} 0000051579
+515.80
+@print{} 0000051579
+515.81
+@print{} 0000051580
+515.82
+@print{} 0000051582
+@kbd{@value{CTL}-d}
+@end example
+
+@noindent
+This shows that some values can be represented exactly,
+whereas others are only approximated.  This is not a ``bug''
+in @command{awk}, but simply an artifact of how computers
+represent numbers.
+
+@cindex negative zero
+@cindex positive zero
+@c comma is part of primary
+@cindex zero, negative vs.@: positive
+Another peculiarity of floating-point numbers on modern systems
+is that they often have more than one representation for the number zero!
+In particular, it is possible to represent ``minus zero'' as well as
+regular, or ``positive'' zero.
+
+This example shows that negative and positive zero are distinct values
+when stored internally, but that they are in fact equal to each other,
+as well as to ``regular'' zero:
+
+@smallexample
+$ gawk 'BEGIN @{ mz = -0 ; pz = 0
+> printf "-0 = %g, +0 = %g, (-0 == +0) -> %d\n", mz, pz, mz == pz
+> printf "mz == 0 -> %d, pz == 0 -> %d\n", mz == 0, pz == 0
+> @}'
+@print{} -0 = -0, +0 = 0, (-0 == +0) -> 1
+@print{} mz == 0 -> 1, pz == 0 -> 1
+@end smallexample
+
+It helps to keep this in mind should you process numeric data
+that contains negative zero values; the fact that the zero is negative
+is noted and can affect comparisons.
+@c ENDOFRANGE procon
+
+@node Glossary
+@unnumbered Glossary
+
+@table @asis
+@item Action
+A series of @command{awk} statements attached to a rule.  If the rule's
+pattern matches an input record, @command{awk} executes the
+rule's action.  Actions are always enclosed in curly braces.
+(@xref{Action Overview}.)
+
+@cindex Spencer, Henry
+@cindex @command{sed} utility
+@cindex amazing @command{awk} assembler (@command{aaa})
+@item Amazing @command{awk} Assembler
+Henry Spencer at the University of Toronto wrote a retargetable assembler
+completely as @command{sed} and @command{awk} scripts.  It is thousands
+of lines long, including machine descriptions for several eight-bit
+microcomputers.  It is a good example of a program that would have been
+better written in another language.
+You can get it from @uref{ftp://ftp.freefriends.org/arnold/Awkstuff/aaa.tgz}.
+
+@cindex amazingly workable formatter (@command{awf})
+@cindex @command{awf} (amazingly workable formatter) program
+@item Amazingly Workable Formatter (@command{awf})
+Henry Spencer at the University of Toronto wrote a formatter that accepts
+a large subset of the @samp{nroff -ms} and @samp{nroff -man} formatting
+commands, using @command{awk} and @command{sh}.
+It is available over the Internet
+from @uref{ftp://ftp.freefriends.org/arnold/Awkstuff/awf.tgz}.
+
+@item Anchor
+The regexp metacharacters @samp{^} and @samp{$}, which force the match
+to the beginning or end of the string, respectively.
+
+@cindex ANSI
+@item ANSI
+The American National Standards Institute.  This organization produces
+many standards, among them the standards for the C and C++ programming
+languages.
+These standards often become international standards as well. See also
+``ISO.''
+
+@item Array
+A grouping of multiple values under the same name.
+Most languages just provide sequential arrays.
+@command{awk} provides associative arrays.
+
+@item Assertion
+A statement in a program that a condition is true at this point in the program.
+Useful for reasoning about how a program is supposed to behave.
+
+@item Assignment
+An @command{awk} expression that changes the value of some @command{awk}
+variable or data object.  An object that you can assign to is called an
+@dfn{lvalue}.  The assigned values are called @dfn{rvalues}.
+@xref{Assignment Ops}.
+
+@item Associative Array
+Arrays in which the indices may be numbers or strings, not just
+sequential integers in a fixed range.
+
+@item @command{awk} Language
+The language in which @command{awk} programs are written.
+
+@item @command{awk} Program
+An @command{awk} program consists of a series of @dfn{patterns} and
+@dfn{actions}, collectively known as @dfn{rules}.  For each input record
+given to the program, the program's rules are all processed in turn.
+@command{awk} programs may also contain function definitions.
+
+@item @command{awk} Script
+Another name for an @command{awk} program.
+
+@item Bash
+The GNU version of the standard shell
+@ifnotinfo
+(the @b{B}ourne-@b{A}gain @b{SH}ell).
+@end ifnotinfo
+@ifinfo
+(the Bourne-Again SHell).
+@end ifinfo
+See also ``Bourne Shell.''
+
+@item BBS
+See ``Bulletin Board System.''
+
+@item Bit
+Short for ``Binary Digit.''
+All values in computer memory ultimately reduce to binary digits: values
+that are either zero or one.
+Groups of bits may be interpreted differently---as integers,
+floating-point numbers, character data, addresses of other
+memory objects, or other data.
+@command{awk} lets you work with floating-point numbers and strings.
+@command{gawk} lets you manipulate bit values with the built-in
+functions described in
+@ref{Bitwise Functions}.
+
+Computers are often defined by how many bits they use to represent integer
+values.  Typical systems are 32-bit systems, but 64-bit systems are
+becoming increasingly popular, and 16-bit systems are waning in
+popularity.
+
+@item Boolean Expression
+Named after the English mathematician Boole. See also ``Logical Expression.''
+
+@item Bourne Shell
+The standard shell (@file{/bin/sh}) on Unix and Unix-like systems,
+originally written by Steven R.@: Bourne.
+Many shells (@command{bash}, @command{ksh}, @command{pdksh}, @command{zsh}) are
+generally upwardly compatible with the Bourne shell.
+
+@item Built-in Function
+The @command{awk} language provides built-in functions that perform various
+numerical, I/O-related, and string computations.  Examples are
+@code{sqrt} (for the square root of a number) and @code{substr} (for a
+substring of a string).
+@command{gawk} provides functions for timestamp management, bit manipulation,
+and runtime string translation.
+(@xref{Built-in}.)
+
+@item Built-in Variable
+@code{ARGC},
+@code{ARGV},
+@code{CONVFMT},
+@code{ENVIRON},
+@code{FILENAME},
+@code{FNR},
+@code{FS},
+@code{NF},
+@code{NR},
+@code{OFMT},
+@code{OFS},
+@code{ORS},
+@code{RLENGTH},
+@code{RSTART},
+@code{RS},
+and
+@code{SUBSEP}
+are the variables that have special meaning to @command{awk}.
+In addition,
+@code{ARGIND},
+@code{BINMODE},
+@code{ERRNO},
+@code{FIELDWIDTHS},
+@code{IGNORECASE},
+@code{LINT},
+@code{PROCINFO},
+@code{RT},
+and
+@code{TEXTDOMAIN}
+are the variables that have special meaning to @command{gawk}.
+Changing some of them affects @command{awk}'s running environment.
+(@xref{Built-in Variables}.)
+
+@item Braces
+See ``Curly Braces.''
+
+@item Bulletin Board System
+A computer system allowing users to log in and read and/or leave messages
+for other users of the system, much like leaving paper notes on a bulletin
+board.
+
+@item C
+The system programming language that most GNU software is written in.  The
+@command{awk} programming language has C-like syntax, and this @value{DOCUMENT}
+points out similarities between @command{awk} and C when appropriate.
+
+In general, @command{gawk} attempts to be as similar to the 1990 version
+of ISO C as makes sense.  Future versions of @command{gawk} may adopt features
+from the newer 1999 standard, as appropriate.
+
+@item C++
+A popular object-oriented programming language derived from C.
+
+@cindex ISO 8859-1
+@cindex ISO Latin-1
+@cindex character sets (machine character encodings)
+@item Character Set
+The set of numeric codes used by a computer system to represent the
+characters (letters, numbers, punctuation, etc.) of a particular country
+or place. The most common character set in use today is ASCII (American
+Standard Code for Information Interchange).  Many European
+countries use an extension of ASCII known as ISO-8859-1 (ISO Latin-1).
+
+@cindex @command{chem} utility
+@item CHEM
+A preprocessor for @command{pic} that reads descriptions of molecules
+and produces @command{pic} input for drawing them.
+It was written in @command{awk}
+by Brian Kernighan and Jon Bentley, and is available from
+@uref{http://cm.bell-labs.com/netlib/typesetting/chem.gz}.
+
+@item Coprocess
+A subordinate program with which two-way communications is possible.
+
+@cindex compiled programs
+@item Compiler
+A program that translates human-readable source code into
+machine-executable object code.  The object code is then executed
+directly by the computer.
+See also ``Interpreter.''
+
+@item Compound Statement
+A series of @command{awk} statements, enclosed in curly braces.  Compound
+statements may be nested.
+(@xref{Statements}.)
+
+@item Concatenation
+Concatenating two strings means sticking them together, one after another,
+producing a new string.  For example, the string @samp{foo} concatenated with
+the string @samp{bar} gives the string @samp{foobar}.
+(@xref{Concatenation}.)
+
+@item Conditional Expression
+An expression using the @samp{?:} ternary operator, such as
+@samp{@var{expr1} ? @var{expr2} : @var{expr3}}.  The expression
+@var{expr1} is evaluated; if the result is true, the value of the whole
+expression is the value of @var{expr2}; otherwise the value is
+@var{expr3}.  In either case, only one of @var{expr2} and @var{expr3}
+is evaluated. (@xref{Conditional Exp}.)
+
+@item Comparison Expression
+A relation that is either true or false, such as @samp{(a < b)}.
+Comparison expressions are used in @code{if}, @code{while}, @code{do},
+and @code{for}
+statements, and in patterns to select which input records to process.
+(@xref{Typing and Comparison}.)
+
+@item Curly Braces
+The characters @samp{@{} and @samp{@}}.  Curly braces are used in
+@command{awk} for delimiting actions, compound statements, and function
+bodies.
+
+@cindex dark corner
+@item Dark Corner
+An area in the language where specifications often were (or still
+are) not clear, leading to unexpected or undesirable behavior.
+Such areas are marked in this @value{DOCUMENT} with
+@iftex
+the picture of a flashlight in the margin
+@end iftex
+@ifnottex
+``(d.c.)'' in the text
+@end ifnottex
+and are indexed under the heading ``dark corner.''
+
+@item Data Driven
+A description of @command{awk} programs, where you specify the data you
+are interested in processing, and what to do when that data is seen.
+
+@item Data Objects
+These are numbers and strings of characters.  Numbers are converted into
+strings and vice versa, as needed.
+(@xref{Conversion}.)
+
+@item Deadlock
+The situation in which two communicating processes are each waiting
+for the other to perform an action.
+
+@item Double-Precision
+An internal representation of numbers that can have fractional parts.
+Double-precision numbers keep track of more digits than do single-precision
+numbers, but operations on them are sometimes more expensive.  This is the way
+@command{awk} stores numeric values.  It is the C type @code{double}.
+
+@item Dynamic Regular Expression
+A dynamic regular expression is a regular expression written as an
+ordinary expression.  It could be a string constant, such as
+@code{"foo"}, but it may also be an expression whose value can vary.
+(@xref{Computed Regexps}.)
+
+@item Environment
+A collection of strings, of the form @var{name@code{=}val}, that each
+program has available to it. Users generally place values into the
+environment in order to provide information to various programs. Typical
+examples are the environment variables @env{HOME} and @env{PATH}.
+
+@item Empty String
+See ``Null String.''
+
+@cindex epoch, definition of
+@item Epoch
+The date used as the ``beginning of time'' for timestamps.
+Time values in Unix systems are represented as seconds since the epoch,
+with library functions available for converting these values into
+standard date and time formats.
+
+The epoch on Unix and POSIX systems is 1970-01-01 00:00:00 UTC.
+See also ``GMT'' and ``UTC.''
+
+@item Escape Sequences
+A special sequence of characters used for describing nonprinting
+characters, such as @samp{\n} for newline or @samp{\033} for the ASCII
+ESC (Escape) character. (@xref{Escape Sequences}.)
+
+@item FDL
+See ``Free Documentation License.''
+
+@item Field
+When @command{awk} reads an input record, it splits the record into pieces
+separated by whitespace (or by a separator regexp that you can
+change by setting the built-in variable @code{FS}).  Such pieces are
+called fields.  If the pieces are of fixed length, you can use the built-in
+variable @code{FIELDWIDTHS} to describe their lengths.
+(@xref{Field Separators},
+and
+@ref{Constant Size}.)
+
+@item Flag
+A variable whose truth value indicates the existence or nonexistence
+of some condition.
+
+@item Floating-Point Number
+Often referred to in mathematical terms as a ``rational'' or real number,
+this is just a number that can have a fractional part.
+See also ``Double-Precision'' and ``Single-Precision.''
+
+@item Format
+Format strings are used to control the appearance of output in the
+@code{strftime} and @code{sprintf} functions, and are used in the
+@code{printf} statement as well.  Also, data conversions from numbers to strings
+are controlled by the format string contained in the built-in variable
+@code{CONVFMT}. (@xref{Control Letters}.)
+
+@item Free Documentation License
+This document describes the terms under which this @value{DOCUMENT}
+is published and may be copied. (@xref{GNU Free Documentation License}.)
+
+@item Function
+A specialized group of statements used to encapsulate general
+or program-specific tasks.  @command{awk} has a number of built-in
+functions, and also allows you to define your own.
+(@xref{Functions}.)
+
+@item FSF
+See ``Free Software Foundation.''
+
+@cindex FSF (Free Software Foundation)
+@cindex Free Software Foundation (FSF)
+@cindex Stallman, Richard
+@item Free Software Foundation
+A nonprofit organization dedicated
+to the production and distribution of freely distributable software.
+It was founded by Richard M.@: Stallman, the author of the original
+Emacs editor.  GNU Emacs is the most widely used version of Emacs today.
+
+@item @command{gawk}
+The GNU implementation of @command{awk}.
+
+@cindex GPL (General Public License)
+@cindex General Public License (GPL)
+@cindex GNU General Public License
+@item General Public License
+This document describes the terms under which @command{gawk} and its source
+code may be distributed. (@xref{Copying}.)
+
+@item GMT
+``Greenwich Mean Time.''
+This is the old term for UTC.
+It is the time of day used as the epoch for Unix and POSIX systems.
+See also ``Epoch'' and ``UTC.''
+
+@cindex FSF (Free Software Foundation)
+@cindex Free Software Foundation (FSF)
+@cindex GNU Project
+@item GNU
+``GNU's not Unix''.  An on-going project of the Free Software Foundation
+to create a complete, freely distributable, POSIX-compliant computing
+environment.
+
+@item GNU/Linux
+A variant of the GNU system using the Linux kernel, instead of the
+Free Software Foundation's Hurd kernel.
+Linux is a stable, efficient, full-featured clone of Unix that has
+been ported to a variety of architectures.
+It is most popular on PC-class systems, but runs well on a variety of
+other systems too.
+The Linux kernel source code is available under the terms of the GNU General
+Public License, which is perhaps its most important aspect.
+
+@item GPL
+See ``General Public License.''
+
+@item Hexadecimal
+Base 16 notation, where the digits are @code{0}--@code{9} and
+@code{A}--@code{F}, with @samp{A}
+representing 10, @samp{B} representing 11, and so on, up to @samp{F} for 15.
+Hexadecimal numbers are written in C using a leading @samp{0x},
+to indicate their base.  Thus, @code{0x12} is 18 (1 times 16 plus 2).
+
+@item I/O
+Abbreviation for ``Input/Output,'' the act of moving data into and/or
+out of a running program.
+
+@item Input Record
+A single chunk of data that is read in by @command{awk}.  Usually, an @command{awk} input
+record consists of one line of text.
+(@xref{Records}.)
+
+@item Integer
+A whole number, i.e., a number that does not have a fractional part.
+
+@item Internationalization
+The process of writing or modifying a program so
+that it can use multiple languages without requiring
+further source code changes.
+
+@cindex interpreted programs
+@item Interpreter
+A program that reads human-readable source code directly, and uses
+the instructions in it to process data and produce results.
+@command{awk} is typically (but not always) implemented as an interpreter.
+See also ``Compiler.''
+
+@item Interval Expression
+A component of a regular expression that lets you specify repeated matches of
+some part of the regexp.  Interval expressions were not traditionally available
+in @command{awk} programs.
+
+@cindex ISO
+@item ISO
+The International Standards Organization.
+This organization produces international standards for many things, including
+programming languages, such as C and C++.
+In the computer arena, important standards like those for C, C++, and POSIX
+become both American national and ISO international standards simultaneously.
+This @value{DOCUMENT} refers to Standard C as ``ISO C'' throughout.
+
+@item Keyword
+In the @command{awk} language, a keyword is a word that has special
+meaning.  Keywords are reserved and may not be used as variable names.
+
+@command{gawk}'s keywords are:
+@code{BEGIN},
+@code{END},
+@code{if},
+@code{else},
+@code{while},
+@code{do@dots{}while},
+@code{for},
+@code{for@dots{}in},
+@code{break},
+@code{continue},
+@code{delete},
+@code{next},
+@code{nextfile},
+@code{function},
+@code{func},
+and
+@code{exit}.
+
+@cindex LGPL (Lesser General Public License)
+@cindex Lesser General Public License (LGPL)
+@cindex GNU Lesser General Public License
+@item Lesser General Public License
+This document describes the terms under which binary library archives
+or shared objects,
+and their source code may be distributed.
+
+@item Linux
+See ``GNU/Linux.''
+
+@item LGPL
+See ``Lesser General Public License.''
+
+@item Localization
+The process of providing the data necessary for an
+internationalized program to work in a particular language.
+
+@item Logical Expression
+An expression using the operators for logic, AND, OR, and NOT, written
+@samp{&&}, @samp{||}, and @samp{!} in @command{awk}. Often called Boolean
+expressions, after the mathematician who pioneered this kind of
+mathematical logic.
+
+@item Lvalue
+An expression that can appear on the left side of an assignment
+operator.  In most languages, lvalues can be variables or array
+elements.  In @command{awk}, a field designator can also be used as an
+lvalue.
+
+@item Matching
+The act of testing a string against a regular expression.  If the
+regexp describes the contents of the string, it is said to @dfn{match} it.
+
+@item Metacharacters
+Characters used within a regexp that do not stand for themselves.
+Instead, they denote regular expression operations, such as repetition,
+grouping, or alternation.
+
+@item Null String
+A string with no characters in it.  It is represented explicitly in
+@command{awk} programs by placing two double quote characters next to
+each other (@code{""}).  It can appear in input data by having two successive
+occurrences of the field separator appear next to each other.
+
+@item Number
+A numeric-valued data object.  Modern @command{awk} implementations use
+double-precision floating-point to represent numbers.
+Very old @command{awk} implementations use single-precision floating-point.
+
+@item Octal
+Base-eight notation, where the digits are @code{0}--@code{7}.
+Octal numbers are written in C using a leading @samp{0},
+to indicate their base.  Thus, @code{013} is 11 (one times 8 plus 3).
+
+@cindex P1003.2 POSIX standard
+@item P1003.2
+See ``POSIX.''
+
+@item Pattern
+Patterns tell @command{awk} which input records are interesting to which
+rules.
+
+A pattern is an arbitrary conditional expression against which input is
+tested.  If the condition is satisfied, the pattern is said to @dfn{match}
+the input record.  A typical pattern might compare the input record against
+a regular expression. (@xref{Pattern Overview}.)
+
+@item POSIX
+The name for a series of standards
+@c being developed by the IEEE
+that specify a Portable Operating System interface.  The ``IX'' denotes
+the Unix heritage of these standards.  The main standard of interest for
+@command{awk} users is
+@cite{IEEE Standard for Information Technology, Standard 1003.2-1992,
+Portable Operating System Interface (POSIX) Part 2: Shell and Utilities}.
+Informally, this standard is often referred to as simply ``P1003.2.''
+
+@item Precedence
+The order in which operations are performed when operators are used
+without explicit parentheses.
+
+@item Private
+Variables and/or functions that are meant for use exclusively by library
+functions and not for the main @command{awk} program. Special care must be
+taken when naming such variables and functions.
+(@xref{Library Names}.)
+
+@item Range (of input lines)
+A sequence of consecutive lines from the input file(s).  A pattern
+can specify ranges of input lines for @command{awk} to process or it can
+specify single lines. (@xref{Pattern Overview}.)
+
+@item Recursion
+When a function calls itself, either directly or indirectly.
+If this isn't clear, refer to the entry for ``recursion.''
+
+@item Redirection
+Redirection means performing input from something other than the standard input
+stream, or performing output to something other than the standard output stream.
+
+You can redirect the output of the @code{print} and @code{printf} statements
+to a file or a system command, using the @samp{>}, @samp{>>}, @samp{|}, and @samp{|&}
+operators.  You can redirect input to the @code{getline} statement using
+the @samp{<}, @samp{|}, and @samp{|&} operators.
+(@xref{Redirection},
+and @ref{Getline}.)
+
+@item Regexp
+Short for @dfn{regular expression}.  A regexp is a pattern that denotes a
+set of strings, possibly an infinite set.  For example, the regexp
+@samp{R.*xp} matches any string starting with the letter @samp{R}
+and ending with the letters @samp{xp}.  In @command{awk}, regexps are
+used in patterns and in conditional expressions.  Regexps may contain
+escape sequences. (@xref{Regexp}.)
+
+@item Regular Expression
+See ``regexp.''
+
+@item Regular Expression Constant
+A regular expression constant is a regular expression written within
+slashes, such as @code{/foo/}.  This regular expression is chosen
+when you write the @command{awk} program and cannot be changed during
+its execution. (@xref{Regexp Usage}.)
+
+@item Rule
+A segment of an @command{awk} program that specifies how to process single
+input records.  A rule consists of a @dfn{pattern} and an @dfn{action}.
+@command{awk} reads an input record; then, for each rule, if the input record
+satisfies the rule's pattern, @command{awk} executes the rule's action.
+Otherwise, the rule does nothing for that input record.
+
+@item Rvalue
+A value that can appear on the right side of an assignment operator.
+In @command{awk}, essentially every expression has a value. These values
+are rvalues.
+
+@item Scalar
+A single value, be it a number or a string.
+Regular variables are scalars; arrays and functions are not.
+
+@item Search Path
+In @command{gawk}, a list of directories to search for @command{awk} program source files.
+In the shell, a list of directories to search for executable programs.
+
+@item Seed
+The initial value, or starting point, for a sequence of random numbers.
+
+@item @command{sed}
+See ``Stream Editor.''
+
+@item Shell
+The command interpreter for Unix and POSIX-compliant systems.
+The shell works both interactively, and as a programming language
+for batch files, or shell scripts.
+
+@item Short-Circuit
+The nature of the @command{awk} logical operators @samp{&&} and @samp{||}.
+If the value of the entire expression is determinable from evaluating just
+the lefthand side of these operators, the righthand side is not
+evaluated.
+(@xref{Boolean Ops}.)
+
+@item Side Effect
+A side effect occurs when an expression has an effect aside from merely
+producing a value.  Assignment expressions, increment and decrement
+expressions, and function calls have side effects.
+(@xref{Assignment Ops}.)
+
+@item Single-Precision
+An internal representation of numbers that can have fractional parts.
+Single-precision numbers keep track of fewer digits than do double-precision
+numbers, but operations on them are sometimes less expensive in terms of CPU time.
+This is the type used by some very old versions of @command{awk} to store
+numeric values.  It is the C type @code{float}.
+
+@item Space
+The character generated by hitting the space bar on the keyboard.
+
+@item Special File
+A @value{FN} interpreted internally by @command{gawk}, instead of being handed
+directly to the underlying operating system---for example, @file{/dev/stderr}.
+(@xref{Special Files}.)
+
+@item Stream Editor
+A program that reads records from an input stream and processes them one
+or more at a time.  This is in contrast with batch programs, which may
+expect to read their input files in entirety before starting to do
+anything, as well as with interactive programs which require input from the
+user.
+
+@item String
+A datum consisting of a sequence of characters, such as @samp{I am a
+string}.  Constant strings are written with double quotes in the
+@command{awk} language and may contain escape sequences.
+(@xref{Escape Sequences}.)
+
+@item Tab
+The character generated by hitting the @kbd{TAB} key on the keyboard.
+It usually expands to up to eight spaces upon output.
+
+@item Text Domain
+A unique name that identifies an application.
+Used for grouping messages that are translated at runtime
+into the local language.
+
+@item Timestamp
+A value in the ``seconds since the epoch'' format used by Unix
+and POSIX systems.  Used for the @command{gawk} functions
+@code{mktime}, @code{strftime}, and @code{systime}.
+See also ``Epoch'' and ``UTC.''
+
+@cindex Linux
+@cindex GNU/Linux
+@cindex Unix
+@cindex BSD-based operating systems
+@cindex NetBSD
+@cindex FreeBSD
+@cindex OpenBSD
+@item Unix
+A computer operating system originally developed in the early 1970's at
+AT&T Bell Laboratories.  It initially became popular in universities around
+the world and later moved into commercial environments as a software
+development system and network server system. There are many commercial
+versions of Unix, as well as several work-alike systems whose source code
+is freely available (such as GNU/Linux, NetBSD, FreeBSD, and OpenBSD).
+
+@item UTC
+The accepted abbreviation for ``Universal Coordinated Time.''
+This is standard time in Greenwich, England, which is used as a
+reference time for day and date calculations.
+See also ``Epoch'' and ``GMT.''
+
+@item Whitespace
+A sequence of space, TAB, or newline characters occurring inside an input
+record or a string.
+@end table
+
+@node Copying
+@unnumbered GNU General Public License
+@center Version 2, June 1991
+
+@display
+Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc.
+59 Temple Place, Suite 330, Boston, MA 02111, USA
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+@end display
+
+@c fakenode --- for prepinfo
+@unnumberedsec Preamble
+
+  The licenses for most software are designed to take away your
+freedom to share and change it.  By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software---to make sure the software is free for all its users.  This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it.  (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.)  You can apply it to
+your programs, too.
+
+  When we speak of free software, we are referring to freedom, not
+price.  Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+  To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+  For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have.  You must make sure that they, too, receive or can get the
+source code.  And you must show them these terms so they know their
+rights.
+
+  We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+  Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software.  If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+  Finally, any free program is threatened constantly by software
+patents.  We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary.  To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+  The precise terms and conditions for copying, distribution and
+modification follow.
+
+@ifnotinfo
+@c fakenode --- for prepinfo
+@unnumberedsec Terms and Conditions for Copying, Distribution and Modification
+@end ifnotinfo
+@ifinfo
+@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+@end ifinfo
+
+@enumerate 0
+@item
+This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License.  The ``Program'', below,
+refers to any such program or work, and a ``work based on the Program''
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language.  (Hereinafter, translation is included without limitation in
+the term ``modification''.)  Each licensee is addressed as ``you''.
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope.  The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+@item
+You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+@item
+You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+@enumerate a
+@item
+You must cause the modified files to carry prominent notices
+stating that you changed the files and the date of any change.
+
+@item
+You must cause any work that you distribute or publish, that in
+whole or in part contains or is derived from the Program or any
+part thereof, to be licensed as a whole at no charge to all third
+parties under the terms of this License.
+
+@item
+If the modified program normally reads commands interactively
+when run, you must cause it, when started running for such
+interactive use in the most ordinary way, to print or display an
+announcement including an appropriate copyright notice and a
+notice that there is no warranty (or else, saying that you provide
+a warranty) and that users may redistribute the program under
+these conditions, and telling the user how to view a copy of this
+License.  (Exception: if the Program itself is interactive but
+does not normally print such an announcement, your work based on
+the Program is not required to print an announcement.)
+@end enumerate
+
+These requirements apply to the modified work as a whole.  If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works.  But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+@item
+You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+@enumerate a
+@item
+Accompany it with the complete corresponding machine-readable
+source code, which must be distributed under the terms of Sections
+1 and 2 above on a medium customarily used for software interchange; or,
+
+@item
+Accompany it with a written offer, valid for at least three
+years, to give any third party, for a charge no more than your
+cost of physically performing source distribution, a complete
+machine-readable copy of the corresponding source code, to be
+distributed under the terms of Sections 1 and 2 above on a medium
+customarily used for software interchange; or,
+
+@item
+Accompany it with the information you received as to the offer
+to distribute corresponding source code.  (This alternative is
+allowed only for noncommercial distribution and only if you
+received the program in object code or executable form with such
+an offer, in accord with Subsection b above.)
+@end enumerate
+
+The source code for a work means the preferred form of the work for
+making modifications to it.  For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable.  However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+@item
+You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+@item
+You are not required to accept this License, since you have not
+signed it.  However, nothing else grants you permission to modify or
+distribute the Program or its derivative works.  These actions are
+prohibited by law if you do not accept this License.  Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+@item
+Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions.  You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+@item
+If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License.  If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all.  For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices.  Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+@item
+If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded.  In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+@item
+The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time.  Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number.  If the Program
+specifies a version number of this License which applies to it and ``any
+later version'', you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation.  If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+@item
+If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission.  For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this.  Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+@ifnotinfo
+@c fakenode --- for prepinfo
+@heading NO WARRANTY
+@end ifnotinfo
+@ifinfo
+@center NO WARRANTY
+@end ifinfo
+
+@item
+BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW@.  EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE@.  THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU@.  SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+@item
+IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+@end enumerate
+
+@ifnotinfo
+@c fakenode --- for prepinfo
+@heading END OF TERMS AND CONDITIONS
+@end ifnotinfo
+@ifinfo
+@center END OF TERMS AND CONDITIONS
+@end ifinfo
+
+@page
+@c fakenode --- for prepinfo
+@unnumberedsec How to Apply These Terms to Your New Programs
+
+  If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+  To do so, attach the following notices to the program.  It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the ``copyright'' line and a pointer to where the full notice is found.
+
+@smallexample
+@var{one line to give the program's name and an idea of what it does.}
+Copyright (C) @var{year}  @var{name of author}
+
+This program is free software; you can redistribute it and/or
+modify it under the terms of the GNU General Public License
+as published by the Free Software Foundation; either version 2
+of the License, or (at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE@.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
+@end smallexample
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+@smallexample
+Gnomovision version 69, Copyright (C) @var{year} @var{name of author}
+Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
+type `show w'.  This is free software, and you are welcome
+to redistribute it under certain conditions; type `show c'
+for details.
+@end smallexample
+
+The hypothetical commands @samp{show w} and @samp{show c} should show
+the appropriate parts of the General Public License.  Of course, the
+commands you use may be called something other than @samp{show w} and
+@samp{show c}; they could even be mouse-clicks or menu items---whatever
+suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a ``copyright disclaimer'' for the program, if
+necessary.  Here is a sample; alter the names:
+
+@smallexample
+@group
+Yoyodyne, Inc., hereby disclaims all copyright
+interest in the program `Gnomovision'
+(which makes passes at compilers) written
+by James Hacker.
+
+@var{signature of Ty Coon}, 1 April 1989
+Ty Coon, President of Vice
+@end group
+@end smallexample
+
+This General Public License does not permit incorporating your program into
+proprietary programs.  If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library.  If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
+
+@node GNU Free Documentation License
+@unnumbered GNU Free Documentation License
+
+@cindex FDL (Free Documentation License)
+@cindex Free Documentation License (FDL)
+@cindex GNU Free Documentation License
+@center Version 1.2, November 2002
+
+@display
+Copyright @copyright{} 2000,2001,2002 Free Software Foundation, Inc.
+59 Temple Place, Suite 330, Boston, MA  02111-1307, USA
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+@end display
+
+@enumerate 0
+@item
+PREAMBLE
+
+The purpose of this License is to make a manual, textbook, or other
+functional and useful document @dfn{free} in the sense of freedom: to
+assure everyone the effective freedom to copy and redistribute it,
+with or without modifying it, either commercially or noncommercially.
+Secondarily, this License preserves for the author and publisher a way
+to get credit for their work, while not being considered responsible
+for modifications made by others.
+
+This License is a kind of ``copyleft'', which means that derivative
+works of the document must themselves be free in the same sense.  It
+complements the GNU General Public License, which is a copyleft
+license designed for free software.
+
+We have designed this License in order to use it for manuals for free
+software, because free software needs free documentation: a free
+program should come with manuals providing the same freedoms that the
+software does.  But this License is not limited to software manuals;
+it can be used for any textual work, regardless of subject matter or
+whether it is published as a printed book.  We recommend this License
+principally for works whose purpose is instruction or reference.
+
+@item
+APPLICABILITY AND DEFINITIONS
+
+This License applies to any manual or other work, in any medium, that
+contains a notice placed by the copyright holder saying it can be
+distributed under the terms of this License.  Such a notice grants a
+world-wide, royalty-free license, unlimited in duration, to use that
+work under the conditions stated herein.  The ``Document'', below,
+refers to any such manual or work.  Any member of the public is a
+licensee, and is addressed as ``you''.  You accept the license if you
+copy, modify or distribute the work in a way requiring permission
+under copyright law.
+
+A ``Modified Version'' of the Document means any work containing the
+Document or a portion of it, either copied verbatim, or with
+modifications and/or translated into another language.
+
+A ``Secondary Section'' is a named appendix or a front-matter section
+of the Document that deals exclusively with the relationship of the
+publishers or authors of the Document to the Document's overall
+subject (or to related matters) and contains nothing that could fall
+directly within that overall subject.  (Thus, if the Document is in
+part a textbook of mathematics, a Secondary Section may not explain
+any mathematics.)  The relationship could be a matter of historical
+connection with the subject or with related matters, or of legal,
+commercial, philosophical, ethical or political position regarding
+them.
+
+The ``Invariant Sections'' are certain Secondary Sections whose titles
+are designated, as being those of Invariant Sections, in the notice
+that says that the Document is released under this License.  If a
+section does not fit the above definition of Secondary then it is not
+allowed to be designated as Invariant.  The Document may contain zero
+Invariant Sections.  If the Document does not identify any Invariant
+Sections then there are none.
+
+The ``Cover Texts'' are certain short passages of text that are listed,
+as Front-Cover Texts or Back-Cover Texts, in the notice that says that
+the Document is released under this License.  A Front-Cover Text may
+be at most 5 words, and a Back-Cover Text may be at most 25 words.
+
+A ``Transparent'' copy of the Document means a machine-readable copy,
+represented in a format whose specification is available to the
+general public, that is suitable for revising the document
+straightforwardly with generic text editors or (for images composed of
+pixels) generic paint programs or (for drawings) some widely available
+drawing editor, and that is suitable for input to text formatters or
+for automatic translation to a variety of formats suitable for input
+to text formatters.  A copy made in an otherwise Transparent file
+format whose markup, or absence of markup, has been arranged to thwart
+or discourage subsequent modification by readers is not Transparent.
+An image format is not Transparent if used for any substantial amount
+of text.  A copy that is not ``Transparent'' is called ``Opaque''.
+
+Examples of suitable formats for Transparent copies include plain
+@sc{ascii} without markup, Texinfo input format, La@TeX{} input
+format, @acronym{SGML} or @acronym{XML} using a publicly available
+@acronym{DTD}, and standard-conforming simple @acronym{HTML},
+PostScript or @acronym{PDF} designed for human modification.  Examples
+of transparent image formats include @acronym{PNG}, @acronym{XCF} and
+@acronym{JPG}.  Opaque formats include proprietary formats that can be
+read and edited only by proprietary word processors, @acronym{SGML} or
+@acronym{XML} for which the @acronym{DTD} and/or processing tools are
+not generally available, and the machine-generated @acronym{HTML},
+PostScript or @acronym{PDF} produced by some word processors for
+output purposes only.
+
+The ``Title Page'' means, for a printed book, the title page itself,
+plus such following pages as are needed to hold, legibly, the material
+this License requires to appear in the title page.  For works in
+formats which do not have any title page as such, ``Title Page'' means
+the text near the most prominent appearance of the work's title,
+preceding the beginning of the body of the text.
+
+A section ``Entitled XYZ'' means a named subunit of the Document whose
+title either is precisely XYZ or contains XYZ in parentheses following
+text that translates XYZ in another language.  (Here XYZ stands for a
+specific section name mentioned below, such as ``Acknowledgements'',
+``Dedications'', ``Endorsements'', or ``History''.)  To ``Preserve the Title''
+of such a section when you modify the Document means that it remains a
+section ``Entitled XYZ'' according to this definition.
+
+The Document may include Warranty Disclaimers next to the notice which
+states that this License applies to the Document.  These Warranty
+Disclaimers are considered to be included by reference in this
+License, but only as regards disclaiming warranties: any other
+implication that these Warranty Disclaimers may have is void and has
+no effect on the meaning of this License.
+
+@item
+VERBATIM COPYING
+
+You may copy and distribute the Document in any medium, either
+commercially or noncommercially, provided that this License, the
+copyright notices, and the license notice saying this License applies
+to the Document are reproduced in all copies, and that you add no other
+conditions whatsoever to those of this License.  You may not use
+technical measures to obstruct or control the reading or further
+copying of the copies you make or distribute.  However, you may accept
+compensation in exchange for copies.  If you distribute a large enough
+number of copies you must also follow the conditions in section 3.
+
+You may also lend copies, under the same conditions stated above, and
+you may publicly display copies.
+
+@item
+COPYING IN QUANTITY
+
+If you publish printed copies (or copies in media that commonly have
+printed covers) of the Document, numbering more than 100, and the
+Document's license notice requires Cover Texts, you must enclose the
+copies in covers that carry, clearly and legibly, all these Cover
+Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
+the back cover.  Both covers must also clearly and legibly identify
+you as the publisher of these copies.  The front cover must present
+the full title with all words of the title equally prominent and
+visible.  You may add other material on the covers in addition.
+Copying with changes limited to the covers, as long as they preserve
+the title of the Document and satisfy these conditions, can be treated
+as verbatim copying in other respects.
+
+If the required texts for either cover are too voluminous to fit
+legibly, you should put the first ones listed (as many as fit
+reasonably) on the actual cover, and continue the rest onto adjacent
+pages.
+
+If you publish or distribute Opaque copies of the Document numbering
+more than 100, you must either include a machine-readable Transparent
+copy along with each Opaque copy, or state in or with each Opaque copy
+a computer-network location from which the general network-using
+public has access to download using public-standard network protocols
+a complete Transparent copy of the Document, free of added material.
+If you use the latter option, you must take reasonably prudent steps,
+when you begin distribution of Opaque copies in quantity, to ensure
+that this Transparent copy will remain thus accessible at the stated
+location until at least one year after the last time you distribute an
+Opaque copy (directly or through your agents or retailers) of that
+edition to the public.
+
+It is requested, but not required, that you contact the authors of the
+Document well before redistributing any large number of copies, to give
+them a chance to provide you with an updated version of the Document.
+
+@item
+MODIFICATIONS
+
+You may copy and distribute a Modified Version of the Document under
+the conditions of sections 2 and 3 above, provided that you release
+the Modified Version under precisely this License, with the Modified
+Version filling the role of the Document, thus licensing distribution
+and modification of the Modified Version to whoever possesses a copy
+of it.  In addition, you must do these things in the Modified Version:
+
+@enumerate A
+@item
+Use in the Title Page (and on the covers, if any) a title distinct
+from that of the Document, and from those of previous versions
+(which should, if there were any, be listed in the History section
+of the Document).  You may use the same title as a previous version
+if the original publisher of that version gives permission.
+
+@item
+List on the Title Page, as authors, one or more persons or entities
+responsible for authorship of the modifications in the Modified
+Version, together with at least five of the principal authors of the
+Document (all of its principal authors, if it has fewer than five),
+unless they release you from this requirement.
+
+@item
+State on the Title page the name of the publisher of the
+Modified Version, as the publisher.
+
+@item
+Preserve all the copyright notices of the Document.
+
+@item
+Add an appropriate copyright notice for your modifications
+adjacent to the other copyright notices.
+
+@item
+Include, immediately after the copyright notices, a license notice
+giving the public permission to use the Modified Version under the
+terms of this License, in the form shown in the Addendum below.
+
+@item
+Preserve in that license notice the full lists of Invariant Sections
+and required Cover Texts given in the Document's license notice.
+
+@item
+Include an unaltered copy of this License.
+
+@item
+Preserve the section Entitled ``History'', Preserve its Title, and add
+to it an item stating at least the title, year, new authors, and
+publisher of the Modified Version as given on the Title Page.  If
+there is no section Entitled ``History'' in the Document, create one
+stating the title, year, authors, and publisher of the Document as
+given on its Title Page, then add an item describing the Modified
+Version as stated in the previous sentence.
+
+@item
+Preserve the network location, if any, given in the Document for
+public access to a Transparent copy of the Document, and likewise
+the network locations given in the Document for previous versions
+it was based on.  These may be placed in the ``History'' section.
+You may omit a network location for a work that was published at
+least four years before the Document itself, or if the original
+publisher of the version it refers to gives permission.
+
+@item
+For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve
+the Title of the section, and preserve in the section all the
+substance and tone of each of the contributor acknowledgements and/or
+dedications given therein.
+
+@item
+Preserve all the Invariant Sections of the Document,
+unaltered in their text and in their titles.  Section numbers
+or the equivalent are not considered part of the section titles.
+
+@item
+Delete any section Entitled ``Endorsements''.  Such a section
+may not be included in the Modified Version.
+
+@item
+Do not retitle any existing section to be Entitled ``Endorsements'' or
+to conflict in title with any Invariant Section.
+
+@item
+Preserve any Warranty Disclaimers.
+@end enumerate
+
+If the Modified Version includes new front-matter sections or
+appendices that qualify as Secondary Sections and contain no material
+copied from the Document, you may at your option designate some or all
+of these sections as invariant.  To do this, add their titles to the
+list of Invariant Sections in the Modified Version's license notice.
+These titles must be distinct from any other section titles.
+
+You may add a section Entitled ``Endorsements'', provided it contains
+nothing but endorsements of your Modified Version by various
+parties---for example, statements of peer review or that the text has
+been approved by an organization as the authoritative definition of a
+standard.
+
+You may add a passage of up to five words as a Front-Cover Text, and a
+passage of up to 25 words as a Back-Cover Text, to the end of the list
+of Cover Texts in the Modified Version.  Only one passage of
+Front-Cover Text and one of Back-Cover Text may be added by (or
+through arrangements made by) any one entity.  If the Document already
+includes a cover text for the same cover, previously added by you or
+by arrangement made by the same entity you are acting on behalf of,
+you may not add another; but you may replace the old one, on explicit
+permission from the previous publisher that added the old one.
+
+The author(s) and publisher(s) of the Document do not by this License
+give permission to use their names for publicity for or to assert or
+imply endorsement of any Modified Version.
+
+@item
+COMBINING DOCUMENTS
+
+You may combine the Document with other documents released under this
+License, under the terms defined in section 4 above for modified
+versions, provided that you include in the combination all of the
+Invariant Sections of all of the original documents, unmodified, and
+list them all as Invariant Sections of your combined work in its
+license notice, and that you preserve all their Warranty Disclaimers.
+
+The combined work need only contain one copy of this License, and
+multiple identical Invariant Sections may be replaced with a single
+copy.  If there are multiple Invariant Sections with the same name but
+different contents, make the title of each such section unique by
+adding at the end of it, in parentheses, the name of the original
+author or publisher of that section if known, or else a unique number.
+Make the same adjustment to the section titles in the list of
+Invariant Sections in the license notice of the combined work.
+
+In the combination, you must combine any sections Entitled ``History''
+in the various original documents, forming one section Entitled
+``History''; likewise combine any sections Entitled ``Acknowledgements'',
+and any sections Entitled ``Dedications''.  You must delete all
+sections Entitled ``Endorsements.''
+
+@item
+COLLECTIONS OF DOCUMENTS
+
+You may make a collection consisting of the Document and other documents
+released under this License, and replace the individual copies of this
+License in the various documents with a single copy that is included in
+the collection, provided that you follow the rules of this License for
+verbatim copying of each of the documents in all other respects.
+
+You may extract a single document from such a collection, and distribute
+it individually under this License, provided you insert a copy of this
+License into the extracted document, and follow this License in all
+other respects regarding verbatim copying of that document.
+
+@item
+AGGREGATION WITH INDEPENDENT WORKS
+
+A compilation of the Document or its derivatives with other separate
+and independent documents or works, in or on a volume of a storage or
+distribution medium, is called an ``aggregate'' if the copyright
+resulting from the compilation is not used to limit the legal rights
+of the compilation's users beyond what the individual works permit.
+When the Document is included an aggregate, this License does not
+apply to the other works in the aggregate which are not themselves
+derivative works of the Document.
+
+If the Cover Text requirement of section 3 is applicable to these
+copies of the Document, then if the Document is less than one half of
+the entire aggregate, the Document's Cover Texts may be placed on
+covers that bracket the Document within the aggregate, or the
+electronic equivalent of covers if the Document is in electronic form.
+Otherwise they must appear on printed covers that bracket the whole
+aggregate.
+
+@item
+TRANSLATION
+
+Translation is considered a kind of modification, so you may
+distribute translations of the Document under the terms of section 4.
+Replacing Invariant Sections with translations requires special
+permission from their copyright holders, but you may include
+translations of some or all Invariant Sections in addition to the
+original versions of these Invariant Sections.  You may include a
+translation of this License, and all the license notices in the
+Document, and any Warrany Disclaimers, provided that you also include
+the original English version of this License and the original versions
+of those notices and disclaimers.  In case of a disagreement between
+the translation and the original version of this License or a notice
+or disclaimer, the original version will prevail.
+
+If a section in the Document is Entitled ``Acknowledgements'',
+``Dedications'', or ``History'', the requirement (section 4) to Preserve
+its Title (section 1) will typically require changing the actual
+title.
+
+@item
+TERMINATION
+
+You may not copy, modify, sublicense, or distribute the Document except
+as expressly provided for under this License.  Any other attempt to
+copy, modify, sublicense or distribute the Document is void, and will
+automatically terminate your rights under this License.  However,
+parties who have received copies, or rights, from you under this
+License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+@item
+FUTURE REVISIONS OF THIS LICENSE
+
+The Free Software Foundation may publish new, revised versions
+of the GNU Free Documentation License from time to time.  Such new
+versions will be similar in spirit to the present version, but may
+differ in detail to address new problems or concerns.  See
+@uref{http://www.gnu.org/copyleft/}.
+
+Each version of the License is given a distinguishing version number.
+If the Document specifies that a particular numbered version of this
+License ``or any later version'' applies to it, you have the option of
+following the terms and conditions either of that specified version or
+of any later version that has been published (not as a draft) by the
+Free Software Foundation.  If the Document does not specify a version
+number of this License, you may choose any version ever published (not
+as a draft) by the Free Software Foundation.
+@end enumerate
+
+@c fakenode --- for prepinfo
+@unnumberedsec ADDENDUM: How to use this License for your documents
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and
+license notices just after the title page:
+
+@smallexample
+@group
+  Copyright (C)  @var{year}  @var{your name}.
+  Permission is granted to copy, distribute and/or modify this document
+  under the terms of the GNU Free Documentation License, Version 1.2
+  or any later version published by the Free Software Foundation;
+  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
+  A copy of the license is included in the section entitled ``GNU
+  Free Documentation License''.
+@end group
+@end smallexample
+
+If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
+replace the ``with...Texts.'' line with this:
+
+@smallexample
+@group
+    with the Invariant Sections being @var{list their titles}, with
+    the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
+    being @var{list}.
+@end group
+@end smallexample
+
+If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of
+free software license, such as the GNU General Public License,
+to permit their use in free software.
+
+@c Local Variables:
+@c ispell-local-pdict: "ispell-dict"
+@c End:
+
+
+@node Index
+@unnumbered Index
+@printindex cp
+
+@bye
+
+Unresolved Issues:
+------------------
+1. From ADR.
+
+   Robert J. Chassell points out that awk programs should have some indication
+   of how to use them.  It would be useful to perhaps have a "programming
+   style" section of the manual that would include this and other tips.
+
+2. The default AWKPATH search path should be configurable via `configure'
+   The default and how this changes needs to be documented.
+
+Consistency issues:
+	/.../ regexps are in @code, not @samp
+	".." strings are in @code, not @samp
+	no @print before @dots
+	values of expressions in the text (@code{x} has the value 15),
+		should be in roman, not @code
+	Use   TAB   and not   tab
+	Use   ESC   and not   ESCAPE
+	Use   space and not   blank	to describe the space bar's character
+	The term "blank" is thus basically reserved for "blank lines" etc.
+	To make dark corners work, the @value{DARKCORNER} has to be outside
+		closing `.' of a sentence and after (pxref{...}).  This is
+		a change from earlier versions.
+	" " should have an @w{} around it
+	Use "non-" only with language names or acronyms, or the words bug and option
+	Use @command{ftp} when talking about anonymous ftp
+	Use uppercase and lowercase, not "upper-case" and "lower-case"
+		or "upper case" and "lower case"
+	Use "single precision" and "double precision", not "single-precision" or "double-precision"
+	Use alphanumeric, not alpha-numeric
+	Use POSIX-compliant, not POSIX compliant
+	Use --foo, not -Wfoo when describing long options
+	Use "Bell Laboratories", but not "Bell Labs".
+	Use "behavior" instead of "behaviour".
+	Use "zeros" instead of "zeroes".
+	Use "nonzero" not "non-zero".
+	Use "runtime" not "run time" or "run-time".
+	Use "command-line" not "command line".
+	Use "online" not "on-line".
+	Use "whitespace" not "white space".
+	Use "Input/Output", not "input/output". Also "I/O", not "i/o".
+	Use "lefthand"/"righthand", not "left-hand"/"right-hand".
+	Use "workaround", not "work-around".
+	Use "startup"/"cleanup", not "start-up"/"clean-up"
+	Use @code{do}, and not @code{do}-@code{while}, except where
+		actually discussing the do-while.
+	Use "versus" in text and "vs." in index entries
+	The words "a", "and", "as", "between", "for", "from", "in", "of",
+		"on", "that", "the", "to", "with", and "without",
+		should not be capitalized in @chapter, @section etc.
+		"Into" and "How" should.
+	Search for @dfn; make sure important items are also indexed.
+	"e.g." should always be followed by a comma.
+	"i.e." should always be followed by a comma.
+	The numbers zero through ten should be spelled out, except when
+		talking about file descriptor numbers. > 10 and < 0, it's
+		ok to use numbers.
+	In tables, put command-line options in @code, while in the text,
+		put them in @option.
+	When using @strong, use "Note:" or "Caution:" with colons and
+		not exclamation points.  Do not surround the paragraphs
+		with @quotation ... @end quotation.
+	For most cases, do NOT put a comma before "and", "or" or "but".
+		But exercise taste with this rule.
+	Don't show the awk command with a program in quotes when it's
+		just the program.  I.e.
+
+			{
+				....
+			}
+
+		not
+			awk '{
+				...
+			}'
+		
+	Do show it when showing command-line arguments, data files, etc, even
+		if there is no output shown.
+
+	Use numbered lists only to show a sequential series of steps.
+
+	Use @code{xxx} for the xxx operator in indexing statements, not @samp.
+
+Date: Wed, 13 Apr 94 15:20:52 -0400
+From: rms@gnu.org (Richard Stallman)
+To: gnu-prog@gnu.org
+Subject: A reminder: no pathnames in GNU
+
+It's a GNU convention to use the term "file name" for the name of a
+file, never "pathname".  We use the term "path" for search paths,
+which are lists of file names.  Using it for a single file name as
+well is potentially confusing to users.
+
+So please check any documentation you maintain, if you think you might
+have used "pathname".
+
+Note that "file name" should be two words when it appears as ordinary
+text.  It's ok as one word when it's a metasyntactic variable, though.
+
+------------------------
+ORA uses filename, thus the macro.
+
+Suggestions:
+------------
+Enhance FIELDWIDTHS with some way to indicate "the rest of the record".
+E.g., a length of 0 or -1 or something.  May be "n"?
+
+Make FIELDWIDTHS be an array?
+
+% Next edition:
+%	1. Talk about common extensions, those in nawk, gawk, mawk
+%	2. Use @code{foo} for variables and @code{foo()} for functions
+%	3. Standardize the error messages from the functions and programs
+%	   in Chapters 12 and 13.
+%	4. Nuke the BBS stuff and use something that won't be obsolete
+%	5. Reorg chapters 5 & 7 like so:
+%Chapter 5:
+% - Constants, Variables, and Conversions
+%   + Constant Expressions
+%   + Using Regular Expression Constants
+%   + Variables
+%   + Conversion of Strings and Numbers
+% - Operators
+%   + Arithmetic Operators
+%   + String Concatenation
+%   + Assignment Expressions
+%   + Increment and Decrement Operators
+% - Truth Values and Conditions
+%   + True and False in Awk
+%   + Boolean Expressions
+%   + Conditional Expressions
+% - Function Calls
+% - Operator Precedence
+%
+%Chapter 7:
+%  - Array Basics
+%    + Introduction to Arrays
+%    + Referring to an Array Element
+%    + Assigning Array Elements
+%    + Basic Array Example
+%    + Scanning All Elements of an Array
+%  - The delete Statement
+%  - Using Numbers to Subscript Arrays
+%  - Using Uninitialized Variables as Subscripts
+%  - Multidimensional Arrays
+%    + Scanning Multidimensional Arrays
+%  - Sorting Array Values and Indices with gawk