Clean up deleted files.
This commit is contained in:
parent
969988f0a6
commit
536d2c03e3
@ -1,8 +0,0 @@
|
||||
|
||||
PROG= egrep
|
||||
SRCS=dfa.c grep.c regex.c
|
||||
CFLAGS+=-DEGREP
|
||||
NOMAN= noman
|
||||
.PATH: ${.CURDIR}/../grep
|
||||
|
||||
.include <bsd.prog.mk>
|
@ -1,249 +0,0 @@
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 1, February 1989
|
||||
|
||||
Copyright (C) 1989 Free Software Foundation, Inc.
|
||||
675 Mass Ave, Cambridge, MA 02139, USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The license agreements of most software companies try to keep users
|
||||
at the mercy of those companies. By contrast, our General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. The
|
||||
General Public License applies to the Free Software Foundation's
|
||||
software and to any other program whose authors commit to using it.
|
||||
You can use it for your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Specifically, the General Public License is designed to make
|
||||
sure that you have the freedom to give away or sell copies of free
|
||||
software, that you receive source code or can get it if you want it,
|
||||
that you can change the software or use pieces of it in new free
|
||||
programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of a such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must tell them their rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License Agreement applies to any program or other work which
|
||||
contains a notice placed by the copyright holder saying it may be
|
||||
distributed under the terms of this General Public License. The
|
||||
"Program", below, refers to any such program or work, and a "work based
|
||||
on the Program" means either the Program or any work containing the
|
||||
Program or a portion of it, either verbatim or with modifications. Each
|
||||
licensee is addressed as "you".
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's source
|
||||
code as you receive it, in any medium, provided that you conspicuously and
|
||||
appropriately publish on each copy an appropriate copyright notice and
|
||||
disclaimer of warranty; keep intact all the notices that refer to this
|
||||
General Public License and to the absence of any warranty; and give any
|
||||
other recipients of the Program a copy of this General Public License
|
||||
along with the Program. You may charge a fee for the physical act of
|
||||
transferring a copy.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion of
|
||||
it, and copy and distribute such modifications under the terms of Paragraph
|
||||
1 above, provided that you also do the following:
|
||||
|
||||
a) cause the modified files to carry prominent notices stating that
|
||||
you changed the files and the date of any change; and
|
||||
|
||||
b) cause the whole of any work that you distribute or publish, that
|
||||
in whole or in part contains the Program or any part thereof, either
|
||||
with or without modifications, to be licensed at no charge to all
|
||||
third parties under the terms of this General Public License (except
|
||||
that you may choose to grant warranty protection to some or all
|
||||
third parties, at your option).
|
||||
|
||||
c) If the modified program normally reads commands interactively when
|
||||
run, you must cause it, when started running for such interactive use
|
||||
in the simplest and most usual way, to print or display an
|
||||
announcement including an appropriate copyright notice and a notice
|
||||
that there is no warranty (or else, saying that you provide a
|
||||
warranty) and that users may redistribute the program under these
|
||||
conditions, and telling the user how to view a copy of this General
|
||||
Public License.
|
||||
|
||||
d) You may charge a fee for the physical act of transferring a
|
||||
copy, and you may at your option offer warranty protection in
|
||||
exchange for a fee.
|
||||
|
||||
Mere aggregation of another independent work with the Program (or its
|
||||
derivative) on a volume of a storage or distribution medium does not bring
|
||||
the other work under the scope of these terms.
|
||||
|
||||
3. You may copy and distribute the Program (or a portion or derivative of
|
||||
it, under Paragraph 2) in object code or executable form under the terms of
|
||||
Paragraphs 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of
|
||||
Paragraphs 1 and 2 above; or,
|
||||
|
||||
b) accompany it with a written offer, valid for at least three
|
||||
years, to give any third party free (except for a nominal charge
|
||||
for the cost of distribution) a complete machine-readable copy of the
|
||||
corresponding source code, to be distributed under the terms of
|
||||
Paragraphs 1 and 2 above; or,
|
||||
|
||||
c) accompany it with the information you received as to where the
|
||||
corresponding source code may be obtained. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form alone.)
|
||||
|
||||
Source code for a work means the preferred form of the work for making
|
||||
modifications to it. For an executable file, complete source code means
|
||||
all the source code for all modules it contains; but, as a special
|
||||
exception, it need not include source code for modules which are standard
|
||||
libraries that accompany the operating system on which the executable
|
||||
file runs, or for standard header files or definitions files that
|
||||
accompany that operating system.
|
||||
|
||||
4. You may not copy, modify, sublicense, distribute or transfer the
|
||||
Program except as expressly provided under this General Public License.
|
||||
Any attempt otherwise to copy, modify, sublicense, distribute or transfer
|
||||
the Program is void, and will automatically terminate your rights to use
|
||||
the Program under this License. However, parties who have received
|
||||
copies, or rights to use copies, from you under this General Public
|
||||
License will not have their licenses terminated so long as such parties
|
||||
remain in full compliance.
|
||||
|
||||
5. By copying, distributing or modifying the Program (or any work based
|
||||
on the Program) you indicate your acceptance of this license to do so,
|
||||
and all its terms and conditions.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the original
|
||||
licensor to copy, distribute or modify the Program subject to these
|
||||
terms and conditions. You may not impose any further restrictions on the
|
||||
recipients' exercise of the rights granted herein.
|
||||
|
||||
7. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of the license which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
the license, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
8. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
Appendix: How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to humanity, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these
|
||||
terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest to
|
||||
attach them to the start of each source file to most effectively convey
|
||||
the exclusion of warranty; and each file should have at least the
|
||||
"copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) 19yy <name of author>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 1, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) 19xx name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the
|
||||
appropriate parts of the General Public License. Of course, the
|
||||
commands you use may be called something other than `show w' and `show
|
||||
c'; they could even be mouse-clicks or menu items--whatever suits your
|
||||
program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the
|
||||
program `Gnomovision' (a program to direct compilers to make passes
|
||||
at assemblers) written by James Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
That's all there is to it!
|
@ -1,5 +0,0 @@
|
||||
PROG= fgrep
|
||||
SRCS= fgrep.c kwset.c obstack.c std.c
|
||||
NOMAN=noman
|
||||
|
||||
.include <bsd.prog.mk>
|
@ -1,17 +0,0 @@
|
||||
#DEFS=-DX_strerror
|
||||
CFLAGS= -O $(DEFS)
|
||||
LIBS=
|
||||
FOBJ=fgrep.o kwset.o obstack.o std.o
|
||||
|
||||
fgrep: $(FOBJ)
|
||||
$(CC) $(CFLAGS) -o fgrep $(FOBJ) $(LIBS)
|
||||
|
||||
clean: FRC
|
||||
rm -f fgrep core $(FOBJ)
|
||||
|
||||
FRC:
|
||||
|
||||
fgrep.o: unix.h
|
||||
kwset.o fgrep.o: kwset.h obstack.h std.h
|
||||
obstack.o: obstack.h
|
||||
std.o: std.h unix.h
|
@ -1,18 +0,0 @@
|
||||
This is GNU fgrep, version 1.1. Eventually it will disappear into the
|
||||
single grep program mandated by POSIX, but for now it should be a
|
||||
compatible replacement for UNIX fgrep. The options are the same
|
||||
as those of GNU egrep, and are a proper superset of the fgrep options
|
||||
on all UNIX systems I am aware of.
|
||||
|
||||
I have not tried comparing this performance-wise to any of the
|
||||
various other free fgreps that are floating around, but it should
|
||||
never be any slower, and for demanding applications will probably
|
||||
be substantially faster. For single fixed-string searches the
|
||||
speed is about the same as that of GNU egrep; for multiple fixed
|
||||
strings it is substantially faster.
|
||||
|
||||
Correspondence regarding this program should be addressed directly
|
||||
to me, mike@ai.mit.edu, *not* bug-gnu-utils. Enjoy.
|
||||
|
||||
Mike Haertel
|
||||
mike@ai.mit.edu
|
@ -1,627 +0,0 @@
|
||||
/* fgrep.c - grep program built around matcher.
|
||||
Copyright 1989 Free Software Foundation
|
||||
Written August 1989 by Mike Haertel.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 1, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
The author may be reached (Email) at the address mike@ai.mit.edu,
|
||||
or (US mail) as Mike Haertel c/o Free Software Foundation. */
|
||||
|
||||
#include "std.h"
|
||||
#include "unix.h"
|
||||
|
||||
#include <errno.h>
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "kwset.h"
|
||||
|
||||
#define NCHAR (UCHAR_MAX + 1)
|
||||
|
||||
/* For error messages. */
|
||||
static const char *prog;
|
||||
static int error_seen;
|
||||
|
||||
/* Flags controlling the style of output. */
|
||||
static int out_silent; /* Suppress all normal output. */
|
||||
static int out_invert; /* Print nonmatching stuff. */
|
||||
static int out_file; /* Print filenames. */
|
||||
static int out_line; /* Print line numbers. */
|
||||
static int out_byte; /* Print byte offsets. */
|
||||
static int out_before; /* Lines of leading context. */
|
||||
static int out_after; /* Lines of trailing context. */
|
||||
|
||||
/* Print MESG and possibly the error string for ERRNUM. Remember
|
||||
that something awful happened. */
|
||||
static void
|
||||
DEFUN(error, (mesg, errnum), const char *mesg AND int errnum)
|
||||
{
|
||||
if (errnum)
|
||||
fprintf(stderr, "%s: %s: %s\n", prog, mesg, strerror(errnum));
|
||||
else
|
||||
fprintf(stderr, "%s: %s\n", prog, mesg);
|
||||
error_seen = 1;
|
||||
}
|
||||
|
||||
/* Like error(), but die horribly after printing. */
|
||||
static void
|
||||
DEFUN(fatal, (mesg, errnum), const char *mesg AND int errnum)
|
||||
{
|
||||
error(mesg, errnum);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
/* Interface to handle errors and fix library lossage. */
|
||||
static PTR
|
||||
DEFUN(xmalloc, (size), size_t size)
|
||||
{
|
||||
PTR result;
|
||||
|
||||
result = malloc(size);
|
||||
if (size && !result)
|
||||
fatal("memory exhausted", 0);
|
||||
return result;
|
||||
}
|
||||
|
||||
/* Interface to handle errors and fix some library lossage. */
|
||||
static PTR
|
||||
DEFUN(xrealloc, (ptr, size), PTR ptr AND size_t size)
|
||||
{
|
||||
PTR result;
|
||||
|
||||
if (ptr)
|
||||
result = realloc(ptr, size);
|
||||
else
|
||||
result = malloc(size);
|
||||
if (size && !result)
|
||||
fatal("memory exhausted", 0);
|
||||
return result;
|
||||
}
|
||||
|
||||
/* Compiled search pattern. */
|
||||
kwset_t kwset;
|
||||
|
||||
/* Flags controlling how pattern matching is performed. */
|
||||
static int match_fold; /* Fold all letters to one case. */
|
||||
static int match_words; /* Match only whole words. */
|
||||
static int match_lines; /* Match only whole lines. */
|
||||
|
||||
static void
|
||||
DEFUN(compile, (pattern, size), const char *pattern AND size_t size)
|
||||
{
|
||||
const char *beg, *lim, *err;
|
||||
static char trans[NCHAR];
|
||||
int i;
|
||||
|
||||
if (match_fold)
|
||||
for (i = 0; i < NCHAR; ++i)
|
||||
trans[i] = TOLOWER(i);
|
||||
|
||||
if (!(kwset = kwsalloc(match_fold ? trans : (const char *) NULL)))
|
||||
fatal("memory exhausted", 0);
|
||||
|
||||
beg = pattern;
|
||||
do
|
||||
{
|
||||
for (lim = beg; lim < pattern + size && *lim != '\n'; ++lim)
|
||||
;
|
||||
if (err = kwsincr(kwset, beg, lim - beg))
|
||||
fatal(err, 0);
|
||||
if (lim < pattern + size)
|
||||
++lim;
|
||||
beg = lim;
|
||||
}
|
||||
while (beg < pattern + size);
|
||||
|
||||
if (err = kwsprep(kwset))
|
||||
fatal(err, 0);
|
||||
}
|
||||
|
||||
static char *
|
||||
DEFUN(execute, (buf, size), char *buf AND size_t size)
|
||||
{
|
||||
register char *beg, *try;
|
||||
register size_t len;
|
||||
struct kwsmatch kwsmatch;
|
||||
|
||||
beg = buf;
|
||||
for (;beg <= buf + size; ++beg)
|
||||
{
|
||||
if (!(beg = kwsexec(kwset, beg, buf + size - beg, &kwsmatch)))
|
||||
return NULL;;
|
||||
len = kwsmatch.size[0];
|
||||
if (match_lines)
|
||||
{
|
||||
if (beg > buf && beg[-1] != '\n')
|
||||
continue;
|
||||
if (beg + len < buf + size && *(beg + len) != '\n')
|
||||
continue;
|
||||
return beg;
|
||||
}
|
||||
else if (match_words)
|
||||
for (try = beg; len && try;)
|
||||
{
|
||||
if (try > buf && (ISALNUM((unsigned char) try[-1])
|
||||
|| !ISALNUM((unsigned char) *try)))
|
||||
goto retry;
|
||||
if (try + len < buf + size
|
||||
&& (ISALNUM((unsigned char) *(try + len))
|
||||
|| !ISALNUM((unsigned char) (try + len)[-1])))
|
||||
goto retry;
|
||||
return try;
|
||||
retry:
|
||||
if (--len)
|
||||
try = kwsexec(kwset, beg, len, &kwsmatch);
|
||||
else
|
||||
break;
|
||||
len = kwsmatch.size[0];
|
||||
}
|
||||
else
|
||||
return beg;
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Hairy buffering mechanism to efficiently support all the options. */
|
||||
static char *bufbeg; /* Beginning of user-visible portion. */
|
||||
static char *buflim; /* Limit of user-visible portion. */
|
||||
static char *buf; /* Pointer to base of buffer. */
|
||||
static size_t bufalloc; /* Allocated size of buffer. */
|
||||
static size_t bufcc; /* Count of characters in buffer. */
|
||||
static unsigned long int buftotalcc;
|
||||
/* Total character count since reset. */
|
||||
static char *buflast; /* Pointer after last character printed. */
|
||||
static int bufgap; /* Weird flag indicating buflast is a lie. */
|
||||
static unsigned long int buftotalnl;
|
||||
/* Count of newlines before last character. */
|
||||
static int bufpending; /* Lines of pending output at buflast. */
|
||||
static int bufdesc; /* File descriptor to read from. */
|
||||
static int bufeof; /* Flag indicating EOF reached. */
|
||||
static const char *buffile; /* File name for messages. */
|
||||
|
||||
/* Scan and count the newlines prior to LIM in the buffer. */
|
||||
static void
|
||||
DEFUN(nlscan, (lim), register char *lim)
|
||||
{
|
||||
register char *p;
|
||||
|
||||
for (p = buflast; p < lim; ++p)
|
||||
if (*p == '\n')
|
||||
++buftotalnl;
|
||||
buflast = lim;
|
||||
}
|
||||
|
||||
/* Print the line beginning at BEG, using SEP to separate optional label
|
||||
fields from the text of the line. Return the size of the line. */
|
||||
static size_t
|
||||
DEFUN(prline, (beg, sep), register char *beg AND register char sep)
|
||||
{
|
||||
register size_t cc;
|
||||
register char c;
|
||||
static int err;
|
||||
|
||||
cc = 0;
|
||||
|
||||
if (out_silent || err)
|
||||
while (beg < buflim)
|
||||
{
|
||||
++cc;
|
||||
if (*beg++ == '\n')
|
||||
break;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (out_file)
|
||||
printf("%s%c", buffile, sep);
|
||||
if (out_line)
|
||||
{
|
||||
nlscan(beg);
|
||||
printf("%d%c", buftotalnl + 1, sep);
|
||||
}
|
||||
if (out_byte)
|
||||
printf("%lu%c", buftotalcc + (beg - buf), sep);
|
||||
while (beg < buflim)
|
||||
{
|
||||
++cc;
|
||||
c = *beg++;
|
||||
putchar(c);
|
||||
if (c == '\n')
|
||||
break;
|
||||
}
|
||||
if (ferror(stdout))
|
||||
{
|
||||
error("output error", errno);
|
||||
err = 1;
|
||||
}
|
||||
}
|
||||
|
||||
if (out_line)
|
||||
nlscan(beg);
|
||||
else
|
||||
buflast = beg;
|
||||
bufgap = 0;
|
||||
|
||||
return cc;
|
||||
}
|
||||
|
||||
/* Print pending bytes of last trailing context prior to LIM. */
|
||||
static void
|
||||
DEFUN(prpending, (lim), register char *lim)
|
||||
{
|
||||
while (buflast < lim && bufpending)
|
||||
{
|
||||
--bufpending;
|
||||
prline(buflast, '-');
|
||||
}
|
||||
}
|
||||
|
||||
/* Print the lines between BEG and LIM. Deal with context crap.
|
||||
Return the count of lines between BEG and LIM. */
|
||||
static int
|
||||
DEFUN(prtext, (beg, lim), char *beg AND char *lim)
|
||||
{
|
||||
static int used;
|
||||
register char *p;
|
||||
int i, n;
|
||||
|
||||
prpending(beg);
|
||||
|
||||
p = beg;
|
||||
for (i = 0; i < out_before; ++i)
|
||||
if (p > buflast)
|
||||
do
|
||||
--p;
|
||||
while (p > buflast && p[-1] != '\n');
|
||||
|
||||
if ((out_before || out_after) && used && (p > buflast || bufgap))
|
||||
puts("--");
|
||||
|
||||
while (p < beg)
|
||||
p += prline(p, '-');
|
||||
|
||||
n = 0;
|
||||
while (p < lim)
|
||||
{
|
||||
++n;
|
||||
p += prline(p, ':');
|
||||
}
|
||||
|
||||
bufpending = out_after;
|
||||
used = 1;
|
||||
|
||||
return n;
|
||||
}
|
||||
|
||||
/* Fill the user-visible portion of the buffer, returning a byte count. */
|
||||
static int
|
||||
fillbuf()
|
||||
{
|
||||
register char *b, *d, *l;
|
||||
int i, cc;
|
||||
size_t discard, save;
|
||||
|
||||
prpending(buflim);
|
||||
|
||||
b = buflim;
|
||||
for (i = 0; i < out_before; ++i)
|
||||
if (b > buflast)
|
||||
do
|
||||
--b;
|
||||
while (b > buflast && b[-1] != '\n');
|
||||
|
||||
if (buflast < b)
|
||||
bufgap = 1;
|
||||
if (out_line)
|
||||
nlscan(b);
|
||||
|
||||
discard = b - buf;
|
||||
save = buflim - b;
|
||||
|
||||
if (b > buf)
|
||||
{
|
||||
d = buf;
|
||||
l = buf + bufcc;
|
||||
while (b < l)
|
||||
*d++ = *b++;
|
||||
}
|
||||
|
||||
bufcc -= discard;
|
||||
buftotalcc += discard;
|
||||
|
||||
do
|
||||
{
|
||||
if (!bufeof)
|
||||
{
|
||||
if (bufcc > bufalloc / 2)
|
||||
buf = xrealloc(buf, bufalloc *= 2);
|
||||
cc = read(bufdesc, buf + bufcc, bufalloc - bufcc);
|
||||
if (cc < 0)
|
||||
{
|
||||
error(buffile, errno);
|
||||
bufeof = 1;
|
||||
}
|
||||
else
|
||||
{
|
||||
bufeof = !cc;
|
||||
bufcc += cc;
|
||||
}
|
||||
}
|
||||
bufbeg = buf + save;
|
||||
for (l = buf + bufcc; l > bufbeg && l[-1] != '\n'; --l)
|
||||
;
|
||||
buflim = l;
|
||||
buflast = buf;
|
||||
}
|
||||
while (!bufeof && bufbeg == buflim);
|
||||
|
||||
if (bufeof)
|
||||
buflim = buf + bufcc;
|
||||
|
||||
return buflim - bufbeg;
|
||||
}
|
||||
|
||||
/* One-time initialization. */
|
||||
static void
|
||||
initbuf()
|
||||
{
|
||||
bufalloc = 8192;
|
||||
buf = xmalloc(bufalloc);
|
||||
}
|
||||
|
||||
/* Reset the buffer for a new file. */
|
||||
static void
|
||||
DEFUN(resetbuf, (desc, file), int desc AND const char *file)
|
||||
{
|
||||
bufbeg = buf;
|
||||
buflim = buf;
|
||||
bufcc = 0;
|
||||
buftotalcc = 0;
|
||||
buflast = buf;
|
||||
bufgap = 0;
|
||||
buftotalnl = 0;
|
||||
bufpending = 0;
|
||||
bufdesc = desc;
|
||||
bufeof = 0;
|
||||
buffile = file;
|
||||
}
|
||||
|
||||
/* Scan the user-visible portion of the buffer, calling prtext() for
|
||||
matching lines (or between matching lines if OUT_INVERT is true).
|
||||
Return a count of lines printed. */
|
||||
static int
|
||||
grepbuf()
|
||||
{
|
||||
int total;
|
||||
register char *p, *b, *l;
|
||||
|
||||
total = 0;
|
||||
p = bufbeg;
|
||||
while (b = execute(p, buflim - p))
|
||||
{
|
||||
if (b == buflim && (b > bufbeg && b[-1] == '\n' || b == bufbeg))
|
||||
break;
|
||||
while (b > bufbeg && b[-1] != '\n')
|
||||
--b;
|
||||
l = b + 1;
|
||||
while (l < buflim && l[-1] != '\n')
|
||||
++l;
|
||||
if (!out_invert)
|
||||
total += prtext(b, l);
|
||||
else if (p < b)
|
||||
total += prtext(p, b);
|
||||
p = l;
|
||||
}
|
||||
if (out_invert && p < buflim)
|
||||
total += prtext(p, buflim);
|
||||
return total;
|
||||
}
|
||||
|
||||
/* Scan the given file, returning a count of lines printed. */
|
||||
static int
|
||||
DEFUN(grep, (desc, file), int desc AND const char *file)
|
||||
{
|
||||
int total;
|
||||
|
||||
total = 0;
|
||||
resetbuf(desc, file);
|
||||
while (fillbuf())
|
||||
total += grepbuf();
|
||||
return total;
|
||||
}
|
||||
|
||||
static const char version[] = "GNU fgrep, version 1.1";
|
||||
|
||||
#define USAGE \
|
||||
"usage: %s [-[[AB] ]<num>] [-[CVchilnsvwx]] [-[ef]] <expr> [<files...>]\n"
|
||||
|
||||
static void
|
||||
usage()
|
||||
{
|
||||
fprintf(stderr, USAGE, prog);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
int
|
||||
DEFUN(main, (argc, argv), int argc AND char *argv[])
|
||||
{
|
||||
char *keys;
|
||||
size_t keycc, keyalloc;
|
||||
int count_matches, no_filenames, list_files;
|
||||
int opt, cc, desc, count, status;
|
||||
FILE *fp;
|
||||
|
||||
prog = argv[0];
|
||||
if (prog && strrchr(prog, '/'))
|
||||
prog = strrchr(prog, '/') + 1;
|
||||
|
||||
keys = NULL;
|
||||
count_matches = 0;
|
||||
no_filenames = 0;
|
||||
list_files = 0;
|
||||
|
||||
while ((opt = getopt(argc, argv, "0123456789A:B:CVbce:f:hilnsvwxy")) != EOF)
|
||||
switch (opt)
|
||||
{
|
||||
case '0':
|
||||
case '1':
|
||||
case '2':
|
||||
case '3':
|
||||
case '4':
|
||||
case '5':
|
||||
case '6':
|
||||
case '7':
|
||||
case '8':
|
||||
case '9':
|
||||
out_before = 10 * out_before + opt - '0';
|
||||
out_after = 10 * out_after + opt - '0';
|
||||
break;
|
||||
case 'A':
|
||||
out_after = atoi(optarg);
|
||||
if (out_after < 0)
|
||||
usage();
|
||||
break;
|
||||
case 'B':
|
||||
out_before = atoi(optarg);
|
||||
if (out_before < 0)
|
||||
usage();
|
||||
break;
|
||||
case 'C':
|
||||
out_before = out_after = 2;
|
||||
break;
|
||||
case 'V':
|
||||
fprintf(stderr, "%s\n", version);
|
||||
break;
|
||||
case 'b':
|
||||
out_byte = 1;
|
||||
break;
|
||||
case 'c':
|
||||
out_silent = 1;
|
||||
count_matches = 1;
|
||||
break;
|
||||
case 'e':
|
||||
if (keys)
|
||||
usage();
|
||||
keys = optarg;
|
||||
keycc = strlen(keys);
|
||||
break;
|
||||
case 'f':
|
||||
if (keys)
|
||||
usage();
|
||||
fp = strcmp(optarg, "-") ? fopen(optarg, "r") : stdin;
|
||||
if (!fp)
|
||||
fatal(optarg, errno);
|
||||
keyalloc = 1024;
|
||||
keys = xmalloc(keyalloc);
|
||||
keycc = 0;
|
||||
while (!feof(fp)
|
||||
&& (cc = fread(keys + keycc, 1, keyalloc - keycc, fp)) > 0)
|
||||
{
|
||||
keycc += cc;
|
||||
if (keycc == keyalloc)
|
||||
keys = xrealloc(keys, keyalloc *= 2);
|
||||
}
|
||||
if (fp != stdin)
|
||||
fclose(fp);
|
||||
break;
|
||||
case 'h':
|
||||
no_filenames = 1;
|
||||
break;
|
||||
case 'i':
|
||||
case 'y': /* For old-timers . . . */
|
||||
match_fold = 1;
|
||||
break;
|
||||
case 'l':
|
||||
out_silent = 1;
|
||||
list_files = 1;
|
||||
break;
|
||||
case 'n':
|
||||
out_line = 1;
|
||||
break;
|
||||
case 's':
|
||||
out_silent = 1;
|
||||
break;
|
||||
case 'v':
|
||||
out_invert = 1;
|
||||
break;
|
||||
case 'w':
|
||||
match_words = 1;
|
||||
break;
|
||||
case 'x':
|
||||
match_lines = 1;
|
||||
break;
|
||||
default:
|
||||
usage();
|
||||
break;
|
||||
}
|
||||
|
||||
if (!keys)
|
||||
if (optind < argc)
|
||||
{
|
||||
keys = argv[optind++];
|
||||
keycc = strlen(keys);
|
||||
}
|
||||
else
|
||||
usage();
|
||||
|
||||
compile(keys, keycc);
|
||||
|
||||
if (argc - optind > 1 && !no_filenames)
|
||||
out_file = 1;
|
||||
|
||||
status = 1;
|
||||
initbuf();
|
||||
|
||||
if (optind < argc)
|
||||
while (optind < argc)
|
||||
{
|
||||
desc = strcmp(argv[optind], "-") ? open(argv[optind], 0) : 0;
|
||||
if (desc < 0)
|
||||
error(argv[optind], errno);
|
||||
else
|
||||
{
|
||||
count = grep(desc, argv[optind]);
|
||||
if (count_matches)
|
||||
{
|
||||
if (out_file)
|
||||
printf("%s:", argv[optind]);
|
||||
printf("%d\n", count);
|
||||
}
|
||||
if (count)
|
||||
{
|
||||
status = 0;
|
||||
if (list_files)
|
||||
printf("%s\n", argv[optind]);
|
||||
}
|
||||
}
|
||||
if (desc)
|
||||
close(desc);
|
||||
++optind;
|
||||
}
|
||||
else
|
||||
{
|
||||
count = grep(0, "<stdin>");
|
||||
if (count_matches)
|
||||
printf("%d\n", count);
|
||||
if (count)
|
||||
{
|
||||
status = 0;
|
||||
if (list_files)
|
||||
printf("%s\n", argv[optind]);
|
||||
}
|
||||
}
|
||||
|
||||
exit(error_seen ? 2 : status);
|
||||
}
|
@ -1,609 +0,0 @@
|
||||
/* kwset.c - search for any of a set of keywords.
|
||||
Copyright 1989 Free Software Foundation
|
||||
Written August 1989 by Mike Haertel.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 1, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
The author may be reached (Email) at the address mike@ai.mit.edu,
|
||||
or (US mail) as Mike Haertel c/o Free Software Foundation. */
|
||||
|
||||
#include "std.h"
|
||||
|
||||
/* The algorithm implemented by these routines bears a startling resemblence
|
||||
to one discovered by Beate Commentz-Walter, although it is not identical.
|
||||
See "A String Matching Algorithm Fast on the Average," Technical Report,
|
||||
IBM-Germany, Scientific Center Heidelberg, Tiergartenstrasse 15, D-6900
|
||||
Heidelberg, Germany. See also Aho, A.V., and M. Corasick, "Efficient
|
||||
String Matching: An Aid to Bibliographic Search," CACM June 1975,
|
||||
Vol. 18, No. 6, which describes the failure function used below. */
|
||||
|
||||
#include "kwset.h"
|
||||
#include "obstack.h"
|
||||
|
||||
#define NCHAR (UCHAR_MAX + 1)
|
||||
#define obstack_chunk_alloc malloc
|
||||
#define obstack_chunk_free free
|
||||
|
||||
/* Balanced tree of edges and labels leaving a given trie node. */
|
||||
struct tree
|
||||
{
|
||||
struct tree *llink; /* Left link; MUST be first field. */
|
||||
struct tree *rlink; /* Right link (to larger labels). */
|
||||
struct trie *trie; /* Trie node pointed to by this edge. */
|
||||
unsigned char label; /* Label on this edge. */
|
||||
char balance; /* Difference in depths of subtrees. */
|
||||
};
|
||||
|
||||
/* Node of a trie representing a set of reversed keywords. */
|
||||
struct trie
|
||||
{
|
||||
unsigned int accepting; /* Word index of accepted word, or zero. */
|
||||
struct tree *links; /* Tree of edges leaving this node. */
|
||||
struct trie *parent; /* Parent of this node. */
|
||||
struct trie *next; /* List of all trie nodes in level order. */
|
||||
struct trie *fail; /* Aho-Corasick failure function. */
|
||||
int depth; /* Depth of this node from the root. */
|
||||
int shift; /* Shift function for search failures. */
|
||||
int maxshift; /* Max shift of self and descendents. */
|
||||
};
|
||||
|
||||
/* Structure returned opaquely to the caller, containing everything. */
|
||||
struct kwset
|
||||
{
|
||||
struct obstack obstack; /* Obstack for node allocation. */
|
||||
int words; /* Number of words in the trie. */
|
||||
struct trie *trie; /* The trie itself. */
|
||||
int mind; /* Minimum depth of an accepting node. */
|
||||
int maxd; /* Maximum depth of any node. */
|
||||
int delta[NCHAR]; /* Delta table for rapid search. */
|
||||
struct trie *next[NCHAR]; /* Table of children of the root. */
|
||||
const char *trans; /* Character translation table. */
|
||||
};
|
||||
|
||||
/* Allocate and initialize a keyword set object, returning an opaque
|
||||
pointer to it. Return NULL if memory is not available. */
|
||||
kwset_t
|
||||
DEFUN(kwsalloc, (trans), const char *trans)
|
||||
{
|
||||
struct kwset *kwset;
|
||||
|
||||
kwset = (struct kwset *) malloc(sizeof (struct kwset));
|
||||
if (!kwset)
|
||||
return NULL;
|
||||
|
||||
obstack_init(&kwset->obstack);
|
||||
kwset->words = 0;
|
||||
kwset->trie
|
||||
= (struct trie *) obstack_alloc(&kwset->obstack, sizeof (struct trie));
|
||||
if (!kwset->trie)
|
||||
{
|
||||
kwsfree((kwset_t) kwset);
|
||||
return NULL;
|
||||
}
|
||||
kwset->trie->accepting = 0;
|
||||
kwset->trie->links = NULL;
|
||||
kwset->trie->parent = NULL;
|
||||
kwset->trie->next = NULL;
|
||||
kwset->trie->fail = NULL;
|
||||
kwset->trie->depth = 0;
|
||||
kwset->trie->shift = 0;
|
||||
kwset->mind = INT_MAX;
|
||||
kwset->maxd = -1;
|
||||
kwset->trans = trans;
|
||||
|
||||
return (kwset_t) kwset;
|
||||
}
|
||||
|
||||
/* Add the given string to the contents of the keyword set. Return NULL
|
||||
for success, an error message otherwise. */
|
||||
const char *
|
||||
DEFUN(kwsincr, (kws, text, len),
|
||||
kwset_t kws AND const char *text AND size_t len)
|
||||
{
|
||||
struct kwset *kwset;
|
||||
register struct trie *trie;
|
||||
register unsigned char label;
|
||||
register struct tree *link;
|
||||
register int depth;
|
||||
struct tree *links[12];
|
||||
enum { L, R } dirs[12];
|
||||
struct tree *t, *r, *l, *rl, *lr;
|
||||
|
||||
kwset = (struct kwset *) kws;
|
||||
trie = kwset->trie;
|
||||
text += len;
|
||||
|
||||
/* Descend the trie (built of reversed keywords) character-by-character,
|
||||
installing new nodes when necessary. */
|
||||
while (len--)
|
||||
{
|
||||
label = kwset->trans ? kwset->trans[(unsigned char) *--text] : *--text;
|
||||
|
||||
/* Descend the tree of outgoing links for this trie node,
|
||||
looking for the current character and keeping track
|
||||
of the path followed. */
|
||||
link = trie->links;
|
||||
links[0] = (struct tree *) &trie->links;
|
||||
dirs[0] = L;
|
||||
depth = 1;
|
||||
|
||||
while (link && label != link->label)
|
||||
{
|
||||
links[depth] = link;
|
||||
if (label < link->label)
|
||||
dirs[depth++] = L, link = link->llink;
|
||||
else
|
||||
dirs[depth++] = R, link = link->rlink;
|
||||
}
|
||||
|
||||
/* The current character doesn't have an outgoing link at
|
||||
this trie node, so build a new trie node and install
|
||||
a link in the current trie node's tree. */
|
||||
if (!link)
|
||||
{
|
||||
link = (struct tree *) obstack_alloc(&kwset->obstack,
|
||||
sizeof (struct tree));
|
||||
if (!link)
|
||||
return "memory exhausted";
|
||||
link->llink = NULL;
|
||||
link->rlink = NULL;
|
||||
link->trie = (struct trie *) obstack_alloc(&kwset->obstack,
|
||||
sizeof (struct trie));
|
||||
if (!link->trie)
|
||||
return "memory exhausted";
|
||||
link->trie->accepting = 0;
|
||||
link->trie->links = NULL;
|
||||
link->trie->parent = trie;
|
||||
link->trie->next = NULL;
|
||||
link->trie->fail = NULL;
|
||||
link->trie->depth = trie->depth + 1;
|
||||
link->trie->shift = 0;
|
||||
link->label = label;
|
||||
link->balance = 0;
|
||||
|
||||
/* Install the new tree node in its parent. */
|
||||
if (dirs[--depth] == L)
|
||||
links[depth]->llink = link;
|
||||
else
|
||||
links[depth]->rlink = link;
|
||||
|
||||
/* Back up the tree fixing the balance flags. */
|
||||
while (depth && !links[depth]->balance)
|
||||
{
|
||||
if (dirs[depth] == L)
|
||||
--links[depth]->balance;
|
||||
else
|
||||
++links[depth]->balance;
|
||||
--depth;
|
||||
}
|
||||
|
||||
/* Rebalance the tree by pointer rotations if necessary. */
|
||||
if (depth && (dirs[depth] == L && --links[depth]->balance
|
||||
|| dirs[depth] == R && ++links[depth]->balance))
|
||||
{
|
||||
switch (links[depth]->balance)
|
||||
{
|
||||
case (char) -2:
|
||||
switch (dirs[depth + 1])
|
||||
{
|
||||
case L:
|
||||
r = links[depth], t = r->llink, rl = t->rlink;
|
||||
t->rlink = r, r->llink = rl;
|
||||
t->balance = r->balance = 0;
|
||||
break;
|
||||
case R:
|
||||
r = links[depth], l = r->llink, t = l->rlink;
|
||||
rl = t->rlink, lr = t->llink;
|
||||
t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
|
||||
l->balance = t->balance != 1 ? 0 : -1;
|
||||
r->balance = t->balance != (char) -1 ? 0 : 1;
|
||||
t->balance = 0;
|
||||
break;
|
||||
}
|
||||
break;
|
||||
case 2:
|
||||
switch (dirs[depth + 1])
|
||||
{
|
||||
case R:
|
||||
l = links[depth], t = l->rlink, lr = t->llink;
|
||||
t->llink = l, l->rlink = lr;
|
||||
t->balance = l->balance = 0;
|
||||
break;
|
||||
case L:
|
||||
l = links[depth], r = l->rlink, t = r->llink;
|
||||
lr = t->llink, rl = t->rlink;
|
||||
t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
|
||||
l->balance = t->balance != 1 ? 0 : -1;
|
||||
r->balance = t->balance != (char) -1 ? 0 : 1;
|
||||
t->balance = 0;
|
||||
break;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
if (dirs[depth - 1] == L)
|
||||
links[depth - 1]->llink = t;
|
||||
else
|
||||
links[depth - 1]->rlink = t;
|
||||
}
|
||||
}
|
||||
|
||||
trie = link->trie;
|
||||
}
|
||||
|
||||
/* Mark the node we finally reached as accepting, encoding the
|
||||
index number of this word in the keyword set so far. */
|
||||
if (!trie->accepting)
|
||||
trie->accepting = 1 + 2 * kwset->words;
|
||||
++kwset->words;
|
||||
|
||||
/* Keep track of the longest and shortest string of the keyword set. */
|
||||
if (trie->depth < kwset->mind)
|
||||
kwset->mind = trie->depth;
|
||||
if (trie->depth > kwset->maxd)
|
||||
kwset->maxd = trie->depth;
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Enqueue the trie nodes referenced from the given tree in the
|
||||
given queue. */
|
||||
static void
|
||||
DEFUN(enqueue, (tree, last), struct tree *tree AND struct trie **last)
|
||||
{
|
||||
if (!tree)
|
||||
return;
|
||||
enqueue(tree->llink, last);
|
||||
enqueue(tree->rlink, last);
|
||||
(*last) = (*last)->next = tree->trie;
|
||||
}
|
||||
|
||||
/* Compute the Aho-Corasick failure function for the trie nodes referenced
|
||||
from the given tree, given the failure function for their parent as
|
||||
well as a last resort failure node. */
|
||||
static void
|
||||
DEFUN(treefails, (tree, fail, recourse),
|
||||
register struct tree *tree
|
||||
AND struct trie *fail AND struct trie *recourse)
|
||||
{
|
||||
register struct tree *link;
|
||||
|
||||
if (!tree)
|
||||
return;
|
||||
|
||||
treefails(tree->llink, fail, recourse);
|
||||
treefails(tree->rlink, fail, recourse);
|
||||
|
||||
/* Find, in the chain of fails going back to the root, the first
|
||||
node that has a descendent on the current label. */
|
||||
while (fail)
|
||||
{
|
||||
link = fail->links;
|
||||
while (link && tree->label != link->label)
|
||||
if (tree->label < link->label)
|
||||
link = link->llink;
|
||||
else
|
||||
link = link->rlink;
|
||||
if (link)
|
||||
{
|
||||
tree->trie->fail = link->trie;
|
||||
return;
|
||||
}
|
||||
fail = fail->fail;
|
||||
}
|
||||
|
||||
tree->trie->fail = recourse;
|
||||
}
|
||||
|
||||
/* Set delta entries for the links of the given tree such that
|
||||
the preexisting delta value is larger than the current depth. */
|
||||
static void
|
||||
DEFUN(treedelta, (tree, depth, delta),
|
||||
register struct tree *tree AND register int depth AND int delta[])
|
||||
{
|
||||
if (!tree)
|
||||
return;
|
||||
treedelta(tree->llink, depth, delta);
|
||||
treedelta(tree->rlink, depth, delta);
|
||||
if (depth < delta[tree->label])
|
||||
delta[tree->label] = depth;
|
||||
}
|
||||
|
||||
/* Return true if A has every label in B. */
|
||||
static int
|
||||
DEFUN(hasevery, (a, b), register struct tree *a AND register struct tree *b)
|
||||
{
|
||||
if (!b)
|
||||
return 1;
|
||||
if (!hasevery(a, b->llink))
|
||||
return 0;
|
||||
if (!hasevery(a, b->rlink))
|
||||
return 0;
|
||||
while (a && b->label != a->label)
|
||||
if (b->label < a->label)
|
||||
a = a->llink;
|
||||
else
|
||||
a = a->rlink;
|
||||
return !!a;
|
||||
}
|
||||
|
||||
/* Compute a vector, indexed by character code, of the trie nodes
|
||||
referenced from the given tree. */
|
||||
static void
|
||||
DEFUN(treenext, (tree, next), struct tree *tree AND struct trie *next[])
|
||||
{
|
||||
if (!tree)
|
||||
return;
|
||||
treenext(tree->llink, next);
|
||||
treenext(tree->rlink, next);
|
||||
next[tree->label] = tree->trie;
|
||||
}
|
||||
|
||||
/* Compute the shift for each trie node, as well as the delta
|
||||
table and next cache for the given keyword set. */
|
||||
const char *
|
||||
DEFUN(kwsprep, (kws), kwset_t kws)
|
||||
{
|
||||
register struct kwset *kwset;
|
||||
register int i;
|
||||
register struct trie *curr, *fail;
|
||||
register const char *trans;
|
||||
int delta[NCHAR];
|
||||
struct trie *last, *next[NCHAR];
|
||||
|
||||
kwset = (struct kwset *) kws;
|
||||
|
||||
/* Initial values for the delta table; will be changed later. The
|
||||
delta entry for a given character is the smallest depth of any
|
||||
node at which an outgoing edge is labeled by that character. */
|
||||
for (i = 0; i < NCHAR; ++i)
|
||||
delta[i] = kwset->mind;
|
||||
|
||||
/* Traverse the nodes of the trie in level order, simultaneously
|
||||
computing the delta table, failure function, and shift function. */
|
||||
for (curr = last = kwset->trie; curr; curr = curr->next)
|
||||
{
|
||||
/* Enqueue the immediate descendents in the level order queue. */
|
||||
enqueue(curr->links, &last);
|
||||
|
||||
curr->shift = kwset->mind;
|
||||
curr->maxshift = kwset->mind;
|
||||
|
||||
/* Update the delta table for the descendents of this node. */
|
||||
treedelta(curr->links, curr->depth, delta);
|
||||
|
||||
/* Compute the failure function for the decendents of this node. */
|
||||
treefails(curr->links, curr->fail, kwset->trie);
|
||||
|
||||
/* Update the shifts at each node in the current node's chain
|
||||
of fails back to the root. */
|
||||
for (fail = curr->fail; fail; fail = fail->fail)
|
||||
{
|
||||
/* If the current node has some outgoing edge that the fail
|
||||
doesn't, then the shift at the fail should be no larger
|
||||
than the difference of their depths. */
|
||||
if (!hasevery(fail->links, curr->links))
|
||||
if (curr->depth - fail->depth < fail->shift)
|
||||
fail->shift = curr->depth - fail->depth;
|
||||
|
||||
/* If the current node is accepting then the shift at the
|
||||
fail and its descendents should be no larger than the
|
||||
difference of their depths. */
|
||||
if (curr->accepting && fail->maxshift > curr->depth - fail->depth)
|
||||
fail->maxshift = curr->depth - fail->depth;
|
||||
}
|
||||
}
|
||||
|
||||
/* Traverse the trie in level order again, fixing up all nodes whose
|
||||
shift exceeds their inherited maxshift. */
|
||||
for (curr = kwset->trie->next; curr; curr = curr->next)
|
||||
{
|
||||
if (curr->maxshift > curr->parent->maxshift)
|
||||
curr->maxshift = curr->parent->maxshift;
|
||||
if (curr->shift > curr->maxshift)
|
||||
curr->shift = curr->maxshift;
|
||||
}
|
||||
|
||||
/* Create a vector, indexed by character code, of the outgoing links
|
||||
from the root node. */
|
||||
for (i = 0; i < NCHAR; ++i)
|
||||
next[i] = NULL;
|
||||
treenext(kwset->trie->links, next);
|
||||
|
||||
/* Fix things up for any translation table. */
|
||||
if (trans = kwset->trans)
|
||||
for (i = 0; i < NCHAR; ++i)
|
||||
{
|
||||
kwset->delta[i] = delta[(unsigned char) trans[i]];
|
||||
kwset->next[i] = next[(unsigned char) trans[i]];
|
||||
}
|
||||
else
|
||||
for (i = 0; i < NCHAR; ++i)
|
||||
{
|
||||
kwset->delta[i] = delta[i];
|
||||
kwset->next[i] = next[i];
|
||||
}
|
||||
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/* Search through the given text for a match of any member of the
|
||||
given keyword set. Return a pointer to the first character of
|
||||
the matching substring, or NULL if no match is found. If FOUNDLEN
|
||||
is non-NULL store in the referenced location the length of the
|
||||
matching substring. Similarly, if FOUNDIDX is non-NULL, store
|
||||
in the referenced location the index number of the particular
|
||||
keyword matched. */
|
||||
char *
|
||||
DEFUN(kwsexec, (kws, text, len, kwsmatch),
|
||||
kwset_t kws AND char *text AND size_t len AND struct kwsmatch *kwsmatch)
|
||||
{
|
||||
struct kwset *kwset;
|
||||
struct trie **next, *trie, *accept;
|
||||
char *beg, *lim, *mch, *lmch;
|
||||
register unsigned char c;
|
||||
register int *delta, d;
|
||||
register char *end, *qlim;
|
||||
register struct tree *tree;
|
||||
register const char *trans;
|
||||
|
||||
/* Initialize register copies and look for easy ways out. */
|
||||
kwset = (struct kwset *) kws;
|
||||
if (len < kwset->mind)
|
||||
return NULL;
|
||||
next = kwset->next;
|
||||
delta = kwset->delta;
|
||||
trans = kwset->trans;
|
||||
lim = text + len;
|
||||
end = text;
|
||||
if (d = kwset->mind)
|
||||
mch = NULL;
|
||||
else
|
||||
{
|
||||
mch = text, accept = kwset->trie;
|
||||
goto match;
|
||||
}
|
||||
|
||||
if (len >= 4 * kwset->mind)
|
||||
qlim = lim - 4 * kwset->mind;
|
||||
else
|
||||
qlim = NULL;
|
||||
|
||||
while (lim - end >= d)
|
||||
{
|
||||
if (qlim && end <= qlim)
|
||||
{
|
||||
end += d - 1;
|
||||
while ((d = delta[c = *end]) && end < qlim)
|
||||
{
|
||||
end += d;
|
||||
end += delta[(unsigned char) *end];
|
||||
end += delta[(unsigned char) *end];
|
||||
}
|
||||
++end;
|
||||
}
|
||||
else
|
||||
d = delta[c = (end += d)[-1]];
|
||||
if (d)
|
||||
continue;
|
||||
beg = end - 1;
|
||||
trie = next[c];
|
||||
if (trie->accepting)
|
||||
{
|
||||
mch = beg;
|
||||
accept = trie;
|
||||
}
|
||||
d = trie->shift;
|
||||
while (beg > text)
|
||||
{
|
||||
c = trans ? trans[(unsigned char) *--beg] : *--beg;
|
||||
tree = trie->links;
|
||||
while (tree && c != tree->label)
|
||||
if (c < tree->label)
|
||||
tree = tree->llink;
|
||||
else
|
||||
tree = tree->rlink;
|
||||
if (tree)
|
||||
{
|
||||
trie = tree->trie;
|
||||
if (trie->accepting)
|
||||
{
|
||||
mch = beg;
|
||||
accept = trie;
|
||||
}
|
||||
}
|
||||
else
|
||||
break;
|
||||
d = trie->shift;
|
||||
}
|
||||
if (mch)
|
||||
goto match;
|
||||
}
|
||||
return NULL;
|
||||
|
||||
match:
|
||||
/* Given a known match, find the longest possible match anchored
|
||||
at or before its starting point. This is nearly a verbatim
|
||||
copy of the preceding main search loops. */
|
||||
if (lim - mch > kwset->maxd)
|
||||
lim = mch + kwset->maxd;
|
||||
lmch = NULL;
|
||||
d = 1;
|
||||
while (lim - end >= d)
|
||||
{
|
||||
if (d = delta[c = (end += d)[-1]])
|
||||
continue;
|
||||
beg = end - 1;
|
||||
if (!(trie = next[c]))
|
||||
{
|
||||
d = 1;
|
||||
continue;
|
||||
}
|
||||
if (trie->accepting && beg <= mch)
|
||||
{
|
||||
lmch = beg;
|
||||
accept = trie;
|
||||
}
|
||||
d = trie->shift;
|
||||
while (beg > text)
|
||||
{
|
||||
c = trans ? trans[(unsigned char) *--beg] : *--beg;
|
||||
tree = trie->links;
|
||||
while (tree && c != tree->label)
|
||||
if (c < tree->label)
|
||||
tree = tree->llink;
|
||||
else
|
||||
tree = tree->rlink;
|
||||
if (tree)
|
||||
{
|
||||
trie = tree->trie;
|
||||
if (trie->accepting && beg <= mch)
|
||||
{
|
||||
lmch = beg;
|
||||
accept = trie;
|
||||
}
|
||||
}
|
||||
else
|
||||
break;
|
||||
d = trie->shift;
|
||||
}
|
||||
if (lmch)
|
||||
{
|
||||
mch = lmch;
|
||||
goto match;
|
||||
}
|
||||
if (!d)
|
||||
d = 1;
|
||||
}
|
||||
|
||||
if (kwsmatch)
|
||||
{
|
||||
kwsmatch->index = accept->accepting / 2;
|
||||
kwsmatch->beg[0] = mch;
|
||||
kwsmatch->size[0] = accept->depth;
|
||||
}
|
||||
return mch;
|
||||
}
|
||||
|
||||
/* Free the components of the given keyword set. */
|
||||
void
|
||||
DEFUN(kwsfree, (kws), kwset_t kws)
|
||||
{
|
||||
struct kwset *kwset;
|
||||
|
||||
kwset = (struct kwset *) kws;
|
||||
obstack_free(&kwset->obstack, (PTR) NULL);
|
||||
free((PTR) kws);
|
||||
}
|
@ -1,69 +0,0 @@
|
||||
/* kwset.h - header declaring the keyword set library.
|
||||
Copyright 1989 Free Software Foundation
|
||||
Written August 1989 by Mike Haertel.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 1, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
The author may be reached (Email) at the address mike@ai.mit.edu,
|
||||
or (US mail) as Mike Haertel c/o Free Software Foundation. */
|
||||
|
||||
struct kwsmatch
|
||||
{
|
||||
int index; /* Index number of matching keyword. */
|
||||
char *beg[1]; /* Begin pointer for each submatch. */
|
||||
size_t size[1]; /* Length of each submatch. */
|
||||
};
|
||||
|
||||
#if __STDC__
|
||||
|
||||
typedef void *kwset_t;
|
||||
|
||||
/* Return an opaque pointer to a newly allocated keyword set, or NULL
|
||||
if enough memory cannot be obtained. The argument if non-NULL
|
||||
specifies a table of character translations to be applied to all
|
||||
pattern and search text. */
|
||||
extern kwset_t kwsalloc(const char *);
|
||||
|
||||
/* Incrementally extend the keyword set to include the given string.
|
||||
Return NULL for success, or an error message. Remember an index
|
||||
number for each keyword included in the set. */
|
||||
extern const char *kwsincr(kwset_t, const char *, size_t);
|
||||
|
||||
/* When the keyword set has been completely built, prepare it for
|
||||
use. Return NULL for success, or an error message. */
|
||||
extern const char *kwsprep(kwset_t);
|
||||
|
||||
/* Search through the given buffer for a member of the keyword set.
|
||||
Return a pointer to the leftmost longest match found, or NULL if
|
||||
no match is found. If foundlen is non-NULL, store the length of
|
||||
the matching substring in the integer it points to. Similarly,
|
||||
if foundindex is non-NULL, store the index of the particular
|
||||
keyword found therein. */
|
||||
extern char *kwsexec(kwset_t, char *, size_t, struct kwsmatch *);
|
||||
|
||||
/* Deallocate the given keyword set and all its associated storage. */
|
||||
extern void kwsfree(kwset_t);
|
||||
|
||||
#else
|
||||
|
||||
typedef char *kwset_t;
|
||||
|
||||
extern kwset_t kwsalloc();
|
||||
extern char *kwsincr();
|
||||
extern char *kwsprep();
|
||||
extern char *kwsexec();
|
||||
extern void kwsfree();
|
||||
|
||||
#endif
|
Loading…
Reference in New Issue
Block a user