Import bzip2 1.0.2

This commit is contained in:
mjl 2002-03-15 01:35:17 +00:00
parent 1c444efedf
commit 3849fd5579
38 changed files with 6462 additions and 4918 deletions

88
dist/bzip2/CHANGES vendored
View File

@ -134,7 +134,7 @@ Several minor bugfixes and enhancements:
* Advance the version number to 1.0, so as to counteract the
(false-in-this-case) impression some people have that programs
with version numbers less than 1.0 are in someway, experimental,
with version numbers less than 1.0 are in some way, experimental,
pre-release versions.
* Create an initial Makefile-libbz2_so to build a shared library.
@ -165,3 +165,89 @@ There are no functionality changes or bug fixes relative to version
1.0.0. This is just a documentation update + a fix for minor Win32
build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is
utterly pointless. Don't bother.
1.0.2
~~~~~
A bug fix release, addressing various minor issues which have appeared
in the 18 or so months since 1.0.1 was released. Most of the fixes
are to do with file-handling or documentation bugs. To the best of my
knowledge, there have been no data-loss-causing bugs reported in the
compression/decompression engine of 1.0.0 or 1.0.1.
Note that this release does not improve the rather crude build system
for Unix platforms. The general plan here is to autoconfiscate/
libtoolise 1.0.2 soon after release, and release the result as 1.1.0
or perhaps 1.2.0. That, however, is still just a plan at this point.
Here are the changes in 1.0.2. Bug-reporters and/or patch-senders in
parentheses.
* Fix an infinite segfault loop in 1.0.1 when a directory is
encountered in -f (force) mode.
(Trond Eivind Glomsrod, Nicholas Nethercote, Volker Schmidt)
* Avoid double fclose() of output file on certain I/O error paths.
(Solar Designer)
* Don't fail with internal error 1007 when fed a long stream (> 48MB)
of byte 251. Also print useful message suggesting that 1007s may be
caused by bad memory.
(noticed by Juan Pedro Vallejo, fixed by me)
* Fix uninitialised variable silly bug in demo prog dlltest.c.
(Jorj Bauer)
* Remove 512-MB limitation on recovered file size for bzip2recover
on selected platforms which support 64-bit ints. At the moment
all GCC supported platforms, and Win32.
(me, Alson van der Meulen)
* Hard-code header byte values, to give correct operation on platforms
using EBCDIC as their native character set (IBM's OS/390).
(Leland Lucius)
* Copy file access times correctly.
(Marty Leisner)
* Add distclean and check targets to Makefile.
(Michael Carmack)
* Parameterise use of ar and ranlib in Makefile. Also add $(LDFLAGS).
(Rich Ireland, Bo Thorsen)
* Pass -p (create parent dirs as needed) to mkdir during make install.
(Jeremy Fusco)
* Dereference symlinks when copying file permissions in -f mode.
(Volker Schmidt)
* Majorly simplify implementation of uInt64_qrm10.
(Bo Lindbergh)
* Check the input file still exists before deleting the output one,
when aborting in cleanUpAndFail().
(Joerg Prante, Robert Linden, Matthias Krings)
Also a bunch of patches courtesy of Philippe Troin, the Debian maintainer
of bzip2:
* Wrapper scripts (with manpages): bzdiff, bzgrep, bzmore.
* Spelling changes and minor enhancements in bzip2.1.
* Avoid race condition between creating the output file and setting its
interim permissions safely, by using fopen_output_safely().
No changes to bzip2recover since there is no issue with file
permissions there.
* do not print senseless report with -v when compressing an empty
file.
* bzcat -f works on non-bzip2 files.
* do not try to escape shell meta-characters on unix (the shell takes
care of these).
* added --fast and --best aliases for -1 -9 for gzip compatibility.

4
dist/bzip2/LICENSE vendored
View File

@ -1,6 +1,6 @@
This program, "bzip2" and associated library "libbzip2", are
copyright (C) 1996-2000 Julian R Seward. All rights reserved.
copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -35,5 +35,5 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Julian Seward, Cambridge, UK.
jseward@acm.org
bzip2/libbzip2 version 1.0 of 21 March 2000
bzip2/libbzip2 version 1.0.2 of 30 December 2001

81
dist/bzip2/Makefile vendored
View File

@ -1,9 +1,20 @@
SHELL=/bin/sh
# To assist in cross-compiling
CC=gcc
AR=ar
RANLIB=ranlib
LDFLAGS=
# Suitably paranoid flags to avoid bugs in gcc-2.7
BIGFILES=-D_FILE_OFFSET_BITS=64
CFLAGS=-Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES)
# Where you want it installed when you do 'make install'
PREFIX=/usr
OBJS= blocksort.o \
huffman.o \
crctable.o \
@ -15,20 +26,21 @@ OBJS= blocksort.o \
all: libbz2.a bzip2 bzip2recover test
bzip2: libbz2.a bzip2.o
$(CC) $(CFLAGS) -o bzip2 bzip2.o -L. -lbz2
$(CC) $(CFLAGS) $(LDFLAGS) -o bzip2 bzip2.o -L. -lbz2
bzip2recover: bzip2recover.o
$(CC) $(CFLAGS) -o bzip2recover bzip2recover.o
$(CC) $(CFLAGS) $(LDFLAGS) -o bzip2recover bzip2recover.o
libbz2.a: $(OBJS)
rm -f libbz2.a
ar cq libbz2.a $(OBJS)
@if ( test -f /usr/bin/ranlib -o -f /bin/ranlib -o \
-f /usr/ccs/bin/ranlib ) ; then \
echo ranlib libbz2.a ; \
ranlib libbz2.a ; \
$(AR) cq libbz2.a $(OBJS)
@if ( test -f $(RANLIB) -o -f /usr/bin/ranlib -o \
-f /bin/ranlib -o -f /usr/ccs/bin/ranlib ) ; then \
echo $(RANLIB) libbz2.a ; \
$(RANLIB) libbz2.a ; \
fi
check: test
test: bzip2
@cat words1
./bzip2 -1 < sample1.ref > sample1.rb2
@ -45,14 +57,12 @@ test: bzip2
cmp sample3.tst sample3.ref
@cat words3
PREFIX=/usr
install: bzip2 bzip2recover
if ( test ! -d $(PREFIX)/bin ) ; then mkdir $(PREFIX)/bin ; fi
if ( test ! -d $(PREFIX)/lib ) ; then mkdir $(PREFIX)/lib ; fi
if ( test ! -d $(PREFIX)/man ) ; then mkdir $(PREFIX)/man ; fi
if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir $(PREFIX)/man/man1 ; fi
if ( test ! -d $(PREFIX)/include ) ; then mkdir $(PREFIX)/include ; fi
if ( test ! -d $(PREFIX)/bin ) ; then mkdir -p $(PREFIX)/bin ; fi
if ( test ! -d $(PREFIX)/lib ) ; then mkdir -p $(PREFIX)/lib ; fi
if ( test ! -d $(PREFIX)/man ) ; then mkdir -p $(PREFIX)/man ; fi
if ( test ! -d $(PREFIX)/man/man1 ) ; then mkdir -p $(PREFIX)/man/man1 ; fi
if ( test ! -d $(PREFIX)/include ) ; then mkdir -p $(PREFIX)/include ; fi
cp -f bzip2 $(PREFIX)/bin/bzip2
cp -f bzip2 $(PREFIX)/bin/bunzip2
cp -f bzip2 $(PREFIX)/bin/bzcat
@ -67,7 +77,26 @@ install: bzip2 bzip2recover
chmod a+r $(PREFIX)/include/bzlib.h
cp -f libbz2.a $(PREFIX)/lib
chmod a+r $(PREFIX)/lib/libbz2.a
cp -f bzgrep $(PREFIX)/bin/bzgrep
ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzegrep
ln $(PREFIX)/bin/bzgrep $(PREFIX)/bin/bzfgrep
chmod a+x $(PREFIX)/bin/bzgrep
cp -f bzmore $(PREFIX)/bin/bzmore
ln $(PREFIX)/bin/bzmore $(PREFIX)/bin/bzless
chmod a+x $(PREFIX)/bin/bzmore
cp -f bzdiff $(PREFIX)/bin/bzdiff
ln $(PREFIX)/bin/bzdiff $(PREFIX)/bin/bzcmp
chmod a+x $(PREFIX)/bin/bzdiff
cp -f bzgrep.1 bzmore.1 bzdiff.1 $(PREFIX)/man/man1
chmod a+r $(PREFIX)/man/man1/bzgrep.1
chmod a+r $(PREFIX)/man/man1/bzmore.1
chmod a+r $(PREFIX)/man/man1/bzdiff.1
echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzegrep.1
echo ".so man1/bzgrep.1" > $(PREFIX)/man/man1/bzfgrep.1
echo ".so man1/bzmore.1" > $(PREFIX)/man/man1/bzless.1
echo ".so man1/bzdiff.1" > $(PREFIX)/man/man1/bzcmp.1
distclean: clean
clean:
rm -f *.o libbz2.a bzip2 bzip2recover \
sample1.rb2 sample2.rb2 sample3.rb2 \
@ -93,7 +122,7 @@ bzip2.o: bzip2.c
bzip2recover.o: bzip2recover.c
$(CC) $(CFLAGS) -c bzip2recover.c
DISTNAME=bzip2-1.0.1
DISTNAME=bzip2-1.0.2
tarfile:
rm -f $(DISTNAME)
ln -sf . $(DISTNAME)
@ -112,6 +141,7 @@ tarfile:
$(DISTNAME)/Makefile \
$(DISTNAME)/manual.texi \
$(DISTNAME)/manual.ps \
$(DISTNAME)/manual.pdf \
$(DISTNAME)/LICENSE \
$(DISTNAME)/bzip2.1 \
$(DISTNAME)/bzip2.1.preformatted \
@ -138,4 +168,25 @@ tarfile:
$(DISTNAME)/Y2K_INFO \
$(DISTNAME)/unzcrash.c \
$(DISTNAME)/spewG.c \
$(DISTNAME)/mk251.c \
$(DISTNAME)/bzdiff \
$(DISTNAME)/bzdiff.1 \
$(DISTNAME)/bzmore \
$(DISTNAME)/bzmore.1 \
$(DISTNAME)/bzgrep \
$(DISTNAME)/bzgrep.1 \
$(DISTNAME)/Makefile-libbz2_so
gzip -v $(DISTNAME).tar
# For rebuilding the manual from sources on my RedHat 7.2 box
manual: manual.ps manual.pdf manual.html
manual.ps: manual.texi
tex manual.texi
dvips -o manual.ps manual.dvi
manual.pdf: manual.ps
ps2pdf manual.ps
manual.html: manual.texi
texi2html -split_chapter manual.texi

View File

@ -1,8 +1,9 @@
# This Makefile builds a shared version of the library,
# libbz2.so.1.0.1, with soname libbz2.so.1.0,
# at least on x86-Linux (RedHat 5.2),
# with gcc-2.7.2.3. Please see the README file for some
# libbz2.so.1.0.2, with soname libbz2.so.1.0,
# at least on x86-Linux (RedHat 7.2),
# with gcc-2.96 20000731 (Red Hat Linux 7.1 2.96-98).
# Please see the README file for some
# important info about building the library like this.
SHELL=/bin/sh
@ -19,13 +20,13 @@ OBJS= blocksort.o \
bzlib.o
all: $(OBJS)
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
$(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.1
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
$(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.2
rm -f libbz2.so.1.0
ln -s libbz2.so.1.0.1 libbz2.so.1.0
ln -s libbz2.so.1.0.2 libbz2.so.1.0
clean:
rm -f $(OBJS) bzip2.o libbz2.so.1.0.1 libbz2.so.1.0 bzip2-shared
rm -f $(OBJS) bzip2.o libbz2.so.1.0.2 libbz2.so.1.0 bzip2-shared
blocksort.o: blocksort.c
$(CC) $(CFLAGS) -c blocksort.c

89
dist/bzip2/README vendored
View File

@ -1,15 +1,15 @@
This is the README for bzip2, a block-sorting file compressor, version
1.0. This version is fully compatible with the previous public
releases, bzip2-0.1pl2, bzip2-0.9.0 and bzip2-0.9.5.
1.0.2. This version is fully compatible with the previous public
releases, versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1.
bzip2-1.0 is distributed under a BSD-style license. For details,
bzip2-1.0.2 is distributed under a BSD-style license. For details,
see the file LICENSE.
Complete documentation is available in Postscript form (manual.ps) or
html (manual_toc.html). A plain-text version of the manual page is
available as bzip2.txt. A statement about Y2K issues is now included
in the file Y2K_INFO.
Complete documentation is available in Postscript form (manual.ps),
PDF (manual.pdf, amazingly enough) or html (manual_toc.html). A
plain-text version of the manual page is available as bzip2.txt.
A statement about Y2K issues is now included in the file Y2K_INFO.
HOW TO BUILD -- UNIX
@ -33,34 +33,41 @@ not actually execute them.
HOW TO BUILD -- UNIX, shared library libbz2.so.
Do 'make -f Makefile-libbz2_so'. This Makefile seems to work for
Linux-ELF (RedHat 5.2 on an x86 box), with gcc. I make no claims
Linux-ELF (RedHat 7.2 on an x86 box), with gcc. I make no claims
that it works for any other platform, though I suspect it probably
will work for most platforms employing both ELF and gcc.
bzip2-shared, a client of the shared library, is also build, but
not self-tested. So I suggest you also build using the normal
Makefile, since that conducts a self-test.
bzip2-shared, a client of the shared library, is also built, but not
self-tested. So I suggest you also build using the normal Makefile,
since that conducts a self-test. A second reason to prefer the
version statically linked to the library is that, on x86 platforms,
building shared objects makes a valuable register (%ebx) unavailable
to gcc, resulting in a slowdown of 10%-20%, at least for bzip2.
Important note for people upgrading .so's from 0.9.0/0.9.5 to
version 1.0. All the functions in the library have been renamed,
from (eg) bzCompress to BZ2_bzCompress, to avoid namespace pollution.
Important note for people upgrading .so's from 0.9.0/0.9.5 to version
1.0.X. All the functions in the library have been renamed, from (eg)
bzCompress to BZ2_bzCompress, to avoid namespace pollution.
Unfortunately this means that the libbz2.so created by
Makefile-libbz2_so will not work with any program which used an
older version of the library. Sorry. I do encourage library
clients to make the effort to upgrade to use version 1.0, since
it is both faster and more robust than previous versions.
Makefile-libbz2_so will not work with any program which used an older
version of the library. Sorry. I do encourage library clients to
make the effort to upgrade to use version 1.0, since it is both faster
and more robust than previous versions.
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
It's difficult for me to support compilation on all these platforms.
My approach is to collect binaries for these platforms, and put them
on the master web page (http://sourceware.cygnus.com/bzip2). Look
there. However (FWIW), bzip2-1.0 is very standard ANSI C and should
compile unmodified with MS Visual C. For Win32, there is one
important caveat: in bzip2.c, you must set BZ_UNIX to 0 and
BZ_LCCWIN32 to 1 before building. If you have difficulties building,
you might want to read README.COMPILATION.PROBLEMS.
on the master web page (http://sources.redhat.com/bzip2). Look there.
However (FWIW), bzip2-1.0.X is very standard ANSI C and should compile
unmodified with MS Visual C. If you have difficulties building, you
might want to read README.COMPILATION.PROBLEMS.
At least using MS Visual C++ 6, you can build from the unmodified
sources by issuing, in a command shell:
nmake -f makefile.msc
(you may need to first run the MSVC-provided script VCVARS32.BAT
so as to set up paths to the MSVC tools correctly).
VALIDATION
@ -138,29 +145,37 @@ WHAT'S NEW IN 0.9.5 ?
* Many small improvements in file and flag handling.
* A Y2K statement.
WHAT'S NEW IN 1.0
WHAT'S NEW IN 1.0.0 ?
See the CHANGES file.
WHAT'S NEW IN 1.0.2 ?
See the CHANGES file.
I hope you find bzip2 useful. Feel free to contact me at
jseward@acm.org
if you have any suggestions or queries. Many people mailed me with
comments, suggestions and patches after the releases of bzip-0.15,
bzip-0.21, bzip2-0.1pl2 and bzip2-0.9.0, and the changes in bzip2 are
largely a result of this feedback. I thank you for your comments.
bzip-0.21, and bzip2 versions 0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1,
and the changes in bzip2 are largely a result of this feedback.
I thank you for your comments.
At least for the time being, bzip2's "home" is (or can be reached via)
http://www.muraroa.demon.co.uk.
http://sources.redhat.com/bzip2.
Julian Seward
jseward@acm.org
Cambridge, UK
18 July 1996 (version 0.15)
25 August 1996 (version 0.21)
7 August 1997 (bzip2, version 0.1)
29 August 1997 (bzip2, version 0.1pl2)
23 August 1998 (bzip2, version 0.9.0)
8 June 1999 (bzip2, version 0.9.5)
4 Sept 1999 (bzip2, version 0.9.5d)
5 May 2000 (bzip2, version 1.0pre8)
Cambridge, UK (and what a great town this is!)
18 July 1996 (version 0.15)
25 August 1996 (version 0.21)
7 August 1997 (bzip2, version 0.1)
29 August 1997 (bzip2, version 0.1pl2)
23 August 1998 (bzip2, version 0.9.0)
8 June 1999 (bzip2, version 0.9.5)
4 Sept 1999 (bzip2, version 0.9.5d)
5 May 2000 (bzip2, version 1.0pre8)
30 December 2001 (bzip2, version 1.0.2pre1)

View File

@ -117,11 +117,11 @@ Known problems as of 1.0pre8:
All that said: you might be able to get somewhere
by finding the line in Makefile-libbz2_so which says
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS)
$(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.2 $(OBJS)
and replacing with
($CC) -G -shared -o libbz2.so.1.0.1 -h libbz2.so.1.0 $(OBJS)
$(CC) -G -shared -o libbz2.so.1.0.2 -h libbz2.so.1.0 $(OBJS)
If gcc objects to the combination -fpic -fPIC, get rid of
the second one, leaving just "-fpic".

View File

@ -1,4 +1,4 @@
/* $NetBSD: blocksort.c,v 1.1.1.1 2001/06/03 13:03:01 simonb Exp $ */
/* $NetBSD: blocksort.c,v 1.1.1.2 2002/03/15 01:35:18 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -983,7 +983,14 @@ void mainSort ( UInt32* ptr,
}
}
AssertH ( copyStart[ss]-1 == copyEnd[ss], 1007 );
AssertH ( (copyStart[ss]-1 == copyEnd[ss])
||
/* Extremely rare case missing in bzip2-1.0.0 and 1.0.1.
Necessity for this case is demonstrated by compressing
a sequence of approximately 48.5 million of character
251; 1.0.0/1.0.1 will then die here. */
(copyStart[ss] == 0 && copyEnd[ss] == nblock-1),
1007 )
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;

76
dist/bzip2/bzdiff vendored Normal file
View File

@ -0,0 +1,76 @@
#!/bin/sh
# sh is buggy on RS/6000 AIX 3.2. Replace above line with #!/bin/ksh
# Bzcmp/diff wrapped for bzip2,
# adapted from zdiff by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
# Bzcmp and bzdiff are used to invoke the cmp or the diff pro-
# gram on compressed files. All options specified are passed
# directly to cmp or diff. If only 1 file is specified, then
# the files compared are file1 and an uncompressed file1.gz.
# If two files are specified, then they are uncompressed (if
# necessary) and fed to cmp or diff. The exit status from cmp
# or diff is preserved.
PATH="/usr/bin:$PATH"; export PATH
prog=`echo $0 | sed 's|.*/||'`
case "$prog" in
*cmp) comp=${CMP-cmp} ;;
*) comp=${DIFF-diff} ;;
esac
OPTIONS=
FILES=
for ARG
do
case "$ARG" in
-*) OPTIONS="$OPTIONS $ARG";;
*) if test -f "$ARG"; then
FILES="$FILES $ARG"
else
echo "${prog}: $ARG not found or not a regular file"
exit 1
fi ;;
esac
done
if test -z "$FILES"; then
echo "Usage: $prog [${comp}_options] file [file]"
exit 1
fi
tmp=`tempfile -d /tmp -p bz` || {
echo 'cannot create a temporary file' >&2
exit 1
}
set $FILES
if test $# -eq 1; then
FILE=`echo "$1" | sed 's/.bz2$//'`
bzip2 -cd "$FILE.bz2" | $comp $OPTIONS - "$FILE"
STAT="$?"
elif test $# -eq 2; then
case "$1" in
*.bz2)
case "$2" in
*.bz2)
F=`echo "$2" | sed 's|.*/||;s|.bz2$||'`
bzip2 -cdfq "$2" > $tmp
bzip2 -cdfq "$1" | $comp $OPTIONS - $tmp
STAT="$?"
/bin/rm -f $tmp;;
*) bzip2 -cdfq "$1" | $comp $OPTIONS - "$2"
STAT="$?";;
esac;;
*) case "$2" in
*.bz2)
bzip2 -cdfq "$2" | $comp $OPTIONS "$1" -
STAT="$?";;
*) $comp $OPTIONS "$1" "$2"
STAT="$?";;
esac;;
esac
exit "$STAT"
else
echo "Usage: $prog [${comp}_options] file [file]"
exit 1
fi

49
dist/bzip2/bzdiff.1 vendored Normal file
View File

@ -0,0 +1,49 @@
.\" $NetBSD: bzdiff.1,v 1.1.1.1 2002/03/15 01:35:18 mjl Exp $
.\"
\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
\"for Debian GNU/Linux
.TH BZDIFF 1
.SH NAME
bzcmp, bzdiff \- compare bzip2 compressed files
.SH SYNOPSIS
.B bzcmp
[ cmp_options ] file1
[ file2 ]
.br
.B bzdiff
[ diff_options ] file1
[ file2 ]
.SH DESCRIPTION
.I Bzcmp
and
.I bzdiff
are used to invoke the
.I cmp
or the
.I diff
program on bzip2 compressed files. All options specified are passed
directly to
.I cmp
or
.IR diff "."
If only 1 file is specified, then the files compared are
.I file1
and an uncompressed
.IR file1 ".bz2."
If two files are specified, then they are uncompressed if necessary and fed to
.I cmp
or
.IR diff "."
The exit status from
.I cmp
or
.I diff
is preserved.
.SH "SEE ALSO"
cmp(1), diff(1), bzmore(1), bzless(1), bzgrep(1), bzip2(1)
.SH BUGS
Messages from the
.I cmp
or
.I diff
programs refer to temporary filenames instead of those specified.

71
dist/bzip2/bzgrep vendored Normal file
View File

@ -0,0 +1,71 @@
#!/bin/sh
# Bzgrep wrapped for bzip2,
# adapted from zgrep by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
## zgrep notice:
## zgrep -- a wrapper around a grep program that decompresses files as needed
## Adapted from a version sent by Charles Levert <charles@comm.polymtl.ca>
PATH="/usr/bin:$PATH"; export PATH
prog=`echo $0 | sed 's|.*/||'`
case "$prog" in
*egrep) grep=${EGREP-egrep} ;;
*fgrep) grep=${FGREP-fgrep} ;;
*) grep=${GREP-grep} ;;
esac
pat=""
while test $# -ne 0; do
case "$1" in
-e | -f) opt="$opt $1"; shift; pat="$1"
if test "$grep" = grep; then # grep is buggy with -e on SVR4
grep=egrep
fi;;
-A | -B) opt="$opt $1 $2"; shift;;
-*) opt="$opt $1";;
*) if test -z "$pat"; then
pat="$1"
else
break;
fi;;
esac
shift
done
if test -z "$pat"; then
echo "grep through bzip2 files"
echo "usage: $prog [grep_options] pattern [files]"
exit 1
fi
list=0
silent=0
op=`echo "$opt" | sed -e 's/ //g' -e 's/-//g'`
case "$op" in
*l*) list=1
esac
case "$op" in
*h*) silent=1
esac
if test $# -eq 0; then
bzip2 -cdfq | $grep $opt "$pat"
exit $?
fi
res=0
for i do
if test -f "$i"; then :; else if test -f "$i.bz2"; then i="$i.bz2"; fi; fi
if test $list -eq 1; then
bzip2 -cdfq "$i" | $grep $opt "$pat" 2>&1 > /dev/null && echo $i
r=$?
elif test $# -eq 1 -o $silent -eq 1; then
bzip2 -cdfq "$i" | $grep $opt "$pat"
r=$?
else
bzip2 -cdfq "$i" | $grep $opt "$pat" | sed "s|^|${i}:|"
r=$?
fi
test "$r" -ne 0 && res="$r"
done
exit $res

58
dist/bzip2/bzgrep.1 vendored Normal file
View File

@ -0,0 +1,58 @@
.\" $NetBSD: bzgrep.1,v 1.1.1.1 2002/03/15 01:35:18 mjl Exp $
.\"
\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
\"for Debian GNU/Linux
.TH BZGREP 1
.SH NAME
bzgrep, bzfgrep, bzegrep \- search possibly bzip2 compressed files for a regular expression
.SH SYNOPSIS
.B bzgrep
[ grep_options ]
.BI [\ -e\ ] " pattern"
.IR filename ".\|.\|."
.br
.B bzegrep
[ egrep_options ]
.BI [\ -e\ ] " pattern"
.IR filename ".\|.\|."
.br
.B bzfgrep
[ fgrep_options ]
.BI [\ -e\ ] " pattern"
.IR filename ".\|.\|."
.SH DESCRIPTION
.IR Bzgrep
is used to invoke the
.I grep
on bzip2-compressed files. All options specified are passed directly to
.I grep.
If no file is specified, then the standard input is decompressed
if necessary and fed to grep.
Otherwise the given files are uncompressed if necessary and fed to
.I grep.
.PP
If
.I bzgrep
is invoked as
.I bzegrep
or
.I bzfgrep
then
.I egrep
or
.I fgrep
is used instead of
.I grep.
If the GREP environment variable is set,
.I bzgrep
uses it as the
.I grep
program to be invoked. For example:
for sh: GREP=fgrep bzgrep string files
for csh: (setenv GREP fgrep; bzgrep string files)
.SH AUTHOR
Charles Levert (charles@comm.polymtl.ca). Adapted to bzip2 by Philippe
Troin <phil@fifi.org> for Debian GNU/Linux.
.SH "SEE ALSO"
grep(1), egrep(1), fgrep(1), bzdiff(1), bzmore(1), bzless(1), bzip2(1)

58
dist/bzip2/bzip2.1 vendored
View File

@ -1,9 +1,9 @@
.\" $NetBSD: bzip2.1,v 1.1.1.1 2001/06/03 13:03:02 simonb Exp $
.\" $NetBSD: bzip2.1,v 1.1.1.2 2002/03/15 01:35:23 mjl Exp $
.\"
.PU
.TH bzip2 1
.SH NAME
bzip2, bunzip2 \- a block-sorting file compressor, v1.0
bzip2, bunzip2 \- a block-sorting file compressor, v1.0.2
.br
bzcat \- decompresses files to stdout
.br
@ -199,7 +199,7 @@ to decompress.
.TP
.B \-z --compress
The complement to \-d: forces compression, regardless of the
invokation name.
invocation name.
.TP
.B \-t --test
Check integrity of the specified file(s), but don't decompress them.
@ -213,6 +213,10 @@ existing output files. Also forces
.I bzip2
to break hard links
to files, which it otherwise wouldn't do.
bzip2 normally declines to decompress files which don't have the
correct magic header bytes. If forced (-f), however, it will pass
such files through unmodified. This is how GNU gzip behaves.
.TP
.B \-k --keep
Keep (don't delete) input files during compression
@ -241,9 +245,13 @@ information which is primarily of interest for diagnostic purposes.
.B \-L --license -V --version
Display the software version, license terms and conditions.
.TP
.B \-1 to \-9
.B \-1 (or \-\-fast) to \-9 (or \-\-best)
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
effect when decompressing. See MEMORY MANAGEMENT below.
The \-\-fast and \-\-best aliases are primarily for GNU gzip
compatibility. In particular, \-\-fast doesn't make things
significantly faster.
And \-\-best merely selects the default behaviour.
.TP
.B \--
Treats all subsequent arguments as file names, even if they start
@ -354,11 +362,11 @@ undamaged.
.I bzip2recover
takes a single argument, the name of the damaged file,
and writes a number of files "rec0001file.bz2",
"rec0002file.bz2", etc, containing the extracted blocks.
and writes a number of files "rec00001file.bz2",
"rec00002file.bz2", etc, containing the extracted blocks.
The output filenames are designed so that the use of
wildcards in subsequent processing -- for example,
"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in
"bzip2 -dc rec*file.bz2 > recovered_data" -- processes the files in
the correct order.
.I bzip2recover
@ -399,27 +407,31 @@ I/O error messages are not as helpful as they could be.
tries hard to detect I/O errors and exit cleanly, but the details of
what the problem is sometimes seem rather misleading.
This manual page pertains to version 1.0 of
This manual page pertains to version 1.0.2 of
.I bzip2.
Compressed
data created by this version is entirely forwards and backwards
compatible with the previous public releases, versions 0.1pl2, 0.9.0
and 0.9.5,
but with the following exception: 0.9.0 and above can correctly
decompress multiple concatenated compressed files. 0.1pl2 cannot do
this; it will stop after decompressing just the first file in the
stream.
Compressed data created by this version is entirely forwards and
backwards compatible with the previous public releases, versions
0.1pl2, 0.9.0, 0.9.5, 1.0.0 and 1.0.1, but with the following
exception: 0.9.0 and above can correctly decompress multiple
concatenated compressed files. 0.1pl2 cannot do this; it will stop
after decompressing just the first file in the stream.
.I bzip2recover
uses 32-bit integers to represent bit positions in
compressed files, so it cannot handle compressed files more than 512
megabytes long. This could easily be fixed.
versions prior to this one, 1.0.2, used 32-bit integers to represent
bit positions in compressed files, so it could not handle compressed
files more than 512 megabytes long. Version 1.0.2 and above uses
64-bit ints on some platforms which support them (GNU supported
targets, and Windows). To establish whether or not bzip2recover was
built with such a limitation, run it without arguments. In any event
you can build yourself an unlimited version if you can recompile it
with MaybeUInt64 set to be an unsigned 64-bit integer.
.SH AUTHOR
Julian Seward, jseward@acm.org.
http://sourceware.cygnus.com/bzip2
http://www.muraroa.demon.co.uk
http://sources.redhat.com/bzip2
The ideas embodied in
.I bzip2
@ -436,6 +448,8 @@ indebted for their help, support and advice. See the manual in the
source distribution for pointers to sources of documentation. Christian
von Roques encouraged me to look for faster sorting algorithms, so as to
speed up compression. Bela Lubkin encouraged me to improve the
worst-case compression performance. Many people sent patches, helped
worst-case compression performance.
The bz* scripts are derived from those of GNU gzip.
Many people sent patches, helped
with portability problems, lent machines, gave advice and were generally
helpful.

537
dist/bzip2/bzip2.c vendored
View File

@ -1,4 +1,4 @@
/* $NetBSD: bzip2.c,v 1.1.1.1 2001/06/03 13:03:03 simonb Exp $ */
/* $NetBSD: bzip2.c,v 1.1.1.2 2002/03/15 01:35:24 mjl Exp $ */
/*-----------------------------------------------------------*/
@ -9,7 +9,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -115,13 +115,16 @@
/*--
Generic 32-bit Unix.
Also works on 64-bit Unix boxes.
This is the default.
--*/
#define BZ_UNIX 1
/*--
Win32, as seen by Jacob Navia's excellent
port of (Chris Fraser & David Hanson)'s excellent
lcc compiler.
lcc compiler. Or with MS Visual C.
This is selected automatically if compiled by a compiler which
defines _WIN32, not including the Cygwin GCC.
--*/
#define BZ_LCCWIN32 0
@ -158,6 +161,7 @@
--*/
#if BZ_UNIX
# include <fcntl.h>
# include <sys/types.h>
# include <utime.h>
# include <unistd.h>
@ -166,8 +170,9 @@
# define PATH_SEP '/'
# define MY_LSTAT lstat
# define MY_S_IFREG S_ISREG
# define MY_STAT stat
# define MY_S_ISREG S_ISREG
# define MY_S_ISDIR S_ISDIR
# define APPEND_FILESPEC(root, name) \
root=snocString((root), (name))
@ -182,19 +187,23 @@
# else
# define NORETURN /**/
# endif
# ifdef __DJGPP__
# include <io.h>
# include <fcntl.h>
# undef MY_LSTAT
# undef MY_STAT
# define MY_LSTAT stat
# define MY_STAT stat
# undef SET_BINARY_MODE
# define SET_BINARY_MODE(fd) \
do { \
int retVal = setmode ( fileno ( fd ), \
O_BINARY ); \
O_BINARY ); \
ERROR_IF_MINUS_ONE ( retVal ); \
} while ( 0 )
# endif
# ifdef __CYGWIN__
# include <io.h>
# include <fcntl.h>
@ -202,11 +211,11 @@
# define SET_BINARY_MODE(fd) \
do { \
int retVal = setmode ( fileno ( fd ), \
O_BINARY ); \
O_BINARY ); \
ERROR_IF_MINUS_ONE ( retVal ); \
} while ( 0 )
# endif
#endif
#endif /* BZ_UNIX */
@ -219,46 +228,23 @@
# define PATH_SEP '\\'
# define MY_LSTAT _stat
# define MY_STAT _stat
# define MY_S_IFREG(x) ((x) & _S_IFREG)
# define MY_S_ISREG(x) ((x) & _S_IFREG)
# define MY_S_ISDIR(x) ((x) & _S_IFDIR)
# define APPEND_FLAG(root, name) \
root=snocString((root), (name))
# if 0
/*-- lcc-win32 seems to expand wildcards itself --*/
# define APPEND_FILESPEC(root, spec) \
do { \
if ((spec)[0] == '-') { \
root = snocString((root), (spec)); \
} else { \
struct _finddata_t c_file; \
long hFile; \
hFile = _findfirst((spec), &c_file); \
if ( hFile == -1L ) { \
root = snocString ((root), (spec)); \
} else { \
int anInt = 0; \
while ( anInt == 0 ) { \
root = snocString((root), \
&c_file.name[0]); \
anInt = _findnext(hFile, &c_file); \
} \
} \
} \
} while ( 0 )
# else
# define APPEND_FILESPEC(root, name) \
root = snocString ((root), (name))
# endif
# define SET_BINARY_MODE(fd) \
do { \
int retVal = setmode ( fileno ( fd ), \
O_BINARY ); \
O_BINARY ); \
ERROR_IF_MINUS_ONE ( retVal ); \
} while ( 0 )
#endif
#endif /* BZ_LCCWIN32 */
/*---------------------------------------------*/
@ -340,6 +326,7 @@ typedef
struct { UChar b[8]; }
UInt64;
static
void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
{
@ -353,6 +340,7 @@ void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )
n->b[0] = (UChar) (lo32 & 0xFF);
}
static
double uInt64_to_double ( UInt64* n )
{
@ -366,77 +354,6 @@ double uInt64_to_double ( UInt64* n )
return sum;
}
static
void uInt64_add ( UInt64* src, UInt64* dst )
{
Int32 i;
Int32 carry = 0;
for (i = 0; i < 8; i++) {
carry += ( ((Int32)src->b[i]) + ((Int32)dst->b[i]) );
dst->b[i] = (UChar)(carry & 0xFF);
carry >>= 8;
}
}
static
void uInt64_sub ( UInt64* src, UInt64* dst )
{
Int32 t, i;
Int32 borrow = 0;
for (i = 0; i < 8; i++) {
t = ((Int32)dst->b[i]) - ((Int32)src->b[i]) - borrow;
if (t < 0) {
dst->b[i] = (UChar)(t + 256);
borrow = 1;
} else {
dst->b[i] = (UChar)t;
borrow = 0;
}
}
}
static
void uInt64_mul ( UInt64* a, UInt64* b, UInt64* r_hi, UInt64* r_lo )
{
UChar sum[16];
Int32 ia, ib, carry;
for (ia = 0; ia < 16; ia++) sum[ia] = 0;
for (ia = 0; ia < 8; ia++) {
carry = 0;
for (ib = 0; ib < 8; ib++) {
carry += ( ((Int32)sum[ia+ib])
+ ((Int32)a->b[ia]) * ((Int32)b->b[ib]) );
sum[ia+ib] = (UChar)(carry & 0xFF);
carry >>= 8;
}
sum[ia+8] = (UChar)(carry & 0xFF);
if ((carry >>= 8) != 0) panic ( "uInt64_mul" );
}
for (ia = 0; ia < 8; ia++) r_hi->b[ia] = sum[ia+8];
for (ia = 0; ia < 8; ia++) r_lo->b[ia] = sum[ia];
}
static
void uInt64_shr1 ( UInt64* n )
{
Int32 i;
for (i = 0; i < 8; i++) {
n->b[i] >>= 1;
if (i < 7 && (n->b[i+1] & 1)) n->b[i] |= 0x80;
}
}
static
void uInt64_shl1 ( UInt64* n )
{
Int32 i;
for (i = 7; i >= 0; i--) {
n->b[i] <<= 1;
if (i > 0 && (n->b[i-1] & 0x80)) n->b[i]++;
}
}
static
Bool uInt64_isZero ( UInt64* n )
@ -447,49 +364,23 @@ Bool uInt64_isZero ( UInt64* n )
return 1;
}
static
/* Divide *n by 10, and return the remainder. */
static
Int32 uInt64_qrm10 ( UInt64* n )
{
/* Divide *n by 10, and return the remainder. Long division
is difficult, so we cheat and instead multiply by
0xCCCC CCCC CCCC CCCD, which is 0.8 (viz, 0.1 << 3).
*/
UInt32 rem, tmp;
Int32 i;
UInt64 tmp1, tmp2, n_orig, zero_point_eight;
zero_point_eight.b[1] = zero_point_eight.b[2] =
zero_point_eight.b[3] = zero_point_eight.b[4] =
zero_point_eight.b[5] = zero_point_eight.b[6] =
zero_point_eight.b[7] = 0xCC;
zero_point_eight.b[0] = 0xCD;
n_orig = *n;
/* divide n by 10,
by multiplying by 0.8 and then shifting right 3 times */
uInt64_mul ( n, &zero_point_eight, &tmp1, &tmp2 );
uInt64_shr1(&tmp1); uInt64_shr1(&tmp1); uInt64_shr1(&tmp1);
*n = tmp1;
/* tmp1 = 8*n, tmp2 = 2*n */
uInt64_shl1(&tmp1); uInt64_shl1(&tmp1); uInt64_shl1(&tmp1);
tmp2 = *n; uInt64_shl1(&tmp2);
/* tmp1 = 10*n */
uInt64_add ( &tmp2, &tmp1 );
/* n_orig = n_orig - 10*n */
uInt64_sub ( &tmp1, &n_orig );
/* n_orig should now hold quotient, in range 0 .. 9 */
for (i = 7; i >= 1; i--)
if (n_orig.b[i] != 0) panic ( "uInt64_qrm10(1)" );
if (n_orig.b[0] > 9)
panic ( "uInt64_qrm10(2)" );
return (int)n_orig.b[0];
rem = 0;
for (i = 7; i >= 0; i--) {
tmp = rem * 256 + n->b[i];
n->b[i] = tmp / 10;
rem = tmp % 10;
}
return rem;
}
/* ... and the Whole Entire Point of all this UInt64 stuff is
so that we can supply the following function.
*/
@ -506,7 +397,8 @@ void uInt64_toAscii ( char* outbuf, UInt64* n )
nBuf++;
} while (!uInt64_isZero(&n_copy));
outbuf[nBuf] = 0;
for (i = 0; i < nBuf; i++) outbuf[i] = buf[nBuf-i-1];
for (i = 0; i < nBuf; i++)
outbuf[i] = buf[nBuf-i-1];
}
@ -568,35 +460,38 @@ void compressStream ( FILE *stream, FILE *zStream )
if (ret == EOF) goto errhandler_io;
if (zStream != stdout) {
ret = fclose ( zStream );
outputHandleJustInCase = NULL;
if (ret == EOF) goto errhandler_io;
}
outputHandleJustInCase = NULL;
if (ferror(stream)) goto errhandler_io;
ret = fclose ( stream );
if (ret == EOF) goto errhandler_io;
if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0)
nbytes_in_lo32 = 1;
if (verbosity >= 1) {
Char buf_nin[32], buf_nout[32];
UInt64 nbytes_in, nbytes_out;
double nbytes_in_d, nbytes_out_d;
uInt64_from_UInt32s ( &nbytes_in,
nbytes_in_lo32, nbytes_in_hi32 );
uInt64_from_UInt32s ( &nbytes_out,
nbytes_out_lo32, nbytes_out_hi32 );
nbytes_in_d = uInt64_to_double ( &nbytes_in );
nbytes_out_d = uInt64_to_double ( &nbytes_out );
uInt64_toAscii ( buf_nin, &nbytes_in );
uInt64_toAscii ( buf_nout, &nbytes_out );
fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
"%5.2f%% saved, %s in, %s out.\n",
nbytes_in_d / nbytes_out_d,
(8.0 * nbytes_out_d) / nbytes_in_d,
100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
buf_nin,
buf_nout
);
if (nbytes_in_lo32 == 0 && nbytes_in_hi32 == 0) {
fprintf ( stderr, " no data compressed.\n");
} else {
Char buf_nin[32], buf_nout[32];
UInt64 nbytes_in, nbytes_out;
double nbytes_in_d, nbytes_out_d;
uInt64_from_UInt32s ( &nbytes_in,
nbytes_in_lo32, nbytes_in_hi32 );
uInt64_from_UInt32s ( &nbytes_out,
nbytes_out_lo32, nbytes_out_hi32 );
nbytes_in_d = uInt64_to_double ( &nbytes_in );
nbytes_out_d = uInt64_to_double ( &nbytes_out );
uInt64_toAscii ( buf_nin, &nbytes_in );
uInt64_toAscii ( buf_nout, &nbytes_out );
fprintf ( stderr, "%6.3f:1, %6.3f bits/byte, "
"%5.2f%% saved, %s in, %s out.\n",
nbytes_in_d / nbytes_out_d,
(8.0 * nbytes_out_d) / nbytes_in_d,
100.0 * (1.0 - nbytes_out_d / nbytes_in_d),
buf_nin,
buf_nout
);
}
}
return;
@ -654,7 +549,7 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
while (bzerr == BZ_OK) {
nread = BZ2_bzRead ( &bzerr, bzf, obuf, 5000 );
if (bzerr == BZ_DATA_ERROR_MAGIC) goto errhandler;
if (bzerr == BZ_DATA_ERROR_MAGIC) goto trycat;
if ((bzerr == BZ_OK || bzerr == BZ_STREAM_END) && nread > 0)
fwrite ( obuf, sizeof(UChar), nread, stream );
if (ferror(stream)) goto errhandler_io;
@ -670,9 +565,9 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
if (bzerr != BZ_OK) panic ( "decompress:bzReadGetUnused" );
if (nUnused == 0 && myfeof(zStream)) break;
}
closeok:
if (ferror(zStream)) goto errhandler_io;
ret = fclose ( zStream );
if (ret == EOF) goto errhandler_io;
@ -682,11 +577,26 @@ Bool uncompressStream ( FILE *zStream, FILE *stream )
if (ret != 0) goto errhandler_io;
if (stream != stdout) {
ret = fclose ( stream );
outputHandleJustInCase = NULL;
if (ret == EOF) goto errhandler_io;
}
outputHandleJustInCase = NULL;
if (verbosity >= 2) fprintf ( stderr, "\n " );
return True;
trycat:
if (forceOverwrite) {
rewind(zStream);
while (True) {
if (myfeof(zStream)) break;
nread = fread ( obuf, sizeof(UChar), 5000, zStream );
if (ferror(zStream)) goto errhandler_io;
if (nread > 0) fwrite ( obuf, sizeof(UChar), nread, stream );
if (ferror(stream)) goto errhandler_io;
}
goto closeok;
}
errhandler:
BZ2_bzReadClose ( &bzerr_dummy, bzf );
switch (bzerr) {
@ -834,7 +744,7 @@ void cadvise ( void )
stderr,
"\nIt is possible that the compressed file(s) have become corrupted.\n"
"You can use the -tvv option to test integrity of such files.\n\n"
"You can use the `bzip2recover' program to *attempt* to recover\n"
"You can use the `bzip2recover' program to attempt to recover\n"
"data from undamaged sections of corrupted files.\n\n"
);
}
@ -857,28 +767,55 @@ void showFileNames ( void )
static
void cleanUpAndFail ( Int32 ec )
{
IntNative retVal;
IntNative retVal;
struct MY_STAT statBuf;
if ( srcMode == SM_F2F
&& opMode != OM_TEST
&& deleteOutputOnInterrupt ) {
if (noisy)
fprintf ( stderr, "%s: Deleting output file %s, if it exists.\n",
progName, outName );
if (outputHandleJustInCase != NULL)
fclose ( outputHandleJustInCase );
retVal = remove ( outName );
if (retVal != 0)
/* Check whether input file still exists. Delete output file
only if input exists to avoid loss of data. Joerg Prante, 5
January 2002. (JRS 06-Jan-2002: other changes in 1.0.2 mean
this is less likely to happen. But to be ultra-paranoid, we
do the check anyway.) */
retVal = MY_STAT ( inName, &statBuf );
if (retVal == 0) {
if (noisy)
fprintf ( stderr,
"%s: Deleting output file %s, if it exists.\n",
progName, outName );
if (outputHandleJustInCase != NULL)
fclose ( outputHandleJustInCase );
retVal = remove ( outName );
if (retVal != 0)
fprintf ( stderr,
"%s: WARNING: deletion of output file "
"(apparently) failed.\n",
progName );
} else {
fprintf ( stderr,
"%s: WARNING: deletion of output file (apparently) failed.\n",
"%s: WARNING: deletion of output file suppressed\n",
progName );
fprintf ( stderr,
"%s: since input file no longer exists. Output file\n",
progName );
fprintf ( stderr,
"%s: `%s' may be incomplete.\n",
progName, outName );
fprintf ( stderr,
"%s: I suggest doing an integrity test (bzip2 -tv)"
" of it.\n",
progName );
}
}
if (noisy && numFileNames > 0 && numFilesProcessed < numFileNames) {
fprintf ( stderr,
"%s: WARNING: some files have not been processed:\n"
"\t%d specified on command line, %d not processed yet.\n\n",
progName, numFileNames,
numFileNames - numFilesProcessed );
"%s: %d specified on command line, %d not processed yet.\n\n",
progName, progName,
numFileNames, numFileNames - numFilesProcessed );
}
setExit(ec);
exit(exitValue);
@ -917,14 +854,16 @@ void crcError ( void )
static
void compressedStreamEOF ( void )
{
fprintf ( stderr,
"\n%s: Compressed file ends unexpectedly;\n\t"
"perhaps it is corrupted? *Possible* reason follows.\n",
progName );
perror ( progName );
showFileNames();
cadvise();
cleanUpAndFail( 2 );
if (noisy) {
fprintf ( stderr,
"\n%s: Compressed file ends unexpectedly;\n\t"
"perhaps it is corrupted? *Possible* reason follows.\n",
progName );
perror ( progName );
showFileNames();
cadvise();
}
cleanUpAndFail( 2 );
}
@ -1040,6 +979,11 @@ void configError ( void )
/*--- The main driver machinery ---*/
/*---------------------------------------------------*/
/* All rather crufty. The main problem is that input files
are stat()d multiple times before use. This should be
cleaned up.
*/
/*---------------------------------------------*/
static
void pad ( Char *s )
@ -1083,6 +1027,32 @@ Bool fileExists ( Char* name )
}
/*---------------------------------------------*/
/* Open an output file safely with O_EXCL and good permissions.
This avoids a race condition in versions < 1.0.2, in which
the file was first opened and then had its interim permissions
set safely. We instead use open() to create the file with
the interim permissions required. (--- --- rw-).
For non-Unix platforms, if we are not worrying about
security issues, simple this simply behaves like fopen.
*/
FILE* fopen_output_safely ( Char* name, const char* mode )
{
# if BZ_UNIX
FILE* fp;
IntNative fh;
fh = open(name, O_WRONLY|O_CREAT|O_EXCL, S_IWUSR|S_IRUSR);
if (fh == -1) return NULL;
fp = fdopen(fh, mode);
if (fp == NULL) close(fh);
return fp;
# else
return fopen(name, mode);
# endif
}
/*---------------------------------------------*/
/*--
if in doubt, return True
@ -1095,7 +1065,7 @@ Bool notAStandardFile ( Char* name )
i = MY_LSTAT ( name, &statBuf );
if (i != 0) return True;
if (MY_S_IFREG(statBuf.st_mode)) return False;
if (MY_S_ISREG(statBuf.st_mode)) return False;
return True;
}
@ -1117,42 +1087,66 @@ Int32 countHardLinks ( Char* name )
/*---------------------------------------------*/
static
void copyDatePermissionsAndOwner ( Char *srcName, Char *dstName )
{
/* Copy modification date, access date, permissions and owner from the
source to destination file. We have to copy this meta-info off
into fileMetaInfo before starting to compress / decompress it,
because doing it afterwards means we get the wrong access time.
To complicate matters, in compress() and decompress() below, the
sequence of tests preceding the call to saveInputFileMetaInfo()
involves calling fileExists(), which in turn establishes its result
by attempting to fopen() the file, and if successful, immediately
fclose()ing it again. So we have to assume that the fopen() call
does not cause the access time field to be updated.
Reading of the man page for stat() (man 2 stat) on RedHat 7.2 seems
to imply that merely doing open() will not affect the access time.
Therefore we merely need to hope that the C library only does
open() as a result of fopen(), and not any kind of read()-ahead
cleverness.
It sounds pretty fragile to me. Whether this carries across
robustly to arbitrary Unix-like platforms (or even works robustly
on this one, RedHat 7.2) is unknown to me. Nevertheless ...
*/
#if BZ_UNIX
static
struct MY_STAT fileMetaInfo;
#endif
static
void saveInputFileMetaInfo ( Char *srcName )
{
# if BZ_UNIX
IntNative retVal;
/* Note use of stat here, not lstat. */
retVal = MY_STAT( srcName, &fileMetaInfo );
ERROR_IF_NOT_ZERO ( retVal );
# endif
}
static
void applySavedMetaInfoToOutputFile ( Char *dstName )
{
# if BZ_UNIX
IntNative retVal;
struct MY_STAT statBuf;
struct utimbuf uTimBuf;
retVal = MY_LSTAT ( srcName, &statBuf );
ERROR_IF_NOT_ZERO ( retVal );
uTimBuf.actime = statBuf.st_atime;
uTimBuf.modtime = statBuf.st_mtime;
uTimBuf.actime = fileMetaInfo.st_atime;
uTimBuf.modtime = fileMetaInfo.st_mtime;
retVal = chmod ( dstName, statBuf.st_mode );
retVal = chmod ( dstName, fileMetaInfo.st_mode );
ERROR_IF_NOT_ZERO ( retVal );
retVal = utime ( dstName, &uTimBuf );
ERROR_IF_NOT_ZERO ( retVal );
retVal = chown ( dstName, statBuf.st_uid, statBuf.st_gid );
retVal = chown ( dstName, fileMetaInfo.st_uid, fileMetaInfo.st_gid );
/* chown() will in many cases return with EPERM, which can
be safely ignored.
*/
#endif
}
/*---------------------------------------------*/
static
void setInterimPermissions ( Char *dstName )
{
#if BZ_UNIX
IntNative retVal;
retVal = chmod ( dstName, S_IRUSR | S_IWUSR );
ERROR_IF_NOT_ZERO ( retVal );
#endif
# endif
}
@ -1160,10 +1154,19 @@ void setInterimPermissions ( Char *dstName )
static
Bool containsDubiousChars ( Char* name )
{
Bool cdc = False;
# if BZ_UNIX
/* On unix, files can contain any characters and the file expansion
* is performed by the shell.
*/
return False;
# else /* ! BZ_UNIX */
/* On non-unix (Win* platforms), wildcard characters are not allowed in
* filenames.
*/
for (; *name != '\0'; name++)
if (*name == '?' || *name == '*') cdc = True;
return cdc;
if (*name == '?' || *name == '*') return True;
return False;
# endif /* BZ_UNIX */
}
@ -1203,6 +1206,7 @@ void compress ( Char *name )
FILE *inStr;
FILE *outStr;
Int32 n, i;
struct MY_STAT statBuf;
deleteOutputOnInterrupt = False;
@ -1248,6 +1252,16 @@ void compress ( Char *name )
return;
}
}
if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
MY_STAT(inName, &statBuf);
if ( MY_S_ISDIR(statBuf.st_mode) ) {
fprintf( stderr,
"%s: Input file %s is a directory.\n",
progName,inName);
setExit(1);
return;
}
}
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
if (noisy)
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
@ -1255,11 +1269,15 @@ void compress ( Char *name )
setExit(1);
return;
}
if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
fprintf ( stderr, "%s: Output file %s already exists.\n",
progName, outName );
setExit(1);
return;
if ( srcMode == SM_F2F && fileExists ( outName ) ) {
if (forceOverwrite) {
remove(outName);
} else {
fprintf ( stderr, "%s: Output file %s already exists.\n",
progName, outName );
setExit(1);
return;
}
}
if ( srcMode == SM_F2F && !forceOverwrite &&
(n=countHardLinks ( inName )) > 0) {
@ -1269,6 +1287,12 @@ void compress ( Char *name )
return;
}
if ( srcMode == SM_F2F ) {
/* Save the file's meta-info before we open it. Doing it later
means we mess up the access times. */
saveInputFileMetaInfo ( inName );
}
switch ( srcMode ) {
case SM_I2O:
@ -1308,7 +1332,7 @@ void compress ( Char *name )
case SM_F2F:
inStr = fopen ( inName, "rb" );
outStr = fopen ( outName, "wb" );
outStr = fopen_output_safely ( outName, "wb" );
if ( outStr == NULL) {
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
progName, outName, strerror(errno) );
@ -1323,7 +1347,6 @@ void compress ( Char *name )
setExit(1);
return;
};
setInterimPermissions ( outName );
break;
default:
@ -1345,7 +1368,7 @@ void compress ( Char *name )
/*--- If there was an I/O error, we won't get here. ---*/
if ( srcMode == SM_F2F ) {
copyDatePermissionsAndOwner ( inName, outName );
applySavedMetaInfoToOutputFile ( outName );
deleteOutputOnInterrupt = False;
if ( !keepInputFiles ) {
IntNative retVal = remove ( inName );
@ -1366,6 +1389,7 @@ void uncompress ( Char *name )
Int32 n, i;
Bool magicNumberOK;
Bool cantGuess;
struct MY_STAT statBuf;
deleteOutputOnInterrupt = False;
@ -1407,6 +1431,16 @@ void uncompress ( Char *name )
setExit(1);
return;
}
if ( srcMode == SM_F2F || srcMode == SM_F2O ) {
MY_STAT(inName, &statBuf);
if ( MY_S_ISDIR(statBuf.st_mode) ) {
fprintf( stderr,
"%s: Input file %s is a directory.\n",
progName,inName);
setExit(1);
return;
}
}
if ( srcMode == SM_F2F && !forceOverwrite && notAStandardFile ( inName )) {
if (noisy)
fprintf ( stderr, "%s: Input file %s is not a normal file.\n",
@ -1421,11 +1455,15 @@ void uncompress ( Char *name )
progName, inName, outName );
/* just a warning, no return */
}
if ( srcMode == SM_F2F && !forceOverwrite && fileExists ( outName ) ) {
fprintf ( stderr, "%s: Output file %s already exists.\n",
progName, outName );
setExit(1);
return;
if ( srcMode == SM_F2F && fileExists ( outName ) ) {
if (forceOverwrite) {
remove(outName);
} else {
fprintf ( stderr, "%s: Output file %s already exists.\n",
progName, outName );
setExit(1);
return;
}
}
if ( srcMode == SM_F2F && !forceOverwrite &&
(n=countHardLinks ( inName ) ) > 0) {
@ -1435,6 +1473,12 @@ void uncompress ( Char *name )
return;
}
if ( srcMode == SM_F2F ) {
/* Save the file's meta-info before we open it. Doing it later
means we mess up the access times. */
saveInputFileMetaInfo ( inName );
}
switch ( srcMode ) {
case SM_I2O:
@ -1465,7 +1509,7 @@ void uncompress ( Char *name )
case SM_F2F:
inStr = fopen ( inName, "rb" );
outStr = fopen ( outName, "wb" );
outStr = fopen_output_safely ( outName, "wb" );
if ( outStr == NULL) {
fprintf ( stderr, "%s: Can't create output file %s: %s.\n",
progName, outName, strerror(errno) );
@ -1480,7 +1524,6 @@ void uncompress ( Char *name )
setExit(1);
return;
};
setInterimPermissions ( outName );
break;
default:
@ -1503,7 +1546,7 @@ void uncompress ( Char *name )
/*--- If there was an I/O error, we won't get here. ---*/
if ( magicNumberOK ) {
if ( srcMode == SM_F2F ) {
copyDatePermissionsAndOwner ( inName, outName );
applySavedMetaInfoToOutputFile ( outName );
deleteOutputOnInterrupt = False;
if ( !keepInputFiles ) {
IntNative retVal = remove ( inName );
@ -1541,6 +1584,7 @@ void testf ( Char *name )
{
FILE *inStr;
Bool allOK;
struct MY_STAT statBuf;
deleteOutputOnInterrupt = False;
@ -1567,6 +1611,16 @@ void testf ( Char *name )
setExit(1);
return;
}
if ( srcMode != SM_I2O ) {
MY_STAT(inName, &statBuf);
if ( MY_S_ISDIR(statBuf.st_mode) ) {
fprintf( stderr,
"%s: Input file %s is a directory.\n",
progName,inName);
setExit(1);
return;
}
}
switch ( srcMode ) {
@ -1605,6 +1659,7 @@ void testf ( Char *name )
}
/*--- Now the input handle is sane. Do the Biz. ---*/
outputHandleJustInCase = NULL;
allOK = testStream ( inStr );
if (allOK && verbosity >= 1) fprintf ( stderr, "ok\n" );
@ -1621,7 +1676,7 @@ void license ( void )
"bzip2, a block-sorting file compressor. "
"Version %s.\n"
" \n"
" Copyright (C) 1996-2000 by Julian Seward.\n"
" Copyright (C) 1996-2002 by Julian Seward.\n"
" \n"
" This program is free software; you can redistribute it and/or modify\n"
" it under the terms set out in the LICENSE file, which is included\n"
@ -1660,6 +1715,8 @@ void usage ( Char *fullProgName )
" -V --version display software version & license\n"
" -s --small use less memory (at most 2500k)\n"
" -1 .. -9 set block size to 100k .. 900k\n"
" --fast alias for -1\n"
" --best alias for -9\n"
"\n"
" If invoked as `bzip2', default action is to compress.\n"
" as `bunzip2', default action is to decompress.\n"
@ -1668,9 +1725,9 @@ void usage ( Char *fullProgName )
" If no file names are given, bzip2 compresses or decompresses\n"
" from standard input to standard output. You can combine\n"
" short flags, so `-v -4' means the same as -v4 or -4v, &c.\n"
#if BZ_UNIX
# if BZ_UNIX
"\n"
#endif
# endif
,
BZ2_bzlibVersion(),
@ -1820,11 +1877,11 @@ IntNative main ( IntNative argc, Char *argv[] )
/*-- Set up signal handlers for mem access errors --*/
signal (SIGSEGV, mySIGSEGVorSIGBUScatcher);
#if BZ_UNIX
#ifndef __DJGPP__
# if BZ_UNIX
# ifndef __DJGPP__
signal (SIGBUS, mySIGSEGVorSIGBUScatcher);
#endif
#endif
# endif
# endif
copyFileName ( inName, "(none)" );
copyFileName ( outName, "(none)" );
@ -1935,6 +1992,8 @@ IntNative main ( IntNative argc, Char *argv[] )
if (ISFLAG("--exponential")) workFactor = 1; else
if (ISFLAG("--repetitive-best")) redundant(aa->name); else
if (ISFLAG("--repetitive-fast")) redundant(aa->name); else
if (ISFLAG("--fast")) blockSize100k = 1; else
if (ISFLAG("--best")) blockSize100k = 9; else
if (ISFLAG("--verbose")) verbosity++; else
if (ISFLAG("--help")) { usage ( progName ); exit ( 0 ); }
else

37
dist/bzip2/bzlib.c vendored
View File

@ -1,4 +1,4 @@
/* $NetBSD: bzlib.c,v 1.1.1.1 2001/06/03 13:03:04 simonb Exp $ */
/* $NetBSD: bzlib.c,v 1.1.1.2 2002/03/15 01:35:26 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -95,10 +95,39 @@ void BZ2_bz__AssertH__fail ( int errcode )
"component, you should also report this bug to the author(s)\n"
"of that program. Please make an effort to report this bug;\n"
"timely and accurate bug reports eventually lead to higher\n"
"quality software. Thanks. Julian Seward, 21 March 2000.\n\n",
"quality software. Thanks. Julian Seward, 30 December 2001.\n\n",
errcode,
BZ2_bzlibVersion()
);
if (errcode == 1007) {
fprintf(stderr,
"\n*** A special note about internal error number 1007 ***\n"
"\n"
"Experience suggests that a common cause of i.e. 1007\n"
"is unreliable memory or other hardware. The 1007 assertion\n"
"just happens to cross-check the results of huge numbers of\n"
"memory reads/writes, and so acts (unintendedly) as a stress\n"
"test of your memory system.\n"
"\n"
"I suggest the following: try compressing the file again,\n"
"possibly monitoring progress in detail with the -vv flag.\n"
"\n"
"* If the error cannot be reproduced, and/or happens at different\n"
" points in compression, you may have a flaky memory system.\n"
" Try a memory-test program. I have used Memtest86\n"
" (www.memtest86.com). At the time of writing it is free (GPLd).\n"
" Memtest86 tests memory much more thorougly than your BIOSs\n"
" power-on test, and may find failures that the BIOS doesn't.\n"
"\n"
"* If the error can be repeatably reproduced, this is a bug in\n"
" bzip2, and I would very much like to hear about it. Please\n"
" let me know, and, ideally, save a copy of the file causing the\n"
" problem -- without which I will be unable to investigate it.\n"
"\n"
);
}
exit(3);
}
#endif
@ -1404,7 +1433,7 @@ BZFILE * bzopen_or_bzdopen
smallMode = 1; break;
default:
if (isdigit((int)(*mode))) {
blockSize100k = *mode-'0';
blockSize100k = *mode-BZ_HDR_0;
}
}
mode++;

8
dist/bzip2/bzlib.h vendored
View File

@ -1,4 +1,4 @@
/* $NetBSD: bzlib.h,v 1.1.1.1 2001/06/03 13:03:04 simonb Exp $ */
/* $NetBSD: bzlib.h,v 1.1.1.2 2002/03/15 01:35:26 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -112,8 +112,10 @@ typedef
#define BZ_EXPORT
#endif
/* Need a definitition for FILE */
#include <stdio.h>
#ifdef _WIN32
# include <stdio.h>
# include <windows.h>
# ifdef small
/* windows.h define small to char */

View File

@ -1,4 +1,4 @@
/* $NetBSD: bzlib_private.h,v 1.1.1.1 2001/06/03 13:03:04 simonb Exp $ */
/* $NetBSD: bzlib_private.h,v 1.1.1.2 2002/03/15 01:35:26 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -78,7 +78,7 @@
/*-- General stuff. --*/
#define BZ_VERSION "1.0.1, 23-June-2000"
#define BZ_VERSION "1.0.2, 30-Dec-2001"
typedef char Char;
typedef unsigned char Bool;
@ -139,6 +139,13 @@ extern void bz_internal_error ( int errcode );
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
/*-- Header bytes. --*/
#define BZ_HDR_B 0x42 /* 'B' */
#define BZ_HDR_Z 0x5a /* 'Z' */
#define BZ_HDR_h 0x68 /* 'h' */
#define BZ_HDR_0 0x30 /* '0' */
/*-- Constants for the back end. --*/
#define BZ_MAX_ALPHA_SIZE 258

61
dist/bzip2/bzmore vendored Normal file
View File

@ -0,0 +1,61 @@
#!/bin/sh
# Bzmore wrapped for bzip2,
# adapted from zmore by Philippe Troin <phil@fifi.org> for Debian GNU/Linux.
PATH="/usr/bin:$PATH"; export PATH
prog=`echo $0 | sed 's|.*/||'`
case "$prog" in
*less) more=less ;;
*) more=more ;;
esac
if test "`echo -n a`" = "-n a"; then
# looks like a SysV system:
n1=''; n2='\c'
else
n1='-n'; n2=''
fi
oldtty=`stty -g 2>/dev/null`
if stty -cbreak 2>/dev/null; then
cb='cbreak'; ncb='-cbreak'
else
# 'stty min 1' resets eof to ^a on both SunOS and SysV!
cb='min 1 -icanon'; ncb='icanon eof ^d'
fi
if test $? -eq 0 -a -n "$oldtty"; then
trap 'stty $oldtty 2>/dev/null; exit' 0 2 3 5 10 13 15
else
trap 'stty $ncb echo 2>/dev/null; exit' 0 2 3 5 10 13 15
fi
if test $# = 0; then
if test -t 0; then
echo usage: $prog files...
else
bzip2 -cdfq | eval $more
fi
else
FIRST=1
for FILE
do
if test $FIRST -eq 0; then
echo $n1 "--More--(Next file: $FILE)$n2"
stty $cb -echo 2>/dev/null
ANS=`dd bs=1 count=1 2>/dev/null`
stty $ncb echo 2>/dev/null
echo " "
if test "$ANS" = 'e' -o "$ANS" = 'q'; then
exit
fi
fi
if test "$ANS" != 's'; then
echo "------> $FILE <------"
bzip2 -cdfq "$FILE" | eval $more
fi
if test -t; then
FIRST=0
fi
done
fi

154
dist/bzip2/bzmore.1 vendored Normal file
View File

@ -0,0 +1,154 @@
.\" $NetBSD: bzmore.1,v 1.1.1.1 2002/03/15 01:35:28 mjl Exp $
.\"
.\"Shamelessly copied from zmore.1 by Philippe Troin <phil@fifi.org>
.\"for Debian GNU/Linux
.TH BZMORE 1
.SH NAME
bzmore, bzless \- file perusal filter for crt viewing of bzip2 compressed text
.SH SYNOPSIS
.B bzmore
[ name ... ]
.br
.B bzless
[ name ... ]
.SH NOTE
In the following description,
.I bzless
and
.I less
can be used interchangeably with
.I bzmore
and
.I more.
.SH DESCRIPTION
.I Bzmore
is a filter which allows examination of compressed or plain text files
one screenful at a time on a soft-copy terminal.
.I bzmore
works on files compressed with
.I bzip2
and also on uncompressed files.
If a file does not exist,
.I bzmore
looks for a file of the same name with the addition of a .bz2 suffix.
.PP
.I Bzmore
normally pauses after each screenful, printing --More--
at the bottom of the screen.
If the user then types a carriage return, one more line is displayed.
If the user hits a space,
another screenful is displayed. Other possibilities are enumerated later.
.PP
.I Bzmore
looks in the file
.I /etc/termcap
to determine terminal characteristics,
and to determine the default window size.
On a terminal capable of displaying 24 lines,
the default window size is 22 lines.
Other sequences which may be typed when
.I bzmore
pauses, and their effects, are as follows (\fIi\fP is an optional integer
argument, defaulting to 1) :
.PP
.IP \fIi\|\fP<space>
display
.I i
more lines, (or another screenful if no argument is given)
.PP
.IP ^D
display 11 more lines (a ``scroll'').
If
.I i
is given, then the scroll size is set to \fIi\|\fP.
.PP
.IP d
same as ^D (control-D)
.PP
.IP \fIi\|\fPz
same as typing a space except that \fIi\|\fP, if present, becomes the new
window size. Note that the window size reverts back to the default at the
end of the current file.
.PP
.IP \fIi\|\fPs
skip \fIi\|\fP lines and print a screenful of lines
.PP
.IP \fIi\|\fPf
skip \fIi\fP screenfuls and print a screenful of lines
.PP
.IP "q or Q"
quit reading the current file; go on to the next (if any)
.PP
.IP "e or q"
When the prompt --More--(Next file:
.IR file )
is printed, this command causes bzmore to exit.
.PP
.IP s
When the prompt --More--(Next file:
.IR file )
is printed, this command causes bzmore to skip the next file and continue.
.PP
.IP =
Display the current line number.
.PP
.IP \fIi\|\fP/expr
search for the \fIi\|\fP-th occurrence of the regular expression \fIexpr.\fP
If the pattern is not found,
.I bzmore
goes on to the next file (if any).
Otherwise, a screenful is displayed, starting two lines before the place
where the expression was found.
The user's erase and kill characters may be used to edit the regular
expression.
Erasing back past the first column cancels the search command.
.PP
.IP \fIi\|\fPn
search for the \fIi\|\fP-th occurrence of the last regular expression entered.
.PP
.IP !command
invoke a shell with \fIcommand\|\fP.
The character `!' in "command" are replaced with the
previous shell command. The sequence "\\!" is replaced by "!".
.PP
.IP ":q or :Q"
quit reading the current file; go on to the next (if any)
(same as q or Q).
.PP
.IP .
(dot) repeat the previous command.
.PP
The commands take effect immediately, i.e., it is not necessary to
type a carriage return.
Up to the time when the command character itself is given,
the user may hit the line kill character to cancel the numerical
argument being formed.
In addition, the user may hit the erase character to redisplay the
--More-- message.
.PP
At any time when output is being sent to the terminal, the user can
hit the quit key (normally control\-\\).
.I Bzmore
will stop sending output, and will display the usual --More--
prompt.
The user may then enter one of the above commands in the normal manner.
Unfortunately, some output is lost when this is done, due to the
fact that any characters waiting in the terminal's output queue
are flushed when the quit signal occurs.
.PP
The terminal is set to
.I noecho
mode by this program so that the output can be continuous.
What you type will thus not show on your terminal, except for the / and !
commands.
.PP
If the standard output is not a teletype, then
.I bzmore
acts just like
.I bzcat,
except that a header is printed before each file.
.SH FILES
.DT
/etc/termcap Terminal data base
.SH "SEE ALSO"
more(1), less(1), bzip2(1), bzdiff(1), bzgrep(1)

12
dist/bzip2/compress.c vendored
View File

@ -1,4 +1,4 @@
/* $NetBSD: compress.c,v 1.1.1.1 2001/06/03 13:03:05 simonb Exp $ */
/* $NetBSD: compress.c,v 1.1.1.2 2002/03/15 01:35:27 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -665,10 +665,10 @@ void BZ2_compressBlock ( EState* s, Bool is_last_block )
/*-- If this is the first block, create the stream header. --*/
if (s->blockNo == 1) {
BZ2_bsInitWrite ( s );
bsPutUChar ( s, 'B' );
bsPutUChar ( s, 'Z' );
bsPutUChar ( s, 'h' );
bsPutUChar ( s, (UChar)('0' + s->blockSize100k) );
bsPutUChar ( s, BZ_HDR_B );
bsPutUChar ( s, BZ_HDR_Z );
bsPutUChar ( s, BZ_HDR_h );
bsPutUChar ( s, (UChar)(BZ_HDR_0 + s->blockSize100k) );
}
if (s->nblock > 0) {

View File

@ -1,4 +1,4 @@
/* $NetBSD: crctable.c,v 1.1.1.1 2001/06/03 13:03:05 simonb Exp $ */
/* $NetBSD: crctable.c,v 1.1.1.2 2002/03/15 01:35:28 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions

View File

@ -1,4 +1,4 @@
/* $NetBSD: decompress.c,v 1.1.1.1 2001/06/03 13:03:06 simonb Exp $ */
/* $NetBSD: decompress.c,v 1.1.1.2 2002/03/15 01:35:28 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -237,18 +237,18 @@ Int32 BZ2_decompress ( DState* s )
switch (s->state) {
GET_UCHAR(BZ_X_MAGIC_1, uc);
if (uc != 'B') RETURN(BZ_DATA_ERROR_MAGIC);
if (uc != BZ_HDR_B) RETURN(BZ_DATA_ERROR_MAGIC);
GET_UCHAR(BZ_X_MAGIC_2, uc);
if (uc != 'Z') RETURN(BZ_DATA_ERROR_MAGIC);
if (uc != BZ_HDR_Z) RETURN(BZ_DATA_ERROR_MAGIC);
GET_UCHAR(BZ_X_MAGIC_3, uc)
if (uc != 'h') RETURN(BZ_DATA_ERROR_MAGIC);
if (uc != BZ_HDR_h) RETURN(BZ_DATA_ERROR_MAGIC);
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
if (s->blockSize100k < '1' ||
s->blockSize100k > '9') RETURN(BZ_DATA_ERROR_MAGIC);
s->blockSize100k -= '0';
if (s->blockSize100k < (BZ_HDR_0 + 1) ||
s->blockSize100k > (BZ_HDR_0 + 9)) RETURN(BZ_DATA_ERROR_MAGIC);
s->blockSize100k -= BZ_HDR_0;
if (s->smallDecompress) {
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );

View File

@ -1,4 +1,4 @@
/* $NetBSD: dlltest.c,v 1.1.1.1 2001/06/03 13:03:06 simonb Exp $ */
/* $NetBSD: dlltest.c,v 1.1.1.2 2002/03/15 01:35:28 mjl Exp $ */
/*
minibz2
@ -21,7 +21,7 @@
#ifdef _WIN32
#define BZ2_LIBNAME "libbz2-1.0.0.DLL"
#define BZ2_LIBNAME "libbz2-1.0.2.DLL"
#include <windows.h>
static int BZ2DLLLoaded = 0;
@ -132,8 +132,8 @@ int main(int argc,char *argv[])
}else{
fp_w = stdout;
}
if((BZ2fp_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)
|| (BZ2fp_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){
if((fn_r == NULL && (BZ2fp_r = BZ2_bzdopen(fileno(stdin),"rb"))==NULL)
|| (fn_r != NULL && (BZ2fp_r = BZ2_bzopen(fn_r,"rb"))==NULL)){
printf("can't bz2openstream\n");
exit(1);
}

View File

@ -1,4 +1,4 @@
/* $NetBSD: huffman.c,v 1.1.1.1 2001/06/03 13:03:06 simonb Exp $ */
/* $NetBSD: huffman.c,v 1.1.1.2 2002/03/15 01:35:28 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions

View File

@ -4,7 +4,7 @@
# Fixed up by JRS for bzip2-0.9.5d release.
CC=cl
CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64
CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64 -nologo
OBJS= blocksort.obj \
huffman.obj \

117
dist/bzip2/manual.html vendored Normal file
View File

@ -0,0 +1,117 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<TITLE>Untitled Document: Untitled Document</TITLE>
<META NAME="description" CONTENT="Untitled Document: Untitled Document">
<META NAME="keywords" CONTENT="Untitled Document: Untitled Document">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<A NAME="SEC_Top"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1>Untitled Document</H1></P><P>
The following text is the License for this software. You should
find it identical to that contained in the file LICENSE in the
source distribution.
</P><P>
@bf{------------------ START OF THE LICENSE ------------------}
</P><P>
This program, <CODE>bzip2</CODE>,
and associated library <CODE>libbzip2</CODE>, are
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
</P><P>
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
<UL>
<LI>
Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
<LI>
The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
<LI>
Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
<LI>
The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
permission.
</UL>
THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
<P>
Julian Seward, Cambridge, UK.
</P><P>
<CODE>jseward@acm.org</CODE>
</P><P>
<CODE>bzip2</CODE>/<CODE>libbzip2</CODE> version 1.0.2 of 30 December 2001.
</P><P>
@bf{------------------ END OF THE LICENSE ------------------}
</P><P>
Web sites:
</P><P>
<CODE>http://sources.redhat.com/bzip2</CODE>
</P><P>
<CODE>http://www.cacheprof.org</CODE>
</P><P>
PATENTS: To the best of my knowledge, <CODE>bzip2</CODE> does not use any patented
algorithms. However, I do not have the resources available to carry out
a full patent search. Therefore I cannot give any guarantee of the
above statement.
</P><P>
<HR SIZE=1>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
</BODY>
</HTML>

BIN
dist/bzip2/manual.pdf vendored Normal file

Binary file not shown.

6795
dist/bzip2/manual.ps vendored

File diff suppressed because it is too large Load Diff

118
dist/bzip2/manual.texi vendored
View File

@ -1,12 +1,12 @@
\input texinfo @c -*- Texinfo -*-
@c $NetBSD: manual.texi,v 1.1.1.1 2001/06/03 13:03:16 simonb Exp $
@c $NetBSD: manual.texi,v 1.1.1.2 2002/03/15 01:36:18 mjl Exp $
@setfilename bzip2.info
@ignore
This file documents bzip2 version 1.0, and associated library
This file documents bzip2 version 1.0.2, and associated library
libbzip2, written by Julian Seward (jseward@acm.org).
Copyright (C) 1996-2000 Julian R Seward
Copyright (C) 1996-2002 Julian R Seward
Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
@ -31,8 +31,8 @@ END-INFO-DIR-ENTRY
@titlepage
@title bzip2 and libbzip2
@subtitle a program and library for data compression
@subtitle copyright (C) 1996-2000 Julian Seward
@subtitle version 1.0 of 21 March 2000
@subtitle copyright (C) 1996-2002 Julian Seward
@subtitle version 1.0.2 of 30 December 2001
@author Julian Seward
@end titlepage
@ -41,11 +41,17 @@ END-INFO-DIR-ENTRY
@parskip 2mm
@end iftex
@node Top, Overview, (dir), (dir)
@node Top,,, (dir)
The following text is the License for this software. You should
find it identical to that contained in the file LICENSE in the
source distribution.
@bf{------------------ START OF THE LICENSE ------------------}
This program, @code{bzip2},
and associated library @code{libbzip2}, are
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
@ -83,14 +89,16 @@ Julian Seward, Cambridge, UK.
@code{jseward@@acm.org}
@code{http://sourceware.cygnus.com/bzip2}
@code{bzip2}/@code{libbzip2} version 1.0.2 of 30 December 2001.
@bf{------------------ END OF THE LICENSE ------------------}
Web sites:
@code{http://sources.redhat.com/bzip2}
@code{http://www.cacheprof.org}
@code{http://www.muraroa.demon.co.uk}
@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000.
PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented
algorithms. However, I do not have the resources available to carry out
a full patent search. Therefore I cannot give any guarantee of the
@ -102,7 +110,6 @@ above statement.
@node Overview, Implementation, Top, Top
@chapter Introduction
@code{bzip2} compresses files using the Burrows-Wheeler
@ -135,7 +142,7 @@ and nothing else.
@unnumberedsubsubsec NAME
@itemize
@item @code{bzip2}, @code{bunzip2}
- a block-sorting file compressor, v1.0
- a block-sorting file compressor, v1.0.2
@item @code{bzcat}
- decompresses files to stdout
@item @code{bzip2recover}
@ -265,6 +272,11 @@ This really performs a trial decompression and throws away the result.
Force overwrite of output files. Normally, @code{bzip2} will not overwrite
existing output files. Also forces @code{bzip2} to break hard links
to files, which it otherwise wouldn't do.
@code{bzip2} normally declines to decompress files which don't have the
correct magic header bytes. If forced (@code{-f}), however, it will
pass such files through unmodified. This is how GNU @code{gzip}
behaves.
@item -k --keep
Keep (don't delete) input files during compression
or decompression.
@ -287,9 +299,13 @@ Further @code{-v}'s increase the verbosity level, spewing out lots of
information which is primarily of interest for diagnostic purposes.
@item -L --license -V --version
Display the software version, license terms and conditions.
@item -1 to -9
@item -1 (or --fast) to -9 (or --best)
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
effect when decompressing. See MEMORY MANAGEMENT below.
The @code{--fast} and @code{--best} aliases are primarily for GNU
@code{gzip} compatibility. In particular, @code{--fast} doesn't make
things significantly faster. And @code{--best} merely selects the
default behaviour.
@item --
Treats all subsequent arguments as file names, even if they start
with a dash. This is so you can handle files with names beginning
@ -390,21 +406,19 @@ integrity of the resulting files, and decompress those which are
undamaged.
@code{bzip2recover}
takes a single argument, the name of the damaged file,
and writes a number of files @code{rec0001file.bz2},
@code{rec0002file.bz2}, etc, containing the extracted blocks.
The output filenames are designed so that the use of
wildcards in subsequent processing -- for example,
@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in
the correct order.
takes a single argument, the name of the damaged file, and writes a
number of files @code{rec00001file.bz2}, @code{rec00002file.bz2}, etc,
containing the extracted blocks. The output filenames are designed so
that the use of wildcards in subsequent processing -- for example,
@code{bzip2 -dc rec*file.bz2 > recovered_data} -- processes the files in
the correct order.
@code{bzip2recover} should be of most use dealing with large @code{.bz2}
files, as these will contain many blocks. It is clearly
futile to use it on damaged single-block files, since a
damaged block cannot be recovered. If you wish to minimise
any potential data loss through media or transmission errors,
you might consider compressing with a smaller
block size.
files, as these will contain many blocks. It is clearly futile to use
it on damaged single-block files, since a damaged block cannot be
recovered. If you wish to minimise any potential data loss through
media or transmission errors, you might consider compressing with a
smaller block size.
@unnumberedsubsubsec PERFORMANCE NOTES
@ -436,22 +450,31 @@ I/O error messages are not as helpful as they could be. @code{bzip2}
tries hard to detect I/O errors and exit cleanly, but the details of
what the problem is sometimes seem rather misleading.
This manual page pertains to version 1.0 of @code{bzip2}. Compressed
This manual page pertains to version 1.0.2 of @code{bzip2}. Compressed
data created by this version is entirely forwards and backwards
compatible with the previous public releases, versions 0.1pl2, 0.9.0 and
0.9.5, but with the following exception: 0.9.0 and above can correctly
decompress multiple concatenated compressed files. 0.1pl2 cannot do
this; it will stop after decompressing just the first file in the
stream.
compatible with the previous public releases, versions 0.1pl2, 0.9.0,
0.9.5, 1.0.0 and 1.0.1, but with the following exception: 0.9.0 and
above can correctly decompress multiple concatenated compressed files.
0.1pl2 cannot do this; it will stop after decompressing just the first
file in the stream.
@code{bzip2recover} versions prior to this one, 1.0.2, used 32-bit
integers to represent bit positions in compressed files, so it could not
handle compressed files more than 512 megabytes long. Version 1.0.2 and
above uses 64-bit ints on some platforms which support them (GNU
supported targets, and Windows). To establish whether or not
@code{bzip2recover} was built with such a limitation, run it without
arguments. In any event you can build yourself an unlimited version if
you can recompile it with @code{MaybeUInt64} set to be an unsigned
64-bit integer.
@code{bzip2recover} uses 32-bit integers to represent bit positions in
compressed files, so it cannot handle compressed files more than 512
megabytes long. This could easily be fixed.
@unnumberedsubsubsec AUTHOR
Julian Seward, @code{jseward@@acm.org}.
@code{http://sources.redhat.com/bzip2}
The ideas embodied in @code{bzip2} are due to (at least) the following
people: Michael Burrows and David Wheeler (for the block sorting
transformation), David Wheeler (again, for the Huffman coder), Peter
@ -462,8 +485,9 @@ indebted for their help, support and advice. See the manual in the
source distribution for pointers to sources of documentation. Christian
von Roques encouraged me to look for faster sorting algorithms, so as to
speed up compression. Bela Lubkin encouraged me to improve the
worst-case compression performance. Many people sent patches, helped
with portability problems, lent machines, gave advice and were generally
worst-case compression performance. The @code{bz*} scripts are derived
from those of GNU @code{gzip}. Many people sent patches, helped with
portability problems, lent machines, gave advice and were generally
helpful.
@end quotation
@ -1770,16 +1794,20 @@ was compiled with @code{BZ_NO_STDIO} set.
For a normal compile, an assertion failure yields the message
@example
bzip2/libbzip2: internal error number N.
This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000.
This is a bug in bzip2/libbzip2, 1.0.2, 30-Dec-2001.
Please report it to me at: jseward@@acm.org. If this happened
when you were using some program which uses libbzip2 as a
component, you should also report this bug to the author(s)
of that program. Please make an effort to report this bug;
timely and accurate bug reports eventually lead to higher
quality software. Thanks. Julian Seward, 21 March 2000.
quality software. Thanks. Julian Seward, 30 December 2001.
@end example
where @code{N} is some error code number. @code{exit(3)}
is then called.
where @code{N} is some error code number. If @code{N == 1007}, it also
prints some extra text advising the reader that unreliable memory is
often associated with internal error 1007. (This is a
frequently-observed-phenomenon with versions 1.0.0/1.0.1).
@code{exit(3)} is then called.
For a @code{stdio}-free library, assertion failures result
in a call to a function declared as:
@ -2057,10 +2085,10 @@ Maybe this isn't what you want.
If you want a compressor and/or library which is faster, uses less
memory but gets pretty good compression, and has minimal latency,
consider Jean-loup
Gailly's and Mark Adler's work, @code{zlib-1.1.2} and
Gailly's and Mark Adler's work, @code{zlib-1.1.3} and
@code{gzip-1.2.4}. Look for them at
@code{http://www.cdrom.com/pub/infozip/zlib} and
@code{http://www.zlib.org} and
@code{http://www.gzip.org} respectively.
For something faster and lighter still, you might try Markus F X J

View File

@ -1,47 +1,81 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<!-- This HTML file has been created by texi2html 1.54
from manual.texi on 23 March 2000 -->
<TITLE>Untitled Document: 1. Introduction</TITLE>
<TITLE>bzip2 and libbzip2 - Introduction</TITLE>
<link href="manual_2.html" rel=Next>
<link href="manual_toc.html" rel=ToC>
<META NAME="description" CONTENT="Untitled Document: 1. Introduction">
<META NAME="keywords" CONTENT="Untitled Document: 1. Introduction">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY>
<p>Go to the first, previous, <A HREF="manual_2.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
<P><HR><P>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<H1><A NAME="SEC1" HREF="manual_toc.html#TOC1">Introduction</A></H1>
<A NAME="SEC1"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC2"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1> 1. Introduction </H1>
<!--docid::SEC1::-->
<P>
<CODE>bzip2</CODE> compresses files using the Burrows-Wheeler
block-sorting text compression algorithm, and Huffman coding.
Compression is generally considerably better than that
achieved by more conventional LZ77/LZ78-based compressors,
and approaches the performance of the PPM family of statistical compressors.
</P><P>
</P>
<P>
<CODE>bzip2</CODE> is built on top of <CODE>libbzip2</CODE>, a flexible library
for handling compressed data in the <CODE>bzip2</CODE> format. This manual
describes both how to use the program and
how to work with the library interface. Most of the
manual is devoted to this library, not the program,
which is good news if your interest is only in the program.
</P><P>
</P>
<P>
Chapter 2 describes how to use <CODE>bzip2</CODE>; this is the only part
you need to read if you just want to know how to operate the program.
Chapter 3 describes the programming interfaces in detail, and
Chapter 4 records some miscellaneous notes which I thought
ought to be recorded somewhere.
</P><P>
</P>
<HR SIZE="6">
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
<P><HR><P>
<p>Go to the first, previous, <A HREF="manual_2.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
</BODY>
</HTML>

View File

@ -1,78 +1,127 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<!-- This HTML file has been created by texi2html 1.54
from manual.texi on 23 March 2000 -->
<TITLE>Untitled Document: 2. How to use <CODE>bzip2</CODE></TITLE>
<TITLE>bzip2 and libbzip2 - How to use bzip2</TITLE>
<link href="manual_3.html" rel=Next>
<link href="manual_1.html" rel=Previous>
<link href="manual_toc.html" rel=ToC>
<META NAME="description" CONTENT="Untitled Document: 2. How to use <CODE>bzip2</CODE>">
<META NAME="keywords" CONTENT="Untitled Document: 2. How to use <CODE>bzip2</CODE>">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY>
<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_1.html">previous</A>, <A HREF="manual_3.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
<P><HR><P>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<H1><A NAME="SEC2" HREF="manual_toc.html#TOC2">How to use <CODE>bzip2</CODE></A></H1>
<A NAME="SEC2"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_1.html#SEC1"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC3"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1> 2. How to use <CODE>bzip2</CODE> </H1>
<!--docid::SEC2::-->
<P>
This chapter contains a copy of the <CODE>bzip2</CODE> man page,
and nothing else.
</P>
</P><P>
<BLOCKQUOTE>
<P>
<H4><A NAME="SEC3" HREF="manual_toc.html#TOC3">NAME</A></H4>
<HR SIZE="6">
<A NAME="SEC3"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC2"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC4"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> NAME </H4>
<!--docid::SEC3::-->
<UL>
<LI><CODE>bzip2</CODE>, <CODE>bunzip2</CODE>
- a block-sorting file compressor, v1.0
- a block-sorting file compressor, v1.0.2
<LI><CODE>bzcat</CODE>
- decompresses files to stdout
<LI><CODE>bzip2recover</CODE>
- recovers data from damaged bzip2 files
</UL>
<P>
<H4><A NAME="SEC4" HREF="manual_toc.html#TOC4">SYNOPSIS</A></H4>
<HR SIZE="6">
<A NAME="SEC4"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC3"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC5"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> SYNOPSIS </H4>
<!--docid::SEC4::-->
<UL>
<LI><CODE>bzip2</CODE> [ -cdfkqstvzVL123456789 ] [ filenames ... ]
<LI><CODE>bunzip2</CODE> [ -fkvsVL ] [ filenames ... ]
<LI><CODE>bzcat</CODE> [ -s ] [ filenames ... ]
<LI><CODE>bzip2recover</CODE> filename
</UL>
<H4><A NAME="SEC5" HREF="manual_toc.html#TOC5">DESCRIPTION</A></H4>
<P>
<HR SIZE="6">
<A NAME="SEC5"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC4"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC6"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> DESCRIPTION </H4>
<!--docid::SEC5::-->
<P>
<CODE>bzip2</CODE> compresses files using the Burrows-Wheeler block sorting
text compression algorithm, and Huffman coding. Compression is
generally considerably better than that achieved by more conventional
LZ77/LZ78-based compressors, and approaches the performance of the PPM
family of statistical compressors.
</P><P>
</P>
<P>
The command-line options are deliberately very similar to those of GNU
<CODE>gzip</CODE>, but they are not identical.
</P><P>
</P>
<P>
<CODE>bzip2</CODE> expects a list of file names to accompany the command-line
flags. Each file is replaced by a compressed version of itself, with
the name <CODE>original_name.bz2</CODE>. Each compressed file has the same
@ -82,61 +131,47 @@ restored at decompression time. File name handling is naive in the
sense that there is no mechanism for preserving original file names,
permissions, ownerships or dates in filesystems which lack these
concepts, or have serious file name length restrictions, such as MS-DOS.
</P><P>
</P>
<P>
<CODE>bzip2</CODE> and <CODE>bunzip2</CODE> will by default not overwrite existing
files. If you want this to happen, specify the <CODE>-f</CODE> flag.
</P><P>
</P>
<P>
If no file names are specified, <CODE>bzip2</CODE> compresses from standard
input to standard output. In this case, <CODE>bzip2</CODE> will decline to
write compressed output to a terminal, as this would be entirely
incomprehensible and therefore pointless.
</P><P>
</P>
<P>
<CODE>bunzip2</CODE> (or <CODE>bzip2 -d</CODE>) decompresses all
specified files. Files which were not created by <CODE>bzip2</CODE>
will be detected and ignored, and a warning issued.
<CODE>bzip2</CODE> attempts to guess the filename for the decompressed file
from that of the compressed file as follows:
<UL>
<LI><CODE>filename.bz2 </CODE> becomes <CODE>filename</CODE>
<LI><CODE>filename.bz </CODE> becomes <CODE>filename</CODE>
<LI><CODE>filename.tbz2</CODE> becomes <CODE>filename.tar</CODE>
<LI><CODE>filename.tbz </CODE> becomes <CODE>filename.tar</CODE>
<LI><CODE>anyothername </CODE> becomes <CODE>anyothername.out</CODE>
</UL>
<P>
If the file does not end in one of the recognised endings,
<CODE>.bz2</CODE>, <CODE>.bz</CODE>,
<CODE>.tbz2</CODE> or <CODE>.tbz</CODE>, <CODE>bzip2</CODE> complains that it cannot
guess the name of the original file, and uses the original name
with <CODE>.out</CODE> appended.
</P>
<P>
As with compression, supplying no
filenames causes decompression from standard input to standard output.
</P><P>
</P>
<P>
<CODE>bunzip2</CODE> will correctly decompress a file which is the
concatenation of two or more compressed files. The result is the
concatenation of the corresponding uncompressed files. Integrity
testing (<CODE>-t</CODE>) of concatenated compressed files is also supported.
</P><P>
</P>
<P>
You can also compress or decompress files to the standard output by
giving the <CODE>-c</CODE> flag. Multiple files may be compressed and
decompressed like this. The resulting outputs are fed sequentially to
@ -145,30 +180,26 @@ containing multiple compressed file representations. Such a stream
can be decompressed correctly only by <CODE>bzip2</CODE> version 0.9.0 or
later. Earlier versions of <CODE>bzip2</CODE> will stop after decompressing
the first file in the stream.
</P><P>
</P>
<P>
<CODE>bzcat</CODE> (or <CODE>bzip2 -dc</CODE>) decompresses all specified files to
the standard output.
</P><P>
</P>
<P>
<CODE>bzip2</CODE> will read arguments from the environment variables
<CODE>BZIP2</CODE> and <CODE>BZIP</CODE>, in that order, and will process them
before any arguments read from the command line. This gives a
convenient way to supply default arguments.
</P><P>
</P>
<P>
Compression is always performed, even if the compressed file is slightly
larger than the original. Files of less than about one hundred bytes
tend to get larger, since the compression mechanism has a constant
overhead in the region of 50 bytes. Random data (including the output
of most file compressors) is coded at about 8.05 bits per byte, giving
an expansion of around 0.5%.
</P><P>
</P>
<P>
As a self-check for your protection, <CODE>bzip2</CODE> uses 32-bit CRCs to
make sure that the decompressed version of a file is identical to the
original. This guards against corruption of the compressed data, and
@ -179,94 +210,113 @@ the check occurs upon decompression, so it can only tell you that
something is wrong. It can't help you recover the original uncompressed
data. You can use <CODE>bzip2recover</CODE> to try to recover data from
damaged files.
</P><P>
</P>
<P>
Return values: 0 for a normal exit, 1 for environmental problems (file
not found, invalid flags, I/O errors, &#38;c), 2 to indicate a corrupt
compressed file, 3 for an internal consistency error (eg, bug) which
caused <CODE>bzip2</CODE> to panic.
</P><P>
</P>
<H4><A NAME="SEC6" HREF="manual_toc.html#TOC6">OPTIONS</A></H4>
<HR SIZE="6">
<A NAME="SEC6"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC5"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC7"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> OPTIONS </H4>
<!--docid::SEC6::-->
<DL COMPACT>
<DT><CODE>-c --stdout</CODE>
<DD>
Compress or decompress to standard output.
<DD>Compress or decompress to standard output.
<DT><CODE>-d --decompress</CODE>
<DD>
Force decompression. <CODE>bzip2</CODE>, <CODE>bunzip2</CODE> and <CODE>bzcat</CODE> are
<DD>Force decompression. <CODE>bzip2</CODE>, <CODE>bunzip2</CODE> and <CODE>bzcat</CODE> are
really the same program, and the decision about what actions to take is
done on the basis of which name is used. This flag overrides that
mechanism, and forces bzip2 to decompress.
<DT><CODE>-z --compress</CODE>
<DD>
The complement to <CODE>-d</CODE>: forces compression, regardless of the
<DD>The complement to <CODE>-d</CODE>: forces compression, regardless of the
invokation name.
<DT><CODE>-t --test</CODE>
<DD>
Check integrity of the specified file(s), but don't decompress them.
<DD>Check integrity of the specified file(s), but don't decompress them.
This really performs a trial decompression and throws away the result.
<DT><CODE>-f --force</CODE>
<DD>
Force overwrite of output files. Normally, <CODE>bzip2</CODE> will not overwrite
<DD>Force overwrite of output files. Normally, <CODE>bzip2</CODE> will not overwrite
existing output files. Also forces <CODE>bzip2</CODE> to break hard links
to files, which it otherwise wouldn't do.
<P>
<CODE>bzip2</CODE> normally declines to decompress files which don't have the
correct magic header bytes. If forced (<CODE>-f</CODE>), however, it will
pass such files through unmodified. This is how GNU <CODE>gzip</CODE>
behaves.
<DT><CODE>-k --keep</CODE>
<DD>
Keep (don't delete) input files during compression
<DD>Keep (don't delete) input files during compression
or decompression.
<DT><CODE>-s --small</CODE>
<DD>
Reduce memory usage, for compression, decompression and testing. Files
<DD>Reduce memory usage, for compression, decompression and testing. Files
are decompressed and tested using a modified algorithm which only
requires 2.5 bytes per block byte. This means any file can be
decompressed in 2300k of memory, albeit at about half the normal speed.
<P>
During compression, <CODE>-s</CODE> selects a block size of 200k, which limits
memory use to around the same figure, at the expense of your compression
ratio. In short, if your machine is low on memory (8 megabytes or
less), use -s for everything. See MEMORY MANAGEMENT below.
<DT><CODE>-q --quiet</CODE>
<DD>
Suppress non-essential warning messages. Messages pertaining to
<DD>Suppress non-essential warning messages. Messages pertaining to
I/O errors and other critical events will not be suppressed.
<DT><CODE>-v --verbose</CODE>
<DD>
Verbose mode -- show the compression ratio for each file processed.
<DD>Verbose mode -- show the compression ratio for each file processed.
Further <CODE>-v</CODE>'s increase the verbosity level, spewing out lots of
information which is primarily of interest for diagnostic purposes.
<DT><CODE>-L --license -V --version</CODE>
<DD>
Display the software version, license terms and conditions.
<DT><CODE>-1 to -9</CODE>
<DD>
Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
<DD>Display the software version, license terms and conditions.
<DT><CODE>-1 (or --fast) to -9 (or --best)</CODE>
<DD>Set the block size to 100 k, 200 k .. 900 k when compressing. Has no
effect when decompressing. See MEMORY MANAGEMENT below.
The <CODE>--fast</CODE> and <CODE>--best</CODE> aliases are primarily for GNU
<CODE>gzip</CODE> compatibility. In particular, <CODE>--fast</CODE> doesn't make
things significantly faster. And <CODE>--best</CODE> merely selects the
default behaviour.
<DT><CODE>--</CODE>
<DD>
Treats all subsequent arguments as file names, even if they start
<DD>Treats all subsequent arguments as file names, even if they start
with a dash. This is so you can handle files with names beginning
with a dash, for example: <CODE>bzip2 -- -myfilename</CODE>.
<DT><CODE>--repetitive-fast</CODE>
<DD>
<DT><CODE>--repetitive-best</CODE>
<DD>
These flags are redundant in versions 0.9.5 and above. They provided
<DD><DT><CODE>--repetitive-best</CODE>
<DD>These flags are redundant in versions 0.9.5 and above. They provided
some coarse control over the behaviour of the sorting algorithm in
earlier versions, which was sometimes useful. 0.9.5 and above have an
improved algorithm which renders these flags irrelevant.
</DL>
<H4><A NAME="SEC7" HREF="manual_toc.html#TOC7">MEMORY MANAGEMENT</A></H4>
<P>
<HR SIZE="6">
<A NAME="SEC7"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC6"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC8"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> MEMORY MANAGEMENT </H4>
<!--docid::SEC7::-->
<P>
<CODE>bzip2</CODE> compresses large files in blocks. The block size affects
both the compression ratio achieved, and the amount of memory needed for
compression and decompression. The flags <CODE>-1</CODE> through <CODE>-9</CODE>
@ -277,43 +327,34 @@ compression is read from the header of the compressed file, and
the file. Since block sizes are stored in compressed files, it follows
that the flags <CODE>-1</CODE> to <CODE>-9</CODE> are irrelevant to and so ignored
during decompression.
</P><P>
</P>
<P>
Compression and decompression requirements, in bytes, can be estimated
as:
<PRE>
Compression: 400k + ( 8 x block size )
<TABLE><tr><td>&nbsp;</td><td class=example><pre> Compression: 400k + ( 8 x block size )
Decompression: 100k + ( 4 x block size ), or
100k + ( 2.5 x block size )
</PRE>
<P>
Larger block sizes give rapidly diminishing marginal returns. Most of
</pre></td></tr></table>Larger block sizes give rapidly diminishing marginal returns. Most of
the compression comes from the first two or three hundred k of block
size, a fact worth bearing in mind when using <CODE>bzip2</CODE> on small machines.
It is also important to appreciate that the decompression memory
requirement is set at compression time by the choice of block size.
</P><P>
</P>
<P>
For files compressed with the default 900k block size, <CODE>bunzip2</CODE>
will require about 3700 kbytes to decompress. To support decompression
of any file on a 4 megabyte machine, <CODE>bunzip2</CODE> has an option to
decompress using approximately half this amount of memory, about 2300
kbytes. Decompression speed is also halved, so you should use this
option only where necessary. The relevant flag is <CODE>-s</CODE>.
</P><P>
</P>
<P>
In general, try and use the largest block size memory constraints allow,
since that maximises the compression achieved. Compression and
decompression speed are virtually unaffected by block size.
</P><P>
</P>
<P>
Another significant point applies to files which fit in a single block
-- that means most files you'd encounter using a large block size. The
amount of real memory touched is proportional to the size of the file,
@ -322,18 +363,15 @@ since the file is smaller than a block. For example, compressing a file
allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560
kbytes of it. Similarly, the decompressor will allocate 3700k but only
touch 100k + 20000 * 4 = 180 kbytes.
</P><P>
</P>
<P>
Here is a table which summarises the maximum memory usage for different
block sizes. Also recorded is the total compressed size for 14 files of
the Calgary Text Compression Corpus totalling 3,141,622 bytes. This
column gives some feel for how compression varies with block size.
These figures tend to understate the advantage of larger block sizes for
larger files, since the Corpus is dominated by smaller files.
<PRE>
Compress Decompress Decompress Corpus
<TABLE><tr><td>&nbsp;</td><td class=example><pre> Compress Decompress Decompress Corpus
Flag usage usage -s usage Size
-1 1200k 500k 350k 914704
@ -345,61 +383,78 @@ larger files, since the Corpus is dominated by smaller files.
-7 6100k 2900k 1850k 834096
-8 6800k 3300k 2100k 828642
-9 7600k 3700k 2350k 828642
</PRE>
<H4><A NAME="SEC8" HREF="manual_toc.html#TOC8">RECOVERING DATA FROM DAMAGED FILES</A></H4>
</pre></td></tr></table></P><P>
<HR SIZE="6">
<A NAME="SEC8"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC7"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC9"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> RECOVERING DATA FROM DAMAGED FILES </H4>
<!--docid::SEC8::-->
<P>
<CODE>bzip2</CODE> compresses files in blocks, usually 900kbytes long. Each
block is handled independently. If a media or transmission error causes
a multi-block <CODE>.bz2</CODE> file to become damaged, it may be possible to
recover data from the undamaged blocks in the file.
</P><P>
</P>
<P>
The compressed representation of each block is delimited by a 48-bit
pattern, which makes it possible to find the block boundaries with
reasonable certainty. Each block also carries its own 32-bit CRC, so
damaged blocks can be distinguished from undamaged ones.
</P><P>
</P>
<P>
<CODE>bzip2recover</CODE> is a simple program whose purpose is to search for
blocks in <CODE>.bz2</CODE> files, and write each block out into its own
<CODE>.bz2</CODE> file. You can then use <CODE>bzip2 -t</CODE> to test the
integrity of the resulting files, and decompress those which are
undamaged.
</P><P>
</P>
<P>
<CODE>bzip2recover</CODE>
takes a single argument, the name of the damaged file,
and writes a number of files <CODE>rec0001file.bz2</CODE>,
<CODE>rec0002file.bz2</CODE>, etc, containing the extracted blocks.
The output filenames are designed so that the use of
wildcards in subsequent processing -- for example,
<CODE>bzip2 -dc rec*file.bz2 &#62; recovered_data</CODE> -- lists the files in
the correct order.
takes a single argument, the name of the damaged file, and writes a
number of files <CODE>rec00001file.bz2</CODE>, <CODE>rec00002file.bz2</CODE>, etc,
containing the extracted blocks. The output filenames are designed so
that the use of wildcards in subsequent processing -- for example,
<CODE>bzip2 -dc rec*file.bz2 &#62; recovered_data</CODE> -- processes the files in
the correct order.
</P><P>
</P>
<P>
<CODE>bzip2recover</CODE> should be of most use dealing with large <CODE>.bz2</CODE>
files, as these will contain many blocks. It is clearly
futile to use it on damaged single-block files, since a
damaged block cannot be recovered. If you wish to minimise
any potential data loss through media or transmission errors,
you might consider compressing with a smaller
block size.
</P>
<H4><A NAME="SEC9" HREF="manual_toc.html#TOC9">PERFORMANCE NOTES</A></H4>
files, as these will contain many blocks. It is clearly futile to use
it on damaged single-block files, since a damaged block cannot be
recovered. If you wish to minimise any potential data loss through
media or transmission errors, you might consider compressing with a
smaller block size.
</P><P>
<HR SIZE="6">
<A NAME="SEC9"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC8"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC10"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> PERFORMANCE NOTES </H4>
<!--docid::SEC9::-->
<P>
The sorting phase of compression gathers together similar strings in the
file. Because of this, files containing very long runs of repeated
symbols, like "aabaabaabaab ..." (repeated several hundred times) may
@ -408,13 +463,11 @@ better than previous versions in this respect. The ratio between
worst-case and average-case compression time is in the region of 10:1.
For previous versions, this figure was more like 100:1. You can use the
<CODE>-vvvv</CODE> option to monitor progress in great detail, if you want.
</P><P>
</P>
<P>
Decompression speed is unaffected by these phenomena.
</P><P>
</P>
<P>
<CODE>bzip2</CODE> usually allocates several megabytes of memory to operate
in, and then charges all over it in a fairly random fashion. This means
that performance, both for compressing and decompressing, is largely
@ -423,44 +476,71 @@ Because of this, small changes to the code to reduce the miss rate have
been observed to give disproportionately large performance improvements.
I imagine <CODE>bzip2</CODE> will perform best on machines with very large
caches.
</P><P>
</P>
<H4><A NAME="SEC10" HREF="manual_toc.html#TOC10">CAVEATS</A></H4>
<HR SIZE="6">
<A NAME="SEC10"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC9"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC11"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> CAVEATS </H4>
<!--docid::SEC10::-->
<P>
I/O error messages are not as helpful as they could be. <CODE>bzip2</CODE>
tries hard to detect I/O errors and exit cleanly, but the details of
what the problem is sometimes seem rather misleading.
</P><P>
</P>
<P>
This manual page pertains to version 1.0 of <CODE>bzip2</CODE>. Compressed
This manual page pertains to version 1.0.2 of <CODE>bzip2</CODE>. Compressed
data created by this version is entirely forwards and backwards
compatible with the previous public releases, versions 0.1pl2, 0.9.0 and
0.9.5, but with the following exception: 0.9.0 and above can correctly
decompress multiple concatenated compressed files. 0.1pl2 cannot do
this; it will stop after decompressing just the first file in the
stream.
compatible with the previous public releases, versions 0.1pl2, 0.9.0,
0.9.5, 1.0.0 and 1.0.1, but with the following exception: 0.9.0 and
above can correctly decompress multiple concatenated compressed files.
0.1pl2 cannot do this; it will stop after decompressing just the first
file in the stream.
</P><P>
</P>
<P>
<CODE>bzip2recover</CODE> uses 32-bit integers to represent bit positions in
compressed files, so it cannot handle compressed files more than 512
megabytes long. This could easily be fixed.
<CODE>bzip2recover</CODE> versions prior to this one, 1.0.2, used 32-bit
integers to represent bit positions in compressed files, so it could not
handle compressed files more than 512 megabytes long. Version 1.0.2 and
above uses 64-bit ints on some platforms which support them (GNU
supported targets, and Windows). To establish whether or not
<CODE>bzip2recover</CODE> was built with such a limitation, run it without
arguments. In any event you can build yourself an unlimited version if
you can recompile it with <CODE>MaybeUInt64</CODE> set to be an unsigned
64-bit integer.
</P><P>
</P>
<H4><A NAME="SEC11" HREF="manual_toc.html#TOC11">AUTHOR</A></H4>
<P>
<HR SIZE="6">
<A NAME="SEC11"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_2.html#SEC10"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_3.html#SEC12"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H4> AUTHOR </H4>
<!--docid::SEC11::-->
Julian Seward, <CODE>jseward@acm.org</CODE>.
</P>
<P>
<CODE>http://sources.redhat.com/bzip2</CODE>
</P><P>
The ideas embodied in <CODE>bzip2</CODE> are due to (at least) the following
people: Michael Burrows and David Wheeler (for the block sorting
transformation), David Wheeler (again, for the Huffman coder), Peter
@ -471,14 +551,29 @@ indebted for their help, support and advice. See the manual in the
source distribution for pointers to sources of documentation. Christian
von Roques encouraged me to look for faster sorting algorithms, so as to
speed up compression. Bela Lubkin encouraged me to improve the
worst-case compression performance. Many people sent patches, helped
with portability problems, lent machines, gave advice and were generally
worst-case compression performance. The <CODE>bz*</CODE> scripts are derived
from those of GNU <CODE>gzip</CODE>. Many people sent patches, helped with
portability problems, lent machines, gave advice and were generally
helpful.
</P><P>
</P>
</BLOCKQUOTE>
<P><HR><P>
<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_1.html">previous</A>, <A HREF="manual_3.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
<HR SIZE="6">
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
</BODY>
</HTML>

1400
dist/bzip2/manual_3.html vendored

File diff suppressed because it is too large Load Diff

View File

@ -1,45 +1,76 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<!-- This HTML file has been created by texi2html 1.54
from manual.texi on 23 March 2000 -->
<TITLE>Untitled Document: 4. Miscellanea</TITLE>
<TITLE>bzip2 and libbzip2 - Miscellanea</TITLE>
<link href="manual_3.html" rel=Previous>
<link href="manual_toc.html" rel=ToC>
<META NAME="description" CONTENT="Untitled Document: 4. Miscellanea">
<META NAME="keywords" CONTENT="Untitled Document: 4. Miscellanea">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY>
<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_3.html">previous</A>, next, last section, <A HREF="manual_toc.html">table of contents</A>.
<P><HR><P>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<H1><A NAME="SEC43" HREF="manual_toc.html#TOC43">Miscellanea</A></H1>
<A NAME="SEC43"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_3.html#SEC42"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC44"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1> 4. Miscellanea </H1>
<!--docid::SEC43::-->
<P>
These are just some random thoughts of mine. Your mileage may
vary.
</P><P>
</P>
<H2><A NAME="SEC44" HREF="manual_toc.html#TOC44">Limitations of the compressed file format</A></H2>
<P>
<HR SIZE="6">
<A NAME="SEC44"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC43"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC45"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.1 Limitations of the compressed file format </H2>
<!--docid::SEC44::-->
<CODE>bzip2-1.0</CODE>, <CODE>0.9.5</CODE> and <CODE>0.9.0</CODE>
use exactly the same file format as the previous
version, <CODE>bzip2-0.1</CODE>. This decision was made in the interests of
stability. Creating yet another incompatible compressed file format
would create further confusion and disruption for users.
</P>
<P>
Nevertheless, this is not a painless decision. Development
work since the release of <CODE>bzip2-0.1</CODE> in August 1997
has shown complexities in the file format which slow down
decompression and, in retrospect, are unnecessary. These are:
<UL>
<LI>The run-length encoder, which is the first of the
compression transformations, is entirely irrelevant.
The original purpose was to protect the sorting algorithm
from the very worst case input: a string of repeated
@ -48,7 +79,6 @@ decompression and, in retrospect, are unnecessary. These are:
repeats can be handled without difficulty in block
sorting.
<LI>The randomisation mechanism doesn't really need to be
there. Udi Manber and Gene Myers published a suffix
array construction algorithm a few years back, which
can be employed to sort any block, no matter how
@ -56,6 +86,7 @@ decompression and, in retrospect, are unnecessary. These are:
Kunihiko Sadakane has produced a derivative O(N (log N)^2)
algorithm which usually outperforms the Manber-Myers
algorithm.
<P>
I could have changed to Sadakane's algorithm, but I find
it to be slower than <CODE>bzip2</CODE>'s existing algorithm for
@ -65,6 +96,7 @@ decompression and, in retrospect, are unnecessary. These are:
that I was not flooded with email complaints about
<CODE>bzip2-0.1</CODE>'s performance on repetitive data, so
perhaps it isn't a problem for real inputs.
</P><P>
Probably the best long-term solution,
and the one I have incorporated into 0.9.5 and above,
@ -72,7 +104,6 @@ decompression and, in retrospect, are unnecessary. These are:
algorithm initially, and fall back to a O(N (log N)^2)
algorithm if the standard algorithm gets into difficulties.
<LI>The compressed file format was never designed to be
handled by a library, and I have had to jump though
some hoops to produce an efficient implementation of
decompression. It's a bit hairy. Try passing
@ -81,52 +112,52 @@ decompression and, in retrospect, are unnecessary. These are:
could have been avoided if the compressed size of
each block of data was recorded in the data stream.
<LI>An Adler-32 checksum, rather than a CRC32 checksum,
would be faster to compute.
</UL>
<P>
It would be fair to say that the <CODE>bzip2</CODE> format was frozen
before I properly and fully understood the performance
consequences of doing so.
</P>
<P>
Improvements which I was able to incorporate into
0.9.0, despite using the same file format, are:
<UL>
<LI>Single array implementation of the inverse BWT. This
significantly speeds up decompression, presumably
because it reduces the number of cache misses.
<LI>Faster inverse MTF transform for large MTF values. The
new implementation is based on the notion of sliding blocks
of values.
<LI><CODE>bzip2-0.9.0</CODE> now reads and writes files with <CODE>fread</CODE>
and <CODE>fwrite</CODE>; version 0.1 used <CODE>putc</CODE> and <CODE>getc</CODE>.
Duh! Well, you live and learn.
<P>
</UL>
<P>
Further ahead, it would be nice
to be able to do random access into files. This will
require some careful design of compressed file formats.
</P>
<H2><A NAME="SEC45" HREF="manual_toc.html#TOC45">Portability issues</A></H2>
<P>
<HR SIZE="6">
<A NAME="SEC45"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC44"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC46"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.2 Portability issues </H2>
<!--docid::SEC45::-->
After some consideration, I have decided not to use
GNU <CODE>autoconf</CODE> to configure 0.9.5 or 1.0.
</P>
<P>
<CODE>autoconf</CODE>, admirable and wonderful though it is,
mainly assists with portability problems between Unix-like
platforms. But <CODE>bzip2</CODE> doesn't have much in the way
@ -134,15 +165,13 @@ of portability problems on Unix; most of the difficulties appear
when porting to the Mac, or to Microsoft's operating systems.
<CODE>autoconf</CODE> doesn't help in those cases, and brings in a
whole load of new complexity.
</P><P>
</P>
<P>
Most people should be able to compile the library and program
under Unix straight out-of-the-box, so to speak, especially
if you have a version of GNU C available.
</P><P>
</P>
<P>
There are a couple of <CODE>__inline__</CODE> directives in the code. GNU C
(<CODE>gcc</CODE>) should be able to handle them. If you're not using
GNU C, your C compiler shouldn't see them at all.
@ -150,9 +179,8 @@ If your compiler does, for some reason, see them and doesn't
like them, just <CODE>#define</CODE> <CODE>__inline__</CODE> to be <CODE>/* */</CODE>. One
easy way to do this is to compile with the flag <CODE>-D__inline__=</CODE>,
which should be understood by most Unix compilers.
</P><P>
</P>
<P>
If you still have difficulties, try compiling with the macro
<CODE>BZ_STRICT_ANSI</CODE> defined. This should enable you to build the
library in a strictly ANSI compliant environment. Building the program
@ -160,165 +188,164 @@ itself like this is dangerous and not supported, since you remove
<CODE>bzip2</CODE>'s checks against compressing directories, symbolic links,
devices, and other not-really-a-file entities. This could cause
filesystem corruption!
</P><P>
</P>
<P>
One other thing: if you create a <CODE>bzip2</CODE> binary for public
distribution, please try and link it statically (<CODE>gcc -s</CODE>). This
avoids all sorts of library-version issues that others may encounter
later on.
</P><P>
</P>
<P>
If you build <CODE>bzip2</CODE> on Win32, you must set <CODE>BZ_UNIX</CODE> to 0 and
<CODE>BZ_LCCWIN32</CODE> to 1, in the file <CODE>bzip2.c</CODE>, before compiling.
Otherwise the resulting binary won't work correctly.
</P><P>
</P>
<H2><A NAME="SEC46" HREF="manual_toc.html#TOC46">Reporting bugs</A></H2>
<P>
<HR SIZE="6">
<A NAME="SEC46"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC45"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC47"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.3 Reporting bugs </H2>
<!--docid::SEC46::-->
I tried pretty hard to make sure <CODE>bzip2</CODE> is
bug free, both by design and by testing. Hopefully
you'll never need to read this section for real.
</P>
<P>
Nevertheless, if <CODE>bzip2</CODE> dies with a segmentation
fault, a bus error or an internal assertion failure, it
will ask you to email me a bug report. Experience with
version 0.1 shows that almost all these problems can
be traced to either compiler bugs or hardware problems.
<UL>
<LI>
Recompile the program with no optimisation, and see if it
works. And/or try a different compiler.
I heard all sorts of stories about various flavours
of GNU C (and other compilers) generating bad code for
<CODE>bzip2</CODE>, and I've run across two such examples myself.
<P>
2.7.X versions of GNU C are known to generate bad code from
time to time, at high optimisation levels.
If you get problems, try using the flags
<CODE>-O2</CODE> <CODE>-fomit-frame-pointer</CODE> <CODE>-fno-strength-reduce</CODE>.
You should specifically <EM>not</EM> use <CODE>-funroll-loops</CODE>.
</P><P>
You may notice that the Makefile runs six tests as part of
the build process. If the program passes all of these, it's
a pretty good (but not 100%) indication that the compiler has
done its job correctly.
<LI>
If <CODE>bzip2</CODE> crashes randomly, and the crashes are not
repeatable, you may have a flaky memory subsystem. <CODE>bzip2</CODE>
really hammers your memory hierarchy, and if it's a bit marginal,
you may get these problems. Ditto if your disk or I/O subsystem
is slowly failing. Yup, this really does happen.
<P>
Try using a different machine of the same type, and see if
you can repeat the problem.
<LI>This isn't really a bug, but ... If <CODE>bzip2</CODE> tells
you your file is corrupted on decompression, and you
obtained the file via FTP, there is a possibility that you
forgot to tell FTP to do a binary mode transfer. That absolutely
will cause the file to be non-decompressible. You'll have to transfer
it again.
</UL>
<P>
If you've incorporated <CODE>libbzip2</CODE> into your own program
and are getting problems, please, please, please, check that the
parameters you are passing in calls to the library, are
correct, and in accordance with what the documentation says
is allowable. I have tried to make the library robust against
such problems, but I'm sure I haven't succeeded.
</P><P>
</P>
<P>
Finally, if the above comments don't help, you'll have to send
me a bug report. Now, it's just amazing how many people will
send me a bug report saying something like
<PRE>
bzip2 crashed with segmentation fault on my machine
</PRE>
<P>
and absolutely nothing else. Needless to say, a such a report
<TABLE><tr><td>&nbsp;</td><td class=display><pre style="font-family: serif"> bzip2 crashed with segmentation fault on my machine
</pre></td></tr></table>and absolutely nothing else. Needless to say, a such a report
is <EM>totally, utterly, completely and comprehensively 100% useless;
a waste of your time, my time, and net bandwidth</EM>.
With no details at all, there's no way I can possibly begin
to figure out what the problem is.
</P><P>
</P>
<P>
The rules of the game are: facts, facts, facts. Don't omit
them because "oh, they won't be relevant". At the bare
minimum:
<PRE>
Machine type. Operating system version.
<TABLE><tr><td>&nbsp;</td><td class=display><pre style="font-family: serif"> Machine type. Operating system version.
Exact version of <CODE>bzip2</CODE> (do <CODE>bzip2 -V</CODE>).
Exact version of the compiler used.
Flags passed to the compiler.
</PRE>
<P>
However, the most important single thing that will help me is
</pre></td></tr></table>However, the most important single thing that will help me is
the file that you were trying to compress or decompress at the
time the problem happened. Without that, my ability to do anything
more than speculate about the cause, is limited.
</P><P>
</P>
<P>
Please remember that I connect to the Internet with a modem, so
you should contact me before mailing me huge files.
</P><P>
</P>
<H2><A NAME="SEC47" HREF="manual_toc.html#TOC47">Did you get the right package?</A></H2>
<HR SIZE="6">
<A NAME="SEC47"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC46"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC48"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.4 Did you get the right package? </H2>
<!--docid::SEC47::-->
<P>
<CODE>bzip2</CODE> is a resource hog. It soaks up large amounts of CPU cycles
and memory. Also, it gives very large latencies. In the worst case, you
can feed many megabytes of uncompressed data into the library before
getting any compressed output, so this probably rules out applications
requiring interactive behaviour.
</P><P>
</P>
<P>
These aren't faults of my implementation, I hope, but more
an intrinsic property of the Burrows-Wheeler transform (unfortunately).
Maybe this isn't what you want.
</P><P>
</P>
<P>
If you want a compressor and/or library which is faster, uses less
memory but gets pretty good compression, and has minimal latency,
consider Jean-loup
Gailly's and Mark Adler's work, <CODE>zlib-1.1.2</CODE> and
Gailly's and Mark Adler's work, <CODE>zlib-1.1.3</CODE> and
<CODE>gzip-1.2.4</CODE>. Look for them at
</P><P>
</P>
<P>
<CODE>http://www.cdrom.com/pub/infozip/zlib</CODE> and
<CODE>http://www.zlib.org</CODE> and
<CODE>http://www.gzip.org</CODE> respectively.
</P><P>
</P>
<P>
For something faster and lighter still, you might try Markus F X J
Oberhumer's <CODE>LZO</CODE> real-time compression/decompression library, at
<BR> <CODE>http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html</CODE>.
</P><P>
</P>
<P>
If you want to use the <CODE>bzip2</CODE> algorithms to compress small blocks
of data, 64k bytes or smaller, for example on an on-the-fly disk
compressor, you'd be well advised not to use this library. Instead,
@ -326,136 +353,117 @@ I've made a special library tuned for that kind of use. It's part of
<CODE>e2compr-0.40</CODE>, an on-the-fly disk compressor for the Linux
<CODE>ext2</CODE> filesystem. Look at
<CODE>http://www.netspace.net.au/~reiter/e2compr</CODE>.
</P><P>
</P>
<H2><A NAME="SEC48" HREF="manual_toc.html#TOC48">Testing</A></H2>
<HR SIZE="6">
<A NAME="SEC48"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC47"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC49"> &gt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.5 Testing </H2>
<!--docid::SEC48::-->
<P>
A record of the tests I've done.
</P><P>
</P>
<P>
First, some data sets:
<UL>
<LI>B: a directory containing 6001 files, one for every length in the
range 0 to 6000 bytes. The files contain random lowercase
letters. 18.7 megabytes.
<LI>H: my home directory tree. Documents, source code, mail files,
compressed data. H contains B, and also a directory of
files designed as boundary cases for the sorting; mostly very
repetitive, nasty files. 565 megabytes.
<LI>A: directory tree holding various applications built from source:
<CODE>egcs</CODE>, <CODE>gcc-2.8.1</CODE>, KDE, GTK, Octave, etc.
2200 megabytes.
</UL>
<P>
The tests conducted are as follows. Each test means compressing
(a copy of) each file in the data set, decompressing it and
comparing it against the original.
</P>
<P>
First, a bunch of tests with block sizes and internal buffer
sizes set very small,
to detect any problems with the
blocking and buffering mechanisms.
This required modifying the source code so as to try to
break it.
<OL>
<LI>Data set H, with
buffer size of 1 byte, and block size of 23 bytes.
<LI>Data set B, buffer sizes 1 byte, block size 1 byte.
<LI>As (2) but small-mode decompression.
<LI>As (2) with block size 2 bytes.
<LI>As (2) with block size 3 bytes.
<LI>As (2) with block size 4 bytes.
<LI>As (2) with block size 5 bytes.
<LI>As (2) with block size 6 bytes and small-mode decompression.
<LI>H with buffer size of 1 byte, but normal block
size (up to 900000 bytes).
</OL>
<P>
Then some tests with unmodified source code.
<OL>
<LI>H, all settings normal.
<LI>As (1), with small-mode decompress.
<LI>H, compress with flag <CODE>-1</CODE>.
<LI>H, compress with flag <CODE>-s</CODE>, decompress with flag <CODE>-s</CODE>.
<LI>Forwards compatibility: H, <CODE>bzip2-0.1pl2</CODE> compressing,
<CODE>bzip2-0.9.5</CODE> decompressing, all settings normal.
<LI>Backwards compatibility: H, <CODE>bzip2-0.9.5</CODE> compressing,
<CODE>bzip2-0.1pl2</CODE> decompressing, all settings normal.
<LI>Bigger tests: A, all settings normal.
<LI>As (7), using the fallback (Sadakane-like) sorting algorithm.
<LI>As (8), compress with flag <CODE>-1</CODE>, decompress with flag
<CODE>-s</CODE>.
<LI>H, using the fallback sorting algorithm.
<LI>Forwards compatibility: A, <CODE>bzip2-0.1pl2</CODE> compressing,
<CODE>bzip2-0.9.5</CODE> decompressing, all settings normal.
<LI>Backwards compatibility: A, <CODE>bzip2-0.9.5</CODE> compressing,
<CODE>bzip2-0.1pl2</CODE> decompressing, all settings normal.
<LI>Misc test: about 400 megabytes of <CODE>.tar</CODE> files with
<CODE>bzip2</CODE> compiled with Checker (a memory access error
detector, like Purify).
<LI>Misc tests to make sure it builds and runs ok on non-Linux/x86
platforms.
</OL>
<P>
These tests were conducted on a 225 MHz IDT WinChip machine, running
Linux 2.0.36. They represent nearly a week of continuous computation.
All tests completed successfully.
</P>
<H2><A NAME="SEC49" HREF="manual_toc.html#TOC49">Further reading</A></H2>
<P>
<HR SIZE="6">
<A NAME="SEC49"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_4.html#SEC48"> &lt; </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top"> Up </A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H2> 4.6 Further reading </H2>
<!--docid::SEC49::-->
<CODE>bzip2</CODE> is not research work, in the sense that it doesn't present
any new ideas. Rather, it's an engineering exercise based on existing
ideas.
</P>
<P>
Four documents describe essentially all the ideas behind <CODE>bzip2</CODE>:
<PRE>
Michael Burrows and D. J. Wheeler:
Four documents describe essentially all the ideas behind <CODE>bzip2</CODE>:
<TABLE><tr><td>&nbsp;</td><td class=example><pre>Michael Burrows and D. J. Wheeler:
"A block-sorting lossless data compression algorithm"
10th May 1994.
Digital SRC Research Report 124.
@ -479,50 +487,44 @@ Jon L. Bentley and Robert Sedgewick
"Fast Algorithms for Sorting and Searching Strings"
Available from Sedgewick's web page,
www.cs.princeton.edu/~rs
</PRE>
<P>
The following paper gives valuable additional insights into the
</pre></td></tr></table>The following paper gives valuable additional insights into the
algorithm, but is not immediately the basis of any code
used in bzip2.
<PRE>
Peter Fenwick:
<TABLE><tr><td>&nbsp;</td><td class=example><pre>Peter Fenwick:
Block Sorting Text Compression
Proceedings of the 19th Australasian Computer Science Conference,
Melbourne, Australia. Jan 31 - Feb 2, 1996.
ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
</PRE>
<P>
Kunihiko Sadakane's sorting algorithm, mentioned above,
</pre></td></tr></table>Kunihiko Sadakane's sorting algorithm, mentioned above,
is available from:
<PRE>
http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz
</PRE>
<P>
The Manber-Myers suffix array construction
<TABLE><tr><td>&nbsp;</td><td class=example><pre>http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz
</pre></td></tr></table>The Manber-Myers suffix array construction
algorithm is described in a paper
available from:
<PRE>
http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps
</PRE>
<P>
Finally, the following paper documents some recent investigations
<TABLE><tr><td>&nbsp;</td><td class=example><pre>http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps
</pre></td></tr></table>Finally, the following paper documents some recent investigations
I made into the performance of sorting algorithms:
<PRE>
Julian Seward:
<TABLE><tr><td>&nbsp;</td><td class=example><pre>Julian Seward:
On the Performance of BWT Sorting Algorithms
Proceedings of the IEEE Data Compression Conference 2000
Snowbird, Utah. 28-30 March 2000.
</PRE>
</pre></td></tr></table></P><P>
<HR SIZE="6">
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[ &lt;&lt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[ &gt;&gt; ]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT"> &nbsp; <TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
<P><HR><P>
<p>Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_3.html">previous</A>, next, last section, <A HREF="manual_toc.html">table of contents</A>.
</BODY>
</HTML>

201
dist/bzip2/manual_abt.html vendored Normal file
View File

@ -0,0 +1,201 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<TITLE>Untitled Document: About this document</TITLE>
<META NAME="description" CONTENT="Untitled Document: About this document">
<META NAME="keywords" CONTENT="Untitled Document: About this document">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<A NAME="SEC_About"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1>About this document</H1>
This document was generated by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
<P></P>
The buttons in the navigation panels have the following meaning:
<P></P>
<table border = "1">
<TR>
<TH> Button </TH>
<TH> Name </TH>
<TH> Go to </TH>
<TH> From 1.2.3 go to</TH>
</TR>
<TR>
<TD ALIGN="CENTER">
[ &lt; ] </TD>
<TD ALIGN="CENTER">
Back
</TD>
<TD>
previous section in reading order
</TD>
<TD>
1.2.2
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[ &gt; ] </TD>
<TD ALIGN="CENTER">
Forward
</TD>
<TD>
next section in reading order
</TD>
<TD>
1.2.4
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[ &lt;&lt; ] </TD>
<TD ALIGN="CENTER">
FastBack
</TD>
<TD>
previous or up-and-previous section
</TD>
<TD>
1.1
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[ Up ] </TD>
<TD ALIGN="CENTER">
Up
</TD>
<TD>
up section
</TD>
<TD>
1.2
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[ &gt;&gt; ] </TD>
<TD ALIGN="CENTER">
FastForward
</TD>
<TD>
next or up-and-next section
</TD>
<TD>
1.3
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[Top] </TD>
<TD ALIGN="CENTER">
Top
</TD>
<TD>
cover (top) of document
</TD>
<TD>
&nbsp;
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[Contents] </TD>
<TD ALIGN="CENTER">
Contents
</TD>
<TD>
table of contents
</TD>
<TD>
&nbsp;
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[Index] </TD>
<TD ALIGN="CENTER">
Index
</TD>
<TD>
concept index
</TD>
<TD>
&nbsp;
</TD>
</TR>
<TR>
<TD ALIGN="CENTER">
[ ? ] </TD>
<TD ALIGN="CENTER">
About
</TD>
<TD>
this page
</TD>
<TD>
&nbsp;
</TD>
</TR>
</TABLE>
<P></P>
where the <STRONG> Example </STRONG> assumes that the current position
is at <STRONG> Subsubsection One-Two-Three </STRONG> of a document of
the following structure:
<UL>
<LI> 1. Section One </LI>
<UL>
<LI>1.1 Subsection One-One</LI>
<UL>
<LI> ... </LI>
</UL>
<LI>1.2 Subsection One-Two</LI>
<UL>
<LI>1.2.1 Subsubsection One-Two-One
</LI><LI>1.2.2 Subsubsection One-Two-Two
</LI><LI>1.2.3 Subsubsection One-Two-Three &nbsp; &nbsp; <STRONG>
&lt;== Current Position </STRONG>
</LI><LI>1.2.4 Subsubsection One-Two-Four
</LI></UL>
<LI>1.3 Subsection One-Three</LI>
<UL>
<LI> ... </LI>
</UL>
<LI>1.4 Subsection One-Four</LI>
</UL>
</UL>
<HR SIZE=1>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
</BODY>
</HTML>

54
dist/bzip2/manual_ovr.html vendored Normal file
View File

@ -0,0 +1,54 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<TITLE>Untitled Document: Short Table of Contents</TITLE>
<META NAME="description" CONTENT="Untitled Document: Short Table of Contents">
<META NAME="keywords" CONTENT="Untitled Document: Short Table of Contents">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<A NAME="SEC_OVERVIEW"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1>Short Table of Contents</H1>
<BLOCKQUOTE>
<A NAME="TOC1" HREF="manual_1.html#SEC1">1. Introduction</A>
<BR>
<A NAME="TOC2" HREF="manual_2.html#SEC2">2. How to use <CODE>bzip2</CODE></A>
<BR>
<A NAME="TOC12" HREF="manual_3.html#SEC12">3. Programming with <CODE>libbzip2</CODE></A>
<BR>
<A NAME="TOC43" HREF="manual_4.html#SEC43">4. Miscellanea</A>
<BR>
</BLOCKQUOTE>
<HR SIZE=1>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
</BODY>
</HTML>

View File

@ -1,173 +1,163 @@
<HTML>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!-- Created on January, 5 2002 by texi2html 1.64 -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
Karl Berry <karl@freefriends.org>
Olaf Bachmann <obachman@mathematik.uni-kl.de>
and many others.
Maintained by: Olaf Bachmann <obachman@mathematik.uni-kl.de>
Send bugs and suggestions to <texi2html@mathematik.uni-kl.de>
-->
<HEAD>
<!-- This HTML file has been created by texi2html 1.54
from manual.texi on 23 March 2000 -->
<TITLE>Untitled Document: Table of Contents</TITLE>
<TITLE>bzip2 and libbzip2 - Table of Contents</TITLE>
<META NAME="description" CONTENT="Untitled Document: Table of Contents">
<META NAME="keywords" CONTENT="Untitled Document: Table of Contents">
<META NAME="resource-type" CONTENT="document">
<META NAME="distribution" CONTENT="global">
<META NAME="Generator" CONTENT="texi2html 1.64">
</HEAD>
<BODY>
<H1>bzip2 and libbzip2</H1>
<H2>a program and library for data compression</H2>
<H2>copyright (C) 1996-2000 Julian Seward</H2>
<H2>version 1.0 of 21 March 2000</H2>
<ADDRESS>Julian Seward</ADDRESS>
<P>
<P><HR><P>
<P>
This program, <CODE>bzip2</CODE>,
and associated library <CODE>libbzip2</CODE>, are
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
</P>
<P>
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
<BODY LANG="" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<A NAME="SEC_Contents"></A>
<TABLE CELLPADDING=1 CELLSPACING=1 BORDER=0>
<TR><TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual.html#SEC_Top">Top</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_toc.html#SEC_Contents">Contents</A>]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[Index]</TD>
<TD VALIGN="MIDDLE" ALIGN="LEFT">[<A HREF="manual_abt.html#SEC_About"> ? </A>]</TD>
</TR></TABLE>
<H1>Table of Contents</H1>
<UL>
<LI>
Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
<LI>
The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
<LI>
Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
<LI>
The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
permission.
</UL>
<P>
THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
</P>
<P>
Julian Seward, Cambridge, UK.
</P>
<P>
<CODE>jseward@acm.org</CODE>
</P>
<P>
<CODE>http://sourceware.cygnus.com/bzip2</CODE>
</P>
<P>
<CODE>http://www.cacheprof.org</CODE>
</P>
<P>
<CODE>http://www.muraroa.demon.co.uk</CODE>
</P>
<P>
<CODE>bzip2</CODE>/<CODE>libbzip2</CODE> version 1.0 of 21 March 2000.
</P>
<P>
PATENTS: To the best of my knowledge, <CODE>bzip2</CODE> does not use any patented
algorithms. However, I do not have the resources available to carry out
a full patent search. Therefore I cannot give any guarantee of the
above statement.
</P>
<UL>
<LI><A NAME="TOC1" HREF="manual_1.html#SEC1">Introduction</A>
<LI><A NAME="TOC2" HREF="manual_2.html#SEC2">How to use <CODE>bzip2</CODE></A>
<A NAME="TOC1" HREF="manual_1.html#SEC1">1. Introduction</A>
<BR>
<A NAME="TOC2" HREF="manual_2.html#SEC2">2. How to use <CODE>bzip2</CODE></A>
<BR>
<UL>
<UL>
<UL>
<LI><A NAME="TOC3" HREF="manual_2.html#SEC3">NAME</A>
<LI><A NAME="TOC4" HREF="manual_2.html#SEC4">SYNOPSIS</A>
<LI><A NAME="TOC5" HREF="manual_2.html#SEC5">DESCRIPTION</A>
<LI><A NAME="TOC6" HREF="manual_2.html#SEC6">OPTIONS</A>
<LI><A NAME="TOC7" HREF="manual_2.html#SEC7">MEMORY MANAGEMENT</A>
<LI><A NAME="TOC8" HREF="manual_2.html#SEC8">RECOVERING DATA FROM DAMAGED FILES</A>
<LI><A NAME="TOC9" HREF="manual_2.html#SEC9">PERFORMANCE NOTES</A>
<LI><A NAME="TOC10" HREF="manual_2.html#SEC10">CAVEATS</A>
<LI><A NAME="TOC11" HREF="manual_2.html#SEC11">AUTHOR</A>
<A NAME="TOC3" HREF="manual_2.html#SEC3">NAME</A>
<BR>
<A NAME="TOC4" HREF="manual_2.html#SEC4">SYNOPSIS</A>
<BR>
<A NAME="TOC5" HREF="manual_2.html#SEC5">DESCRIPTION</A>
<BR>
<A NAME="TOC6" HREF="manual_2.html#SEC6">OPTIONS</A>
<BR>
<A NAME="TOC7" HREF="manual_2.html#SEC7">MEMORY MANAGEMENT</A>
<BR>
<A NAME="TOC8" HREF="manual_2.html#SEC8">RECOVERING DATA FROM DAMAGED FILES</A>
<BR>
<A NAME="TOC9" HREF="manual_2.html#SEC9">PERFORMANCE NOTES</A>
<BR>
<A NAME="TOC10" HREF="manual_2.html#SEC10">CAVEATS</A>
<BR>
<A NAME="TOC11" HREF="manual_2.html#SEC11">AUTHOR</A>
<BR>
</UL>
</UL>
</UL>
<LI><A NAME="TOC12" HREF="manual_3.html#SEC12">Programming with <CODE>libbzip2</CODE></A>
<A NAME="TOC12" HREF="manual_3.html#SEC12">3. Programming with <CODE>libbzip2</CODE></A>
<BR>
<UL>
<LI><A NAME="TOC13" HREF="manual_3.html#SEC13">Top-level structure</A>
<A NAME="TOC13" HREF="manual_3.html#SEC13">3.1 Top-level structure</A>
<BR>
<UL>
<LI><A NAME="TOC14" HREF="manual_3.html#SEC14">Low-level summary</A>
<LI><A NAME="TOC15" HREF="manual_3.html#SEC15">High-level summary</A>
<LI><A NAME="TOC16" HREF="manual_3.html#SEC16">Utility functions summary</A>
<A NAME="TOC14" HREF="manual_3.html#SEC14">3.1.1 Low-level summary</A>
<BR>
<A NAME="TOC15" HREF="manual_3.html#SEC15">3.1.2 High-level summary</A>
<BR>
<A NAME="TOC16" HREF="manual_3.html#SEC16">3.1.3 Utility functions summary</A>
<BR>
</UL>
<LI><A NAME="TOC17" HREF="manual_3.html#SEC17">Error handling</A>
<LI><A NAME="TOC18" HREF="manual_3.html#SEC18">Low-level interface</A>
<A NAME="TOC17" HREF="manual_3.html#SEC17">3.2 Error handling</A>
<BR>
<A NAME="TOC18" HREF="manual_3.html#SEC18">3.3 Low-level interface</A>
<BR>
<UL>
<LI><A NAME="TOC19" HREF="manual_3.html#SEC19"><CODE>BZ2_bzCompressInit</CODE></A>
<LI><A NAME="TOC20" HREF="manual_3.html#SEC20"><CODE>BZ2_bzCompress</CODE></A>
<LI><A NAME="TOC21" HREF="manual_3.html#SEC21"><CODE>BZ2_bzCompressEnd</CODE></A>
<LI><A NAME="TOC22" HREF="manual_3.html#SEC22"><CODE>BZ2_bzDecompressInit</CODE></A>
<LI><A NAME="TOC23" HREF="manual_3.html#SEC23"><CODE>BZ2_bzDecompress</CODE></A>
<LI><A NAME="TOC24" HREF="manual_3.html#SEC24"><CODE>BZ2_bzDecompressEnd</CODE></A>
<A NAME="TOC19" HREF="manual_3.html#SEC19">3.3.1 <CODE>BZ2_bzCompressInit</CODE></A>
<BR>
<A NAME="TOC20" HREF="manual_3.html#SEC20">3.3.2 <CODE>BZ2_bzCompress</CODE></A>
<BR>
<A NAME="TOC21" HREF="manual_3.html#SEC21">3.3.3 <CODE>BZ2_bzCompressEnd</CODE></A>
<BR>
<A NAME="TOC22" HREF="manual_3.html#SEC22">3.3.4 <CODE>BZ2_bzDecompressInit</CODE></A>
<BR>
<A NAME="TOC23" HREF="manual_3.html#SEC23">3.3.5 <CODE>BZ2_bzDecompress</CODE></A>
<BR>
<A NAME="TOC24" HREF="manual_3.html#SEC24">3.3.6 <CODE>BZ2_bzDecompressEnd</CODE></A>
<BR>
</UL>
<LI><A NAME="TOC25" HREF="manual_3.html#SEC25">High-level interface</A>
<A NAME="TOC25" HREF="manual_3.html#SEC25">3.4 High-level interface</A>
<BR>
<UL>
<LI><A NAME="TOC26" HREF="manual_3.html#SEC26"><CODE>BZ2_bzReadOpen</CODE></A>
<LI><A NAME="TOC27" HREF="manual_3.html#SEC27"><CODE>BZ2_bzRead</CODE></A>
<LI><A NAME="TOC28" HREF="manual_3.html#SEC28"><CODE>BZ2_bzReadGetUnused</CODE></A>
<LI><A NAME="TOC29" HREF="manual_3.html#SEC29"><CODE>BZ2_bzReadClose</CODE></A>
<LI><A NAME="TOC30" HREF="manual_3.html#SEC30"><CODE>BZ2_bzWriteOpen</CODE></A>
<LI><A NAME="TOC31" HREF="manual_3.html#SEC31"><CODE>BZ2_bzWrite</CODE></A>
<LI><A NAME="TOC32" HREF="manual_3.html#SEC32"><CODE>BZ2_bzWriteClose</CODE></A>
<LI><A NAME="TOC33" HREF="manual_3.html#SEC33">Handling embedded compressed data streams</A>
<LI><A NAME="TOC34" HREF="manual_3.html#SEC34">Standard file-reading/writing code</A>
<A NAME="TOC26" HREF="manual_3.html#SEC26">3.4.1 <CODE>BZ2_bzReadOpen</CODE></A>
<BR>
<A NAME="TOC27" HREF="manual_3.html#SEC27">3.4.2 <CODE>BZ2_bzRead</CODE></A>
<BR>
<A NAME="TOC28" HREF="manual_3.html#SEC28">3.4.3 <CODE>BZ2_bzReadGetUnused</CODE></A>
<BR>
<A NAME="TOC29" HREF="manual_3.html#SEC29">3.4.4 <CODE>BZ2_bzReadClose</CODE></A>
<BR>
<A NAME="TOC30" HREF="manual_3.html#SEC30">3.4.5 <CODE>BZ2_bzWriteOpen</CODE></A>
<BR>
<A NAME="TOC31" HREF="manual_3.html#SEC31">3.4.6 <CODE>BZ2_bzWrite</CODE></A>
<BR>
<A NAME="TOC32" HREF="manual_3.html#SEC32">3.4.7 <CODE>BZ2_bzWriteClose</CODE></A>
<BR>
<A NAME="TOC33" HREF="manual_3.html#SEC33">3.4.8 Handling embedded compressed data streams</A>
<BR>
<A NAME="TOC34" HREF="manual_3.html#SEC34">3.4.9 Standard file-reading/writing code</A>
<BR>
</UL>
<LI><A NAME="TOC35" HREF="manual_3.html#SEC35">Utility functions</A>
<A NAME="TOC35" HREF="manual_3.html#SEC35">3.5 Utility functions</A>
<BR>
<UL>
<LI><A NAME="TOC36" HREF="manual_3.html#SEC36"><CODE>BZ2_bzBuffToBuffCompress</CODE></A>
<LI><A NAME="TOC37" HREF="manual_3.html#SEC37"><CODE>BZ2_bzBuffToBuffDecompress</CODE></A>
<A NAME="TOC36" HREF="manual_3.html#SEC36">3.5.1 <CODE>BZ2_bzBuffToBuffCompress</CODE></A>
<BR>
<A NAME="TOC37" HREF="manual_3.html#SEC37">3.5.2 <CODE>BZ2_bzBuffToBuffDecompress</CODE></A>
<BR>
</UL>
<LI><A NAME="TOC38" HREF="manual_3.html#SEC38"><CODE>zlib</CODE> compatibility functions</A>
<LI><A NAME="TOC39" HREF="manual_3.html#SEC39">Using the library in a <CODE>stdio</CODE>-free environment</A>
<A NAME="TOC38" HREF="manual_3.html#SEC38">3.6 <CODE>zlib</CODE> compatibility functions</A>
<BR>
<A NAME="TOC39" HREF="manual_3.html#SEC39">3.7 Using the library in a <CODE>stdio</CODE>-free environment</A>
<BR>
<UL>
<LI><A NAME="TOC40" HREF="manual_3.html#SEC40">Getting rid of <CODE>stdio</CODE></A>
<LI><A NAME="TOC41" HREF="manual_3.html#SEC41">Critical error handling</A>
<A NAME="TOC40" HREF="manual_3.html#SEC40">3.7.1 Getting rid of <CODE>stdio</CODE></A>
<BR>
<A NAME="TOC41" HREF="manual_3.html#SEC41">3.7.2 Critical error handling</A>
<BR>
</UL>
<LI><A NAME="TOC42" HREF="manual_3.html#SEC42">Making a Windows DLL</A>
<A NAME="TOC42" HREF="manual_3.html#SEC42">3.8 Making a Windows DLL</A>
<BR>
</UL>
<LI><A NAME="TOC43" HREF="manual_4.html#SEC43">Miscellanea</A>
<A NAME="TOC43" HREF="manual_4.html#SEC43">4. Miscellanea</A>
<BR>
<UL>
<LI><A NAME="TOC44" HREF="manual_4.html#SEC44">Limitations of the compressed file format</A>
<LI><A NAME="TOC45" HREF="manual_4.html#SEC45">Portability issues</A>
<LI><A NAME="TOC46" HREF="manual_4.html#SEC46">Reporting bugs</A>
<LI><A NAME="TOC47" HREF="manual_4.html#SEC47">Did you get the right package?</A>
<LI><A NAME="TOC48" HREF="manual_4.html#SEC48">Testing</A>
<LI><A NAME="TOC49" HREF="manual_4.html#SEC49">Further reading</A>
<A NAME="TOC44" HREF="manual_4.html#SEC44">4.1 Limitations of the compressed file format</A>
<BR>
<A NAME="TOC45" HREF="manual_4.html#SEC45">4.2 Portability issues</A>
<BR>
<A NAME="TOC46" HREF="manual_4.html#SEC46">4.3 Reporting bugs</A>
<BR>
<A NAME="TOC47" HREF="manual_4.html#SEC47">4.4 Did you get the right package?</A>
<BR>
<A NAME="TOC48" HREF="manual_4.html#SEC48">4.5 Testing</A>
<BR>
<A NAME="TOC49" HREF="manual_4.html#SEC49">4.6 Further reading</A>
<BR>
</UL>
</UL>
<P><HR><P>
This document was generated on 23 March 2000 using the
<A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A>
translator version 1.51a.</P>
<HR SIZE=1>
<BR>
<FONT SIZE="-1">
This document was generated
by <I>Julian Seward</I> on <I>January, 5 2002</I>
using <A HREF="http://www.mathematik.uni-kl.de/~obachman/Texi2html
"><I>texi2html</I></A>
</BODY>
</HTML>

18
dist/bzip2/mk251.c vendored Normal file
View File

@ -0,0 +1,18 @@
/* $NetBSD: mk251.c,v 1.1.1.1 2002/03/15 01:35:28 mjl Exp $ */
/* Spew out a long sequence of the byte 251. When fed to bzip2
versions 1.0.0 or 1.0.1, causes it to die with internal error
1007 in blocksort.c. This assertion misses an extremely rare
case, which is fixed in this version (1.0.2) and above.
*/
#include <stdio.h>
int main ()
{
int i;
for (i = 0; i < 48500000 ; i++)
putchar(251);
return 0;
}

View File

@ -1,4 +1,4 @@
/* $NetBSD: randtable.c,v 1.1.1.1 2001/06/03 13:03:06 simonb Exp $ */
/* $NetBSD: randtable.c,v 1.1.1.2 2002/03/15 01:35:29 mjl Exp $ */
/*-------------------------------------------------------------*/
@ -10,7 +10,7 @@
This file is a part of bzip2 and/or libbzip2, a program and
library for lossless, block-sorting data compression.
Copyright (C) 1996-2000 Julian R Seward. All rights reserved.
Copyright (C) 1996-2002 Julian R Seward. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions

4
dist/bzip2/words3 vendored
View File

@ -15,8 +15,8 @@ not actually execute them.
Instructions for use are in the preformatted manual page, in the file
bzip2.txt. For more detailed documentation, read the full manual.
It is available in Postscript form (manual.ps) and HTML form
(manual_toc.html).
It is available in Postscript form (manual.ps), PDF form (manual.pdf),
and HTML form (manual_toc.html).
You can also do "bzip2 --help" to see some helpful information.
"bzip2 -L" displays the software license.