Initial import of bzip2-0.9.0b from Julian Seward.
This commit is contained in:
parent
27f3588d7b
commit
6610fc2af7
39
lib/libbz2/LICENSE
Normal file
39
lib/libbz2/LICENSE
Normal file
@ -0,0 +1,39 @@
|
||||
|
||||
This program, "bzip2" and associated library "libbzip2", are
|
||||
copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
137
lib/libbz2/README
Normal file
137
lib/libbz2/README
Normal file
@ -0,0 +1,137 @@
|
||||
|
||||
|
||||
This is the README for bzip2, a block-sorting file compressor, version
|
||||
0.9.0. This version is fully compatible with the previous public
|
||||
release, bzip2-0.1pl2.
|
||||
|
||||
bzip2-0.9.0 is distributed under a BSD-style license. For details,
|
||||
see the file LICENSE.
|
||||
|
||||
Complete documentation is available in Postscript form (manual.ps)
|
||||
or html (manual_toc.html). A plain-text version of the manual page is
|
||||
available as bzip2.txt.
|
||||
|
||||
|
||||
HOW TO BUILD -- UNIX
|
||||
|
||||
Type `make'.
|
||||
|
||||
This creates binaries "bzip2" and "bzip2recover".
|
||||
|
||||
It also runs four compress-decompress tests to make sure things are
|
||||
working properly. If all goes well, you should be up & running.
|
||||
Please be sure to read the output from `make' just to be sure that the
|
||||
tests went ok.
|
||||
|
||||
To install bzip2 properly:
|
||||
|
||||
* Copy the binaries "bzip2" and "bzip2recover" to a publically visible
|
||||
place, possibly /usr/bin or /usr/local/bin.
|
||||
|
||||
* In that directory, make "bunzip2" and "bzcat" be symbolic links
|
||||
to "bzip2".
|
||||
|
||||
* Copy the manual page, bzip2.1, to the relevant place.
|
||||
Probably the right place is /usr/man/man1/.
|
||||
|
||||
If you want to program with the library, you'll need to copy libbz2.a
|
||||
and bzlib.h to /usr/lib and /usr/include respectively.
|
||||
|
||||
|
||||
HOW TO BUILD -- Windows 95, NT, DOS, Mac, etc.
|
||||
|
||||
It's difficult for me to support compilation on all these platforms.
|
||||
My approach is to collect binaries for these platforms, and put them
|
||||
on my web page (http://www.muraroa.demon.co.uk). Look there.
|
||||
|
||||
|
||||
VALIDATION
|
||||
|
||||
Correct operation, in the sense that a compressed file can always be
|
||||
decompressed to reproduce the original, is obviously of paramount
|
||||
importance. To validate bzip2, I used a modified version of Mark
|
||||
Nelson's churn program. Churn is an automated test driver which
|
||||
recursively traverses a directory structure, using bzip2 to compress
|
||||
and then decompress each file it encounters, and checking that the
|
||||
decompressed data is the same as the original. There are more details
|
||||
in Section 4 of the user guide.
|
||||
|
||||
|
||||
|
||||
Please read and be aware of the following:
|
||||
|
||||
WARNING:
|
||||
|
||||
This program (attempts to) compress data by performing several
|
||||
non-trivial transformations on it. Unless you are 100% familiar
|
||||
with *all* the algorithms contained herein, and with the
|
||||
consequences of modifying them, you should NOT meddle with the
|
||||
compression or decompression machinery. Incorrect changes can and
|
||||
very likely *will* lead to disastrous loss of data.
|
||||
|
||||
|
||||
DISCLAIMER:
|
||||
|
||||
I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE
|
||||
USE OF THIS PROGRAM, HOWSOEVER CAUSED.
|
||||
|
||||
Every compression of a file implies an assumption that the
|
||||
compressed file can be decompressed to reproduce the original.
|
||||
Great efforts in design, coding and testing have been made to
|
||||
ensure that this program works correctly. However, the complexity
|
||||
of the algorithms, and, in particular, the presence of various
|
||||
special cases in the code which occur with very low but non-zero
|
||||
probability make it impossible to rule out the possibility of bugs
|
||||
remaining in the program. DO NOT COMPRESS ANY DATA WITH THIS
|
||||
PROGRAM UNLESS YOU ARE PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER
|
||||
SMALL, THAT THE DATA WILL NOT BE RECOVERABLE.
|
||||
|
||||
That is not to say this program is inherently unreliable. Indeed,
|
||||
I very much hope the opposite is true. bzip2 has been carefully
|
||||
constructed and extensively tested.
|
||||
|
||||
|
||||
PATENTS:
|
||||
|
||||
To the best of my knowledge, bzip2 does not use any patented
|
||||
algorithms. However, I do not have the resources available to
|
||||
carry out a full patent search. Therefore I cannot give any
|
||||
guarantee of the above statement.
|
||||
|
||||
End of legalities.
|
||||
|
||||
|
||||
WHAT'S NEW IN 0.9.0 (as compared to 0.1pl2) ?
|
||||
|
||||
* Approx 10% faster compression, 30% faster decompression
|
||||
* -t (test mode) is a lot quicker
|
||||
* Can decompress concatenated compressed files
|
||||
* Programming interface, so programs can directly read/write .bz2 files
|
||||
* Less restrictive (BSD-style) licensing
|
||||
* Flag handling more compatible with GNU gzip
|
||||
* Much more documentation, i.e., a proper user manual
|
||||
* Hopefully, improved portability (at least of the library)
|
||||
|
||||
|
||||
I hope you find bzip2 useful. Feel free to contact me at
|
||||
jseward@acm.org
|
||||
if you have any suggestions or queries. Many people mailed me with
|
||||
comments, suggestions and patches after the releases of bzip-0.15,
|
||||
bzip-0.21 and bzip2-0.1pl2, and the changes in bzip2 are largely a
|
||||
result of this feedback. I thank you for your comments.
|
||||
|
||||
At least for the time being, bzip2's "home" is
|
||||
http://www.muraroa.demon.co.uk.
|
||||
|
||||
Julian Seward
|
||||
jseward@acm.org
|
||||
|
||||
Manchester, UK
|
||||
18 July 1996 (version 0.15)
|
||||
25 August 1996 (version 0.21)
|
||||
|
||||
Guildford, Surrey, UK
|
||||
7 August 1997 (bzip2, version 0.1)
|
||||
29 August 1997 (bzip2, version 0.1pl2)
|
||||
23 August 1998 (bzip2, version 0.9.0)
|
||||
|
709
lib/libbz2/blocksort.c
Normal file
709
lib/libbz2/blocksort.c
Normal file
@ -0,0 +1,709 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Block sorting machinery ---*/
|
||||
/*--- blocksort.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
Compare two strings in block. We assume (see
|
||||
discussion above) that i1 and i2 have a max
|
||||
offset of 10 on entry, and that the first
|
||||
bytes of both block and quadrant have been
|
||||
copied into the "overshoot area", ie
|
||||
into the subscript range
|
||||
[nblock .. nblock+NUM_OVERSHOOT_BYTES-1].
|
||||
--*/
|
||||
static __inline__ Bool fullGtU ( UChar* block,
|
||||
UInt16* quadrant,
|
||||
UInt32 nblock,
|
||||
Int32* workDone,
|
||||
Int32 i1,
|
||||
Int32 i2
|
||||
)
|
||||
{
|
||||
Int32 k;
|
||||
UChar c1, c2;
|
||||
UInt16 s1, s2;
|
||||
|
||||
AssertD ( i1 != i2, "fullGtU(1)" );
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
i1++; i2++;
|
||||
|
||||
k = nblock;
|
||||
|
||||
do {
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
c1 = block[i1];
|
||||
c2 = block[i2];
|
||||
if (c1 != c2) return (c1 > c2);
|
||||
s1 = quadrant[i1];
|
||||
s2 = quadrant[i2];
|
||||
if (s1 != s2) return (s1 > s2);
|
||||
i1++; i2++;
|
||||
|
||||
if (i1 >= nblock) i1 -= nblock;
|
||||
if (i2 >= nblock) i2 -= nblock;
|
||||
|
||||
k -= 4;
|
||||
(*workDone)++;
|
||||
}
|
||||
while (k >= 0);
|
||||
|
||||
return False;
|
||||
}
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
Knuth's increments seem to work better
|
||||
than Incerpi-Sedgewick here. Possibly
|
||||
because the number of elems to sort is
|
||||
usually small, typically <= 20.
|
||||
--*/
|
||||
static Int32 incs[14] = { 1, 4, 13, 40, 121, 364, 1093, 3280,
|
||||
9841, 29524, 88573, 265720,
|
||||
797161, 2391484 };
|
||||
|
||||
static void simpleSort ( EState* s, Int32 lo, Int32 hi, Int32 d )
|
||||
{
|
||||
Int32 i, j, h, bigN, hp;
|
||||
Int32 v;
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
UInt16* quadrant = s->quadrant;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 nblock = s->nblock;
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
bigN = hi - lo + 1;
|
||||
if (bigN < 2) return;
|
||||
|
||||
hp = 0;
|
||||
while (incs[hp] < bigN) hp++;
|
||||
hp--;
|
||||
|
||||
for (; hp >= 0; hp--) {
|
||||
h = incs[hp];
|
||||
i = lo + h;
|
||||
while (True) {
|
||||
|
||||
/*-- copy 1 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
/*-- copy 2 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
/*-- copy 3 --*/
|
||||
if (i > hi) break;
|
||||
v = zptr[i];
|
||||
j = i;
|
||||
while ( fullGtU ( block, quadrant, nblock, workDone,
|
||||
zptr[j-h]+d, v+d ) ) {
|
||||
zptr[j] = zptr[j-h];
|
||||
j = j - h;
|
||||
if (j <= (lo + h - 1)) break;
|
||||
}
|
||||
zptr[j] = v;
|
||||
i++;
|
||||
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
/*--
|
||||
The following is an implementation of
|
||||
an elegant 3-way quicksort for strings,
|
||||
described in a paper "Fast Algorithms for
|
||||
Sorting and Searching Strings", by Robert
|
||||
Sedgewick and Jon L. Bentley.
|
||||
--*/
|
||||
|
||||
#define swap(lv1, lv2) \
|
||||
{ Int32 tmp = lv1; lv1 = lv2; lv2 = tmp; }
|
||||
|
||||
static void vswap ( UInt32* zptr, Int32 p1, Int32 p2, Int32 n )
|
||||
{
|
||||
while (n > 0) {
|
||||
swap(zptr[p1], zptr[p2]);
|
||||
p1++; p2++; n--;
|
||||
}
|
||||
}
|
||||
|
||||
static UChar med3 ( UChar a, UChar b, UChar c )
|
||||
{
|
||||
UChar t;
|
||||
if (a > b) { t = a; a = b; b = t; };
|
||||
if (b > c) { t = b; b = c; c = t; };
|
||||
if (a > b) b = a;
|
||||
return b;
|
||||
}
|
||||
|
||||
|
||||
#define min(a,b) ((a) < (b)) ? (a) : (b)
|
||||
|
||||
typedef
|
||||
struct { Int32 ll; Int32 hh; Int32 dd; }
|
||||
StackElem;
|
||||
|
||||
#define push(lz,hz,dz) { stack[sp].ll = lz; \
|
||||
stack[sp].hh = hz; \
|
||||
stack[sp].dd = dz; \
|
||||
sp++; }
|
||||
|
||||
#define pop(lz,hz,dz) { sp--; \
|
||||
lz = stack[sp].ll; \
|
||||
hz = stack[sp].hh; \
|
||||
dz = stack[sp].dd; }
|
||||
|
||||
#define SMALL_THRESH 20
|
||||
#define DEPTH_THRESH 10
|
||||
|
||||
/*--
|
||||
If you are ever unlucky/improbable enough
|
||||
to get a stack overflow whilst sorting,
|
||||
increase the following constant and try
|
||||
again. In practice I have never seen the
|
||||
stack go above 27 elems, so the following
|
||||
limit seems very generous.
|
||||
--*/
|
||||
#define QSORT_STACK_SIZE 1000
|
||||
|
||||
|
||||
static void qSort3 ( EState* s, Int32 loSt, Int32 hiSt, Int32 dSt )
|
||||
{
|
||||
Int32 unLo, unHi, ltLo, gtHi, med, n, m;
|
||||
Int32 sp, lo, hi, d;
|
||||
StackElem stack[QSORT_STACK_SIZE];
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
sp = 0;
|
||||
push ( loSt, hiSt, dSt );
|
||||
|
||||
while (sp > 0) {
|
||||
|
||||
AssertH ( sp < QSORT_STACK_SIZE, 1001 );
|
||||
|
||||
pop ( lo, hi, d );
|
||||
|
||||
if (hi - lo < SMALL_THRESH || d > DEPTH_THRESH) {
|
||||
simpleSort ( s, lo, hi, d );
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
continue;
|
||||
}
|
||||
|
||||
med = med3 ( block[zptr[ lo ]+d],
|
||||
block[zptr[ hi ]+d],
|
||||
block[zptr[ (lo+hi)>>1 ]+d] );
|
||||
|
||||
unLo = ltLo = lo;
|
||||
unHi = gtHi = hi;
|
||||
|
||||
while (True) {
|
||||
while (True) {
|
||||
if (unLo > unHi) break;
|
||||
n = ((Int32)block[zptr[unLo]+d]) - med;
|
||||
if (n == 0) { swap(zptr[unLo], zptr[ltLo]); ltLo++; unLo++; continue; };
|
||||
if (n > 0) break;
|
||||
unLo++;
|
||||
}
|
||||
while (True) {
|
||||
if (unLo > unHi) break;
|
||||
n = ((Int32)block[zptr[unHi]+d]) - med;
|
||||
if (n == 0) { swap(zptr[unHi], zptr[gtHi]); gtHi--; unHi--; continue; };
|
||||
if (n < 0) break;
|
||||
unHi--;
|
||||
}
|
||||
if (unLo > unHi) break;
|
||||
swap(zptr[unLo], zptr[unHi]); unLo++; unHi--;
|
||||
}
|
||||
|
||||
AssertD ( unHi == unLo-1, "bad termination in qSort3" );
|
||||
|
||||
if (gtHi < ltLo) {
|
||||
push(lo, hi, d+1 );
|
||||
continue;
|
||||
}
|
||||
|
||||
n = min(ltLo-lo, unLo-ltLo); vswap(zptr, lo, unLo-n, n);
|
||||
m = min(hi-gtHi, gtHi-unHi); vswap(zptr, unLo, hi-m+1, m);
|
||||
|
||||
n = lo + unLo - ltLo - 1;
|
||||
m = hi - (gtHi - unHi) + 1;
|
||||
|
||||
push ( lo, n, d );
|
||||
push ( n+1, m-1, d+1 );
|
||||
push ( m, hi, d );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
|
||||
#define BIGFREQ(b) (ftab[((b)+1) << 8] - ftab[(b) << 8])
|
||||
|
||||
#define SETMASK (1 << 21)
|
||||
#define CLEARMASK (~(SETMASK))
|
||||
|
||||
static void sortMain ( EState* s )
|
||||
{
|
||||
Int32 i, j, k, ss, sb;
|
||||
Int32 runningOrder[256];
|
||||
Int32 copy[256];
|
||||
Bool bigDone[256];
|
||||
UChar c1, c2;
|
||||
Int32 numQSorted;
|
||||
|
||||
UChar* block = s->block;
|
||||
UInt32* zptr = s->zptr;
|
||||
UInt16* quadrant = s->quadrant;
|
||||
Int32* ftab = s->ftab;
|
||||
Int32* workDone = &(s->workDone);
|
||||
Int32 nblock = s->nblock;
|
||||
Int32 workLimit = s->workLimit;
|
||||
Bool firstAttempt = s->firstAttempt;
|
||||
|
||||
/*--
|
||||
In the various block-sized structures, live data runs
|
||||
from 0 to last+NUM_OVERSHOOT_BYTES inclusive. First,
|
||||
set up the overshoot area for block.
|
||||
--*/
|
||||
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf0( " sort initialise ...\n" );
|
||||
|
||||
for (i = 0; i < BZ_NUM_OVERSHOOT_BYTES; i++)
|
||||
block[nblock+i] = block[i % nblock];
|
||||
for (i = 0; i < nblock+BZ_NUM_OVERSHOOT_BYTES; i++)
|
||||
quadrant[i] = 0;
|
||||
|
||||
|
||||
if (nblock <= 4000) {
|
||||
|
||||
/*--
|
||||
Use simpleSort(), since the full sorting mechanism
|
||||
has quite a large constant overhead.
|
||||
--*/
|
||||
if (s->verbosity >= 4) VPrintf0( " simpleSort ...\n" );
|
||||
for (i = 0; i < nblock; i++) zptr[i] = i;
|
||||
firstAttempt = False;
|
||||
*workDone = workLimit = 0;
|
||||
simpleSort ( s, 0, nblock-1, 0 );
|
||||
if (s->verbosity >= 4) VPrintf0( " simpleSort done.\n" );
|
||||
|
||||
} else {
|
||||
|
||||
numQSorted = 0;
|
||||
for (i = 0; i <= 255; i++) bigDone[i] = False;
|
||||
|
||||
if (s->verbosity >= 4) VPrintf0( " bucket sorting ...\n" );
|
||||
|
||||
for (i = 0; i <= 65536; i++) ftab[i] = 0;
|
||||
|
||||
c1 = block[nblock-1];
|
||||
for (i = 0; i < nblock; i++) {
|
||||
c2 = block[i];
|
||||
ftab[(c1 << 8) + c2]++;
|
||||
c1 = c2;
|
||||
}
|
||||
|
||||
for (i = 1; i <= 65536; i++) ftab[i] += ftab[i-1];
|
||||
|
||||
c1 = block[0];
|
||||
for (i = 0; i < nblock-1; i++) {
|
||||
c2 = block[i+1];
|
||||
j = (c1 << 8) + c2;
|
||||
c1 = c2;
|
||||
ftab[j]--;
|
||||
zptr[ftab[j]] = i;
|
||||
}
|
||||
j = (block[nblock-1] << 8) + block[0];
|
||||
ftab[j]--;
|
||||
zptr[ftab[j]] = nblock-1;
|
||||
|
||||
/*--
|
||||
Now ftab contains the first loc of every small bucket.
|
||||
Calculate the running order, from smallest to largest
|
||||
big bucket.
|
||||
--*/
|
||||
|
||||
for (i = 0; i <= 255; i++) runningOrder[i] = i;
|
||||
|
||||
{
|
||||
Int32 vv;
|
||||
Int32 h = 1;
|
||||
do h = 3 * h + 1; while (h <= 256);
|
||||
do {
|
||||
h = h / 3;
|
||||
for (i = h; i <= 255; i++) {
|
||||
vv = runningOrder[i];
|
||||
j = i;
|
||||
while ( BIGFREQ(runningOrder[j-h]) > BIGFREQ(vv) ) {
|
||||
runningOrder[j] = runningOrder[j-h];
|
||||
j = j - h;
|
||||
if (j <= (h - 1)) goto zero;
|
||||
}
|
||||
zero:
|
||||
runningOrder[j] = vv;
|
||||
}
|
||||
} while (h != 1);
|
||||
}
|
||||
|
||||
/*--
|
||||
The main sorting loop.
|
||||
--*/
|
||||
|
||||
for (i = 0; i <= 255; i++) {
|
||||
|
||||
/*--
|
||||
Process big buckets, starting with the least full.
|
||||
Basically this is a 4-step process in which we call
|
||||
qSort3 to sort the small buckets [ss, j], but
|
||||
also make a big effort to avoid the calls if we can.
|
||||
--*/
|
||||
ss = runningOrder[i];
|
||||
|
||||
/*--
|
||||
Step 1:
|
||||
Complete the big bucket [ss] by quicksorting
|
||||
any unsorted small buckets [ss, j], for j != ss.
|
||||
Hopefully previous pointer-scanning phases have already
|
||||
completed many of the small buckets [ss, j], so
|
||||
we don't have to sort them at all.
|
||||
--*/
|
||||
for (j = 0; j <= 255; j++) {
|
||||
if (j != ss) {
|
||||
sb = (ss << 8) + j;
|
||||
if ( ! (ftab[sb] & SETMASK) ) {
|
||||
Int32 lo = ftab[sb] & CLEARMASK;
|
||||
Int32 hi = (ftab[sb+1] & CLEARMASK) - 1;
|
||||
if (hi > lo) {
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf4( " qsort [0x%x, 0x%x] done %d this %d\n",
|
||||
ss, j, numQSorted, hi - lo + 1 );
|
||||
qSort3 ( s, lo, hi, 2 );
|
||||
numQSorted += ( hi - lo + 1 );
|
||||
if (*workDone > workLimit && firstAttempt) return;
|
||||
}
|
||||
}
|
||||
ftab[sb] |= SETMASK;
|
||||
}
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 2:
|
||||
Deal specially with case [ss, ss]. This establishes the
|
||||
sorted order for [ss, ss] without any comparisons.
|
||||
A clever trick, cryptically described as steps Q6b and Q6c
|
||||
in SRC-124 (aka BW94). This makes it entirely practical to
|
||||
not use a preliminary run-length coder, but unfortunately
|
||||
we are now stuck with the .bz2 file format.
|
||||
--*/
|
||||
{
|
||||
Int32 put0, get0, put1, get1;
|
||||
Int32 sbn = (ss << 8) + ss;
|
||||
Int32 lo = ftab[sbn] & CLEARMASK;
|
||||
Int32 hi = (ftab[sbn+1] & CLEARMASK) - 1;
|
||||
UChar ssc = (UChar)ss;
|
||||
put0 = lo;
|
||||
get0 = ftab[ss << 8] & CLEARMASK;
|
||||
put1 = hi;
|
||||
get1 = (ftab[(ss+1) << 8] & CLEARMASK) - 1;
|
||||
while (get0 < put0) {
|
||||
j = zptr[get0]-1; if (j < 0) j += nblock;
|
||||
c1 = block[j];
|
||||
if (c1 == ssc) { zptr[put0] = j; put0++; };
|
||||
get0++;
|
||||
}
|
||||
while (get1 > put1) {
|
||||
j = zptr[get1]-1; if (j < 0) j += nblock;
|
||||
c1 = block[j];
|
||||
if (c1 == ssc) { zptr[put1] = j; put1--; };
|
||||
get1--;
|
||||
}
|
||||
ftab[sbn] |= SETMASK;
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 3:
|
||||
The [ss] big bucket is now done. Record this fact,
|
||||
and update the quadrant descriptors. Remember to
|
||||
update quadrants in the overshoot area too, if
|
||||
necessary. The "if (i < 255)" test merely skips
|
||||
this updating for the last bucket processed, since
|
||||
updating for the last bucket is pointless.
|
||||
|
||||
The quadrant array provides a way to incrementally
|
||||
cache sort orderings, as they appear, so as to
|
||||
make subsequent comparisons in fullGtU() complete
|
||||
faster. For repetitive blocks this makes a big
|
||||
difference (but not big enough to be able to avoid
|
||||
randomisation for very repetitive data.)
|
||||
|
||||
The precise meaning is: at all times:
|
||||
|
||||
for 0 <= i < nblock and 0 <= j <= nblock
|
||||
|
||||
if block[i] != block[j],
|
||||
|
||||
then the relative values of quadrant[i] and
|
||||
quadrant[j] are meaningless.
|
||||
|
||||
else {
|
||||
if quadrant[i] < quadrant[j]
|
||||
then the string starting at i lexicographically
|
||||
precedes the string starting at j
|
||||
|
||||
else if quadrant[i] > quadrant[j]
|
||||
then the string starting at j lexicographically
|
||||
precedes the string starting at i
|
||||
|
||||
else
|
||||
the relative ordering of the strings starting
|
||||
at i and j has not yet been determined.
|
||||
}
|
||||
--*/
|
||||
bigDone[ss] = True;
|
||||
|
||||
if (i < 255) {
|
||||
Int32 bbStart = ftab[ss << 8] & CLEARMASK;
|
||||
Int32 bbSize = (ftab[(ss+1) << 8] & CLEARMASK) - bbStart;
|
||||
Int32 shifts = 0;
|
||||
|
||||
while ((bbSize >> shifts) > 65534) shifts++;
|
||||
|
||||
for (j = 0; j < bbSize; j++) {
|
||||
Int32 a2update = zptr[bbStart + j];
|
||||
UInt16 qVal = (UInt16)(j >> shifts);
|
||||
quadrant[a2update] = qVal;
|
||||
if (a2update < BZ_NUM_OVERSHOOT_BYTES)
|
||||
quadrant[a2update + nblock] = qVal;
|
||||
}
|
||||
|
||||
AssertH ( ( ((bbSize-1) >> shifts) <= 65535 ), 1002 );
|
||||
}
|
||||
|
||||
/*--
|
||||
Step 4:
|
||||
Now scan this big bucket [ss] so as to synthesise the
|
||||
sorted order for small buckets [t, ss] for all t != ss.
|
||||
This will avoid doing Real Work in subsequent Step 1's.
|
||||
--*/
|
||||
for (j = 0; j <= 255; j++)
|
||||
copy[j] = ftab[(j << 8) + ss] & CLEARMASK;
|
||||
|
||||
for (j = ftab[ss << 8] & CLEARMASK;
|
||||
j < (ftab[(ss+1) << 8] & CLEARMASK);
|
||||
j++) {
|
||||
k = zptr[j]-1; if (k < 0) k += nblock;
|
||||
c1 = block[k];
|
||||
if ( ! bigDone[c1] ) {
|
||||
zptr[copy[c1]] = k;
|
||||
copy[c1] ++;
|
||||
}
|
||||
}
|
||||
|
||||
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
|
||||
}
|
||||
if (s->verbosity >= 4)
|
||||
VPrintf3( " %d pointers, %d sorted, %d scanned\n",
|
||||
nblock, numQSorted, nblock - numQSorted );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
static void randomiseBlock ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
BZ_RAND_INIT_MASK;
|
||||
for (i = 0; i < 256; i++) s->inUse[i] = False;
|
||||
|
||||
for (i = 0; i < s->nblock; i++) {
|
||||
BZ_RAND_UPD_MASK;
|
||||
s->block[i] ^= BZ_RAND_MASK;
|
||||
s->inUse[s->block[i]] = True;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
void blockSort ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
|
||||
s->workLimit = s->workFactor * (s->nblock - 1);
|
||||
s->workDone = 0;
|
||||
s->blockRandomised = False;
|
||||
s->firstAttempt = True;
|
||||
|
||||
sortMain ( s );
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d work, %d block, ratio %5.2f\n",
|
||||
s->workDone, s->nblock-1,
|
||||
(float)(s->workDone) / (float)(s->nblock-1) );
|
||||
|
||||
if (s->workDone > s->workLimit && s->firstAttempt) {
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf0( " sorting aborted; randomising block\n" );
|
||||
randomiseBlock ( s );
|
||||
s->workLimit = s->workDone = 0;
|
||||
s->blockRandomised = True;
|
||||
s->firstAttempt = False;
|
||||
sortMain ( s );
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d work, %d block, ratio %f\n",
|
||||
s->workDone, s->nblock-1,
|
||||
(float)(s->workDone) / (float)(s->nblock-1) );
|
||||
}
|
||||
|
||||
s->origPtr = -1;
|
||||
for (i = 0; i < s->nblock; i++)
|
||||
if (s->zptr[i] == 0)
|
||||
{ s->origPtr = i; break; };
|
||||
|
||||
AssertH( s->origPtr != -1, 1003 );
|
||||
}
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end blocksort.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
355
lib/libbz2/bzip2.txt
Normal file
355
lib/libbz2/bzip2.txt
Normal file
@ -0,0 +1,355 @@
|
||||
|
||||
bzip2(1) bzip2(1)
|
||||
|
||||
|
||||
NAME
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
|
||||
bzcat - decompresses files to stdout
|
||||
bzip2recover - recovers data from damaged bzip2 files
|
||||
|
||||
|
||||
SYNOPSIS
|
||||
bzip2 [ -cdfkstvzVL123456789 ] [ filenames ... ]
|
||||
bunzip2 [ -fkvsVL ] [ filenames ... ]
|
||||
bzcat [ -s ] [ filenames ... ]
|
||||
bzip2recover filename
|
||||
|
||||
|
||||
DESCRIPTION
|
||||
bzip2 compresses files using the Burrows-Wheeler block-
|
||||
sorting text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably better than that
|
||||
achieved by more conventional LZ77/LZ78-based compressors,
|
||||
and approaches the performance of the PPM family of sta-
|
||||
tistical compressors.
|
||||
|
||||
The command-line options are deliberately very similar to
|
||||
those of GNU Gzip, but they are not identical.
|
||||
|
||||
bzip2 expects a list of file names to accompany the com-
|
||||
mand-line flags. Each file is replaced by a compressed
|
||||
version of itself, with the name "original_name.bz2".
|
||||
Each compressed file has the same modification date and
|
||||
permissions as the corresponding original, so that these
|
||||
properties can be correctly restored at decompression
|
||||
time. File name handling is naive in the sense that there
|
||||
is no mechanism for preserving original file names, per-
|
||||
missions and dates in filesystems which lack these con-
|
||||
cepts, or have serious file name length restrictions, such
|
||||
as MS-DOS.
|
||||
|
||||
bzip2 and bunzip2 will by default not overwrite existing
|
||||
files; if you want this to happen, specify the -f flag.
|
||||
|
||||
If no file names are specified, bzip2 compresses from
|
||||
standard input to standard output. In this case, bzip2
|
||||
will decline to write compressed output to a terminal, as
|
||||
this would be entirely incomprehensible and therefore
|
||||
pointless.
|
||||
|
||||
bunzip2 (or bzip2 -d ) decompresses and restores all spec-
|
||||
ified files whose names end in ".bz2". Files without this
|
||||
suffix are ignored. Again, supplying no filenames causes
|
||||
decompression from standard input to standard output.
|
||||
|
||||
bunzip2 will correctly decompress a file which is the con-
|
||||
catenation of two or more compressed files. The result is
|
||||
the concatenation of the corresponding uncompressed files.
|
||||
Integrity testing (-t) of concatenated compressed files is
|
||||
also supported.
|
||||
|
||||
You can also compress or decompress files to the standard
|
||||
output by giving the -c flag. Multiple files may be com-
|
||||
pressed and decompressed like this. The resulting outputs
|
||||
are fed sequentially to stdout. Compression of multiple
|
||||
files in this manner generates a stream containing multi-
|
||||
ple compressed file representations. Such a stream can be
|
||||
decompressed correctly only by bzip2 version 0.9.0 or
|
||||
later. Earlier versions of bzip2 will stop after decom-
|
||||
pressing the first file in the stream.
|
||||
|
||||
bzcat (or bzip2 -dc ) decompresses all specified files to
|
||||
the standard output.
|
||||
|
||||
Compression is always performed, even if the compressed
|
||||
file is slightly larger than the original. Files of less
|
||||
than about one hundred bytes tend to get larger, since the
|
||||
compression mechanism has a constant overhead in the
|
||||
region of 50 bytes. Random data (including the output of
|
||||
most file compressors) is coded at about 8.05 bits per
|
||||
byte, giving an expansion of around 0.5%.
|
||||
|
||||
As a self-check for your protection, bzip2 uses 32-bit
|
||||
CRCs to make sure that the decompressed version of a file
|
||||
is identical to the original. This guards against corrup-
|
||||
tion of the compressed data, and against undetected bugs
|
||||
in bzip2 (hopefully very unlikely). The chances of data
|
||||
corruption going undetected is microscopic, about one
|
||||
chance in four billion for each file processed. Be aware,
|
||||
though, that the check occurs upon decompression, so it
|
||||
can only tell you that that something is wrong. It can't
|
||||
help you recover the original uncompressed data. You can
|
||||
use bzip2recover to try to recover data from damaged
|
||||
files.
|
||||
|
||||
Return values: 0 for a normal exit, 1 for environmental
|
||||
problems (file not found, invalid flags, I/O errors, &c),
|
||||
2 to indicate a corrupt compressed file, 3 for an internal
|
||||
consistency error (eg, bug) which caused bzip2 to panic.
|
||||
|
||||
|
||||
MEMORY MANAGEMENT
|
||||
Bzip2 compresses large files in blocks. The block size
|
||||
affects both the compression ratio achieved, and the
|
||||
amount of memory needed both for compression and decom-
|
||||
pression. The flags -1 through -9 specify the block size
|
||||
to be 100,000 bytes through 900,000 bytes (the default)
|
||||
respectively. At decompression-time, the block size used
|
||||
for compression is read from the header of the compressed
|
||||
file, and bunzip2 then allocates itself just enough memory
|
||||
to decompress the file. Since block sizes are stored in
|
||||
compressed files, it follows that the flags -1 to -9 are
|
||||
irrelevant to and so ignored during decompression.
|
||||
|
||||
Compression and decompression requirements, in bytes, can
|
||||
be estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 4 x block size ), or
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
Larger block sizes give rapidly diminishing marginal
|
||||
returns; most of the compression comes from the first two
|
||||
or three hundred k of block size, a fact worth bearing in
|
||||
mind when using bzip2 on small machines. It is also
|
||||
important to appreciate that the decompression memory
|
||||
requirement is set at compression-time by the choice of
|
||||
block size.
|
||||
|
||||
For files compressed with the default 900k block size,
|
||||
bunzip2 will require about 3700 kbytes to decompress. To
|
||||
support decompression of any file on a 4 megabyte machine,
|
||||
bunzip2 has an option to decompress using approximately
|
||||
half this amount of memory, about 2300 kbytes. Decompres-
|
||||
sion speed is also halved, so you should use this option
|
||||
only where necessary. The relevant flag is -s.
|
||||
|
||||
In general, try and use the largest block size memory con-
|
||||
straints allow, since that maximises the compression
|
||||
achieved. Compression and decompression speed are virtu-
|
||||
ally unaffected by block size.
|
||||
|
||||
Another significant point applies to files which fit in a
|
||||
single block -- that means most files you'd encounter
|
||||
using a large block size. The amount of real memory
|
||||
touched is proportional to the size of the file, since the
|
||||
file is smaller than a block. For example, compressing a
|
||||
file 20,000 bytes long with the flag -9 will cause the
|
||||
compressor to allocate around 6700k of memory, but only
|
||||
touch 400k + 20000 * 7 = 540 kbytes of it. Similarly, the
|
||||
decompressor will allocate 3700k but only touch 100k +
|
||||
20000 * 4 = 180 kbytes.
|
||||
|
||||
Here is a table which summarises the maximum memory usage
|
||||
for different block sizes. Also recorded is the total
|
||||
compressed size for 14 files of the Calgary Text Compres-
|
||||
sion Corpus totalling 3,141,622 bytes. This column gives
|
||||
some feel for how compression varies with block size.
|
||||
These figures tend to understate the advantage of larger
|
||||
block sizes for larger files, since the Corpus is domi-
|
||||
nated by smaller files.
|
||||
|
||||
Compress Decompress Decompress Corpus
|
||||
Flag usage usage -s usage Size
|
||||
|
||||
-1 1100k 500k 350k 914704
|
||||
-2 1800k 900k 600k 877703
|
||||
-3 2500k 1300k 850k 860338
|
||||
-4 3200k 1700k 1100k 846899
|
||||
-5 3900k 2100k 1350k 845160
|
||||
-6 4600k 2500k 1600k 838626
|
||||
-7 5400k 2900k 1850k 834096
|
||||
-8 6000k 3300k 2100k 828642
|
||||
-9 6700k 3700k 2350k 828642
|
||||
|
||||
|
||||
OPTIONS
|
||||
-c --stdout
|
||||
Compress or decompress to standard output. -c will
|
||||
decompress multiple files to stdout, but will only
|
||||
compress a single file to stdout.
|
||||
|
||||
-d --decompress
|
||||
Force decompression. bzip2, bunzip2 and bzcat are
|
||||
really the same program, and the decision about
|
||||
what actions to take is done on the basis of which
|
||||
name is used. This flag overrides that mechanism,
|
||||
and forces bzip2 to decompress.
|
||||
|
||||
-z --compress
|
||||
The complement to -d: forces compression, regard-
|
||||
less of the invokation name.
|
||||
|
||||
-t --test
|
||||
Check integrity of the specified file(s), but don't
|
||||
decompress them. This really performs a trial
|
||||
decompression and throws away the result.
|
||||
|
||||
-f --force
|
||||
Force overwrite of output files. Normally, bzip2
|
||||
will not overwrite existing output files.
|
||||
|
||||
-k --keep
|
||||
Keep (don't delete) input files during compression
|
||||
or decompression.
|
||||
|
||||
-s --small
|
||||
Reduce memory usage, for compression, decompression
|
||||
and testing. Files are decompressed and tested
|
||||
using a modified algorithm which only requires 2.5
|
||||
bytes per block byte. This means any file can be
|
||||
decompressed in 2300k of memory, albeit at about
|
||||
half the normal speed.
|
||||
|
||||
During compression, -s selects a block size of
|
||||
200k, which limits memory use to around the same
|
||||
figure, at the expense of your compression ratio.
|
||||
In short, if your machine is low on memory (8
|
||||
megabytes or less), use -s for everything. See
|
||||
MEMORY MANAGEMENT above.
|
||||
|
||||
-v --verbose
|
||||
Verbose mode -- show the compression ratio for each
|
||||
file processed. Further -v's increase the ver-
|
||||
bosity level, spewing out lots of information which
|
||||
is primarily of interest for diagnostic purposes.
|
||||
|
||||
-L --license -V --version
|
||||
Display the software version, license terms and
|
||||
conditions.
|
||||
|
||||
-1 to -9
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
See MEMORY MANAGEMENT above.
|
||||
|
||||
--repetitive-fast
|
||||
bzip2 injects some small pseudo-random variations
|
||||
into very repetitive blocks to limit worst-case
|
||||
performance during compression. If sorting runs
|
||||
into difficulties, the block is randomised, and
|
||||
sorting is restarted. Very roughly, bzip2 persists
|
||||
for three times as long as a well-behaved input
|
||||
would take before resorting to randomisation. This
|
||||
flag makes it give up much sooner.
|
||||
|
||||
--repetitive-best
|
||||
Opposite of --repetitive-fast; try a lot harder
|
||||
before resorting to randomisation.
|
||||
|
||||
|
||||
RECOVERING DATA FROM DAMAGED FILES
|
||||
bzip2 compresses files in blocks, usually 900kbytes long.
|
||||
Each block is handled independently. If a media or trans-
|
||||
mission error causes a multi-block .bz2 file to become
|
||||
damaged, it may be possible to recover data from the
|
||||
undamaged blocks in the file.
|
||||
|
||||
The compressed representation of each block is delimited
|
||||
by a 48-bit pattern, which makes it possible to find the
|
||||
block boundaries with reasonable certainty. Each block
|
||||
also carries its own 32-bit CRC, so damaged blocks can be
|
||||
distinguished from undamaged ones.
|
||||
|
||||
bzip2recover is a simple program whose purpose is to
|
||||
search for blocks in .bz2 files, and write each block out
|
||||
into its own .bz2 file. You can then use bzip2 -t to test
|
||||
the integrity of the resulting files, and decompress those
|
||||
which are undamaged.
|
||||
|
||||
bzip2recover takes a single argument, the name of the dam-
|
||||
aged file, and writes a number of files "rec0001file.bz2",
|
||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
||||
The output filenames are designed so that the use of
|
||||
wildcards in subsequent processing -- for example, "bzip2
|
||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
||||
the "right" order.
|
||||
|
||||
bzip2recover should be of most use dealing with large .bz2
|
||||
files, as these will contain many blocks. It is clearly
|
||||
futile to use it on damaged single-block files, since a
|
||||
damaged block cannot be recovered. If you wish to min-
|
||||
imise any potential data loss through media or transmis-
|
||||
sion errors, you might consider compressing with a smaller
|
||||
block size.
|
||||
|
||||
|
||||
PERFORMANCE NOTES
|
||||
The sorting phase of compression gathers together similar
|
||||
strings in the file. Because of this, files containing
|
||||
very long runs of repeated symbols, like "aabaabaabaab
|
||||
..." (repeated several hundred times) may compress
|
||||
extraordinarily slowly. You can use the -vvvvv option to
|
||||
monitor progress in great detail, if you want. Decompres-
|
||||
sion speed is unaffected.
|
||||
|
||||
Such pathological cases seem rare in practice, appearing
|
||||
mostly in artificially-constructed test files, and in low-
|
||||
level disk images. It may be inadvisable to use bzip2 to
|
||||
compress the latter. If you do get a file which causes
|
||||
severe slowness in compression, try making the block size
|
||||
as small as possible, with flag -1.
|
||||
|
||||
bzip2 usually allocates several megabytes of memory to
|
||||
operate in, and then charges all over it in a fairly ran-
|
||||
dom fashion. This means that performance, both for com-
|
||||
pressing and decompressing, is largely determined by the
|
||||
speed at which your machine can service cache misses.
|
||||
Because of this, small changes to the code to reduce the
|
||||
miss rate have been observed to give disproportionately
|
||||
large performance improvements. I imagine bzip2 will per-
|
||||
form best on machines with very large caches.
|
||||
|
||||
|
||||
CAVEATS
|
||||
I/O error messages are not as helpful as they could be.
|
||||
Bzip2 tries hard to detect I/O errors and exit cleanly,
|
||||
but the details of what the problem is sometimes seem
|
||||
rather misleading.
|
||||
|
||||
This manual page pertains to version 0.9.0 of bzip2. Com-
|
||||
pressed data created by this version is entirely forwards
|
||||
and backwards compatible with the previous public release,
|
||||
version 0.1pl2, but with the following exception: 0.9.0
|
||||
can correctly decompress multiple concatenated compressed
|
||||
files. 0.1pl2 cannot do this; it will stop after decom-
|
||||
pressing just the first file in the stream.
|
||||
|
||||
Wildcard expansion for Windows 95 and NT is flaky.
|
||||
|
||||
bzip2recover uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
files more than 512 megabytes long. This could easily be
|
||||
fixed.
|
||||
|
||||
|
||||
AUTHOR
|
||||
Julian Seward, jseward@acm.org.
|
||||
http://www.muraroa.demon.co.uk
|
||||
|
||||
The ideas embodied in bzip2 are due to (at least) the fol-
|
||||
lowing people: Michael Burrows and David Wheeler (for the
|
||||
block sorting transformation), David Wheeler (again, for
|
||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
||||
ing model in the original bzip, and many refinements), and
|
||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in the original bzip). I am much
|
||||
indebted for their help, support and advice. See the man-
|
||||
ual in the source distribution for pointers to sources of
|
||||
documentation. Christian von Roques encouraged me to look
|
||||
for faster sorting algorithms, so as to speed up compres-
|
||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||
compression performance. Many people sent patches, helped
|
||||
with portability problems, lent machines, gave advice and
|
||||
were generally helpful.
|
1486
lib/libbz2/bzlib.c
Normal file
1486
lib/libbz2/bzlib.c
Normal file
File diff suppressed because it is too large
Load Diff
299
lib/libbz2/bzlib.h
Normal file
299
lib/libbz2/bzlib.h
Normal file
@ -0,0 +1,299 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Public header file for the library. ---*/
|
||||
/*--- bzlib.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#ifndef _BZLIB_H
|
||||
#define _BZLIB_H
|
||||
|
||||
#define BZ_RUN 0
|
||||
#define BZ_FLUSH 1
|
||||
#define BZ_FINISH 2
|
||||
|
||||
#define BZ_OK 0
|
||||
#define BZ_RUN_OK 1
|
||||
#define BZ_FLUSH_OK 2
|
||||
#define BZ_FINISH_OK 3
|
||||
#define BZ_STREAM_END 4
|
||||
#define BZ_SEQUENCE_ERROR (-1)
|
||||
#define BZ_PARAM_ERROR (-2)
|
||||
#define BZ_MEM_ERROR (-3)
|
||||
#define BZ_DATA_ERROR (-4)
|
||||
#define BZ_DATA_ERROR_MAGIC (-5)
|
||||
#define BZ_IO_ERROR (-6)
|
||||
#define BZ_UNEXPECTED_EOF (-7)
|
||||
#define BZ_OUTBUFF_FULL (-8)
|
||||
|
||||
typedef
|
||||
struct {
|
||||
char *next_in;
|
||||
unsigned int avail_in;
|
||||
unsigned int total_in;
|
||||
|
||||
char *next_out;
|
||||
unsigned int avail_out;
|
||||
unsigned int total_out;
|
||||
|
||||
void *state;
|
||||
|
||||
void *(*bzalloc)(void *,int,int);
|
||||
void (*bzfree)(void *,void *);
|
||||
void *opaque;
|
||||
}
|
||||
bz_stream;
|
||||
|
||||
|
||||
#ifndef BZ_IMPORT
|
||||
#define BZ_EXPORT
|
||||
#endif
|
||||
|
||||
#ifdef _WIN32
|
||||
# include <stdio.h>
|
||||
# include <windows.h>
|
||||
# ifdef small
|
||||
/* windows.h define small to char */
|
||||
# undef small
|
||||
# endif
|
||||
# ifdef BZ_EXPORT
|
||||
# define BZ_API(func) WINAPI func
|
||||
# define BZ_EXTERN extern
|
||||
# else
|
||||
/* import windows dll dynamically */
|
||||
# define BZ_API(func) (WINAPI * func)
|
||||
# define BZ_EXTERN
|
||||
# endif
|
||||
#else
|
||||
# define BZ_API(func) func
|
||||
# define BZ_EXTERN extern
|
||||
#endif
|
||||
|
||||
|
||||
/*-- Core (low-level) library functions --*/
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompressInit) (
|
||||
bz_stream* strm,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompress) (
|
||||
bz_stream* strm,
|
||||
int action
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzCompressEnd) (
|
||||
bz_stream* strm
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompressInit) (
|
||||
bz_stream *strm,
|
||||
int verbosity,
|
||||
int small
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompress) (
|
||||
bz_stream* strm
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzDecompressEnd) (
|
||||
bz_stream *strm
|
||||
);
|
||||
|
||||
|
||||
|
||||
/*-- High(er) level library functions --*/
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
#define BZ_MAX_UNUSED 5000
|
||||
|
||||
typedef void BZFILE;
|
||||
|
||||
BZ_EXTERN BZFILE* BZ_API(bzReadOpen) (
|
||||
int* bzerror,
|
||||
FILE* f,
|
||||
int verbosity,
|
||||
int small,
|
||||
void* unused,
|
||||
int nUnused
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzReadClose) (
|
||||
int* bzerror,
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzReadGetUnused) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void** unused,
|
||||
int* nUnused
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzRead) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN BZFILE* BZ_API(bzWriteOpen) (
|
||||
int* bzerror,
|
||||
FILE* f,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzWrite) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzWriteClose) (
|
||||
int* bzerror,
|
||||
BZFILE* b,
|
||||
int abandon,
|
||||
unsigned int* nbytes_in,
|
||||
unsigned int* nbytes_out
|
||||
);
|
||||
#endif
|
||||
|
||||
|
||||
/*-- Utility functions --*/
|
||||
|
||||
BZ_EXTERN int BZ_API(bzBuffToBuffCompress) (
|
||||
char* dest,
|
||||
unsigned int* destLen,
|
||||
char* source,
|
||||
unsigned int sourceLen,
|
||||
int blockSize100k,
|
||||
int verbosity,
|
||||
int workFactor
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzBuffToBuffDecompress) (
|
||||
char* dest,
|
||||
unsigned int* destLen,
|
||||
char* source,
|
||||
unsigned int sourceLen,
|
||||
int small,
|
||||
int verbosity
|
||||
);
|
||||
|
||||
|
||||
/*--
|
||||
Code contributed by Yoshioka Tsuneo
|
||||
(QWF00133@niftyserve.or.jp/tsuneo-y@is.aist-nara.ac.jp),
|
||||
to support better zlib compatibility.
|
||||
This code is not _officially_ part of libbzip2 (yet);
|
||||
I haven't tested it, documented it, or considered the
|
||||
threading-safeness of it.
|
||||
If this code breaks, please contact both Yoshioka and me.
|
||||
--*/
|
||||
|
||||
BZ_EXTERN const char * BZ_API(bzlibVersion) (
|
||||
void
|
||||
);
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
BZ_EXTERN BZFILE * BZ_API(bzopen) (
|
||||
const char *path,
|
||||
const char *mode
|
||||
);
|
||||
|
||||
BZ_EXTERN BZFILE * BZ_API(bzdopen) (
|
||||
int fd,
|
||||
const char *mode
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzread) (
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzwrite) (
|
||||
BZFILE* b,
|
||||
void* buf,
|
||||
int len
|
||||
);
|
||||
|
||||
BZ_EXTERN int BZ_API(bzflush) (
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN void BZ_API(bzclose) (
|
||||
BZFILE* b
|
||||
);
|
||||
|
||||
BZ_EXTERN const char * BZ_API(bzerror) (
|
||||
BZFILE *b,
|
||||
int *errnum
|
||||
);
|
||||
#endif
|
||||
|
||||
|
||||
#endif
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end bzlib.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
523
lib/libbz2/bzlib_private.h
Normal file
523
lib/libbz2/bzlib_private.h
Normal file
@ -0,0 +1,523 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Private header file for the library. ---*/
|
||||
/*--- bzlib_private.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#ifndef _BZLIB_PRIVATE_H
|
||||
#define _BZLIB_PRIVATE_H
|
||||
|
||||
#include <stdlib.h>
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
#include <stdio.h>
|
||||
#include <ctype.h>
|
||||
#include <string.h>
|
||||
#endif
|
||||
|
||||
#include "bzlib.h"
|
||||
|
||||
|
||||
|
||||
/*-- General stuff. --*/
|
||||
|
||||
#define BZ_VERSION "0.9.0"
|
||||
|
||||
typedef char Char;
|
||||
typedef unsigned char Bool;
|
||||
typedef unsigned char UChar;
|
||||
typedef int Int32;
|
||||
typedef unsigned int UInt32;
|
||||
typedef short Int16;
|
||||
typedef unsigned short UInt16;
|
||||
|
||||
#define True ((Bool)1)
|
||||
#define False ((Bool)0)
|
||||
|
||||
#ifdef _WIN32
|
||||
#define __inline__ __inline
|
||||
#endif
|
||||
|
||||
#ifndef BZ_NO_STDIO
|
||||
extern void bz__AssertH__fail ( int errcode );
|
||||
#define AssertH(cond,errcode) \
|
||||
{ if (!(cond)) bz__AssertH__fail ( errcode ); }
|
||||
#if BZ_DEBUG
|
||||
#define AssertD(cond,msg) \
|
||||
{ if (!(cond)) { \
|
||||
fprintf ( stderr, \
|
||||
"\n\nlibbzip2(debug build): internal error\n\t%s\n", msg );\
|
||||
exit(1); \
|
||||
}}
|
||||
#else
|
||||
#define AssertD(cond,msg) /* */
|
||||
#endif
|
||||
#define VPrintf0(zf) \
|
||||
fprintf(stderr,zf)
|
||||
#define VPrintf1(zf,za1) \
|
||||
fprintf(stderr,zf,za1)
|
||||
#define VPrintf2(zf,za1,za2) \
|
||||
fprintf(stderr,zf,za1,za2)
|
||||
#define VPrintf3(zf,za1,za2,za3) \
|
||||
fprintf(stderr,zf,za1,za2,za3)
|
||||
#define VPrintf4(zf,za1,za2,za3,za4) \
|
||||
fprintf(stderr,zf,za1,za2,za3,za4)
|
||||
#define VPrintf5(zf,za1,za2,za3,za4,za5) \
|
||||
fprintf(stderr,zf,za1,za2,za3,za4,za5)
|
||||
#else
|
||||
extern void bz_internal_error ( int errcode );
|
||||
#define AssertH(cond,errcode) \
|
||||
{ if (!(cond)) bz_internal_error ( errcode ); }
|
||||
#define AssertD(cond,msg) /* */
|
||||
#define VPrintf0(zf) /* */
|
||||
#define VPrintf1(zf,za1) /* */
|
||||
#define VPrintf2(zf,za1,za2) /* */
|
||||
#define VPrintf3(zf,za1,za2,za3) /* */
|
||||
#define VPrintf4(zf,za1,za2,za3,za4) /* */
|
||||
#define VPrintf5(zf,za1,za2,za3,za4,za5) /* */
|
||||
#endif
|
||||
|
||||
|
||||
#define BZALLOC(nnn) (strm->bzalloc)(strm->opaque,(nnn),1)
|
||||
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
|
||||
|
||||
|
||||
/*-- Constants for the back end. --*/
|
||||
|
||||
#define BZ_MAX_ALPHA_SIZE 258
|
||||
#define BZ_MAX_CODE_LEN 23
|
||||
|
||||
#define BZ_RUNA 0
|
||||
#define BZ_RUNB 1
|
||||
|
||||
#define BZ_N_GROUPS 6
|
||||
#define BZ_G_SIZE 50
|
||||
#define BZ_N_ITERS 4
|
||||
|
||||
#define BZ_MAX_SELECTORS (2 + (900000 / BZ_G_SIZE))
|
||||
|
||||
|
||||
|
||||
/*-- Stuff for randomising repetitive blocks. --*/
|
||||
|
||||
extern Int32 rNums[512];
|
||||
|
||||
#define BZ_RAND_DECLS \
|
||||
Int32 rNToGo; \
|
||||
Int32 rTPos \
|
||||
|
||||
#define BZ_RAND_INIT_MASK \
|
||||
s->rNToGo = 0; \
|
||||
s->rTPos = 0 \
|
||||
|
||||
#define BZ_RAND_MASK ((s->rNToGo == 1) ? 1 : 0)
|
||||
|
||||
#define BZ_RAND_UPD_MASK \
|
||||
if (s->rNToGo == 0) { \
|
||||
s->rNToGo = rNums[s->rTPos]; \
|
||||
s->rTPos++; \
|
||||
if (s->rTPos == 512) s->rTPos = 0; \
|
||||
} \
|
||||
s->rNToGo--;
|
||||
|
||||
|
||||
|
||||
/*-- Stuff for doing CRCs. --*/
|
||||
|
||||
extern UInt32 crc32Table[256];
|
||||
|
||||
#define BZ_INITIALISE_CRC(crcVar) \
|
||||
{ \
|
||||
crcVar = 0xffffffffL; \
|
||||
}
|
||||
|
||||
#define BZ_FINALISE_CRC(crcVar) \
|
||||
{ \
|
||||
crcVar = ~(crcVar); \
|
||||
}
|
||||
|
||||
#define BZ_UPDATE_CRC(crcVar,cha) \
|
||||
{ \
|
||||
crcVar = (crcVar << 8) ^ \
|
||||
crc32Table[(crcVar >> 24) ^ \
|
||||
((UChar)cha)]; \
|
||||
}
|
||||
|
||||
|
||||
|
||||
/*-- States and modes for compression. --*/
|
||||
|
||||
#define BZ_M_IDLE 1
|
||||
#define BZ_M_RUNNING 2
|
||||
#define BZ_M_FLUSHING 3
|
||||
#define BZ_M_FINISHING 4
|
||||
|
||||
#define BZ_S_OUTPUT 1
|
||||
#define BZ_S_INPUT 2
|
||||
|
||||
#define BZ_NUM_OVERSHOOT_BYTES 20
|
||||
|
||||
|
||||
|
||||
/*-- Structure holding all the compression-side stuff. --*/
|
||||
|
||||
typedef
|
||||
struct {
|
||||
/* pointer back to the struct bz_stream */
|
||||
bz_stream* strm;
|
||||
|
||||
/* mode this stream is in, and whether inputting */
|
||||
/* or outputting data */
|
||||
Int32 mode;
|
||||
Int32 state;
|
||||
|
||||
/* remembers avail_in when flush/finish requested */
|
||||
UInt32 avail_in_expect;
|
||||
|
||||
/* for doing the block sorting */
|
||||
UChar* block;
|
||||
UInt16* quadrant;
|
||||
UInt32* zptr;
|
||||
UInt16* szptr;
|
||||
Int32* ftab;
|
||||
Int32 workDone;
|
||||
Int32 workLimit;
|
||||
Int32 workFactor;
|
||||
Bool firstAttempt;
|
||||
Bool blockRandomised;
|
||||
Int32 origPtr;
|
||||
|
||||
/* run-length-encoding of the input */
|
||||
UInt32 state_in_ch;
|
||||
Int32 state_in_len;
|
||||
BZ_RAND_DECLS;
|
||||
|
||||
/* input and output limits and current posns */
|
||||
Int32 nblock;
|
||||
Int32 nblockMAX;
|
||||
Int32 numZ;
|
||||
Int32 state_out_pos;
|
||||
|
||||
/* map of bytes used in block */
|
||||
Int32 nInUse;
|
||||
Bool inUse[256];
|
||||
UChar unseqToSeq[256];
|
||||
|
||||
/* the buffer for bit stream creation */
|
||||
UInt32 bsBuff;
|
||||
Int32 bsLive;
|
||||
|
||||
/* block and combined CRCs */
|
||||
UInt32 blockCRC;
|
||||
UInt32 combinedCRC;
|
||||
|
||||
/* misc administratium */
|
||||
Int32 verbosity;
|
||||
Int32 blockNo;
|
||||
Int32 nBlocksRandomised;
|
||||
Int32 blockSize100k;
|
||||
|
||||
/* stuff for coding the MTF values */
|
||||
Int32 nMTF;
|
||||
Int32 mtfFreq [BZ_MAX_ALPHA_SIZE];
|
||||
UChar selector [BZ_MAX_SELECTORS];
|
||||
UChar selectorMtf[BZ_MAX_SELECTORS];
|
||||
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 code [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
|
||||
}
|
||||
EState;
|
||||
|
||||
|
||||
|
||||
/*-- externs for compression. --*/
|
||||
|
||||
extern void
|
||||
blockSort ( EState* );
|
||||
|
||||
extern void
|
||||
compressBlock ( EState*, Bool );
|
||||
|
||||
extern void
|
||||
bsInitWrite ( EState* );
|
||||
|
||||
extern void
|
||||
hbAssignCodes ( Int32*, UChar*, Int32, Int32, Int32 );
|
||||
|
||||
extern void
|
||||
hbMakeCodeLengths ( UChar*, Int32*, Int32, Int32 );
|
||||
|
||||
|
||||
|
||||
/*-- states for decompression. --*/
|
||||
|
||||
#define BZ_X_IDLE 1
|
||||
#define BZ_X_OUTPUT 2
|
||||
|
||||
#define BZ_X_MAGIC_1 10
|
||||
#define BZ_X_MAGIC_2 11
|
||||
#define BZ_X_MAGIC_3 12
|
||||
#define BZ_X_MAGIC_4 13
|
||||
#define BZ_X_BLKHDR_1 14
|
||||
#define BZ_X_BLKHDR_2 15
|
||||
#define BZ_X_BLKHDR_3 16
|
||||
#define BZ_X_BLKHDR_4 17
|
||||
#define BZ_X_BLKHDR_5 18
|
||||
#define BZ_X_BLKHDR_6 19
|
||||
#define BZ_X_BCRC_1 20
|
||||
#define BZ_X_BCRC_2 21
|
||||
#define BZ_X_BCRC_3 22
|
||||
#define BZ_X_BCRC_4 23
|
||||
#define BZ_X_RANDBIT 24
|
||||
#define BZ_X_ORIGPTR_1 25
|
||||
#define BZ_X_ORIGPTR_2 26
|
||||
#define BZ_X_ORIGPTR_3 27
|
||||
#define BZ_X_MAPPING_1 28
|
||||
#define BZ_X_MAPPING_2 29
|
||||
#define BZ_X_SELECTOR_1 30
|
||||
#define BZ_X_SELECTOR_2 31
|
||||
#define BZ_X_SELECTOR_3 32
|
||||
#define BZ_X_CODING_1 33
|
||||
#define BZ_X_CODING_2 34
|
||||
#define BZ_X_CODING_3 35
|
||||
#define BZ_X_MTF_1 36
|
||||
#define BZ_X_MTF_2 37
|
||||
#define BZ_X_MTF_3 38
|
||||
#define BZ_X_MTF_4 39
|
||||
#define BZ_X_MTF_5 40
|
||||
#define BZ_X_MTF_6 41
|
||||
#define BZ_X_ENDHDR_2 42
|
||||
#define BZ_X_ENDHDR_3 43
|
||||
#define BZ_X_ENDHDR_4 44
|
||||
#define BZ_X_ENDHDR_5 45
|
||||
#define BZ_X_ENDHDR_6 46
|
||||
#define BZ_X_CCRC_1 47
|
||||
#define BZ_X_CCRC_2 48
|
||||
#define BZ_X_CCRC_3 49
|
||||
#define BZ_X_CCRC_4 50
|
||||
|
||||
|
||||
|
||||
/*-- Constants for the fast MTF decoder. --*/
|
||||
|
||||
#define MTFA_SIZE 4096
|
||||
#define MTFL_SIZE 16
|
||||
|
||||
|
||||
|
||||
/*-- Structure holding all the decompression-side stuff. --*/
|
||||
|
||||
typedef
|
||||
struct {
|
||||
/* pointer back to the struct bz_stream */
|
||||
bz_stream* strm;
|
||||
|
||||
/* state indicator for this stream */
|
||||
Int32 state;
|
||||
|
||||
/* for doing the final run-length decoding */
|
||||
UChar state_out_ch;
|
||||
Int32 state_out_len;
|
||||
Bool blockRandomised;
|
||||
BZ_RAND_DECLS;
|
||||
|
||||
/* the buffer for bit stream reading */
|
||||
UInt32 bsBuff;
|
||||
Int32 bsLive;
|
||||
|
||||
/* misc administratium */
|
||||
Int32 blockSize100k;
|
||||
Bool smallDecompress;
|
||||
Int32 currBlockNo;
|
||||
Int32 verbosity;
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform */
|
||||
Int32 origPtr;
|
||||
UInt32 tPos;
|
||||
Int32 k0;
|
||||
Int32 unzftab[256];
|
||||
Int32 nblock_used;
|
||||
Int32 cftab[257];
|
||||
Int32 cftabCopy[257];
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform (FAST) */
|
||||
UInt32 *tt;
|
||||
|
||||
/* for undoing the Burrows-Wheeler transform (SMALL) */
|
||||
UInt16 *ll16;
|
||||
UChar *ll4;
|
||||
|
||||
/* stored and calculated CRCs */
|
||||
UInt32 storedBlockCRC;
|
||||
UInt32 storedCombinedCRC;
|
||||
UInt32 calculatedBlockCRC;
|
||||
UInt32 calculatedCombinedCRC;
|
||||
|
||||
/* map of bytes used in block */
|
||||
Int32 nInUse;
|
||||
Bool inUse[256];
|
||||
Bool inUse16[16];
|
||||
UChar seqToUnseq[256];
|
||||
|
||||
/* for decoding the MTF values */
|
||||
UChar mtfa [MTFA_SIZE];
|
||||
Int32 mtfbase[256 / MTFL_SIZE];
|
||||
UChar selector [BZ_MAX_SELECTORS];
|
||||
UChar selectorMtf[BZ_MAX_SELECTORS];
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
|
||||
Int32 limit [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 base [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 perm [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 minLens[BZ_N_GROUPS];
|
||||
|
||||
/* save area for scalars in the main decompress code */
|
||||
Int32 save_i;
|
||||
Int32 save_j;
|
||||
Int32 save_t;
|
||||
Int32 save_alphaSize;
|
||||
Int32 save_nGroups;
|
||||
Int32 save_nSelectors;
|
||||
Int32 save_EOB;
|
||||
Int32 save_groupNo;
|
||||
Int32 save_groupPos;
|
||||
Int32 save_nextSym;
|
||||
Int32 save_nblockMAX;
|
||||
Int32 save_nblock;
|
||||
Int32 save_es;
|
||||
Int32 save_N;
|
||||
Int32 save_curr;
|
||||
Int32 save_zt;
|
||||
Int32 save_zn;
|
||||
Int32 save_zvec;
|
||||
Int32 save_zj;
|
||||
Int32 save_gSel;
|
||||
Int32 save_gMinlen;
|
||||
Int32* save_gLimit;
|
||||
Int32* save_gBase;
|
||||
Int32* save_gPerm;
|
||||
|
||||
}
|
||||
DState;
|
||||
|
||||
|
||||
|
||||
/*-- Macros for decompression. --*/
|
||||
|
||||
#define BZ_GET_FAST(cccc) \
|
||||
s->tPos = s->tt[s->tPos]; \
|
||||
cccc = (UChar)(s->tPos & 0xff); \
|
||||
s->tPos >>= 8;
|
||||
|
||||
#define BZ_GET_FAST_C(cccc) \
|
||||
c_tPos = c_tt[c_tPos]; \
|
||||
cccc = (UChar)(c_tPos & 0xff); \
|
||||
c_tPos >>= 8;
|
||||
|
||||
#define SET_LL4(i,n) \
|
||||
{ if (((i) & 0x1) == 0) \
|
||||
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0xf0) | (n); else \
|
||||
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0x0f) | ((n) << 4); \
|
||||
}
|
||||
|
||||
#define GET_LL4(i) \
|
||||
(((UInt32)(s->ll4[(i) >> 1])) >> (((i) << 2) & 0x4) & 0xF)
|
||||
|
||||
#define SET_LL(i,n) \
|
||||
{ s->ll16[i] = (UInt16)(n & 0x0000ffff); \
|
||||
SET_LL4(i, n >> 16); \
|
||||
}
|
||||
|
||||
#define GET_LL(i) \
|
||||
(((UInt32)s->ll16[i]) | (GET_LL4(i) << 16))
|
||||
|
||||
#define BZ_GET_SMALL(cccc) \
|
||||
cccc = indexIntoF ( s->tPos, s->cftab ); \
|
||||
s->tPos = GET_LL(s->tPos);
|
||||
|
||||
|
||||
/*-- externs for decompression. --*/
|
||||
|
||||
extern Int32
|
||||
indexIntoF ( Int32, Int32* );
|
||||
|
||||
extern Int32
|
||||
decompress ( DState* );
|
||||
|
||||
extern void
|
||||
hbCreateDecodeTables ( Int32*, Int32*, Int32*, UChar*,
|
||||
Int32, Int32, Int32 );
|
||||
|
||||
|
||||
#endif
|
||||
|
||||
|
||||
/*-- BZ_NO_STDIO seems to make NULL disappear on some platforms. --*/
|
||||
|
||||
#ifdef BZ_NO_STDIO
|
||||
#ifndef NULL
|
||||
#define NULL 0
|
||||
#endif
|
||||
#endif
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end bzlib_private.h ---*/
|
||||
/*-------------------------------------------------------------*/
|
575
lib/libbz2/compress.c
Normal file
575
lib/libbz2/compress.c
Normal file
@ -0,0 +1,575 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Compression machinery (not incl block sorting) ---*/
|
||||
/*--- compress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
/*--- Bit stream I/O ---*/
|
||||
/*---------------------------------------------------*/
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void bsInitWrite ( EState* s )
|
||||
{
|
||||
s->bsLive = 0;
|
||||
s->bsBuff = 0;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsFinishWrite ( EState* s )
|
||||
{
|
||||
while (s->bsLive > 0) {
|
||||
((UChar*)(s->quadrant))[s->numZ] = (UChar)(s->bsBuff >> 24);
|
||||
s->numZ++;
|
||||
s->bsBuff <<= 8;
|
||||
s->bsLive -= 8;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define bsNEEDW(nz) \
|
||||
{ \
|
||||
while (s->bsLive >= 8) { \
|
||||
((UChar*)(s->quadrant))[s->numZ] \
|
||||
= (UChar)(s->bsBuff >> 24); \
|
||||
s->numZ++; \
|
||||
s->bsBuff <<= 8; \
|
||||
s->bsLive -= 8; \
|
||||
} \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsW ( EState* s, Int32 n, UInt32 v )
|
||||
{
|
||||
bsNEEDW ( n );
|
||||
s->bsBuff |= (v << (32 - s->bsLive - n));
|
||||
s->bsLive += n;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsPutUInt32 ( EState* s, UInt32 u )
|
||||
{
|
||||
bsW ( s, 8, (u >> 24) & 0xffL );
|
||||
bsW ( s, 8, (u >> 16) & 0xffL );
|
||||
bsW ( s, 8, (u >> 8) & 0xffL );
|
||||
bsW ( s, 8, u & 0xffL );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void bsPutUChar ( EState* s, UChar c )
|
||||
{
|
||||
bsW( s, 8, (UInt32)c );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
/*--- The back end proper ---*/
|
||||
/*---------------------------------------------------*/
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void makeMaps_e ( EState* s )
|
||||
{
|
||||
Int32 i;
|
||||
s->nInUse = 0;
|
||||
for (i = 0; i < 256; i++)
|
||||
if (s->inUse[i]) {
|
||||
s->unseqToSeq[i] = s->nInUse;
|
||||
s->nInUse++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void generateMTFValues ( EState* s )
|
||||
{
|
||||
UChar yy[256];
|
||||
Int32 i, j;
|
||||
UChar tmp;
|
||||
UChar tmp2;
|
||||
Int32 zPend;
|
||||
Int32 wr;
|
||||
Int32 EOB;
|
||||
|
||||
makeMaps_e ( s );
|
||||
EOB = s->nInUse+1;
|
||||
|
||||
for (i = 0; i <= EOB; i++) s->mtfFreq[i] = 0;
|
||||
|
||||
wr = 0;
|
||||
zPend = 0;
|
||||
for (i = 0; i < s->nInUse; i++) yy[i] = (UChar) i;
|
||||
|
||||
for (i = 0; i < s->nblock; i++) {
|
||||
UChar ll_i;
|
||||
|
||||
AssertD ( wr <= i, "generateMTFValues(1)" );
|
||||
j = s->zptr[i]-1; if (j < 0) j += s->nblock;
|
||||
ll_i = s->unseqToSeq[s->block[j]];
|
||||
AssertD ( ll_i < s->nInUse, "generateMTFValues(2a)" );
|
||||
|
||||
j = 0;
|
||||
tmp = yy[j];
|
||||
while ( ll_i != tmp ) {
|
||||
j++;
|
||||
tmp2 = tmp;
|
||||
tmp = yy[j];
|
||||
yy[j] = tmp2;
|
||||
};
|
||||
yy[0] = tmp;
|
||||
|
||||
if (j == 0) {
|
||||
zPend++;
|
||||
} else {
|
||||
if (zPend > 0) {
|
||||
zPend--;
|
||||
while (True) {
|
||||
switch (zPend % 2) {
|
||||
case 0: s->szptr[wr] = BZ_RUNA; wr++; s->mtfFreq[BZ_RUNA]++; break;
|
||||
case 1: s->szptr[wr] = BZ_RUNB; wr++; s->mtfFreq[BZ_RUNB]++; break;
|
||||
};
|
||||
if (zPend < 2) break;
|
||||
zPend = (zPend - 2) / 2;
|
||||
};
|
||||
zPend = 0;
|
||||
}
|
||||
s->szptr[wr] = j+1; wr++; s->mtfFreq[j+1]++;
|
||||
}
|
||||
}
|
||||
|
||||
if (zPend > 0) {
|
||||
zPend--;
|
||||
while (True) {
|
||||
switch (zPend % 2) {
|
||||
case 0: s->szptr[wr] = BZ_RUNA; wr++; s->mtfFreq[BZ_RUNA]++; break;
|
||||
case 1: s->szptr[wr] = BZ_RUNB; wr++; s->mtfFreq[BZ_RUNB]++; break;
|
||||
};
|
||||
if (zPend < 2) break;
|
||||
zPend = (zPend - 2) / 2;
|
||||
};
|
||||
}
|
||||
|
||||
s->szptr[wr] = EOB; wr++; s->mtfFreq[EOB]++;
|
||||
|
||||
s->nMTF = wr;
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define BZ_LESSER_ICOST 0
|
||||
#define BZ_GREATER_ICOST 15
|
||||
|
||||
static
|
||||
void sendMTFValues ( EState* s )
|
||||
{
|
||||
Int32 v, t, i, j, gs, ge, totc, bt, bc, iter;
|
||||
Int32 nSelectors, alphaSize, minLen, maxLen, selCtr;
|
||||
Int32 nGroups, nBytes;
|
||||
|
||||
/*--
|
||||
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
is a global since the decoder also needs it.
|
||||
|
||||
Int32 code[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
Int32 rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
|
||||
are also globals only used in this proc.
|
||||
Made global to keep stack frame size small.
|
||||
--*/
|
||||
|
||||
|
||||
UInt16 cost[BZ_N_GROUPS];
|
||||
Int32 fave[BZ_N_GROUPS];
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf3( " %d in block, %d after MTF & 1-2 coding, "
|
||||
"%d+2 syms in use\n",
|
||||
s->nblock, s->nMTF, s->nInUse );
|
||||
|
||||
alphaSize = s->nInUse+2;
|
||||
for (t = 0; t < BZ_N_GROUPS; t++)
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
s->len[t][v] = BZ_GREATER_ICOST;
|
||||
|
||||
/*--- Decide how many coding tables to use ---*/
|
||||
AssertH ( s->nMTF > 0, 3001 );
|
||||
if (s->nMTF < 200) nGroups = 2; else
|
||||
if (s->nMTF < 800) nGroups = 4; else
|
||||
nGroups = 6;
|
||||
|
||||
/*--- Generate an initial set of coding tables ---*/
|
||||
{
|
||||
Int32 nPart, remF, tFreq, aFreq;
|
||||
|
||||
nPart = nGroups;
|
||||
remF = s->nMTF;
|
||||
gs = 0;
|
||||
while (nPart > 0) {
|
||||
tFreq = remF / nPart;
|
||||
ge = gs-1;
|
||||
aFreq = 0;
|
||||
while (aFreq < tFreq && ge < alphaSize-1) {
|
||||
ge++;
|
||||
aFreq += s->mtfFreq[ge];
|
||||
}
|
||||
|
||||
if (ge > gs
|
||||
&& nPart != nGroups && nPart != 1
|
||||
&& ((nGroups-nPart) % 2 == 1)) {
|
||||
aFreq -= s->mtfFreq[ge];
|
||||
ge--;
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf5( " initial group %d, [%d .. %d], "
|
||||
"has %d syms (%4.1f%%)\n",
|
||||
nPart, gs, ge, aFreq,
|
||||
(100.0 * (float)aFreq) / (float)(s->nMTF) );
|
||||
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
if (v >= gs && v <= ge)
|
||||
s->len[nPart-1][v] = BZ_LESSER_ICOST; else
|
||||
s->len[nPart-1][v] = BZ_GREATER_ICOST;
|
||||
|
||||
nPart--;
|
||||
gs = ge+1;
|
||||
remF -= aFreq;
|
||||
}
|
||||
}
|
||||
|
||||
/*---
|
||||
Iterate up to BZ_N_ITERS times to improve the tables.
|
||||
---*/
|
||||
for (iter = 0; iter < BZ_N_ITERS; iter++) {
|
||||
|
||||
for (t = 0; t < nGroups; t++) fave[t] = 0;
|
||||
|
||||
for (t = 0; t < nGroups; t++)
|
||||
for (v = 0; v < alphaSize; v++)
|
||||
s->rfreq[t][v] = 0;
|
||||
|
||||
nSelectors = 0;
|
||||
totc = 0;
|
||||
gs = 0;
|
||||
while (True) {
|
||||
|
||||
/*--- Set group start & end marks. --*/
|
||||
if (gs >= s->nMTF) break;
|
||||
ge = gs + BZ_G_SIZE - 1;
|
||||
if (ge >= s->nMTF) ge = s->nMTF-1;
|
||||
|
||||
/*--
|
||||
Calculate the cost of this group as coded
|
||||
by each of the coding tables.
|
||||
--*/
|
||||
for (t = 0; t < nGroups; t++) cost[t] = 0;
|
||||
|
||||
if (nGroups == 6) {
|
||||
register UInt16 cost0, cost1, cost2, cost3, cost4, cost5;
|
||||
cost0 = cost1 = cost2 = cost3 = cost4 = cost5 = 0;
|
||||
for (i = gs; i <= ge; i++) {
|
||||
UInt16 icv = s->szptr[i];
|
||||
cost0 += s->len[0][icv];
|
||||
cost1 += s->len[1][icv];
|
||||
cost2 += s->len[2][icv];
|
||||
cost3 += s->len[3][icv];
|
||||
cost4 += s->len[4][icv];
|
||||
cost5 += s->len[5][icv];
|
||||
}
|
||||
cost[0] = cost0; cost[1] = cost1; cost[2] = cost2;
|
||||
cost[3] = cost3; cost[4] = cost4; cost[5] = cost5;
|
||||
} else {
|
||||
for (i = gs; i <= ge; i++) {
|
||||
UInt16 icv = s->szptr[i];
|
||||
for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv];
|
||||
}
|
||||
}
|
||||
|
||||
/*--
|
||||
Find the coding table which is best for this group,
|
||||
and record its identity in the selector table.
|
||||
--*/
|
||||
bc = 999999999; bt = -1;
|
||||
for (t = 0; t < nGroups; t++)
|
||||
if (cost[t] < bc) { bc = cost[t]; bt = t; };
|
||||
totc += bc;
|
||||
fave[bt]++;
|
||||
s->selector[nSelectors] = bt;
|
||||
nSelectors++;
|
||||
|
||||
/*--
|
||||
Increment the symbol frequencies for the selected table.
|
||||
--*/
|
||||
for (i = gs; i <= ge; i++)
|
||||
s->rfreq[bt][ s->szptr[i] ]++;
|
||||
|
||||
gs = ge+1;
|
||||
}
|
||||
if (s->verbosity >= 3) {
|
||||
VPrintf2 ( " pass %d: size is %d, grp uses are ",
|
||||
iter+1, totc/8 );
|
||||
for (t = 0; t < nGroups; t++)
|
||||
VPrintf1 ( "%d ", fave[t] );
|
||||
VPrintf0 ( "\n" );
|
||||
}
|
||||
|
||||
/*--
|
||||
Recompute the tables based on the accumulated frequencies.
|
||||
--*/
|
||||
for (t = 0; t < nGroups; t++)
|
||||
hbMakeCodeLengths ( &(s->len[t][0]), &(s->rfreq[t][0]),
|
||||
alphaSize, 20 );
|
||||
}
|
||||
|
||||
|
||||
AssertH( nGroups < 8, 3002 );
|
||||
AssertH( nSelectors < 32768 &&
|
||||
nSelectors <= (2 + (900000 / BZ_G_SIZE)),
|
||||
3003 );
|
||||
|
||||
|
||||
/*--- Compute MTF values for the selectors. ---*/
|
||||
{
|
||||
UChar pos[BZ_N_GROUPS], ll_i, tmp2, tmp;
|
||||
for (i = 0; i < nGroups; i++) pos[i] = i;
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
ll_i = s->selector[i];
|
||||
j = 0;
|
||||
tmp = pos[j];
|
||||
while ( ll_i != tmp ) {
|
||||
j++;
|
||||
tmp2 = tmp;
|
||||
tmp = pos[j];
|
||||
pos[j] = tmp2;
|
||||
};
|
||||
pos[0] = tmp;
|
||||
s->selectorMtf[i] = j;
|
||||
}
|
||||
};
|
||||
|
||||
/*--- Assign actual codes for the tables. --*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
minLen = 32;
|
||||
maxLen = 0;
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
|
||||
if (s->len[t][i] < minLen) minLen = s->len[t][i];
|
||||
}
|
||||
AssertH ( !(maxLen > 20), 3004 );
|
||||
AssertH ( !(minLen < 1), 3005 );
|
||||
hbAssignCodes ( &(s->code[t][0]), &(s->len[t][0]),
|
||||
minLen, maxLen, alphaSize );
|
||||
}
|
||||
|
||||
/*--- Transmit the mapping table. ---*/
|
||||
{
|
||||
Bool inUse16[16];
|
||||
for (i = 0; i < 16; i++) {
|
||||
inUse16[i] = False;
|
||||
for (j = 0; j < 16; j++)
|
||||
if (s->inUse[i * 16 + j]) inUse16[i] = True;
|
||||
}
|
||||
|
||||
nBytes = s->numZ;
|
||||
for (i = 0; i < 16; i++)
|
||||
if (inUse16[i]) bsW(s,1,1); else bsW(s,1,0);
|
||||
|
||||
for (i = 0; i < 16; i++)
|
||||
if (inUse16[i])
|
||||
for (j = 0; j < 16; j++) {
|
||||
if (s->inUse[i * 16 + j]) bsW(s,1,1); else bsW(s,1,0);
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( " bytes: mapping %d, ", s->numZ-nBytes );
|
||||
}
|
||||
|
||||
/*--- Now the selectors. ---*/
|
||||
nBytes = s->numZ;
|
||||
bsW ( s, 3, nGroups );
|
||||
bsW ( s, 15, nSelectors );
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
for (j = 0; j < s->selectorMtf[i]; j++) bsW(s,1,1);
|
||||
bsW(s,1,0);
|
||||
}
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( "selectors %d, ", s->numZ-nBytes );
|
||||
|
||||
/*--- Now the coding tables. ---*/
|
||||
nBytes = s->numZ;
|
||||
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
Int32 curr = s->len[t][0];
|
||||
bsW ( s, 5, curr );
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
while (curr < s->len[t][i]) { bsW(s,2,2); curr++; /* 10 */ };
|
||||
while (curr > s->len[t][i]) { bsW(s,2,3); curr--; /* 11 */ };
|
||||
bsW ( s, 1, 0 );
|
||||
}
|
||||
}
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1 ( "code lengths %d, ", s->numZ-nBytes );
|
||||
|
||||
/*--- And finally, the block data proper ---*/
|
||||
nBytes = s->numZ;
|
||||
selCtr = 0;
|
||||
gs = 0;
|
||||
while (True) {
|
||||
if (gs >= s->nMTF) break;
|
||||
ge = gs + BZ_G_SIZE - 1;
|
||||
if (ge >= s->nMTF) ge = s->nMTF-1;
|
||||
for (i = gs; i <= ge; i++) {
|
||||
AssertH ( s->selector[selCtr] < nGroups, 3006 );
|
||||
bsW ( s,
|
||||
s->len [s->selector[selCtr]] [s->szptr[i]],
|
||||
s->code [s->selector[selCtr]] [s->szptr[i]] );
|
||||
}
|
||||
|
||||
gs = ge+1;
|
||||
selCtr++;
|
||||
}
|
||||
AssertH( selCtr == nSelectors, 3007 );
|
||||
|
||||
if (s->verbosity >= 3)
|
||||
VPrintf1( "codes %d\n", s->numZ-nBytes );
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void compressBlock ( EState* s, Bool is_last_block )
|
||||
{
|
||||
if (s->nblock > 0) {
|
||||
|
||||
BZ_FINALISE_CRC ( s->blockCRC );
|
||||
s->combinedCRC = (s->combinedCRC << 1) | (s->combinedCRC >> 31);
|
||||
s->combinedCRC ^= s->blockCRC;
|
||||
if (s->blockNo > 1) s->numZ = 0;
|
||||
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf4( " block %d: crc = 0x%8x, "
|
||||
"combined CRC = 0x%8x, size = %d\n",
|
||||
s->blockNo, s->blockCRC, s->combinedCRC, s->nblock );
|
||||
|
||||
blockSort ( s );
|
||||
}
|
||||
|
||||
/*-- If this is the first block, create the stream header. --*/
|
||||
if (s->blockNo == 1) {
|
||||
bsInitWrite ( s );
|
||||
bsPutUChar ( s, 'B' );
|
||||
bsPutUChar ( s, 'Z' );
|
||||
bsPutUChar ( s, 'h' );
|
||||
bsPutUChar ( s, '0' + s->blockSize100k );
|
||||
}
|
||||
|
||||
if (s->nblock > 0) {
|
||||
|
||||
bsPutUChar ( s, 0x31 ); bsPutUChar ( s, 0x41 );
|
||||
bsPutUChar ( s, 0x59 ); bsPutUChar ( s, 0x26 );
|
||||
bsPutUChar ( s, 0x53 ); bsPutUChar ( s, 0x59 );
|
||||
|
||||
/*-- Now the block's CRC, so it is in a known place. --*/
|
||||
bsPutUInt32 ( s, s->blockCRC );
|
||||
|
||||
/*-- Now a single bit indicating randomisation. --*/
|
||||
if (s->blockRandomised) {
|
||||
bsW(s,1,1); s->nBlocksRandomised++;
|
||||
} else
|
||||
bsW(s,1,0);
|
||||
|
||||
bsW ( s, 24, s->origPtr );
|
||||
generateMTFValues ( s );
|
||||
sendMTFValues ( s );
|
||||
}
|
||||
|
||||
|
||||
/*-- If this is the last block, add the stream trailer. --*/
|
||||
if (is_last_block) {
|
||||
|
||||
if (s->verbosity >= 2 && s->nBlocksRandomised > 0)
|
||||
VPrintf2 ( " %d block%s needed randomisation\n",
|
||||
s->nBlocksRandomised,
|
||||
s->nBlocksRandomised == 1 ? "" : "s" );
|
||||
|
||||
bsPutUChar ( s, 0x17 ); bsPutUChar ( s, 0x72 );
|
||||
bsPutUChar ( s, 0x45 ); bsPutUChar ( s, 0x38 );
|
||||
bsPutUChar ( s, 0x50 ); bsPutUChar ( s, 0x90 );
|
||||
bsPutUInt32 ( s, s->combinedCRC );
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf1( " final combined CRC = 0x%x\n ", s->combinedCRC );
|
||||
bsFinishWrite ( s );
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end compress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
144
lib/libbz2/crctable.c
Normal file
144
lib/libbz2/crctable.c
Normal file
@ -0,0 +1,144 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Table for doing CRCs ---*/
|
||||
/*--- crctable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*--
|
||||
I think this is an implementation of the AUTODIN-II,
|
||||
Ethernet & FDDI 32-bit CRC standard. Vaguely derived
|
||||
from code by Rob Warnock, in Section 51 of the
|
||||
comp.compression FAQ.
|
||||
--*/
|
||||
|
||||
UInt32 crc32Table[256] = {
|
||||
|
||||
/*-- Ugly, innit? --*/
|
||||
|
||||
0x00000000L, 0x04c11db7L, 0x09823b6eL, 0x0d4326d9L,
|
||||
0x130476dcL, 0x17c56b6bL, 0x1a864db2L, 0x1e475005L,
|
||||
0x2608edb8L, 0x22c9f00fL, 0x2f8ad6d6L, 0x2b4bcb61L,
|
||||
0x350c9b64L, 0x31cd86d3L, 0x3c8ea00aL, 0x384fbdbdL,
|
||||
0x4c11db70L, 0x48d0c6c7L, 0x4593e01eL, 0x4152fda9L,
|
||||
0x5f15adacL, 0x5bd4b01bL, 0x569796c2L, 0x52568b75L,
|
||||
0x6a1936c8L, 0x6ed82b7fL, 0x639b0da6L, 0x675a1011L,
|
||||
0x791d4014L, 0x7ddc5da3L, 0x709f7b7aL, 0x745e66cdL,
|
||||
0x9823b6e0L, 0x9ce2ab57L, 0x91a18d8eL, 0x95609039L,
|
||||
0x8b27c03cL, 0x8fe6dd8bL, 0x82a5fb52L, 0x8664e6e5L,
|
||||
0xbe2b5b58L, 0xbaea46efL, 0xb7a96036L, 0xb3687d81L,
|
||||
0xad2f2d84L, 0xa9ee3033L, 0xa4ad16eaL, 0xa06c0b5dL,
|
||||
0xd4326d90L, 0xd0f37027L, 0xddb056feL, 0xd9714b49L,
|
||||
0xc7361b4cL, 0xc3f706fbL, 0xceb42022L, 0xca753d95L,
|
||||
0xf23a8028L, 0xf6fb9d9fL, 0xfbb8bb46L, 0xff79a6f1L,
|
||||
0xe13ef6f4L, 0xe5ffeb43L, 0xe8bccd9aL, 0xec7dd02dL,
|
||||
0x34867077L, 0x30476dc0L, 0x3d044b19L, 0x39c556aeL,
|
||||
0x278206abL, 0x23431b1cL, 0x2e003dc5L, 0x2ac12072L,
|
||||
0x128e9dcfL, 0x164f8078L, 0x1b0ca6a1L, 0x1fcdbb16L,
|
||||
0x018aeb13L, 0x054bf6a4L, 0x0808d07dL, 0x0cc9cdcaL,
|
||||
0x7897ab07L, 0x7c56b6b0L, 0x71159069L, 0x75d48ddeL,
|
||||
0x6b93dddbL, 0x6f52c06cL, 0x6211e6b5L, 0x66d0fb02L,
|
||||
0x5e9f46bfL, 0x5a5e5b08L, 0x571d7dd1L, 0x53dc6066L,
|
||||
0x4d9b3063L, 0x495a2dd4L, 0x44190b0dL, 0x40d816baL,
|
||||
0xaca5c697L, 0xa864db20L, 0xa527fdf9L, 0xa1e6e04eL,
|
||||
0xbfa1b04bL, 0xbb60adfcL, 0xb6238b25L, 0xb2e29692L,
|
||||
0x8aad2b2fL, 0x8e6c3698L, 0x832f1041L, 0x87ee0df6L,
|
||||
0x99a95df3L, 0x9d684044L, 0x902b669dL, 0x94ea7b2aL,
|
||||
0xe0b41de7L, 0xe4750050L, 0xe9362689L, 0xedf73b3eL,
|
||||
0xf3b06b3bL, 0xf771768cL, 0xfa325055L, 0xfef34de2L,
|
||||
0xc6bcf05fL, 0xc27dede8L, 0xcf3ecb31L, 0xcbffd686L,
|
||||
0xd5b88683L, 0xd1799b34L, 0xdc3abdedL, 0xd8fba05aL,
|
||||
0x690ce0eeL, 0x6dcdfd59L, 0x608edb80L, 0x644fc637L,
|
||||
0x7a089632L, 0x7ec98b85L, 0x738aad5cL, 0x774bb0ebL,
|
||||
0x4f040d56L, 0x4bc510e1L, 0x46863638L, 0x42472b8fL,
|
||||
0x5c007b8aL, 0x58c1663dL, 0x558240e4L, 0x51435d53L,
|
||||
0x251d3b9eL, 0x21dc2629L, 0x2c9f00f0L, 0x285e1d47L,
|
||||
0x36194d42L, 0x32d850f5L, 0x3f9b762cL, 0x3b5a6b9bL,
|
||||
0x0315d626L, 0x07d4cb91L, 0x0a97ed48L, 0x0e56f0ffL,
|
||||
0x1011a0faL, 0x14d0bd4dL, 0x19939b94L, 0x1d528623L,
|
||||
0xf12f560eL, 0xf5ee4bb9L, 0xf8ad6d60L, 0xfc6c70d7L,
|
||||
0xe22b20d2L, 0xe6ea3d65L, 0xeba91bbcL, 0xef68060bL,
|
||||
0xd727bbb6L, 0xd3e6a601L, 0xdea580d8L, 0xda649d6fL,
|
||||
0xc423cd6aL, 0xc0e2d0ddL, 0xcda1f604L, 0xc960ebb3L,
|
||||
0xbd3e8d7eL, 0xb9ff90c9L, 0xb4bcb610L, 0xb07daba7L,
|
||||
0xae3afba2L, 0xaafbe615L, 0xa7b8c0ccL, 0xa379dd7bL,
|
||||
0x9b3660c6L, 0x9ff77d71L, 0x92b45ba8L, 0x9675461fL,
|
||||
0x8832161aL, 0x8cf30badL, 0x81b02d74L, 0x857130c3L,
|
||||
0x5d8a9099L, 0x594b8d2eL, 0x5408abf7L, 0x50c9b640L,
|
||||
0x4e8ee645L, 0x4a4ffbf2L, 0x470cdd2bL, 0x43cdc09cL,
|
||||
0x7b827d21L, 0x7f436096L, 0x7200464fL, 0x76c15bf8L,
|
||||
0x68860bfdL, 0x6c47164aL, 0x61043093L, 0x65c52d24L,
|
||||
0x119b4be9L, 0x155a565eL, 0x18197087L, 0x1cd86d30L,
|
||||
0x029f3d35L, 0x065e2082L, 0x0b1d065bL, 0x0fdc1becL,
|
||||
0x3793a651L, 0x3352bbe6L, 0x3e119d3fL, 0x3ad08088L,
|
||||
0x2497d08dL, 0x2056cd3aL, 0x2d15ebe3L, 0x29d4f654L,
|
||||
0xc5a92679L, 0xc1683bceL, 0xcc2b1d17L, 0xc8ea00a0L,
|
||||
0xd6ad50a5L, 0xd26c4d12L, 0xdf2f6bcbL, 0xdbee767cL,
|
||||
0xe3a1cbc1L, 0xe760d676L, 0xea23f0afL, 0xeee2ed18L,
|
||||
0xf0a5bd1dL, 0xf464a0aaL, 0xf9278673L, 0xfde69bc4L,
|
||||
0x89b8fd09L, 0x8d79e0beL, 0x803ac667L, 0x84fbdbd0L,
|
||||
0x9abc8bd5L, 0x9e7d9662L, 0x933eb0bbL, 0x97ffad0cL,
|
||||
0xafb010b1L, 0xab710d06L, 0xa6322bdfL, 0xa2f33668L,
|
||||
0xbcb4666dL, 0xb8757bdaL, 0xb5365d03L, 0xb1f740b4L
|
||||
};
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end crctable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
636
lib/libbz2/decompress.c
Normal file
636
lib/libbz2/decompress.c
Normal file
@ -0,0 +1,636 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Decompression machinery ---*/
|
||||
/*--- decompress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
static
|
||||
void makeMaps_d ( DState* s )
|
||||
{
|
||||
Int32 i;
|
||||
s->nInUse = 0;
|
||||
for (i = 0; i < 256; i++)
|
||||
if (s->inUse[i]) {
|
||||
s->seqToUnseq[s->nInUse] = i;
|
||||
s->nInUse++;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define RETURN(rrr) \
|
||||
{ retVal = rrr; goto save_state_and_return; };
|
||||
|
||||
#define GET_BITS(lll,vvv,nnn) \
|
||||
case lll: s->state = lll; \
|
||||
while (True) { \
|
||||
if (s->bsLive >= nnn) { \
|
||||
UInt32 v; \
|
||||
v = (s->bsBuff >> \
|
||||
(s->bsLive-nnn)) & ((1 << nnn)-1); \
|
||||
s->bsLive -= nnn; \
|
||||
vvv = v; \
|
||||
break; \
|
||||
} \
|
||||
if (s->strm->avail_in == 0) RETURN(BZ_OK); \
|
||||
s->bsBuff \
|
||||
= (s->bsBuff << 8) | \
|
||||
((UInt32) \
|
||||
(*((UChar*)(s->strm->next_in)))); \
|
||||
s->bsLive += 8; \
|
||||
s->strm->next_in++; \
|
||||
s->strm->avail_in--; \
|
||||
s->strm->total_in++; \
|
||||
}
|
||||
|
||||
#define GET_UCHAR(lll,uuu) \
|
||||
GET_BITS(lll,uuu,8)
|
||||
|
||||
#define GET_BIT(lll,uuu) \
|
||||
GET_BITS(lll,uuu,1)
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define GET_MTF_VAL(label1,label2,lval) \
|
||||
{ \
|
||||
if (groupPos == 0) { \
|
||||
groupNo++; \
|
||||
groupPos = BZ_G_SIZE; \
|
||||
gSel = s->selector[groupNo]; \
|
||||
gMinlen = s->minLens[gSel]; \
|
||||
gLimit = &(s->limit[gSel][0]); \
|
||||
gPerm = &(s->perm[gSel][0]); \
|
||||
gBase = &(s->base[gSel][0]); \
|
||||
} \
|
||||
groupPos--; \
|
||||
zn = gMinlen; \
|
||||
GET_BITS(label1, zvec, zn); \
|
||||
while (zvec > gLimit[zn]) { \
|
||||
zn++; \
|
||||
GET_BIT(label2, zj); \
|
||||
zvec = (zvec << 1) | zj; \
|
||||
}; \
|
||||
lval = gPerm[zvec - gBase[zn]]; \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
Int32 decompress ( DState* s )
|
||||
{
|
||||
UChar uc;
|
||||
Int32 retVal;
|
||||
Int32 minLen, maxLen;
|
||||
bz_stream* strm = s->strm;
|
||||
|
||||
/* stuff that needs to be saved/restored */
|
||||
Int32 i ;
|
||||
Int32 j;
|
||||
Int32 t;
|
||||
Int32 alphaSize;
|
||||
Int32 nGroups;
|
||||
Int32 nSelectors;
|
||||
Int32 EOB;
|
||||
Int32 groupNo;
|
||||
Int32 groupPos;
|
||||
Int32 nextSym;
|
||||
Int32 nblockMAX;
|
||||
Int32 nblock;
|
||||
Int32 es;
|
||||
Int32 N;
|
||||
Int32 curr;
|
||||
Int32 zt;
|
||||
Int32 zn;
|
||||
Int32 zvec;
|
||||
Int32 zj;
|
||||
Int32 gSel;
|
||||
Int32 gMinlen;
|
||||
Int32* gLimit;
|
||||
Int32* gBase;
|
||||
Int32* gPerm;
|
||||
|
||||
if (s->state == BZ_X_MAGIC_1) {
|
||||
/*initialise the save area*/
|
||||
s->save_i = 0;
|
||||
s->save_j = 0;
|
||||
s->save_t = 0;
|
||||
s->save_alphaSize = 0;
|
||||
s->save_nGroups = 0;
|
||||
s->save_nSelectors = 0;
|
||||
s->save_EOB = 0;
|
||||
s->save_groupNo = 0;
|
||||
s->save_groupPos = 0;
|
||||
s->save_nextSym = 0;
|
||||
s->save_nblockMAX = 0;
|
||||
s->save_nblock = 0;
|
||||
s->save_es = 0;
|
||||
s->save_N = 0;
|
||||
s->save_curr = 0;
|
||||
s->save_zt = 0;
|
||||
s->save_zn = 0;
|
||||
s->save_zvec = 0;
|
||||
s->save_zj = 0;
|
||||
s->save_gSel = 0;
|
||||
s->save_gMinlen = 0;
|
||||
s->save_gLimit = NULL;
|
||||
s->save_gBase = NULL;
|
||||
s->save_gPerm = NULL;
|
||||
}
|
||||
|
||||
/*restore from the save area*/
|
||||
i = s->save_i;
|
||||
j = s->save_j;
|
||||
t = s->save_t;
|
||||
alphaSize = s->save_alphaSize;
|
||||
nGroups = s->save_nGroups;
|
||||
nSelectors = s->save_nSelectors;
|
||||
EOB = s->save_EOB;
|
||||
groupNo = s->save_groupNo;
|
||||
groupPos = s->save_groupPos;
|
||||
nextSym = s->save_nextSym;
|
||||
nblockMAX = s->save_nblockMAX;
|
||||
nblock = s->save_nblock;
|
||||
es = s->save_es;
|
||||
N = s->save_N;
|
||||
curr = s->save_curr;
|
||||
zt = s->save_zt;
|
||||
zn = s->save_zn;
|
||||
zvec = s->save_zvec;
|
||||
zj = s->save_zj;
|
||||
gSel = s->save_gSel;
|
||||
gMinlen = s->save_gMinlen;
|
||||
gLimit = s->save_gLimit;
|
||||
gBase = s->save_gBase;
|
||||
gPerm = s->save_gPerm;
|
||||
|
||||
retVal = BZ_OK;
|
||||
|
||||
switch (s->state) {
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_1, uc);
|
||||
if (uc != 'B') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_2, uc);
|
||||
if (uc != 'Z') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_UCHAR(BZ_X_MAGIC_3, uc)
|
||||
if (uc != 'h') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
|
||||
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
|
||||
if (s->blockSize100k < '1' ||
|
||||
s->blockSize100k > '9') RETURN(BZ_DATA_ERROR_MAGIC);
|
||||
s->blockSize100k -= '0';
|
||||
|
||||
if (s->smallDecompress) {
|
||||
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
|
||||
s->ll4 = BZALLOC(
|
||||
((1 + s->blockSize100k * 100000) >> 1) * sizeof(UChar)
|
||||
);
|
||||
if (s->ll16 == NULL || s->ll4 == NULL) RETURN(BZ_MEM_ERROR);
|
||||
} else {
|
||||
s->tt = BZALLOC( s->blockSize100k * 100000 * sizeof(Int32) );
|
||||
if (s->tt == NULL) RETURN(BZ_MEM_ERROR);
|
||||
}
|
||||
|
||||
GET_UCHAR(BZ_X_BLKHDR_1, uc);
|
||||
|
||||
if (uc == 0x17) goto endhdr_2;
|
||||
if (uc != 0x31) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_2, uc);
|
||||
if (uc != 0x41) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_3, uc);
|
||||
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_4, uc);
|
||||
if (uc != 0x26) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_5, uc);
|
||||
if (uc != 0x53) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_BLKHDR_6, uc);
|
||||
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
s->currBlockNo++;
|
||||
if (s->verbosity >= 2)
|
||||
VPrintf1 ( "\n [%d: huff+mtf ", s->currBlockNo );
|
||||
|
||||
s->storedBlockCRC = 0;
|
||||
GET_UCHAR(BZ_X_BCRC_1, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_2, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_3, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_BCRC_4, uc);
|
||||
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
|
||||
|
||||
GET_BITS(BZ_X_RANDBIT, s->blockRandomised, 1);
|
||||
|
||||
s->origPtr = 0;
|
||||
GET_UCHAR(BZ_X_ORIGPTR_1, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
GET_UCHAR(BZ_X_ORIGPTR_2, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
GET_UCHAR(BZ_X_ORIGPTR_3, uc);
|
||||
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
|
||||
|
||||
/*--- Receive the mapping table ---*/
|
||||
for (i = 0; i < 16; i++) {
|
||||
GET_BIT(BZ_X_MAPPING_1, uc);
|
||||
if (uc == 1)
|
||||
s->inUse16[i] = True; else
|
||||
s->inUse16[i] = False;
|
||||
}
|
||||
|
||||
for (i = 0; i < 256; i++) s->inUse[i] = False;
|
||||
|
||||
for (i = 0; i < 16; i++)
|
||||
if (s->inUse16[i])
|
||||
for (j = 0; j < 16; j++) {
|
||||
GET_BIT(BZ_X_MAPPING_2, uc);
|
||||
if (uc == 1) s->inUse[i * 16 + j] = True;
|
||||
}
|
||||
makeMaps_d ( s );
|
||||
alphaSize = s->nInUse+2;
|
||||
|
||||
/*--- Now the selectors ---*/
|
||||
GET_BITS(BZ_X_SELECTOR_1, nGroups, 3);
|
||||
GET_BITS(BZ_X_SELECTOR_2, nSelectors, 15);
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
j = 0;
|
||||
while (True) {
|
||||
GET_BIT(BZ_X_SELECTOR_3, uc);
|
||||
if (uc == 0) break;
|
||||
j++;
|
||||
if (j > 5) RETURN(BZ_DATA_ERROR);
|
||||
}
|
||||
s->selectorMtf[i] = j;
|
||||
}
|
||||
|
||||
/*--- Undo the MTF values for the selectors. ---*/
|
||||
{
|
||||
UChar pos[BZ_N_GROUPS], tmp, v;
|
||||
for (v = 0; v < nGroups; v++) pos[v] = v;
|
||||
|
||||
for (i = 0; i < nSelectors; i++) {
|
||||
v = s->selectorMtf[i];
|
||||
tmp = pos[v];
|
||||
while (v > 0) { pos[v] = pos[v-1]; v--; }
|
||||
pos[0] = tmp;
|
||||
s->selector[i] = tmp;
|
||||
}
|
||||
}
|
||||
|
||||
/*--- Now the coding tables ---*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
GET_BITS(BZ_X_CODING_1, curr, 5);
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
while (True) {
|
||||
if (curr < 1 || curr > 20) RETURN(BZ_DATA_ERROR);
|
||||
GET_BIT(BZ_X_CODING_2, uc);
|
||||
if (uc == 0) break;
|
||||
GET_BIT(BZ_X_CODING_3, uc);
|
||||
if (uc == 0) curr++; else curr--;
|
||||
}
|
||||
s->len[t][i] = curr;
|
||||
}
|
||||
}
|
||||
|
||||
/*--- Create the Huffman decoding tables ---*/
|
||||
for (t = 0; t < nGroups; t++) {
|
||||
minLen = 32;
|
||||
maxLen = 0;
|
||||
for (i = 0; i < alphaSize; i++) {
|
||||
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
|
||||
if (s->len[t][i] < minLen) minLen = s->len[t][i];
|
||||
}
|
||||
hbCreateDecodeTables (
|
||||
&(s->limit[t][0]),
|
||||
&(s->base[t][0]),
|
||||
&(s->perm[t][0]),
|
||||
&(s->len[t][0]),
|
||||
minLen, maxLen, alphaSize
|
||||
);
|
||||
s->minLens[t] = minLen;
|
||||
}
|
||||
|
||||
/*--- Now the MTF values ---*/
|
||||
|
||||
EOB = s->nInUse+1;
|
||||
nblockMAX = 100000 * s->blockSize100k;
|
||||
groupNo = -1;
|
||||
groupPos = 0;
|
||||
|
||||
for (i = 0; i <= 255; i++) s->unzftab[i] = 0;
|
||||
|
||||
/*-- MTF init --*/
|
||||
{
|
||||
Int32 ii, jj, kk;
|
||||
kk = MTFA_SIZE-1;
|
||||
for (ii = 256 / MTFL_SIZE - 1; ii >= 0; ii--) {
|
||||
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
|
||||
s->mtfa[kk] = (UChar)(ii * MTFL_SIZE + jj);
|
||||
kk--;
|
||||
}
|
||||
s->mtfbase[ii] = kk + 1;
|
||||
}
|
||||
}
|
||||
/*-- end MTF init --*/
|
||||
|
||||
nblock = 0;
|
||||
|
||||
GET_MTF_VAL(BZ_X_MTF_1, BZ_X_MTF_2, nextSym);
|
||||
|
||||
while (True) {
|
||||
|
||||
if (nextSym == EOB) break;
|
||||
|
||||
if (nextSym == BZ_RUNA || nextSym == BZ_RUNB) {
|
||||
|
||||
es = -1;
|
||||
N = 1;
|
||||
do {
|
||||
if (nextSym == BZ_RUNA) es = es + (0+1) * N; else
|
||||
if (nextSym == BZ_RUNB) es = es + (1+1) * N;
|
||||
N = N * 2;
|
||||
GET_MTF_VAL(BZ_X_MTF_3, BZ_X_MTF_4, nextSym);
|
||||
}
|
||||
while (nextSym == BZ_RUNA || nextSym == BZ_RUNB);
|
||||
|
||||
es++;
|
||||
uc = s->seqToUnseq[ s->mtfa[s->mtfbase[0]] ];
|
||||
s->unzftab[uc] += es;
|
||||
|
||||
if (s->smallDecompress)
|
||||
while (es > 0) {
|
||||
s->ll16[nblock] = (UInt16)uc;
|
||||
nblock++;
|
||||
es--;
|
||||
}
|
||||
else
|
||||
while (es > 0) {
|
||||
s->tt[nblock] = (UInt32)uc;
|
||||
nblock++;
|
||||
es--;
|
||||
};
|
||||
|
||||
if (nblock > nblockMAX) RETURN(BZ_DATA_ERROR);
|
||||
continue;
|
||||
|
||||
} else {
|
||||
|
||||
if (nblock > nblockMAX) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
/*-- uc = MTF ( nextSym-1 ) --*/
|
||||
{
|
||||
Int32 ii, jj, kk, pp, lno, off;
|
||||
UInt32 nn;
|
||||
nn = (UInt32)(nextSym - 1);
|
||||
|
||||
if (nn < MTFL_SIZE) {
|
||||
/* avoid general-case expense */
|
||||
pp = s->mtfbase[0];
|
||||
uc = s->mtfa[pp+nn];
|
||||
while (nn > 3) {
|
||||
Int32 z = pp+nn;
|
||||
s->mtfa[(z) ] = s->mtfa[(z)-1];
|
||||
s->mtfa[(z)-1] = s->mtfa[(z)-2];
|
||||
s->mtfa[(z)-2] = s->mtfa[(z)-3];
|
||||
s->mtfa[(z)-3] = s->mtfa[(z)-4];
|
||||
nn -= 4;
|
||||
}
|
||||
while (nn > 0) {
|
||||
s->mtfa[(pp+nn)] = s->mtfa[(pp+nn)-1]; nn--;
|
||||
};
|
||||
s->mtfa[pp] = uc;
|
||||
} else {
|
||||
/* general case */
|
||||
lno = nn / MTFL_SIZE;
|
||||
off = nn % MTFL_SIZE;
|
||||
pp = s->mtfbase[lno] + off;
|
||||
uc = s->mtfa[pp];
|
||||
while (pp > s->mtfbase[lno]) {
|
||||
s->mtfa[pp] = s->mtfa[pp-1]; pp--;
|
||||
};
|
||||
s->mtfbase[lno]++;
|
||||
while (lno > 0) {
|
||||
s->mtfbase[lno]--;
|
||||
s->mtfa[s->mtfbase[lno]]
|
||||
= s->mtfa[s->mtfbase[lno-1] + MTFL_SIZE - 1];
|
||||
lno--;
|
||||
}
|
||||
s->mtfbase[0]--;
|
||||
s->mtfa[s->mtfbase[0]] = uc;
|
||||
if (s->mtfbase[0] == 0) {
|
||||
kk = MTFA_SIZE-1;
|
||||
for (ii = 256 / MTFL_SIZE-1; ii >= 0; ii--) {
|
||||
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
|
||||
s->mtfa[kk] = s->mtfa[s->mtfbase[ii] + jj];
|
||||
kk--;
|
||||
}
|
||||
s->mtfbase[ii] = kk + 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
/*-- end uc = MTF ( nextSym-1 ) --*/
|
||||
|
||||
s->unzftab[s->seqToUnseq[uc]]++;
|
||||
if (s->smallDecompress)
|
||||
s->ll16[nblock] = (UInt16)(s->seqToUnseq[uc]); else
|
||||
s->tt[nblock] = (UInt32)(s->seqToUnseq[uc]);
|
||||
nblock++;
|
||||
|
||||
GET_MTF_VAL(BZ_X_MTF_5, BZ_X_MTF_6, nextSym);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
s->state_out_len = 0;
|
||||
s->state_out_ch = 0;
|
||||
BZ_INITIALISE_CRC ( s->calculatedBlockCRC );
|
||||
s->state = BZ_X_OUTPUT;
|
||||
if (s->verbosity >= 2) VPrintf0 ( "rt+rld" );
|
||||
|
||||
/*-- Set up cftab to facilitate generation of T^(-1) --*/
|
||||
s->cftab[0] = 0;
|
||||
for (i = 1; i <= 256; i++) s->cftab[i] = s->unzftab[i-1];
|
||||
for (i = 1; i <= 256; i++) s->cftab[i] += s->cftab[i-1];
|
||||
|
||||
if (s->smallDecompress) {
|
||||
|
||||
/*-- Make a copy of cftab, used in generation of T --*/
|
||||
for (i = 0; i <= 256; i++) s->cftabCopy[i] = s->cftab[i];
|
||||
|
||||
/*-- compute the T vector --*/
|
||||
for (i = 0; i < nblock; i++) {
|
||||
uc = (UChar)(s->ll16[i]);
|
||||
SET_LL(i, s->cftabCopy[uc]);
|
||||
s->cftabCopy[uc]++;
|
||||
}
|
||||
|
||||
/*-- Compute T^(-1) by pointer reversal on T --*/
|
||||
i = s->origPtr;
|
||||
j = GET_LL(i);
|
||||
do {
|
||||
Int32 tmp = GET_LL(j);
|
||||
SET_LL(j, i);
|
||||
i = j;
|
||||
j = tmp;
|
||||
}
|
||||
while (i != s->origPtr);
|
||||
|
||||
s->tPos = s->origPtr;
|
||||
s->nblock_used = 0;
|
||||
if (s->blockRandomised) {
|
||||
BZ_RAND_INIT_MASK;
|
||||
BZ_GET_SMALL(s->k0); s->nblock_used++;
|
||||
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
|
||||
} else {
|
||||
BZ_GET_SMALL(s->k0); s->nblock_used++;
|
||||
}
|
||||
|
||||
} else {
|
||||
|
||||
/*-- compute the T^(-1) vector --*/
|
||||
for (i = 0; i < nblock; i++) {
|
||||
uc = (UChar)(s->tt[i] & 0xff);
|
||||
s->tt[s->cftab[uc]] |= (i << 8);
|
||||
s->cftab[uc]++;
|
||||
}
|
||||
|
||||
s->tPos = s->tt[s->origPtr] >> 8;
|
||||
s->nblock_used = 0;
|
||||
if (s->blockRandomised) {
|
||||
BZ_RAND_INIT_MASK;
|
||||
BZ_GET_FAST(s->k0); s->nblock_used++;
|
||||
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
|
||||
} else {
|
||||
BZ_GET_FAST(s->k0); s->nblock_used++;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
RETURN(BZ_OK);
|
||||
|
||||
|
||||
|
||||
endhdr_2:
|
||||
|
||||
GET_UCHAR(BZ_X_ENDHDR_2, uc);
|
||||
if (uc != 0x72) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_3, uc);
|
||||
if (uc != 0x45) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_4, uc);
|
||||
if (uc != 0x38) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_5, uc);
|
||||
if (uc != 0x50) RETURN(BZ_DATA_ERROR);
|
||||
GET_UCHAR(BZ_X_ENDHDR_6, uc);
|
||||
if (uc != 0x90) RETURN(BZ_DATA_ERROR);
|
||||
|
||||
s->storedCombinedCRC = 0;
|
||||
GET_UCHAR(BZ_X_CCRC_1, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_2, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_3, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
GET_UCHAR(BZ_X_CCRC_4, uc);
|
||||
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
|
||||
|
||||
s->state = BZ_X_IDLE;
|
||||
RETURN(BZ_STREAM_END);
|
||||
|
||||
default: AssertH ( False, 4001 );
|
||||
}
|
||||
|
||||
AssertH ( False, 4002 );
|
||||
|
||||
save_state_and_return:
|
||||
|
||||
s->save_i = i;
|
||||
s->save_j = j;
|
||||
s->save_t = t;
|
||||
s->save_alphaSize = alphaSize;
|
||||
s->save_nGroups = nGroups;
|
||||
s->save_nSelectors = nSelectors;
|
||||
s->save_EOB = EOB;
|
||||
s->save_groupNo = groupNo;
|
||||
s->save_groupPos = groupPos;
|
||||
s->save_nextSym = nextSym;
|
||||
s->save_nblockMAX = nblockMAX;
|
||||
s->save_nblock = nblock;
|
||||
s->save_es = es;
|
||||
s->save_N = N;
|
||||
s->save_curr = curr;
|
||||
s->save_zt = zt;
|
||||
s->save_zn = zn;
|
||||
s->save_zvec = zvec;
|
||||
s->save_zj = zj;
|
||||
s->save_gSel = gSel;
|
||||
s->save_gMinlen = gMinlen;
|
||||
s->save_gLimit = gLimit;
|
||||
s->save_gBase = gBase;
|
||||
s->save_gPerm = gPerm;
|
||||
|
||||
return retVal;
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end decompress.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
228
lib/libbz2/huffman.c
Normal file
228
lib/libbz2/huffman.c
Normal file
@ -0,0 +1,228 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Huffman coding low-level stuff ---*/
|
||||
/*--- huffman.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
#define WEIGHTOF(zz0) ((zz0) & 0xffffff00)
|
||||
#define DEPTHOF(zz1) ((zz1) & 0x000000ff)
|
||||
#define MYMAX(zz2,zz3) ((zz2) > (zz3) ? (zz2) : (zz3))
|
||||
|
||||
#define ADDWEIGHTS(zw1,zw2) \
|
||||
(WEIGHTOF(zw1)+WEIGHTOF(zw2)) | \
|
||||
(1 + MYMAX(DEPTHOF(zw1),DEPTHOF(zw2)))
|
||||
|
||||
#define UPHEAP(z) \
|
||||
{ \
|
||||
Int32 zz, tmp; \
|
||||
zz = z; tmp = heap[zz]; \
|
||||
while (weight[tmp] < weight[heap[zz >> 1]]) { \
|
||||
heap[zz] = heap[zz >> 1]; \
|
||||
zz >>= 1; \
|
||||
} \
|
||||
heap[zz] = tmp; \
|
||||
}
|
||||
|
||||
#define DOWNHEAP(z) \
|
||||
{ \
|
||||
Int32 zz, yy, tmp; \
|
||||
zz = z; tmp = heap[zz]; \
|
||||
while (True) { \
|
||||
yy = zz << 1; \
|
||||
if (yy > nHeap) break; \
|
||||
if (yy < nHeap && \
|
||||
weight[heap[yy+1]] < weight[heap[yy]]) \
|
||||
yy++; \
|
||||
if (weight[tmp] < weight[heap[yy]]) break; \
|
||||
heap[zz] = heap[yy]; \
|
||||
zz = yy; \
|
||||
} \
|
||||
heap[zz] = tmp; \
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbMakeCodeLengths ( UChar *len,
|
||||
Int32 *freq,
|
||||
Int32 alphaSize,
|
||||
Int32 maxLen )
|
||||
{
|
||||
/*--
|
||||
Nodes and heap entries run from 1. Entry 0
|
||||
for both the heap and nodes is a sentinel.
|
||||
--*/
|
||||
Int32 nNodes, nHeap, n1, n2, i, j, k;
|
||||
Bool tooLong;
|
||||
|
||||
Int32 heap [ BZ_MAX_ALPHA_SIZE + 2 ];
|
||||
Int32 weight [ BZ_MAX_ALPHA_SIZE * 2 ];
|
||||
Int32 parent [ BZ_MAX_ALPHA_SIZE * 2 ];
|
||||
|
||||
for (i = 0; i < alphaSize; i++)
|
||||
weight[i+1] = (freq[i] == 0 ? 1 : freq[i]) << 8;
|
||||
|
||||
while (True) {
|
||||
|
||||
nNodes = alphaSize;
|
||||
nHeap = 0;
|
||||
|
||||
heap[0] = 0;
|
||||
weight[0] = 0;
|
||||
parent[0] = -2;
|
||||
|
||||
for (i = 1; i <= alphaSize; i++) {
|
||||
parent[i] = -1;
|
||||
nHeap++;
|
||||
heap[nHeap] = i;
|
||||
UPHEAP(nHeap);
|
||||
}
|
||||
|
||||
AssertH( nHeap < (BZ_MAX_ALPHA_SIZE+2), 2001 );
|
||||
|
||||
while (nHeap > 1) {
|
||||
n1 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
|
||||
n2 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
|
||||
nNodes++;
|
||||
parent[n1] = parent[n2] = nNodes;
|
||||
weight[nNodes] = ADDWEIGHTS(weight[n1], weight[n2]);
|
||||
parent[nNodes] = -1;
|
||||
nHeap++;
|
||||
heap[nHeap] = nNodes;
|
||||
UPHEAP(nHeap);
|
||||
}
|
||||
|
||||
AssertH( nNodes < (BZ_MAX_ALPHA_SIZE * 2), 2002 );
|
||||
|
||||
tooLong = False;
|
||||
for (i = 1; i <= alphaSize; i++) {
|
||||
j = 0;
|
||||
k = i;
|
||||
while (parent[k] >= 0) { k = parent[k]; j++; }
|
||||
len[i-1] = j;
|
||||
if (j > maxLen) tooLong = True;
|
||||
}
|
||||
|
||||
if (! tooLong) break;
|
||||
|
||||
for (i = 1; i < alphaSize; i++) {
|
||||
j = weight[i] >> 8;
|
||||
j = 1 + (j / 2);
|
||||
weight[i] = j << 8;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbAssignCodes ( Int32 *code,
|
||||
UChar *length,
|
||||
Int32 minLen,
|
||||
Int32 maxLen,
|
||||
Int32 alphaSize )
|
||||
{
|
||||
Int32 n, vec, i;
|
||||
|
||||
vec = 0;
|
||||
for (n = minLen; n <= maxLen; n++) {
|
||||
for (i = 0; i < alphaSize; i++)
|
||||
if (length[i] == n) { code[i] = vec; vec++; };
|
||||
vec <<= 1;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
/*---------------------------------------------------*/
|
||||
void hbCreateDecodeTables ( Int32 *limit,
|
||||
Int32 *base,
|
||||
Int32 *perm,
|
||||
UChar *length,
|
||||
Int32 minLen,
|
||||
Int32 maxLen,
|
||||
Int32 alphaSize )
|
||||
{
|
||||
Int32 pp, i, j, vec;
|
||||
|
||||
pp = 0;
|
||||
for (i = minLen; i <= maxLen; i++)
|
||||
for (j = 0; j < alphaSize; j++)
|
||||
if (length[j] == i) { perm[pp] = j; pp++; };
|
||||
|
||||
for (i = 0; i < BZ_MAX_CODE_LEN; i++) base[i] = 0;
|
||||
for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
|
||||
|
||||
for (i = 1; i < BZ_MAX_CODE_LEN; i++) base[i] += base[i-1];
|
||||
|
||||
for (i = 0; i < BZ_MAX_CODE_LEN; i++) limit[i] = 0;
|
||||
vec = 0;
|
||||
|
||||
for (i = minLen; i <= maxLen; i++) {
|
||||
vec += (base[i+1] - base[i]);
|
||||
limit[i] = vec-1;
|
||||
vec <<= 1;
|
||||
}
|
||||
for (i = minLen + 1; i <= maxLen; i++)
|
||||
base[i] = ((limit[i-1] + 1) << 1) - base[i];
|
||||
}
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end huffman.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
25
lib/libbz2/libbz2.def
Normal file
25
lib/libbz2/libbz2.def
Normal file
@ -0,0 +1,25 @@
|
||||
LIBRARY LIBBZ2
|
||||
DESCRIPTION "libbzip2: library for data compression"
|
||||
EXPORTS
|
||||
bzCompressInit
|
||||
bzCompress
|
||||
bzCompressEnd
|
||||
bzDecompressInit
|
||||
bzDecompress
|
||||
bzDecompressEnd
|
||||
bzReadOpen
|
||||
bzReadClose
|
||||
bzReadGetUnused
|
||||
bzRead
|
||||
bzWriteOpen
|
||||
bzWrite
|
||||
bzWriteClose
|
||||
bzBuffToBuffCompress
|
||||
bzBuffToBuffDecompress
|
||||
bzlibVersion
|
||||
bzopen
|
||||
bzdopen
|
||||
bzread
|
||||
bzwrite
|
||||
bzflush
|
||||
bzclose
|
6748
lib/libbz2/manual.ps
Normal file
6748
lib/libbz2/manual.ps
Normal file
File diff suppressed because it is too large
Load Diff
2099
lib/libbz2/manual.texi
Normal file
2099
lib/libbz2/manual.texi
Normal file
File diff suppressed because it is too large
Load Diff
122
lib/libbz2/manual_1.html
Normal file
122
lib/libbz2/manual_1.html
Normal file
@ -0,0 +1,122 @@
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<!-- This HTML file has been created by texi2html 1.51
|
||||
from manual.texi on 23 August 1998 -->
|
||||
|
||||
<TITLE>bzip2 and libbzip2 - Introduction</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
Go to the first, previous, <A HREF="manual_2.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
<P><HR><P>
|
||||
|
||||
<P>
|
||||
@parindent 0mm
|
||||
@parskip 2mm
|
||||
|
||||
</P>
|
||||
|
||||
<P>
|
||||
This program, <CODE>bzip2</CODE>,
|
||||
and associated library <CODE>libbzip2</CODE>, are
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
<UL>
|
||||
<LI>
|
||||
|
||||
Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
<LI>
|
||||
|
||||
The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
<LI>
|
||||
|
||||
Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
<LI>
|
||||
|
||||
The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
<CODE>jseward@acm.org</CODE>
|
||||
|
||||
</P>
|
||||
<P>
|
||||
<CODE>http://www.muraroa.demon.co.uk</CODE>
|
||||
|
||||
</P>
|
||||
<P>
|
||||
<CODE>bzip2</CODE>/<CODE>libbzip2</CODE> version 0.9.0 of 23 August 1998.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
PATENTS: To the best of my knowledge, <CODE>bzip2</CODE> does not use any patented
|
||||
algorithms. However, I do not have the resources available to carry out
|
||||
a full patent search. Therefore I cannot give any guarantee of the
|
||||
above statement.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H1><A NAME="SEC1" HREF="manual_toc.html#TOC1">Introduction</A></H1>
|
||||
|
||||
<P>
|
||||
<CODE>bzip2</CODE> compresses files using the Burrows-Wheeler
|
||||
block-sorting text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably better than that
|
||||
achieved by more conventional LZ77/LZ78-based compressors,
|
||||
and approaches the performance of the PPM family of statistical compressors.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
<CODE>bzip2</CODE> is built on top of <CODE>libbzip2</CODE>, a flexible library
|
||||
for handling compressed data in the <CODE>bzip2</CODE> format. This manual
|
||||
describes both how to use the program and
|
||||
how to work with the library interface. Most of the
|
||||
manual is devoted to this library, not the program,
|
||||
which is good news if your interest is only in the program.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Chapter 2 describes how to use <CODE>bzip2</CODE>; this is the only part
|
||||
you need to read if you just want to know how to operate the program.
|
||||
Chapter 3 describes the programming interfaces in detail, and
|
||||
Chapter 4 records some miscellaneous notes which I thought
|
||||
ought to be recorded somewhere.
|
||||
|
||||
</P>
|
||||
|
||||
<P><HR><P>
|
||||
Go to the first, previous, <A HREF="manual_2.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
</BODY>
|
||||
</HTML>
|
367
lib/libbz2/manual_2.html
Normal file
367
lib/libbz2/manual_2.html
Normal file
@ -0,0 +1,367 @@
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<!-- This HTML file has been created by texi2html 1.51
|
||||
from manual.texi on 23 August 1998 -->
|
||||
|
||||
<TITLE>bzip2 and libbzip2 - How to use bzip2</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_1.html">previous</A>, <A HREF="manual_3.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
<P><HR><P>
|
||||
|
||||
|
||||
<H1><A NAME="SEC2" HREF="manual_toc.html#TOC2">How to use <CODE>bzip2</CODE></A></H1>
|
||||
|
||||
<P>
|
||||
This chapter contains a copy of the <CODE>bzip2</CODE> man page,
|
||||
and nothing else.
|
||||
|
||||
<PRE>
|
||||
NAME
|
||||
bzip2, bunzip2 - a block-sorting file compressor, v0.9.0
|
||||
bzcat - decompresses files to stdout
|
||||
bzip2recover - recovers data from damaged bzip2 files
|
||||
|
||||
SYNOPSIS
|
||||
bzip2 [ -cdfkstvzVL123456789 ] [ filenames ... ]
|
||||
bunzip2 [ -fkvsVL ] [ filenames ... ]
|
||||
bzcat [ -s ] [ filenames ... ]
|
||||
bzip2recover filename
|
||||
|
||||
DESCRIPTION
|
||||
bzip2 compresses files using the Burrows-Wheeler block-
|
||||
sorting text compression algorithm, and Huffman coding.
|
||||
Compression is generally considerably better than that
|
||||
achieved by more conventional LZ77/LZ78-based compressors,
|
||||
and approaches the performance of the PPM family of sta-
|
||||
tistical compressors.
|
||||
|
||||
The command-line options are deliberately very similar to
|
||||
those of GNU Gzip, but they are not identical.
|
||||
|
||||
bzip2 expects a list of file names to accompany the com-
|
||||
mand-line flags. Each file is replaced by a compressed
|
||||
version of itself, with the name "original_name.bz2".
|
||||
Each compressed file has the same modification date and
|
||||
permissions as the corresponding original, so that these
|
||||
properties can be correctly restored at decompression
|
||||
time. File name handling is naive in the sense that there
|
||||
is no mechanism for preserving original file names, per-
|
||||
missions and dates in filesystems which lack these con-
|
||||
cepts, or have serious file name length restrictions, such
|
||||
as MS-DOS.
|
||||
|
||||
bzip2 and bunzip2 will by default not overwrite existing
|
||||
files; if you want this to happen, specify the -f flag.
|
||||
|
||||
If no file names are specified, bzip2 compresses from
|
||||
standard input to standard output. In this case, bzip2
|
||||
will decline to write compressed output to a terminal, as
|
||||
this would be entirely incomprehensible and therefore
|
||||
pointless.
|
||||
|
||||
bunzip2 (or bzip2 -d ) decompresses and restores all spec-
|
||||
ified files whose names end in ".bz2". Files without this
|
||||
suffix are ignored. Again, supplying no filenames causes
|
||||
decompression from standard input to standard output.
|
||||
|
||||
bunzip2 will correctly decompress a file which is the con-
|
||||
catenation of two or more compressed files. The result is
|
||||
the concatenation of the corresponding uncompressed files.
|
||||
Integrity testing (-t) of concatenated compressed files is
|
||||
also supported.
|
||||
|
||||
You can also compress or decompress files to the standard
|
||||
output by giving the -c flag. Multiple files may be com-
|
||||
pressed and decompressed like this. The resulting outputs
|
||||
are fed sequentially to stdout. Compression of multiple
|
||||
files in this manner generates a stream containing multi-
|
||||
ple compressed file representations. Such a stream can be
|
||||
decompressed correctly only by bzip2 version 0.9.0 or
|
||||
later. Earlier versions of bzip2 will stop after decom-
|
||||
pressing the first file in the stream.
|
||||
|
||||
bzcat (or bzip2 -dc ) decompresses all specified files to
|
||||
the standard output.
|
||||
|
||||
Compression is always performed, even if the compressed
|
||||
file is slightly larger than the original. Files of less
|
||||
than about one hundred bytes tend to get larger, since the
|
||||
compression mechanism has a constant overhead in the
|
||||
region of 50 bytes. Random data (including the output of
|
||||
most file compressors) is coded at about 8.05 bits per
|
||||
byte, giving an expansion of around 0.5%.
|
||||
|
||||
As a self-check for your protection, bzip2 uses 32-bit
|
||||
CRCs to make sure that the decompressed version of a file
|
||||
is identical to the original. This guards against corrup-
|
||||
tion of the compressed data, and against undetected bugs
|
||||
in bzip2 (hopefully very unlikely). The chances of data
|
||||
corruption going undetected is microscopic, about one
|
||||
chance in four billion for each file processed. Be aware,
|
||||
though, that the check occurs upon decompression, so it
|
||||
can only tell you that that something is wrong. It can't
|
||||
help you recover the original uncompressed data. You can
|
||||
use bzip2recover to try to recover data from damaged
|
||||
files.
|
||||
|
||||
Return values: 0 for a normal exit, 1 for environmental
|
||||
problems (file not found, invalid flags, I/O errors, &c),
|
||||
2 to indicate a corrupt compressed file, 3 for an internal
|
||||
consistency error (eg, bug) which caused bzip2 to panic.
|
||||
|
||||
MEMORY MANAGEMENT
|
||||
Bzip2 compresses large files in blocks. The block size
|
||||
affects both the compression ratio achieved, and the
|
||||
amount of memory needed both for compression and decom-
|
||||
pression. The flags -1 through -9 specify the block size
|
||||
to be 100,000 bytes through 900,000 bytes (the default)
|
||||
respectively. At decompression-time, the block size used
|
||||
for compression is read from the header of the compressed
|
||||
file, and bunzip2 then allocates itself just enough memory
|
||||
to decompress the file. Since block sizes are stored in
|
||||
compressed files, it follows that the flags -1 to -9 are
|
||||
irrelevant to and so ignored during decompression.
|
||||
|
||||
Compression and decompression requirements, in bytes, can
|
||||
be estimated as:
|
||||
|
||||
Compression: 400k + ( 7 x block size )
|
||||
|
||||
Decompression: 100k + ( 4 x block size ), or
|
||||
100k + ( 2.5 x block size )
|
||||
|
||||
Larger block sizes give rapidly diminishing marginal
|
||||
returns; most of the compression comes from the first two
|
||||
or three hundred k of block size, a fact worth bearing in
|
||||
mind when using bzip2 on small machines. It is also
|
||||
important to appreciate that the decompression memory
|
||||
requirement is set at compression-time by the choice of
|
||||
block size.
|
||||
|
||||
For files compressed with the default 900k block size,
|
||||
bunzip2 will require about 3700 kbytes to decompress. To
|
||||
support decompression of any file on a 4 megabyte machine,
|
||||
bunzip2 has an option to decompress using approximately
|
||||
half this amount of memory, about 2300 kbytes. Decompres-
|
||||
sion speed is also halved, so you should use this option
|
||||
only where necessary. The relevant flag is -s.
|
||||
|
||||
In general, try and use the largest block size memory con-
|
||||
straints allow, since that maximises the compression
|
||||
achieved. Compression and decompression speed are virtu-
|
||||
ally unaffected by block size.
|
||||
|
||||
Another significant point applies to files which fit in a
|
||||
single block -- that means most files you'd encounter
|
||||
using a large block size. The amount of real memory
|
||||
touched is proportional to the size of the file, since the
|
||||
file is smaller than a block. For example, compressing a
|
||||
file 20,000 bytes long with the flag -9 will cause the
|
||||
compressor to allocate around 6700k of memory, but only
|
||||
touch 400k + 20000 * 7 = 540 kbytes of it. Similarly, the
|
||||
decompressor will allocate 3700k but only touch 100k +
|
||||
20000 * 4 = 180 kbytes.
|
||||
|
||||
Here is a table which summarises the maximum memory usage
|
||||
for different block sizes. Also recorded is the total
|
||||
compressed size for 14 files of the Calgary Text Compres-
|
||||
sion Corpus totalling 3,141,622 bytes. This column gives
|
||||
some feel for how compression varies with block size.
|
||||
These figures tend to understate the advantage of larger
|
||||
block sizes for larger files, since the Corpus is domi-
|
||||
nated by smaller files.
|
||||
|
||||
Compress Decompress Decompress Corpus
|
||||
Flag usage usage -s usage Size
|
||||
|
||||
-1 1100k 500k 350k 914704
|
||||
-2 1800k 900k 600k 877703
|
||||
-3 2500k 1300k 850k 860338
|
||||
-4 3200k 1700k 1100k 846899
|
||||
-5 3900k 2100k 1350k 845160
|
||||
-6 4600k 2500k 1600k 838626
|
||||
-7 5400k 2900k 1850k 834096
|
||||
-8 6000k 3300k 2100k 828642
|
||||
-9 6700k 3700k 2350k 828642
|
||||
|
||||
OPTIONS
|
||||
-c --stdout
|
||||
Compress or decompress to standard output. -c will
|
||||
decompress multiple files to stdout, but will only
|
||||
compress a single file to stdout.
|
||||
|
||||
-d --decompress
|
||||
Force decompression. bzip2, bunzip2 and bzcat are
|
||||
really the same program, and the decision about
|
||||
what actions to take is done on the basis of which
|
||||
name is used. This flag overrides that mechanism,
|
||||
and forces bzip2 to decompress.
|
||||
|
||||
-z --compress
|
||||
The complement to -d: forces compression, regard-
|
||||
less of the invokation name.
|
||||
|
||||
-t --test
|
||||
Check integrity of the specified file(s), but don't
|
||||
decompress them. This really performs a trial
|
||||
decompression and throws away the result.
|
||||
|
||||
-f --force
|
||||
Force overwrite of output files. Normally, bzip2
|
||||
will not overwrite existing output files.
|
||||
|
||||
-k --keep
|
||||
Keep (don't delete) input files during compression
|
||||
or decompression.
|
||||
|
||||
-s --small
|
||||
Reduce memory usage, for compression, decompression
|
||||
and testing. Files are decompressed and tested
|
||||
using a modified algorithm which only requires 2.5
|
||||
bytes per block byte. This means any file can be
|
||||
decompressed in 2300k of memory, albeit at about
|
||||
half the normal speed.
|
||||
|
||||
During compression, -s selects a block size of
|
||||
200k, which limits memory use to around the same
|
||||
figure, at the expense of your compression ratio.
|
||||
In short, if your machine is low on memory (8
|
||||
megabytes or less), use -s for everything. See
|
||||
MEMORY MANAGEMENT above.
|
||||
|
||||
-v --verbose
|
||||
Verbose mode -- show the compression ratio for each
|
||||
file processed. Further -v's increase the ver-
|
||||
bosity level, spewing out lots of information which
|
||||
is primarily of interest for diagnostic purposes.
|
||||
|
||||
-L --license -V --version
|
||||
Display the software version, license terms and
|
||||
conditions.
|
||||
|
||||
-1 to -9
|
||||
Set the block size to 100 k, 200 k .. 900 k when
|
||||
compressing. Has no effect when decompressing.
|
||||
See MEMORY MANAGEMENT above.
|
||||
|
||||
--repetitive-fast
|
||||
bzip2 injects some small pseudo-random variations
|
||||
into very repetitive blocks to limit worst-case
|
||||
performance during compression. If sorting runs
|
||||
into difficulties, the block is randomised, and
|
||||
sorting is restarted. Very roughly, bzip2 persists
|
||||
for three times as long as a well-behaved input
|
||||
would take before resorting to randomisation. This
|
||||
flag makes it give up much sooner.
|
||||
|
||||
--repetitive-best
|
||||
Opposite of --repetitive-fast; try a lot harder
|
||||
before resorting to randomisation.
|
||||
|
||||
RECOVERING DATA FROM DAMAGED FILES
|
||||
bzip2 compresses files in blocks, usually 900kbytes long.
|
||||
Each block is handled independently. If a media or trans-
|
||||
mission error causes a multi-block .bz2 file to become
|
||||
damaged, it may be possible to recover data from the
|
||||
undamaged blocks in the file.
|
||||
|
||||
The compressed representation of each block is delimited
|
||||
by a 48-bit pattern, which makes it possible to find the
|
||||
block boundaries with reasonable certainty. Each block
|
||||
also carries its own 32-bit CRC, so damaged blocks can be
|
||||
distinguished from undamaged ones.
|
||||
|
||||
bzip2recover is a simple program whose purpose is to
|
||||
search for blocks in .bz2 files, and write each block out
|
||||
into its own .bz2 file. You can then use bzip2 -t to test
|
||||
the integrity of the resulting files, and decompress those
|
||||
which are undamaged.
|
||||
|
||||
bzip2recover takes a single argument, the name of the dam-
|
||||
aged file, and writes a number of files "rec0001file.bz2",
|
||||
"rec0002file.bz2", etc, containing the extracted blocks.
|
||||
The output filenames are designed so that the use of
|
||||
wildcards in subsequent processing -- for example, "bzip2
|
||||
-dc rec*file.bz2 > recovered_data" -- lists the files in
|
||||
the "right" order.
|
||||
|
||||
bzip2recover should be of most use dealing with large .bz2
|
||||
files, as these will contain many blocks. It is clearly
|
||||
futile to use it on damaged single-block files, since a
|
||||
damaged block cannot be recovered. If you wish to min-
|
||||
imise any potential data loss through media or transmis-
|
||||
sion errors, you might consider compressing with a smaller
|
||||
block size.
|
||||
|
||||
PERFORMANCE NOTES
|
||||
The sorting phase of compression gathers together similar
|
||||
strings in the file. Because of this, files containing
|
||||
very long runs of repeated symbols, like "aabaabaabaab
|
||||
..." (repeated several hundred times) may compress
|
||||
extraordinarily slowly. You can use the -vvvvv option to
|
||||
monitor progress in great detail, if you want. Decompres-
|
||||
sion speed is unaffected.
|
||||
|
||||
Such pathological cases seem rare in practice, appearing
|
||||
mostly in artificially-constructed test files, and in low-
|
||||
level disk images. It may be inadvisable to use bzip2 to
|
||||
compress the latter. If you do get a file which causes
|
||||
severe slowness in compression, try making the block size
|
||||
as small as possible, with flag -1.
|
||||
|
||||
bzip2 usually allocates several megabytes of memory to
|
||||
operate in, and then charges all over it in a fairly ran-
|
||||
dom fashion. This means that performance, both for com-
|
||||
pressing and decompressing, is largely determined by the
|
||||
speed at which your machine can service cache misses.
|
||||
Because of this, small changes to the code to reduce the
|
||||
miss rate have been observed to give disproportionately
|
||||
large performance improvements. I imagine bzip2 will per-
|
||||
form best on machines with very large caches.
|
||||
|
||||
CAVEATS
|
||||
I/O error messages are not as helpful as they could be.
|
||||
Bzip2 tries hard to detect I/O errors and exit cleanly,
|
||||
but the details of what the problem is sometimes seem
|
||||
rather misleading.
|
||||
|
||||
This manual page pertains to version 0.9.0 of bzip2. Com-
|
||||
pressed data created by this version is entirely forwards
|
||||
and backwards compatible with the previous public release,
|
||||
version 0.1pl2, but with the following exception: 0.9.0
|
||||
can correctly decompress multiple concatenated compressed
|
||||
files. 0.1pl2 cannot do this; it will stop after decom-
|
||||
pressing just the first file in the stream.
|
||||
|
||||
Wildcard expansion for Windows 95 and NT is flaky.
|
||||
|
||||
bzip2recover uses 32-bit integers to represent bit posi-
|
||||
tions in compressed files, so it cannot handle compressed
|
||||
files more than 512 megabytes long. This could easily be
|
||||
fixed.
|
||||
|
||||
AUTHOR
|
||||
Julian Seward, jseward@acm.org.
|
||||
|
||||
The ideas embodied in bzip2 are due to (at least) the fol-
|
||||
lowing people: Michael Burrows and David Wheeler (for the
|
||||
block sorting transformation), David Wheeler (again, for
|
||||
the Huffman coder), Peter Fenwick (for the structured cod-
|
||||
ing model in the original bzip, and many refinements), and
|
||||
Alistair Moffat, Radford Neal and Ian Witten (for the
|
||||
arithmetic coder in the original bzip). I am much
|
||||
indebted for their help, support and advice. See the man-
|
||||
ual in the source distribution for pointers to sources of
|
||||
documentation. Christian von Roques encouraged me to look
|
||||
for faster sorting algorithms, so as to speed up compres-
|
||||
sion. Bela Lubkin encouraged me to improve the worst-case
|
||||
compression performance. Many people sent patches, helped
|
||||
with portability problems, lent machines, gave advice and
|
||||
were generally helpful.
|
||||
</PRE>
|
||||
|
||||
<P><HR><P>
|
||||
Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_1.html">previous</A>, <A HREF="manual_3.html">next</A>, <A HREF="manual_4.html">last</A> section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
</BODY>
|
||||
</HTML>
|
1625
lib/libbz2/manual_3.html
Normal file
1625
lib/libbz2/manual_3.html
Normal file
File diff suppressed because it is too large
Load Diff
500
lib/libbz2/manual_4.html
Normal file
500
lib/libbz2/manual_4.html
Normal file
@ -0,0 +1,500 @@
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<!-- This HTML file has been created by texi2html 1.51
|
||||
from manual.texi on 23 August 1998 -->
|
||||
|
||||
<TITLE>bzip2 and libbzip2 - Miscellanea</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_3.html">previous</A>, next, last section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
<P><HR><P>
|
||||
|
||||
|
||||
<H1><A NAME="SEC33" HREF="manual_toc.html#TOC33">Miscellanea</A></H1>
|
||||
|
||||
<P>
|
||||
These are just some random thoughts of mine. Your mileage may
|
||||
vary.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
<H2><A NAME="SEC34" HREF="manual_toc.html#TOC34">Limitations of the compressed file format</A></H2>
|
||||
<P>
|
||||
<CODE>bzip2-0.9.0</CODE> uses exactly the same file format as the previous
|
||||
version, <CODE>bzip2-0.1</CODE>. This decision was made in the interests of
|
||||
stability. Creating yet another incompatible compressed file format
|
||||
would create further confusion and disruption for users.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Nevertheless, this is not a painless decision. Development
|
||||
work since the release of <CODE>bzip2-0.1</CODE> in August 1997
|
||||
has shown complexities in the file format which slow down
|
||||
decompression and, in retrospect, are unnecessary. These are:
|
||||
|
||||
<UL>
|
||||
<LI>The run-length encoder, which is the first of the
|
||||
|
||||
compression transformations, is entirely irrelevant.
|
||||
The original purpose was to protect the sorting algorithm
|
||||
from the very worst case input: a string of repeated
|
||||
symbols. But algorithm steps Q6a and Q6b in the original
|
||||
Burrows-Wheeler technical report (SRC-124) show how
|
||||
repeats can be handled without difficulty in block
|
||||
sorting.
|
||||
<LI>The randomisation mechanism doesn't really need to be
|
||||
|
||||
there. Udi Manber and Gene Myers published a suffix
|
||||
array construction algorithm a few years back, which
|
||||
can be employed to sort any block, no matter how
|
||||
repetitive, in O(N log N) time. Subsequent work by
|
||||
Kunihiko Sadakane has produced a derivative O(N (log N)^2)
|
||||
algorithm which usually outperforms the Manber-Myers
|
||||
algorithm.
|
||||
|
||||
I could have changed to Sadakane's algorithm, but I find
|
||||
it to be slower than <CODE>bzip2</CODE>'s existing algorithm for
|
||||
most inputs, and the randomisation mechanism protects
|
||||
adequately against bad cases. I didn't think it was
|
||||
a good tradeoff to make. Partly this is due to the fact
|
||||
that I was not flooded with email complaints about
|
||||
<CODE>bzip2-0.1</CODE>'s performance on repetitive data, so
|
||||
perhaps it isn't a problem for real inputs.
|
||||
|
||||
Probably the best long-term solution
|
||||
is to use the existing sorting
|
||||
algorithm initially, and fall back to a O(N (log N)^2)
|
||||
algorithm if the standard algorithm gets into difficulties.
|
||||
This can be done without much difficulty; I made
|
||||
a prototype implementation of it some months now.
|
||||
<LI>The compressed file format was never designed to be
|
||||
|
||||
handled by a library, and I have had to jump though
|
||||
some hoops to produce an efficient implementation of
|
||||
decompression. It's a bit hairy. Try passing
|
||||
<CODE>decompress.c</CODE> through the C preprocessor
|
||||
and you'll see what I mean. Much of this complexity
|
||||
could have been avoided if the compressed size of
|
||||
each block of data was recorded in the data stream.
|
||||
<LI>An Adler-32 checksum, rather than a CRC32 checksum,
|
||||
|
||||
would be faster to compute.
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
It would be fair to say that the <CODE>bzip2</CODE> format was frozen
|
||||
before I properly and fully understood the performance
|
||||
consequences of doing so.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Improvements which I have been able to incorporate into
|
||||
0.9.0, despite using the same file format, are:
|
||||
|
||||
<UL>
|
||||
<LI>Single array implementation of the inverse BWT. This
|
||||
|
||||
significantly speeds up decompression, presumably
|
||||
because it reduces the number of cache misses.
|
||||
<LI>Faster inverse MTF transform for large MTF values. The
|
||||
|
||||
new implementation is based on the notion of sliding blocks
|
||||
of values.
|
||||
<LI><CODE>bzip2-0.9.0</CODE> now reads and writes files with <CODE>fread</CODE>
|
||||
|
||||
and <CODE>fwrite</CODE>; version 0.1 used <CODE>putc</CODE> and <CODE>getc</CODE>.
|
||||
Duh! I'm embarrassed at my own moronicness (moronicity?) on this
|
||||
one.
|
||||
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
Further ahead, it would be nice
|
||||
to be able to do random access into files. This will
|
||||
require some careful design of compressed file formats.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H2><A NAME="SEC35" HREF="manual_toc.html#TOC35">Portability issues</A></H2>
|
||||
<P>
|
||||
After some consideration, I have decided not to use
|
||||
GNU <CODE>autoconf</CODE> to configure 0.9.0.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
<CODE>autoconf</CODE>, admirable and wonderful though it is,
|
||||
mainly assists with portability problems between Unix-like
|
||||
platforms. But <CODE>bzip2</CODE> doesn't have much in the way
|
||||
of portability problems on Unix; most of the difficulties appear
|
||||
when porting to the Mac, or to Microsoft's operating systems.
|
||||
<CODE>autoconf</CODE> doesn't help in those cases, and brings in a
|
||||
whole load of new complexity.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Most people should be able to compile the library and program
|
||||
under Unix straight out-of-the-box, so to speak, especially
|
||||
if you have a version of GNU C available.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
There are a couple of <CODE>__inline__</CODE> directives in the code. GNU C
|
||||
(<CODE>gcc</CODE>) should be able to handle them. If your compiler doesn't
|
||||
like them, just <CODE>#define</CODE> <CODE>__inline__</CODE> to be null. One
|
||||
easy way to do this is to compile with the flag <CODE>-D__inline__=</CODE>,
|
||||
which should be understood by most Unix compilers.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
If you still have difficulties, try compiling with the macro
|
||||
<CODE>BZ_STRICT_ANSI</CODE> defined. This should enable you to build the
|
||||
library in a strictly ANSI compliant environment. Building the program
|
||||
itself like this is dangerous and not supported, since you remove
|
||||
<CODE>bzip2</CODE>'s checks against compressing directories, symbolic links,
|
||||
devices, and other not-really-a-file entities. This could cause
|
||||
filesystem corruption!
|
||||
|
||||
</P>
|
||||
<P>
|
||||
One other thing: if you create a <CODE>bzip2</CODE> binary for public
|
||||
distribution, please try and link it statically (<CODE>gcc -s</CODE>). This
|
||||
avoids all sorts of library-version issues that others may encounter
|
||||
later on.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H2><A NAME="SEC36" HREF="manual_toc.html#TOC36">Reporting bugs</A></H2>
|
||||
<P>
|
||||
I tried pretty hard to make sure <CODE>bzip2</CODE> is
|
||||
bug free, both by design and by testing. Hopefully
|
||||
you'll never need to read this section for real.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Nevertheless, if <CODE>bzip2</CODE> dies with a segmentation
|
||||
fault, a bus error or an internal assertion failure, it
|
||||
will ask you to email me a bug report. Experience with
|
||||
version 0.1 shows that almost all these problems can
|
||||
be traced to either compiler bugs or hardware problems.
|
||||
|
||||
<UL>
|
||||
<LI>
|
||||
|
||||
Recompile the program with no optimisation, and see if it
|
||||
works. And/or try a different compiler.
|
||||
I heard all sorts of stories about various flavours
|
||||
of GNU C (and other compilers) generating bad code for
|
||||
<CODE>bzip2</CODE>, and I've run across two such examples myself.
|
||||
|
||||
2.7.X versions of GNU C are known to generate bad code from
|
||||
time to time, at high optimisation levels.
|
||||
If you get problems, try using the flags
|
||||
<CODE>-O2</CODE> <CODE>-fomit-frame-pointer</CODE> <CODE>-fno-strength-reduce</CODE>.
|
||||
You should specifically <EM>not</EM> use <CODE>-funroll-loops</CODE>.
|
||||
|
||||
You may notice that the Makefile runs four tests as part of
|
||||
the build process. If the program passes all of these, it's
|
||||
a pretty good (but not 100%) indication that the compiler has
|
||||
done its job correctly.
|
||||
<LI>
|
||||
|
||||
If <CODE>bzip2</CODE> crashes randomly, and the crashes are not
|
||||
repeatable, you may have a flaky memory subsystem. <CODE>bzip2</CODE>
|
||||
really hammers your memory hierarchy, and if it's a bit marginal,
|
||||
you may get these problems. Ditto if your disk or I/O subsystem
|
||||
is slowly failing. Yup, this really does happen.
|
||||
|
||||
Try using a different machine of the same type, and see if
|
||||
you can repeat the problem.
|
||||
<LI>This isn't really a bug, but ... If <CODE>bzip2</CODE> tells
|
||||
|
||||
you your file is corrupted on decompression, and you
|
||||
obtained the file via FTP, there is a possibility that you
|
||||
forgot to tell FTP to do a binary mode transfer. That absolutely
|
||||
will cause the file to be non-decompressible. You'll have to transfer
|
||||
it again.
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
If you've incorporated <CODE>libbzip2</CODE> into your own program
|
||||
and are getting problems, please, please, please, check that the
|
||||
parameters you are passing in calls to the library, are
|
||||
correct, and in accordance with what the documentation says
|
||||
is allowable. I have tried to make the library robust against
|
||||
such problems, but I'm sure I haven't succeeded.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Finally, if the above comments don't help, you'll have to send
|
||||
me a bug report. Now, it's just amazing how many people will
|
||||
send me a bug report saying something like
|
||||
|
||||
<PRE>
|
||||
bzip2 crashed with segmentation fault on my machine
|
||||
</PRE>
|
||||
|
||||
<P>
|
||||
and absolutely nothing else. Needless to say, a such a report
|
||||
is <EM>totally, utterly, completely and comprehensively 100% useless;
|
||||
a waste of your time, my time, and net bandwidth</EM>.
|
||||
With no details at all, there's no way I can possibly begin
|
||||
to figure out what the problem is.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
The rules of the game are: facts, facts, facts. Don't omit
|
||||
them because "oh, they won't be relevant". At the bare
|
||||
minimum:
|
||||
|
||||
<PRE>
|
||||
Machine type. Operating system version.
|
||||
Exact version of <CODE>bzip2</CODE> (do <CODE>bzip2 -V</CODE>).
|
||||
Exact version of the compiler used.
|
||||
Flags passed to the compiler.
|
||||
</PRE>
|
||||
|
||||
<P>
|
||||
However, the most important single thing that will help me is
|
||||
the file that you were trying to compress or decompress at the
|
||||
time the problem happened. Without that, my ability to do anything
|
||||
more than speculate about the cause, is limited.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Please remember that I connect to the Internet with a modem, so
|
||||
you should contact me before mailing me huge files.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H2><A NAME="SEC37" HREF="manual_toc.html#TOC37">Did you get the right package?</A></H2>
|
||||
|
||||
<P>
|
||||
<CODE>bzip2</CODE> is a resource hog. It soaks up large amounts of CPU cycles
|
||||
and memory. Also, it gives very large latencies. In the worst case, you
|
||||
can feed many megabytes of uncompressed data into the library before
|
||||
getting any compressed output, so this probably rules out applications
|
||||
requiring interactive behaviour.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
These aren't faults of my implementation, I hope, but more
|
||||
an intrinsic property of the Burrows-Wheeler transform (unfortunately).
|
||||
Maybe this isn't what you want.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
If you want a compressor and/or library which is faster, uses less
|
||||
memory but gets pretty good compression, and has minimal latency,
|
||||
consider Jean-loup
|
||||
Gailly's and Mark Adler's work, <CODE>zlib-1.1.2</CODE> and
|
||||
<CODE>gzip-1.2.4</CODE>. Look for them at
|
||||
<CODE>http://www.cdrom.com/pub/infozip/zlib</CODE> and
|
||||
<CODE>http://www.gzip.org</CODE> respectively.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
For something faster and lighter still, you might try Markus F X J
|
||||
Oberhumer's <CODE>LZO</CODE> real-time compression/decompression library, at
|
||||
<BR> <CODE>http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html</CODE>.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
If you want to use the <CODE>bzip2</CODE> algorithms to compress small blocks
|
||||
of data, 64k bytes or smaller, for example on an on-the-fly disk
|
||||
compressor, you'd be well advised not to use this library. Instead,
|
||||
I've made a special library tuned for that kind of use. It's part of
|
||||
<CODE>e2compr-0.40</CODE>, an on-the-fly disk compressor for the Linux
|
||||
<CODE>ext2</CODE> filesystem. Look at
|
||||
<CODE>http://www.netspace.net.au/~reiter/e2compr</CODE>.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H2><A NAME="SEC38" HREF="manual_toc.html#TOC38">Testing</A></H2>
|
||||
|
||||
<P>
|
||||
A record of the tests I've done.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
First, some data sets:
|
||||
|
||||
<UL>
|
||||
<LI>B: a directory containing a 6001 files, one for every length in the
|
||||
|
||||
range 0 to 6000 bytes. The files contain random lowercase
|
||||
letters. 18.7 megabytes.
|
||||
<LI>H: my home directory tree. Documents, source code, mail files,
|
||||
|
||||
compressed data. H contains B, and also a directory of
|
||||
files designed as boundary cases for the sorting; mostly very
|
||||
repetitive, nasty files. 445 megabytes.
|
||||
<LI>A: directory tree holding various applications built from source:
|
||||
|
||||
<CODE>egcs-1.0.2</CODE>, <CODE>gcc-2.8.1</CODE>, KDE Beta 4, GTK, Octave, etc.
|
||||
827 megabytes.
|
||||
<LI>P: directory tree holding large amounts of source code (<CODE>.tar</CODE>
|
||||
|
||||
files) of the entire GNU distribution, plus a couple of
|
||||
Linux distributions. 2400 megabytes.
|
||||
</UL>
|
||||
|
||||
<P>
|
||||
The tests conducted are as follows. Each test means compressing
|
||||
(a copy of) each file in the data set, decompressing it and
|
||||
comparing it against the original.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
First, a bunch of tests with block sizes, internal buffer
|
||||
sizes and randomisation lengths set very small,
|
||||
to detect any problems with the
|
||||
blocking, buffering and randomisation mechanisms.
|
||||
This required modifying the source code so as to try to
|
||||
break it.
|
||||
|
||||
<OL>
|
||||
<LI>Data set H, with
|
||||
|
||||
buffer size of 1 byte, and block size of 23 bytes.
|
||||
<LI>Data set B, buffer sizes 1 byte, block size 1 byte.
|
||||
|
||||
<LI>As (2) but small-mode decompression (first 1700 files).
|
||||
|
||||
<LI>As (2) with block size 2 bytes.
|
||||
|
||||
<LI>As (2) with block size 3 bytes.
|
||||
|
||||
<LI>As (2) with block size 4 bytes.
|
||||
|
||||
<LI>As (2) with block size 5 bytes.
|
||||
|
||||
<LI>As (2) with block size 6 bytes and small-mode decompression.
|
||||
|
||||
<LI>H with normal buffer sizes (5000 bytes), normal block
|
||||
|
||||
size (up to 900000 bytes), but with randomisation
|
||||
mechanism running intensely (randomising approximately every
|
||||
third byte).
|
||||
<LI>As (9) with small-mode decompression.
|
||||
|
||||
</OL>
|
||||
|
||||
<P>
|
||||
Then some tests with unmodified source code.
|
||||
|
||||
<OL>
|
||||
<LI>H, all settings normal.
|
||||
|
||||
<LI>As (1), with small-mode decompress.
|
||||
|
||||
<LI>H, compress with flag <CODE>-1</CODE>.
|
||||
|
||||
<LI>H, compress with flag <CODE>-s</CODE>, decompress with flag <CODE>-s</CODE>.
|
||||
|
||||
<LI>Forwards compatibility: H, <CODE>bzip2-0.1pl2</CODE> compressing,
|
||||
|
||||
<CODE>bzip2-0.9.0</CODE> decompressing, all settings normal.
|
||||
<LI>Backwards compatibility: H, <CODE>bzip2-0.9.0</CODE> compressing,
|
||||
|
||||
<CODE>bzip2-0.1pl2</CODE> decompressing, all settings normal.
|
||||
<LI>Bigger tests: A, all settings normal.
|
||||
|
||||
<LI>P, all settings normal.
|
||||
|
||||
<LI>Misc test: about 100 megabytes of <CODE>.tar</CODE> files with
|
||||
|
||||
<CODE>bzip2</CODE> compiled with Purify.
|
||||
<LI>Misc tests to make sure it builds and runs ok on non-Linux/x86
|
||||
|
||||
platforms.
|
||||
</OL>
|
||||
|
||||
<P>
|
||||
These tests were conducted on a 205 MHz Cyrix 6x86MX machine, running
|
||||
Linux 2.0.32. They represent nearly a week of continuous computation.
|
||||
All tests completed successfully.
|
||||
|
||||
</P>
|
||||
|
||||
|
||||
|
||||
<H2><A NAME="SEC39" HREF="manual_toc.html#TOC39">Further reading</A></H2>
|
||||
<P>
|
||||
<CODE>bzip2</CODE> is not research work, in the sense that it doesn't present
|
||||
any new ideas. Rather, it's an engineering exercise based on existing
|
||||
ideas.
|
||||
|
||||
</P>
|
||||
<P>
|
||||
Four documents describe essentially all the ideas behind <CODE>bzip2</CODE>:
|
||||
|
||||
<PRE>
|
||||
Michael Burrows and D. J. Wheeler:
|
||||
"A block-sorting lossless data compression algorithm"
|
||||
10th May 1994.
|
||||
Digital SRC Research Report 124.
|
||||
ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
|
||||
If you have trouble finding it, try searching at the
|
||||
New Zealand Digital Library, http://www.nzdl.org.
|
||||
|
||||
Daniel S. Hirschberg and Debra A. LeLewer
|
||||
"Efficient Decoding of Prefix Codes"
|
||||
Communications of the ACM, April 1990, Vol 33, Number 4.
|
||||
You might be able to get an electronic copy of this
|
||||
from the ACM Digital Library.
|
||||
|
||||
David J. Wheeler
|
||||
Program bred3.c and accompanying document bred3.ps.
|
||||
This contains the idea behind the multi-table Huffman
|
||||
coding scheme.
|
||||
ftp://ftp.cl.cam.ac.uk/pub/user/djw3/
|
||||
|
||||
Jon L. Bentley and Robert Sedgewick
|
||||
"Fast Algorithms for Sorting and Searching Strings"
|
||||
Available from Sedgewick's web page,
|
||||
www.cs.princeton.edu/~rs
|
||||
</PRE>
|
||||
|
||||
<P>
|
||||
The following paper gives valuable additional insights into the
|
||||
algorithm, but is not immediately the basis of any code
|
||||
used in bzip2.
|
||||
|
||||
<PRE>
|
||||
Peter Fenwick:
|
||||
Block Sorting Text Compression
|
||||
Proceedings of the 19th Australasian Computer Science Conference,
|
||||
Melbourne, Australia. Jan 31 - Feb 2, 1996.
|
||||
ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
|
||||
</PRE>
|
||||
|
||||
<P>
|
||||
Kunihiko Sadakane's sorting algorithm, mentioned above,
|
||||
is available from:
|
||||
|
||||
<PRE>
|
||||
http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz
|
||||
</PRE>
|
||||
|
||||
<P>
|
||||
The Manber-Myers suffix array construction
|
||||
algorithm is described in a paper
|
||||
available from:
|
||||
|
||||
<PRE>
|
||||
http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps
|
||||
</PRE>
|
||||
|
||||
<P><HR><P>
|
||||
Go to the <A HREF="manual_1.html">first</A>, <A HREF="manual_3.html">previous</A>, next, last section, <A HREF="manual_toc.html">table of contents</A>.
|
||||
</BODY>
|
||||
</HTML>
|
76
lib/libbz2/manual_toc.html
Normal file
76
lib/libbz2/manual_toc.html
Normal file
@ -0,0 +1,76 @@
|
||||
<HTML>
|
||||
<HEAD>
|
||||
<!-- This HTML file has been created by texi2html 1.51
|
||||
from manual.texi on 23 August 1998 -->
|
||||
|
||||
<TITLE>bzip2 and libbzip2 - Table of Contents</TITLE>
|
||||
</HEAD>
|
||||
<BODY>
|
||||
<H1>bzip2 and libbzip2</H1>
|
||||
<H2>a program and library for data compression</H2>
|
||||
<H2>copyright (C) 1996-1998 Julian Seward</H2>
|
||||
<H2>version 0.9.0 of 23 August 1998</H2>
|
||||
<ADDRESS>Julian Seward</ADDRESS>
|
||||
<P>
|
||||
<P><HR><P>
|
||||
<UL>
|
||||
<LI><A NAME="TOC1" HREF="manual_1.html#SEC1">Introduction</A>
|
||||
<LI><A NAME="TOC2" HREF="manual_2.html#SEC2">How to use <CODE>bzip2</CODE></A>
|
||||
<LI><A NAME="TOC3" HREF="manual_3.html#SEC3">Programming with <CODE>libbzip2</CODE></A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC4" HREF="manual_3.html#SEC4">Top-level structure</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC5" HREF="manual_3.html#SEC5">Low-level summary</A>
|
||||
<LI><A NAME="TOC6" HREF="manual_3.html#SEC6">High-level summary</A>
|
||||
<LI><A NAME="TOC7" HREF="manual_3.html#SEC7">Utility functions summary</A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC8" HREF="manual_3.html#SEC8">Error handling</A>
|
||||
<LI><A NAME="TOC9" HREF="manual_3.html#SEC9">Low-level interface</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC10" HREF="manual_3.html#SEC10"><CODE>bzCompressInit</CODE></A>
|
||||
<LI><A NAME="TOC11" HREF="manual_3.html#SEC11"><CODE>bzCompress</CODE></A>
|
||||
<LI><A NAME="TOC12" HREF="manual_3.html#SEC12"><CODE>bzCompressEnd</CODE></A>
|
||||
<LI><A NAME="TOC13" HREF="manual_3.html#SEC13"><CODE>bzDecompressInit</CODE></A>
|
||||
<LI><A NAME="TOC14" HREF="manual_3.html#SEC14"><CODE>bzDecompress</CODE></A>
|
||||
<LI><A NAME="TOC15" HREF="manual_3.html#SEC15"><CODE>bzDecompressEnd</CODE></A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC16" HREF="manual_3.html#SEC16">High-level interface</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC17" HREF="manual_3.html#SEC17"><CODE>bzReadOpen</CODE></A>
|
||||
<LI><A NAME="TOC18" HREF="manual_3.html#SEC18"><CODE>bzRead</CODE></A>
|
||||
<LI><A NAME="TOC19" HREF="manual_3.html#SEC19"><CODE>bzReadGetUnused</CODE></A>
|
||||
<LI><A NAME="TOC20" HREF="manual_3.html#SEC20"><CODE>bzReadClose</CODE></A>
|
||||
<LI><A NAME="TOC21" HREF="manual_3.html#SEC21"><CODE>bzWriteOpen</CODE></A>
|
||||
<LI><A NAME="TOC22" HREF="manual_3.html#SEC22"><CODE>bzWrite</CODE></A>
|
||||
<LI><A NAME="TOC23" HREF="manual_3.html#SEC23"><CODE>bzWriteClose</CODE></A>
|
||||
<LI><A NAME="TOC24" HREF="manual_3.html#SEC24">Handling embedded compressed data streams</A>
|
||||
<LI><A NAME="TOC25" HREF="manual_3.html#SEC25">Standard file-reading/writing code</A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC26" HREF="manual_3.html#SEC26">Utility functions</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC27" HREF="manual_3.html#SEC27"><CODE>bzBuffToBuffCompress</CODE></A>
|
||||
<LI><A NAME="TOC28" HREF="manual_3.html#SEC28"><CODE>bzBuffToBuffDecompress</CODE></A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC29" HREF="manual_3.html#SEC29">Using the library in a <CODE>stdio</CODE>-free environment</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC30" HREF="manual_3.html#SEC30">Getting rid of <CODE>stdio</CODE></A>
|
||||
<LI><A NAME="TOC31" HREF="manual_3.html#SEC31">Critical error handling</A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC32" HREF="manual_3.html#SEC32">Making a Windows DLL</A>
|
||||
</UL>
|
||||
<LI><A NAME="TOC33" HREF="manual_4.html#SEC33">Miscellanea</A>
|
||||
<UL>
|
||||
<LI><A NAME="TOC34" HREF="manual_4.html#SEC34">Limitations of the compressed file format</A>
|
||||
<LI><A NAME="TOC35" HREF="manual_4.html#SEC35">Portability issues</A>
|
||||
<LI><A NAME="TOC36" HREF="manual_4.html#SEC36">Reporting bugs</A>
|
||||
<LI><A NAME="TOC37" HREF="manual_4.html#SEC37">Did you get the right package?</A>
|
||||
<LI><A NAME="TOC38" HREF="manual_4.html#SEC38">Testing</A>
|
||||
<LI><A NAME="TOC39" HREF="manual_4.html#SEC39">Further reading</A>
|
||||
</UL>
|
||||
</UL>
|
||||
<P><HR><P>
|
||||
This document was generated on 23 August 1998 using the
|
||||
<A HREF="http://wwwcn.cern.ch/dci/texi2html/">texi2html</A>
|
||||
translator version 1.51.</P>
|
||||
</BODY>
|
||||
</HTML>
|
124
lib/libbz2/randtable.c
Normal file
124
lib/libbz2/randtable.c
Normal file
@ -0,0 +1,124 @@
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- Table for randomising repetitive blocks ---*/
|
||||
/*--- randtable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
||||
|
||||
/*--
|
||||
This file is a part of bzip2 and/or libbzip2, a program and
|
||||
library for lossless, block-sorting data compression.
|
||||
|
||||
Copyright (C) 1996-1998 Julian R Seward. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions
|
||||
are met:
|
||||
|
||||
1. Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
|
||||
2. The origin of this software must not be misrepresented; you must
|
||||
not claim that you wrote the original software. If you use this
|
||||
software in a product, an acknowledgment in the product
|
||||
documentation would be appreciated but is not required.
|
||||
|
||||
3. Altered source versions must be plainly marked as such, and must
|
||||
not be misrepresented as being the original software.
|
||||
|
||||
4. The name of the author may not be used to endorse or promote
|
||||
products derived from this software without specific prior written
|
||||
permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
||||
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
||||
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
||||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
||||
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
||||
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
||||
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
|
||||
Julian Seward, Guildford, Surrey, UK.
|
||||
jseward@acm.org
|
||||
bzip2/libbzip2 version 0.9.0 of 28 June 1998
|
||||
|
||||
This program is based on (at least) the work of:
|
||||
Mike Burrows
|
||||
David Wheeler
|
||||
Peter Fenwick
|
||||
Alistair Moffat
|
||||
Radford Neal
|
||||
Ian H. Witten
|
||||
Robert Sedgewick
|
||||
Jon L. Bentley
|
||||
|
||||
For more information on these sources, see the manual.
|
||||
--*/
|
||||
|
||||
|
||||
#include "bzlib_private.h"
|
||||
|
||||
|
||||
/*---------------------------------------------*/
|
||||
Int32 rNums[512] = {
|
||||
619, 720, 127, 481, 931, 816, 813, 233, 566, 247,
|
||||
985, 724, 205, 454, 863, 491, 741, 242, 949, 214,
|
||||
733, 859, 335, 708, 621, 574, 73, 654, 730, 472,
|
||||
419, 436, 278, 496, 867, 210, 399, 680, 480, 51,
|
||||
878, 465, 811, 169, 869, 675, 611, 697, 867, 561,
|
||||
862, 687, 507, 283, 482, 129, 807, 591, 733, 623,
|
||||
150, 238, 59, 379, 684, 877, 625, 169, 643, 105,
|
||||
170, 607, 520, 932, 727, 476, 693, 425, 174, 647,
|
||||
73, 122, 335, 530, 442, 853, 695, 249, 445, 515,
|
||||
909, 545, 703, 919, 874, 474, 882, 500, 594, 612,
|
||||
641, 801, 220, 162, 819, 984, 589, 513, 495, 799,
|
||||
161, 604, 958, 533, 221, 400, 386, 867, 600, 782,
|
||||
382, 596, 414, 171, 516, 375, 682, 485, 911, 276,
|
||||
98, 553, 163, 354, 666, 933, 424, 341, 533, 870,
|
||||
227, 730, 475, 186, 263, 647, 537, 686, 600, 224,
|
||||
469, 68, 770, 919, 190, 373, 294, 822, 808, 206,
|
||||
184, 943, 795, 384, 383, 461, 404, 758, 839, 887,
|
||||
715, 67, 618, 276, 204, 918, 873, 777, 604, 560,
|
||||
951, 160, 578, 722, 79, 804, 96, 409, 713, 940,
|
||||
652, 934, 970, 447, 318, 353, 859, 672, 112, 785,
|
||||
645, 863, 803, 350, 139, 93, 354, 99, 820, 908,
|
||||
609, 772, 154, 274, 580, 184, 79, 626, 630, 742,
|
||||
653, 282, 762, 623, 680, 81, 927, 626, 789, 125,
|
||||
411, 521, 938, 300, 821, 78, 343, 175, 128, 250,
|
||||
170, 774, 972, 275, 999, 639, 495, 78, 352, 126,
|
||||
857, 956, 358, 619, 580, 124, 737, 594, 701, 612,
|
||||
669, 112, 134, 694, 363, 992, 809, 743, 168, 974,
|
||||
944, 375, 748, 52, 600, 747, 642, 182, 862, 81,
|
||||
344, 805, 988, 739, 511, 655, 814, 334, 249, 515,
|
||||
897, 955, 664, 981, 649, 113, 974, 459, 893, 228,
|
||||
433, 837, 553, 268, 926, 240, 102, 654, 459, 51,
|
||||
686, 754, 806, 760, 493, 403, 415, 394, 687, 700,
|
||||
946, 670, 656, 610, 738, 392, 760, 799, 887, 653,
|
||||
978, 321, 576, 617, 626, 502, 894, 679, 243, 440,
|
||||
680, 879, 194, 572, 640, 724, 926, 56, 204, 700,
|
||||
707, 151, 457, 449, 797, 195, 791, 558, 945, 679,
|
||||
297, 59, 87, 824, 713, 663, 412, 693, 342, 606,
|
||||
134, 108, 571, 364, 631, 212, 174, 643, 304, 329,
|
||||
343, 97, 430, 751, 497, 314, 983, 374, 822, 928,
|
||||
140, 206, 73, 263, 980, 736, 876, 478, 430, 305,
|
||||
170, 514, 364, 692, 829, 82, 855, 953, 676, 246,
|
||||
369, 970, 294, 750, 807, 827, 150, 790, 288, 923,
|
||||
804, 378, 215, 828, 592, 281, 565, 555, 710, 82,
|
||||
896, 831, 547, 261, 524, 462, 293, 465, 502, 56,
|
||||
661, 821, 976, 991, 658, 869, 905, 758, 745, 193,
|
||||
768, 550, 608, 933, 378, 286, 215, 979, 792, 961,
|
||||
61, 688, 793, 644, 986, 403, 106, 366, 905, 644,
|
||||
372, 567, 466, 434, 645, 210, 389, 550, 919, 135,
|
||||
780, 773, 635, 389, 707, 100, 626, 958, 165, 504,
|
||||
920, 176, 193, 713, 857, 265, 203, 50, 668, 108,
|
||||
645, 990, 626, 197, 510, 357, 358, 850, 858, 364,
|
||||
936, 638
|
||||
};
|
||||
|
||||
|
||||
/*-------------------------------------------------------------*/
|
||||
/*--- end randtable.c ---*/
|
||||
/*-------------------------------------------------------------*/
|
BIN
lib/libbz2/sample1.bz2
Normal file
BIN
lib/libbz2/sample1.bz2
Normal file
Binary file not shown.
BIN
lib/libbz2/sample1.ref
Normal file
BIN
lib/libbz2/sample1.ref
Normal file
Binary file not shown.
BIN
lib/libbz2/sample2.bz2
Normal file
BIN
lib/libbz2/sample2.bz2
Normal file
Binary file not shown.
BIN
lib/libbz2/sample2.ref
Normal file
BIN
lib/libbz2/sample2.ref
Normal file
Binary file not shown.
4
lib/libbz2/words1
Normal file
4
lib/libbz2/words1
Normal file
@ -0,0 +1,4 @@
|
||||
|
||||
Doing 4 tests (2 compress, 2 uncompress) ...
|
||||
If there's a problem, things might stop at this point.
|
||||
|
5
lib/libbz2/words2
Normal file
5
lib/libbz2/words2
Normal file
@ -0,0 +1,5 @@
|
||||
|
||||
Checking test results. If any of the four "cmp"s which follow
|
||||
report any differences, something is wrong. If you can't easily
|
||||
figure out what, please let me know (jseward@acm.org).
|
||||
|
20
lib/libbz2/words3
Normal file
20
lib/libbz2/words3
Normal file
@ -0,0 +1,20 @@
|
||||
|
||||
If you got this far and the "cmp"s didn't find anything amiss, looks
|
||||
like you're in business. You should install bzip2, bunzip2 and bzcat:
|
||||
|
||||
Copy bzip2 and bzip2recover to a public place, maybe /usr/bin.
|
||||
In that public place, make bunzip2 and bzcat be
|
||||
symbolic links to the bzip2 you just copied there.
|
||||
Put the manual page, bzip2.1, somewhere appropriate;
|
||||
perhaps in /usr/man/man1.
|
||||
|
||||
Instructions for use are in the preformatted manual page, in the file
|
||||
bzip2.txt. For more detailed documentation, read the full manual.
|
||||
It is available in Postscript form (manual.ps) and HTML form
|
||||
(manual_toc.html).
|
||||
|
||||
You can also do "bzip2 --help" to see some helpful information.
|
||||
"bzip2 -L" displays the software license.
|
||||
|
||||
Happy compressing. -- JRS, 30 August 1998.
|
||||
|
Loading…
Reference in New Issue
Block a user