A bug fix in zlib 1.2.12 resulted in a slight slowdown (1-2%) of
deflate. This commit provides the option to #define LIT_MEM, which
uses more memory to reverse most of that slowdown. The memory for
the pending buffer and symbol buffers is increased by 25%, which
increases the total memory usage with the default parameters by
about 6%.
slide_hash() knowingly reads potentially uninitialized memory, see
comment lower down about prev[n] potentially being garbage. In
this case, the result is never used.
C2X has removed K&R definitions from the C function syntax.
Though the standard has not yet been approved, some high-profile
compilers are now issuing warnings when such definitions are
encountered.
memLevel 9 would cause deflateBound() to assume the use of fixed
blocks, even if the compression level was 0, which forces stored
blocks. That could result in a bound less than the size of the
compressed data. Now level 0 always uses the stored blocks bound.
This bug was reported by Danilo Ramos of Eideticom, Inc. It has
lain in wait 13 years before being found! The bug was introduced
in zlib 1.2.2.2, with the addition of the Z_FIXED option. That
option forces the use of fixed Huffman codes. For rare inputs with
a large number of distant matches, the pending buffer into which
the compressed data is written can overwrite the distance symbol
table which it overlays. That results in corrupted output due to
invalid distances, and can result in out-of-bound accesses,
crashing the application.
The fix here combines the distance buffer and literal/length
buffers into a single symbol buffer. Now three bytes of pending
buffer space are opened up for each literal or length/distance
pair consumed, instead of the previous two bytes. This assures
that the pending buffer cannot overwrite the symbol table, since
the maximum fixed code compressed length/distance is 31 bits, and
since there are four bytes of pending space for every three bytes
of symbol space.
This limits hash table inserts to the available data in the window
and to the sliding window size in deflate_stored(). The hash table
inserts are deferred until deflateParams() switches to a non-zero
compression level.
This commit allows a parameter change even if the input data has
not all been compressed and copied to the application output
buffer, so long as all of the input data has been compressed to
the internal pending output buffer. This also allows an immediate
deflateParams change so long as there have been no deflate calls
since initialization or reset.
This permits deflateParams to change the strategy and level right
after deflateInit, without having to wait until a header has been
written. The parameters can be changed immediately up until the
first deflate call that consumes any input data.
This avoids unnecessary filling of bytes in the sliding window
buffer when switching from level zero to a non-zero level. This
also provides a consistent indication of deflate having taken
input for a later commit ...
There have been many reports of bugs in the assembler codes
intended to speed up deflate and inflate. They are third-party
contributions in contrib, and so are not supported by the zlib
maintainers.
The previous code slid the window and the hash table and copied
every input byte three times in order to just write the data as
stored blocks with no compression. This commit minimizes sliding
and copying, especially for large input and output buffers.
Level 0 compression is now more than 20 times faster than before
the commit.
Most of the speedup is due to deferring hash table slides until
deflateParams() is called to change the compression level away
from 0. More speedup is due to copying directly from next_in to
next_out when the amounts of available input data and output space
permit it, avoiding the intermediate pending buffer. Additionally,
only the last 32K of the used input data is copied back to the
sliding window when large input buffers are provided.
This alters the specification in zlib.h, so that deflateParams()
will not change any parameters if there is not enough output space
in the event that a block is emitted in order to allow switching
the compression function.
Compression level 0 requests no compression, using only stored
blocks. When Z_HUFFMAN or Z_RLE was used with level 0 (granted,
an odd choice, but permitted), the resulting blocks were mostly
fixed or dynamic. The reason is that deflate_stored() was not
being called in that case. The compressed data was valid, but it
was not what the application requested. This commit assures that
only stored blocks are emitted for compression level 0, regardless
of the strategy selected.