The stbi__sbraw() macro in stb_image_write.h causes Clang to spew about 24
warnings complaining that "cast from 'unsigned char *' to 'int *' increases
required alignment from 1 to 4" when compiled with the -Wcast-align option.
In practice, this is spurious so long as STBIW_MALLOC() and STBIW_REALLOC()
follow the usual alignment semantics for malloc() and realloc() in that they
align sufficiently for any built-in type.
To quell the warning, we can cast through a void pointer as an intermediary.
JPG always encodes 8x8 pixel blocks. If the input image does not have
a width or height that's a multiple of 8, the last column or row is just
used multiple times for the remaining pixels of the block.
The original code first calculated p (the index into the pixel data)
with the "imaginary" row/colum (that might be up to 7 pixels too far
into each direction) and then subtracted the necessary amount of bytes
it if row >= height or col >= width.
That was a bit cryptic (IMHO), and didn't get more readable/obvious when
vertical flipping was added - which introduced a bug, by not taking
stbi__flip_vertically_on_write into account when adjusting p for
row >= height...
The code should be more obvious (and less buggy) now.
This fixes bug #592
The PNG filters of the pixels row N are computed using row N-1 of the final image. If the image should be flipped when saving, this corresponds to row N+1 of the initial image.
* `force_filter` being < 0 means the original behavior (i.e. figure out
the best-performing filter per scanline); any other values 0 <= x <= 4 correspond
to PNG filters (0 = none, 1 = sub, 2 = up, 3 = average, 4 = Paeth).
* `compression_level` being < 0 equals `compression_level` 8 (the previous value).
The higher this is, the better the compression should be (though it will use
more memory).
These new parameters are not (yet) exposed for the higher-level API functions.