From 9f8f71ebc7540c8cd9a097fa23918c2d491b14b3 Mon Sep 17 00:00:00 2001 From: Google Code Exporter Date: Fri, 13 Mar 2015 19:51:11 -0400 Subject: [PATCH] Migrating wiki contents from Google Code --- FeatureRequests.md | 56 +++++++++ ProjectHome.md | 104 ++++++++++++++++ RejectedRequestExample.md | 113 +++++++++++++++++ ReleaseNotes.md | 110 +++++++++++++++++ SpkFileFormat.md | 249 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 632 insertions(+) create mode 100644 FeatureRequests.md create mode 100644 ProjectHome.md create mode 100644 RejectedRequestExample.md create mode 100644 ReleaseNotes.md create mode 100644 SpkFileFormat.md diff --git a/FeatureRequests.md b/FeatureRequests.md new file mode 100644 index 0000000..c688037 --- /dev/null +++ b/FeatureRequests.md @@ -0,0 +1,56 @@ +After reading this page, you can go here to request a feature: [Issues](http://code.google.com/p/stb-imv/issues/list) + +Check the Open and Closed issues to make sure it hasn't already been +requested. If not, go ahead and create a new issue. Don't worry about +the fact that it will be called a 'defect' by default. + +Potential developers should also read this: RejectedRequestExample + +# Information for Users # + +**stb(imv)** is an image viewing program. Unless someone has an +_extremely_ compelling reason, we're not going to consider +feature requests that change that. + +So, for example, we will **not** be adding features like: + * Image filters + * Photoshop-style Levels or Curve operations + +On the other hand, the following features _are_ the sort of things +that might qualify as features appropriate for a viewing program. + + * image rotation (display only, no saving) + * gamma correction (display only, no saving) + * view multiple thumbnails in one window + +# Information for Developers # + +All the things above apply. But also: + + * Do not suggest features _because_ the implementation is easy (see notes below) + * Do not suggest features merely by describing the implementation; describe the feature in user terms (as suggested above) + +## On the Cost of Implementing Features ## + +Developers especially need to be cautious about saying that features are easy. There are four major costs in implementing a feature: + + 1. Development time to implement the feature + 1. Increase in the size of the executable + 1. Performance/memory overhead when not using the feature + 1. Maintainence overhead for continued development of the program + +Of these, #1 is most visible, and #4 is the most important (and perhaps least-considered). So let me address the other two first. + +2. Most small features will not increase the executable size much. As long as **imv(stb)** stays under 100KB, I'm not too worried, so there's plenty of room to add new features. But that doesn't mean a feature that uses 40KB is ok! A feature that requires 1KB is a no-brainer; we've got room for 50 of those. A feature that uses 10KB only leaves us room for 4 more, so it better be darn important. + +3. Performance overhead for users who don't use the feature must be minimized. In some cases this is trivial, in other cases this may significantly expand the basic development cost. + +That leads us to the first cost. Some possible features are non-starters due to #1; if no developer on the project is willing to devote the time necessary to implement it, it won't happen. Developer time-devotion-willingness tends to follow the developer's interests and tastes, but is influenced by user desires. Creating an issue, or starring an existing one, is a good way to influence the developers this way. Or, if you are a developer yourself, you might just devote the time yourself to implement it. However, if you implement something because you thinking it's easy, don't forget to account for the other three issues above. + +As stated before, #4 is the doozy. With 2000 lines of code, a feature that can be added in 2-3 lines probably doesn't hurt much, but one which takes 100+ lines, or even just 5 lines in 5 different places, may have a non-linear effect on further development. For example, toggling the image border seems like it ought to be fairly easy, but its effect is fairly squirrely in the code base. Much of the code must reason about the image size and the window size. For simplicity, most of the code pretends the border is there while reasoning, and then at the last minute adjusts numbers as needed to compensate. But this logic doesn't apply everywhere; dragging the window just ignores the border entirely. To write any new code that pays attention to the window size, you now have to deal with the burden of worrying about this. So we have to be very careful about adding more features that introduce this kind of development overhead. + +So, rule of thumb: trivial, local changes are fine by #4. If there's one place in the code where you can put an if() and do something different some of the time (plus preference setting and such), that's probably going to be fine by #4. But don't expect patches that add 200 lines of code, or change multiple places in the existing codebase, to be approved. (Unless there's high demand! Ask a developer first!) + +But even a feature that is small in source and executable still has to pass the test of being useful to users as an image viewer behavior; otherwise we might get 200 features added all of which are random tangential things desired by individual developers. Even though each one by itself might be small, the effective increase in the codebase would be significant, which would hurt further development. Even if your feature is "optional", it still has significant costs. Does it require a keyboard toggle? There's only so much room to document UI on the F1 page, and even ignoring that, every added toggle _slows down users_ reading the documentation (if they don't want it). Each addition alone is trivial, but many additions add up; when there are 3x as many commands, it's a big deal. + +Similarly, getting/saving preferences to the registry is trivial, but there's a significant mental difference in looking at a codebase that saves 6 prefrences rather than one that saves 100 preferences. Worse yet is the UI for setting that preference. \ No newline at end of file diff --git a/ProjectHome.md b/ProjectHome.md new file mode 100644 index 0000000..ab03c56 --- /dev/null +++ b/ProjectHome.md @@ -0,0 +1,104 @@ +Project status: 1.02 release + +# Information for Users # + +**imv_(stb)_** is an extremely lightweight and fast image viewer/browser for the Windows platform. The current executable is about 70KB (without executable compression). + +## Features ## + +Inspired by [vjpeg](http://www.stereopsis.com/vjpeg/), **imv_(stb)_** offers a very simple, no-frills image viewer with a minimal interface. Each opened image is easy to drag and resize. A minimal border around the image makes it easy to see the contents of the edge of the image, and (compared to vjpeg) prevents you from confusing a screenshot of a windows application with the actual application. + +Inspired by Windows Picture and Fax Viewer, **imv_(stb)_** lets you navigate through multiple images in the same directory in a single window. Unlike Windows Picture and Fax Viewer, you can have multiple **imv(stb)** browsers open simultaneously. + +Details of the latest changes can be found in the ReleaseNotes. + +## Limitations ## + +It is not possible to zoom in on part of an image; in other words, you can't zoom to the point where you'd need scrollbars. (Well, you can if you get sneaky, but you don't get scrollbars.) + +## Issues, Bugs, Feature Requests ## + +To submit a bug report, go here: [Issues](http://code.google.com/p/stb-imv/adminIssues) + +To submit a suggestion or a feature request, go here: FeatureRequests + +# Information for Developers # + +## Image Loading ## + +**imv(stb)** is built around http://nothings.org/stb_image.c, a public domain, portable JPEG/PNG/BMP/TGA reader. stb\_image.c has some limits; it doesn't handle progressive jpegs or interlaced PNGs, or 1bpp BMPs. To improve this, we dynamically load gdiplus if available (it supports many interesting file formats, and is available on most win32 machines), and dynamically load FreeImage if it's available (it supports many, many more file formats and isn't on most people's machines at all). We use stb\_image first, although it's not necessary the fastest, to provide more testing bandwidth for stb\_image.c, and because it's well-behaved and sometimes fast. + +It is also possible to improve stb\_image.c or add new loaders for other types. Because Google Code does not support projects with a public domain (PD) "license", there is no Google Code project for stb\_image.c. We do not generally want to fork stb\_image.c, though, so changes to stb\_image.c in this project should all be public domain; no changes that alter the copyright status of stb\_image.c will be accepted. If you want to write new loaders that are GPL'd rather than PD'd, simply place them in separate files, and either make imv.c call them if stbi\_load fails, or use stb\_image's 'register a new loader' feature. + +The current version of **imv(stb)** falls back to GDI+ and then to FreeImage if they can be dynamically loaded. + +## Portability ## + +**imv(stb)** itself is non-portable, offering a simple, flexible, highly-tuned user interface hard-coded to the windows platform. If you back to its earliest state it might be a reasonable starting place for a port. + +The project is currently available with VC6 and VC7 project files for maximum compatibility. The automatic upgrade to VS2005 should just work. + +## Performance ## + +**imv(stb)** makes heavy use of threads. If you use it to browse images in a directory, it will read ahead (in either direction) by one image. Disk I/O is performed in one thread, jpeg/png decoding in a second thread, and a single image-resize operation is performed in parallel on one or more other threads (as many threads as you have processors). The main thread accepts input as fast as it can and manages the other threads (maintaining single-threaded ownership of most data but lending it as needed). For example, when you drag a corner to resize, image resize operations are "queued", but the window itself is not resized until the image resize operation completes. + +The goals/ideals motivating the design are: + 1. Minimize the latency of the user experience + 1. Don't waste work + 1. Maximize the quality of the user experience + 1. If you've started doing something, finish it (corollary to 2) + +These principles lead to the following threading/performance design: + * Keep a cache of decoded images (1,2) + * If you've loaded the image data from disk, eventually it'll get decoded (2) + * The user can browse "ahead" of the system to a "goal image" (1) + * Always decode the "goal image" before anything else pending; generally, decode in most-recently-browsed order (1) + * At any given time, try disk-reading the goal image and the image before and after (1) + * At any given time, _only_ try disk-reading the goal image and the image before and after (2) + * If you've started decoding an image, go ahead and finish decoding it (4) + * If you've started resizing an image, finish it and show it (4) + * Resizer always resizes whichever decoded image is most-recently-browsed (1) + * Don't resize the window until the resizer is done (3) + * Draw the window without clearing it (3) + * Don't start doing file readaheads until the first time the user browses (i.e. the first time you step to a next/prev image) (2) + * Don't use threads (except for resizing) on opening the first image, to make a clear, optimized code path (threads couldn't reduce this latency anyway) (1,2,3) + * Although decoded images are cached, do not cache non-decoded data, let the OS do that (2) + * Use multiple threads for a single resize operation (1) + +This leads to the following behaviors: + * When browsing images at moderate speeds, switching images only takes the time of a single resize operation + * When browsing images faster that they can be decoded, the user will see whichever images manage to be decoded flip by, in (sparse) browse order, but all are being decoded, so if you go back through the skipped images it's fast + * When browsing images faster than they can be read from disk, the user will see whichever images manage to be read and decoded flip by, in (sparse) browse order. Old load requests are dropped entirely, though, so if they go back through them they'll still need loading. + * In other words, we have to trade off between pre-decoding and wasting work. If we outrun file I/O, we choose to avoid wasting work and simply don't read and don't decode. If the file I/O keeps up, having loaded the data we go ahead and decode it, even though that's wasted work if we never go back. There isn't a _really_ good reason for handling the two cases differently, but an approximate one is that the viewer is a foreground task competing with background tasks, and it seems ok to claim lots of CPU but annoying to grind the disk (e.g. for users running peer-to-peer applications or large file copies files in the background). + +The first of these, and a general 'see some full-sized, fully-decoded images if you scan too fast' were part of the initial design requirements that motivated the specifics of the implementation. + +## Todo ## + * additional optimizations + * refactorings to reduce size + +## Possible Todo ## + * refresh current image, refresh directory list + * ReadDirectoryChangesW + +If you want any of the features above, feel free to create an issue for it so we know there's interest. + +## Done ## + * build under VS2003 + * UNICODE (currently we support non-8bit filenames that you iterate through in a directory, but only a very few open from the commandline, and none(?) from the open file dialog) + * Iterate through images recursively in subdirectories? (stb.h already includes a recursive readdir, so it would be trivial) + * Slideshow mode (WM\_TIMER triggers advance(1)) + * label settings in preferences + * comment source code (partway done) + * better error messages in stb\_image.c + * Preferences dialog + * Get config info from registry (cache size, image frame preference) + * The current bilinear resampling is a placeholder; the plan is to upsample with bicubic, and downsample with the nearest integral-sized box-filter followed by a bilinear filter + * Display filename as label at bottom + * Allow toggling the grey highlight in the image frame, and toggling the frame entirely + * Display error messages for files that fail to load + * when switching windows, leave window in current position if possible + * mousewheel zoom + * integrated help on F1 + * File Open dialog + * Need an image-switching-mode 'use actual/best size' vs. 'use current size' (currently does the former). Double-click/alt-enter to switch to between actual/best size and "fullscreen" on primary, and back. Whenever you resize the window or switch to fullscreen, switch the image-switching-mode to 'use current size'; when you double-click back to best size, set the image-switching-mode to that. \ No newline at end of file diff --git a/RejectedRequestExample.md b/RejectedRequestExample.md new file mode 100644 index 0000000..20cbdb2 --- /dev/null +++ b/RejectedRequestExample.md @@ -0,0 +1,113 @@ +This page will be of interest to developers or potential developers. + +# Introduction # + +A programmer suggested using **UpdateLayeredWindow()** to support +(optionally) drawing images with alpha in them transparently on the desktop. +(I will call this behavior "desktop alpha".) + +In the interests of helping developers understand why candidate features they +are considering might not be accepted to the code base, this page +describes why I rejected the above suggestion (in +the sense both of choosing not to implement it, but also making +clear that I was unlikely to accept a patch), and why it would have +been much harder to implement properly than the programmer probably imagined. + +# Details # + +As discussed in FeatureRequests, there are four major costs to +implementing a feature: + 1. Development time to implement the feature + 1. Increase in the size of the executable + 1. Performance/memory overhead when not using the feature + 1. Maintainence overhead for continued development of the program + +A central premise to **imv(stb)** is to make significant trade-offs +for _performance_; that is, the fundamental threaded design has _huge_ +costs of type #1 and #4. This was, in some sense, the premise of **imv(stb)**; +to accept that trade-off. + +But because **imv(stb)** is so performance-oriented, I am strongly +opposed to any significant costs of type #3! That means additional +features must pay more #1 and #4 if needed to avoid unnecessary +overhead. + +But I'm also opposed to significant costs in #4, unless the gain +to users is really significant. That can make it hard to implement +some features in any way that I find acceptable, as is the case +for this feature. + +Now let's look at the actual issues of implementing this feature. + +## Overview of problems ## + +Just so they don't come totally out of left-field later, here are +the central issues in the existing code base and design that maDe +the proposed feature problematic at the time it was proposed: + * compile-time switch to only store 3 bytes per pixel + * image cache stores image already composited against background + * no alpha resampling + +## Implementation #1: proof-of-concept ## + +Naively, the feature sounds pretty straightforward. There's an image +with alpha, **imv(stb)** currently loads the image and alpha blends +the image against a static background, and then it gets rendered. +So the things we need are (1) an image with alpha, (2) the ability +to draw it, and (3) some code to change the rendering path / window +type as appropriate. + +How hard is a first cut, proof-of-concept implementation? + + * Disable the code that alpha-blends over a static background + * Change the window type + * Insert call(s) to UpdateLayeredWindow. There are also a number of calls to MoveWindow and SetWindowPos which may need changing, I don't know. + +**Issues with implementation #1** + +This just hard-codes the desktop alpha to be always on so we can see whether +it works or not. There are many other issues that won't work. For example, +you cannot resize an image (or use an image larger than the desktop), +because the resampler doesn't resample the alpha channel. This will +have to be addressed in a correct implementation. + +## Implementation #2: real, but unsatisfactory ## + +The following additional changes must be made: + + * Add a toggle for whether to make the background transparent or not + * UpdateLayeredWindow obeys the toggle + * If it's not transparent, you need to alpha-blend against the static background somewhere in the code (not in the current place, as will be discussed next section). + * Modify the resampler to resample the alpha channel as well + +**Issues with implementation #2** + +This implementation is unsatisfactory because of the many ways it +increases the third type of cost, performance overhead for users +who don't need it. + + * The code can currently be compiled to use either 3 or 4 bytes per pixel. At the moment it uses 4 because the 32-bit resampler is faster. However, the 24-bit case could be sped up. I've chosen not to pursue that yet until we get the bicubic sampler in, which will be the more common performance case (upsampling is the slowest case). It's quite possible that the 24-bit case will be faster due to reduced cache usage; it will also have the benefit of using less memory, or being able to cache more decoded images in the same memory. The proposed alpha solution will simply not work if the code is recompiled to use 3 bytes per pixel. + * Even if compiled for 4-bytes-per-pixel, resampling alpha correctly may incur performance overhead on the non-alpha-blended path. + * We need to detect if images are opaque and force UpateLayeredWindow to run in opaque mode for those images, to avoid unnecessary alpha blending overhead + * Even so, UpdateLayeredWindow is probably slower (and uses more memory) for opaque images, so really we should just use the existing path for opaque images and when desktop alpha is toggled off + * The current implementation avoids the cost of alpha blending the image most of the time, because it's blended immediately after loading. The cached image is stored already blended. If we want to toggle desktop alpha on and off, we either need to always store the cached image non-blended--which costs an overhead when we're not doing desktop alpha (and note that the optimized bilinear resampler uses 5 integer multiplies per pixel; doing an alpha blend requires 2 integer multiplies per pixel, thus it's a potentially significant cost (40%))--or we need to cache the image maybe pre-blended and maybe not, and flush the cache when the user toggles it. Probably this should be lazy, so if a user browses through a bunch of images, then toggle one to transparent and the back, then browses back through them, they haven't been flushed. Or we need to keep two separate caches, one for each version (significantly increasing the storage overhead for images with alpha). + +## Implementation #3: satisfies most requirements, but bloated ## + + * Replicate the resampler, so there's an optimized version for images with alpha, and an optimized version without + * Switch the bytes-per-pixel from being a #define to being a per-image thing; then use 3 bytes for opaque images and 4 for ones with alpha. (Note that you can't get this easily out of stb\_image, because there's a separate issue of expanding greyscale to RGB. Either you let stb\_image return you whatever it's got, and replicate the grey->RGB conversion, or you modify stb\_image to allow querying the number of components WITHOUT decoding the image, and then use that result to decide whether to ask for 3 or 4.) + * Make a #define to decide whether opaque images use 3 or 4 bytes. + * Make the non-alpha resampler follow the #define. + * Cache flushing as described above; need to record with each decoded image whether it's been pre-blended, and if it's the wrong type, flush it and redecode. This will interact messily with prefetching! There's no real good answer here. Perhaps we should cache images with alpha unblended, then the first time we need to display them blended, we blend them and update the cache with the blended version. If we need the unblended version and have the blended version, we discard the cache and reload. That fixes half the prefetch cases, but not the other half. + +**Issues with implementation #3** + +Now we have: + +> A bunch of code squirreled everywhere through our app (all the MoveWindows, +> the support for multiple window types, switching window types on the fly, +> keeping track of the # of components in each image and whether it's been preblended, etc.) + +> An inefficiency when you switch between desktop alpha and not (probably not the end of the world, since that toggling is likely rare, and you only pay the cost if you use it; but as described above, we have slightly slowed the prefetching path even when disabled, because now we don't pre-blend the alpha until display time, instead of immediately after decode). + +And what do we get in return? A feature that is probably not very useful to most users. If there was no user other than the developer proposing the feature who wanted it, I have no doubt I would reject the patch, unless an efficient implementation turned out to be far simpler than I'm imagining it. \ No newline at end of file diff --git a/ReleaseNotes.md b/ReleaseNotes.md new file mode 100644 index 0000000..5da4102 --- /dev/null +++ b/ReleaseNotes.md @@ -0,0 +1,110 @@ +# Release # + +> version 1.01 + +# Release Notes # + +> Version 1.01: Release 1.01 (2012-01-17) + * bugfix: fix crash when closing preferences + * feature: if image is entire alpha=0, assume opaque + * feature: full-size expands on whichever monitor image is current on + * change: don't stretch full-size image across multiple monitors (#define ALLOW\_MULTISCREEN for old behavior) + +> Version 1.0: Release 1 ( 2008-10-19 ) + * feature: open a directory + * feature: sharpen when upscaling + * secret feature: toggle use of full-size virtual desktop + +> Version 0.99: Beta 11 ( 2008-02-07 ) + * bugfix: further attempt to support BACK/FORWARD + +> Version 0.98: Beta 10 ( 2008-01-31 ) + * bugfix: attempt to support BACK/FORWARD mouse buttons + * feature: custom .spk image-delta file format + * secret feature: recursive slideshows with clumsy UI using 'ctrl-R' and '.' + +> Version 0.97: Beta 9 ( 2007-11-27 ) + * bugfix: if starting path is unicode don't blow up trying to read the directory + +> Version 0.96: Beta 8 ( 2007-10-20 ) + * feature: VC7 project files + * feature: imv\_light.exe - uses GDI+ only, not stb\_image or FreeImage + * feature: press 's' for a primitive slideshow of current directory + * feature: HDR support in stb\_image + * feature: TGA support in stb\_image + * feature: save border choice to registry (expose in prefs) + * secret feature: use { } [ ] to rescale dark/light images + +> Version 0.95: Beta 7 ( 2007-08-15 ) + * bugfix: minor stb\_image.c fixes + * bugfix: don't have gdi+ use threads; correct lock/unlock of global memory + * bugfix: load non-7bit-filenames from commandline + * feature: ctrl-i launches new viewer instance on current image + * bugfix: fix cacheing code to allow refreshing current image after flip etc. + * bugfix: clean up repainting when dragging top or left to avoid dragging old data + * bugfix: fix out-of-control resize when border is off + +> Version 0.94: Beta 5 (2007-07-15) + * bugfix: changing image resize quality in preferences refreshes on OK + * feature: sort filenames to sort numbers in human-friendly order + * feature: show index within sorted list + * feature: ctrl-C puts current image filename (with full path) in the clipboard + * change: change the border control keys to 'b' instead of 'f' for next feature + * feature: use FreeImage.dll if it's available + * bugfix: fix bug in right or bottom cursor region due to internal cleanup + +> Version 0.93: Beta 4 (2007-07-10) + * bugfix: alter stb\_image to support jpegs with weird header blocks + * bugfix: exit after printing directory error message + * bugfix: change naming of frame/border variables + * bugfix: ESC when showing help clears help, rather than exiting + * internal: clean up registry code to halve registry ops + +> Version 0.92: Beta 3 (2007-07-03) + * internal: replace Sleep()-based thread-joining code with synchronization primitive + * internal: change work queue internals to use stb\_mutex + * internal: change stb\_mutex from using win32 semaphore to using CRITICAL\_SECTION + * internal: stbi\_load\_from\_memory() only; remove stdio from stb\_image (500 bytes) + +> Version 0.91: Beta 2 (2007-07-01) + * feature: allow changing the label font size, toggle label in preferences + * internal: various refactorings to clean up the code + * bugfix: finish commenting code (except resizer) + * bugfix: fix tiny leak closing the preferences dialog with the close button + +> Version 0.90: Beta 1 (2007-06-30) + * bugfix: user-friendlier error messages + * feature: save preferences to registry + * feature: preferences dialog + +> Version 0.57 (2007-06-29) + * feature: cubic image resampling + * bugfix: advancing to pre-loaded image then retreated to previous image + * bugfix: occasional error when advancing to image that was about to be decoded + * bugfix: commented about 75% of code + * bugfix: fix logic for fitting large images onscreen to not stop a few pixels short + +> Version 0.56 (2007-06-27) + * bugfix: stb\_image wouldn't load jpegtran output (which is invalid JFIF) + +> Version 0.55 (2007-06-27) + * feature: toggle filename label + * feature: toggle entire border + * feature: toggle white stripe in border + * bugfix: display error message for files that don't load + +> Version 0.54 (2007-06-26) + * bugfix: keep current window position while switching windows in actual-size mode + * feature: mousewheel to resize + * feature: checkerboard border behind alpha - but image sized, not zoom-independent + * feature: integrated help with F1/h/? + * bugfix: resizing with ctrl- and ctrl+ correctly sets the actual-size-mode + +> Version 0.53 (2007-06-25) + * feature: double-click, alt-enter to toggle actual-size vs. fullscreen + * feature: ctrl-O lets you open an arbitrary file + * feature: added Open File dialog if you run without a commandline + * bugfix: changing images doesn't change window size except in actual-size mode + +> Version 0.52 (2007-06-25) + * bugfix: hang when resizing first image \ No newline at end of file diff --git a/SpkFileFormat.md b/SpkFileFormat.md new file mode 100644 index 0000000..c7768e1 --- /dev/null +++ b/SpkFileFormat.md @@ -0,0 +1,249 @@ +# Introduction # + +This is the SPK file format document from daeyna.com, since +it's no longer accessible at that site and isn't on archive.org. + + +# Details # + +``` + +January 17, 2008 + +SPK is a simple image file format that encodes an image as a delta +from another image stored separately. decoding an SPK file requires +access to this other file, which requires access to a file system. +it is not appropriate for streaming or general purpose usage. it is +primarily designed for archival purposes to reduce storage. see the +separate rationale section for more information. + + +file format +=========== + +all integers are stored as 4-byte little-endian integers. + +header: + ++---------+------+---------------------+----------------------------------+ +| Offset | Size | Value | Purpose | ++---------|------|---------------------+----------------------------------+ +| 0 | 16 | "xPIC-delta-image" | signature | +| 16 | 1 | byte VERSION=0 | version number | +| 17 | 4 | integer FLEN | length of following filename | +| 21 | FLEN | utf8 string NAME | filename with terminating \0 | +| FLEN+21 | 4 | integer WIDTH | width of image in pixels | +| FLEN+25 | 4 | integer HEIGHT | height of image in pixels | +| FLEN+29 | 4 | integer CHN | number of 8-bit channels in image| ++---------|------|---------------------+----------------------------------+ + +note that the byte at FLEN+20 must be 0--it is the terminating nul for +the filename string, which simplifies decoding (such as if you memory map +the file). + +after the header comes repeated "delta packets", until the end of the file +is reached. these are encoded as follows, assuming the start of the packet +is at location PCKT: + ++------------+------+---------------------+----------------------------------+ +| Offset | Size | Value | Purpose | ++------------|------|---------------------+----------------------------------+ +| PCKT + 0 | 4 | integer START | offset in image of delta pixels | +| PCKT + 4 | 4 | integer LEN | number of delta pixels in packet | +| PCKT + 8 | CHN | pixel | first replacement pixel | +| PCKT+CHN+8 | CHN | pixel | second replacement pixel | +| ... | ... | ... | ... | ++------------|------|---------------------+----------------------------------+ + + +interpretation +============== + +the version number VERSION must be 0. + +the NAME field contains the name of an image file stored in the same +directory as the SPK file. the name must not contain the character '/' +the character ':', or the character '\', since these are filename +separators. a decoder MUST reject SPK files using these characters, to +avoid security issues. + +the NAME is interpreted as a utf8 string that names a file in the same +directory. for version 0, that file MUST be a PNG file with 8-bit channels. +the specified name should be an exact match of the name of the file in +the filesystem, including case. the named file MUST NOT be an SPK file. + +after locating the file called NAME, a decoder should decode the PNG +file it contains and, if paletted, convert it to a non-palettized form +(e.g. expand the palette entries). the PNG should be decoded with an +alpha channel if transparency information is available in the file in +any form. + +if the width of the decoded PNG is not equal to WIDTH, the decoder MUST +reject the SPK file. the the height of the decoded PNG is not equal to +HEIGHT, the decoder MUST reject the SPK file. if the number of channels +in the decoded PNG are not equal to CHN, the decoder MUST reject the SPK +file. the number of channels in the PNG is, for a monochrome image, 1 +if no alpha channel or 3 if an alpha channel; or for a color image, 3 +if no alpha channel else 4 if no alpha channel. + +(if the file has not been rejected at this point, it will be accepted; +there are no further rejection cases.) + +at this point the "delta packets" are applied to the image. this process +is modeled on the assumption that the PNG file is decoded into a continuous +stream of pixels, with the first row of pixels, then the second, then the +third each consecutive in memory. + +each delta packet specifies a horizontal run of pixels to change. these +horizontal runs are identified by their starting pixel offset location, not +their coordinates, and the runs can cross multiple scanlines. the "pixel +offset" of the top-left pixel is 0; that of the pixel at (1,0) is 1; that +of the pixel at (0,1) is WIDTH. + +each replacement pixel in the delta packet consists of CHN bytes, which +replace the corresponding pixel in the PNG image. (they are not alpha-blended; +they are directly replaced). the bytes in each pixel are stored in the +following order: + + +-----|--------------------------| + | CHN | byte ordering | + +-----|--------------------------| + | 1 | luminance | + | 2 | lum, alpha | + | 3 | red, green, blue | + | 4 | red, green, blue, alpha | + +-----|--------------------------| + +each successive pixel in the delta packet occurs at an increasing pixel +offset. For example, with START=10, LEN=3, the packet modifies the 3 +pixels at offsets 10, 11, and 12. If CHN is 4, the pixel data will take +up 12 bytes, and the header will take up 8 bytes, so the total packet +will require 20 bytes. + +after the end of a packet, a new packet begins on the very next byte. + +let END = WIDTH * HEIGHT, i.e. one more than the largest valid offset. +if START >= END, or START+COUNT-1 >= END, or START+COUNT < START, the +decoder MUST stop processing delta packets and discard the rest of the +SPK file data and produce the so-far-processed image. (in other words, +if the range specified includes any pixels outside of the image.) + +if end of file is reached in the middle of a delta packet, the decoder +MAY process the available part of the packet or MAY discard it entirely. +in either case it MUST produce the so-far-processed image. + +if the pixel regions specified in the delta packets overlap each other, +the decoder MUST produce results consistent with the above description +(that is, the produced pixel value comes from the last delta packet +that overlaps it). + +this completes the specification. + + +rationale +========= + +SPK files exist primarily to achieve more compressed archival of sets of +images that are very similar. we identify two important cases: + + 1. the set is stored in a directory + 2. the set is stored in a compressed archive (zip, rar, 7z, etc.) + +the goal of SPK is to minimize storage in both these cases, without +causing excessive overhead or complexity for image decoding. + +SPK is only intended to address _lossless_ compression; if the files +are lossily compressed, SPK is not applicable. + +the current standard for meeting the goal in case #1 in a lossless case +is to use PNG. + +the current standard for meeting the goal in case #2 is the use of +"solid" 7-zip archives. a "solid" archive allows for compression across +multiple files, finding shared elements between them. because PNG files +are already compressed, common elements in the original images are likely +to not be common in the PNG files. to address this, the standard approach +is to compress BMP files in a solid 7z archive. + +unfortunately, this means that converting between these two formats +requires both unpacking and then converting BMPs to PNGs. the intermediate +step involves having all the BMPs unpacked; since the BMPs are uncompressed, +this can take up inordinate amounts of space. (the sets that motivated +the creation of SPK were multiple gigabytes as PNGs, and tens of gigabytes +as BMPs!) + +SPK is intended to allow one or a few images from a similar-set to be +encoded as PNGs, and the rest to be encoded as SPKs. the SPK files are +themselves uncompressed for simplicitly (it would be expensive to encode +each individual delta packet as a PNG, and using rectangles would be +very complicated for the encoder and might not be efficient). the SPK +file cannot be relative to another SPK file, which reduces the possible +space efficiency but keeps decoding fast and simple. although in some +cases a set is generated by applying several independent details in +combination, so many SPK files will be applying the exact same sets of +changes to a base file, we do not attempt to share this data, again for +speed and simplicity. + +with a solid archive, however, the delta packets will often be identical +between multiple SPK files, and the solid archiving will further compress +them. this allows further savings in the compressed case. + +some image viewers can view directly from archive files. however, viewing +a solid archive cannot be efficient, since random access is not possible. +in the non-solid case, the compressors can still reduce the size of the +SPK files (which aren't compressed), while leaving open the possibility of +random access. however, the image viewer will still need to decode two files, +so we doubt anybody would actually implement it. + +we tested with two sample data sets. the first involved 204 image +files, the second involved 1,212 image files. our encoder for the +first set preserved 37 PNG files, while that for the second preserved +56 PNG files. + ++-----------------------+--------------+---------------+ +| compression method | set 1 size | set 2 size | ++-----------------------+--------------+---------------+ +| BMP files | 152,859,816 | 1,381,745,448 | +| PNG files | 36,952,795 | 247,963,721 | +| PNG + SPK files | 7,300,912 | 24,933,386 | ++-----------------------+--------------+---------------+ ++ PNG non-solid RAR | 36,951,803 | 247,641,338 | +| PNG solid RAR | 30,061,920 | 194,108,416 | +| PNG solid 7z | 26,860,657 | 145,120,893 | +| BMP solid RAR | 16,760,954 | [*] | +| BMP solid 7z | 4,081,486 | 18,612,638 | +| PNG+SPK non-solid RAR | 6,962,703 | 21,005,839 | +| PNG+SPK solid RAR | 5,633,416 | 11,246,404 | +| PNG+SPK solid 7z | 5,558,082 | 10,908,167 | ++-----------------------+--------------+---------------+ + +we omitted [*] because solid RAR is just never competitive with solid 7z. + +according to these results, when stored as files, not in archives, use of .SPK +files for sets of images with many similar images can achieve a significant +storage savings, in our case, between 5x and 10x. + +additionally, they are competitive with the standard way of minimizing +storage in archives. For non-solid archives, they are strictly superior +(showing a range of 6x to 12x as expected--since PNGs are already compressed, +non-solid compression offers little additional gain). for solid archives, +PNG+SPK they are always superior to PNG or BMP RAR solid archives, and they +are always superior to PNG 7z archives. Compared to the best standard +compression, BMP 7z, they are sometimes smaller and sometimes larger, in +these two cases showing a range of 1:1.5 (worse) to 2:1 (better). we do not +know how well these generalize (are they more often better or more often +worse?) + +based on these results, we highly recommend using PNG+SPK in solid 7z archives +as a mechanism for distributing sets of highly-similar images. the size +result is comparable, but the decode step does not require decoding through +an enormous intermediate step. instead, the archive will decompress to the +recommended smallest-possible format for storing files. at this point, the +PNG+SPK directory can be expanded to purely PNG images to maximize +compatibility with other applications, or used directly if compatible image +viewers exist. the only drawback is the lack of an SPK => PNG decoder on +all platforms. we have provided a windows implementation and the source +code so it can be ported, but we have not attempted a Mac or Linux port +ourselves. + +``` \ No newline at end of file