Additional technical documentation about ImageWorsener
======================================================

This file contains extra information about ImageWorsener. The main
documentation is in readme.txt.

Web site: <http://entropymine.com/imageworsener/>

Acknowledgments
---------------

Some of the inspiration for this project came from these web pages:
  "Gamma error in picture scaling"
    http://www.4p8.com/eric.brasseur/gamma.html
  "How to make a resampler that doesn't suck"
    http://www.virtualdub.org/blog/pivot/entry.php?id=86

Information about resampling functions and other algorithms was gathered from
many sources, but ImageMagick's page on resizing was particularly helpful:
  http://www.imagemagick.org/Usage/resize/

Alternatives
------------

There are many applications and libraries that do image processing, but in the
free software world, the leader is ImageMagick (http://www.imagemagick.org/).
Or you might prefer ImageMagick's conservative alter-ego, GraphicsMagick
(http://www.graphicsmagick.org/).

Installing / Building from source
---------------------------------

Dependencies (optional):
  libpng <http://www.libpng.org/pub/png/libpng.html>
  zlib <http://zlib.net/>
  libjpeg <http://www.ijg.org/>
  libwebp
     <http://www.webmproject.org/code/#libwebp_webp_image_decoder_library>

Here are four possible ways to build ImageWorsener:

* Prebuilt Visual Studio 2008 project files

Open the scripts/imagew2008.sln file in Visual Studio 2008 or newer.

To compile without libwebp: Edit the project settings to not link to
libwebp.lib, and change the line in src/imagew-config.h to
"#define IW_SUPPORT_WEBP 0".

* Generic Makefile

In a Unix-ish environment, try typing "make -C scripts". It should build an
executable file named "imagew" or "imagew.exe".

To compile without libwebp: Set the "IW_SUPPORT_WEBP" environment variable to
"0" (type "IW_SUPPORT_WEBP=0 make").

* Using autotools

Official source releases contain a file named "configure". In simplest form,
run
   ./configure
then
   make

Many options can be passed to the "configure" utility. For help, run
   ./configure --help
Suggested options:
   CFLAGS="-g -O3" ./configure --disable-shared

If there is no "configure" file in the distribution you're using, you need to
generate it by running 
   scripts/autogen.sh
You must have GNU autotools (autoconf, automake, libtool) installed. To clean
up the mess made by autogen.sh, run
   scripts/autogen.sh clean

* Using CMake (deprecated?)

CMake is a utility that generates Makefiles and project files.

If you don't have CMake installed, you can download it from
<http://www.cmake.org/>.

In a Unix-ish environment:

$ mkdir build
$ cd build
$ cmake ..
$ make

Using CMake from Windows is not recommended at this time, but you can try it
if you want.

First, set your environment variables correctly by running a command prompt
via Start -> All Programs -> Microsoft Visual [whatever] ->
Visual Studio Tools -> Visual Studio Command Prompt. If you can't find such
a utility, look for a script named VCVARS32.BAT and run that.

To build from the command line:

> cd <path_to_imagew>
> mkdir build
> cd build
> cmake ..
> nmake

To build from the IDE:

> cd <path_to_imagew>
> mkdir build
> cd build
> cmake -G "Visual Studio 9 2008" ..
Now open the imagew.sln file.

Instead of "Visual Studio 9 2008", you can name any "generator" supported by
CMake. Consult the CMake documentation. Here are some examples:

 "Visual Studio 7 .NET 2003"
 "Visual Studio 8 2005"
 "Visual Studio 8 2005 Win64"
 "Visual Studio 9 2008"
 "Visual Studio 9 2008 Win64"
 "Visual Studio 10"
 "Visual Studio 10 Win64"

This is not meant to imply that IW is guaranteed to work with all of the
compilers listed above.

Philosophy
----------

ImageWorsener attempts to have good defaults. The user should not have to know
anything about gamma correction, bit depths, filters, windowing functions,
etc., in order to get good results.

IW tries to be as accurate as possible. It never trades accuracy for speed.
Really, it goes too far, as nearly everyone would rather have a program that
works twice as fast and is imperceptibly less accurate. But there are lots
of utilities that are optimized for speed, and there would be no reason for
IW to exist if it worked the same as everything else.

I don't intend to add millions of options to IW. It is nearly feature complete
as it is. I want most of the options to have some practical purpose (which may
include the ability to imitate what other applications do). Admittedly, some
fairly useless options exist just for orthogonal completeness, or to scratch
some particular itch I had.

I've taken a lot of care to make sure the resizing algorithms are implemented
correctly. I won't add an algorithm until I'm sure that I understand it. This
isn't so easy. There's a lot of confusing and contradictory information out
there.

IW's command line should not be thought of as a sequence of image processing
commands. Instead, imagine you're describing the properties of a display
device, and IW will try to create the best image for that device. For example,
if you tell IW to dither an image and resize it, it knows that it should
resize the image first, then dither it, instead of doing it in the opposite
order.

IW does not really care about the details of how an image is stored in a file;
it only cares about the essential image itself. For example, a 1-bit image is
treated the same as an 8-bit representation of the same image. If you resize a
bilevel image, you'll automatically get high quality grayscale image, not a
low quality bilevel image. 

Architecture
------------

IW has three components: The core library, the auxiliary library, and the
command-line utility.

The core library does the image processing, but does not do any file I/O. It
knows almost nothing about specific file formats. It has access to the
internal data structures defined in imagew-internals.h. It does not make any
direct calls to the auxiliary library.

The auxiliary library consists of the file I/O code that is specific to file
formats like PNG and JPEG. It does not use the internal data structures from
imagew-internals.h.

The public interface is completely defined in the imagew.h file. It includes
declarations for both the core and auxiliary library.

The command-line utility is implemented in imagew-cmd.c. It uses both the core
library and the auxiliary library.

The core and auxiliary libraries are separated in order to break dependencies.
For example, if your application supports only PNG files, you can probably
(given how most linkers work) build it without linking to libjpeg.

Files in core library:
 imagew-internals.h, imagew-main.c, imagew-resize.c, imagew-opt.c,
 imagew-api.c, imagew-util.c
 
Files in auxiliary library:
 imagew-png.c, imagew-jpeg.c, imagew-webp.c, imagew-gif.c, imagew-miff.c,
 imagew-bmp.c, imagew-tiff.c, imagew-zlib.c, imagew-allfmts.c

Files in command-line utility:
 imagew-cmd.c, imagew.rc, imagew.ico

Other files:
 imagew.h (Public header file, Core, Aux., Command-line)
 imagew-config.h (Core, Aux., Command-line)

Double-precision floating point?
--------------------------------

IW can be compiled to use any available floating point type for its internal
representation of samples. (Unfortunately, it's impractical to make this a
run-time option.) Its default is currently set to be "double", which is
usually an 8-byte floating-point number. This may seem like overkill, and I
admit, it probably is. Using double precision shouldn't have much effect on
performance, especially if it's compiled as a 64-bit application. But it will
use a lot more memory.

The real reason I haven't switched to single-precision is simply because I
haven't found any particular reason to do so. IW is intended to be used on
reasonably modern PCs, and it does not aim for low memory use or the highest
possible performance, so this might not be much of an issue.

For 8-bit target images, switching to 4-byte floating point affects very
roughly one pixel in every 50,000. For 16-bit target images, it's more like
one pixel in every few hundred. Not that that means anything -- the image
processing algorithms being used aren't even "correct" to that degree.

4-byte floating-point numbers give you about 7 significant digits, which in
extreme cases may not be quite enough. Particularly for 16-bit target images,
when working in a linear colorspace, bright samples are much, much brighter
than the dimmest samples. If IW has to add a huge number of dim pixels together
with just a few bright pixels, 7 significant digits might not be enough to
do the kind of accurate calculations it strives for.

Unicode
-------

Text files like this one notwithstanding, I've had enough of ASCII, and I want
to support Unicode even in an application like this that does very little with
text. IW supports Unicode filenames, and will try to use Unicode quotation
marks, arrows, etc., if possible. If IW does not correctly figure out the
encoding you want, you can explicitly set it using the "-encoding" option. In
a Unix environment, Unicode output can also probably be turned off with
environment variables, such as by setting "LANG=C".

The encoding setting does not affect the interpretation of the parameters on
the command line. This should not be a problem in Windows, because Windows can
translate them. But on a Unix system, they are always assumed to be UTF-8.

All strings produced by the library (e.g. error messages) are encoded in UTF-8.
Applications must convert them if necessary.

Rationale for the default resizing algorithm
--------------------------------------------

By default, IW uses a Catmull-Rom ("catrom") filter for both upscaling and
downscaling. Why?

For one thing, I don't want to default to a filter that has any inherent
blurring. A casual user would expect that when you "resize" an image without
changing the size, it will not modify the image at all. This requirement
eliminates mitchell, gaussian, etc.

The "echoes" produced by filters like lanczos(3) are too weird, I think; and
they can be too severe when using proper gamma correction.

When upscaling, hermite, triangle, and pixel mixing just don't have acceptable
quality. That really only leaves catrom and lanczos2. I somewhat arbitrarily
chose catrom over lanczos2 (they are almost identical).

When downscaling, the differences between various algorithms are much more
subtle. Hermite and pixel mixing are both reasonable candidates, and are nice
in that they have no ringing at all. But they're not quite as sharp as catrom,
and can do badly with images that have thin lines or repetetive details.

Colorspaces
-----------

Unless it has reason to believe otherwise, IW assumes that images use the sRGB
colorspace. This is the colorspace that standard computer monitors use, and
it's a reasonable assumption that most computer image files (whether by
accident or design) are intended to be directly displayable on computer
monitors.

It does this even if the file format predates the invention of sRGB, and/or
the file format specification says that, by default, colors have a gamma of
2.2 (which is similar, but not identical, to sRGB).

IW does not support ICC color profiles. Full or partial support for them may
be added in a future version.

TIFF output support
-------------------

IW mainly sticks to the "baseline" TIFF v6 specification, but it will write
images with a sample depth of 16 bits, which is not part of the baseline spec.
It writes transparent images using unassociated alpha, which is probably less
common in TIFF files than associated alpha, and may not be supported as well
by TIFF viewers.

TIFF colorspaces
----------------

When writing TIFF files, IW uses the TransferFunction TIFF tag to describe the
colorspace that the output image uses. I doubt that many TIFF viewers read
this tag, and actually, I don't even know how to test whether I'm using it
correctly. You can disable the TransferFunction tag by using the "-nocslabel"
option.

GIF screen size vs. image size
------------------------------

Every GIF file has a global "screen size", and a sequence of one or more
images. Each image has its own size, and an offset to indicate its position on
the screen. By default, IW treats the screen size as the final image size, and
paints the GIF image (as selected by the -page option) onto the screen at the
appropriate position. Any area not covered by the image will be made
transparent.

If you use the -noincludescreen option, it will instead ignore the screen size
and the image position, and extract just the selected image.

MIFF support
------------

IW can write to ImageMagick's MIFF image format, and can read back the small
subset of MIFF files that it writes. MIFF supports floating point samples, and
this is intended to be used to store intermediate images, in order to perform
multiple operations on an image with no loss of precision. MIFF support is
experimental and incomplete. Some features, such as dithering, may not be
supported with floating point output.

To use ImageMagick to write a MIFF file that IW can read, try:
$ convert <input-file> -define quantum:format=floating-point -depth 32 \
 -compress Zip <output-file.miff>

Non-square pixels
-----------------

Most image formats can contain metadata specifying different "densities" (i.e.
number of pixels/inch) for the X and Y dimension. In other words, the pixels
can be thought of as being non-square rectangles.

Non-square pixels are a pain, and make it really messy to figure out the best
size and density to use for the output image, if (as usually the case) the
user did not fully specify that information.

IW's rules are as follows:

If the user used the -noresize option, behave as if the user requested a height
and width that are exactly the size of the source image, and did not use
-bestfit.

If the user specified both the width and the height (absolute or relative), and
did not use the -bestfit flag, then IW doesn't have to "fit" the image in any
way, so there's no real difficulty. If a density is written to the output
image, it will likely indicate non-square pixels.

Otherwise, for the purposes of sizing, IW pretends that the input image is a
larger image (as measured by number of pixels) with square pixels. For example,
if an image is 150x150 pixels with a density of 100x200dpi, it will behave as
if it were 300x150, with a density of 200x200dpi. Thus, even if you don't tell
it to resize the image at all, the output image will be a different size in
pixels. If you use relative sizing (e.g. "-w x2"), it will be relative to the
adjusted size, not the original size.

"Color" of transparent pixels
-----------------------------

In image formats that use unassociated alpha values to indicate transparency,
pixels that are fully transparent still have "colors", but those colors are
irrelevant. IW will not attempt to retain such colors, and will make fully-
transparent pixels black in most cases. An exception is if the output image
uses color-keyed transparency, in which case it uses a different strategy.

Box filter
----------

It's not obvious how a box filter should behave when a source pixel falls
right on the boundary between two target pixels. There seem to be several
options:
1. "Clone" the source pixel, and put it into both "boxes" (target pixels).
2. "Split" the source pixel, and put it into both boxes, but with half the
   usual weight.
3. Arbitrarily select one of the two boxes (which could be the left box, the
   right box, or some other strategy like selecting the box nearest to the
   center of the image).
4. Ignore the problem, in which case the algorithm may behave unpredictably,
   due to the intricacies of floating point rounding. It may sometimes clone,
   sometimes round, and sometimes skip over a pixel completely.

IW arbitrarily selects the left (or top) box. To make it select the right (or
bottom) box instead, you could translate the image by a very small amount;
e.g. "-translate 0.000001,0.000001". To use the "clone" strategy, use a very
small blur; e.g. "-blur 1.000001".

Nearest neighbor
----------------

When using the nearest neighbor algorithm, if a target pixel is equally close
to two source pixels, it will be given the color of the one to the right (or
bottom). This is the same tiebreaking logic as is used for the box filter. (It
may sound like it's the opposite, but it's not: image features are shifted to
the left in each case.) As with a box filter, you can change this by
translating the image by a very small amount.

PNG sBIT chunks
---------------

If a PNG image contains the rarely-used sBIT chunk, IW will ignore any bits
that the sBIT chunk indicates are not significant.

Suppose you have an 8-bit grayscale image with an sBIT chunk that says 3 bits
are significant. This means there will probably be only 8 distinct colors in
the image, similar to these:

00000000 =   0/255 = 0.00000000
00100100 =  36/255 = 0.14117647
01001001 =  73/255 = 0.28627450
01101101 = 109/255 = 0.42745098
10010010 = 146/255 = 0.57254901
10110110 = 182/255 = 0.71372549
11011011 = 219/255 = 0.85882352
11111111 = 255/255 = 1.00000000

IW, though, will see only the significant bits, and will interpret the image
like this:

000 = 0/7 = 0.00000000
001 = 1/7 = 0.14285714
010 = 2/7 = 0.28571428
011 = 3/7 = 0.42857142
100 = 4/7 = 0.57142857
101 = 5/7 = 0.71428571
110 = 6/7 = 0.85714285
111 = 7/7 = 1.00000000

So, the interpretation is slightly different (e.g. 0.14285714 instead of
0.14117647).

A similar thing happens with BMP images with 16 bits/pixel, in which a color
channel usually has 5 or 6 bits. A value of 7/31, for example, is not converted
to 58/255, but is interpreted as exactly 7/31.

BMP transparency
----------------

Windows BMP images that use RLE compression can leave the color of some pixels
undefined, by using "delta" codes, or premature end-of-line codes. Many
applications interpret these undefined pixels as being the color of the first
color in the palette. Others interpret them as black. Still others (such as
IW, Mozilla Firefox, and Google Chrome) interpret them as transparent.

IW has a "-bmptrns" option to create such a transparent BMP, but it's kind of
a hack. It will only work if the final image has no more than 255 opaque
colors, and does not have partial transparency. If that's not the case, it will
fail, and write no image at all.

Transparent BMP images can have up to 256 opaque colors, but IW currently
limits it to 255. It leaves the first palette color unused, and assigns it a
bright color, so that it's likely to contrast with the foreground image.

IW is not really a good application to use to create images that are restricted
to a certain number of colors, because it does not support generating optimized
palettes. If your image has too many colors, the best you can do is to
posterize it. For example:
  imagew in.png out.bmp -bmptrns -cc 6 -ccgreen 7 -ccalpha 2 -dither f

Ordered dithering + transparency
--------------------------------

Ordered (or halftone) dithering with IW can produce poor results when used
with images that have partial transparency. If you ordered-dither both the
colors and the alpha channel, you can have a situation where all the (say)
darker pixels are made transparent, leaving only the lighter pixels visible,
and making the image much lighter than it should be. This happens because the
same dither pattern is used for two purposes (color thresholding and
transparency thresholding).

Obscure details about clamping, backgrounds, and alpha channel resizing
-----------------------------------------------------------------------

"Clamping" is the restricting of sample values to the range that is
displayable on a computer monitor. This must be done when writing to any file
format other than MIFF. But if you use -intclamp, it will also be done at
other times. Essentially, it will be done as often as possible, after every
dimension of every resizing operation. If a background is applied after
resizing, clamping will be done individually to both the alpha channel and the
color channels, then the background will be applied.

If you don't use -intclamp, no clamping will be done, except as the very last
step. If IW applies a background after resizing the image, the alpha channel
will not be clamped first, so it could actually contain negative opacity
values. That's hard to envision, but the math works out, and you generally get
the same result as if you had applied the background before resizing.

Currently, the only time IW applies a background before resizing is when a
channel offset is being used. This means that using -offset can have
unexpected side effects if you also use -intclamp.

Cropping
--------

IW's -crop option crops the image before resizing it, completely ignoring any
pixels outside the region to crop. This is not quite ideal. Ideally, any pixel
that could have an effect on the pixels at the edge of the image should be kept
around until after the resize, then the crop should be completed. This is not
difficult in theory, but coding it would be messy enough that I haven't
attempted it.

To do
-----

Features I'm considering adding:

- More options for specifying the image size to use; e.g. "enlarge the image
  only if it's smaller than a certain size".

- More options for aligning the input pixels with the output pixels.

- Ability to maintain PNG and GIF background colors.

- Hilbert curve dithering.

- Support for ICC color profiles.

Contributing
------------

I may accept code contributions, if they fit the spirit of the project. I will
probably not accept contributions on which you or someone else claims
copyright. At this stage, I want to retain the ability to change the licensing
terms unilaterally.

Of course, the license allows you to fork your own version of ImageWorsener if
you wish to.
