This document explains how to do image resampling, in the way that ImageWorsener does it. It’s not really about ImageWorsener, though, and some of the options mentioned here are not supported by ImageWorsener.

This is not the only way to resample images. There are other algorithms that are more sophisticated, and potentially better. But it is more or less how most applications do it, and it’s not easy to improve upon.

I’m only going to show how to resize a one-dimensional image having one sample per pixel (i.e. a grayscale image):

To resize a two-dimensional image, simply resize it in two passes:
one for each dimension.
(For a discussion of which dimension to resize first, see the
*Which dimension?* section below.)

For a color RGB image, resize the red, green, and blue components separately, as if you had three separate images.

If your image has transparency, you can resize the alpha channel
the same as a normal color channel. However, the *color* channels may
require special processing
in this situation.

Before resizing an image, it’s best to convert it to a linear colorspace. Then, after all the resizing is complete, convert it back to the original colorspace. Do not convert alpha channels in this way, as they are most likely already linear representations of the opacity.

The above diagram represents *pixels* as rectangular regions,
and *samples* as points located at the center of the
corresponding pixel (indicated by the blue and green dots).
The discussion below will follow this convention, in that
pixels have a nonzero size, and the distance between two neighboring
samples is one pixel.

You may have seen graphs of resampling filters, like these:

How do you actually go about using such a filter to resize an image?

Before we get to that, take note of a few properties of these filters.

On the horizontal scale, a length of 1 is the size of one
“pixel” (more about that later).
The vertical scale usually does not matter from the programmer’s
perspective, because we will be
*normalizing* the samples in a certain way that will automatically
scale it correctly.

Once a filter is normalized, the area under its curve will be 1. (In calculus terminology, the integral from −∞ to +∞ is 1.) A filter without this property would cause resized images to be consistently darker or lighter than the original. In this example, the area of the green regions minus the area of the red regions equals 1.

Choose any real number, say for example 0.25. Now form the set of all numbers that differ from that number by an integer: {..., −3.75, −2.75, −1.75, −0.75, 0.25, 1.25, 2.25, 3.25, ...}. If you add up the curve’s values at those points, the sum will always be 1. So, in the following image, the lengths of the green lines minus the lengths of the red lines should equal 1.

If the filter’s value is 1 at position 0, and 0 at all other
integers, then it has the property that if the source and target
images are the same size, it will leave the image unchanged.
You might expect that all good filters have this property, but
that’s not the case. It *is* a nice property to have,
but it can be worth
sacrificing in order to improve the filter in other ways.

The filter should be symmetric around the line x=0. If it were symmetric around a different line, the new image would be shifted to the left or right. If it were asymmetric, symmetric images would, after resizing, become asymmetric.

Now for the actual resampling algorithm. I’ll consider enlarging (upscaling) and reducing (downscaling) separately.

*Enlarging* is the easier case to deal with.

First, align the source samples with the target samples in the appropriate fashion for the resampling that you want to do.

Scale the filter horizontally so that the distance from its
x-coordinates 0 to 1 is the size of
one *source* pixel.
It doesn’t necessarily matter how you scale it vertically.
If you
normalize your calculations as described below, the absolute vertical
scale is irrelevant.

Do the rest of these steps for each target sample:

Align the filter so that its origin corresponds to the position of the target sample (the large green dot in this diagram).

Note that this diagram shows (approximately) a Catmull-Rom filter, which is a little different from the filter shown in the previous examples on this page. The procedure is the same regardless of the filter used.

Now, consider all the source samples that fall within the domain where the filter takes nonzero values. In this example, there are four such samples (the blue dots that are in the region highlighted with a gradient). The number of source samples that contribute to a target sample will be the same for each target sample (with the possible exception of samples near the edges of the image).

Process the relevant source samples, and calculate a running total of two things:

- The value of the filter at the location of the source sample.
I’ll call this
sum
*norm*. - The value of the source sample multiplied by the value of the filter at
that location. I’ll call this sum
*val*.

After you’ve processed
all the relevant source samples, you can calculate
the value of the target sample. It is simply *val* divided by *norm*.

Depending on implementation details, the value of *norm* might always
be 1. If you know that’s the case, then you don’t need to calculate *norm*
at all.
But in general, you’ll find it easier if you calculate *norm* and use
it to normalize the value.

Downscaling is almost the same as upscaling, but with one key difference:
the filter is scaled to the size
of the *target* pixels instead of the source pixels. The number of
source samples that contribute to a target sample will not, in general,
be the same for all target samples.

The general rule is that you scale the filter to whichever pixels are larger in size (or smaller, depending on your perspective). When downscaling, the target pixels are larger.

Now, proceed the same as when enlarging. Look at all the source
samples in the filter’s nonzero range, add up their weighted values,
then normalize. In this example, there are 5 or 6 such source
samples, depending on whether you consider the one right on the
border. Fortunately, the filter’s value is zero at that point, so it
makes no difference whether we include it or not.
(Borderline pixels like this can be an annoyance, though, with filters that
aren’t continuous, notably a *box* filter.)

A common thing to do after downscaling an image is to post-process it by applying a sharpening filter (not discussed in this document) to the resized image. This artificially emphasizes the edges of objects, which can be good if the image is going to be displayed at a very small size, as an icon. Whether it’s desirable in other cases is debatable. It does tend to make images look better at first glance, but at the expense of realism.

The straightforward way to implement this algorithm is fairly slow, but in most cases there’s an easy way to make it more efficient.

Note that every row in the target image will use the same set of weights. Calculate the weights only once, and store them in a list. Each item in the list might contain a source sample number, a target sample number, and a weight. Then, for each row, simply read through the list, and do what it says (multiply the source sample by the weight, and add it to the target sample).

Do the same thing when resizing the columns. They will usually need a different weight list.

The time it takes to create the weight lists should be almost insignificant, so it really doesn’t matter if your filter function uses expensive transcendental math functions.

Note that this optimization may not be possible if you’re doing something fancy, such as rotating the image by 37 degrees.

What do you do near the edges of the image, when some of the source samples you need for your calculation do not exist? There are two main strategies: one is to invent “virtual pixels”, and I don’t know what the other one is named so I’ll just call it the “standard method”.

The *de facto* standard thing to do seems to be to do
nothing special. If you do the normalization described above,
you should be okay.
Instead of a weighted average of, say, 4 samples, you might have a
weighted average of only 3 or 2 samples.

Under normal circumstances, there will always be enough source samples to give a meaningful result. By normal circumstances, I mean a number of things, including the fact that the filter takes only positive values between −1 and 1, which is not the case for some kinds of “sharpened” filters.

Another requirement is that the target image be aligned exactly with the source image. If it’s shifted or scaled in an unusual way, there may not be enough (or any) source samples to use. You could end up dividing by zero, or calculating a very strange sample value.

I *suspect* that many resampling implementations have a
special-case rule, something like:
If there are not enough source samples near the target sample,
simply copy the nearest source sample to the target sample.
Such a rule would be necessary in order for so-called
“point” resampling to make sense. But I don’t know exactly what
“not enough” means.

A strategy that always works is to invent “virtual” source samples to use to replace the ones that are beyond the edge of the image.

One way to do this is *replication*: give a virtual pixel
the value of the nearest
pixel that does exist. This is easy, but it can give the edge pixels
too much importance.

Or you could make the virtual pixels some predetermined color (black? white? gray?), or transparent.

Or you could use some method to extrapolate from the nearest two or more image pixels to estimate what the missing samples ought to be. (You’ll probably need to handle extremely small images as a special case.)

Or you could make the virtual pixels “wrap around” to the other side of the image. This is a good thing to do if and only if the image is going to be tiled; e.g. used as a web page background.

The dimensions of the source image are by definition an integral
number of pixels, but it’s easy to overlook the fact that this is
not really the case for the *target* image. By that, I mean that
if you want to reduce a 99-pixel row to half that size, you don’t
have to round it and change the image features’ size
by a factor of 50/99. There’s nothing stopping you from
changing them by a factor of exactly 0.5
(49.5/99). Assuming you then use 50 pixels to store the resized row,
the outermost pixels would not be as well-defined as usual;
refer to the *Image edges* section above.

Don’t think of resampling an image as simply changing the number of pixels. Sometimes that is what you want. But in general, think of it as painting a resized image onto a canvas, where the canvas could be larger, smaller, or the same size as the resized image.

It’s possible for a filter to give you a value for a target sample that is darker or lighter than any source sample. It could even be darker or lighter than the range of values that you are capable of storing in your image. I discuss this issue on another page.

The dimension you choose to resize first has no significant impact on image quality, or on the number of calculations needed.

If you use a filter that has negative values, and you clamp intermediate samples, then it does make a difference which dimension you resize first. But there is probably no reason to prefer one dimension over the other.

If you are writing the resized image to a file, it may be more convenient to resize the vertical dimension first. That way, you’ll finish resizing the image one row at a time.

If you want to follow the crowd, then based on the evidence I’ve seen, most applications resize the horizontal dimension first.

If performance is your goal, be aware of CPU caching issues. Data that is close together in memory can usually be processed more quickly. Most likely, your image is stored in such a way that samples in the same row are close together in memory. That means the horizontal resize will run more quickly, so you should try to minimize the amount of work done by the vertical resize. If you are reducing the image, resize it horizontally first. If you are enlarging the image, resize it vertically first.

Algorithms of this type are potentially “unfair”, in the sense that some source pixels have less effect on the resized image than others do. I discuss this issue on another page.