Basics of Image Resampling

Introduction

This document explains how to do image resampling, in the way that ImageWorsener does it. It’s not really about ImageWorsener, though, and some of the options mentioned here are not supported by ImageWorsener.

This is not the only way to resample images. There are other algorithms that are more sophisticated, and potentially better. But it is more or less how most applications do it, and it’s not easy to improve upon.

I’m only going to show how to resize a one-dimensional image having one sample per pixel (i.e. a grayscale image):

To resize a two-dimensional image, simply resize it in two passes: one for each dimension. (For a discussion of which dimension to resize first, see the Which dimension? section below.)

For a color RGB image, resize the red, green, and blue components separately, as if you had three separate images.

If your image has transparency, you can resize the alpha channel the same as a normal color channel. However, the color channels may require special processing in this situation.

Before resizing an image, it’s best to convert it to a linear colorspace. Then, after all the resizing is complete, convert it back to the original colorspace. Do not convert alpha channels in this way, as they are most likely already linear representations of the opacity.

The above diagram represents pixels as rectangular regions, and samples as points located at the center of the corresponding pixel (indicated by the blue and green dots). The discussion below will follow this convention, in that pixels have a nonzero size, and the distance between two neighboring samples is one pixel.

Resampling Filters

You may have seen graphs of resampling filters, like these:

How do you actually go about using such a filter to resize an image?

Before we get to that, take note of a few properties of these filters.

On the horizontal scale, a length of 1 is the size of one “pixel” (more about that later). The vertical scale usually does not matter from the programmer’s perspective, because we will be normalizing the samples in a certain way that will automatically scale it correctly.

Once a filter is normalized, the area under its curve will be 1. (In calculus terminology, the integral from −∞ to +∞ is 1.) A filter without this property would cause resized images to be consistently darker or lighter than the original. In this example, the area of the green regions minus the area of the red regions equals 1.

Choose any real number, say for example 0.25. Now form the set of all numbers that differ from that number by an integer: {..., −3.75, −2.75, −1.75, −0.75, 0.25, 1.25, 2.25, 3.25, ...}. If you add up the curve’s values at those points, the sum will always be 1. So, in the following image, the lengths of the green lines minus the lengths of the red lines should equal 1.

If the filter’s value is 1 at position 0, and 0 at all other integers, then it has the property that if the source and target images are the same size, it will leave the image unchanged. You might expect that all good filters have this property, but that’s not the case. It is a nice property to have, but it can be worth sacrificing in order to improve the filter in other ways.

The filter should be symmetric around the line x=0. If it were symmetric around a different line, the new image would be shifted to the left or right. If it were asymmetric, symmetric images would, after resizing, become asymmetric.

Now for the actual resampling algorithm. I’ll consider enlarging (upscaling) and reducing (downscaling) separately.

Enlarging

Enlarging is the easier case to deal with.

First, align the source samples with the target samples in the appropriate fashion for the resampling that you want to do.

Scale the filter horizontally so that the distance from its x-coordinates 0 to 1 is the size of one source pixel. It doesn’t necessarily matter how you scale it vertically. If you normalize your calculations as described below, the absolute vertical scale is irrelevant.

Do the rest of these steps for each target sample:

Align the filter so that its origin corresponds to the position of the target sample (the large green dot in this diagram).

Note that this diagram shows (approximately) a Catmull-Rom filter, which is a little different from the filter shown in the previous examples on this page. The procedure is the same regardless of the filter used.

Now, consider all the source samples that fall within the domain where the filter takes nonzero values. In this example, there are four such samples (the blue dots that are in the region highlighted with a gradient). The number of source samples that contribute to a target sample will be the same for each target sample (with the possible exception of samples near the edges of the image).

Process the relevant source samples, and calculate a running total of two things:

After you’ve processed all the relevant source samples, you can calculate the value of the target sample. It is simply val divided by norm.

Depending on implementation details, the value of norm might always be 1. If you know that’s the case, then you don’t need to calculate norm at all. But in general, you’ll find it easier if you calculate norm and use it to normalize the value.

Reducing

Downscaling is almost the same as upscaling, but with one key difference: the filter is scaled to the size of the target pixels instead of the source pixels. The number of source samples that contribute to a target sample will not, in general, be the same for all target samples.

The general rule is that you scale the filter to whichever pixels are larger in size (or smaller, depending on your perspective). When downscaling, the target pixels are larger.

Now, proceed the same as when enlarging. Look at all the source samples in the filter’s nonzero range, add up their weighted values, then normalize. In this example, there are 5 or 6 such source samples, depending on whether you consider the one right on the border. Fortunately, the filter’s value is zero at that point, so it makes no difference whether we include it or not. (Borderline pixels like this can be an annoyance, though, with filters that aren’t continuous, notably a box filter.)

Details

Post-processing

A common thing to do after downscaling an image is to post-process it by applying a sharpening filter (not discussed in this document) to the resized image. This artificially emphasizes the edges of objects, which can be good if the image is going to be displayed at a very small size, as an icon. Whether it’s desirable in other cases is debatable. It does tend to make images look better at first glance, but at the expense of realism.

Optimization

The straightforward way to implement this algorithm is fairly slow, but in most cases there’s an easy way to make it more efficient.

Note that every row in the target image will use the same set of weights. Calculate the weights only once, and store them in a list. Each item in the list might contain a source sample number, a target sample number, and a weight. Then, for each row, simply read through the list, and do what it says (multiply the source sample by the weight, and add it to the target sample).

Do the same thing when resizing the columns. They will usually need a different weight list.

The time it takes to create the weight lists should be almost insignificant, so it really doesn’t matter if your filter function uses expensive transcendental math functions.

Note that this optimization may not be possible if you’re doing something fancy, such as rotating the image by 37 degrees.

Image edges

What do you do near the edges of the image, when some of the source samples you need for your calculation do not exist? There are two main strategies: one is to invent “virtual pixels”, and I don’t know what the other one is named so I’ll just call it the “standard method”.

The standard method

The de facto standard thing to do seems to be to do nothing special. If you do the normalization described above, you should be okay. Instead of a weighted average of, say, 4 samples, you might have a weighted average of only 3 or 2 samples.

Under normal circumstances, there will always be enough source samples to give a meaningful result. By normal circumstances, I mean a number of things, including the fact that the filter takes only positive values between −1 and 1, which is not the case for some kinds of “sharpened” filters.

Another requirement is that the target image be aligned exactly with the source image. If it’s shifted or scaled in an unusual way, there may not be enough (or any) source samples to use. You could end up dividing by zero, or calculating a very strange sample value.

I suspect that many resampling implementations have a special-case rule, something like: If there are not enough source samples near the target sample, simply copy the nearest source sample to the target sample. Such a rule would be necessary in order for so-called “point” resampling to make sense. But I don’t know exactly what “not enough” means.

Virtual pixels

A strategy that always works is to invent “virtual” source samples to use to replace the ones that are beyond the edge of the image.

One way to do this is replication: give a virtual pixel the value of the nearest pixel that does exist. This is easy, but it can give the edge pixels too much importance.

Or you could make the virtual pixels some predetermined color (black? white? gray?), or transparent.

Or you could use some method to extrapolate from the nearest two or more image pixels to estimate what the missing samples ought to be. (You’ll probably need to handle extremely small images as a special case.)

Or you could make the virtual pixels “wrap around” to the other side of the image. This is a good thing to do if and only if the image is going to be tiled; e.g. used as a web page background.

Non-integer dimensions

The dimensions of the source image are by definition an integral number of pixels, but it’s easy to overlook the fact that this is not really the case for the target image. By that, I mean that if you want to reduce a 99-pixel row to half that size, you don’t have to round it and change the image features’ size by a factor of 50/99. There’s nothing stopping you from changing them by a factor of exactly 0.5 (49.5/99). Assuming you then use 50 pixels to store the resized row, the outermost pixels would not be as well-defined as usual; refer to the Image edges section above.

Don’t think of resampling an image as simply changing the number of pixels. Sometimes that is what you want. But in general, think of it as painting a resized image onto a canvas, where the canvas could be larger, smaller, or the same size as the resized image.

Clamping

It’s possible for a filter to give you a value for a target sample that is darker or lighter than any source sample. It could even be darker or lighter than the range of values that you are capable of storing in your image. I discuss this issue on another page.

Which dimension to resize first?

The dimension you choose to resize first has no significant impact on image quality, or on the number of calculations needed.

If you use a filter that has negative values, and you clamp intermediate samples, then it does make a difference which dimension you resize first. But there is probably no reason to prefer one dimension over the other.

If you are writing the resized image to a file, it may be more convenient to resize the vertical dimension first. That way, you’ll finish resizing the image one row at a time.

If you want to follow the crowd, then based on the evidence I’ve seen, most applications resize the horizontal dimension first.

If performance is your goal, be aware of CPU caching issues. Data that is close together in memory can usually be processed more quickly. Most likely, your image is stored in such a way that samples in the same row are close together in memory. That means the horizontal resize will run more quickly, so you should try to minimize the amount of work done by the vertical resize. If you are reducing the image, resize it horizontally first. If you are enlarging the image, resize it vertically first.

Fairness

Algorithms of this type are potentially “unfair”, in the sense that some source pixels have less effect on the resized image than others do. I discuss this issue on another page.