Slow power computation by 64-bit glibc

[The document is based on glibc version 2.13. It might not apply to newer versions. It does not apply to applications that don’t use glibc, such as most Windows applications.]

[Update: In fact, there probably has been some improvement since this document was written.]

Here’s a little C program I wrote to take numbers to powers:

/* powtest.c */

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc,char**argv)
{
  int count, i;
  volatile double b, e, x;

  if (argc<4) { printf("error\n"); return 1; }
  count = atoi(argv[1]);
  b = atof(argv[2]);
  e = atof(argv[3]);

  for (i=0;i<count;i++) {
    x = pow(b,e);
  }
  printf("(count=%d) %.16f^%.16f = %.16f\n",count,b,e,x);
  return 0;
}

The program accepts a count of the number of times to do the computation. This is to help us to see how long it takes.

How long would you expect the following computations to take? Note that we’re doing them 100 million times each.

./powtest 100000000 1.0000000000000020 1.5000050000000000
./powtest 100000000 1.0000000000000020 1.5000005000000000
./powtest 100000000 1.0000000000000020 1.5000000000000000

On the 64-bit Linux system I used to test this, here are my results:

9.3 seconds
45.6 minutes
21.6 hours

(Of course, I didn’t actually run them all to completion. I extrapolated.)

The pow() function nearly always runs as fast as in (1). The absolute worst case is exhibited by (3). There are also intermediate cases, as exhibited by (2).

Does it surprise you that the pow() function sometimes runs over 8000 times slower than normal? It surprised me. But then, maybe that’s my own fault, for not reading the documentation. Let’s see what the man page for pow() says about this surprising behavior:

BUGS
   In  glibc  2.9  and  earlier, when a pole error occurs, errno is set to
   EDOM instead of the POSIX-mandated ERANGE.  Since version  2.10,  glibc
   does the right thing.

   If  x is negative, then large negative or positive y values yield a NaN
   as the function  result,  with  errno  set  to  EDOM,  and  an  invalid
   (FE_INVALID)  floating-point  exception.   For example, with pow(), one
   sees this behavior when the absolute value of y is greater  than  about
   9.223373e18.

   In  version  2.3.2  and  earlier,  when  an overflow or underflow error
   occurs, glibc's pow() generates a bogus invalid  floating-point  excep-
   tion (FE_INVALID) in addition to the overflow or underflow exception.

So... it says absolutely nothing.

To the internet, then. But after much searching, I found only a few scattered references to this issue (1 2 3).

I also studied and compiled the glibc source code to help figure out what’s going on. At this point, I can say the following about it:

This issue has existed and been known about for over five years (since 2006).
It affects 64-bit x86_64 code. Most or all 32-bit builds of glibc use completely different code for pow(), and are not affected.
It (mostly?) affects bases that are close to, but not exactly, 1.0. The exponents that are slow depend to some degree on the base.
The slowness is intentional, and is not considered to be a bug, at least not by the person who programmed it.
There’s also a function named “slowexp”, suggesting that the the exp() function might have the same problem, though it’s not clear what parameters would cause it. (Update: Parameters in the range −0.00000000033301145180213807 to −0.00000000033301145180369239 cause exp() to run about 100 times slower than normal. Thanks to D. Given for this example.)

Here’s where the slow code is in the glibc source code:

pow (aliased as __pow) is implemented in math/w_pow.c.
It calls __ieee754_pow, defined in sysdeps/ieee754/dbl-64/e_pow.c.
Which calls power1, defined in the same file.
Which calls __slowpow, defined in sysdeps/ieee754/dbl-64/slowpow.c.
Which calls lots little functions, many of which contain loops that are iterated many times each. The name “slowpow” is an understatement.

I am aware that transcendental functions like pow() are hard to calculate, but I find it difficult to believe that this is the state of the art.

But, you say, what are the odds of ever running into one of these pathological cases? And even the slowest computation takes only one thousandth of a second. Does this really matter?

Well, I ran into it because my image resizing utility was going into an infinite loop when resizing a certain image by certain amounts. At least, it seemed like an infinite loop. I soon figured out that it was just running incredibly slowly. Here’s the (worst-case) computation that bit me:

  pow(0.9999999999999940, 0.4166666666666666)

The 0.416666... exponent is 1/2.4, which is the exponent you have to use to convert to the standard sRGB color space. And the way my utility resized images often resulted in bases that differed from 1.0 only very slightly, due to round-off error.

I spent quite a lot of time debugging this, and I’m angry that I had to do that. At the very least, this “gotcha” should be clearly documented, and a fast-but-less-accurate alternative offered.

Somewhere out there, there’s probably a web application that’s vulnerable to a denial-of-service attack by feeding it these pathological numbers. Admittedly, such a vulnerability is probably very rare. Web applications that may run too long usually have special protections against that, and anyway, there aren’t that many reasons to do so many power computations.