A puzzle - Embedded Software

By Colin Walls

I am always interested in some of the subtle effects that coding can have on not just the behavior of code, but also its performance. Recently, I stumbled across some benchmarking code – I cannot recall how I actually found it. It was not written in C, so I translated it.

In the process, I realized that a literal translation might not be ideal …

The code is to do with Mandelbrot sets. My first C translation looked like this:

int mandelbrot(float a)

float b;

int iterations = 50;

int i;

b = a;

for (i=1; i<=iterations; i++)

if (abs(a) > 2)

return (i - 1);

a = pow(a, 2) + b;

return (iterations);

I then saw a couple of optimizations that I could make:

#define ITERATIONS 50

int mandelbrot(float a)

float b;

int i;

b = a;

for (i=1; i<=ITERATIONS; i++)

if (abs(a) > 2)

return (i - 1);

a = a * a + b;

return (ITERATIONS);

I made it clear that iterations was a constant. I could have simply qualified it with const, but I chose to use #define. Call me old fashioned if you like. Both would most likely produce the same result. A good compiler may well optimize this without my help.

Since the code was translated from a language that included an exponentiation operator, using the pow() library function seemed logical, as C has no such operator. As a is simply squared, multiplication seemed better. Again, a smart compiler may well have addressed this.

I then saw another possible enhancement that might be effective:

#define ITERATIONS 50

int mandelbrot(float a)

float b;

int i;

b = a;

for (i=1; i<=ITERATIONS; i++)

if (abs(b) > 2)

return (i - 1);

b = b * b + a;

return (ITERATIONS);

Does this improve matters? If so, why? Please let me know your thoughts by email or comment. Sorry, no prizes, but I will follow up this posting with my further thoughts sometime soon.

Comments

0 thoughts about “A puzzle”

Can’t see why the second change would make any difference, because I would expect a compiler to treat a parameter and a local variable in the same way – probably keeping both in registers, in this case.

I would add a third “optimisation”, but this is just a tidying exercise, really, as I don’t suppose this would make any difference, either: just do the b=a assignment at the declaration of b.

Good input Peter.

Firstly, for many CPU architectures, parameter passing in registers is not a possibility, except for static functions, where the definition and calls are all seen at once. This is even more true for older compilers. The optimisation of auto variables to registers, even without the register keyword being used, is more common.

Your proposed further change is, I suppose neater to look at. However, as an assignment of an auto variable on definition is only shorthand [as assignment will need some code, unlike with a static], I actually prefer to keep them separate. Just personal taste.

Following your reply to mine, I had a further thought. If there is a floating-point coprocessor present, then maybe it does its calculations using an internal stack (think of Reverse Polish Notation – RPN). If so, it may be advantageous to write

const float b = a;

then work with a as the “accumulating” variable, as originally.

The compiler, knowing that b is constant (for the current function invocation) has the opportunity to push it onto the FP stack just once, before the loop is entered, and to pop it just once before returning. Otherwise, it is possible that the supposedly variable b will be pushed and popped for every loop iteration.

Peter: That makes sense, if indeed a coprocessor does have a stack that can be used in that way [i.e. for passing FP variables]. As, for me, RPN is the only obvious way to do arithmetic, I’d like to think that this is correct.

It’s not about optimization, but
shouldn’t it be fabsf() instead of abs() ?

You are quite correct Grygorii. Well spotted.

Comments

Leave a Reply Cancel reply