Efficient code – a quiz
I was recently reading a set of “golden rules” for embedded programming. I am very skeptical about such proscriptive instructions as, for them to be valid, a great many assumptions must be made and clearly stated. These rules were supposed to promote the production of safe, efficient code. I am OK with “safe” – that simply means that the code does what it is supposed to do, without deviation as a result of unexpected data or unexpected side-effects. I am not so sure about “efficient”, as that depends entirely on the design parameters in force; it might mean fast or it might mean small, for example. And what about maintainability and portability being significant parameters?
So, instead of trying to outline rules myself, I thought that I might set out to find out what real developers think …
Just for fun [budgets to not stretch to offering a prize], I would like to pose a question and request answers by comment or email:
You are developing a system, where execution speed is important. At one point in the code, there is an unsigned integer variable x which needs to be divided by 8. I can immediately think of 4 ways to code this:
x /= 8;
x = x / 8;
x >>= 3;
x = x >> 3;
My question is: which of these options is the most efficient and why? Of course, if you have other suggestions about how to address the problem, I would be very interested to hear.
I will publish some results and my answer in a couple of weeks.
The second is clearest, but the first is a well-known C idiom and is quite acceptable. The others are not acceptable because they hide the arithmetic nature of the job to be done. The mechanism by which division is performed is a matter for the processor, aided in this case, possibly, by the compiler, which should optimise the code to a simple shift if this results in quicker execution than just generating a divide instruction (assuming the processor provides one, of course)
There is another important point, too, related to the C standard, but you have neatly rendered its discussion “off-topic” by making x unsigned!
I take it, by the way, that you don’t want us to divert the discussion into arguments about the use of the explicit constants 8 and 3 in your examples.
Thanks Peter. I will comment in a follow up blog soon.
I agree with Peter.
I’ve heard the saying, “If you mean to shift, then shift; if you want to multiply or divide, then multiply or divide”.
Generally speaking, on most architectures, all 4 possibilities will result in the same assembly code being generated, at least with any sort of optimization turned on.
But the first 2 versions are more expressive – you want to divide, and you are dividing. Not to mention the fact that now if you want to divide by 9, it’s a single character being changed. Also, even though most of us see “>> 3” and “instantly” know this is a divide-by-8, some people might have trouble converting 2**3 to 8. I had to fix code once where the original coder meant to divide by 16, but everywhere in the code it was coded as “>> 5” – the dreaded “off by one [bit]” error.
And what if the value now changes to a floating point (float, double) type? How’s that bit shift working out now?
But most importantly, the first 2 versions do not rely on any specific architecture or internal representation. Although most architectures use 2s complement representation, this is not universal; binary representation is an implementation detail of the processor.
Bottom line: write your code for human beings, and let the compiler do its job.
Good points Dan.
You are developing a system, where execution speed is important.
Fewer Cycles = faster execution speed
Use of fewer registers is more efficient
I would code x = x >> 3; I may create an intermediate term xshiftn
The shift should take 2 cycles and use 1 register
The divide code would take 3,4, or 5 cycles to produce a result and probably use 2 or 3 registers.
Code for the machine. The code may become the machine!
Thanks Ken. I will be commenting on the responses I have received in a blog post very soon.
Code should present idea, operation, not implementation details. If your compiler do not optimize dividing by 8 then there is one ultimate way to efficient program – change compiler.
I agree entirely Krzysztof.
I also agree with Krzysztof. Back when we first moved from Assembly to C, we used to examine the Assembly output of the compiler, and we found that certain C constructs were more efficient than others. That gave the Assembly “old guard” reason to claim C was useless and to keep coding in Assembly. Shift rather than divide and if-then-else rather than switch-case were two hand optimizations we saw. After a few years, the C compilers solved these issues and it became no longer necessary to look at the compiler output in Assembly. I would hope that those years are long behind us.
I am a recent graduate so know that my reply is coming from a position of inexperience.
Without actually looking at the assembly code I would assume that method 1 and 2 are identical and would produce the same result. The second method may be a tad bit clearer but that’s the only difference I can spot.
A possible problem with using a right shift as a multiplier, even if it did result in better optimization, may in fact come from cross platform applications. In short, I believe that the code will have portability issues when dealing with data of different sizes and could possibly produce unreliable results in certain applications.
Just my thoughts,