Assembly language is always smallest/fastest - not!

By Colin Walls

It would seem intuitive that writing in assembly language is the best possible option if you want the most optimal code in terms of size and/or speed. After all, assuming the programmer is smart and competent in the assembler for a particular CPU and understands its architecture well enough, and he has been fully appraised of the functional requirements of the code, the only possible result is code that uses the processor’s capabilities in an ideal way. A compiler just cannot compete, as it has no information on the precise requirements of the code.

Well, that would just seem common sense, but it does not take into account one key factor: human nature.

Let me illustrate this issue with a simple example: a C language switch statement. There are three patterns which I can envisage for the case values: contiguous values; almost contiguous values, with a few values missing; completely non-contiguous values. For contiguous case values, a good compiler will probably generate code with a simple list of addresses, which is indexed by the case values. The same thing would result for almost contiguous values, except the table would have a few dummy entries. For completely non-contiguous values, the likely code is a look-up table of values and addresses. In short, a compiler will use a strategy which is appropriate to the pattern of case values.

A smart human programmer would take a different approach and almost always code a look-up table. Why? Because this code is maintainable. If the code was written for contiguous values and then a change of requirements made the case values non-contiguous, a re-write would be necessitated and a smart programmer would want to avoid that possibility. A compiler re-writes the code every time you run it, so it does not care.

So, for switch statements anyway, a compiler will, on average, produce better code than a human programmer. The result is that C will, thus, yield a more efficient result than assembly language.

Comments

0 thoughts about “Assembly language is always smallest/fastest – not!”

Interesting article and good subject. Assembly rarely gets mention these days.

Anyways I just wanted to add another viewpoint in the sense that, most programmers that code in assembly (the ones I mostly know are in the security software and OS development field) think differently. So the human nature argument doesn’t really come into play when a *good* assembly programmer knows exactly how many clock cycles every instruction will take etc.

It’s true that with the optimization available on most current C compilers, sure, its hard to compete with that. But with that being said, my main response is as follows:

“Assembly language is always the smallest/fastest – not!” Is not true. Assembly always is the smallest/fastest and always will be because it is what every other programming language will be translated to in order to run. Therefore the only merit your article has is based on nothing to do with assembly, but with humans. Or at least the assembly programmers that you know.

I only bother to post this because it’s easy to read your article and think “Okay so I will just use C and never assembly because of my own human nature.” No. If you are smart enough and understand each instruction well then assembly will always be the fastest/smallest and if it is not then it is lack of education and not a lack on part of anything to do with “assembly.”

-Ryan

All good and valid points Ryan.

Maybe the title should be “Assembly language programmers don’t always produce the fastest/smallest code”.

I agree with Ryan’s point and I think the suggested title by Colin fits well with that point. Good article overall.

Ryan is right. A good example is in DOS, where the interrupt table is broken down by a series of compare statements, and this is quite often the best way to do a switch in assembly. Yet, when a certain type of processor is used, a switch table can be used and one set of call/return sequence will end up at the right location. Or with some processors, the vector can be preloaded and a call, pop old return, push new return, and then the return instruction will create the branch.

The deal is how does the assembly programmer choose to automate the branch vs how the compiler generate the code and what kinds of optimization take place.

Ryan is right, the programmer who knows his code will do better optimizations and create better assembly than the compiler/optimizer.

Most of the good programmers I know will write code in high level languages, then hand optimize at both the C and assembly languages to get the best results. Using C gets the overall construct and working code faster on large projects, optimizing at assembly gives the best run time performance. For embedded products, hand coded assembly is often the only effective choice for power, size, speed, and cycle count factors.

Even if I agree with Ryan and Colin, this is always a neverending story. Yes, assmebler will always be fastest and smallest as soon as programmer well knows the core’s assembler and features.
Nowadays, most of time portability and also more readable/maintainable code is needed. For that reason, C is the best choice.
But it is also important to say that some cases need from developper to use assembly language to have completet mastery of execution time and actions. In these cases, C will not produce accurate code whatever the compiler is.
From my experience, I saw many RTOS schedulers written in assembly where the services where written in C language. I personnally think this is the way most applications are developped today.
What we can add is that portability forces the uses of high level languages but also the use of more powerfull processor because of “encapsulation” for all features (you can also call that drivers, BSP,…) and so assembly language is not really used by most of developpers.

Good article, Colin. Thought-provoking. One of the thoughts it provoked in me is to remember that premature optimisation is a sin! 😉 In my experience (32 years of embedded firmware development) I rarely found that my firmware needed the optimisation-for-efficiency that we are discussing here. First confirm that optimisation is necessary, then invest the time and effort to do it.

What to read next:

Comments

Leave a Reply Cancel reply