Thought Leadership

Spotting the difference – subtleties of C code

By Colin Walls

It is common for C to provide several different ways to do something, all of which are exactly equivalent. For example, given that x is a normal int variable:

x = x + 1;

is exactly equivalent to:

x += 1;

or

x++;

The only possible difference is that a less capable compiler might generate slightly better code for the second and third options [which would be a hint that getting a better compiler would be worthwhile].

However, sometimes constructs that appear to be equivalent have very subtle differences …

Probably the simplest thing that you can do in any programming language is assign a value to a variable. So, in C, we might write:

alpha = 99;
beta = 99;
gamma = 99;

Of course, this might be written more compactly like this:

alpha = beta = gamma = 99;

And these are 100% equivalent. Or are they?

Most of the time, these two constructs are entirely equivalent, but there are [at least] four situations when choosing one or the other might make a difference:

Firstly, and most prosaically, each variable is separate and perhaps a comment indicating why it is set to this value might be appropriate.

Second, it is always good to write maintainable code. Maybe, at some point in the future, the code might need to be changed so that all three variable are not set to the same value. The first format lends itself more readily to modification.

The third reason relates to substandard compilers, which might generate code like this for the first construct:

mov r0, #99
mov alpha, r0
mov r0, #99
mov beta, r0
mov r0, #99
mov gamma, r0

The second construct gives the hint that r0 only needs to be loaded once. Again, a better compiler would not need the hint.

Lastly, there is the question of execution order. In the first construct, it is entirely clear that alpha will be assigned first and gamma last. A compiler will interpret the second construct thus:

alpha = (beta = (gamma = 99));

This means that the assignment order is reversed. But does that matter? Most of the time, it does not. But if these were device registers, not ordinary variables, it might make a big difference. It is very common for hardware to need set-up values to be loaded in a precise sequence.

So, I would say that the multiple assignment in one statement construct should be avoided. But, of course, I am interested in alternative arguments by comment, email or via social networks.

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at https://blogs.sw.siemens.com/embedded-software/2015/06/01/spotting-the-difference-subtleties-of-c-code/