Thought Leadership

Endianness

In almost all modern embedded systems, memory is organized into bytes. CPUs, however, process data as 8-, 16- or 32-bit words. As soon as this word size is larger than a byte, a decision needs to be made with regard to how the bytes in a word are stored in memory. There are two obvious options and a number of other variations. The property that describes this byte ordering is called “endianness” [or, sometimes, “endianity”].

Broadly speaking, the endianness in use is determined by the CPU. Because there are a number of options, it is unsurprising that different semiconductor vendors have chosen different endianness for their CPUs. The questions, from an embedded software engineers perspective are “Does endianness matter?” and “If so, how much?” …

First of all we need to provide some boundaries for this discussion. I am going to just consider 32-bit CPUs – the same issues apply to 16- and 64-bit devices. Even 8-bit devices typically have instructions that deal with larger data units. I am also going to limit my consideration to the obvious endianness options: least significant byte stored at lowest address [“little-endian”] and most significant byte stored at lowest address [“big-endian”]. These two options may visualized quite easily:

There are also other possibilities, like using little-endian within 16-bit words, but storing the 16-bit words inside 32-bit words using big-endian. This is commonly called “middle-endian” or “mixed-endian”, but rarely encountered nowadays. The order of bits within a byte is also potentially arbitrary, but I will ignore that too.

Examples of little-endian CPUs include Intel x86 and Altera Nios II. Big-endian CPUs include Freescale 68K and Coldfire and Xilinx Microblaze. Many modern architectures facilitate both modes and can be switched in software; such “bi-endian” devices include ARM, PowerPC and MIPS.

There are broadly two circumstances when a software developer needs to think about endianness:

  • data transmitted over a communications link or network
  • data handled in multiple representations in software

The former situation is quite straightforward – simply a matter of following or defining a protocol. The latter is more tricky and requires some thought.

Consider this code:

unsigned int n = 0x0a0b0c0d;
unsigned char c, d, *p;
c = (unsigned char) n;
p = (unsigned char *) &n;
d = *p;

 

What values would c and d contain at the end? Whatever the endianness, c should contain the value 0x0d. However, the value of d will depend on the endianness. On a little-endian system d will contain 0x0d; on big-endian it will have the value 0x0a. The same kind of effect would be observed if a union were to be made between n and, say, unsigned char a[4].

So, does this matter? With care, most code may be written to be independent of endianness and I would contend that almost all well-written code would be like this. However, if you do build in an endianness dependency, as usual, good documentation/commenting is obviously essential.

Colin Walls

I have over thirty years experience in the electronics industry, largely dedicated to embedded software. A frequent presenter at conferences and seminars and author of numerous technical articles and two books on embedded software, I am a member of the marketing team of the Mentor Graphics Embedded Systems Division, and am based in the UK. Away from work, I have a wide range of interests including photography and trying to point my two daughters in the right direction in life. Learn more about Colin, including his go-to karaoke song and the best parts of being British: http://go.mentor.com/3_acv

More from this author

Comments

0 thoughts about “Endianness

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at https://blogs.sw.siemens.com/embedded-software/2013/03/18/endianness/