Thought Leadership

Stuffing bits

By Colin Walls

I am not a networking specialist. If you are an expert in this area, this posting will be teaching a grandmother to suck eggs [strange expression – I wonder what it actually means]. Obviously, over years of working with embedded systems, I have learned quite a lot about protocols, so learning about a new one is not starting from scratch. For many, networking begins and ends with TCP/IP. However, there are lots of other Internet protocols – FTP, UDP and HTTP, for example. There are also other kinds of connectivity that may or may not be thought of as networking – WiFi, Bluetooth and USB, for example.

It was while studying the operation of the last of these, USB, that I came across a technique that was familiar in form, but unfamiliar in application: bit stuffing …

The term “bit stuffing” broadly refers to a technique whereby extra bits are added to a data stream, which do not themselves carry any information, but either assist in management of the communications or deal with other issues.

The use of bit stuffing, with which I was familiar, was to do with protocol management. Imagine you have a stream of binary data and you periodically want to include a marker to say that a data set has finished. Obviously, you can use a particular bit sequence, but how do you recognize it when any sequence of bits might occur within the data stream? This is where bit stuffing comes in.

For example, we can define a rule that no more than five 1s can occur in a row in the transmitted data. To make this work, the sending software will insert [stuff] an extra 0 after any sequence of five 1s. It can then send a sequence of six 1s specifically to indicate an end of data set. The receiving software, on seeing a sequence of five 1s, checks the next bit. If it is 0, the bit is simply discarded; if it is 1, then it notes that an end of data set has been flagged. This technique is very flexible and can be adapted to various circumstances. It is broadly the same idea as using an “escape” character in byte-oriented protocols.

I was interested to see how and why bit stuffing is used in USB, where it has a different purpose. USB is a synchronous protocol, which means that the sender and receiver must be somewhat synchronized to correctly recognize data on the bus. There is no separate clock line between the devices. The receiver uses transitions in the data stream to get in synch. Because of the way the encoding is implemented, a stream of 0s is not a problem, as there are plenty of transitions. However a stream of 1s would have no transitions, which might compromise the synchronization. This is fixed by the transmitter stuffing a 0 after any sequence of six 1s. The receiver just discards any bit following a sequence of six 1s. This is a simplified description, but I hope that it conveys the idea.

The downsides of bit stuffing are that it introduces an overhead, but this is minimal [0.8% on random data in USB, for example], and it means that the overall data transmission rate is not predictable, as it is [slightly] data dependent.

Interestingly, USB 3.0 also messes with data streams in another way. It has a data scrambling mechanism, which eliminates repetitive patterns in data, which would otherwise cause radio interference that would compromise FCC requirements. But that another story entirely …


0 thoughts about “Stuffing bits
  • While bit stuffing is usually done at the hardware level, character-stuffing is sometimes done in software, especially in communications protocols.

    For example, in the Point to Point Protocol (PPP) character 0x7E is the “Flag” byte & is used for frame delimiting (start / end). So what if the character 0x7E appears in the data frame? On transmission, a 0x7E is replaced by 2 characters – 0x7D (the PPP Control Escape character) and 0x5E. Upon reception, the 2-character sequence is translated to the single byte 0x7E. Similarly, a 0x7D becomes 0x7D 0x5D on transmission, and is collapsed down to just 0x7D upon reception.

    My first few years after university were in communications companies, and I worked pretty close to the hardware. This exposed me to issues such as clock recovery, jitter, etc. It’s always interesting to see how these issues are addressed in hardware & software. Also, the issues, and the solutions, change as the clock speed gets faster. I remember a hardware guy holding up an 8-foot piece of cable (this was in 1992) and saying “here is a 1-byte RAM”. I looked at him, confused. He said “At our 1GHz clock, this piece of wire can hold eight distinct bits – each foot of wire is about 1 nanosecond of a signal.”

    • Thanks for that Dan. I love the story about a piece of wire being, in effect, a data storage device.

      At one point in my career, I was doing some “software archaeology”. We had some old data capture equipment that was hard-wired electronics and used unique [and uniquely weird] bit-oriented serial protocols. I had to figure out how they worked and get a new microprocessor based device to talk the same protocol. Fun times.

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at