Thought Leadership

Staying in line

By Colin Walls

The idea of inlining code – placing the actual code of a small function at each call site – is a well known compiler optimization, which I have discussed before. This technique can provide significant performance improvements, due to the elimination of the call/return sequence. Also, stack usage is reduced. There is a possible cost in terms of increased program memory requirement.

It is reasonable to expect a good C or C++ compiler, when told to compile for speed, to perform inlining automatically. Some C compilers have extensions to give more control over this, but C++ has intrinsic support for inlining within the language …

C++ has a number of keywords that are additional to those in C. Among those is inline. This is a directive to the compiler requesting that the function is inlined. The compiler is at liberty to ignore the request, if the function is too big or inlining conflicts with optimization settings [i.e. if switches request small code instead of fast code]. The inline keyword is also implemented in many modern C compilers, as a language extension, and works in the same way.

But C++ has some other inlining tricks. It is very common for functions, which are part of objects [class member functions] to be small and called with great frequency. Thus, such functions are strong contenders for inlining. They may be declared inline in the same way, using the keyword. However, there is another way to convey the inlining request to the compiler: place the actual code inside the class definition, instead of just declaring the function and defining it outside.

So, C++ has two different ways to declare a function to be inline:

	class T1
	{
	public:
		void foo() 
		{
		}
	};

	class T2
	{
	public:
		void foo();
	};

	inline void T2::foo()
	{
	}

In this code, both classes have an inline function called foo().

This begs a question: what is the difference, in terms of generated code, between these two ways to declare an inline?
Please post answers as comments or by email. Sorry, there is no prize, but I will post the result as a comment in a week or so.

Comments

0 thoughts about “Staying in line
  • Colin,

    Good that you pointed out that inlining is a suggestion to the compiler, not a guarantee by the compiler.

    So, we know inlining sometimes doesn’t happen in spite of the request.

    The flip side to that topic is the question “Does inlining ever happen when it *isn’t* requested?”

    Good interview question, not a huge ding if the candidate doesn’t know the answer.

  • Dan:
    Good input.
    I would expect that a compiler might automatically inline code if it were told to optimize. So long as the user has not asked for small code and the inlining would make the code larger. A tiny function’s code might be as small as the call/return.

  • Inlining depends on the compiler. I worked with gcc and Tensilica compiler (which is based on Open64/SGI).

    gcc is quite lame – it still doesn’t have inter-procedural analysis/optimization (although gcc 4.5 promises -flto option). As a result gcc does not inline functions unless they are in header file. inline keyword is pretty much ignored: gcc can decide to inline a function w/o “inline” if it is in a header file or decide not to inline a function with “inline”. As far as I understand the only criteria is code size which is obviously suboptimal in many cases, especially in C++ code.

    To force gcc to inline all the time, __attribute__((always_inline)) must be used. -fno-inline option can be used to prevent any inlining which is useful for debugging.

    Tensilica compiler has inter-procedural analysis/optimization which allows compiler to inline any function during link stage regardless of whether it is in the header file or not, e.g. if a function is called only in one place in the code, it will be inlined always. Also, Tensilica compiler has an option to force inlining for every function with “inline” keyword.

  • gcc will inline single call site functions if they are declared static when using -Os, and I think other -O levels.
    Without the static qualifier the compiler is obliged to place the function code into the object file, so in this case it is never a size win to inline them (unless you can do whole program analysis type stuff, to remove the non-inlined version).

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at https://blogs.sw.siemens.com/embedded-software/2010/02/22/staying-in-line/