Efficient C++: The hidden compile-time cost of auto return types
In today's post, I would like to dive into writing efficient C++ code. As you probably know, one post will not cover this entire topic. For today's post, I like to focus on controlling your compile-times, picking one element out of the language: auto
.
I would like to start by saying that I think the auto
type deduction was a great extension of the language. With the C++14 version of auto
as a return type, writing more flexible and correct functions is a good thing.
But let's start without auto
as a return type. Below, you find a function that doesn't do something interesting; my apologies.
1 2 3 4 5 6 7 8 |
|
All I needed was a function with some body that wasn't totally trivial to the compiler. Since today's post is about efficient C++, I like to talk about the compile time that is required to compile these few lines of code. Not necessarily the absolute numbers, as they will be different between your and my machine.
First of all, how can you measure your compiler? Using the Linux time
command is one option. But it only gives a little information that does not help you improve your build times. When a build feels slow to you, you don't need a tool telling you that you're right, you need a tool that tells you where a possible speedup is.
If I say time
isn't the right tool, then what is? Easy, your compiler, of course! At least Clang. It comes with this great command line option -ftime-trace
. Which lets Clang generate a .json
file for each object file containing plenty of data. You can explore the information, for example, using Chrome with its tracing viewer. Just open a tab there and type chrome://tracing. Then load the json-file. For the code above on my machine with Clang 19, the visualization looks like this:
One word about what I did. I assume that the function Fun
is in a header file and not used in the current translation unit. For the sake of simplicity I did not include other code or even use it. I had one .cpp
containing the code I presented above and compiled it with
1 |
|
Now back to the visualization. You might have a hard time reading the numbers; the top bar saying "ExecuteCompiler" takes 6.499ms. Let's leave it as this for now.
Using auto
as return type
Here is a modified version of the earlier code. The only change I made is using auto
instead of int
as the return type.
1 2 3 4 5 6 7 8 |
|
Now, let's measure this piece of code and see what the results are.
This time, "ExecuteCompiler" takes 8.114ms. That's roughly 1.5ms longer. Yes, that is true; this is only a single data point. Maybe we are looking at jitter. That can be. Maybe there will be invocations when the two numbers come closer together, but it is unlikely that they will ever become the same. I am confident here because of the one additional step that appears in the second trace, "ParseFunctionDefinition". Again, the screenshot might be hard to read for you, but either trust me or run your own tracing. This step isn't present unless you use auto
as the return type for the function. Or you actually use the function. But my picture here is that Fun
is declared in a header file.
The second code example takes longer to compile because of "ParseFunctionDefinition". With C++14's auto
as a return type, the compiler must look up the function definition when it parses the declaration. Without auto
the compiler delays parsing the body until it is actually used.
Summary
If you want fast compile times, be careful with functions using auto
as a return type in header files.
Andreas