The correct way to do type punning in C++
Today's post is relevant for everybody who does type punning in C++. Something I did for years while working in the embedded software domain. Something others have done for a long time before I entered the game. Something that is 100% illegal according to the standard. And yet I know that many embedded devices are built using type punning despite it being illegal and with that UB.
Let's start at the beginning
First, by using the code below, let's establish what I mean by type punning and why it is undefined behavior in C++.
1 2 3 4 5 6 7 |
|
The code aims to convert the bit representation of the float
in A to an integer. It's not about converting the value, 3.14
, to an integer, which, as you know, would be 3
. The goal is to get the bit representation.
I picture two attempts I often seen. The first in B uses a static_cast
. While this attempt compiles and is 100% valid C++, the result isn't what you're looking for, as you get the values converted, ending up with the number 3
in first
.
Clearly, the second attempt must be more clever, and I would say C does look clever. This piece of code first grabs the address of our float
, then uses a reinterpret_cast
converting the float
-pointer to an int
-pointer to finally dereference the freshly obtained pointer to an int
. Yes! You just bend the rules of C++ (and C, by the way) so far that the compiler allows you the conversion. Success!
Well, ..., remember from your starting days? That the code compiles is only the first step! Next, it must link, which this code does as well. Then, the code must do what you've planned. This is where things get tricky. You just wrote code that contains undefined behavior.
By being overly clever with the conversion sequence, you opened the door for the compiler to optimize the assignment to second
. This basically comes down to object lifetime, and its rules. Very roughly speaking, in the view of the compiler, the int
you assign in C never became alive. There is no constructor or any allowed conversion sequence that would make the compiler aware of a lifetime start. A perfect opportunity for the compiler to save us some instructions by optimizing this entire assignment away.
I have customers who, for that reason, do not tune the optimization level higher than -O1
, for example.
All right, sorry for the sad news, shouldn't you have been aware of all that already.
C++20 to the rescue
Now, you may be from a later time, but while thinking about the to rescue part, the TV series Baywatch came to my mind where the lifeguards ran into the water with the red safety buoy (it's the first time I looked that word up in English). A quick look with my favorite search engine shows that such a safety buoy comes in different flavors. Other colors than red and different shapes. But I'm digressing here, sorry.
All right, back from the beach into the office. C++20 has a safety buoy named std::bit_cast
for the case I presented above. Instead of writing a bunch of code to trick the compiler into the desired conversion sequence, apply std::bit_cast
in the same way as first
used static_cast
. Here is how the code then looks:
1 2 |
|
As you can see, std::bit_cast
really does look like a static_cast
. You provide the destination type in the angle brackets and the source variable or value as an argument. What you get back is an object of the destination type.
Internally, std::bit_cast
uses memcpy
to copy the source bits into the destination buffer before returning the latter. This works since C++17 memcpy
is blessed by the standard as an element that starts the lifetime of an object.
Whenever you approach a situation like the one at the beginning of the post, please prefer std::bit_cast
whenever you can, to be on the safe side.
Andreas