C++ for embedded systems: constexpr and consteval
In today's post, I'll learn how modern C++ can influence the code you write for your embedded system. You will see code using up to C++23. The example I show you below circles around at least two questions I got various times from customers:
- What is
consteval
good for? - What is that user-defined literal operator, and why should I care?
I'm going to address these two questions, but I will not leave you with just that. The real-world example you find below also showcases the latest additions to C++, which help to make your code more robust and safe.
Chapter One: What is a MAC address?
I teach a lot of classes to customers who are developing embedded systems. That makes sense. I worked for a long time in that domain, and I enjoyed it so much.
One recurring topic is networking. While nowadays we have various different network types and technologies, let's use the Internet Protocol (IP) today. The base of network communication is your Network Interface Card (NIC). Each NIC has a unique Medium Control Address (MAC) assigned. The MAC address is the base layer for everything on top, like TCP/IP.
A MAC address consists of exactly six bytes. One way to represent a MAC address in code is this:
1 2 3 |
|
The human-readable MAC address form presents these six bytes as hexadecimal digits grouped in two and separated by a colon or a dash like this:
1 |
|
Some of these MAC addresses are known at compile time, others might be entered by users during run time.
Chapter Two: Parsing a MAC address string.
As a first step, let's explore how a function that converts a MAC address in string form can be converted to a six-byte version. You already saw the base above MACAddr
. I started using a std::array
for memory safety reasons. I replace all C-style arrays with the stronger C++ version whenever I can. The major benefit is that I can query the size all the time.
For the parse function, we can conclude that a string is only valid if it contains at least 17 characters (six bytes times two due to the hex format plus five separators). Finding the separators is another item.
In macFromString
, you find one possible implementation, which I will walk you through now.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Let me start with A the return type, std::expected<MACAddr, std::errc>
. That type is available in C++23. As you can see, the first template parameter is the type we expect in the good case (hence expected), and the second type is the error condition type. For simplicity, I used std:errc
here. However, choose a type tailored to your needs in your code base.
The power of std::expected
is that it contains either a value or an error code. While there are invalid MAC addresses, let's say that we take the easy way and consider the address invalid, if the string is too short, or doesn't contain the required separators, or contains invalid characters (like T
, which isn't a hexadecimal number). Using std::expected
helps remove the out-parameter pattern, which I greatly dislike.
Next, you can see that macFromString
takes a std::span
as a parameter in B. The beauty of std::span
, which was added in C++20, is that it is a very cheap view of the original data while the std::span
still preserves the data size. All these points make std::span
the perfect type to pass array-like data around while still being bounds-safe.
The first thing I do inside macFromString
is checking the string is of sufficient length (C). Thanks to the std::span
, not only an easy but also a safe task.
What if the string is too short? That would be unexpected, right? That's why, in that case, I return a std::unexpected
with a std:errc
code. Here, you can see the beauty of std::expected
; the error case is marked absolutely clear.
Now, let's look at the conversion, which requires a loop. Of course, I use a range-based for loop here (D). You can spot that I'm using C++20 at this point because I use the form with an initializer, my count variable i
.
I check for the separator inside the loop, which is at position two in the std::span
if we are not looking at the final part. The same procedure now as above in C. If there is no separator (or another one, I went in easy here), I return a std::unexpected
, in D.
Next comes the actual conversion F. I use C++17's std::from_chars
. The beauty here is that I can pass the values from std::span
directly despite not starting with the usual hexadecimal number indicator 0x
. You can tell std::from_chars
, which is the base of the numbers.
Should the conversion fail, I check the error code ec
here. I once again return a std::unexpected
. As I wrote above, using different error values to indicate at which point the conversion failed is super useful for your code.
The final step in G then is to advance the std::span
using its subspan
functionality. Sadly, at this point, you have to be careful; going out-of-bounds here is undefined behavior. This is why I check how many elements are left and either advance by there or whatever is left. The last part is always the case for the last digit pair, which comes without a following separator.
Not forgetting to increment i
as the final step is important, and we are done. Here, you are looking at a robust and safe MAC address parser using the latest C++ features.
Chapter Three: Why the constexpr
?
One tiny piece I haven't talked about when walking through the implementation of macFromString
was the very first sequence of characters in the function declaration, forming a keyword constexpr
.
Well, the answer is easy, right? You and I, we want to be able to invoke macFromString
at compile time. Sure! But how? Remember the user-defined literal (UDL) operator I said I had gone to answer a question about at the beginning? Here it comes.
One interesting property of the UDL operator is that we can invoke it only with compile-time constants1!
Let's create a UDL _macaddr
, which returns a MACAddr
object to make the following code valid:
1 2 3 4 5 |
|
The implementation of the UDL operator is fairly easy, as you can see here:
1 2 3 4 |
|
I use the UDL operator form, which takes a const char*
and a std::size_t
. The compiler graciously detects and tells us the size of the compile time constant string. With that, all information to invoke macFromString
is present. The best part here is that the string and the length match 100% of the time since we have absolutely nothing to do with it. Just pass the data along, forming a std::span
when calling macFromString
.
But wait, macFromString
returns a std::expected<MACAddr, std::errc>
, more than just a MACAddr
. What to do here? Well, I simply call .value
on the result of macFromString
. In case there is no value in the std::expected
, the datatype throws an exception. But isn't that bad? Well, maybe in other cases, yes, here I argue it is more than okay; it is great!
Did you notice the first keyword I used for the UDL operator? It is consteval
! I'm forcing this function to a compile-time-only evaluation. In case of an invalid MAC address string, the exception causes the termination of the compile run. This effectively enables you to catch such errors during development. No hard-coded MAC address should be invalid, right?
The consteval
here has another benefit. If the implementation of macFromString
would not be constexpr
, like because of throwing or other undefined behavior, the evaluation of the UDL would become a run time call. Most certainly not what you want.
Chapter Four: Your turn!
I showed you various new elements and how they help you to make your code:
- more robust
- improve safety
- improve readability
Applying the latest features of C++ is beneficial in various ways.
One additional takeaway from today's post: As a rule of thumb, make every UDL operator consteval
in C++20 and later.
Andreas
Yes, you can invoke the UDL operator by hand and with that with run-time values. But that totally defeats the purpose of an operator. ↩