C++20 ranges benefits: avoid dangling pointers
In my last monthly post, C++20 benefits: consistency with ranges, we looked at what ranges do for us when it comes to consistency and how we can get the same level of consistency for our code.
Today I like to continue with the last example and see how ranges prevent us from dangling pointers. Another element that is great to have in our codebase.
Dangling pointers are bad
Okay, I assume that you already know that dangling pointers are bad. Just to be on the same page, let's recap what dangling pointers are and how quickly we can accidentally create one.
In the previous post, we finished by creating our custom begin
function, which works for both free and member functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
We used it like this:
1 2 3 4 |
|
This example does not really excel, as the result of begin
is never used. Let's change that. Suppose we have the following code:
1 2 3 4 5 |
|
In A, we use begin
to retrieve the begin iterator, and in B, we dereference it. This is, for example, what range-based for-loop does. Now the code does something, at least a bit of something, but it is enough for today's purpose.
This code works and is perfectly fine. But what happens if we change it slightly, like this:
1 2 3 |
|
In this case, we pass a temporary object, of type Container
, to custom::begin
. This changes everything. The call to custom::begin
is fine. Dereferencing iter2
isn't. We have a dangling pointer. The temporary object is destroyed after the full expression, after the semicolon in C.
Once we start using iter2
, we are looking at undefined behavior.
Avoid dangling pointers - Strategy 1
One approach to avoid a dangling pointer is that custom::begin
rejects temporaries. Only l-value references make sense here. A simple approach comes to mind, let's ban all other types with a static_assert
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
A slight change, here in A, and we effectively prevent our custom::begin
getting called with temporaries.
This is good, and in some cases exactly what you want. However, this is not what ranges do. The static_assert
solution has one drawback, we cannot pass a temporary to it. Ah, wasn't that the purpose of this exercise? Yes, but passing the temporary isn't the issue. As long as we do not dereference the result, it doesn't matter.
Let's look at an alternative solution.
Avoid dangling pointers - Strategy 2
Obviously, the static_assert
is the limiting factor here. The std::is_lvalue_reference_v<R>
part is fine.
We know the type at compile-time, so let's use another constexpr if
to guard the good case and return a special type dangling
in all other cases:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
With this change, we delay the error to the point where it really occurs. The interesting construct is dangling
, which is returned in the case begin
is invoked with a temporary A. Below, you see a possible implementation of dangling
:
1 2 3 4 5 6 |
|
We can see that dangling
is a struct with a default constructor and a constructor, which is a variadic template. There is no implementation. All this type should do is to give users a helpful error message. Assume we uncomment D:
1 2 3 |
|
Once we compile it, we get the following error message with Clang:
1 2 3 4 5 6 7 8 9 10 |
|
It is the name that should draw the users attention to the fact that here is something wrong. We delayed this error until the variable is really used. This is the version ranges use because under some circumstances, this behavior is desired.
For your own codebase, you can decide from case to case what's the better option.
I hope you learned something. I appreciate your feedback. Please reach out to me on X or via email.
Andreas