C++ Call Store but Get the Old Value Back

Desember 18, 2021 Posting Komentar

std::optional: How, when, and why

Casey

September 4th, 2018

This post is part of a regular series of posts where the C++ product team here at Microsoft and other guests answer questions we have received from customers. The questions can be about anything C++ related: MSVC toolset, the standard language and library, the C++ standards committee, isocpp.org, CppCon, etc. Today's post is by Casey Carter.

C++17 adds several new "vocabulary types" – types intended to be used in the interfaces between components from different sources – to the standard library. MSVC has been shipping implementations of std::optional,std::any, andstd::variant since the Visual Studio 2017 release, but we haven't provided any guidelines on how and when these vocabulary types should be used. This article onstd::optional is the first of a series that will examine each of the vocabulary types in turn.

The need for "sometimes-a-thing"

How do you write a function that optionally accepts or returns an object? The traditional solution is to choose one of the potential values as a sentinel to indicate theabsence of a value:

void maybe_take_an_int(int value = -1); // an argument of -1 means "no value" int maybe_return_an_int(); // a return value of -1 means "no value"

This works reasonably well when one of the representable values of the type never occurs in practice. It's less great when there's no obvious choice of sentinel and you want to be able to pass all representable values. If that's the case, the typical approach is to use a separate boolean to indicate whether the optional parameter holds a valid value:

void maybe_take_an_int(int value = -1, bool is_valid = false); void or_even_better(pair<int,bool> param = std::make_pair(-1, false)); pair<int, bool> maybe_return_an_int();

This is also feasible, but awkward. The "two distinct parameters" technique ofmaybe_take_an_int requires the caller to pass two things instead of one to represent a single notion, and fails silently when the caller forgets thebool and simply callsmaybe_take_an_int(42). The use ofpair in the other two functions avoids those problems, but it's possible for the user of thepair to forget to check thebool and potentially use a garbage value in theint. Passingstd::make_pair(42, true) orstd::make_pair(whatever, false) is also hugely different than passing42 or nothing – we've made the interface hard to use.

The need for "not-yet-a-thing"

How do you write a class with a member object whose initialization is delayed, i.e., optionally contains an object? For whatever reason, you do not want to initialize this member in a constructor. The initialization may happen in a later mandatory call, or it may happen only on request. When the object is destroyed the member must be destroyed only if it has been initialized. It's possible to achieve this by allocating raw storage for the member object, using abool to track its initialization status, and doing horrible placementnew tricks:

using T = /* some object type */;  struct S {   bool is_initialized = false;   alignas(T) unsigned char maybe_T[sizeof(T)];    void construct_the_T(int arg) {     assert(!is_initialized);     new (&maybe_T) T(arg);     is_initialized = true;   }    T& get_the_T() {     assert(is_initialized);     return reinterpret_cast<T&>(maybe_T);   }    ~S() {     if (is_initialized) {       get_the_T().~T(); // destroy the T     }   }    // ... lots of code ... };

The "lots of code" comment in the body ofS is where you write copy/move constructors/assignment operators that do the right thing depending on whether the source and target objects contain an initializedT. If this all seems horribly messy and fragile to you, then give yourself a pat on the back – your instincts are right. We're walking right along the cliff's edge where small mistakes will send us tumbling into undefined behavior.

Another possible solution to many of the above problems is to dynamically allocate the "optional" value and pass it via pointer – ideallystd::unique_ptr. Given that we C++ programmers are accustomed to using pointers, this solution has good usability: a null pointer indicates the no-value condition,* is used to access the value,std::make_unique<int>(42) is only slightly awkward compared toreturn 42 andunique_ptr handles the deallocation for us automatically. Of course usability is not the only concern; readers accustomed to C++'s zero-overhead abstractions will immediately pounce upon this solution and complain that dynamic allocation is orders of magnitude more expensive than simply returning an integer. We'd like to solve this class of problem withoutrequiring dynamic allocation.

`optional` is mandatory

C++17's solution to the above problems isstd::optional.optional<T> directly addresses the issues that arise when passing or storing what may-or-may-not-currently-be an object.optional<T> provides interfaces to determine if it contains aT and to query the stored value. You can initialize anoptional with an actualT value, or default-initialize it (or initialize withstd::nullopt) to put it in the "empty" state.optional<T> even extendsT's ordering operations<,>,<=,>= – where an emptyoptional compares as less than anyoptional that contains aT – so you can use it in some contexts exactly as if it were aT.optional<T> stores theT object internally, so dynamic allocation is not necessary and in fact explicitly forbidden by the C++ Standard.

Our functions that need to optionally pass aT would be declared as:

void maybe_take_an_int(optional<int> potential_value = nullopt);    // or equivalently, "potential_value = {}" optional<int> maybe_return_an_int();

Since optional<T> can be initialized from aT value, callers ofmaybe_take_an_int need not change unless they were explicitly passing-1 to indicate "not-a-value." Similarly, the implementation ofmaybe_return_an_int need only change places that are returning-1 for "not-a-value" to instead returnnullopt (or equivalently{}).

Callers ofmaybe_return_an_int and the implementation ofmaybe_take_an_int require more substantial changes. You can ask explicitly if an instance ofoptional holds a value using either thehas_value member or by contextual conversion tobool:

optional<int> o = maybe_return_an_int(); if (o.has_value()) { /* ... */ } if (o) { /* ... */ } // "if" converts its condition to bool

Once you know that theoptional contains a value, you can extract it with the* operator:

if (o) { cout << "The value is: " << *o << '\n'; }

or you can use the value member function to get the stored value or abad_optional_access exception if there is none, and not bother with checking:

cout << "The value is: " << o.value() << '\n';

or thevalue_or member function if you'd rather get a fallback value than an exception from an emptyoptional:

cout << "The value might be: " << o.value_or(42) << '\n';

All of which together means we cannot inadvertently use a garbage value as was the case for the "traditional" solutions. Attempting to access the contained value of an emptyoptional results in an exception if accessed with thevalue() member, or undefined behavior if accessed via the* operator that can be caught by debug libraries and static analysis tools. Updating the "old" code is probably as simple as replacing validity tests likevalue == not_a_value_sentinel andif (is_valid) withopt_value.has_value() andif (opt_value) and replacing uses with*opt_value.

Returning to the concrete example, your function that looks up a string given an integer can simply returnoptional<string>. This avoids the problems of the suggested solutions; we can

easily discern the no-value case from the value-found case, unlike for the "return a default value" solution,
report the no-value case without using exception handling machinery, which is likely too expensive if such cases are frequent rather than exceptional,
avoid leaking implementation details to the caller as would be necessary to expose an "end" iterator with which they could compare a returned iterator.

Solving the delayed initialization problem is straightforward: we simply add anoptional<T> member to our class. The standard library implementer is responsible for getting the placement new handling correct, andstd::optional already handles all of the special cases for the copy/move constructors/assignment operators:

using T = /* some object type */;  struct S {   optional<T> maybe_T;        void construct_the_T(int arg) {     // We need not guard against repeat initialization;     // optional's emplace member will destroy any      // contained object and make a fresh one.             maybe_T.emplace(arg);   }    T& get_the_T() {      assert(maybe_T);     return *maybe_T;         // Or, if we prefer an exception when maybe_T is not initialized:     // return maybe_T.value();   }    // ... No error-prone handwritten special member functions! ... };

optional is particularly well-suited to the delayed initialization problem because it is itself an instance of delayed initialization. The containedT may be initialized at construction, or sometime later, or never. Any containedT must be destroyed when theoptional is destroyed. The designers ofoptional have already answered most of the questions that arise in this context.

Conclusions

Any time you need a tool to express "value-or-not-value", or "possibly an answer", or "object with delayed initialization", you should reach into your toolbox forstd::optional. Using a vocabulary type for these cases raises the level of abstraction, making it easier for others to understand what your code is doing. The declarationsoptional<T> f(); andvoid g(optional<T>); express intent more clearly and concisely than dopair<T, bool> f(); orvoid g(T t, bool is_valid);. Just as is the case with words, adding to our vocabulary of types increases our capacity to describe complex problems simply – it makes us more efficient.

If you have any questions, please feel free to post in the comments below. You can also send any comments and suggestions directly to the author via e-mail at cacarter@microsoft.com, or Twitter @CoderCasey. Thank you!