C++11 brought concurrency into the language for the first time. Tasks vs threads, futures vs promises, std::atomic vs volatile — each serves a distinct purpose. Getting them confused leads to bugs that are nightmarish to debug.
You want to run a function asynchronously. You have two choices:
c++ int doAsyncWork(); // Thread-based: you manage everything std::thread t(doAsyncWork); // Task-based: the runtime manages threads for you auto fut = std::async(doAsyncWork);
std::async gives you a future for the return value, propagates exceptions via get(), and lets the runtime handle thread creation, destruction, load balancing, and oversubscription avoidance.| Kind | What It Is |
|---|---|
| Hardware thread | Physical execution unit on a CPU core. Fixed by hardware. |
| Software thread | OS-managed thread, scheduled onto hardware threads. More can exist than hardware threads. |
| std::thread | C++ object that acts as a handle to a software thread. Can be null, moved, joined, or detached. |
std::thread t(f) throws std::system_error if the OS is out of threads. Even if f is noexcept.std::thread gives you no direct way to get the function's return value or catch its exception.std::async shifts all of these problems to the Standard Library implementer. It can defer execution, use thread pools, or steal work — you get the result via fut.get().
native_handle() for platform-specific APIs (priorities, affinity). (2) You need to optimize thread usage for a known hardware profile. (3) You're implementing threading technology beyond the standard API.std::thread t(f) runs out of OS threads?std::async doesn't always run your function asynchronously. Its default launch policy is std::launch::async | std::launch::deferred, which means: "run it however you want."
| Policy | Behavior |
|---|---|
std::launch::async | Must run on a different thread. Guaranteed concurrency. |
std::launch::deferred | Runs synchronously when you call get() or wait(). May never run at all. |
| Default (both or'd) | Runtime decides. Might be async, might be deferred. You don't know. |
f runs concurrently.
thread_local variables are used.
f might never run (if get/wait is never called).
wait_for may run forever (deferred returns deferred, never ready).
c++ auto fut = std::async(f); // default policy while (fut.wait_for(100ms) != std::future_status::ready) { ... // If f is deferred, this loops FOREVER! // wait_for always returns future_status::deferred }
c++ auto fut = std::async(f); if (fut.wait_for(0s) == std::future_status::deferred) { fut.get(); // call synchronously } else { while (fut.wait_for(100ms) != std::future_status::ready) { ... // safe: task is not deferred } }
std::launch::async explicitly: auto fut = std::async(std::launch::async, f);wait_for return for a deferred task?Every std::thread is either joinable (corresponds to a running/runnable thread) or unjoinable (default-constructed, moved-from, joined, or detached). If a joinable thread's destructor runs, your program is terminated.
| Unjoinable State | How It Got There |
|---|---|
| Default-constructed | std::thread t; — no function to execute |
| Moved-from | auto t2 = std::move(t); — t is now empty |
| Joined | t.join(); — underlying thread finished |
| Detached | t.detach(); — connection severed |
join (could silently block for minutes) and implicit detach (could cause memory corruption by writing to destroyed stack frames).c++ class ThreadRAII { public: enum class DtorAction { join, detach }; ThreadRAII(std::thread&& t, DtorAction a) : action(a), t(std::move(t)) {} ~ThreadRAII() { if (t.joinable()) { if (action == DtorAction::join) t.join(); else t.detach(); } } ThreadRAII(ThreadRAII&&) = default; ThreadRAII& operator=(ThreadRAII&&) = default; std::thread& get() { return t; } private: DtorAction action; std::thread t; // declared last: initialized last (after all other members) };
std::thread is declared last so it's initialized after all other members, preventing a race. (3) Move operations are explicitly defaulted because declaring a destructor suppresses them.| Option | Problem |
|---|---|
| Implicit join | Could silently block the destructor for an unpredictable duration. If conditionsAreSatisfied() returns false, you'd still wait for the entire filter computation to finish. |
| Implicit detach | The thread keeps running after the function returns. It may write to stack memory that's now occupied by another function's local variables. Memory corruption with no visible cause. |
| Program termination | Harsh, but at least the bug is immediately visible. The standard committee chose this. |
c++ bool doWork(std::function<bool(int)> filter, int maxVal) { std::vector<int> goodVals; ThreadRAII t( std::thread([&filter, maxVal, &goodVals] { for (auto i = 0; i <= maxVal; ++i) if (filter(i)) goodVals.push_back(i); }), ThreadRAII::DtorAction::join // join if exception or early return ); auto nh = t.get().native_handle(); // ... set priority via platform API ... if (conditionsAreSatisfied()) { t.get().join(); performComputation(goodVals); return true; } return false; // ThreadRAII destructor joins the thread safely }
std::thread's destructor is called?Unlike std::thread, futures don't terminate your program when destroyed. But their destructor behavior is surprisingly nuanced.
Results from async tasks live in a shared state — a heap-allocated object accessible to both the caller's future and the callee's std::promise:
The shared state sits between caller and callee.
| Condition | Destructor Behavior |
|---|---|
| Normal case | Destroys the future's data members. No join, no detach, no block. |
| Exception: last future for a non-deferred std::async task | Blocks until the task completes (implicit join) |
std::async. (2) The task's policy is std::launch::async. (3) This future is the last one referring to the shared state.c++ // This future's dtor will NOT block std::packaged_task<int()> pt(calcValue); auto fut = pt.get_future(); // shared state NOT from std::async std::thread t(std::move(pt)); // You must join or detach t yourself
You have a detecting task that needs to signal a reacting task. What's the best mechanism?
| Mechanism | Problem |
|---|---|
| Condition variable | Requires a mutex (even if no shared data), timing-dependent, spurious wakeups |
| Atomic flag + polling | Wastes CPU — the reacting task spins instead of blocking |
| Condvar + flag combo | Works but stilted — two redundant signaling channels |
| std::promise<void> | Clean, one-shot, no mutex, no spurious wakeups, truly blocks |
c++ std::promise<void> p; // Detecting task: ... // detect event p.set_value(); // signal! // Reacting task: ... // prepare p.get_future().wait(); // block until signaled ... // react
std::promise can be set only once. For repeated notifications, use condvar+flag.c++ std::promise<void> p; void detect() { std::thread t([] { p.get_future().wait(); // suspend until signaled react(); }); ... // configure t (priority, affinity, etc.) p.set_value(); // unsuspend t ... t.join(); }
c++ std::promise<void> p; auto sf = p.get_future().share(); std::vector<std::thread> threads; for (int i = 0; i < n; ++i) threads.emplace_back([sf]{ sf.wait(); react(); }); p.set_value(); // wake all threads simultaneously for (auto& t : threads) t.join();
std::promise<void> better than a condition variable for one-shot event signaling?std::atomic and volatile serve completely different purposes. Confusing them is a common and dangerous mistake.
| Feature | std::atomic | volatile |
|---|---|---|
| Purpose | Concurrent access | Special memory (MMIO) |
| Atomicity | Guaranteed (all operations) | None |
| Reordering prevention | Yes (sequential consistency) | No |
| Redundant load elimination | Allowed (optimizer can merge) | Forbidden (every read/write preserved) |
| Dead store elimination | Allowed | Forbidden |
| Copyable? | No (copy ops deleted) | Yes |
c++ std::atomic<int> ai(0); ai = 10; // atomically set to 10 ++ai; // atomic read-modify-write → 11 --ai; // atomic RMW → 10 // Other threads see only 0, 10, or 11. Never a torn value. // Sequential consistency: no reordering past atomic writes auto val = computeValue(); // (1) std::atomic<bool> ready(false); ready = true; // (2) guaranteed after (1)
c++ volatile int x; auto y = x; // read x (preserved — might be sensor data) y = x; // read x again (preserved — value might have changed!) x = 10; // write x (preserved — might be a hardware command) x = 20; // write x (preserved — might be a different command) // Without volatile, compiler would optimize to: auto y = x; x = 20;
c++ // With std::atomic: guaranteed ordering auto imptValue = computeImportantValue(); // (1) std::atomic<bool> valAvailable(false); valAvailable = true; // (2) — (1) guaranteed before (2) // Other threads see imptValue BEFORE valAvailable becomes true // With volatile: NO ordering guarantee auto imptValue = computeImportantValue(); // might be reordered! volatile bool valAvailable = false; valAvailable = true; // might execute BEFORE imptValue! // Other threads could see valAvailable = true but imptValue not yet computed
std::atomic (with default memory order) guarantees that no code preceding an atomic write can appear to other threads to happen after it. This is the foundation of lock-free programming. volatile provides no such guarantee.c++ std::atomic<int> x; auto y = x; // ERROR! Copy operations are deleted. // Reading x and writing y atomically as ONE operation // is impossible on most hardware. // Use load() and store() instead: std::atomic<int> y(x.load()); // atomic read, then init y y.store(x.load()); // atomic read, then atomic write // Each operation is atomic, but the pair is NOT atomic together
volatile std::atomic<int> vai; means "operations are atomic AND can't be optimized away." Useful for memory-mapped I/O that's accessed from multiple threads.std::atomic or volatile?Understanding the three layers of threads helps you reason about concurrency.
How C++ threads map to OS threads and hardware cores.
| Layer | Count | Managed By |
|---|---|---|
| Hardware threads | Fixed (e.g., 8 cores × 2 SMT = 16) | CPU |
| Software threads | Hundreds to thousands | OS scheduler |
| std::thread objects | As many as you create | Your code (or std::async) |
std::async with the default policy can avoid this by deferring tasks.Click each pattern to see its trade-offs.
Four approaches to inter-thread event communication.
See how std::atomic and volatile handle concurrent increments differently.
Two threads each increment a counter once. See the possible outcomes.
| Item | Key Takeaway |
|---|---|
| Item 35 | Prefer task-based (std::async) to thread-based (std::thread). Tasks handle thread management, return values, and exceptions. |
| Item 36 | The default launch policy may defer. Specify std::launch::async if concurrency is essential. |
| Item 37 | Make std::threads unjoinable on all paths. Use RAII wrappers. Declare thread members last. |
| Item 38 | Only futures from std::async (non-deferred, last reference) block in their destructors. |
| Item 39 | Use std::promise<void> for clean one-shot event communication. Use shared_future for multiple reactors. |
| Item 40 | std::atomic = concurrency (atomicity + ordering). volatile = special memory (no optimization). Different tools for different jobs. |
"For the first time in C++'s history, programmers can write multithreaded programs with standard behavior across all platforms." — Scott Meyers