The Concurrency API — Effective Modern C++, Chapter 7

Chapter 0: Tasks vs Threads

You want to run a function asynchronously. You have two choices:

c++
int doAsyncWork();

// Thread-based: you manage everything
std::thread t(doAsyncWork);

// Task-based: the runtime manages threads for you
auto fut = std::async(doAsyncWork);

Prefer task-based. std::async gives you a future for the return value, propagates exceptions via get(), and lets the runtime handle thread creation, destruction, load balancing, and oversubscription avoidance.

Three kinds of "thread"

Kind	What It Is
Hardware thread	Physical execution unit on a CPU core. Fixed by hardware.
Software thread	OS-managed thread, scheduled onto hardware threads. More can exist than hardware threads.
std::thread	C++ object that acts as a handle to a software thread. Can be null, moved, joined, or detached.

Why thread-based is dangerous

Thread exhaustion

std::thread t(f) throws std::system_error if the OS is out of threads. Even if f is noexcept.

↓

Oversubscription

More ready threads than hardware threads = expensive context switches, cache pollution, thrashing.

↓

No return value

std::thread gives you no direct way to get the function's return value or catch its exception.

std::async shifts all of these problems to the Standard Library implementer. It can defer execution, use thread pools, or steal work — you get the result via fut.get().

When to use raw threads: (1) You need native_handle() for platform-specific APIs (priorities, affinity). (2) You need to optimize thread usage for a known hardware profile. (3) You're implementing threading technology beyond the standard API.

What happens if std::thread t(f) runs out of OS threads?

It throws std::system_error, even if f is noexcept It silently runs f on the calling thread It queues f until a thread becomes available

Chapter 1: Launch Policies

std::async doesn't always run your function asynchronously. Its default launch policy is std::launch::async | std::launch::deferred, which means: "run it however you want."

Policy	Behavior
`std::launch::async`	Must run on a different thread. Guaranteed concurrency.
`std::launch::deferred`	Runs synchronously when you call `get()` or `wait()`. May never run at all.
Default (both or'd)	Runtime decides. Might be async, might be deferred. You don't know.

The default policy is dangerous because:
• You can't predict if f runs concurrently.
• You can't predict which thread's thread_local variables are used.
• f might never run (if get/wait is never called).
• Timeout loops with wait_for may run forever (deferred returns deferred, never ready).

The infinite loop trap

c++
auto fut = std::async(f);  // default policy

while (fut.wait_for(100ms) != std::future_status::ready) {
    ...  // If f is deferred, this loops FOREVER!
         // wait_for always returns future_status::deferred
}

The fix

c++
auto fut = std::async(f);

if (fut.wait_for(0s) == std::future_status::deferred) {
    fut.get();  // call synchronously
} else {
    while (fut.wait_for(100ms) != std::future_status::ready) {
        ...  // safe: task is not deferred
    }
}

Rule of thumb: If asynchronous execution is essential, always specify std::launch::async explicitly: auto fut = std::async(std::launch::async, f);

What does wait_for return for a deferred task?

std::future_status::deferred — always, forever, never ready std::future_status::ready after the timeout It throws an exception

Chapter 2: Make std::threads Unjoinable on All Paths

Every std::thread is either joinable (corresponds to a running/runnable thread) or unjoinable (default-constructed, moved-from, joined, or detached). If a joinable thread's destructor runs, your program is terminated.

Unjoinable State	How It Got There
Default-constructed	`std::thread t;` — no function to execute
Moved-from	`auto t2 = std::move(t);` — t is now empty
Joined	`t.join();` — underlying thread finished
Detached	`t.detach();` — connection severed

Why program termination? The standard rejected both alternatives: implicit join (could silently block for minutes) and implicit detach (could cause memory corruption by writing to destroyed stack frames).

ThreadRAII: an RAII wrapper

c++
class ThreadRAII {
public:
    enum class DtorAction { join, detach };

    ThreadRAII(std::thread&& t, DtorAction a)
        : action(a), t(std::move(t)) {}

    ~ThreadRAII() {
        if (t.joinable()) {
            if (action == DtorAction::join) t.join();
            else t.detach();
        }
    }

    ThreadRAII(ThreadRAII&&) = default;
    ThreadRAII& operator=(ThreadRAII&&) = default;

    std::thread& get() { return t; }
private:
    DtorAction action;
    std::thread t;  // declared last: initialized last (after all other members)
};

Design notes: (1) The joinability check is necessary because join/detach on an unjoinable thread is UB. (2) std::thread is declared last so it's initialized after all other members, preventing a race. (3) Move operations are explicitly defaulted because declaring a destructor suppresses them.

Why both join and detach were rejected as defaults

Option	Problem
Implicit join	Could silently block the destructor for an unpredictable duration. If `conditionsAreSatisfied()` returns false, you'd still wait for the entire filter computation to finish.
Implicit detach	The thread keeps running after the function returns. It may write to stack memory that's now occupied by another function's local variables. Memory corruption with no visible cause.
Program termination	Harsh, but at least the bug is immediately visible. The standard committee chose this.

Using ThreadRAII

c++
bool doWork(std::function<bool(int)> filter, int maxVal) {
    std::vector<int> goodVals;

    ThreadRAII t(
        std::thread([&filter, maxVal, &goodVals] {
            for (auto i = 0; i <= maxVal; ++i)
                if (filter(i)) goodVals.push_back(i);
        }),
        ThreadRAII::DtorAction::join  // join if exception or early return
    );

    auto nh = t.get().native_handle();
    // ... set priority via platform API ...

    if (conditionsAreSatisfied()) {
        t.get().join();
        performComputation(goodVals);
        return true;
    }
    return false;  // ThreadRAII destructor joins the thread safely
}

What happens if a joinable std::thread's destructor is called?

The program is terminated (calls std::terminate) The thread is silently joined The thread is silently detached

Chapter 3: Future Destructor Behavior

Unlike std::thread, futures don't terminate your program when destroyed. But their destructor behavior is surprisingly nuanced.

The shared state

Results from async tasks live in a shared state — a heap-allocated object accessible to both the caller's future and the callee's std::promise:

Future / Promise / Shared State

The shared state sits between caller and callee.

Destructor rules

Condition	Destructor Behavior
Normal case	Destroys the future's data members. No join, no detach, no block.
Exception: last future for a non-deferred std::async task	Blocks until the task completes (implicit join)

All three conditions must hold for blocking: (1) The shared state was created by std::async. (2) The task's policy is std::launch::async. (3) This future is the last one referring to the shared state.

std::packaged_task avoids blocking

c++
// This future's dtor will NOT block
std::packaged_task<int()> pt(calcValue);
auto fut = pt.get_future();  // shared state NOT from std::async
std::thread t(std::move(pt));
// You must join or detach t yourself

When does a future's destructor block until the task completes?

Only when it's the last future for a non-deferred std::async task Always — futures always block in their destructors Never — futures never block

Chapter 4: Void Futures for One-Shot Events

You have a detecting task that needs to signal a reacting task. What's the best mechanism?

Mechanism	Problem
Condition variable	Requires a mutex (even if no shared data), timing-dependent, spurious wakeups
Atomic flag + polling	Wastes CPU — the reacting task spins instead of blocking
Condvar + flag combo	Works but stilted — two redundant signaling channels
std::promise<void>	Clean, one-shot, no mutex, no spurious wakeups, truly blocks

c++
std::promise<void> p;

// Detecting task:
...                       // detect event
p.set_value();              // signal!

// Reacting task:
...                       // prepare
p.get_future().wait();    // block until signaled
...                       // react

Limitations: (1) Heap-allocated shared state (cost). (2) One-shot only: a std::promise can be set only once. For repeated notifications, use condvar+flag.

Suspending threads at creation

c++
std::promise<void> p;

void detect() {
    std::thread t([] {
        p.get_future().wait();   // suspend until signaled
        react();
    });
    ...                         // configure t (priority, affinity, etc.)
    p.set_value();               // unsuspend t
    ...
    t.join();
}

Multiple reacting tasks: use shared_future

c++
std::promise<void> p;
auto sf = p.get_future().share();

std::vector<std::thread> threads;
for (int i = 0; i < n; ++i)
    threads.emplace_back([sf]{ sf.wait(); react(); });

p.set_value();  // wake all threads simultaneously

for (auto& t : threads) t.join();

Why is std::promise<void> better than a condition variable for one-shot event signaling?

No mutex needed, no spurious wakeups, no timing dependency, truly blocks It's faster than condition variables It supports multiple notifications

Chapter 5: std::atomic vs volatile

std::atomic and volatile serve completely different purposes. Confusing them is a common and dangerous mistake.

Feature	std::atomic	volatile
Purpose	Concurrent access	Special memory (MMIO)
Atomicity	Guaranteed (all operations)	None
Reordering prevention	Yes (sequential consistency)	No
Redundant load elimination	Allowed (optimizer can merge)	Forbidden (every read/write preserved)
Dead store elimination	Allowed	Forbidden
Copyable?	No (copy ops deleted)	Yes

Atomic guarantees atomicity and ordering

c++
std::atomic<int> ai(0);
ai = 10;      // atomically set to 10
++ai;         // atomic read-modify-write → 11
--ai;         // atomic RMW → 10
// Other threads see only 0, 10, or 11. Never a torn value.

// Sequential consistency: no reordering past atomic writes
auto val = computeValue();    // (1)
std::atomic<bool> ready(false);
ready = true;                  // (2) guaranteed after (1)

volatile prevents optimization of special memory

c++
volatile int x;

auto y = x;  // read x (preserved — might be sensor data)
y = x;       // read x again (preserved — value might have changed!)

x = 10;      // write x (preserved — might be a hardware command)
x = 20;      // write x (preserved — might be a different command)

// Without volatile, compiler would optimize to: auto y = x; x = 20;

Code reordering: the critical difference

c++
// With std::atomic: guaranteed ordering
auto imptValue = computeImportantValue();  // (1)
std::atomic<bool> valAvailable(false);
valAvailable = true;                        // (2) — (1) guaranteed before (2)
// Other threads see imptValue BEFORE valAvailable becomes true

// With volatile: NO ordering guarantee
auto imptValue = computeImportantValue();  // might be reordered!
volatile bool valAvailable = false;
valAvailable = true;                        // might execute BEFORE imptValue!
// Other threads could see valAvailable = true but imptValue not yet computed

Sequential consistency: std::atomic (with default memory order) guarantees that no code preceding an atomic write can appear to other threads to happen after it. This is the foundation of lock-free programming. volatile provides no such guarantee.

Why std::atomic isn't copyable

c++
std::atomic<int> x;
auto y = x;       // ERROR! Copy operations are deleted.
// Reading x and writing y atomically as ONE operation
// is impossible on most hardware.

// Use load() and store() instead:
std::atomic<int> y(x.load());  // atomic read, then init y
y.store(x.load());               // atomic read, then atomic write
// Each operation is atomic, but the pair is NOT atomic together

They can be combined: volatile std::atomic<int> vai; means "operations are atomic AND can't be optimized away." Useful for memory-mapped I/O that's accessed from multiple threads.

Which is correct for concurrent shared data: std::atomic or volatile?

std::atomic — it guarantees atomicity and prevents reordering. volatile provides neither. volatile — it prevents the compiler from caching values Either works equally well

Chapter 6: The Thread Model

Understanding the three layers of threads helps you reason about concurrency.

Hardware → Software → std::thread

How C++ threads map to OS threads and hardware cores.

Layer	Count	Managed By
Hardware threads	Fixed (e.g., 8 cores × 2 SMT = 16)	CPU
Software threads	Hundreds to thousands	OS scheduler
std::thread objects	As many as you create	Your code (or std::async)

Oversubscription occurs when ready-to-run software threads exceed hardware threads. This causes context switches, cache pollution, and performance degradation. std::async with the default policy can avoid this by deferring tasks.

What is oversubscription?

More ready-to-run software threads than hardware threads, causing costly context switches Running out of memory for thread stacks Creating more std::thread objects than the OS allows

Chapter 7: Event Communication Patterns

Click each pattern to see its trade-offs.

Event Signaling Patterns

Four approaches to inter-thread event communication.

What is the main drawback of polling an atomic flag for event notification?

The reacting task wastes CPU by spinning instead of truly blocking Atomic flags can't be set from another thread It requires a mutex

Chapter 8: Atomic vs Volatile Visualized

See how std::atomic and volatile handle concurrent increments differently.

Concurrent Increment: atomic vs volatile

Two threads each increment a counter once. See the possible outcomes.

Chapter 9: Beyond

Item	Key Takeaway
Item 35	Prefer task-based (`std::async`) to thread-based (`std::thread`). Tasks handle thread management, return values, and exceptions.
Item 36	The default launch policy may defer. Specify `std::launch::async` if concurrency is essential.
Item 37	Make `std::thread`s unjoinable on all paths. Use RAII wrappers. Declare thread members last.
Item 38	Only futures from `std::async` (non-deferred, last reference) block in their destructors.
Item 39	Use `std::promise<void>` for clean one-shot event communication. Use `shared_future` for multiple reactors.
Item 40	`std::atomic` = concurrency (atomicity + ordering). `volatile` = special memory (no optimization). Different tools for different jobs.

Next: Chapter 8: Tweaks covers the final two items: pass by value for copyable parameters, and emplacement vs insertion.

"For the first time in C++'s history, programmers can write multithreaded programs with standard behavior across all platforms." — Scott Meyers