Skip to content
13 min read

Multithreading in Java

Multithreading is the foundation of concurrent programming in Java. It allows multiple tasks to execute simultaneously within a single process, enabling efficient CPU utilization, responsive applications, and high-throughput systems.

Modern Java (21+): Start Here

If you're on Java 21+, Virtual Threads are the default choice for I/O-bound concurrency. You write simple blocking code that scales to millions of concurrent tasks — no thread pool tuning, no reactive frameworks. See Virtual Threads & Structured Concurrency for the modern approach. This page covers the foundational threading model that Virtual Threads build upon — essential for understanding the JVM, debugging production issues, and answering interview questions about what happens under the hood.


When to Use What (2026 Decision Guide)

Scenario Recommended Approach Why
I/O-bound work (HTTP calls, DB queries) Virtual Threads (Executors.newVirtualThreadPerTaskExecutor()) Scales to millions of tasks with zero pool tuning
CPU-bound computation (image processing, crypto) Platform threads with ForkJoinPool Virtual threads can't parallelize faster than core count
Need fine-grained control (locks, barriers) Platform threads + java.util.concurrent Full control over scheduling, priority, affinity
Legacy codebase (Java 8-17) ExecutorService + thread pools Virtual threads require Java 21+
Fire-and-forget async tasks Virtual Threads (simplest) or CompletableFuture Both work; VTs are simpler for blocking I/O

Process vs Thread

A process is an independent program with its own memory space. A thread is the smallest unit of CPU execution that runs within a process and shares its memory.

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    subgraph Process A
        direction LR
        A_HEAP[["Shared Heap Memory"]]
        A_T1(("Thread 1<br/>Own Stack"))
        A_T2(("Thread 2<br/>Own Stack"))
        A_T3(("Thread 3<br/>Own Stack"))
        A_T1 --> A_HEAP
        A_T2 --> A_HEAP
        A_T3 --> A_HEAP
    end
    subgraph Process B
        direction LR
        B_HEAP[["Shared Heap Memory"]]
        B_T1(("Thread 1<br/>Own Stack"))
        B_T2(("Thread 2<br/>Own Stack"))
        B_T1 --> B_HEAP
        B_T2 --> B_HEAP
    end

    style A_HEAP fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style A_T1 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style A_T2 fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style A_T3 fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style B_HEAP fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style B_T1 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style B_T2 fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style P fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
Aspect Process Thread
Memory Separate address space Shares heap, has own stack
Creation cost Expensive (OS-level) Lightweight (within JVM)
Communication IPC (pipes, sockets) Direct shared memory access
Isolation Crash doesn't affect others Crash can bring down the process
Context switch Slow (full memory swap) Fast (only stack + registers)

Thread Creation

1. Extending Thread Class

Java
public class DownloadThread extends Thread {
    private final String url;

    public DownloadThread(String url) {
        this.url = url;
    }

    @Override
    public void run() {
        System.out.println(Thread.currentThread().getName() + " downloading: " + url);
    }
}

DownloadThread thread = new DownloadThread("https://example.com/file.zip");
thread.start();

2. Implementing Runnable (Preferred)

Java
public class DownloadTask implements Runnable {
    private final String url;

    public DownloadTask(String url) {
        this.url = url;
    }

    @Override
    public void run() {
        System.out.println(Thread.currentThread().getName() + " downloading: " + url);
    }
}

Thread thread = new Thread(new DownloadTask("https://example.com/file.zip"));
thread.start();

// Lambda style (Java 8+)
Thread thread = new Thread(() -> System.out.println("Running in: " + Thread.currentThread().getName()));
thread.start();

3. Callable + Future (Returns a Result)

Java
public class PriceCalculator implements Callable<Double> {
    private final String ticker;

    public PriceCalculator(String ticker) {
        this.ticker = ticker;
    }

    @Override
    public Double call() throws Exception {
        Thread.sleep(1000);
        return Math.random() * 1000;
    }
}

ExecutorService executor = Executors.newFixedThreadPool(4);
Future<Double> future = executor.submit(new PriceCalculator("GOOG"));
Double price = future.get();  // blocks until result is available
executor.shutdown();

Thread vs Runnable vs Callable

Feature Thread Runnable Callable
Return value No No Yes (Future<T>)
Checked exceptions No No Yes
Inheritance Extends Thread (uses up single inheritance) Implements interface Implements interface
Reusability Low High High
Use with thread pools No Yes Yes

Thread Lifecycle

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
stateDiagram-v2
    [*] --> NEW : Thread created
    NEW --> RUNNABLE : start()
    RUNNABLE --> RUNNING : Scheduler picks thread
    RUNNING --> RUNNABLE : yield() / time slice expires
    RUNNING --> BLOCKED : waiting for monitor lock
    RUNNING --> WAITING : wait() / join() / park()
    RUNNING --> TIMED_WAITING : sleep(ms) / wait(ms) / join(ms)
    BLOCKED --> RUNNABLE : lock acquired
    WAITING --> RUNNABLE : notify() / notifyAll() / unpark()
    TIMED_WAITING --> RUNNABLE : timeout expires / notify()
    RUNNING --> TERMINATED : run() completes / exception
    TERMINATED --> [*]
State Description How to enter
NEW Thread object created, not yet started new Thread(runnable)
RUNNABLE Ready to run, waiting for CPU time start(), or lock acquired
RUNNING Currently executing on CPU Scheduler assigns CPU
BLOCKED Waiting to acquire a monitor lock Entering synchronized block
WAITING Waiting indefinitely for another thread wait(), join(), LockSupport.park()
TIMED_WAITING Waiting with a timeout sleep(ms), wait(ms), join(ms)
TERMINATED Execution complete run() returns or throws exception

Essential Thread Methods

start() vs run()

Java
Thread t = new Thread(() -> System.out.println("Running in: " + Thread.currentThread().getName()));

t.start();  // Creates a NEW thread, executes run() in that thread
t.run();    // Does NOT create a new thread — runs in caller's thread

Calling start() twice on the same thread throws IllegalThreadStateException.

sleep() — Pause Execution

Java
try {
    Thread.sleep(2000);
} catch (InterruptedException e) {
    Thread.currentThread().interrupt();  // Restore interrupt flag
}
  • Does NOT release any locks held by the thread
  • Throws InterruptedException if another thread interrupts it

join() — Wait for Another Thread

Java
Thread worker = new Thread(() -> computeResult());
worker.start();
worker.join();  // Current thread waits until worker finishes

interrupt() — Cooperative Cancellation

Java
Thread worker = new Thread(() -> {
    while (!Thread.currentThread().isInterrupted()) {
        try {
            Thread.sleep(100);
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            break;
        }
    }
});
worker.start();
worker.interrupt();  // Request cancellation — sets flag
  • Does NOT forcefully stop a thread — it sets a flag
  • Blocking methods (sleep, wait, join) throw InterruptedException when interrupted

Daemon Threads

Java
Thread daemon = new Thread(() -> {
    while (true) {
        cleanupExpiredSessions();
        Thread.sleep(60_000);
    }
});
daemon.setDaemon(true);  // Must be set BEFORE start()
daemon.start();
  • Daemon threads do NOT prevent JVM shutdown
  • When all non-daemon threads finish, JVM exits (killing daemons abruptly)
  • Examples: GC, signal handlers, monitoring threads

Java Memory Model (JMM)

The JMM defines how threads interact through memory and what guarantees the JVM provides about visibility and ordering of operations.

The Problem: Why We Need a Memory Model

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    subgraph "CPU 1"
        direction LR
        T1(("Thread 1"))
        C1[["L1/L2 Cache"]]
        T1 --> C1
    end
    subgraph "CPU 2"
        direction LR
        T2(("Thread 2"))
        C2[["L1/L2 Cache"]]
        T2 --> C2
    end
    C1 --> RAM(["Main Memory (RAM)"])
    C2 --> RAM

    style RAM fill:#FEF3C7,stroke:#FCD34D,color:#92400E

Each CPU core has its own cache. Without explicit synchronization, Thread 1's writes may never be visible to Thread 2 — they stay in CPU 1's cache. The JMM specifies happens-before rules that guarantee visibility.

Happens-Before Relationships

If action A happens-before action B, then A's effects are guaranteed visible to B. Key rules:

Rule Guarantees
Program Order Each action in a thread happens-before every subsequent action in that thread
Monitor Lock An unlock on a monitor happens-before every subsequent lock on that monitor
Volatile Variable A write to volatile happens-before every subsequent read of that volatile
Thread Start thread.start() happens-before any action in the started thread
Thread Join All actions in a thread happen-before join() returns
Transitivity If A happens-before B, and B happens-before C, then A happens-before C

Visibility Without Happens-Before

Java
// Thread 1                       // Thread 2
flag = true;                      while (!flag) { }  // May loop forever!
data = 42;                        print(data);       // May print 0!

Without volatile or synchronization, Thread 2 might never see flag = true (stale cache) and even if it does, data might still be 0 (reordering).

Instruction Reordering

The JVM and CPU reorder instructions for performance. The JMM permits reordering as long as single-threaded semantics are preserved — but this breaks multi-threaded expectations:

Java
// Source code             // After reordering (legal for single thread)
x = 1;                    y = 2;  // moved up!
y = 2;                    x = 1;

Memory barriers (fences) prevent reordering across them. synchronized, volatile, and java.util.concurrent classes insert appropriate barriers.

Memory Barrier Types

Barrier Prevents
LoadLoad Reordering of two reads
StoreStore Reordering of two writes
LoadStore A read being reordered after a write
StoreLoad A write being reordered after a read (most expensive — full fence)

A volatile write inserts StoreStore + StoreLoad barriers. A volatile read inserts LoadLoad + LoadStore barriers.


volatile Keyword — Deep Dive

volatile provides visibility and ordering guarantees but NOT atomicity.

What volatile Guarantees

Java
private volatile boolean shutdownRequested = false;

// Thread 1 (writer)
shutdownRequested = true;  // StoreStore + StoreLoad barrier after write

// Thread 2 (reader)
while (!shutdownRequested) {  // LoadLoad + LoadStore barrier before read
    doWork();
}
// Guaranteed to see the write from Thread 1

What volatile Does NOT Guarantee

Java
private volatile int counter = 0;

// Thread 1                    // Thread 2
counter++;                     counter++;

// counter++ is: read → increment → write (3 operations)
// volatile makes each individual read/write visible, but the compound
// operation is NOT atomic. Final counter could be 1, not 2.

When to Use volatile

Use Case volatile Works?
Simple boolean flag (one writer, many readers) Yes
Status/state field read by multiple threads Yes
Counter incremented by multiple threads No — use AtomicInteger
Double-checked locking singleton Yes (on the instance field)
Publishing an immutable object Yes

Double-Checked Locking (Correct Version)

Java
public class Singleton {
    private static volatile Singleton instance;  // volatile is critical here

    public static Singleton getInstance() {
        if (instance == null) {                   // first check (no lock)
            synchronized (Singleton.class) {
                if (instance == null) {           // second check (with lock)
                    instance = new Singleton();   // without volatile, partially
                }                                 // constructed object may be visible
            }
        }
        return instance;
    }
}

Without volatile, the write to instance can be reordered before the constructor completes — other threads see a non-null but partially constructed object.


Synchronization Internals

How Intrinsic Locks (Monitors) Work

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    T1(("Thread 1")) -->|tries to enter| SYNC{{"synchronized block"}}
    T2(("Thread 2")) -->|tries to enter| SYNC
    T3(("Thread 3")) -->|tries to enter| SYNC
    SYNC -->|acquires lock| OWNER(["Lock Owner: Thread 1"])
    T2 -->|BLOCKED| QUEUE[/"Entry Set / Wait Queue"/]
    T3 -->|BLOCKED| QUEUE
    OWNER -->|exits block| RELEASE(["Lock Released"])
    RELEASE -->|one thread unblocked| QUEUE

    style OWNER fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style QUEUE fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style RELEASE fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style SYNC fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style T1 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style T2 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style T3 fill:#FEF3C7,stroke:#FCD34D,color:#92400E
  • Every Java object has an intrinsic lock (monitor)
  • synchronized acquires the lock on entry, releases on exit (even if exception thrown)
  • Static synchronized methods lock on the Class object (ClassName.class)
  • Intrinsic locks are reentrant — a thread can re-acquire a lock it already holds

Lock Optimization in the JVM (HotSpot)

The JVM applies progressive lock optimization:

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    B(["Biased Locking<br/>(single thread, no CAS)"]) -->|contention detected| T{{"Thin Lock<br/>(CAS on mark word)"}}
    T -->|spinning fails| F{{"Fat Lock<br/>(OS mutex, park thread)"}}

    style B fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style F fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style T fill:#FEF3C7,stroke:#FCD34D,color:#92400E
Level Mechanism When Cost
Biased Lock Mark word stores owner thread ID. No atomic ops on reentry. Single thread accesses repeatedly Near zero
Thin Lock (lightweight) CAS on object's mark word Low contention, short critical sections CAS per lock/unlock
Fat Lock (heavyweight) OS-level mutex (pthread_mutex) + thread parking High contention, long critical sections Context switch
Lock Coarsening JVM merges adjacent synchronized blocks on same lock Detected pattern Reduces lock/unlock overhead
Lock Elision JVM removes lock entirely via escape analysis Object doesn't escape thread Zero

Block-Level Synchronization

Java
public class BankAccount {
    private double balance;
    private final Object lock = new Object();

    public void deposit(double amount) {
        synchronized (lock) {
            balance += amount;
        }
        notifyObservers();  // outside lock — better throughput
    }
}

wait(), notify(), notifyAll()

Rules

  • Must be called from within a synchronized block (thread must hold the monitor)
  • wait() releases the lock and enters WAITING state
  • notify() wakes one waiting thread; notifyAll() wakes all
  • Waiting thread must re-acquire the lock before resuming

Producer-Consumer Pattern

Java
public class BoundedBuffer<T> {
    private final Queue<T> queue = new LinkedList<>();
    private final int capacity;

    public BoundedBuffer(int capacity) {
        this.capacity = capacity;
    }

    public synchronized void produce(T item) throws InterruptedException {
        while (queue.size() == capacity) {
            wait();  // Buffer full — release lock and wait
        }
        queue.add(item);
        notifyAll();  // Wake up consumers
    }

    public synchronized T consume() throws InterruptedException {
        while (queue.isEmpty()) {
            wait();  // Buffer empty — release lock and wait
        }
        T item = queue.poll();
        notifyAll();  // Wake up producers
        return item;
    }
}

Why while and not if? Spurious wakeups can occur — the JVM spec permits a thread to wake from wait() without being notified. Always re-check the condition.

wait() vs sleep()

Aspect wait() sleep()
Called on Object (monitor) Thread (static method)
Releases lock Yes No
Wakeup notify() / notifyAll() Timeout expires
Must be in synchronized Yes No
Purpose Inter-thread communication Pause execution

ReentrantLock — Beyond synchronized

ReentrantLock provides the same mutual exclusion as synchronized but with additional capabilities.

Key Advantages Over synchronized

Feature synchronized ReentrantLock
Try without blocking No tryLock()
Timeout on lock attempt No tryLock(timeout, unit)
Interruptible waiting No lockInterruptibly()
Fairness policy No (always unfair) new ReentrantLock(true)
Multiple conditions One wait set per monitor Multiple Condition objects
Non-block-structured No Yes (lock/unlock in different methods)

Usage Pattern

Java
private final ReentrantLock lock = new ReentrantLock();

public void transfer(Account from, Account to, double amount) {
    lock.lock();
    try {
        from.debit(amount);
        to.credit(amount);
    } finally {
        lock.unlock();  // ALWAYS in finally — prevents lock leak on exception
    }
}

tryLock — Deadlock Avoidance

Java
public boolean transferWithTimeout(Account from, Account to, double amount) 
        throws InterruptedException {
    if (from.lock.tryLock(1, TimeUnit.SECONDS)) {
        try {
            if (to.lock.tryLock(1, TimeUnit.SECONDS)) {
                try {
                    from.debit(amount);
                    to.credit(amount);
                    return true;
                } finally {
                    to.lock.unlock();
                }
            }
        } finally {
            from.lock.unlock();
        }
    }
    return false;  // Could not acquire both locks — caller retries
}

Condition Variables — Multiple Wait Sets

Java
private final ReentrantLock lock = new ReentrantLock();
private final Condition notFull = lock.newCondition();
private final Condition notEmpty = lock.newCondition();

public void produce(T item) throws InterruptedException {
    lock.lock();
    try {
        while (count == capacity) notFull.await();
        enqueue(item);
        notEmpty.signal();  // Only wake consumers, not other producers
    } finally {
        lock.unlock();
    }
}

public T consume() throws InterruptedException {
    lock.lock();
    try {
        while (count == 0) notEmpty.await();
        T item = dequeue();
        notFull.signal();  // Only wake producers
        return item;
    } finally {
        lock.unlock();
    }
}

ReadWriteLock — Reader/Writer Optimization

Java
private final ReadWriteLock rwLock = new ReentrantReadWriteLock();
private final Map<String, Object> cache = new HashMap<>();

public Object get(String key) {
    rwLock.readLock().lock();  // Multiple readers can hold simultaneously
    try {
        return cache.get(key);
    } finally {
        rwLock.readLock().unlock();
    }
}

public void put(String key, Object value) {
    rwLock.writeLock().lock();  // Exclusive — blocks all readers and writers
    try {
        cache.put(key, value);
    } finally {
        rwLock.writeLock().unlock();
    }
}

StampedLock (Java 8) — Optimistic Reads

Java
private final StampedLock sl = new StampedLock();
private double x, y;

public double distanceFromOrigin() {
    long stamp = sl.tryOptimisticRead();  // No lock acquired — just a stamp
    double currentX = x, currentY = y;
    if (!sl.validate(stamp)) {  // Check if a write occurred meanwhile
        stamp = sl.readLock();  // Fall back to pessimistic read
        try {
            currentX = x;
            currentY = y;
        } finally {
            sl.unlockRead(stamp);
        }
    }
    return Math.sqrt(currentX * currentX + currentY * currentY);
}

StampedLock is NOT reentrant — do not use in recursive code.


Atomic Variables & CAS

Compare-And-Swap (CAS) — The Foundation

CAS is a CPU-level atomic instruction: "If the value at address X is currently V, set it to N. Return whether it succeeded."

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    READ[/"Read current value (expected = 5)"/] --> COMPUTE{{"Compute new value (6)"}}
    COMPUTE --> CAS{"CAS(expected=5, new=6)"}
    CAS -->|Success: was 5, now 6| DONE(["Operation complete"])
    CAS -->|Failure: value changed| READ

    style 6 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style CAS fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style COMPUTE fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style DONE fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style READ fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF

AtomicInteger Internals

Java
// What AtomicInteger.incrementAndGet() does internally:
public final int incrementAndGet() {
    int prev, next;
    do {
        prev = get();              // volatile read
        next = prev + 1;
    } while (!compareAndSet(prev, next));  // CAS retry loop
    return next;
}

Atomic Classes

Class Use Case
AtomicInteger / AtomicLong Lock-free counters
AtomicBoolean Lock-free flags
AtomicReference<V> Lock-free object reference updates
AtomicStampedReference<V> Solves ABA problem (tracks version stamp)
AtomicIntegerArray Lock-free array element updates
LongAdder (Java 8) High-contention counters (striped cells)
LongAccumulator Custom accumulation function

LongAdder vs AtomicLong — High Contention

Java
// AtomicLong: all threads CAS on single variable — high contention, many retries
private final AtomicLong counter = new AtomicLong();

// LongAdder: internally striped across cells — threads update different cells
// sum() aggregates all cells (slightly higher read cost, much lower write contention)
private final LongAdder counter = new LongAdder();
counter.increment();      // write — distributed across cells
counter.sum();            // read — aggregates (eventual consistency for reads)

Use LongAdder when updates are far more frequent than reads (metrics, counters). Use AtomicLong when you need precise real-time reads.

The ABA Problem

Text Only
Thread 1: reads value A
Thread 2: changes A → B → A
Thread 1: CAS succeeds (value is still A) — but state has changed!

Solution: AtomicStampedReference pairs value with a version stamp — CAS checks both.

Java
AtomicStampedReference<Node> head = new AtomicStampedReference<>(node, 0);

int[] stampHolder = new int[1];
Node current = head.get(stampHolder);
int stamp = stampHolder[0];

head.compareAndSet(current, newNode, stamp, stamp + 1);

Thread Pools and ExecutorService

Thread Pool Architecture

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    TASKS{{"Task Queue<br/>BlockingQueue"}} --> T1(["Worker Thread 1"])
    TASKS --> T2(["Worker Thread 2"])
    TASKS --> T3(["Worker Thread 3"])
    TASKS --> TN(["Worker Thread N"])
    CLIENT1(("Client")) -->|submit task| TASKS
    CLIENT2(("Client")) -->|submit task| TASKS
    CLIENT3(("Client")) -->|submit task| TASKS

    style 1 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style 2 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style 3 fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style CLIENT1 fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style CLIENT2 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style CLIENT3 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style N fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style TASKS fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B

Types of Thread Pools

Thread Pool Core Max Queue Best For
FixedThreadPool N N Unbounded LinkedBlockingQueue CPU-bound with known load
CachedThreadPool 0 Integer.MAX_VALUE SynchronousQueue Many short-lived I/O tasks
ScheduledThreadPool N Integer.MAX_VALUE DelayedWorkQueue Periodic/delayed tasks
SingleThreadExecutor 1 1 Unbounded LinkedBlockingQueue Sequential ordering
ForkJoinPool N (processors) N Per-thread work queues Recursive divide-and-conquer

ThreadPoolExecutor — Full Control

Java
ThreadPoolExecutor executor = new ThreadPoolExecutor(
    4,                              // corePoolSize
    16,                             // maximumPoolSize
    60L, TimeUnit.SECONDS,          // keepAliveTime for idle threads > core
    new ArrayBlockingQueue<>(1000), // bounded queue — backpressure!
    new ThreadFactory() {           // custom thread naming
        private final AtomicInteger count = new AtomicInteger();
        public Thread newThread(Runnable r) {
            Thread t = new Thread(r, "order-processor-" + count.incrementAndGet());
            t.setDaemon(false);
            return t;
        }
    },
    new ThreadPoolExecutor.CallerRunsPolicy()  // rejection policy
);

Rejection Policies (When Queue is Full)

Policy Behavior Use Case
AbortPolicy (default) Throws RejectedExecutionException Fail-fast systems
CallerRunsPolicy Caller thread executes the task Backpressure (slows producer)
DiscardPolicy Silently drops task Fire-and-forget metrics
DiscardOldestPolicy Drops oldest queued task, retries Latest-value-wins scenarios

Ideal Pool Size

  • CPU-bound: threads = CPU cores (more = context-switching waste)
  • I/O-bound: threads = cores × (1 + waitTime / computeTime) — typically 2x-10x cores
  • Mixed: Separate pools for CPU-bound and I/O-bound work

ForkJoinPool — Work-Stealing

Java
public class MergeSortTask extends RecursiveAction {
    private final int[] array;
    private final int left, right;
    private static final int THRESHOLD = 1024;

    @Override
    protected void compute() {
        if (right - left < THRESHOLD) {
            Arrays.sort(array, left, right);
            return;
        }
        int mid = (left + right) / 2;
        invokeAll(
            new MergeSortTask(array, left, mid),
            new MergeSortTask(array, mid, right)
        );
        merge(array, left, mid, right);
    }
}

ForkJoinPool pool = new ForkJoinPool();
pool.invoke(new MergeSortTask(array, 0, array.length));

Work-stealing: idle threads steal tasks from busy threads' queues — improves utilization for unbalanced workloads.


java.util.concurrent Synchronizers

CountDownLatch — Wait for N Events

Java
int serviceCount = 5;
CountDownLatch latch = new CountDownLatch(serviceCount);

for (int i = 0; i < serviceCount; i++) {
    executor.submit(() -> {
        try {
            initializeService();
        } finally {
            latch.countDown();  // Decrement count
        }
    });
}

latch.await(30, TimeUnit.SECONDS);  // Block until count reaches 0
// All services initialized — start accepting requests

One-shot only — cannot be reset.

CyclicBarrier — Reusable Rendezvous Point

Java
int threadCount = 4;
CyclicBarrier barrier = new CyclicBarrier(threadCount, () -> {
    mergePartialResults();  // Runs when all threads arrive
});

for (int i = 0; i < threadCount; i++) {
    final int partition = i;
    executor.submit(() -> {
        while (hasMoreData()) {
            processPartition(partition);
            barrier.await();  // Wait for all threads to finish this phase
            // barrier resets — ready for next phase
        }
    });
}
Aspect CountDownLatch CyclicBarrier
Reusable No (one-shot) Yes (resets after each trip)
Wait condition Count reaches 0 All parties arrive
Threads that wait Any thread calls await() Participating threads only
Use case Wait for external events Multi-phase parallel computation

Semaphore — Resource Limiting

Java
// Connection pool with max 10 concurrent connections
Semaphore semaphore = new Semaphore(10, true);  // fair = true

public Connection getConnection() throws InterruptedException {
    semaphore.acquire();  // blocks if 10 connections already in use
    try {
        return pool.borrowConnection();
    } catch (Exception e) {
        semaphore.release();
        throw e;
    }
}

public void releaseConnection(Connection conn) {
    pool.returnConnection(conn);
    semaphore.release();  // permit becomes available
}

Phaser — Flexible Multi-Phase Synchronization

Java
Phaser phaser = new Phaser(1);  // register self

for (int i = 0; i < workerCount; i++) {
    phaser.register();  // dynamic registration
    executor.submit(() -> {
        // Phase 0: load data
        loadData();
        phaser.arriveAndAwaitAdvance();

        // Phase 1: process
        process();
        phaser.arriveAndAwaitAdvance();

        // Phase 2: write results
        writeResults();
        phaser.arriveAndDeregister();  // done — leave the phaser
    });
}

phaser.arriveAndDeregister();  // deregister self

Phaser supports dynamic party count (register/deregister at any time) — CountDownLatch and CyclicBarrier do not.

Exchanger — Two-Thread Rendezvous

Java
Exchanger<List<Order>> exchanger = new Exchanger<>();

// Producer thread
List<Order> buffer = new ArrayList<>();
while (running) {
    buffer.add(fetchOrder());
    if (buffer.size() >= BATCH_SIZE) {
        buffer = exchanger.exchange(buffer);  // swap full buffer for empty one
        buffer.clear();
    }
}

// Consumer thread
List<Order> buffer = new ArrayList<>();
while (running) {
    buffer = exchanger.exchange(buffer);  // swap empty buffer for full one
    processBatch(buffer);
}

Concurrent Collections

ConcurrentHashMap Internals

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    CHM{{"ConcurrentHashMap"}}
    CHM --> S0[["Segment 0<br/>(own lock)"]]
    CHM --> S1[["Segment 1<br/>(own lock)"]]
    CHM --> S2[["Segment 2<br/>(own lock)"]]
    CHM --> SN[["Segment N<br/>(own lock)"]]
    S0 --> B0(["Bucket 0 → Node → Node"])
    S1 --> B1(["Bucket 1 → Node → Node"])

    style CHM fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style S0 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style S1 fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style e fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style k fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF

Java 8+: No longer uses segments. Uses per-bucket CAS + synchronized on the head node of each bucket. Reads are lock-free (volatile reads of nodes).

Collection Thread Safety Performance Profile
ConcurrentHashMap Lock striping (per-bucket) High concurrent throughput for reads + writes
CopyOnWriteArrayList Copy entire array on write Excellent reads, expensive writes
ConcurrentLinkedQueue Lock-free (CAS) Non-blocking FIFO
BlockingQueue (various) Lock-based Producer-consumer with backpressure
ConcurrentSkipListMap Lock-free Sorted concurrent map (O(log n))

BlockingQueue Implementations

Implementation Bound Ordering Use Case
ArrayBlockingQueue Bounded FIFO Fixed-size producer-consumer
LinkedBlockingQueue Optional bound FIFO General purpose (unbounded by default)
PriorityBlockingQueue Unbounded Priority Task scheduling by priority
SynchronousQueue Zero capacity Direct handoff Thread-to-thread handoff (CachedThreadPool)
DelayQueue Unbounded By delay expiry Scheduled tasks, TTL-based expiry
LinkedTransferQueue Unbounded FIFO High-performance async messaging

CopyOnWriteArrayList — When to Use

Java
// Excellent for: listeners, event handlers, configuration (read >> write)
private final CopyOnWriteArrayList<EventListener> listeners = new CopyOnWriteArrayList<>();

public void addListener(EventListener l) { listeners.add(l); }  // copies entire array

public void fireEvent(Event e) {
    for (EventListener l : listeners) {  // no lock needed — iterates snapshot
        l.onEvent(e);
    }
}

Trade-off: O(n) writes (copy array), O(1) lock-free reads. Use only when reads vastly outnumber writes.


CompletableFuture — Async Composition

Creation Patterns

Java
// Run async, no return value
CompletableFuture<Void> cf = CompletableFuture.runAsync(() -> sendEmail());

// Run async, return value
CompletableFuture<User> cf = CompletableFuture.supplyAsync(() -> fetchUser(id));

// With custom executor
CompletableFuture<User> cf = CompletableFuture.supplyAsync(
    () -> fetchUser(id), ioExecutor
);

Chaining and Composition

Java
CompletableFuture<OrderConfirmation> pipeline = 
    CompletableFuture.supplyAsync(() -> validateOrder(order))
        .thenApply(valid -> enrichOrder(valid))          // sync transform
        .thenCompose(enriched -> chargePayment(enriched)) // async flatMap
        .thenApply(charged -> createConfirmation(charged))
        .exceptionally(ex -> handleFailure(ex));
Method Input Output Async?
thenApply(fn) T → U CF<U> No
thenCompose(fn) T → CF<U> CF<U> Yes (flatMap)
thenCombine(other, fn) (T, U) → V CF<V> Combine two futures
thenAccept(consumer) T → void CF<Void> No
thenRun(runnable) CF<Void> No
exceptionally(fn) Throwable → T CF<T> Recovery
handle(fn) (T, Throwable) → U CF<U> Both success/failure

Combining Multiple Futures

Java
// Wait for ALL to complete
CompletableFuture<Void> allOf = CompletableFuture.allOf(
    fetchUser(id),
    fetchOrders(id),
    fetchRecommendations(id)
);

// Wait for FIRST to complete (racing)
CompletableFuture<Object> anyOf = CompletableFuture.anyOf(
    callServiceA(),
    callServiceB()  // hedged request
);

Production Pattern: Timeout + Fallback

Java
CompletableFuture<Price> price = CompletableFuture
    .supplyAsync(() -> callPricingService(item))
    .orTimeout(2, TimeUnit.SECONDS)                     // Java 9
    .exceptionally(ex -> getCachedPrice(item));          // fallback

Production Pattern: Parallel Fan-Out

Java
public UserProfile buildProfile(String userId) {
    var userFuture = CompletableFuture.supplyAsync(() -> fetchUser(userId), ioPool);
    var ordersFuture = CompletableFuture.supplyAsync(() -> fetchOrders(userId), ioPool);
    var prefsFuture = CompletableFuture.supplyAsync(() -> fetchPrefs(userId), ioPool);

    return userFuture.thenCombine(ordersFuture, (user, orders) -> 
        new PartialProfile(user, orders)
    ).thenCombine(prefsFuture, (partial, prefs) -> 
        new UserProfile(partial, prefs)
    ).join();  // block for final result
}

Virtual Threads (Java 21 — Project Loom)

Virtual threads are lightweight threads managed by the JVM, not the OS. They enable writing blocking code that scales like async code.

Platform Threads vs Virtual Threads

Aspect Platform Thread Virtual Thread
Managed by OS kernel JVM scheduler
Memory ~1 MB stack (fixed) ~few KB (grows dynamically)
Creation cost Expensive (~1ms) Cheap (~1μs, millions possible)
Blocking behavior Blocks OS thread Unmounts from carrier, frees OS thread
Best for CPU-bound work I/O-bound work
Pool needed? Yes — limit thread count No — create per task

How Virtual Threads Work Internally

%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '13px', 'fontFamily': 'Inter, -apple-system, sans-serif'}, 'flowchart': {'nodeSpacing': 30, 'rankSpacing': 50, 'padding': 12, 'curve': 'basis'}, 'sequence': {'actorMargin': 60, 'messageMargin': 40}, 'class': {'padding': 12}}}%%
flowchart LR
    subgraph "JVM Scheduler"
        direction LR
        VT1(("Virtual Thread 1<br/>(running)"))
        VT2(("Virtual Thread 2<br/>(blocked on I/O)"))
        VT3(("Virtual Thread 3<br/>(runnable)"))
    end

    subgraph "Carrier Threads (ForkJoinPool)"
        direction LR
        CT1[["Carrier Thread 1"]]
        CT2[["Carrier Thread 2"]]
    end

    VT1 -->|"mounted on"| CT1
    VT3 -->|"waiting for"| CT2
    VT2 -->|"unmounted<br/>(parked)"| HEAP[/"Heap<br/>(continuation stored)"/]

    style CT1 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style CT2 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46
    style HEAP fill:#FEF3C7,stroke:#FCD34D,color:#92400E
    style VT1 fill:#FEE2E2,stroke:#FCA5A5,color:#991B1B
    style VT2 fill:#DBEAFE,stroke:#93C5FD,color:#1E40AF
    style VT3 fill:#D1FAE5,stroke:#6EE7B7,color:#065F46

When a virtual thread blocks on I/O:

  1. JVM saves its stack (continuation) to heap
  2. Carrier thread is released to run other virtual threads
  3. When I/O completes, virtual thread is rescheduled on any available carrier

Creating Virtual Threads

Java
// Single virtual thread
Thread.ofVirtual().name("worker-1").start(() -> {
    String result = blockingHttpCall();  // OK to block!
    processResult(result);
});

// Virtual thread per task executor (replaces CachedThreadPool)
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    List<Future<String>> futures = IntStream.range(0, 100_000)
        .mapToObj(i -> executor.submit(() -> fetchData(i)))
        .toList();
    for (Future<String> f : futures) {
        process(f.get());
    }
}

Pinning — When Virtual Threads Don't Scale

A virtual thread is pinned to its carrier when it blocks inside:

  • synchronized block/method (holds monitor lock)
  • Native method / JNI call

Pinning defeats the purpose — the carrier thread is blocked.

Java
// BAD: synchronized pins the virtual thread to the carrier
synchronized (lock) {
    connection.query(sql);  // blocks while pinned — wastes carrier
}

// GOOD: ReentrantLock does NOT pin
private final ReentrantLock lock = new ReentrantLock();
lock.lock();
try {
    connection.query(sql);  // virtual thread unmounts while waiting for I/O
} finally {
    lock.unlock();
}

Detect pinning with: -Djdk.tracePinnedThreads=full

Structured Concurrency (Preview — Java 21+)

Java
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
    Subtask<User> userTask = scope.fork(() -> fetchUser(userId));
    Subtask<List<Order>> ordersTask = scope.fork(() -> fetchOrders(userId));

    scope.join();           // Wait for all
    scope.throwIfFailed();  // Propagate exceptions

    return new UserProfile(userTask.get(), ordersTask.get());
}
// If any subtask fails, all others are cancelled automatically

Scoped Values (Preview — Java 21+)

Thread-safe alternative to ThreadLocal for virtual threads:

Java
private static final ScopedValue<RequestContext> CONTEXT = ScopedValue.newInstance();

ScopedValue.where(CONTEXT, new RequestContext(requestId))
    .run(() -> {
        handleRequest();  // CONTEXT.get() available in this scope and child tasks
    });

ThreadLocal with virtual threads is problematic — millions of threads × per-thread storage = massive memory waste. ScopedValue is immutable, inheritable, and GC-friendly.


ThreadLocal — Per-Thread State

How It Works

Java
private static final ThreadLocal<SimpleDateFormat> dateFormat = 
    ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-MM-dd"));

public String formatDate(Date date) {
    return dateFormat.get().format(date);  // each thread gets its own instance
}

ThreadLocal Internals

Each Thread object has a ThreadLocalMap (open-addressing hash table). ThreadLocal.get() looks up the value in the current thread's map using the ThreadLocal instance as key.

Memory Leak Pattern

Java
// LEAK: ThreadLocal not removed in thread pool
executorService.submit(() -> {
    CONTEXT.set(expensiveObject);  // stored in worker thread's map
    doWork();
    // Missing CONTEXT.remove() — object lives until thread dies
    // In a thread pool, thread never dies → permanent leak!
});

// FIX: always remove in finally
executorService.submit(() -> {
    CONTEXT.set(expensiveObject);
    try {
        doWork();
    } finally {
        CONTEXT.remove();  // critical in thread pools
    }
});

InheritableThreadLocal

Java
// Child threads inherit parent's ThreadLocal value (copy at creation time)
private static final InheritableThreadLocal<String> traceId = 
    new InheritableThreadLocal<>();

traceId.set("trace-12345");
new Thread(() -> {
    System.out.println(traceId.get());  // "trace-12345" — inherited
}).start();

Does NOT work with thread pools (threads are reused, not created fresh).


Common Concurrency Problems

Race Condition

Java
// BUG: counter++ is NOT atomic (read → increment → write)
private int counter = 0;
public void increment() { counter++; }

// FIX 1: synchronized
public synchronized void increment() { counter++; }

// FIX 2: AtomicInteger (better performance)
private final AtomicInteger counter = new AtomicInteger();
public void increment() { counter.incrementAndGet(); }

Deadlock

Java
// Thread 1: lock A → tries lock B
// Thread 2: lock B → tries lock A
// DEADLOCK!

// Prevention: always acquire locks in consistent global order

Deadlock Detection — Thread Dump Analysis

Text Only
"Thread-1":
  waiting to lock <0x00000007d6a48e68> (a Object) — held by "Thread-2"
  locked <0x00000007d6a48e58> (a Object)

"Thread-2":
  waiting to lock <0x00000007d6a48e58> (a Object) — held by "Thread-1"
  locked <0x00000007d6a48e68> (a Object)

Get thread dumps: jstack <pid>, jcmd <pid> Thread.print, or kill -3 <pid> (Unix).

Livelock

Threads keep responding to each other but make no progress — like two people in a hallway stepping aside in the same direction. Fix: add randomized backoff.

Thread Starvation

Low-priority threads never get CPU time. Fix: fair locks (new ReentrantLock(true)), avoid priority abuse.

False Sharing — Cache Line Contention

Java
// BAD: counters[0] and counters[1] on same cache line (64 bytes)
// Writing to counters[0] invalidates counters[1] in other CPU's cache
private long[] counters = new long[NUM_THREADS];

// Thread i increments counters[i] — but all threads slow each other down
// due to cache line bouncing between CPUs

// FIX: pad to separate cache lines (@Contended in JMH)
@jdk.internal.vm.annotation.Contended
private volatile long counter;  // padded to own cache line

False sharing causes dramatic performance degradation in tight loops. Use @Contended or manual padding (array with 8-long stride).


Production Patterns

Rate Limiter (Token Bucket)

Java
public class TokenBucketRateLimiter {
    private final int maxTokens;
    private final int refillRate;  // tokens per second
    private double availableTokens;
    private long lastRefillTime;
    private final ReentrantLock lock = new ReentrantLock();

    public boolean tryAcquire() {
        lock.lock();
        try {
            refill();
            if (availableTokens >= 1) {
                availableTokens--;
                return true;
            }
            return false;
        } finally {
            lock.unlock();
        }
    }

    private void refill() {
        long now = System.nanoTime();
        double elapsed = (now - lastRefillTime) / 1_000_000_000.0;
        availableTokens = Math.min(maxTokens, availableTokens + elapsed * refillRate);
        lastRefillTime = now;
    }
}

Read-Through Cache with Stampede Protection

Java
public class StampedeProtectedCache<K, V> {
    private final ConcurrentHashMap<K, V> cache = new ConcurrentHashMap<>();
    private final ConcurrentHashMap<K, CompletableFuture<V>> inflight = new ConcurrentHashMap<>();
    private final Function<K, V> loader;

    public V get(K key) {
        V cached = cache.get(key);
        if (cached != null) return cached;

        // Only ONE thread loads; others wait on the same future
        CompletableFuture<V> future = inflight.computeIfAbsent(key, k ->
            CompletableFuture.supplyAsync(() -> {
                V value = loader.apply(k);
                cache.put(k, value);
                inflight.remove(k);
                return value;
            })
        );
        return future.join();
    }
}

Periodic Task with Graceful Shutdown

Java
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1, r -> {
    Thread t = new Thread(r, "health-check");
    t.setDaemon(true);
    return t;
});

ScheduledFuture<?> task = scheduler.scheduleAtFixedRate(
    () -> {
        try { checkHealth(); }
        catch (Exception e) { log.warn("Health check failed", e); }
    },
    0, 30, TimeUnit.SECONDS  // initial delay, period
);

// Graceful shutdown
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
    task.cancel(false);
    scheduler.shutdown();
    try { scheduler.awaitTermination(5, TimeUnit.SECONDS); }
    catch (InterruptedException e) { scheduler.shutdownNow(); }
}));

Thread-Safe Lazy Initialization (Holder Pattern)

Java
// No synchronized, no volatile, no double-checked locking
// JVM guarantees class initialization is thread-safe
public class ExpensiveService {
    private static class Holder {
        static final ExpensiveService INSTANCE = new ExpensiveService();
    }

    public static ExpensiveService getInstance() {
        return Holder.INSTANCE;  // class loaded on first access
    }
}

Debugging and Profiling Concurrent Code

Thread Dump Analysis

Bash
# Generate thread dump
jcmd <pid> Thread.print
jstack <pid>

# Detect deadlocks
jcmd <pid> Thread.print | grep -A5 "deadlock"

Key Signals in Thread Dumps

Pattern Indicates
Many threads BLOCKED on same lock Lock contention bottleneck
Circular "waiting to lock" chains Deadlock
Many threads in WAITING at BlockingQueue.take() Underutilized pool (too many threads)
Many tasks queued, few running Pool too small for workload
RUNNABLE threads consuming CPU CPU-bound bottleneck or spin-wait

JVM Flags for Concurrency Debugging

Flag Purpose
-Djdk.tracePinnedThreads=full Detect virtual thread pinning
-XX:+PrintGCDetails GC pauses that stop all threads
-XX:-UseBiasedLocking Disable biased locking (sometimes helps)
-XX:+UseG1GC G1 GC (shorter pause times)
-XX:+UseZGC ZGC (sub-ms pauses, good for latency-sensitive concurrent apps)

Java Flight Recorder (JFR)

Bash
# Start recording
jcmd <pid> JFR.start duration=60s filename=recording.jfr

# Analyze with JDK Mission Control or programmatically

JFR captures: lock contention events, thread state transitions, blocking times, CPU usage per thread — all with minimal overhead (~1-2%).


Interview Questions

1. Explain the Java Memory Model. What is happens-before?

The JMM defines how threads interact through memory. Without synchronization, writes by one thread may never be visible to another (due to CPU caches and instruction reordering). The happens-before relationship guarantees visibility: if A happens-before B, then A's effects are visible to B. Key rules: (1) unlock happens-before subsequent lock on same monitor, (2) volatile write happens-before subsequent read, (3) thread start happens-before first action in started thread, (4) all actions in thread happen-before join() returns. Transitivity chains these together.

2. What is false sharing and how do you fix it?

False sharing occurs when threads on different CPUs write to different variables that reside on the same cache line (64 bytes). Writing one variable invalidates the entire cache line on other CPUs, causing expensive cache-coherence traffic even though threads aren't logically sharing data. Fix: pad variables to separate cache lines using @Contended annotation or manual padding (e.g., 7 unused longs between hot fields). LongAdder avoids this internally by striping counters across cells on different cache lines.

3. When would you use StampedLock over ReadWriteLock?

StampedLock adds an optimistic read mode: read without acquiring a lock, then validate that no write occurred. If validation fails, fall back to a pessimistic read lock. This eliminates reader-writer starvation and CAS overhead for read-heavy workloads with occasional writes. Trade-offs: StampedLock is NOT reentrant (deadlock if you recurse), doesn't support Conditions, and requires careful coding (validate pattern). Use ReadWriteLock for simpler cases or when reentrancy is needed; StampedLock for hot paths where read performance is critical.

4. How do virtual threads handle blocking I/O internally?

When a virtual thread hits a blocking operation (socket read, Thread.sleep, Lock.lock), the JVM: (1) saves the virtual thread's stack as a continuation on the heap, (2) unmounts it from the carrier thread (platform thread from the ForkJoinPool), (3) schedules another virtual thread on that carrier. When I/O completes (via non-blocking I/O under the hood), the continuation is resumed on any available carrier. Exception: synchronized blocks pin the virtual thread to the carrier (monitor can't be saved/restored), so use ReentrantLock instead.

5. Design a thread-safe bounded cache with expiry.

Use ConcurrentHashMap + ScheduledExecutorService: entries stored with timestamps, scheduled task evicts expired entries. For bounded size, use LinkedHashMap with access-order under a ReadWriteLock, or Caffeine library (production choice). Key considerations: (1) read performance (avoid locks on reads — ConcurrentHashMap or StampedLock), (2) stampede protection (only one thread loads a missing key — use computeIfAbsent or CompletableFuture in an inflight map), (3) eviction policy (LRU, TTL, or size-based), (4) memory overhead (weak references for large values).

6. Explain CAS. What is the ABA problem and how do you solve it?

CAS (Compare-And-Swap) is a CPU instruction: atomically write a new value only if the current value matches expected. Lock-free algorithms use CAS retry loops instead of locks. ABA problem: Thread 1 reads value A, gets preempted. Thread 2 changes A→B→A. Thread 1's CAS succeeds (sees A) but state has semantically changed — dangerous in linked structures (node recycling). Solutions: (1) AtomicStampedReference — pairs value with a monotonic stamp; CAS checks both. (2) AtomicMarkableReference — boolean flag + value. (3) Epoch-based reclamation (for lock-free data structures).

7. A service handles 100k concurrent requests. Compare thread-per-request (virtual threads) vs reactive (WebFlux).

Virtual threads: Write blocking code (jdbc.query(), http.get()) — JVM handles multiplexing. Pros: simple mental model, existing libraries work, easy debugging (readable stack traces), familiar exception handling. Cons: pinning with synchronized, ThreadLocal misuse, doesn't help CPU-bound work. Reactive (WebFlux): Explicit async pipeline (Mono/Flux). Pros: explicit backpressure, fine-grained control, works before Java 21. Cons: complex debugging (no stack traces), callback hell, all libraries must be reactive, steep learning curve. Recommendation: Virtual threads for most new Java 21+ services — simpler code with equivalent throughput. Reactive only for extreme cases needing backpressure signaling or pre-Java 21.

8. How would you detect and resolve a deadlock in production?

Detection: (1) Take thread dump (jcmd <pid> Thread.print) — JVM automatically reports detected deadlocks at the bottom. (2) Programmatic: ThreadMXBean.findDeadlockedThreads() in a monitoring thread. (3) Symptoms: request latency spikes, thread pool exhaustion, specific operations never complete. Resolution: (1) Identify the lock cycle from the thread dump. (2) Impose a global lock ordering — always acquire locks in the same order (e.g., by account ID for bank transfers). (3) Replace synchronized with tryLock(timeout) — thread backs off instead of waiting forever. (4) Reduce lock scope. (5) Use lock-free structures where possible. Prevention: code review for nested locks, integration tests with concurrency stress, -XX:+DeadlockDetection alerts.

9. Explain the difference between LongAdder and AtomicLong. When would you choose each?

AtomicLong: single volatile long + CAS. Every thread CAS-retries on the same variable — under high contention, most attempts fail and retry, causing a CAS storm. LongAdder: internally maintains a base + array of cells. Threads update different cells (striped by thread hash), reducing contention. sum() aggregates all cells. Trade-offs: LongAdder has higher memory overhead and sum() is not atomic (may miss concurrent updates). Choose AtomicLong when: precise real-time reads needed, low contention, or using compareAndSet. Choose LongAdder when: write-heavy (metrics counters, statistics) and reads are infrequent or approximate is acceptable.

10. How do you size a thread pool for a microservice that makes database calls (avg 50ms) and serves 5000 req/s with 8 CPU cores?

Calculation: Each request blocks ~50ms on DB. At 5000 req/s, we need 5000 × 0.05 = 250 concurrent threads just to sustain throughput (Little's Law: L = λ × W). With 8 cores and I/O ratio of ~50ms wait / ~2ms compute: threads = 8 × (1 + 50/2) = 208. Rounding up: ~250 threads for the I/O pool. But with virtual threads (Java 21): just use newVirtualThreadPerTaskExecutor() — no sizing needed, each request gets a virtual thread that unmounts during the 50ms DB wait. Key pitfalls: unbounded pools (OOM under spike), too-small pools (request queuing → latency), shared pool for CPU + I/O work (mutual interference). Use separate pools: small fixed pool for CPU work, larger pool (or virtual threads) for I/O.


See Also