Home/Concurrency Patterns/Scoped Threads

Scoped Threads

Borrowing across thread boundaries

advanced
scopethreadsborrowing
🎮 Interactive Playground

What are Scoped Threads?

Scoped threads allow spawning threads that can borrow non-'static data from their parent scope. This enables safe concurrent access to stack-allocated data without requiring Arc, Mutex, or moving ownership. The compiler guarantees that scoped threads complete before their borrowed data goes out of scope.

Rust's std::thread::scope (stable since 1.63.0) provides this functionality with zero runtime overhead.

The Problem

Regular thread::spawn requires 'static lifetimes:

let data = vec![1, 2, 3];

// ❌ ERROR: data doesn't live long enough
// thread::spawn(|| {
//     println!("{:?}", data);
// });

You're forced to either:

  1. Move ownership (can't use data after)
  2. Use Arc (runtime overhead, limited to Clone types)
  3. Use 'static data only (limits flexibility)

Scoped threads solve this by guaranteeing thread completion within a scope.

Example Code

use std::thread;

// Example 1: Basic scoped threads
fn basic_scoped() {
    let mut data = vec![1, 2, 3, 4, 5];

    thread::scope(|s| {
        // Spawn threads that borrow `data`
        s.spawn(|| {
            println!("Thread 1: {:?}", data);
        });

        s.spawn(|| {
            println!("Thread 2: {:?}", data);
        });

        // All spawned threads complete before scope ends
    });

    // Safe to use `data` here
    println!("Main: {:?}", data);
}

// Example 2: Parallel computation with borrowing
fn parallel_sum() {
    let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8];
    let mut sum1 = 0;
    let mut sum2 = 0;

    thread::scope(|s| {
        s.spawn(|| {
            sum1 = numbers[0..4].iter().sum();
        });

        s.spawn(|| {
            sum2 = numbers[4..8].iter().sum();
        });
    });

    println!("Total sum: {}", sum1 + sum2);
}

// Example 3: Mutable access with proper synchronization
use std::cell::Cell;

fn mutable_access() {
    let counter = Cell::new(0);

    thread::scope(|s| {
        for _ in 0..5 {
            s.spawn(|| {
                let current = counter.get();
                counter.set(current + 1);
            });
        }
    });

    println!("Counter: {}", counter.get());
    // Note: Cell is not thread-safe, this is a race!
    // Use AtomicUsize for actual concurrent counter
}

// Example 4: Correct mutable access with Mutex
use std::sync::Mutex;

fn safe_mutable_access() {
    let counter = Mutex::new(0);

    thread::scope(|s| {
        for _ in 0..5 {
            s.spawn(|| {
                let mut lock = counter.lock().unwrap();
                *lock += 1;
            });
        }
    });

    println!("Counter: {}", counter.lock().unwrap());
}

// Example 5: Returning values from scoped threads
fn returning_values() {
    let data = vec![1, 2, 3, 4, 5];
    let mut results = Vec::new();

    thread::scope(|s| {
        let handle1 = s.spawn(|| {
            data[0..3].iter().sum::<i32>()
        });

        let handle2 = s.spawn(|| {
            data[3..5].iter().sum::<i32>()
        });

        results.push(handle1.join().unwrap());
        results.push(handle2.join().unwrap());
    });

    println!("Results: {:?}", results);
}

// Example 6: Complex example - parallel quicksort
fn parallel_quicksort<T: Ord + Send>(slice: &mut [T]) {
    if slice.len() <= 1 {
        return;
    }

    let pivot = slice.len() / 2;
    slice.swap(pivot, slice.len() - 1);

    let mut i = 0;
    for j in 0..slice.len() - 1 {
        if slice[j] <= slice[slice.len() - 1] {
            slice.swap(i, j);
            i += 1;
        }
    }
    slice.swap(i, slice.len() - 1);

    let (left, right) = slice.split_at_mut(i);
    let (_, right) = right.split_at_mut(1);

    // Only parallelize if chunks are large enough
    if left.len() > 1000 && right.len() > 1000 {
        thread::scope(|s| {
            s.spawn(|| parallel_quicksort(left));
            s.spawn(|| parallel_quicksort(right));
        });
    } else {
        parallel_quicksort(left);
        parallel_quicksort(right);
    }
}

fn main() {
    println!("=== Basic Scoped ===");
    basic_scoped();

    println!("\n=== Parallel Sum ===");
    parallel_sum();

    println!("\n=== Safe Mutable Access ===");
    safe_mutable_access();

    println!("\n=== Returning Values ===");
    returning_values();

    println!("\n=== Parallel Quicksort ===");
    let mut data = vec![5, 2, 8, 1, 9, 3, 7, 4, 6];
    parallel_quicksort(&mut data);
    println!("Sorted: {:?}", data);
}

Why It Works

Lifetime Guarantees

thread::scope(|s| {
    // Threads spawned here MUST complete before scope ends
    s.spawn(|| { /* can borrow scope data */ });
}); // Compiler ensures all threads joined here

Type System Enforcement

  • Scoped threads can't outlive their scope
  • Borrow checker verifies all lifetimes
  • No runtime overhead compared to regular threads
  • Zero-cost abstraction

Automatic Joining

  • All threads automatically joined at scope end
  • No need to manually call .join()
  • Panic propagation through scope boundary

When to Use

Use scoped threads when:

  • Need to parallelize work over stack data
  • Want to avoid Arc overhead
  • Working with non-Clone types
  • Guaranteed completion within a scope

Use regular threads when:

  • Threads outlive their parent
  • Background tasks that run independently
  • Service threads or long-lived workers
  • Need explicit control over joining

Specific use cases:

  • Parallel iteration over slices
  • Divide-and-conquer algorithms
  • Map-reduce operations
  • Temporary parallel sections

⚠️ Anti-patterns

⚠️ Mistake #1: Unnecessary Arc When Scoped Threads Suffice

use std::sync::Arc;

// ❌ DON'T: Use Arc for short-lived parallelism
let data = Arc::new(vec![1, 2, 3, 4, 5]);

let mut handles = vec![];
for i in 0..5 {
    let data = Arc::clone(&data);
    handles.push(thread::spawn(move || {
        println!("{}", data[i]);
    }));
}

for h in handles {
    h.join().unwrap();
}

// ✅ DO: Use scoped threads
let data = vec![1, 2, 3, 4, 5];

thread::scope(|s| {
    for i in 0..5 {
        s.spawn(|| {
            println!("{}", data[i]);
        });
    }
});

⚠️ Mistake #2: Ignoring Data Races with Cell

use std::cell::Cell;

// ❌ DON'T: Cell is not thread-safe
let counter = Cell::new(0);

thread::scope(|s| {
    for _ in 0..100 {
        s.spawn(|| {
            // RACE CONDITION!
            let v = counter.get();
            counter.set(v + 1);
        });
    }
});

// ✅ DO: Use atomic types
use std::sync::atomic::{AtomicUsize, Ordering};

let counter = AtomicUsize::new(0);

thread::scope(|s| {
    for _ in 0..100 {
        s.spawn(|| {
            counter.fetch_add(1, Ordering::Relaxed);
        });
    }
});

⚠️ Mistake #3: Panics in Scoped Threads

// ❌ DON'T: Ignore panic handling
thread::scope(|s| {
    s.spawn(|| {
        panic!("Oops!"); // Panics are propagated
    });
}); // This will panic!

// ✅ DO: Handle panics explicitly
use std::panic;

let result = panic::catch_unwind(|| {
    thread::scope(|s| {
        s.spawn(|| {
            panic!("Oops!");
        });
    });
});

if result.is_err() {
    println!("Thread panicked!");
}

Advanced Example: Parallel Tree Traversal

#[derive(Debug)]
struct TreeNode {
    value: i32,
    left: Option<Box<TreeNode>>,
    right: Option<Box<TreeNode>>,
}

impl TreeNode {
    fn new(value: i32) -> Self {
        TreeNode {
            value,
            left: None,
            right: None,
        }
    }

    fn with_children(
        value: i32,
        left: Option<Box<TreeNode>>,
        right: Option<Box<TreeNode>>,
    ) -> Self {
        TreeNode { value, left, right }
    }

    // Parallel sum of all nodes
    fn parallel_sum(&self) -> i32 {
        let mut sum = self.value;

        thread::scope(|s| {
            let left_handle = self.left.as_ref().map(|left| {
                s.spawn(|| left.parallel_sum())
            });

            let right_handle = self.right.as_ref().map(|right| {
                s.spawn(|| right.parallel_sum())
            });

            if let Some(handle) = left_handle {
                sum += handle.join().unwrap();
            }

            if let Some(handle) = right_handle {
                sum += handle.join().unwrap();
            }
        });

        sum
    }

    // Parallel search
    fn parallel_contains(&self, target: i32) -> bool {
        if self.value == target {
            return true;
        }

        thread::scope(|s| {
            let left_handle = self.left.as_ref().map(|left| {
                s.spawn(move || left.parallel_contains(target))
            });

            let right_handle = self.right.as_ref().map(|right| {
                s.spawn(move || right.parallel_contains(target))
            });

            left_handle.map_or(false, |h| h.join().unwrap())
                || right_handle.map_or(false, |h| h.join().unwrap())
        })
    }
}

fn tree_example() {
    let tree = TreeNode::with_children(
        1,
        Some(Box::new(TreeNode::with_children(
            2,
            Some(Box::new(TreeNode::new(4))),
            Some(Box::new(TreeNode::new(5))),
        ))),
        Some(Box::new(TreeNode::with_children(
            3,
            Some(Box::new(TreeNode::new(6))),
            Some(Box::new(TreeNode::new(7))),
        ))),
    );

    println!("Tree sum: {}", tree.parallel_sum());
    println!("Contains 5: {}", tree.parallel_contains(5));
    println!("Contains 10: {}", tree.parallel_contains(10));
}

Advanced Example: Parallel Matrix Multiplication

type Matrix = Vec<Vec<f64>>;

fn parallel_matrix_multiply(a: &Matrix, b: &Matrix) -> Matrix {
    let n = a.len();
    let m = b[0].len();
    let p = b.len();

    assert_eq!(a[0].len(), p);

    let mut result = vec![vec![0.0; m]; n];

    thread::scope(|s| {
        for result_row in result.iter_mut() {
            s.spawn(|| {
                for j in 0..m {
                    for k in 0..p {
                        result_row[j] += a[0][k] * b[k][j];
                    }
                }
            });
        }
    });

    result
}

// Chunked parallel multiplication for better cache locality
fn chunked_parallel_multiply(a: &Matrix, b: &Matrix, chunk_size: usize) -> Matrix {
    let n = a.len();
    let m = b[0].len();
    let p = b.len();

    let mut result = vec![vec![0.0; m]; n];

    thread::scope(|s| {
        for chunk in result.chunks_mut(chunk_size) {
            s.spawn(|| {
                for row in chunk {
                    for j in 0..m {
                        for k in 0..p {
                            row[j] += a[0][k] * b[k][j];
                        }
                    }
                }
            });
        }
    });

    result
}

Performance Characteristics

Overhead

  • Zero-cost abstraction: Same as regular threads
  • No Arc overhead: Direct borrowing
  • Stack allocation: No heap allocation for borrowed data
  • Automatic joining: No manual join overhead

Scalability

  • Linear with cores: For embarrassingly parallel problems
  • Task granularity: Balance between parallelism and overhead
  • Cache effects: Consider locality when dividing work

Trade-offs

  • Limited lifetime: Can't spawn long-lived threads
  • Blocking: Scope waits for all threads
  • Thread creation overhead: Consider thread pools for many small tasks

Exercises

Beginner

  1. Parallelize summing a large vector using scoped threads
  2. Implement parallel map operation on a slice
  3. Write parallel min/max finder

Intermediate

  1. Implement parallel merge sort using scoped threads
  2. Create a parallel filter operation
  3. Build a parallel file processor

Advanced

  1. Implement parallel matrix operations (multiply, transpose)
  2. Create a parallel graph traversal algorithm
  3. Build a work-stealing fork-join pool

Real-World Usage

Rayon

use rayon::prelude::*;

// Rayon uses scoped threads internally
let sum: i32 = (0..1000).into_par_iter().sum();

Standard Library

// Direct scoped thread usage
use std::thread;

let data = vec![1, 2, 3, 4, 5];
thread::scope(|s| {
    for item in &data {
        s.spawn(move || println!("{}", item));
    }
});

Custom Parallel Algorithms

// Parallel sorting, searching, transformations
fn parallel_transform<T, F>(data: &mut [T], f: F)
where
    T: Send,
    F: Fn(&mut T) + Send + Sync,
{
    thread::scope(|s| {
        for item in data {
            s.spawn(|| f(item));
        }
    });
}

Further Reading

🎮 Try it Yourself

🎮

Scoped Threads - Playground

Run this code in the official Rust Playground