tokio-troubleshooting

Tokio Troubleshooting

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tokio-troubleshooting" with this command: npx skills add geoffjay/claude-plugins/geoffjay-claude-plugins-tokio-troubleshooting

Tokio Troubleshooting

This skill provides techniques for debugging and troubleshooting async applications built with Tokio.

Using tokio-console for Runtime Inspection

Monitor async runtime in real-time:

// In Cargo.toml [dependencies] console-subscriber = "0.2"

// In main.rs fn main() { console_subscriber::init();

tokio::runtime::Builder::new_multi_thread()
    .enable_all()
    .build()
    .unwrap()
    .block_on(async {
        run_application().await
    });

}

Run console in separate terminal:

tokio-console

Key metrics to monitor:

  • Task spawn rate and total tasks

  • Poll duration per task

  • Idle vs. busy time

  • Waker operations

  • Resource utilization

Identifying issues:

  • Long poll durations: CPU-intensive work in async context

  • Many wakers: Potential contention or inefficient polling

  • Growing task count: Task leak or unbounded spawning

  • High idle time: Not enough work or blocking operations

Debugging Deadlocks and Hangs

Detect and resolve deadlock situations:

Common Deadlock Pattern

// BAD: Potential deadlock async fn deadlock_example() { let mutex1 = Arc::new(Mutex::new(())); let mutex2 = Arc::new(Mutex::new(()));

let m1 = mutex1.clone();
let m2 = mutex2.clone();
tokio::spawn(async move {
    let _g1 = m1.lock().await;
    tokio::time::sleep(Duration::from_millis(10)).await;
    let _g2 = m2.lock().await; // May deadlock
});

let _g2 = mutex2.lock().await;
tokio::time::sleep(Duration::from_millis(10)).await;
let _g1 = mutex1.lock().await; // May deadlock

}

// GOOD: Consistent lock ordering async fn no_deadlock_example() { let mutex1 = Arc::new(Mutex::new(())); let mutex2 = Arc::new(Mutex::new(()));

// Always acquire locks in same order
let _g1 = mutex1.lock().await;
let _g2 = mutex2.lock().await;

}

// BETTER: Avoid nested locks async fn best_example() { // Use message passing instead let (tx, mut rx) = mpsc::channel(10);

tokio::spawn(async move {
    while let Some(msg) = rx.recv().await {
        process_message(msg).await;
    }
});

tx.send(message).await.unwrap();

}

Detecting Hangs with Timeouts

use tokio::time::{timeout, Duration};

async fn detect_hang() { match timeout(Duration::from_secs(5), potentially_hanging_operation()).await { Ok(result) => println!("Completed: {:?}", result), Err(_) => { eprintln!("Operation timed out - potential hang detected"); // Log stack traces, metrics, etc. } } }

Deadlock Detection with try_lock

use tokio::sync::Mutex;

async fn try_with_timeout(mutex: &Mutex<State>) -> Option<State> { for _ in 0..10 { if let Ok(guard) = mutex.try_lock() { return Some(guard.clone()); } tokio::time::sleep(Duration::from_millis(10)).await; } eprintln!("Failed to acquire lock - possible deadlock"); None }

Memory Leak Detection

Identify and fix memory leaks:

Task Leaks

// BAD: Tasks never complete async fn leaking_tasks() { loop { tokio::spawn(async { loop { // Never exits tokio::time::sleep(Duration::from_secs(1)).await; } }); } }

// GOOD: Tasks have exit condition async fn proper_tasks(shutdown: broadcast::Receiver<()>) { loop { let mut shutdown_rx = shutdown.resubscribe(); tokio::spawn(async move { loop { tokio::select! { _ = shutdown_rx.recv() => break, _ = tokio::time::sleep(Duration::from_secs(1)) => { // Work } } } }); } }

Arc Cycles

// BAD: Reference cycle struct Node { next: Option<Arc<Mutex<Node>>>, prev: Option<Arc<Mutex<Node>>>, // Creates cycle! }

// GOOD: Use weak references use std::sync::Weak;

struct Node { next: Option<Arc<Mutex<Node>>>, prev: Option<Weak<Mutex<Node>>>, // Weak reference breaks cycle }

Monitoring Memory Usage

use sysinfo::{System, SystemExt};

pub async fn memory_monitor() { let mut system = System::new_all(); let mut interval = tokio::time::interval(Duration::from_secs(60));

loop {
    interval.tick().await;
    system.refresh_memory();

    let used = system.used_memory();
    let total = system.total_memory();
    let percent = (used as f64 / total as f64) * 100.0;

    tracing::info!(
        used_mb = used / 1024 / 1024,
        total_mb = total / 1024 / 1024,
        percent = %.2 percent,
        "Memory usage"
    );

    if percent > 80.0 {
        tracing::warn!("High memory usage detected");
    }
}

}

Performance Profiling with Tracing

Instrument code for performance analysis:

use tracing::{info, instrument, span, Level};

#[instrument] async fn process_request(id: u64) -> Result<Response, Error> { let span = span!(Level::INFO, "database_query"); let _enter = span.enter();

let data = fetch_from_database(id).await?;

drop(_enter);

let span = span!(Level::INFO, "transformation");
let _enter = span.enter();

let result = transform_data(data).await?;

Ok(Response { result })

}

// Configure subscriber for flame graphs use tracing_subscriber::layer::SubscriberExt;

fn init_tracing() { let fmt_layer = tracing_subscriber::fmt::layer(); let filter_layer = tracing_subscriber::EnvFilter::from_default_env();

tracing_subscriber::registry()
    .with(filter_layer)
    .with(fmt_layer)
    .init();

}

Understanding Panic Messages

Common async panic patterns:

Panics in Spawned Tasks

// Panic is isolated to the task tokio::spawn(async { panic!("This won't crash the program"); });

// To catch panics let handle = tokio::spawn(async { // Work that might panic });

match handle.await { Ok(result) => println!("Success: {:?}", result), Err(e) if e.is_panic() => { eprintln!("Task panicked: {:?}", e); // Handle panic } Err(e) => eprintln!("Task cancelled: {:?}", e), }

Send + 'static Errors

// ERROR: future cannot be sent between threads async fn bad_example() { let rc = Rc::new(5); // Rc is !Send tokio::spawn(async move { println!("{}", rc); // Error! }); }

// FIX: Use Arc instead async fn good_example() { let rc = Arc::new(5); // Arc is Send tokio::spawn(async move { println!("{}", rc); // OK }); }

// ERROR: borrowed value does not live long enough async fn lifetime_error() { let data = String::from("hello"); tokio::spawn(async { println!("{}", data); // Error: data might not live long enough }); }

// FIX: Move ownership async fn lifetime_fixed() { let data = String::from("hello"); tokio::spawn(async move { println!("{}", data); // OK: data is moved }); }

Common Error Patterns and Solutions

Blocking in Async Context

// PROBLEM: Detected with tokio-console (long poll time) async fn blocking_example() { std::thread::sleep(Duration::from_secs(1)); // Blocks thread! }

// SOLUTION async fn non_blocking_example() { tokio::time::sleep(Duration::from_secs(1)).await; // Yields control }

// For unavoidable blocking async fn necessary_blocking() { tokio::task::spawn_blocking(|| { expensive_cpu_work() }).await.unwrap(); }

Channel Closed Errors

// PROBLEM: SendError because receiver dropped async fn send_error_example() { let (tx, rx) = mpsc::channel(10); drop(rx); // Receiver dropped

match tx.send(42).await {
    Ok(_) => println!("Sent"),
    Err(e) => eprintln!("Send failed: {}", e), // Channel closed
}

}

// SOLUTION: Check if receiver exists async fn handle_closed_channel() { let (tx, rx) = mpsc::channel(10);

tokio::spawn(async move {
    // Receiver keeps channel open
    while let Some(msg) = rx.recv().await {
        process(msg).await;
    }
});

// Or handle the error
if let Err(e) = tx.send(42).await {
    tracing::warn!("Channel closed: {}", e);
    // Cleanup or alternative action
}

}

Task Cancellation

// PROBLEM: Task cancelled unexpectedly let handle = tokio::spawn(async { // Long-running work });

handle.abort(); // Cancels task

// SOLUTION: Handle cancellation gracefully let handle = tokio::spawn(async { let result = tokio::select! { result = do_work() => result, _ = tokio::signal::ctrl_c() => { cleanup().await; return Err(Error::Cancelled); } }; result });

Testing Async Code Effectively

Write reliable async tests:

#[tokio::test] async fn test_with_timeout() { tokio::time::timeout( Duration::from_secs(5), async { let result = my_async_function().await; assert!(result.is_ok()); } ) .await .expect("Test timed out"); }

#[tokio::test] async fn test_concurrent_access() { let shared = Arc::new(Mutex::new(0));

let handles: Vec&#x3C;_> = (0..10)
    .map(|_| {
        let shared = shared.clone();
        tokio::spawn(async move {
            let mut lock = shared.lock().await;
            *lock += 1;
        })
    })
    .collect();

for handle in handles {
    handle.await.unwrap();
}

assert_eq!(*shared.lock().await, 10);

}

// Test with mocked time #[tokio::test(start_paused = true)] async fn test_with_time_control() { let start = tokio::time::Instant::now();

tokio::time::sleep(Duration::from_secs(100)).await;

// Time is mocked, so this completes instantly
assert!(start.elapsed() &#x3C; Duration::from_secs(1));

}

Debugging Checklist

When troubleshooting async issues:

  • Use tokio-console to monitor runtime behavior

  • Check for blocking operations with tracing

  • Verify all locks are released properly

  • Look for task leaks (growing task count)

  • Monitor memory usage over time

  • Add timeouts to detect hangs

  • Check for channel closure errors

  • Verify Send + 'static bounds are satisfied

  • Use try_lock to detect potential deadlocks

  • Profile with tracing for performance bottlenecks

  • Test with tokio-test for time-based code

  • Check for Arc cycles with weak references

Helpful Tools

  • tokio-console: Real-time async runtime monitoring

  • tracing: Structured logging and profiling

  • cargo-flamegraph: Generate flame graphs

  • valgrind/heaptrack: Memory profiling

  • perf: CPU profiling on Linux

  • Instruments: Profiling on macOS

Best Practices

  • Always use tokio-console in development

  • Add tracing spans to critical code paths

  • Use timeouts liberally to detect hangs

  • Monitor task count for leaks

  • Profile before optimizing - measure first

  • Test with real concurrency - don't just test happy paths

  • Handle cancellation gracefully in all tasks

  • Use structured logging for debugging

  • Avoid nested locks - prefer message passing

  • Document lock ordering when necessary

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

documentation-update

No summary provided by upstream source.

Repository SourceNeeds Review
General

git-troubleshooting

No summary provided by upstream source.

Repository SourceNeeds Review
General

git-advanced

No summary provided by upstream source.

Repository SourceNeeds Review
General

tokio-patterns

No summary provided by upstream source.

Repository SourceNeeds Review