ADK Streaming Patterns
Comprehensive patterns for configuring Google ADK bidi-streaming (bidirectional streaming) to build real-time, multimodal AI agents with voice, video, and streaming tool capabilities.
Core Concepts
ADK bidi-streaming enables low-latency, bidirectional communication between users and AI agents with:
-
Real-time interaction: Process and respond while user is still providing input
-
Natural interruption: User can interrupt agent mid-response
-
Multimodal support: Text, audio, and video inputs/outputs
-
Streaming tools: Tools that yield intermediate results over time
-
Session persistence: Maintain context across ~10-minute connection timeouts
Quick Start Patterns
- Basic Bidi-Streaming Setup
from google.adk.agents import Agent from google.adk.agents.run_config import RunConfig, StreamingMode from google.genai import types
Configure for audio streaming
run_config = RunConfig( response_modalities=["AUDIO"], streaming_mode=StreamingMode.BIDI )
Run agent with bidi-streaming
async for event in agent.run_live(request_queue, run_config=run_config): if event.server_content: # Handle streaming response handle_response(event)
- LiveRequestQueue Pattern
from google.adk.agents import LiveRequestQueue
Create queue for multimodal inputs
request_queue = LiveRequestQueue()
Enqueue text
await request_queue.put("What's the weather?")
Enqueue audio chunks
await request_queue.put(audio_bytes)
Signal activity boundaries
await request_queue.put(types.LiveClientRealtimeInput( media_chunks=[types.LiveClientRealtimeInputMediaChunk( data=audio_chunk )] ))
Configuration Patterns
Response Modalities
CRITICAL: Only ONE response modality per session. Cannot switch mid-session.
Audio output (voice agent)
RunConfig(response_modalities=["AUDIO"])
Text output (chat agent)
RunConfig(response_modalities=["TEXT"])
Session Management
Session Resumption (automatic reconnection):
RunConfig( session_resumption=types.SessionResumptionConfig() )
Context Window Compression (unlimited sessions):
RunConfig( context_window_compression=types.ContextWindowCompressionConfig( trigger_tokens=100000, sliding_window=types.SlidingWindow(target_tokens=80000) ) )
Audio Configuration
See templates/audio-config.py for speech and transcription settings.
Platform Selection
Use environment variable (no code changes needed):
Google AI Studio (Gemini Live API)
export GOOGLE_GENAI_USE_VERTEXAI=FALSE
Vertex AI (Live API)
export GOOGLE_GENAI_USE_VERTEXAI=TRUE
Streaming Tools Pattern
Define tools as async generators for continuous results:
@streaming_tool async def monitor_stock(symbol: str): """Stream real-time stock price updates.""" while True: price = await fetch_current_price(symbol) yield f"Current price: ${price}" await asyncio.sleep(1)
See templates/streaming-tool-template.py for complete pattern.
Event Handling
Process events from run_live():
async for event in agent.run_live(request_queue, run_config=run_config): # Server content (agent responses) if event.server_content: if event.server_content.model_turn: # Text/audio from model process_model_response(event.server_content.model_turn)
if event.server_content.turn_complete:
# Agent finished speaking
handle_turn_complete()
# Tool calls
if event.tool_call:
# ADK executes tools automatically
log_tool_execution(event.tool_call)
# Interruptions
if event.interrupted:
handle_interruption()
Multi-Agent Streaming
Transfer stateful sessions between agents:
Agent 1 creates session
session = await agent1.run_live(request_queue, run_config=config)
Transfer to Agent 2 (seamless handoff)
await agent2.run_live( request_queue, run_config=config, session=session # Maintains conversation context )
Workflows
Complete Bidi-Streaming Agent Workflow
Configure RunConfig
-
Choose response modality (AUDIO or TEXT)
-
Enable session resumption
-
Configure context window compression (optional)
-
Set audio/speech configs (for audio modality)
Create LiveRequestQueue
-
Initialize queue for multimodal inputs
-
Enqueue messages as they arrive
-
Use activity markers for segmentation
Implement Event Handling
-
Process server_content for agent responses
-
Handle tool_call events
-
Manage interruption events
-
Track turn_complete signals
Define Streaming Tools (optional)
-
Use async generators for continuous output
-
Yield intermediate results over time
-
Support real-time monitoring/analysis
Test and Deploy
-
Validate audio/video processing
-
Test interruption handling
-
Verify session resumption
-
Monitor quota usage
Audio/Video Workflow
See examples/audio-video-agent.py for complete multimodal setup including:
-
Audio input processing
-
Video frame handling
-
Speech configuration
-
Transcription settings
Templates
All templates use placeholders only (no hardcoded API keys):
-
templates/bidi-streaming-config.py: Complete RunConfig patterns
-
templates/streaming-tool-template.py: Async generator tool pattern
-
templates/audio-config.py: Speech and transcription setup
-
templates/video-config.py: Video frame processing
-
templates/liverequest-queue.py: Queue management patterns
-
templates/event-handler.py: Event processing patterns
Scripts
Utility scripts for validation and setup:
-
scripts/validate-streaming-config.py: Validate RunConfig settings
-
scripts/test-liverequest-queue.py: Test queue functionality
-
scripts/check-modality-support.py: Verify modality compatibility
Examples
Real-world streaming agent implementations:
-
examples/voice-agent.py: Complete audio streaming agent
-
examples/video-agent.py: Multimodal video processing agent
-
examples/streaming-tool-agent.py: Agent with streaming tools
-
examples/multi-agent-handoff.py: Session transfer between agents
Best Practices
-
Choose Modality Carefully: Cannot switch response modality mid-session
-
Use Session Resumption: Prevent disconnection issues
-
Enable Context Compression: For extended conversations
-
Implement Streaming Tools: For real-time monitoring/analysis
-
Handle Interruptions: Natural conversation requires interruption support
-
Segment Context: Use activity markers for logical event boundaries
-
Test Platform Switch: Verify behavior on both AI Studio and Vertex AI
Common Patterns
Pattern 1: Voice Agent with Interruption
See examples/voice-agent.py
Pattern 2: Streaming Analysis Tool
See examples/streaming-tool-agent.py
Pattern 3: Multi-Agent Coordination
See examples/multi-agent-handoff.py
References
-
ADK Bidi-streaming Docs
-
RunConfig Guide (Part 4)
-
Audio/Video Guide (Part 5)
-
Real-Time Multi-Agent Architecture
Security Compliance
This skill follows strict security rules:
-
All code examples use placeholder values only
-
No real API keys, passwords, or secrets
-
Environment variable references in all code
-
.gitignore protection documented