Adaptive Rejection Sampler
Overview
This skill provides guidance for implementing Adaptive Rejection Sampling (ARS) algorithms. ARS is a method for generating samples from log-concave probability distributions by constructing piecewise linear upper and lower envelopes of the log-density function. The skill focuses on procedural approaches, performance optimization, and verification strategies rather than providing implementation code.
When to Use This Skill
Use this skill when:
-
Implementing adaptive rejection sampling from scratch
-
Working with log-concave distribution samplers
-
Building statistical sampling algorithms that require envelope construction
-
Debugging or optimizing existing ARS implementations
-
The task involves R, Python, or other statistical computing environments
Implementation Approach
Phase 1: Algorithm Design Before Coding
Before writing any code:
Understand the mathematical foundations
-
Review the ARS algorithm requirements for log-concave functions
-
Understand how piecewise linear envelopes are constructed from tangent lines
-
Identify the squeeze function optimization
Design with performance in mind
-
Plan iteration limits and safeguards from the start
-
Consider worst-case computational complexity of the sampling loop
-
Design for timeout constraints that may be stricter than development testing
Plan the module structure
-
Log-concavity verification module
-
Envelope construction and update module
-
Sampling loop with rejection logic
-
Initialization point selection
Phase 2: Critical Implementation Considerations
Log-Concavity Checking
When implementing log-concavity verification:
-
Use appropriate tolerance values for numerical comparison
-
Consider that some valid distributions have constant second derivatives (e.g., exponential distribution)
-
Use <= tolerance instead of strict < comparisons when checking for non-positive second derivatives
-
Test with edge cases like exponential, normal, and gamma distributions
Initialization Points
Proper initialization significantly affects sampling quality and performance:
-
Handle shifted distributions by adjusting initialization points relative to the mode
-
Consider the support bounds when selecting initial points
-
Use multiple initial points to ensure good envelope coverage
-
Test with distributions that have modes far from the origin
Iteration Limits and Safeguards
Critical for preventing infinite loops and timeouts:
-
Implement maximum iteration limits in all sampling loops
-
Add progress indicators or logging for long-running computations
-
Include timeout protection mechanisms
-
Design early exit conditions for convergence
Phase 3: Performance-First Development
Timeout Considerations
-
External test frameworks may use shorter timeouts than development testing
-
If development tests use 180-second timeouts, production may use 60 seconds
-
Profile performance early to catch issues before they become blocking
-
Test with varying timeout constraints to ensure robustness
Computational Efficiency
-
Analyze the computational complexity of the sampling loop
-
Optimize envelope updates to minimize recalculation
-
Consider caching frequently computed values
-
Profile with realistic sample sizes
Verification Strategies
Unit Testing Approach
Test each module independently
-
Log-concavity checker with known log-concave and non-log-concave functions
-
Envelope construction with simple distributions
-
Sampling loop with controlled random seeds
Test known distributions
-
Standard normal distribution
-
Exponential distribution (constant second derivative edge case)
-
Truncated normal distributions
-
Gamma distributions with various parameters
Test edge cases explicitly
-
Non-function inputs
-
Negative sample counts
-
Invalid bounds (lower > upper)
-
Very small and very large sample sizes
Performance Testing
-
Match testing timeouts to expected production constraints
-
Test worst-case behavior, not just average case
-
Profile with different random seeds to catch stochastic failures
-
Measure time per sample to identify performance degradation
Integration Testing
-
Test the complete pipeline from input to samples
-
Verify statistical properties of generated samples (mean, variance, distribution shape)
-
Use statistical tests (KS test, chi-square) to verify sample quality
-
Test with distributions that have known analytical moments
Common Pitfalls and Mistakes
Critical Mistakes to Avoid
Missing iteration limits
-
Always include maximum iteration safeguards in rejection sampling loops
-
An unbounded rejection loop will cause timeouts on difficult distributions
Inadequate performance testing
-
Development tests passing does not guarantee production success
-
Test with the same timeout constraints as the target environment
Tolerance issues in log-concavity checking
-
Strict inequality checks (< 0 ) may incorrectly reject valid distributions
-
Use tolerant comparisons (<= tolerance ) for numerical stability
Poor initialization for shifted distributions
-
Hard-coded initialization points fail for distributions with non-zero modes
-
Always compute initialization relative to the distribution's characteristics
Incomplete file verification
-
After writing large code blocks, verify the complete content was written
-
Truncated files can cause subtle bugs
Debugging Approach
When tests fail or timeout:
Systematic debugging over trial-and-error
-
Understand root causes before implementing fixes
-
Use logging to identify where time is being spent
Isolate the problematic component
-
Test log-concavity checking separately
-
Test envelope construction separately
-
Test sampling loop with mock envelopes
Check for infinite loops
-
Add iteration counters to all loops
-
Log when iteration limits are approached
-
Verify exit conditions are reachable
Quality Checklist
Before considering the implementation complete:
-
All sampling loops have maximum iteration limits
-
Log-concavity checking uses appropriate tolerances
-
Initialization points adapt to distribution characteristics
-
Performance tested with target timeout constraints
-
Edge cases for invalid inputs are handled
-
Statistical properties of samples verified
-
Code verified to be completely written (no truncation)
-
Tested with multiple random seeds