stan-development

Use this skill when working with Stan models for Bayesian inference and probabilistic programming, particularly when integrating with R (cmdstanr) or Python (cmdstanpy).

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "stan-development" with this command: npx skills add seabbs/claude-code-config/seabbs-claude-code-config-stan-development

Stan Development

Use this skill when working with Stan models for Bayesian inference and probabilistic programming, particularly when integrating with R (cmdstanr) or Python (cmdstanpy).

Modern Stan Syntax

Array Declarations (IMPORTANT)

Use new array syntax introduced in Stan 2.26+:

// Good - Modern syntax (Stan 2.26+) array[10] int x; // Array of 10 integers array[N] real y; // Array of N reals array[N, M] real z; // 2D array array[N] vector[K] v; // Array of vectors

// Avoid - Old syntax (deprecated) int x[10]; // Old style real y[N]; // Old style real z[N, M]; // Old style vector[K] v[N]; // Old style

Why the change?

  • More consistent and readable

  • Clearer distinction between arrays and matrix/vector types

  • Aligns with modern Stan style

Other Key Syntax

Data types:

// Scalar types int n; real x;

// Vector/Matrix types vector[N] v; // Column vector row_vector[N] rv; // Row vector matrix[N, M] A; // Matrix

// Arrays of vectors/matrices array[K] vector[N] vectors; array[K] matrix[N, M] matrices;

// Constrained types real<lower=0> sigma; // Lower bound real<lower=0, upper=1> theta; // Bounded simplex[K] pi; // Simplex constraint ordered[N] sorted_values; // Ordered constraint

Stan File Organization

Standard Structure

project/ ├── inst/stan/ # Stan models (R packages) │ ├── model.stan # Main model file │ ├── functions/ # Reusable Stan functions │ │ ├── likelihood.stan │ │ ├── priors.stan │ │ └── utils.stan │ └── chunks/ # Reusable code chunks (optional) └── stan/ # Alternative location (Python projects)

Modular Stan Code

Main model includes functions:

functions { #include functions/likelihood.stan #include functions/priors.stan #include functions/utils.stan }

data { int<lower=0> N; array[N] real y; }

parameters { real mu; real<lower=0> sigma; }

model { // Use functions from included files y ~ custom_likelihood(mu, sigma); mu ~ custom_prior(); }

Benefits of modular approach:

  • Reusable functions across models

  • Easier testing of individual components

  • Better organization for complex models

  • Cleaner main model files

Stan Integration with R (cmdstanr)

Basic Workflow

library(cmdstanr)

Compile Stan model

model <- cmdstan_model("path/to/model.stan")

Prepare data (list with names matching Stan data block)

stan_data <- list( N = 100, y = rnorm(100) )

Fit model

fit <- model$sample( data = stan_data, chains = 4, parallel_chains = 4, iter_warmup = 1000, iter_sampling = 1000 )

Extract samples

draws <- fit$draws() summary <- fit$summary()

Model Compilation and Caching

Get model path from package

model_path <- system.file("stan", "model.stan", package = "mypackage")

Compile with cmdstanr

model <- cmdstan_model(model_path)

Package-specific model functions

Many packages provide wrapper functions:

my_package_model <- get_model() # Returns compiled model

Inference Algorithms

NUTS sampling (default, most robust)

fit <- model$sample(data = stan_data)

Variational inference (faster, approximate)

fit <- model$variational(data = stan_data)

Pathfinder (fast, approximate)

fit <- model$pathfinder(data = stan_data)

Laplace approximation (very fast, approximate)

fit <- model$laplace(data = stan_data)

Optimization (MAP estimate)

fit <- model$optimize(data = stan_data)

Testing Stan Functions

Exposing Stan Functions to R

In R packages with cmdstanr:

Expose Stan functions for testing

Typically in a package function

expose_stan_functions <- function() { model_path <- system.file("stan", "functions.stan", package = "mypackage")

Note: This typically works on Linux only

cmdstanr::cmdstan_model(model_path, compile_standalone = TRUE) }

Then in tests

test_that("Stan function works correctly", {

Exposed functions are now available in R

result <- stan_function_name(args) expect_equal(result, expected) })

Common pattern in test setup:

tests/testthat/setup.R

if (on_linux()) { expose_stan_functions() }

tests/testthat/test-stan-functions.R

test_that("likelihood function works", { skip_if_not(on_linux()) # Stan function exposure often Linux-only

result <- stan_likelihood(data, params) expect_gt(result, -Inf) })

Stan-R Interface Patterns

Data Preparation

Convert R data structures to Stan format

stan_data_list <- list( N = nrow(data), K = ncol(X), y = data$outcome, X = as.matrix(X),

Arrays use R vectors/matrices directly

group = as.integer(data$group) )

For complex models, packages often provide conversion functions

stan_data <- prepare_stan_data(preprocessed_data, model_spec)

Prior Specification as Data

Empirical Bayes approach - priors as data:

data { // Data int<lower=0> N; array[N] real y;

// Priors as data (empirical Bayes) real prior_mu_mean; real<lower=0> prior_mu_sd; real<lower=0> prior_sigma_alpha; real<lower=0> prior_sigma_beta; }

parameters { real mu; real<lower=0> sigma; }

model { // Use data-specified priors mu ~ normal(prior_mu_mean, prior_mu_sd); sigma ~ gamma(prior_sigma_alpha, prior_sigma_beta); y ~ normal(mu, sigma); }

Benefits:

  • Priors can be updated without recompiling

  • Easy to specify different priors for different models

  • Enables automated prior selection

Distribution Lookup Systems

For packages with multiple distribution options:

R side - get distribution ID

dist_id <- get_distribution_id("lognormal")

Stan side - use distribution

stan_data <- list( distribution = dist_id, # Integer ID

... other data

)

// Stan functions for distribution dispatch real compute_lpdf(real y, int distribution, vector params) { if (distribution == 1) { // Normal return normal_lpdf(y | params[1], params[2]); } else if (distribution == 2) { // Lognormal return lognormal_lpdf(y | params[1], params[2]); } // ... other distributions }

Common Stan Patterns

Vectorization

// Good - vectorized (much faster) y ~ normal(mu, sigma);

// Avoid - loop (slower) for (n in 1:N) { y[n] ~ normal(mu, sigma); }

Efficient Matrix Operations

// Use built-in matrix operations vector[N] mu = X * beta; // Matrix-vector multiplication

// Avoid loops when vectorization possible

Handling Missing Data

data { int<lower=0> N; int<lower=0> N_obs; array[N_obs] int<lower=1, upper=N> obs_idx; array[N_obs] real y_obs; }

parameters { array[N] real y_latent; real mu; real<lower=0> sigma; }

model { // Likelihood for observed data y_obs ~ normal(mu, sigma);

// Prior/model for all data y_latent ~ normal(mu, sigma);

// Constrain observed values y_latent[obs_idx] ~ normal(y_obs, 0.001); // Small noise }

Debugging Stan Models

Common Issues

Divergences:

Increase adapt_delta

fit <- model$sample( data = stan_data, adapt_delta = 0.95 # Default 0.8 )

Slow mixing:

Use non-centered parameterization

Reparameterize hierarchical models

Model diagnostics:

Check convergence

fit$diagnostic_summary()

Check R-hat, ESS

fit$summary(c("Rhat", "ess_bulk", "ess_tail"))

Pairs plot for problematic parameters

bayesplot::mcmc_pairs(fit$draws(), pars = c("mu", "sigma"))

When to Use This Skill

Activate this skill when:

  • Writing Stan models

  • Integrating Stan with R packages (cmdstanr)

  • Testing Stan functions

  • Debugging Stan models

  • Working with Bayesian hierarchical models

  • Implementing custom likelihoods or priors in Stan

This skill provides Stan-specific development patterns. Project-specific model architecture and domain knowledge should remain in project CLAUDE.md files.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

taskfile-automation

No summary provided by upstream source.

Repository SourceNeeds Review
-1.2K
seabbs
Coding

academic-writing-standards

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

analyzing-research-papers

No summary provided by upstream source.

Repository SourceNeeds Review