rstan-to-pystan

RStan to PyStan Conversion

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "rstan-to-pystan" with this command: npx skills add letta-ai/skills/letta-ai-skills-rstan-to-pystan

RStan to PyStan Conversion

This skill provides guidance for converting RStan (R interface to Stan) code to PyStan (Python interface to Stan), focusing on the significant API differences between the two libraries.

Key Insight: Stan Model Code is Language-Agnostic

The Stan modeling language itself is identical between RStan and PyStan. The Stan model code (data blocks, parameters, model, generated quantities) can typically be copied directly. The conversion challenge lies in the wrapper code that:

  • Prepares data for the model

  • Calls the sampler with correct parameters

  • Extracts and processes posterior samples

Pre-Conversion Checklist

Before writing any conversion code:

Verify system dependencies for PyStan

  • PyStan 3.x requires a C++ compiler (g++ on Linux, clang on macOS)

  • Install with: apt-get install g++ or equivalent

  • Missing compiler causes cryptic compilation errors at runtime

Identify PyStan version to target

  • PyStan 2.x API mirrors RStan closely

  • PyStan 3.x has a completely different API (recommended for new projects)

  • This guide focuses on PyStan 3.x conversion

Read the complete R script before converting

  • Identify all hyperparameters used

  • Note any data transformations

  • Understand the expected output format

Hyperparameter Mapping (RStan to PyStan 3.x)

Create an explicit mapping table before coding:

RStan Parameter PyStan 3.x Equivalent Notes

iter

N/A (see calculation) Total iterations including warmup

warmup

num_warmup

Number of warmup iterations

chains

num_chains

Number of MCMC chains

thin

N/A PyStan 3 does not support thinning directly

seed

random_seed

Random seed for reproducibility

control=list(adapt_delta=X)

N/A Not directly available in PyStan 3

control=list(max_treedepth=X)

N/A Not directly available in PyStan 3

Critical calculation for num_samples:

PyStan num_samples = (RStan iter - RStan warmup) / RStan thin

Example: If RStan uses iter=2000, warmup=1000, thin=2 :

  • Effective samples = (2000 - 1000) / 2 = 500

  • Use num_samples=500 in PyStan 3

Sample Extraction: Critical API Difference

This is the most common source of errors in conversion.

RStan sample extraction:

Returns matrix with shape (n_iterations, n_parameters)

samples <- extract(fit)$parameter_name

For vector parameters, shape is (n_iterations, param_length)

PyStan 3.x sample extraction:

Returns array with shape (param_length, n_samples) - TRANSPOSED!

samples = fit['parameter_name']

Single parameters have shape (n_samples,)

Verification approach:

Always add shape debugging in first version

for param in ['rho', 'beta', 'sigma']: print(f"{param} shape: {fit[param].shape}")

Step-by-Step Conversion Process

Step 1: Analyze the RStan Script

  • Extract the Stan model code (between stan() or in separate .stan file)

  • Document all hyperparameters with their values

  • Identify data preparation steps

  • Note the expected output format and calculations

Step 2: Set Up Python Environment

Verify compiler availability first

import subprocess result = subprocess.run(['g++', '--version'], capture_output=True) if result.returncode != 0: raise RuntimeError("g++ not found - install C++ compiler first")

Then install and import

import stan # PyStan 3.x

Step 3: Translate the Stan Model

  • Copy Stan model code directly (it's language-agnostic)

  • Store as a Python string with proper escaping

  • Verify all data block variables match your Python data preparation

Step 4: Prepare Data Dictionary

RStan uses list(), PyStan uses dict

Ensure all variable names match the Stan data block exactly

data = { 'N': len(y), 'K': X.shape[1], 'y': y.tolist(), # Convert numpy arrays to lists 'X': X.tolist() }

Step 5: Build and Sample

PyStan 3.x pattern

posterior = stan.build(stan_code, data=data) fit = posterior.sample( num_chains=chains, num_samples=num_samples, # Calculated from RStan params num_warmup=warmup, random_seed=seed )

Step 6: Extract Results with Shape Verification

ALWAYS verify shapes before computing statistics

samples = fit['parameter_name'] print(f"Shape: {samples.shape}") # Debug first

For posterior means:

- If shape is (n_samples,): use samples.mean()

- If shape is (param_dim, n_samples): use samples.mean(axis=1)

Common Pitfalls and Solutions

Pitfall 1: Assuming RStan-like sample shapes

  • Symptom: Getting 1000 values when expecting 3 (or vice versa)

  • Cause: PyStan 3 returns (param_dim, n_samples) not (n_samples, param_dim)

  • Solution: Always check .shape before computing statistics

Pitfall 2: Missing C++ compiler

  • Symptom: Compilation errors when building Stan model

  • Cause: PyStan compiles Stan to C++ at runtime

  • Solution: Install g++ before attempting to run PyStan code

Pitfall 3: Incorrect num_samples calculation

  • Symptom: Different number of posterior samples than expected

  • Cause: Not accounting for thinning in calculation

  • Solution: Use formula: num_samples = (iter - warmup) / thin

Pitfall 4: Data type mismatches

  • Symptom: Stan compilation or runtime errors about data types

  • Cause: NumPy arrays not converted to Python lists

  • Solution: Use .tolist() on numpy arrays in data dictionary

Pitfall 5: Ignoring sampling warnings

  • Symptom: Warnings about Cholesky decomposition, divergences, etc.

  • Cause: Can indicate model issues or poor sampling

  • Solution: While often transient, log warnings and verify results make sense

Verification Strategy

After conversion, verify the following:

Sample counts match expectations

expected_samples = (rstan_iter - rstan_warmup) // rstan_thin assert fit['param'].shape[-1] == expected_samples

Parameter dimensions are correct

For a 3-dimensional parameter vector

assert fit['beta'].shape[0] == 3

Posterior statistics are reasonable

Compare means, check they're in expected ranges

posterior_mean = fit['param'].mean(axis=-1) print(f"Posterior mean: {posterior_mean}")

Output format matches requirements

  • Verify JSON/CSV structure

  • Check key names and data types

  • Ensure numerical precision is adequate

Minimal Test Pattern

Before full conversion, write a minimal test to understand PyStan's behavior:

import stan

Simple model to verify API understanding

code = """ data { int N; array[N] real y; } parameters { real mu; real<lower=0> sigma; } model { y ~ normal(mu, sigma); } """

data = {'N': 10, 'y': [1.0]*10} posterior = stan.build(code, data=data) fit = posterior.sample(num_chains=1, num_samples=100)

Verify you understand the output structure

print(f"mu shape: {fit['mu'].shape}") print(f"sigma shape: {fit['sigma'].shape}")

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

extracting-pdf-text

No summary provided by upstream source.

Repository SourceNeeds Review
General

video-processing

No summary provided by upstream source.

Repository SourceNeeds Review
General

google-workspace

No summary provided by upstream source.

Repository SourceNeeds Review
General

portfolio-optimization

No summary provided by upstream source.

Repository SourceNeeds Review