numerical-methods

You are an expert in numerical stability and computational aspects of statistical methods.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "numerical-methods" with this command: npx skills add data-wise/claude-plugins/data-wise-claude-plugins-numerical-methods

Numerical Methods

You are an expert in numerical stability and computational aspects of statistical methods.

Floating-Point Fundamentals

IEEE 754 Double Precision

  • Precision: ~15-17 significant decimal digits

  • Range: ~10⁻³⁰⁸ to 10³⁰⁸

  • Machine epsilon: ε ≈ 2.2 × 10⁻¹⁶

  • Special values: Inf, -Inf, NaN

Key Constants in R

.Machine$double.eps # ~2.22e-16 (machine epsilon) .Machine$double.xmax # ~1.80e+308 (max finite) .Machine$double.xmin # ~2.23e-308 (min positive normalized) .Machine$double.neg.eps # ~1.11e-16 (negative epsilon)

Common Numerical Issues

  1. Catastrophic Cancellation

When subtracting nearly equal numbers:

BAD: loses precision

x <- 1e10 + 1 y <- 1e10 result <- x - y # Should be 1, may have errors

BETTER: reformulate to avoid subtraction

Example: Computing variance

var_bad <- mean(x^2) - mean(x)^2 # Can be negative! var_good <- sum((x - mean(x))^2) / (n-1) # Always non-negative

  1. Overflow/Underflow

BAD: overflow

prod(1:200) # Inf

GOOD: work on log scale

sum(log(1:200)) # Then exp() if needed

BAD: underflow in probabilities

prod(dnorm(x)) # 0 for large x

GOOD: sum log probabilities

sum(dnorm(x, log = TRUE))

  1. Log-Sum-Exp Trick

Essential for working with log probabilities:

log_sum_exp <- function(log_x) { max_log <- max(log_x) if (is.infinite(max_log)) return(max_log) max_log + log(sum(exp(log_x - max_log))) }

Example: log(exp(-1000) + exp(-1001))

log_sum_exp(c(-1000, -1001)) # Correct: ~-999.69 log(exp(-1000) + exp(-1001)) # Wrong: -Inf

  1. Softmax Stability

BAD

softmax_bad <- function(x) exp(x) / sum(exp(x))

GOOD

softmax <- function(x) { x_max <- max(x) exp_x <- exp(x - x_max) exp_x / sum(exp_x) }

Matrix Computations

Conditioning

The condition number κ(A) measures sensitivity to perturbation:

  • κ(A) = ‖A‖ · ‖A⁻¹‖

  • Rule: Expect to lose log₁₀(κ) digits of accuracy

  • κ > 10¹⁵ means matrix is numerically singular

Check condition number

kappa(X, exact = TRUE)

For regression: check X'X conditioning

kappa(crossprod(X))

Solving Linear Systems

Prefer: Decomposition methods over explicit inversion

BAD: explicit inverse

beta <- solve(t(X) %% X) %% t(X) %*% y

GOOD: QR decomposition

beta <- qr.coef(qr(X), y)

BETTER for positive definite: Cholesky

R <- chol(crossprod(X)) beta <- backsolve(R, forwardsolve(t(R), crossprod(X, y)))

For ill-conditioned: SVD/pseudoinverse

beta <- MASS::ginv(X) %*% y

Symmetric Positive Definite Matrices

Always use specialized methods:

Cholesky for SPD

L <- chol(Sigma)

Eigendecomposition

eig <- eigen(Sigma, symmetric = TRUE)

Check positive definiteness

all(eigen(Sigma, symmetric = TRUE, only.values = TRUE)$values > 0)

Optimization Stability

Gradient Computation

Numerical gradient (for verification)

numerical_grad <- function(f, x, h = sqrt(.Machine$double.eps)) { sapply(seq_along(x), function(i) { x_plus <- x_minus <- x x_plus[i] <- x[i] + h x_minus[i] <- x[i] - h (f(x_plus) - f(x_minus)) / (2 * h) }) }

Central difference is O(h²) accurate

Forward difference is O(h) accurate

Hessian Stability

Check Hessian is positive definite at optimum

check_hessian <- function(H, tol = 1e-8) { eigs <- eigen(H, symmetric = TRUE, only.values = TRUE)$values min_eig <- min(eigs)

list( positive_definite = min_eig > tol, min_eigenvalue = min_eig, condition_number = max(eigs) / min_eig ) }

Line Search

For gradient descent stability:

backtracking_line_search <- function(f, x, d, grad, alpha = 1, rho = 0.5, c = 1e-4) {

Armijo condition

while (f(x + alpha * d) > f(x) + c * alpha * sum(grad * d)) { alpha <- rho * alpha if (alpha < 1e-10) break } alpha }

Integration and Quadrature

Numerical Integration Guidelines

Adaptive quadrature (default choice)

integrate(f, lower, upper)

For infinite limits

integrate(f, -Inf, Inf)

For highly oscillatory or peaked functions

Increase subdivisions

integrate(f, lower, upper, subdivisions = 1000)

For known singularities, split the domain

Monte Carlo Integration

mc_integrate <- function(f, n, lower, upper) { x <- runif(n, lower, upper) fx <- sapply(x, f)

estimate <- (upper - lower) * mean(fx) se <- (upper - lower) * sd(fx) / sqrt(n)

list(value = estimate, se = se) }

Root Finding

Newton-Raphson Stability

newton_raphson <- function(f, df, x0, tol = 1e-8, max_iter = 100) { x <- x0 for (i in 1:max_iter) { fx <- f(x) dfx <- df(x)

# Check for near-zero derivative
if (abs(dfx) &#x3C; .Machine$double.eps * 100) {
  warning("Near-zero derivative")
  break
}

x_new &#x3C;- x - fx / dfx

if (abs(x_new - x) &#x3C; tol) break
x &#x3C;- x_new

} x }

Brent's Method

For robust root finding without derivatives:

uniroot(f, interval = c(lower, upper), tol = .Machine$double.eps^0.5)

Statistical Computing Patterns

Safe Likelihood Computation

Always work with log-likelihood

log_lik <- function(theta, data) {

Compute log-likelihood, not likelihood

sum(dnorm(data, mean = theta[1], sd = theta[2], log = TRUE)) }

Robust Standard Errors

Sandwich estimator with numerical stability

sandwich_se <- function(score, hessian) {

Check Hessian conditioning

H_inv <- tryCatch( solve(hessian), error = function(e) MASS::ginv(hessian) )

meat <- crossprod(score) V <- H_inv %% meat %% H_inv

sqrt(diag(V)) }

Bootstrap with Error Handling

safe_bootstrap <- function(data, statistic, R = 1000) { results <- numeric(R) failures <- 0

for (i in 1:R) { boot_data <- data[sample(nrow(data), replace = TRUE), ] result <- tryCatch( statistic(boot_data), error = function(e) NA ) results[i] <- result if (is.na(result)) failures <- failures + 1 }

if (failures > 0.1 * R) { warning(sprintf("%.1f%% bootstrap failures", 100 * failures / R)) }

list( estimate = mean(results, na.rm = TRUE), se = sd(results, na.rm = TRUE), failures = failures ) }

Debugging Numerical Issues

Diagnostic Checklist

  • Check for NaN/Inf: any(is.nan(x)) , any(is.infinite(x))

  • Check conditioning: kappa(matrix)

  • Check eigenvalues: For PD matrices

  • Check gradients: Numerically vs analytically

  • Check scale: Variables on similar scales?

Debugging Functions

Trace NaN/Inf sources

debug_numeric <- function(x, name = "x") { cat(sprintf("%s: range [%.3g, %.3g], ", name, min(x), max(x))) cat(sprintf("NaN: %d, Inf: %d, -Inf: %d\n", sum(is.nan(x)), sum(x == Inf), sum(x == -Inf))) }

Check relative error

rel_error <- function(computed, true) { abs(computed - true) / max(abs(true), 1) }

Best Practices Summary

  • Always work on log scale for products of probabilities

  • Use QR or Cholesky instead of matrix inversion

  • Check conditioning before solving linear systems

  • Center and scale predictors in regression

  • Handle edge cases (empty data, singular matrices)

  • Use existing implementations (LAPACK, BLAS) when possible

  • Test with extreme values (very small, very large, near-zero)

  • Compare analytical and numerical gradients

  • Monitor convergence in iterative algorithms

  • Document numerical assumptions and limitations

Key References

  • Higham

  • Golub & Van Loan

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

proof-architect

No summary provided by upstream source.

Repository SourceNeeds Review
General

asymptotic-theory

No summary provided by upstream source.

Repository SourceNeeds Review
General

identification-theory

No summary provided by upstream source.

Repository SourceNeeds Review