r-package-development

R Package Development Decision Guide

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "r-package-development" with this command: npx skills add ab604/claude-code-r-skills/ab604-claude-code-r-skills-r-package-development

R Package Development Decision Guide

Dependencies, API design, testing, documentation, and best practices for R packages

Dependency Strategy

When to Add Dependencies vs Base R

Add dependency when:

- Significant functionality gain

- Maintenance burden reduction

- User experience improvement

- Complex implementation (regex, dates, web)

Use base R when:

- Simple utility functions

- Package will be widely used (minimize deps)

- Dependency is large for small benefit

- Base R solution is straightforward

Example decisions:

str_detect(x, "pattern") # Worth stringr dependency length(x) > 0 # Don't need purrr for this parse_dates(x) # Worth lubridate dependency x + 1 # Don't need dplyr for this

Tidyverse Dependency Guidelines

Core tidyverse (usually worth it):

dplyr # Complex data manipulation purrr # Functional programming, parallel stringr # String manipulation tidyr # Data reshaping

Specialized tidyverse (evaluate carefully):

lubridate # If heavy date manipulation forcats # If many categorical operations readr # If specific file reading needs ggplot2 # If package creates visualizations

Heavy dependencies (use sparingly):

tidyverse # Meta-package, very heavy shiny # Only for interactive apps

Dependency Specification in DESCRIPTION

Strong dependencies (required)

Imports: dplyr (>= 1.1.0), rlang (>= 1.0.0)

Suggested dependencies (optional)

Suggests: testthat (>= 3.0.0), knitr, rmarkdown

Enhanced functionality (optional but loaded if available)

Enhances: data.table

API Design Patterns

Function Design Strategy

Modern tidyverse API patterns

1. Use .by for per-operation grouping

my_summarise <- function(.data, ..., .by = NULL) {

Support modern grouped operations

}

2. Use {{ }} for user-provided columns

my_select <- function(.data, cols) { .data |> select({{ cols }}) }

3. Use ... for flexible arguments

my_mutate <- function(.data, ..., .by = NULL) { .data |> mutate(..., .by = {{ .by }}) }

4. Return consistent types (tibbles, not data.frames)

my_function <- function(.data) { result |> tibble::as_tibble() }

Input Validation Strategy

Validation level by function type:

User-facing functions - comprehensive validation

user_function <- function(x, threshold = 0.5) {

Check all inputs thoroughly

if (!is.numeric(x)) stop("x must be numeric") if (!is.numeric(threshold) || length(threshold) != 1) { stop("threshold must be a single number") }

... function body

}

Internal functions - minimal validation

.internal_function <- function(x, threshold) {

Assume inputs are valid (document assumptions)

Only check critical invariants

... function body

}

Package functions with vctrs - type-stable validation

safe_function <- function(x, y) { x <- vec_cast(x, double()) y <- vec_cast(y, double())

Automatic type checking and coercion

}

Error Handling Patterns

Good error messages - specific and actionable

if (length(x) == 0) { cli::cli_abort( "Input {.arg x} cannot be empty.", "i" = "Provide a non-empty vector." ) }

Include function name in errors

validate_input <- function(x, call = caller_env()) { if (!is.numeric(x)) { cli::cli_abort("Input must be numeric", call = call) } }

Use consistent error styling

cli package for user-friendly messages

rlang for developer tools

Error Classes

Custom error classes for programmatic handling

my_error <- function(message, ..., call = caller_env()) { cli::cli_abort( message, ..., class = "my_package_error", call = call ) }

Specific error types

validation_error <- function(message, ..., call = caller_env()) { cli::cli_abort( message, ..., class = c("validation_error", "my_package_error"), call = call ) }

When to Create Internal vs Exported Functions

Export Function When

Export when:

- Users will call it directly

- Other packages might want to extend it

- Part of the core package functionality

- Stable API that won't change often

Example: main data processing functions

#' @export process_data <- function(.data, ...) {

Comprehensive input validation

Full documentation required

Stable API contract

}

Keep Function Internal When

Keep internal when:

- Implementation detail that may change

- Only used within package

- Complex implementation helpers

- Would clutter user-facing API

Example: helper functions (no @export)

.validate_input <- function(x, y) {

Minimal documentation

Can change without breaking users

Assume inputs are pre-validated

}

Naming convention: prefix with . for internal functions

.compute_metrics <- function(data) { ... }

Testing and Documentation Strategy

Testing Levels

Unit tests - individual functions

test_that("function handles edge cases", { expect_equal(my_func(c()), expected_empty_result) expect_error(my_func(NULL), class = "my_error_class") })

Integration tests - workflow combinations

test_that("pipeline works end-to-end", { result <- data |> step1() |> step2() |> step3() expect_s3_class(result, "expected_class") })

Property-based tests for package functions

test_that("function properties hold", {

Test invariants across many inputs

})

Test File Organization

tests/ testthat/ test-validation.R # Input validation tests test-processing.R # Core processing tests test-output.R # Output format tests test-integration.R # End-to-end tests helper-fixtures.R # Shared test fixtures testthat.R # Test runner

Snapshot Testing

For complex outputs that are hard to specify exactly

test_that("summary output is correct", { expect_snapshot(summary(my_object)) })

For error messages

test_that("errors are informative", expect_snapshot(my_function(bad_input), error = TRUE) })

Documentation Priorities

Must document:

- All exported functions

- Complex algorithms or formulas

- Non-obvious parameter interactions

- Examples of typical usage

Can skip documentation:

- Simple internal helpers

- Obvious parameter meanings

- Functions that just call other functions

roxygen2 Documentation

#' Process and summarize data #' #' @description #' Takes a data frame and computes summary statistics #' for specified variables. #' #' @param data A data frame or tibble. #' @param vars <[tidy-select][dplyr::dplyr_tidy_select]> Columns to summarize. #' @param .by <[data-masking][dplyr::dplyr_data_masking]> Optional grouping variable. #' #' @return A tibble with summary statistics. #' #' @examples #' mtcars |> process_data(mpg, .by = cyl) #' #' @export process_data <- function(data, vars, .by = NULL) {

...

}

Package Structure

Recommended Directory Layout

mypackage/ DESCRIPTION NAMESPACE LICENSE README.md R/ utils.R # Internal utilities validation.R # Input validation core.R # Core functionality methods.R # S3/S7 methods zzz.R # .onLoad, .onAttach man/ # Generated by roxygen2 tests/ testthat/ testthat.R vignettes/ getting-started.Rmd inst/ extdata/ # Example data files data/ # Package data (lazy-loaded) data-raw/ # Scripts to create package data

DESCRIPTION Best Practices

Package: mypackage Title: What The Package Does (One Line) Version: 0.1.0 Authors@R: person("First", "Last", email = "email@example.com", role = c("aut", "cre")) Description: A longer description that spans multiple lines. Use four spaces for continuation lines. License: MIT + file LICENSE Encoding: UTF-8 Roxygen: list(markdown = TRUE) RoxygenNote: 7.2.3 Imports: dplyr (>= 1.1.0), rlang (>= 1.0.0) Suggests: testthat (>= 3.0.0) Config/testthat/edition: 3

Release Checklist

Before release:

devtools::check() # Must pass with 0 errors, warnings, notes devtools::test() # All tests pass devtools::document() # Documentation up to date urlchecker::url_check() # All URLs valid spelling::spell_check_package() # No typos

Update version

usethis::use_version("minor") # or "major", "patch"

Update NEWS.md with changes

Final checks

devtools::check(remote = TRUE, manual = TRUE)

Common Package Development Mistakes

Avoid - Using library() in package code

library(dplyr) # Never in package code!

Good - Use namespace qualification

dplyr::filter(data, x > 0)

Or import in NAMESPACE via roxygen2

#' @importFrom dplyr filter mutate

Avoid - Modifying global state

options(my_option = TRUE) # Side effect!

Good - Restore state if you must modify

old_opts <- options(my_option = TRUE) on.exit(options(old_opts), add = TRUE)

Avoid - Hardcoded paths

read.csv("/home/user/data.csv")

Good - Use system.file for package data

system.file("extdata", "data.csv", package = "mypackage")

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

r-style-guide

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

r-performance

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

rlang-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

tidyverse-patterns

No summary provided by upstream source.

Repository SourceNeeds Review