data-visualization

Create data visualizations, design figures, and plot analysis results. Specialized for observational cosmology (spectra, covariances, posteriors, sky maps) but capable of creative/expressive output for outreach. Use when: plotting data, designing figures for papers/talks, visualizing uncertainty, creating publication-quality graphics, or any task involving matplotlib/seaborn/plotly. Triggers on: "plot", "figure", "visualize", "chart", "graph", "histogram", "scatter", "heatmap", "contour", "power spectrum", "corner plot", or any request to create or improve data graphics.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-visualization" with this command: npx skills add cailmdaley/skills/cailmdaley-skills-data-visualization

Data Visualization

Decision framework for translating data into effective visual form. Synthesizes Bertin, Cleveland, Tufte, Cairo, Wilke, and Knaflic — optimized for scientific work with cosmology-specific conventions.

The Intake Protocol

Before plotting, establish two dimensions:

1. Data Structure Analysis

Identify what you're visualizing:

Data TypeDescriptionLikely Forms
AmountsValues across categoriesBar, dot plot, heatmap
DistributionsSpread/shape of valuesHistogram, KDE, violin, ridgeline
X-Y RelationshipsContinuous variablesScatter, line, confidence bands
UncertaintyError on measurementsError bars, bands, gradient ribbons
ProportionsParts of wholeStacked bar, pie (rarely)
Spatial/MapsGeographic or sky dataMollweide, healpix, choropleth
CorrelationsVariable relationshipsCovariance matrix, triangle plot

2. Communication Mode

Determine the venue—this switches the entire rule set:

Mode A: Analytical/Paper

  • Audience: Expert peers, reviewers
  • Optimize for: Precision, black/white printing, convention
  • Philosophy: Tufte/Cleveland/Wilke—density is permitted, accuracy is paramount
  • Color: Restrained, colorblind-safe, grayscale-compatible
  • Default: This mode unless otherwise specified

Mode B: Presentation/Outreach

  • Audience: Mixed expertise, attention-competitive
  • Optimize for: Impact, engagement, narrative clarity
  • Philosophy: Cairo/McCandless/Knaflic—preattentive pop, visual hierarchy
  • Color: Bold accent colors, clear entry points
  • Use when: Talks, posters, press releases, social media

The Decision Framework

Route from data to visualization form:

Step 1: Analyze Variables (Bertin)

For each variable, classify:

  • Quantitative: Continuous numeric (position, intensity, redshift)
  • Ordered: Categorical with sequence (low/med/high, redshift bins)
  • Categorical: Nominal groups (experiments, instruments, sky regions)

Check for uncertainty: Is there error on mean (discrete bars) or intrinsic spread (continuous band)?

Step 2: Select Encoding (Cleveland)

Match importance to perceptual accuracy:

RankEncodingUse For
1Position on common scalePrimary comparisons, precise values
2Position on non-aligned scalesSecondary comparisons
3LengthBar charts (amounts only)
4Angle/SlopeAvoid for precise reading
5AreaGestalt impressions, bubble charts
6Color saturationTertiary encoding, density

Rule: If precise comparison is needed, use position. If gestalt impression is needed, use color/area.

Step 3: Select Form (Wilke)

Consult viz-catalog.md for the specific form. Key mappings:

You HaveConsider
Spectrum (continuous x, continuous y, uncertainty)Line + confidence band, residual subplot
Correlation/covariance matrixHeatmap, diverging colormap, white at zero
Parameter posteriorsTriangle plot, ridgeline, violin
Comparison across groupsSmall multiples > overlay when groups > 4
Time seriesLine, banking to 45 degrees
Amounts across categoriesDot plot (Cleveland) > bar chart

Step 4: Apply Mode-Specific Rules

If Mode A (Paper):

  • Enforce strict linear/log scaling
  • No bubble charts for precise quantities
  • No dual y-axes
  • Redundant encoding (shape + color) for colorblind safety
  • Direct labeling over legends when <=4 series
  • Light grid lines, subordinate to data

If Mode B (Outreach):

  • Establish visual hierarchy—most important data most salient
  • One clear entry point (where does eye go first?)
  • Bolder colors, but maintain accuracy
  • Annotations that guide reading
  • Title states the takeaway, not the topic

Cosmology-Specific Overrides

These conventions override general principles for domain consistency:

Power Spectra

  • Flatten steeply falling spectra: Multiply by x-axis factor to reveal percent-level features
    • Angular: Plot ell^n C_ell (commonly D_ell = ell(ell+1)C_ell/2pi, but factor varies)
    • Matter: Plot k^3 P(k) or Delta^2(k) to flatten
    • Correlation functions: Plot theta xi(theta) or similar
  • Log-linear preferred: Log scale on x (multipole/k), linear on y after flattening
    • Reveals small differences hidden by log-log compression
    • Reserve log-log only when dynamic range is the message
  • Label x-axis with actual values (10, 100, 1000), not exponents
  • Residual panel: Show (data - model)/sigma or data/model below main panel
  • Uncertainty: Confidence bands if dense sampling, error bars if sparse

Covariance Matrices

  • Diverging colormap required (RdBu, coolwarm)
  • White/neutral at zero (or at 1 for correlation matrices)
  • Explicit colorbar with position-based lookup for precise values
  • Consider: Showing only upper/lower triangle for symmetry

Triangle/Corner Plots

  • Standard layout: 1D posteriors on diagonal, 2D contours off-diagonal
  • Contour levels: 68%, 95% (1sigma, 2sigma)
  • Consistent axis ranges across all panels showing same parameter
  • Direct parameter labels on axes, not legend

Sky Maps (Healpix/Mollweide)

  • Projection matters: Mollweide for full-sky, orthographic for regions
  • Graticule: RA/Dec grid, labeled at edges
  • Sequential colormap for intensity, diverging for residuals

Error Representation

  • Asymmetric errors: Make asymmetry visually obvious
  • Bands vs bars: Use bands for continuous functions, bars for discrete points
  • Multiple sigma levels: Gradient opacity (dark = 1sigma, light = 2sigma)

Encoding Principles

Brief rules from perceptual science:

Preattentive Attributes (Cairo)

These "pop out" in <250ms—use for key distinctions:

  • Color (hue)
  • Size
  • Position
  • Orientation

If your main finding should be visible at a glance, encode it preattentively.

Working Memory Limits

Humans hold ~4 chunks in working memory:

  • Legends with >4 items require constant back-and-forth
  • Direct labeling dramatically reduces cognitive load
  • Group by meaningful categories to chunk (8 items -> 2 groups of 4)

Redundant Encoding (Wilke)

Never rely on color alone:

  • Shape + color for categories
  • Position + color for emphasis
  • Ensures colorblind safety and bad projector survival

The Refinement Loop

After generating the plot, inspect against:

The Squint Test (Knaflic)

Squint at the figure. What stands out? If it's not your main finding, you have:

  • Clutter competing with signal
  • Wrong visual hierarchy
  • Preattentive attributes on wrong elements

Data-Ink Ratio (Tufte)

For each element, ask: "Does this earn its ink?"

  • Remove chart frames if not essential
  • Lighten or remove gridlines
  • Replace legends with direct labels
  • Remove redundant axis lines

The 1+1=3 Principle (Tufte)

Two elements create emergent visual artifacts (the space between). Check:

  • Dense grids creating moire
  • Grouped bars creating unintended rhythms
  • Close parallel lines creating "third" shapes

Colorblind Check

Verify with simulation (viridis is designed for CVD safety). Test: Would the message survive grayscale printing?

Reference Files

Consult as needed:

Library preference: Use seaborn over raw matplotlib when possible. Seaborn provides cleaner defaults and better statistical visualization primitives.

Quick Reference: Common Mistakes

MistakeFix
Jet/rainbow colormapUse forestdawn (diverging) or mako/rocket (sequential)
>5 colors in legendSmall multiples or direct labeling
Dual y-axesTwo separate plots or faceting
3D effectsNever. Use 2D with color/facets
Pie charts for comparisonDot plot or bar chart
Bar chart not starting at zeroStart at zero (length encoding) or use dot plot
Truncated axis exaggerating effectShow full range or use log scale
Heavy matplotlib defaultsApply decluttering checklist

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

data-visualization

No summary provided by upstream source.

Repository SourceNeeds Review
General

data-visualization

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

ralph-loops

No summary provided by upstream source.

Repository SourceNeeds Review
General

nano-banana

No summary provided by upstream source.

Repository SourceNeeds Review