Interactive Notebooks
Use this skill for creating reproducible, well-structured notebooks for data exploration, analysis, and communication.
When to use this skill
-
Exploratory analysis — interactively investigate data
-
Reproducible research — document methodology with code and results
-
Teaching/demos — explain concepts with executable examples
-
Stakeholder communication — share insights with narrative + visuals
-
Prototyping — quickly iterate on data transformations or models
Tool selection
Tool Best For Key Feature
JupyterLab Traditional data science, extensions ecosystem Full IDE experience
marimo Reproducible notebooks, reactive execution Python-native, version-control friendly
VS Code + Jupyter IDE-native notebook experience Intellisense, debugging, git integration
Google Colab Cloud GPUs, sharing, collaboration Free TPU/GPU, easy sharing
Core principles
- Structure for readability
Title: Clear project/question description
Setup
Imports and configuration
Data Loading
Load and validate data
Analysis
- Subsection per question/hypothesis
- Clear markdown explanations
- Visualizations with interpretations
Conclusions
Key findings and next steps
- Ensure reproducibility
Set random seeds
import numpy as np import random
np.random.seed(42) random.seed(42)
Pin versions in requirements.txt or environment.yml
requirements.txt example:
pandas==2.1.0
scikit-learn==1.3.0
- Keep cells focused
-
One concept per cell
-
Avoid cells with >50 lines
-
Refactor helper functions to .py files
- Never hardcode secrets
✅ Use environment variables
import os
api_key = os.environ.get("OPENAI_API_KEY")
❌ Never do this
api_key = "sk-abc123..."
Jupyter best practices
Magic commands (Jupyter/IPython)
In a Jupyter cell (these are IPython magics, not standard Python)
Auto-reload modules during development
%load_ext autoreload
%autoreload 2
Timing
%timeit function_call()
Debugging
%debug
Environment info (requires watermark package)
%watermark -v -m -p numpy,pandas,sklearn
Clean outputs before git
Using nbstripout
pip install nbstripout nbstripout --install
Or pre-commit hook
pip install pre-commit pre-commit install
marimo advantages
Reactive execution
marimo notebook - cells auto-recompute when dependencies change
import marimo as mo
slider = mo.ui.slider(1, 100, value=50) slider # Display the slider
This cell re-runs automatically when slider changes
df_filtered = df[df['value'] > slider.value]
Version control friendly
-
Pure Python (.py files)
-
No output blobs in git
-
Readable diffs
Convert Jupyter to marimo
marimo convert notebook.ipynb -o notebook.py
Common anti-patterns
-
❌ Running cells out of order (Jupyter)
-
❌ Giant cells with mixed concerns
-
❌ Hardcoded file paths
-
❌ No markdown explanations
-
❌ Committing large output files
-
❌ Inline data (use data/ folder)
Progressive disclosure
-
../references/jupyter-advanced.md — Widgets, extensions, debugging
-
../references/marimo-guide.md — Reactive patterns, UI components
-
../references/notebook-testing.md — Unit tests for notebook code
-
../references/sharing-publishing.md — nbconvert, Quarto, Voilà
Related skills
-
@data-science-eda — Exploration patterns for notebooks
-
@data-science-interactive-apps — Convert notebooks to apps
-
@data-engineering-core — Production-ready code patterns
References
-
Jupyter Documentation
-
marimo Documentation
-
nbstripout
-
Quarto (publishing)