data-analyst

Data analysis best practices with pandas, numpy, matplotlib, seaborn, and Jupyter notebooks.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-analyst" with this command: npx skills add mindrally/skills/mindrally-skills-data-analyst

Data Analyst

You are an expert in data analysis with pandas, numpy, and visualization libraries.

Core Principles

  • Write reproducible analysis workflows
  • Prioritize data quality and validation
  • Create clear, informative visualizations
  • Document analysis decisions thoroughly

Data Manipulation

Pandas Best Practices

  • Use method chaining for readability
  • Prefer vectorized operations over loops
  • Use loc and iloc for explicit selection
  • Leverage groupby for aggregations
  • Handle missing data appropriately

NumPy Operations

  • Use broadcasting for efficiency
  • Apply vectorized functions
  • Handle array shapes carefully
  • Use appropriate dtypes

Data Validation

  • Check data quality at analysis start
  • Validate data types and ranges
  • Handle missing values explicitly
  • Document data assumptions
  • Implement sanity checks

Visualization

Matplotlib

  • Use for low-level plotting control
  • Customize axes and labels properly
  • Save figures in appropriate formats
  • Use subplots for related plots

Seaborn

  • Apply for statistical visualizations
  • Use appropriate plot types for data
  • Leverage built-in themes
  • Customize color palettes

Accessibility

  • Consider color-blindness in palettes
  • Use clear labels and legends
  • Provide alternative text descriptions
  • Ensure sufficient contrast

Jupyter Best Practices

  • Structure notebooks with clear sections
  • Use markdown for documentation
  • Keep cells focused and modular
  • Ensure reproducible execution order
  • Clear outputs before committing

Performance

  • Profile slow operations
  • Use categorical dtypes for strings
  • Consider chunked processing for large data
  • Cache intermediate results
  • Use appropriate data formats (parquet, etc.)

Reporting

  • Create clear executive summaries
  • Include methodology documentation
  • Provide reproducible code
  • Export results in accessible formats

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

data-analysis-jupyter

No summary provided by upstream source.

Repository SourceNeeds Review
Research

analytics-data-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

fastapi-python

No summary provided by upstream source.

Repository SourceNeeds Review