data-export-excel

Excel Data Export (Universal)

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "data-export-excel" with this command: npx skills add starlitnightly/omicverse/starlitnightly-omicverse-data-export-excel

Excel Data Export (Universal)

Overview

This skill enables you to export bioinformatics data, analysis results, and formatted tables to professional Excel spreadsheets. Unlike cloud-hosted solutions, this skill uses the openpyxl Python library and executes locally in your environment, making it compatible with ALL LLM providers including GPT, Gemini, Claude, DeepSeek, and Qwen.

When to Use This Skill

  • Export AnnData observations (.obs) or variables (.var) to Excel

  • Save DEG analysis results with formatting

  • Create multi-sheet workbooks with different data types

  • Generate formatted Excel reports with cell styling

  • Export cluster annotations, cell type assignments, or quality control metrics

How to Use

Step 1: Import Required Libraries

import openpyxl from openpyxl import Workbook from openpyxl.styles import Font, PatternFill, Alignment, Border, Side from openpyxl.utils.dataframe import dataframe_to_rows import pandas as pd import numpy as np

Step 2: Prepare Your Data

Convert your data to pandas DataFrame format:

Example: Export AnnData observations

df = adata.obs.copy()

Example: Export DEG results

deg_df = pd.DataFrame({ 'gene': gene_names, 'log2FC': log2_fold_changes, 'pvalue': pvalues, 'qvalue': qvalues })

Example: Export cluster statistics

cluster_stats = adata.obs.groupby('clusters').size().reset_index(name='count')

Step 3: Create Excel Workbook

Create new workbook

wb = Workbook() ws = wb.active ws.title = "Sheet Name"

Write DataFrame to worksheet

for r in dataframe_to_rows(df, index=False, header=True): ws.append(r)

Step 4: Add Formatting (Optional)

Style header row

header_fill = PatternFill(start_color="4472C4", end_color="4472C4", fill_type="solid") header_font = Font(bold=True, color="FFFFFF")

for cell in ws[1]: cell.fill = header_fill cell.font = header_font cell.alignment = Alignment(horizontal='center')

Auto-adjust column widths

for column in ws.columns: max_length = 0 column_letter = column[0].column_letter for cell in column: try: if len(str(cell.value)) > max_length: max_length = len(str(cell.value)) except: pass adjusted_width = min(max_length + 2, 50) ws.column_dimensions[column_letter].width = adjusted_width

Add borders

thin_border = Border( left=Side(style='thin'), right=Side(style='thin'), top=Side(style='thin'), bottom=Side(style='thin') ) for row in ws.iter_rows(min_row=1, max_row=ws.max_row, min_col=1, max_col=ws.max_column): for cell in row: cell.border = thin_border

Step 5: Save the Workbook

Save to file

output_path = "analysis_results.xlsx" wb.save(output_path) print(f"✅ Excel file saved to: {output_path}")

Multi-Sheet Workbooks

Create workbooks with multiple sheets for different data types:

wb = Workbook()

Sheet 1: Cell metadata

ws1 = wb.active ws1.title = "Cell Metadata" for r in dataframe_to_rows(adata.obs, index=True, header=True): ws1.append(r)

Sheet 2: Gene metadata

ws2 = wb.create_sheet("Gene Metadata") for r in dataframe_to_rows(adata.var, index=True, header=True): ws2.append(r)

Sheet 3: DEG results

ws3 = wb.create_sheet("DEG Results") for r in dataframe_to_rows(deg_df, index=False, header=True): ws3.append(r)

wb.save("multi_sheet_analysis.xlsx")

Best Practices

  • Column Headers: Always include column headers in the first row

  • Data Types: Convert numpy arrays to lists before writing

  • Large Datasets: For datasets >100K rows, consider CSV export instead

  • File Paths: Use absolute paths or ensure output directory exists

  • Formatting: Apply formatting sparingly to reduce file size

  • Index: Decide whether to include DataFrame index (set index=True/False in dataframe_to_rows )

Common Use Cases

Export Quality Control Metrics

qc_metrics = adata.obs[['n_genes', 'n_counts', 'percent_mito', 'clusters']].copy()

wb = Workbook() ws = wb.active ws.title = "QC Metrics"

for r in dataframe_to_rows(qc_metrics, index=False, header=True): ws.append(r)

Highlight cells with high mitochondrial content

for row in range(2, ws.max_row + 1): if ws.cell(row, 3).value > 0.2: # percent_mito > 20% ws.cell(row, 3).fill = PatternFill(start_color="FFC7CE", end_color="FFC7CE", fill_type="solid")

wb.save("qc_metrics.xlsx")

Export Marker Genes by Cluster

Assuming you have marker genes for each cluster

marker_dict = { 'Cluster_0': ['CD3D', 'CD3E', 'CD8A'], 'Cluster_1': ['CD79A', 'MS4A1', 'CD19'], 'Cluster_2': ['LYZ', 'S100A9', 'CD14'] }

wb = Workbook()

for cluster_name, genes in marker_dict.items(): ws = wb.create_sheet(cluster_name) ws.append(['Marker Gene']) for gene in genes: ws.append([gene])

Remove default sheet

if 'Sheet' in wb.sheetnames: wb.remove(wb['Sheet'])

wb.save("marker_genes.xlsx")

Export DEG Analysis with Conditional Formatting

wb = Workbook() ws = wb.active ws.title = "DEG Analysis"

Write DEG results

for r in dataframe_to_rows(deg_df, index=False, header=True): ws.append(r)

Color code by fold change

for row in range(2, ws.max_row + 1): log2fc = ws.cell(row, 2).value # Assuming log2FC in column 2 if log2fc > 1: # Upregulated ws.cell(row, 2).fill = PatternFill(start_color="C6EFCE", end_color="C6EFCE", fill_type="solid") elif log2fc < -1: # Downregulated ws.cell(row, 2).fill = PatternFill(start_color="FFC7CE", end_color="FFC7CE", fill_type="solid")

wb.save("deg_results_formatted.xlsx")

Troubleshooting

Issue: "openpyxl not found"

Solution: Install the library:

import subprocess subprocess.check_call(['pip', 'install', 'openpyxl'])

Issue: "Invalid data type for cell"

Solution: Convert numpy/pandas types to native Python types:

Convert numpy types

df = df.astype(object).where(pd.notnull(df), None)

Or convert specific columns

df['column_name'] = df['column_name'].astype(str)

Issue: "Memory error with large datasets"

Solution: Export in chunks or use CSV format instead:

Fallback to CSV for large data

df.to_csv('large_dataset.csv', index=False) print("Dataset too large for Excel, saved as CSV instead")

Technical Notes

  • Library: Uses openpyxl (pure Python, no external dependencies)

  • Execution: Runs locally in the agent's sandbox

  • Compatibility: Works with ALL LLM providers (GPT, Gemini, Claude, DeepSeek, Qwen, etc.)

  • File Limits: Excel has a 1,048,576 row limit (use CSV for larger datasets)

  • Performance: Writing ~10K rows takes 1-2 seconds; 100K rows takes 10-20 seconds

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

data-viz-plots

No summary provided by upstream source.

Repository SourceNeeds Review
General

data-export-pdf

No summary provided by upstream source.

Repository SourceNeeds Review
General

single-cell-multi-omics-integration

No summary provided by upstream source.

Repository SourceNeeds Review
General

single-cell-cellphonedb-communication-mapping

No summary provided by upstream source.

Repository SourceNeeds Review