spatial-transcriptomics-tutorials-with-omicverse

Spatial Transcriptomics with OmicVerse

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "spatial-transcriptomics-tutorials-with-omicverse" with this command: npx skills add starlitnightly/omicverse/starlitnightly-omicverse-spatial-transcriptomics-tutorials-with-omicverse

Spatial Transcriptomics with OmicVerse

This skill covers spatial analysis workflows organized into three stages: Preprocessing, Deconvolution, and Downstream Analysis. Each stage includes the critical function calls, parameter guidance, and common pitfalls.

Defensive Validation: Always Check Spatial Coordinates First

Before ANY spatial operation, verify that spatial coordinates exist and are numeric:

Required check before spatial analysis

assert 'spatial' in adata.obsm,
"Missing adata.obsm['spatial']. Load with ov.io.spatial.read_visium() or set manually."

Cast to float64 to prevent coordinate precision issues during rotation/cropping

adata.obsm['spatial'] = adata.obsm['spatial'].astype('float64')

Stage 1: Preprocessing

Crop, Rotate, and Align Coordinates

Load Visium data and manipulate spatial coordinates for region selection and alignment:

import scanpy as sc, omicverse as ov ov.plot_set()

adata = sc.datasets.visium_sge(sample_id="V1_Breast_Cancer_Block_A_Section_1") library_id = list(adata.uns['spatial'].keys())[0]

Cast coordinates before manipulation

adata.obsm['spatial'] = adata.obsm['spatial'].astype('float64')

Crop to region of interest

adata_crop = ov.space.crop_space_visium(adata, crop_loc=(0, 0), crop_area=(1000, 1000), library_id=library_id, scale=1)

Rotate and auto-align

adata_rot = ov.space.rotate_space_visium(adata, angle=45, library_id=library_id) ov.space.map_spatial_auto(adata_rot, method='phase')

For manual refinement: ov.space.map_spatial_manual(adata_rot, ...)

Visium HD Cell Segmentation

Segment Visium HD bins into cells using cellpose:

adata = ov.space.read_visium_10x(path="binned_outputs/square_002um/", source_image_path="tissue_image.btf") ov.pp.filter_genes(adata, min_cells=3) ov.pp.filter_cells(adata, min_counts=1)

H&E-based segmentation

adata = ov.space.visium_10x_hd_cellpose_he(adata, mpp=0.3, gpu=True, buffer=150)

Expand labels to neighboring bins

ov.space.visium_10x_hd_cellpose_expand(adata, labels_key='labels_he', expanded_labels_key='labels_he_expanded', max_bin_distance=4)

Gene-expression-driven seeds

ov.space.visium_10x_hd_cellpose_gex(adata, obs_key="n_counts_adjusted", mpp=0.3, sigma=5)

Merge labels and aggregate to cell-level

ov.space.salvage_secondary_labels(adata, primary_label='labels_he_expanded', secondary_label='labels_gex', labels_key='labels_joint') cdata = ov.space.bin2cell(adata, labels_key='labels_joint')

Stage 2: Deconvolution

Critical API Reference: Method Selection

CORRECT — Tangram: method passed to deconvolution() call

decov_obj = ov.space.Deconvolution(adata_sc=sc_adata, adata_sp=sp_adata, celltype_key='Subset', result_dir='result/tangram') decov_obj.preprocess_sc(max_cells=5000) decov_obj.preprocess_sp() decov_obj.deconvolution(method='tangram', num_epochs=1000)

CORRECT — cell2location: method passed at INIT time

cell2_obj = ov.space.Deconvolution(adata_sc=sc_adata, adata_sp=sp_adata, celltype_key='Subset', result_dir='result/c2l', method='cell2location') cell2_obj.deconvolution(max_epochs=30000) cell2_obj.save_model('result/c2l/model')

Note: For cell2location, the method parameter is set at initialization, not at the deconvolution() call. For Tangram, it's passed to deconvolution() .

Starfysh Archetypal Deconvolution

from omicverse.external.starfysh import AA, utils, plot_utils

visium_args = utils.prepare_data(adata_path="data/counts.h5ad", signature_path="data/signatures.csv", min_cells=10, filter_hvg=True, n_top_genes=3000) adata, adata_normed = visium_args.get_adata() aa_model = AA.ArchetypalAnalysis(adata_orig=adata_normed) aa_model.fit(k=12, n_init=10) visium_args = utils.refine_anchors(visium_args, aa_model, add_marker=True) model, history = utils.run_starfysh(visium_args, poe=False, n_repeat=5, lr=5e-3, max_epochs=500)

Stage 3: Downstream Analysis

Spatial Clustering

GraphST + mclust

ov.utils.cluster(adata, use_rep='graphst|original|X_pca', method='mclust', n_components=7)

STAGATE

ov.utils.cluster(adata, use_rep='STAGATE', method='mclust', n_components=7)

Merge small clusters

ov.space.merge_cluster(adata, groupby='mclust', resolution=0.5)

Algorithm choice: GraphST and STAGATE require precalculated latent spaces. For standard clustering without spatial-aware embeddings, use Leiden/Louvain directly.

Multi-Slice Integration (STAligner)

import anndata as ad Batch_list = [ov.read(p) for p in slice_paths] adata_concat = ad.concat(Batch_list, label='slice_name', keys=section_ids) STAligner_obj = ov.space.pySTAligner(adata=adata_concat, batch_key='slice_name', hidden_dims=[256, 64], use_gpu=True) STAligner_obj.train_STAligner_subgraph(nepochs=800, lr=1e-3) STAligner_obj.train() adata_aligned = STAligner_obj.predicted() sc.pp.neighbors(adata_aligned, use_rep='STAligner')

Spatial Trajectories

SpaceFlow — pseudo-spatial maps:

sf_obj = ov.space.pySpaceFlow(adata) sf_obj.train(spatial_regularization_strength=0.1, num_epochs=300, patience=50) sf_obj.cal_pSM(n_neighbors=20, resolution=1.0)

STT — transition dynamics:

STT_obj = ov.space.STT(adata, spatial_loc='xy_loc', region='Region', n_neighbors=20) STT_obj.stage_estimate() STT_obj.train(n_states=9, n_iter=15, weight_connectivities=0.5)

Cell Communication (COMMOT + FlowSig)

df_cellchat = ov.external.commot.pp.ligand_receptor_database(species='human', database='cellchat') df_cellchat = ov.external.commot.pp.filter_lr_database(df_cellchat, adata, min_expr_frac=0.05) ov.external.commot.tl.spatial_communication(adata, lr_database=df_cellchat, distance_threshold=500, result_prefix='cellchat')

FlowSig network

adata.layers['normalized'] = adata.X.copy() ov.external.flowsig.tl.construct_intercellular_flow_network( adata, commot_output_key='commot-cellchat', flowsig_output_key='flowsig-cellchat', edge_threshold=0.7)

Structural Layers (GASTON) and Slice Alignment (SLAT)

GASTON — iso-depth estimation:

gas_obj = ov.space.GASTON(adata) A = gas_obj.prepare_inputs(n_pcs=50) gas_obj.load_rescale(A) gas_obj.train(hidden_dims=[64, 32], dropout=0.1, max_epochs=2000) gaston_isodepth, gaston_labels = gas_obj.cal_iso_depth(n_layers=5)

SLAT — cross-slice alignment:

from omicverse.external.scSLAT.model import Cal_Spatial_Net, load_anndatas, run_SLAT, spatial_match Cal_Spatial_Net(adata1, k_cutoff=20, model='KNN') Cal_Spatial_Net(adata2, k_cutoff=20, model='KNN') edges, features = load_anndatas([adata1, adata2], feature='DPCA', check_order=False) embeddings, *_ = run_SLAT(features, edges, LGCN_layer=5) best, index, distance = spatial_match(embeddings, adatas=[adata1, adata2])

Troubleshooting

  • ValueError: spatial coordinates out of bounds after rotation: Cast adata.obsm['spatial'] to float64 BEFORE calling rotate_space_visium . Integer coordinates lose precision during trigonometric rotation.

  • Cellpose segmentation fails with memory error: For large .btf images, use backend='tifffile' to memory-map the image. Reduce buffer parameter if GPU memory is insufficient.

  • Gene ID overlap failure in Tangram/cell2location: Harmonise identifiers (ENSEMBL vs gene symbols) between adata_sc and adata_sp before calling preprocess_sc /preprocess_sp . Drop non-overlapping genes.

  • mclust clustering error: Requires rpy2 and the R mclust package. If R bindings are unavailable, switch to method='louvain' or method='leiden' .

  • STAligner/SpaceFlow embeddings collapse to a single point: Verify adata.obsm['spatial'] exists and coordinates are scaled appropriately. Tune learning rate (try lr=5e-4 ) and regularisation strength.

  • FlowSig returns empty network: Build spatial neighbor graphs before Moran's I filtering. Increase bootstraps or lower edge_threshold (try 0.5) if the network is too sparse.

  • GASTON RuntimeError in training: Provide a writable out_dir path. PyTorch nondeterminism may cause variation between runs—set torch.manual_seed() for reproducibility.

  • SLAT alignment has many low-quality matches: Regenerate spatial graphs with a higher k_cutoff value. Inspect low_quality_index flags and filter cells with high distance scores.

  • STT pathway enrichment fails: gseapy needs network access for gene set downloads. Cache gene sets locally with ov.utils.geneset_prepare() and pass the dictionary directly.

Dependencies

  • Core: omicverse , scanpy , anndata , squidpy , numpy , matplotlib

  • Segmentation: cellpose , opencv-python /tifffile , optional GPU PyTorch

  • Deconvolution: tangram-sc , cell2location , pytorch-lightning ; Starfysh needs torch , scikit-learn

  • Downstream: scikit-learn , commot , flowsig , gseapy , torch-backed modules (STAligner, SpaceFlow, GASTON, SLAT)

Examples

  • "Crop and rotate my Visium slide, then run cellpose segmentation on the HD data and aggregate to cell-level AnnData."

  • "Deconvolve my lymph node spatial data with Tangram and cell2location, compare proportions, and plot cell-type maps."

  • "Integrate three DLPFC slices with STAligner, cluster with STAGATE, and infer communication with COMMOT+FlowSig."

References

  • Quick copy/paste commands: reference.md

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

data-viz-plots

No summary provided by upstream source.

Repository SourceNeeds Review
General

data-export-pdf

No summary provided by upstream source.

Repository SourceNeeds Review
General

data-export-excel

No summary provided by upstream source.

Repository SourceNeeds Review
General

single-cell-multi-omics-integration

No summary provided by upstream source.

Repository SourceNeeds Review