bio-microbiome-diversity-analysis

Create phyloseq Object

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "bio-microbiome-diversity-analysis" with this command: npx skills add gptomics/bioskills/gptomics-bioskills-bio-microbiome-diversity-analysis

Diversity Analysis

Create phyloseq Object

library(phyloseq) library(vegan) library(ggplot2)

seqtab <- readRDS('seqtab_nochim.rds') taxa <- readRDS('taxa.rds') metadata <- read.csv('sample_metadata.csv', row.names = 1)

ps <- phyloseq(otu_table(seqtab, taxa_are_rows = FALSE), tax_table(taxa), sample_data(metadata)) taxa_names(ps) <- paste0('ASV', seq(ntaxa(ps)))

Alpha Diversity

Calculate multiple metrics

alpha_div <- estimate_richness(ps, measures = c('Observed', 'Chao1', 'Shannon', 'Simpson')) alpha_div$SampleID <- rownames(alpha_div) alpha_div <- merge(alpha_div, sample_data(ps), by = 'row.names')

Statistical test

kruskal.test(Shannon ~ Group, data = alpha_div)

Pairwise comparisons

pairwise.wilcox.test(alpha_div$Shannon, alpha_div$Group, p.adjust.method = 'BH')

Alpha Diversity Plots

plot_richness(ps, x = 'Group', measures = c('Observed', 'Shannon')) + geom_boxplot() + theme_minimal()

Custom plot

ggplot(alpha_div, aes(x = Group, y = Shannon, fill = Group)) + geom_boxplot() + geom_jitter(width = 0.2, alpha = 0.5) + theme_minimal() + labs(y = 'Shannon Diversity Index')

Faith's Phylogenetic Diversity

library(picante)

Requires phylogenetic tree in phyloseq object

Build tree from ASV sequences

library(DECIPHER) library(phangorn)

seqs <- refseq(ps) alignment <- AlignSeqs(seqs, anchor = NA) phang_align <- phyDat(as(alignment, 'matrix'), type = 'DNA') dm <- dist.ml(phang_align) tree <- NJ(dm) tree <- midpoint(tree) phy_tree(ps) <- tree

Calculate Faith's PD

otu_mat <- as.matrix(t(otu_table(ps))) faith_pd <- pd(otu_mat, phy_tree(ps), include.root = TRUE) alpha_div$PD <- faith_pd$PD

Rarefaction Curves

Check if sequencing depth is adequate

rarecurve_data <- vegan::rarecurve(t(otu_table(ps)), step = 100, sample = min(sample_sums(ps)))

ggplot version with ggrare (install from GitHub)

devtools::install_github('gauravsk/ranacapa')

library(ranacapa) p_rare <- ggrare(ps, step = 100, color = 'Group', se = FALSE) p_rare + theme_minimal() + labs(title = 'Rarefaction Curves')

Rarefaction

Check sequencing depth

sample_sums(ps)

Rarefy to minimum depth

ps_rarefied <- rarefy_even_depth(ps, sample.size = min(sample_sums(ps)), rngseed = 42, replace = FALSE)

Beta Diversity

Calculate distance matrices

bray <- phyloseq::distance(ps, method = 'bray') # Bray-Curtis jaccard <- phyloseq::distance(ps, method = 'jaccard') # Jaccard unifrac <- UniFrac(ps, weighted = TRUE) # Weighted UniFrac (requires tree)

Ordination

ord_bray <- ordinate(ps, method = 'PCoA', distance = bray)

Plot

plot_ordination(ps, ord_bray, color = 'Group') + stat_ellipse(level = 0.95) + theme_minimal()

PERMANOVA

Test for group differences

metadata <- data.frame(sample_data(ps)) permanova_result <- adonis2(bray ~ Group, data = metadata, permutations = 999) permanova_result

With covariates

adonis2(bray ~ Group + Age + Sex, data = metadata, permutations = 999)

Beta Dispersion

Test homogeneity of dispersions (assumption of PERMANOVA)

beta_disp <- betadisper(bray, metadata$Group) permutest(beta_disp) plot(beta_disp)

NMDS Ordination

ord_nmds <- ordinate(ps, method = 'NMDS', distance = bray)

Check stress

ord_nmds$stress # Should be < 0.2

plot_ordination(ps, ord_nmds, color = 'Group') + theme_minimal()

Distance Metrics Comparison

Metric Type Considers Abundance Phylogeny

Bray-Curtis Quantitative Yes No

Jaccard Binary No No

UniFrac (unweighted) Binary No Yes

UniFrac (weighted) Quantitative Yes Yes

Related Skills

  • amplicon-processing - Generate ASV table

  • differential-abundance - Identify taxa driving differences

  • data-visualization/ggplot2-fundamentals - Custom diversity plots

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

bio-metabolomics-statistical-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-proteomics-dia-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-spatial-transcriptomics-image-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-restriction-fragment-analysis

No summary provided by upstream source.

Repository SourceNeeds Review