bio-proteomics-dia-analysis

DIA Proteomics Analysis

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "bio-proteomics-dia-analysis" with this command: npx skills add gptomics/bioskills/gptomics-bioskills-bio-proteomics-dia-analysis

DIA Proteomics Analysis

DIA-NN Library-Free Analysis

Library-free mode (generates library from data)

diann
--f sample1.mzML
--f sample2.mzML
--lib ""
--threads 8
--verbose 1
--out report.tsv
--qvalue 0.01
--matrices
--out-lib generated_lib.tsv
--gen-spec-lib
--predictor
--fasta uniprot_human.fasta
--fasta-search
--min-fr-mz 200
--max-fr-mz 1800
--met-excision
--cut K*,R*
--missed-cleavages 1
--min-pep-len 7
--max-pep-len 30
--min-pr-mz 300
--max-pr-mz 1800
--min-pr-charge 1
--max-pr-charge 4
--unimod4
--var-mods 1
--var-mod UniMod:35,15.994915,M
--reanalyse
--smart-profiling

DIA-NN with Spectral Library

Use pre-built or predicted library

diann
--f sample1.mzML
--f sample2.mzML
--lib spectral_library.tsv
--threads 8
--verbose 1
--out report.tsv
--qvalue 0.01
--matrices
--reanalyse
--smart-profiling

DIA-NN Output Files

report.tsv # Main quantification report (long format) report.stats.tsv # Run statistics report.pg_matrix.tsv # Protein group quantities (wide format) report.pr.matrix.tsv # Precursor quantities (wide format) report.gg_matrix.tsv # Gene group quantities (wide format) generated_lib.tsv # Generated spectral library (if requested)

Load DIA-NN Results in R

library(tidyverse)

Load main report

report <- read_tsv('report.tsv')

Load protein matrix (already wide format)

proteins <- read_tsv('report.pg_matrix.tsv')

Filter and reshape for analysis

protein_matrix <- proteins %>% column_to_rownames('Protein.Group') %>% select(starts_with('sample')) %>% as.matrix()

Log2 transform (DIA-NN outputs raw intensities)

log2_matrix <- log2(protein_matrix) log2_matrix[is.infinite(log2_matrix)] <- NA

Load DIA-NN Results in Python

import pandas as pd import numpy as np

Load main report

report = pd.read_csv('report.tsv', sep='\t')

Load protein matrix

proteins = pd.read_csv('report.pg_matrix.tsv', sep='\t') proteins = proteins.set_index('Protein.Group')

Log2 transform

log2_proteins = np.log2(proteins.replace(0, np.nan))

MSFragger-DIA Analysis

MSFragger for DIA (alternative to DIA-NN)

Requires FragPipe GUI or command-line workflow

Generate predicted library with EasyPQP

easypqp library
--in psm_results.tsv
--out library.pqp
--psmtsv
--rt_reference irt.tsv

Convert to DIA-NN format

easypqp convert
--in library.pqp
--out library.tsv
--format diann

Spectronaut Export Processing

Load Spectronaut report

spectronaut <- read_tsv('spectronaut_report.tsv')

Pivot to protein matrix

protein_matrix <- spectronaut %>% select(PG.ProteinGroups, R.FileName, PG.Quantity) %>% pivot_wider(names_from = R.FileName, values_from = PG.Quantity) %>% column_to_rownames('PG.ProteinGroups')

DIA Quality Metrics

library(tidyverse)

report <- read_tsv('report.tsv')

Identifications per run

ids_per_run <- report %>% group_by(Run) %>% summarise( precursors = n_distinct(Precursor.Id), proteins = n_distinct(Protein.Group), genes = n_distinct(Genes) )

Missing value analysis

proteins <- read_tsv('report.pg_matrix.tsv') protein_values <- proteins %>% select(-Protein.Group) missing_pct <- colSums(protein_values == 0 | is.na(protein_values)) / nrow(protein_values) * 100

Match Between Runs

DIA-NN MBR is automatic with --reanalyse flag

First pass: identifies peptides per run

Second pass: transfers IDs between runs

diann
--f *.mzML
--lib library.tsv
--reanalyse
--out report_mbr.tsv

DIA vs DDA Comparison

Feature DIA DDA

Acquisition All precursors fragmented Top-N precursors selected

Missing values Lower (5-20%) Higher (30-50%)

Dynamic range Better for low-abundance Better for high-abundance

Library required Optional (library-free) Not applicable

Quantification More reproducible More variable

Analysis tools DIA-NN, Spectronaut MaxQuant, MSFragger

Related Skills

  • data-import - Load raw MS data

  • spectral-libraries - Build and use spectral libraries

  • quantification - Normalization methods

  • differential-abundance - Statistical testing

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

bio-microbiome-diversity-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-metabolomics-statistical-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-pdb-geometric-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

bio-restriction-fragment-analysis

No summary provided by upstream source.

Repository SourceNeeds Review