Citation Management

Overview

Manage citations systematically throughout the research and writing process. This skill provides tools and strategies for searching academic databases (Google Scholar, PubMed), extracting accurate metadata from multiple sources (CrossRef, PubMed, arXiv), validating citation information, and generating properly formatted BibTeX entries.

Critical for maintaining citation accuracy, avoiding reference errors, and ensuring reproducible research. Integrates seamlessly with the literature-review skill for comprehensive research workflows.

When to Use This Skill

Use this skill when:

Searching for specific papers on Google Scholar or PubMed
Converting DOIs, PMIDs, or arXiv IDs to properly formatted BibTeX
Extracting complete metadata for citations (authors, title, journal, year, etc.)
Validating existing citations for accuracy
Cleaning and formatting BibTeX files
Finding highly cited papers in a specific field
Verifying that citation information matches the actual publication
Building a bibliography for a manuscript or thesis
Checking for duplicate citations
Ensuring consistent citation formatting

Visual Enhancement with Scientific Schematics

When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.

If your document does not already contain schematics or diagrams:

Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
Simply describe your desired diagram in natural language
Nano Banana Pro will automatically generate, review, and refine the schematic

For new documents: Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.

How to generate schematics:

python scripts/generate_schematic.py "your diagram description" -o figures/output.png

The AI will automatically:

Create publication-quality images with proper formatting
Review and refine through multiple iterations
Ensure accessibility (colorblind-friendly, high contrast)
Save outputs in the figures/ directory

When to add schematics:

Citation workflow diagrams
Literature search methodology flowcharts
Reference management system architectures
Citation style decision trees
Database integration diagrams
Any complex concept that benefits from visualization

For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.

Core Workflow

Citation management follows a systematic process:

Phase 1: Paper Discovery and Search

Goal: Find relevant papers using academic search engines.

Google Scholar Search

Google Scholar provides the most comprehensive coverage across disciplines.

Basic Search:

Search for papers on a topic

python scripts/search_google_scholar.py "CRISPR gene editing"
--limit 50
--output results.json

Search with year filter

python scripts/search_google_scholar.py "machine learning protein folding"
--year-start 2020
--year-end 2024
--limit 100
--output ml_proteins.json

Advanced Search Strategies (see references/google_scholar_search.md ):

Use quotation marks for exact phrases: "deep learning"
Search by author: author:LeCun
Search in title: intitle:"neural networks"
Exclude terms: machine learning -survey
Find highly cited papers using sort options
Filter by date ranges to get recent work

Best Practices:

Use specific, targeted search terms
Include key technical terms and acronyms
Filter by recent years for fast-moving fields
Check "Cited by" to find seminal papers
Export top results for further analysis

PubMed Search

PubMed specializes in biomedical and life sciences literature (35+ million citations).

Basic Search:

Search PubMed

python scripts/search_pubmed.py "Alzheimer's disease treatment"
--limit 100
--output alzheimers.json

Search with MeSH terms and filters

python scripts/search_pubmed.py
--query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]'
--date-start 2020
--date-end 2024
--publication-types "Clinical Trial,Review"
--output alzheimers_trials.json

Advanced PubMed Queries (see references/pubmed_search.md ):

Use MeSH terms: "Diabetes Mellitus"[MeSH]
Field tags: "cancer"[Title] , "Smith J"[Author]
Boolean operators: AND , OR , NOT
Date filters: 2020:2024[Publication Date]
Publication types: "Review"[Publication Type]
Combine with E-utilities API for automation

Best Practices:

Use MeSH Browser to find correct controlled vocabulary
Construct complex queries in PubMed Advanced Search Builder first
Include multiple synonyms with OR
Retrieve PMIDs for easy metadata extraction
Export to JSON or directly to BibTeX

Phase 2: Metadata Extraction

Goal: Convert paper identifiers (DOI, PMID, arXiv ID) to complete, accurate metadata.

Quick DOI to BibTeX Conversion

For single DOIs, use the quick conversion tool:

Convert single DOI

python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

Convert multiple DOIs from a file

python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

Different output formats

python scripts/doi_to_bibtex.py 10.1038/nature12345 --format json

Comprehensive Metadata Extraction

For DOIs, PMIDs, arXiv IDs, or URLs:

Extract from DOI

python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

Extract from PMID

python scripts/extract_metadata.py --pmid 34265844

Extract from arXiv ID

python scripts/extract_metadata.py --arxiv 2103.14030

Extract from URL

python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2"

Batch extraction from file (mixed identifiers)

python scripts/extract_metadata.py --input identifiers.txt --output citations.bib

Metadata Sources (see references/metadata_extraction.md ):

CrossRef API: Primary source for DOIs

Comprehensive metadata for journal articles
Publisher-provided information
Includes authors, title, journal, volume, pages, dates
Free, no API key required

PubMed E-utilities: Biomedical literature

Official NCBI metadata
Includes MeSH terms, abstracts
PMID and PMCID identifiers
Free, API key recommended for high volume

arXiv API: Preprints in physics, math, CS, q-bio

Complete metadata for preprints
Version tracking
Author affiliations
Free, open access

DataCite API: Research datasets, software, other resources

Metadata for non-traditional scholarly outputs
DOIs for datasets and code
Free access

What Gets Extracted:

Required fields: author, title, year
Journal articles: journal, volume, number, pages, DOI
Books: publisher, ISBN, edition
Conference papers: booktitle, conference location, pages
Preprints: repository (arXiv, bioRxiv), preprint ID
Additional: abstract, keywords, URL

Phase 3: BibTeX Formatting

Goal: Generate clean, properly formatted BibTeX entries.

Understanding BibTeX Entry Types

See references/bibtex_formatting.md for complete guide.

Common Entry Types:

@article : Journal articles (most common)
@book : Books
@inproceedings : Conference papers
@incollection : Book chapters
@phdthesis : Dissertations
@misc : Preprints, software, datasets

Required Fields by Type:

@article{citationkey, author = {Last1, First1 and Last2, First2}, title = {Article Title}, journal = {Journal Name}, year = {2024}, volume = {10}, number = {3}, pages = {123--145}, doi = {10.1234/example} }

@inproceedings{citationkey, author = {Last, First}, title = {Paper Title}, booktitle = {Conference Name}, year = {2024}, pages = {1--10} }

@book{citationkey, author = {Last, First}, title = {Book Title}, publisher = {Publisher Name}, year = {2024} }

Formatting and Cleaning

Use the formatter to standardize BibTeX files:

Format and clean BibTeX file

python scripts/format_bibtex.py references.bib
--output formatted_references.bib

Sort entries by citation key

python scripts/format_bibtex.py references.bib
--sort key
--output sorted_references.bib

Sort by year (newest first)

python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_references.bib

Remove duplicates

python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_references.bib

Validate and report issues

python scripts/format_bibtex.py references.bib
--validate
--report validation_report.txt

Formatting Operations:

Standardize field order
Consistent indentation and spacing
Proper capitalization in titles (protected with {})
Standardized author name format
Consistent citation key format
Remove unnecessary fields
Fix common errors (missing commas, braces)

Phase 4: Citation Validation

Goal: Verify all citations are accurate and complete.

Comprehensive Validation

Validate BibTeX file

python scripts/validate_citations.py references.bib

Validate and fix common issues

python scripts/validate_citations.py references.bib
--auto-fix
--output validated_references.bib

Generate detailed validation report

python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose

Validation Checks (see references/citation_validation.md ):

DOI Verification:

DOI resolves correctly via doi.org
Metadata matches between BibTeX and CrossRef
No broken or invalid DOIs

Required Fields:

All required fields present for entry type
No empty or missing critical information
Author names properly formatted

Data Consistency:

Year is valid (4 digits, reasonable range)
Volume/number are numeric
Pages formatted correctly (e.g., 123--145)
URLs are accessible

Duplicate Detection:

Same DOI used multiple times
Similar titles (possible duplicates)
Same author/year/title combinations

Format Compliance:

Valid BibTeX syntax
Proper bracing and quoting
Citation keys are unique
Special characters handled correctly

Validation Output:

{ "total_entries": 150, "valid_entries": 145, "errors": [ { "citation_key": "Smith2023", "error_type": "missing_field", "field": "journal", "severity": "high" }, { "citation_key": "Jones2022", "error_type": "invalid_doi", "doi": "10.1234/broken", "severity": "high" } ], "warnings": [ { "citation_key": "Brown2021", "warning_type": "possible_duplicate", "duplicate_of": "Brown2021a", "severity": "medium" } ] }

Phase 5: Integration with Writing Workflow

Building References for Manuscripts

Complete workflow for creating a bibliography:

1. Search for papers on your topic

python scripts/search_pubmed.py
'"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]'
--date-start 2020
--limit 200
--output crispr_papers.json

2. Extract DOIs from search results and convert to BibTeX

python scripts/extract_metadata.py
--input crispr_papers.json
--output crispr_refs.bib

3. Add specific papers by DOI

python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib

4. Format and clean the BibTeX file

python scripts/format_bibtex.py crispr_refs.bib
--deduplicate
--sort year
--descending
--output references.bib

5. Validate all citations

python scripts/validate_citations.py references.bib
--auto-fix
--report validation.json
--output final_references.bib

6. Review validation report and fix any remaining issues

cat validation.json

7. Use in your LaTeX document

\bibliography{final_references}

Integration with Literature Review Skill

This skill complements the literature-review skill:

Literature Review Skill → Systematic search and synthesis Citation Management Skill → Technical citation handling

Combined Workflow:

Use literature-review for comprehensive multi-database search
Use citation-management to extract and validate all citations
Use literature-review to synthesize findings thematically
Use citation-management to verify final bibliography accuracy

After completing literature review

Verify all citations in the review document

python scripts/validate_citations.py my_review_references.bib --report review_validation.json

Format for specific citation style if needed

python scripts/format_bibtex.py my_review_references.bib
--style nature
--output formatted_refs.bib

Search Strategies

Google Scholar Best Practices

Finding Seminal and High-Impact Papers (CRITICAL):

Always prioritize papers based on citation count, venue quality, and author reputation:

Citation Count Thresholds:

Paper Age Citations Classification

0-3 years 20+ Noteworthy

0-3 years 100+ Highly Influential

3-7 years 100+ Significant

3-7 years 500+ Landmark Paper

7+ years 500+ Seminal Work

7+ years 1000+ Foundational

Venue Quality Tiers:

Tier 1 (Prefer): Nature, Science, Cell, NEJM, Lancet, JAMA, PNAS
Tier 2 (High Priority): Impact Factor >10, top conferences (NeurIPS, ICML, ICLR)
Tier 3 (Good): Specialized journals (IF 5-10)
Tier 4 (Sparingly): Lower-impact peer-reviewed venues

Author Reputation Indicators:

Senior researchers with h-index >40
Multiple publications in Tier-1 venues
Leadership at recognized institutions
Awards and editorial positions

Search Strategies for High-Impact Papers:

Sort by citation count (most cited first)
Look for review articles from Tier-1 journals for overview
Check "Cited by" for impact assessment and recent follow-up work
Use citation alerts for tracking new citations to key papers
Filter by top venues using source:Nature or source:Science
Search for papers by known field leaders using author:LastName

Advanced Operators (full list in references/google_scholar_search.md ):

"exact phrase" # Exact phrase matching author:lastname # Search by author intitle:keyword # Search in title only source:journal # Search specific journal -exclude # Exclude terms OR # Alternative terms 2020..2024 # Year range

Example Searches:

Find recent reviews on a topic

"CRISPR" intitle:review 2023..2024

Find papers by specific author on topic

author:Church "synthetic biology"

Find highly cited foundational work

"deep learning" 2012..2015 sort:citations

Exclude surveys and focus on methods

"protein folding" -survey -review intitle:method

PubMed Best Practices

Using MeSH Terms: MeSH (Medical Subject Headings) provides controlled vocabulary for precise searching.

Find MeSH terms at https://meshb.nlm.nih.gov/search
Use in queries: "Diabetes Mellitus, Type 2"[MeSH]
Combine with keywords for comprehensive coverage

Field Tags:

[Title] # Search in title only [Title/Abstract] # Search in title or abstract [Author] # Search by author name [Journal] # Search specific journal [Publication Date] # Date range [Publication Type] # Article type [MeSH] # MeSH term

Building Complex Queries:

Clinical trials on diabetes treatment published recently

"Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH] AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date]

Reviews on CRISPR in specific journal

"CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type]

Specific author's recent work

"Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]

E-utilities for Automation: The scripts use NCBI E-utilities API for programmatic access:

ESearch: Search and retrieve PMIDs
EFetch: Retrieve full metadata
ESummary: Get summary information
ELink: Find related articles

See references/pubmed_search.md for complete API documentation.

Tools and Scripts

search_google_scholar.py

Search Google Scholar and export results.

Features:

Automated searching with rate limiting
Pagination support
Year range filtering
Export to JSON or BibTeX
Citation count information

Usage:

Basic search

python scripts/search_google_scholar.py "quantum computing"

Advanced search with filters

python scripts/search_google_scholar.py "quantum computing"
--year-start 2020
--year-end 2024
--limit 100
--sort-by citations
--output quantum_papers.json

Export directly to BibTeX

python scripts/search_google_scholar.py "machine learning"
--limit 50
--format bibtex
--output ml_papers.bib

search_pubmed.py

Search PubMed using E-utilities API.

Features:

Complex query support (MeSH, field tags, Boolean)
Date range filtering
Publication type filtering
Batch retrieval with metadata
Export to JSON or BibTeX

Usage:

Simple keyword search

python scripts/search_pubmed.py "CRISPR gene editing"

Complex query with filters

python scripts/search_pubmed.py
--query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]'
--date-start 2020-01-01
--date-end 2024-12-31
--publication-types "Clinical Trial,Review"
--limit 200
--output crispr_therapeutic.json

Export to BibTeX

python scripts/search_pubmed.py "Alzheimer's disease"
--limit 100
--format bibtex
--output alzheimers.bib

extract_metadata.py

Extract complete metadata from paper identifiers.

Features:

Supports DOI, PMID, arXiv ID, URL
Queries CrossRef, PubMed, arXiv APIs
Handles multiple identifier types
Batch processing
Multiple output formats

Usage:

Single DOI

python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2

Single PMID

python scripts/extract_metadata.py --pmid 34265844

Single arXiv ID

python scripts/extract_metadata.py --arxiv 2103.14030

From URL

python scripts/extract_metadata.py
--url "https://www.nature.com/articles/s41586-021-03819-2"

Batch processing (file with one identifier per line)

python scripts/extract_metadata.py
--input paper_ids.txt
--output references.bib

Different output formats

python scripts/extract_metadata.py
--doi 10.1038/nature12345
--format json # or bibtex, yaml

validate_citations.py

Validate BibTeX entries for accuracy and completeness.

Features:

DOI verification via doi.org and CrossRef
Required field checking
Duplicate detection
Format validation
Auto-fix common issues
Detailed reporting

Usage:

Basic validation

python scripts/validate_citations.py references.bib

With auto-fix

python scripts/validate_citations.py references.bib
--auto-fix
--output fixed_references.bib

Detailed validation report

python scripts/validate_citations.py references.bib
--report validation_report.json
--verbose

Only check DOIs

python scripts/validate_citations.py references.bib
--check-dois-only

format_bibtex.py

Format and clean BibTeX files.

Features:

Standardize formatting
Sort entries (by key, year, author)
Remove duplicates
Validate syntax
Fix common errors
Enforce citation key conventions

Usage:

Basic formatting

python scripts/format_bibtex.py references.bib

Sort by year (newest first)

python scripts/format_bibtex.py references.bib
--sort year
--descending
--output sorted_refs.bib

Remove duplicates

python scripts/format_bibtex.py references.bib
--deduplicate
--output clean_refs.bib

Complete cleanup

python scripts/format_bibtex.py references.bib
--deduplicate
--sort year
--validate
--auto-fix
--output final_refs.bib

doi_to_bibtex.py

Quick DOI to BibTeX conversion.

Features:

Fast single DOI conversion
Batch processing
Multiple output formats
Clipboard support

Usage:

Single DOI

python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2

Multiple DOIs

python scripts/doi_to_bibtex.py
10.1038/nature12345
10.1126/science.abc1234
10.1016/j.cell.2023.01.001

From file (one DOI per line)

python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

Copy to clipboard

python scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard

Best Practices

Search Strategy

Start broad, then narrow:

Begin with general terms to understand the field
Refine with specific keywords and filters
Use synonyms and related terms

Use multiple sources:

Google Scholar for comprehensive coverage
PubMed for biomedical focus
arXiv for preprints
Combine results for completeness

Leverage citations:

Check "Cited by" for seminal papers
Review references from key papers
Use citation networks to discover related work

Document your searches:

Save search queries and dates
Record number of results
Note any filters or restrictions applied

Metadata Extraction

Always use DOIs when available:

Most reliable identifier
Permanent link to the publication
Best metadata source via CrossRef

Verify extracted metadata:

Check author names are correct
Verify journal/conference names
Confirm publication year
Validate page numbers and volume

Handle edge cases:

Preprints: Include repository and ID
Preprints later published: Use published version
Conference papers: Include conference name and location
Book chapters: Include book title and editors

Maintain consistency:

Use consistent author name format
Standardize journal abbreviations
Use same DOI format (URL preferred)

BibTeX Quality

Follow conventions:

Use meaningful citation keys (FirstAuthor2024keyword)
Protect capitalization in titles with {}
Use -- for page ranges (not single dash)
Include DOI field for all modern publications

Keep it clean:

Remove unnecessary fields
No redundant information
Consistent formatting
Validate syntax regularly

Organize systematically:

Sort by year or topic
Group related papers
Use separate files for different projects
Merge carefully to avoid duplicates

Validation

Validate early and often:

Check citations when adding them
Validate complete bibliography before submission
Re-validate after any manual edits

Fix issues promptly:

Broken DOIs: Find correct identifier
Missing fields: Extract from original source
Duplicates: Choose best version, remove others
Format errors: Use auto-fix when safe

Manual review for critical citations:

Verify key papers cited correctly
Check author names match publication
Confirm page numbers and volume
Ensure URLs are current

Common Pitfalls to Avoid

Single source bias: Only using Google Scholar or PubMed

Solution: Search multiple databases for comprehensive coverage

Accepting metadata blindly: Not verifying extracted information

Solution: Spot-check extracted metadata against original sources

Ignoring DOI errors: Broken or incorrect DOIs in bibliography

Solution: Run validation before final submission

Inconsistent formatting: Mixed citation key styles, formatting

Solution: Use format_bibtex.py to standardize

Duplicate entries: Same paper cited multiple times with different keys

Solution: Use duplicate detection in validation

Missing required fields: Incomplete BibTeX entries

Solution: Validate and ensure all required fields present

Outdated preprints: Citing preprint when published version exists

Solution: Check if preprints have been published, update to journal version

Special character issues: Broken LaTeX compilation due to characters

Solution: Use proper escaping or Unicode in BibTeX

No validation before submission: Submitting with citation errors

Solution: Always run validation as final check

Manual BibTeX entry: Typing entries by hand

Solution: Always extract from metadata sources using scripts

Example Workflows

Example 1: Building a Bibliography for a Paper

Step 1: Find key papers on your topic

python scripts/search_google_scholar.py "transformer neural networks"
--year-start 2017
--limit 50
--output transformers_gs.json

python scripts/search_pubmed.py "deep learning medical imaging"
--date-start 2020
--limit 50
--output medical_dl_pm.json

Step 2: Extract metadata from search results

python scripts/extract_metadata.py
--input transformers_gs.json
--output transformers.bib

python scripts/extract_metadata.py
--input medical_dl_pm.json
--output medical.bib

Step 3: Add specific papers you already know

python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib

Step 4: Combine all BibTeX files

cat transformers.bib medical.bib specific.bib > combined.bib

Step 5: Format and deduplicate

python scripts/format_bibtex.py combined.bib
--deduplicate
--sort year
--descending
--output formatted.bib

Step 6: Validate

python scripts/validate_citations.py formatted.bib
--auto-fix
--report validation.json
--output final_references.bib

Step 7: Review any issues

cat validation.json | grep -A 3 '"errors"'

Step 8: Use in LaTeX

\bibliography{final_references}

Example 2: Converting a List of DOIs

You have a text file with DOIs (one per line)

dois.txt contains:

10.1038/s41586-021-03819-2

10.1126/science.aam9317

10.1016/j.cell.2023.01.001

Convert all to BibTeX

python scripts/doi_to_bibtex.py --input dois.txt --output references.bib

Validate the result

python scripts/validate_citations.py references.bib --verbose

Example 3: Cleaning an Existing BibTeX File

You have a messy BibTeX file from various sources

Clean it up systematically

Step 1: Format and standardize

python scripts/format_bibtex.py messy_references.bib
--output step1_formatted.bib

Step 2: Remove duplicates

python scripts/format_bibtex.py step1_formatted.bib
--deduplicate
--output step2_deduplicated.bib

Step 3: Validate and auto-fix

python scripts/validate_citations.py step2_deduplicated.bib
--auto-fix
--output step3_validated.bib

Step 4: Sort by year

python scripts/format_bibtex.py step3_validated.bib
--sort year
--descending
--output clean_references.bib

Step 5: Final validation report

python scripts/validate_citations.py clean_references.bib
--report final_validation.json
--verbose

Review report

cat final_validation.json

Example 4: Finding and Citing Seminal Papers

Find highly cited papers on a topic

python scripts/search_google_scholar.py "AlphaFold protein structure"
--year-start 2020
--year-end 2024
--sort-by citations
--limit 20
--output alphafold_seminal.json

Extract the top 10 by citation count

(script will have included citation counts in JSON)

Convert to BibTeX

python scripts/extract_metadata.py
--input alphafold_seminal.json
--output alphafold_refs.bib

The BibTeX file now contains the most influential papers

Integration with Other Skills

Literature Review Skill

Citation Management provides the technical infrastructure for Literature Review:

Literature Review: Multi-database systematic search and synthesis
Citation Management: Metadata extraction and validation

Combined workflow:

Use literature-review for systematic search methodology
Use citation-management to extract and validate citations
Use literature-review to synthesize findings
Use citation-management to ensure bibliography accuracy

Scientific Writing Skill

Citation Management ensures accurate references for Scientific Writing:

Export validated BibTeX for use in LaTeX manuscripts
Verify citations match publication standards
Format references according to journal requirements

Venue Templates Skill

Citation Management works with Venue Templates for submission-ready manuscripts:

Different venues require different citation styles
Generate properly formatted references
Validate citations meet venue requirements

Resources

Bundled Resources

References (in references/ ):

google_scholar_search.md : Complete Google Scholar search guide
pubmed_search.md : PubMed and E-utilities API documentation
metadata_extraction.md : Metadata sources and field requirements
citation_validation.md : Validation criteria and quality checks
bibtex_formatting.md : BibTeX entry types and formatting rules

Scripts (in scripts/ ):

search_google_scholar.py : Google Scholar search automation
search_pubmed.py : PubMed E-utilities API client
extract_metadata.py : Universal metadata extractor
validate_citations.py : Citation validation and verification
format_bibtex.py : BibTeX formatter and cleaner
doi_to_bibtex.py : Quick DOI to BibTeX converter

Assets (in assets/ ):

bibtex_template.bib : Example BibTeX entries for all types
citation_checklist.md : Quality assurance checklist

External Resources

Search Engines:

Google Scholar: https://scholar.google.com/
PubMed: https://pubmed.ncbi.nlm.nih.gov/
PubMed Advanced Search: https://pubmed.ncbi.nlm.nih.gov/advanced/

Metadata APIs:

CrossRef API: https://api.crossref.org/
PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
arXiv API: https://arxiv.org/help/api/
DataCite API: https://api.datacite.org/

Tools and Validators:

MeSH Browser: https://meshb.nlm.nih.gov/search
DOI Resolver: https://doi.org/
BibTeX Format: http://www.bibtex.org/Format/

Citation Styles:

BibTeX documentation: http://www.bibtex.org/
LaTeX bibliography management: https://www.overleaf.com/learn/latex/Bibliography_management

Dependencies

Required Python Packages

Core dependencies

pip install requests # HTTP requests for APIs pip install bibtexparser # BibTeX parsing and formatting pip install biopython # PubMed E-utilities access

Optional (for Google Scholar)

pip install scholarly # Google Scholar API wrapper

or

pip install selenium # For more robust Scholar scraping

Optional Tools

For advanced validation

pip install crossref-commons # Enhanced CrossRef API access pip install pylatexenc # LaTeX special character handling

Summary

The citation-management skill provides:

Comprehensive search capabilities for Google Scholar and PubMed
Automated metadata extraction from DOI, PMID, arXiv ID, URLs
Citation validation with DOI verification and completeness checking
BibTeX formatting with standardization and cleaning tools
Quality assurance through validation and reporting
Integration with scientific writing workflow
Reproducibility through documented search and extraction methods

Use this skill to maintain accurate, complete citations throughout your research and ensure publication-ready bibliographies.

citation-management

Safety Notice

Copy this and send it to your AI assistant to learn

Search for papers on a topic

Search with year filter

Search PubMed

Search with MeSH terms and filters

Convert single DOI

Convert multiple DOIs from a file

Different output formats

Extract from DOI

Extract from PMID

Extract from arXiv ID

Extract from URL

Batch extraction from file (mixed identifiers)

Format and clean BibTeX file

Sort entries by citation key

Sort by year (newest first)

Remove duplicates

Validate and report issues

Validate BibTeX file

Validate and fix common issues

Generate detailed validation report

1. Search for papers on your topic

2. Extract DOIs from search results and convert to BibTeX

3. Add specific papers by DOI

4. Format and clean the BibTeX file

5. Validate all citations

6. Review validation report and fix any remaining issues

7. Use in your LaTeX document

\bibliography{final_references}

After completing literature review

Verify all citations in the review document

Format for specific citation style if needed

Find recent reviews on a topic

Find papers by specific author on topic

Find highly cited foundational work

Exclude surveys and focus on methods

Clinical trials on diabetes treatment published recently

Reviews on CRISPR in specific journal

Specific author's recent work

Basic search

Advanced search with filters

Export directly to BibTeX

Simple keyword search

Complex query with filters

Export to BibTeX

Single DOI

Single PMID

Single arXiv ID

From URL

Batch processing (file with one identifier per line)

Different output formats

Basic validation

With auto-fix

Detailed validation report

Only check DOIs

Basic formatting

Sort by year (newest first)

Remove duplicates

Complete cleanup

Single DOI

Multiple DOIs

From file (one DOI per line)

Copy to clipboard

Step 1: Find key papers on your topic

Step 2: Extract metadata from search results

Step 3: Add specific papers you already know

Step 4: Combine all BibTeX files

Step 5: Format and deduplicate

Step 6: Validate

Step 7: Review any issues

Step 8: Use in LaTeX

\bibliography{final_references}

You have a text file with DOIs (one per line)

dois.txt contains:

10.1038/s41586-021-03819-2

10.1126/science.aam9317

10.1016/j.cell.2023.01.001

Convert all to BibTeX