fetch-meta-to-kb

Fetch journal articles from Crossref published after a user-specified date and insert them into PostgreSQL `journals` with DOI deduplication. Use when incrementally ingesting journal metadata from `journals_issn` into `journals`.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "fetch-meta-to-kb" with this command: npx skills add tiangong-ai/skills/tiangong-ai-skills-fetch-meta-to-kb

Fetch Meta to KB

Core Goal

Pull journal-article records from Crossref after a given --from-date.
Read ISSN seed rows from journals_issn (journal, issn1).
Insert rows into journals with ON CONFLICT (doi) DO NOTHING.
Keep the implementation aligned with fetch_meta_to_kb.py.

Run Workflow

Set database connection env vars (user-managed keys prefixed with KB_):

KB_DB_HOST
KB_DB_PORT
KB_DB_NAME
KB_DB_USER
KB_DB_PASSWORD
KB_LOG_DIR (required, log output directory)

Run incremental fetch with a required date:

python3 scripts/fetch_meta_to_kb.py --from-date 2024-05-01

If executing through an exec tool call, set timeout to 1800 seconds (30 minutes).

Check logs in:

${KB_LOG_DIR}/fetch-meta-to-kb-YYYYMMDD-HHMMSS.log (UTC timestamp, one file per run)

Build user-facing summary strictly from the current run output:

Prefer RUN_SUMMARY_JSON emitted by fetch_meta_to_kb.py.
If JSON is unavailable, parse only this run's ${KB_LOG_DIR}/fetch-meta-to-kb-YYYYMMDD-HHMMSS.log.
total_inserted must mean rows inserted in this run (after DOI dedup), not cumulative rows in table.

Behavior Contract

Query Crossref endpoint: https://api.crossref.org/journals/{issn}/works.
Filter with type:journal-article,from-pub-date:<from-date>.
Keep only items whose container-title equals target journal title (case-insensitive).
Continue pagination with cursor until no matching items remain.
Store fields in journals: title, doi, journal, authors, date, abstract (nullable when Crossref has no abstract).
Reporting/announcement metrics must use current-run log/summary only.
Do not compute announcement counts via database-wide or time-window SQL such as WHERE date >= ....

Scope Boundary

Implement only Crossref incremental fetch + insert into journals.

Script

scripts/fetch_meta_to_kb.py

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

ai-tech-rss-fetch

No summary provided by upstream source.

Repository SourceNeeds Review

130-tiangong-ai

General

email-smtp-send

No summary provided by upstream source.

Repository SourceNeeds Review

101-tiangong-ai

General

email-imap-fetch

No summary provided by upstream source.

Repository SourceNeeds Review

General

sci-journals-hybrid-search

No summary provided by upstream source.

Repository SourceNeeds Review