AMiner Open Platform Academic Data Query
AMiner is a globally leading academic data platform that provides comprehensive academic data covering scholars, papers, institutions, journals, patents, and more.
This skill covers all 28 open APIs and organizes them into 6 practical workflows.
Before use, please generate a token in the console and set it as the environment variable AMINER_API_KEY for automatic script access.
- API Documentation: https://open.aminer.cn/open/docs
- Console (Generate Token): https://open.aminer.cn/open/board?tab=control
High-Priority Mandatory Rules (Critical)
The following four rules are the highest priority and must be followed in any query task:
- Token Security: Only check whether
AMINER_API_KEYexists; never expose the token in plain text in any location (including terminal output, logs, example results, or debug information). - Cost Control: Always prefer the optimal combined query; never perform indiscriminate full-detail retrieval. When many results are matched and the user has not specified a count, default to fetching details for only the top 10 results.
- Free-First: Prefer free APIs unless the user explicitly requires deeper fields or higher precision; only upgrade to paid APIs when free ones cannot meet the need.
- Result Links: Whenever this skill is used and the result contains entities (papers/scholars/patents/journals), an accessible URL must be appended after each entity, regardless of the scenario or output format.
Entity URL templates (mandatory):
- Paper:
https://www.aminer.cn/pub/{paper_id} - Scholar:
https://www.aminer.cn/profile/{scholar_id} - Patent:
https://www.aminer.cn/patent/{patent_id} - Journal:
https://www.aminer.cn/open/journal/detail/{journal_id}
Enforcement note: This rule applies to all returned results (including summaries, lists, details, comparative analyses, workflow outputs, and raw output transcriptions). Whenever an entity appears and a usable ID is available, a link must be attached.
Violating any of the above rules is considered a process non-compliance; execution must be immediately halted and corrected before continuing.
Step 1: Check Environment Variable Token (Required)
Before making any API call, you must first check whether the environment variable AMINER_API_KEY exists (request headers must include both: Authorization: <your_token> and X-Platform: openclaw).
Only determine whether it "exists / does not exist"; never output, echo, or log the token in plain text (including logs, terminal output, or example results).
Standard check (recommended for direct use):
if [ -z "${AMINER_API_KEY+x}" ]; then
echo "AMINER_API_KEY does not exist"
else
echo "AMINER_API_KEY exists"
fi
- If the token already exists in the environment variable: proceed with the subsequent query workflow.
- If no token is in the environment variable: check whether the user has explicitly provided
--token. - If neither the environment variable nor
--tokenis available: stop immediately; do not call any API or enter any subsequent workflow; guide the user to obtain a token first.
Recommended token configuration (preferred):
- Go to the AMiner Console, log in, and generate an API Token.
- Set the token as an environment variable:
export AMINER_API_KEY="<TOKEN>" - The script reads the environment variable
AMINER_API_KEYby default (if--tokenis explicitly provided, it takes precedence).
Guidance when no token is available:
- Clearly inform the user: "A token is currently missing; AMiner API calls cannot continue."
- Direct the user to the AMiner Console to log in and generate an API Token.
- For assistance, refer to the Open Platform Documentation.
- Prompt the user to continue after obtaining a token; they can reply directly:
Here is my token: <TOKEN>
The token can be generated in the console and reused within its validity period. Do not execute any data query steps before obtaining a token.
Quick Start (Python Script)
All workflows can be driven through scripts/aminer_client.py:
# Recommended: set the environment variable first (no need to pass --token repeatedly)
export AMINER_API_KEY="<TOKEN>"
# Scholar profile analysis
python scripts/aminer_client.py --action scholar_profile --name "Andrew Ng"
# Paper deep dive (with citation chain)
python scripts/aminer_client.py --action paper_deep_dive --title "Attention is all you need"
# Institution research capability analysis
python scripts/aminer_client.py --action org_analysis --org "Tsinghua University"
# Journal paper monitoring (specify year)
python scripts/aminer_client.py --action venue_papers --venue "Nature" --year 2024
# Academic Q&A (natural language query)
python scripts/aminer_client.py --action paper_qa --query "latest advances in transformer architecture"
# Patent search and details
python scripts/aminer_client.py --action patent_search --query "quantum computing"
You can also call a single API directly:
python scripts/aminer_client.py --action raw \
--api paper_search --params '{"title": "BERT", "page": 0, "size": 5}'
# Or temporarily override the environment variable by explicitly passing --token
python scripts/aminer_client.py --token <TOKEN> --action raw \
--api paper_search --params '{"title": "BERT", "page": 0, "size": 5}'
Raw mode error-prevention rules (mandatory):
- Before calling, verify the function signature (parameter names and types must match exactly); never "guess parameters by semantics".
- Raw parameter constraints are governed by
references/api-catalog.md; if it conflicts with prior knowledge, the catalog always takes precedence. paper_infois only for batch basic information; the parameter must be{"ids": [...]}.paper_detailonly supports single-paper details; the parameter must be{"paper_id": "..."}. Never passids.- When multiple paper details are needed: first use low-cost interfaces for filtering (e.g.,
paper_info/paper_search_pro), then callpaper_detailonly for the target subset (default top 10 if the user has not specified a count). - Before executing, output "the function name to be called + parameter JSON" for self-inspection, then make the request.
Stability and Failure Handling Strategy (Must Read)
The client scripts/aminer_client.py has built-in request retry and fallback strategies to reduce the impact of network fluctuations and transient service errors on results.
- Timeout and Retry
- Default request timeout:
30s - Maximum retries:
3 - Backoff strategy: exponential backoff (
1s -> 2s -> 4s) + random jitter
- Default request timeout:
- Retryable Status Codes
408 / 429 / 500 / 502 / 503 / 504
- Non-Retryable Scenarios
- Common
4xxerrors (e.g., parameter errors, authentication issues) are not retried by default; an error structure is returned directly.
- Common
- Workflow Fallback
paper_deep_dive: automatically falls back topaper_search_proifpaper_searchyields no results.paper_qa: automatically falls back topaper_search_proif thequerymode yields no results.
- Traceable Call Chain
- Combined workflow output includes
source_api_chain, marking which APIs were combined to produce the result.
- Combined workflow output includes
Paper Search API Selection Guide
When the user says "search for papers", first determine whether the goal is "find an ID", "filter results", "Q&A", or "generate an analysis report", then select the API:
| API | Focus | Use Case | Cost |
|---|---|---|---|
paper_search | Title search, quickly get paper_id | Known paper title, locate the target paper first | Free |
paper_search_pro | Multi-condition search and sorting (author/institution/journal/keywords) | Topic search, sort by citations or year | ¥0.01/call |
paper_qa_search | Natural language Q&A / topic keyword search | User describes need in natural language; semantic search first | ¥0.05/call |
paper_list_by_search_venue | Returns more complete paper info (suitable for analysis) | Need richer fields for analysis/reports | ¥0.30/call |
paper_list_by_keywords | Multi-keyword batch retrieval | Batch thematic retrieval (e.g., AlphaFold + protein folding) | ¥0.10/call |
paper_detail_by_condition | Retrieve details by year + journal dimension | Journal annual monitoring, venue selection analysis | ¥0.20/call |
Recommended routing (default):
- Known title:
paper_search -> paper_detail -> paper_relation - Conditional filtering:
paper_search_pro -> paper_detail - Natural language Q&A:
paper_qa_search(fall back topaper_search_proif no results) - Journal annual analysis:
venue_search -> venue_paper_relation -> paper_detail_by_condition
Supplementary rules (strongly recommended):
- When searching by title only, always use
paper_searchfirst (free) to quickly locate the paper ID. - For complex semantic retrieval (natural language, multi-condition, fuzzy expressions), prefer
paper_qa_search. - When using
paper_qa_search, first break the natural language need into structured conditions, then fill in the fields (e.g., year, topic keywords, author/institution, etc.). queryandtopic_high/topic_middle/topic_loware mutually exclusive: choose one; do not pass both simultaneously.- When using
querymode, fill in a natural language string directly; when usingtopic_*mode, first expand with synonyms/English variants before filling in. - Example: querying "AI-related papers from 2012":
year→[2012]- Option A:
query→"artificial intelligence" - Option B:
topic_high→[["artificial intelligence","ai","Artificial Intelligence"]](withuse_topicenabled)
Handling Out-of-Workflow Requests (Required)
When the user's request falls outside the 6 workflows above, or existing workflows cannot directly cover it, the following steps must be executed:
- First read
references/api-catalog.mdto confirm available interfaces, parameter constraints, and response fields. - Select the most appropriate API based on the user's goal and design the shortest viable call chain (locate ID first, then supplement details, then expand relationships).
- When necessary, combine multiple APIs to complete the query, and annotate
source_api_chainin the result to clearly describe the data source path. - If multiple combination approaches exist, prefer the one with lower cost, higher stability, and fields that satisfy the requirement.
- Use the "optimal query combination" as much as possible; avoid indiscriminate full retrieval; perform low-cost search and filtering first, then fetch details for a small set of targets.
- When results are large and the user has not specified a count, default to querying only the top 10 details and returning a summary first; for example, when 1000 papers are matched, do not call the detail API for all 1000 to reduce user costs.
- For
rawcalls, parameter-level validation is required: e.g.,paper_infousesids,paper_detailusespaper_id; do not mix them up. - When the user has not explicitly requested deep information, prefer the free path (e.g.,
paper_search/paper_info/venue_search); only supplement with necessary paid APIs after confirming they are insufficient. - When returning the final entity list, the corresponding URL must be included; if entity IDs are missing, supplement them before outputting results.
Do not give up on a query simply because "no existing workflow fits"; actively complete the API combination based on
api-catalog.
6 Combined Workflows
Workflow 1: Scholar Profile
Use Case: Understand a scholar's complete academic profile, including bio, research interests, published papers, patents, and research projects.
Call Chain:
Scholar search (name → person_id)
↓
Parallel calls:
├── Scholar details (bio/education/honors)
├── Scholar portrait (research interests/experience/work history)
├── Scholar papers (paper list)
├── Scholar patents (patent list)
└── Scholar projects (research projects/funding info)
Command:
python scripts/aminer_client.py --token <TOKEN> --action scholar_profile --name "Yann LeCun"
Sample output fields:
- Basic info: name, institution, title, gender
- Personal bio (bilingual)
- Research interests and domains
- Education history (structured)
- Work experience (structured)
- Paper list (ID + title)
- Patent list (ID + title)
- Research projects (title/funding amount/dates)
Workflow 2: Paper Deep Dive
Use Case: Retrieve complete paper information and citation relationships based on a paper title or keywords.
Call Chain:
Paper search / Paper search pro (title/keyword → paper_id)
↓
Paper details (abstract/authors/DOI/journal/year/keywords)
↓
Paper citations (which papers this paper cites → cited_ids)
↓
(Optional) Batch retrieve basic info for cited papers
Command:
# Search by title
python scripts/aminer_client.py --token <TOKEN> --action paper_deep_dive --title "BERT"
# Search by keyword (using pro API)
python scripts/aminer_client.py --token <TOKEN> --action paper_deep_dive \
--keyword "large language model" --author "Hinton" --order n_citation
Workflow 3: Org Analysis
Use Case: Analyze an institution's scholar size, paper output, and patent count; suitable for competitive research or partnership evaluation.
Call Chain:
Org disambiguation pro (raw string → org_id, handles alias/full-name differences)
↓
Parallel calls:
├── Org details (description/type/founding date)
├── Org scholars (scholar list)
├── Org papers (paper list)
└── Org patents (patent ID list, supports pagination, up to 10,000)
If multiple institutions share the same name, org search returns a candidate list; use org disambiguation pro for precise matching.
Command:
python scripts/aminer_client.py --token <TOKEN> --action org_analysis --org "MIT"
# Specify raw string (with abbreviation/alias)
python scripts/aminer_client.py --token <TOKEN> --action org_analysis --org "Massachusetts Institute of Technology, CSAIL"
Workflow 4: Venue Papers
Use Case: Track papers published in a specific journal for a given year; useful for submission research or research trend analysis.
Call Chain:
Venue search (name → venue_id)
↓
Venue details (ISSN/type/abbreviation)
↓
Venue papers (venue_id + year → paper_id list)
↓
(Optional) Batch paper detail query
Command:
python scripts/aminer_client.py --token <TOKEN> --action venue_papers --venue "NeurIPS" --year 2023
Workflow 5: Paper QA Search
Use Case: Intelligently search for papers using natural language or structured keywords; supports SCI filtering, citation-based sorting, author/institution constraints.
Core API: Paper QA Search (¥0.05/call), supports:
query: natural language question; system automatically breaks it into keywordstopic_high/middle/low: fine-grained keyword weight control (nested array OR/AND logic)sci_flag: show SCI papers onlyforce_citation_sort: sort by citation countforce_year_sort: sort by yearauthor_terms / org_terms: filter by author name or institution nameauthor_id / org_id: filter by author ID or institution ID (recommended for disambiguation)venue_ids: filter by conference/journal ID list
Command:
# Natural language Q&A
python scripts/aminer_client.py --token <TOKEN> --action paper_qa \
--query "deep learning methods for protein structure prediction"
# Fine-grained keyword search (must contain A and B, bonus for C)
python scripts/aminer_client.py --token <TOKEN> --action paper_qa \
--topic_high '[["transformer","self-attention"],["protein folding"]]' \
--topic_middle '[["AlphaFold"]]' \
--sci_flag --sort_citation
Workflow 6: Patent Analysis
Use Case: Search for patents in a specific technology domain, or retrieve a scholar's/institution's patent portfolio.
Call Chain (standalone search):
Patent search (query → patent_id)
↓
Patent details (abstract/filing date/application number/assignee/inventor)
Call Chain (via scholar/institution):
Scholar search → Scholar patents (patent_id list)
Org disambiguation → Org patents (patent_id list)
↓
Patent info / Patent details
Command:
python scripts/aminer_client.py --token <TOKEN> --action patent_search --query "quantum computing chip"
python scripts/aminer_client.py --token <TOKEN> --action scholar_patents --name "Shou-Cheng Zhang"
Individual API Quick Reference
For complete parameter descriptions, read
references/api-catalog.md
| # | Title | Method | Price | API Path (Base domain: datacenter.aminer.cn/gateway/open_platform) |
|---|---|---|---|---|
| 1 | Paper QA Search | POST | ¥0.05 | /api/paper/qa/search |
| 2 | Scholar Search | POST | Free | /api/person/search |
| 3 | Paper Search | GET | Free | /api/paper/search |
| 4 | Paper Search Pro | GET | ¥0.01 | /api/paper/search/pro |
| 5 | Patent Search | POST | Free | /api/patent/search |
| 6 | Org Search | POST | Free | /api/organization/search |
| 7 | Venue Search | POST | Free | /api/venue/search |
| 8 | Scholar Details | GET | ¥1.00 | /api/person/detail |
| 9 | Scholar Projects | GET | ¥3.00 | /api/project/person/v3/open |
| 10 | Scholar Papers | GET | ¥1.50 | /api/person/paper/relation |
| 11 | Scholar Patents | GET | ¥1.50 | /api/person/patent/relation |
| 12 | Scholar Portrait | GET | ¥0.50 | /api/person/figure |
| 13 | Paper Info | POST | Free | /api/paper/info |
| 14 | Paper Details | GET | ¥0.01 | /api/paper/detail |
| 15 | Paper Citations | GET | ¥0.10 | /api/paper/relation |
| 16 | Patent Info | GET | Free | /api/patent/info |
| 17 | Patent Details | GET | ¥0.01 | /api/patent/detail |
| 18 | Org Details | POST | ¥0.01 | /api/organization/detail |
| 19 | Org Patents | GET | ¥0.10 | /api/organization/patent/relation |
| 20 | Org Scholars | GET | ¥0.50 | /api/organization/person/relation |
| 21 | Org Papers | GET | ¥0.10 | /api/organization/paper/relation |
| 22 | Venue Details | POST | ¥0.20 | /api/venue/detail |
| 23 | Venue Papers | POST | ¥0.10 | /api/venue/paper/relation |
| 24 | Org Disambiguation | POST | ¥0.01 | /api/organization/na |
| 25 | Org Disambiguation Pro | POST | ¥0.05 | /api/organization/na/pro |
| 26 | Paper Search by Venue | GET | ¥0.30 | /api/paper/list/by/search/venue |
| 27 | Paper Batch Query | GET | ¥0.10 | /api/paper/list/citation/by/keywords |
| 28 | Paper Details by Year and Venue | GET | ¥0.20 | /api/paper/platform/allpubs/more/detail/by/ts/org/venue |
References
- Full API parameter documentation: read
references/api-catalog.md - Python client source:
scripts/aminer_client.py - Test cases:
evals/evals.json - Official documentation: https://open.aminer.cn/open/docs
- Console: https://open.aminer.cn/open/board?tab=control