Hong Kong Open Data API
Official Documentation: https://data.gov.hk/en/help/api-spec
Overview
DATA.GOV.HK provides multiple APIs for accessing government open data. All APIs use GMT+8 timezone.
API Types:
- CKAN API - Dataset discovery and metadata
- Data Filtering API - Query and filter dataset resources
- Historical Archive API - Access historical versions
- Nearest Facilities API - Location-based queries
Language Endpoints
All CKAN APIs have language-specific endpoints:
| Language | Endpoint Pattern |
|---|---|
| English | https://data.gov.hk/en-data/api/3/action/... |
| Traditional Chinese | https://data.gov.hk/tc-data/api/3/action/... |
| Simplified Chinese | https://data.gov.hk/sc-data/api/3/action/... |
CRITICAL: Always choose the correct language endpoint for your users.
CKAN API (Dataset Discovery)
List All Datasets
import requests
# Get all dataset IDs (English)
response = requests.get("https://data.gov.hk/en-data/api/3/action/package_list")
datasets = response.json()["result"]
Parameters: limit (int, optional), offset (int, optional)
# Get first 10 datasets with pagination
response = requests.get(
"https://data.gov.hk/en-data/api/3/action/package_list",
params={"limit": 10, "offset": 0}
)
Get Dataset Metadata
# Get details of a specific dataset
dataset_id = "hk-td-tis_2-traffic-snapshot-images"
response = requests.get(
"https://data.gov.hk/en-data/api/3/action/package_show",
params={"id": dataset_id}
)
metadata = response.json()["result"]
resources = metadata["resources"] # List of data files/URLs
List Data Categories
response = requests.get("https://data.gov.hk/en-data/api/3/action/group_list")
categories = response.json()["result"]
# Returns: ["city-management", "transport", "health", ...]
Get Datasets by Category
category_id = "transport" # Use category ID from group_list
response = requests.get(
"https://data.gov.hk/en-data/api/3/action/group_show",
params={"id": category_id}
)
datasets = response.json()["result"]["packages"]
Category IDs: city-management, climate-and-weather, commerce-and-industry, social-welfare, development, education, legislature, employment-and-labour, environment, finance, food, health, housing, law-and-security, population, recreation-and-culture, information-technology-and-broadcasting, tourism, transport, miscellaneous
Data Filtering API
CRITICAL: Use v2 endpoint, not v1.
Endpoint: https://api.data.gov.hk/v2/filter
Method: GET with URL-encoded JSON query parameter
Query Structure
Columns are 1-indexed (not 0-indexed).
import requests
import urllib.parse
import json
query = {
"resource": "http://www.cr.gov.hk/datagovhk/psi/ml_licensees.csv",
"section": 1, # Required if dataset has sections (default: 1)
"format": "json", # Options: csv (default), json, xml
"filters": [
[1, "eq", ["3983"]], # Column 1 equals "3983"
[2, "nct", ["2021"]], # Column 2 does not contain "2021"
[3, "gt", ["100"]] # Column 3 greater than 100
],
"sorts": [
[4, "asc"], # Sort column 4 ascending
[7, "desc"] # Then column 7 descending
]
}
# URL encode the JSON query
url = f"https://api.data.gov.hk/v2/filter?q={urllib.parse.quote(json.dumps(query))}"
response = requests.get(url)
data = response.json()
Filter Operators
Text Only:
eq- equals (1 operand)ne- not equals (1 operand)in- is in (2+ operands)ni- is not in (2+ operands)ct- contains (1 operand)nct- does not contain (1 operand)bw- begins with (1 operand)nbw- does not begin with (1 operand)ew- ends with (1 operand)new- does not end with (1 operand)
Number Only:
lt- less than (1 operand)le- less than or equal (1 operand)gt- greater than (1 operand)ge- greater than or equal (1 operand)bt- between (2 operands)
Example - Multiple Values with "in":
query = {
"resource": "...",
"filters": [
[2, "in", ["Kowloon", "Hong Kong Island", "New Territories"]]
]
}
Example - Range with "bt":
query = {
"resource": "...",
"filters": [
[5, "bt", ["100", "500"]] # Column 5 between 100 and 500
]
}
CRITICAL: Always include "section": 1 if the dataset has multiple sections. Check the dataset page or metadata to confirm.
Historical Archive API
Endpoint: https://app.data.gov.hk/v1/historical-archive/...
Note: Latest historical data is from yesterday (not today).
List Historical Files
response = requests.get(
"https://app.data.gov.hk/v1/historical-archive/list-files",
params={
"start": "20230101", # YYYYMMDD
"end": "20231231", # YYYYMMDD
"category": "climate-and-weather", # Optional
"provider": "hk-hko", # Optional
"format": "csv", # Optional: file extension
"search": "temperature", # Optional: keyword
"order": "url", # Optional: dataset-en, dataset-tc, resource-en, etc.
"skip": 0 # Optional: pagination offset
}
)
files = response.json()["result"]
List File Versions
response = requests.get(
"https://app.data.gov.hk/v1/historical-archive/list-file-versions",
params={
"url": "http://www.cr.gov.hk/datagovhk/psi/ml_licensees.csv",
"start": "20230101",
"end": "20231231"
}
)
versions = response.json()["result"]
# Note: Maximum 10,000 results returned
Download Historical File
response = requests.get(
"https://app.data.gov.hk/v1/historical-archive/get-file",
params={
"url": "http://www.cr.gov.hk/datagovhk/psi/ml_licensees.csv",
"time": "202301011230" # YYYYMMDDHHmm
}
)
# Response: 302 redirect to file download
# Follow redirect to get file content
Download Schema or Data Dictionary
# Schema
response = requests.get(
"https://app.data.gov.hk/v1/historical-archive/get-schema",
params={
"url": "DATASET_URL", # Dataset URL, not resource URL
"date": "20230101"
}
)
# Data Dictionary
response = requests.get(
"https://app.data.gov.hk/v1/historical-archive/get-data-dictionary",
params={
"url": "DATASET_URL",
"date": "20230101"
}
)
Nearest Facilities API
Endpoint: https://api.data.gov.hk/v1/nearest-schools
response = requests.get(
"https://api.data.gov.hk/v1/nearest-schools",
params={
"lat": 22.2812, # WGS84 latitude
"long": 114.1659, # WGS84 longitude
"max": 5 # Max results (optional, default: all, limit: 100)
}
)
schools = response.json()["results"]
# Results ordered by distance ascending
Related Dataset: School Location and Information
Common Mistakes
| Mistake | Fix |
|---|---|
| Using v1 Filtering API | Use https://api.data.gov.hk/v2/filter |
| 0-indexed columns | Columns are 1-indexed |
Missing section parameter | If dataset has sections, always include "section": 1 |
| Wrong language endpoint | Use en-data for English, tc-data for Traditional Chinese, sc-data for Simplified Chinese |
| Quering today's historical data | Historical data available starting from yesterday |
| Not following redirects | Historical file download returns 302 redirect - follow it |
| String values for all filter operands | Convert numbers to strings: "100" not 100 |
| Forgetting URL encoding | Always urllib.parse.quote(json.dumps(query)) |
| Exceeding 10,000 result limit | Historical File Version API limits to 10,000 results |
Workflow: Getting Started
- Discover datasets using CKAN
package_listorgroup_show - Get metadata using
package_showto find resource URLs - Query data using v2 Filtering API with proper filters and sections
- For historical data use Historical Archive APIs
Error Handling
# Filtering API returns HTTP 200 with JSON
response = requests.get(url)
if response.status_code == 200:
data = response.json()
else:
# Error details in JSON
error = response.json()
print(error.get("error", {}).get("message", "Unknown error"))
# Historical Archive API
# 200 - Success
# 302 - Redirect (for file downloads)
# 400 - Syntax error (check required parameters)
# 404 - File not available (for historical versions)
Provider IDs (Common)
| Provider ID | Name |
|---|---|
hk-dpo | Digital Policy Office |
hk-hko | Hong Kong Observatory |
hk-td | Transport Department |
hk-fehd | Food and Environmental Hygiene Department |
mtr | MTR Corporation Limited |
hospital | Hospital Authority |
Full provider list: 90+ providers available in API documentation.
Quick Reference
| Task | API | Endpoint |
|---|---|---|
| List all datasets | CKAN | /api/3/action/package_list |
| Get dataset metadata | CKAN | /api/3/action/package_show?id=... |
| List categories | CKAN | /api/3/action/group_list |
| Get datasets by category | CKAN | /api/3/action/group_show?id=... |
| Filter/query data | Filtering | https://api.data.gov.hk/v2/filter?q=... |
| List historical files | Archive | /v1/historical-archive/list-files?... |
| Get file versions | Archive | /v1/historical-archive/list-file-versions?... |
| Download historical file | Archive | /v1/historical-archive/get-file?... |
| Find nearest schools | Facilities | /v1/nearest-schools?lat=...&long=... |
Real-World Example: Traffic Data
import requests
import urllib.parse
import json
# Step 1: Discover traffic datasets
response = requests.get(
"https://data.gov.hk/en-data/api/3/action/group_show",
params={"id": "transport"}
)
datasets = response.json()["result"]["packages"]
# Step 2: Get specific dataset metadata
dataset_id = "hk-td-tis_2-traffic-snapshot-images"
response = requests.get(
"https://data.gov.hk/en-data/api/3/action/package_show",
params={"id": dataset_id}
)
resources = response.json()["result"]["resources"]
resource_url = resources[0]["url"]
# Step 3: Filter the data
query = {
"resource": resource_url,
"section": 1,
"format": "json",
"filters": [
[1, "eq", ["HIGHWAY"]],
[5, "gt", ["2024-01-01"]]
],
"sorts": [[3, "desc"]]
}
url = f"https://api.data.gov.hk/v2/filter?q={urllib.parse.quote(json.dumps(query))}"
response = requests.get(url)
traffic_data = response.json()