UN Comtrade provides detailed bilateral trade statistics including import, export, and re-export flows by commodity, country, and year for 200+ reporting nations. Used in data engineering for international trade analytics, supply chain intelligence pipelines, and building global commerce dashboards in Python.
The `comtradeapicall` Python library wraps the UN Comtrade v3 API with pagination and rate limit handling. Engineers query trade flows specifying reporter, partner, HS codes, and year ranges. Large queries are split by year and aggregated in pandas for comprehensive trade analysis.
UN Comtrade bilateral trade data enables AI systems that map global supply chain dependencies and predict trade disruption impacts. RAG pipelines built on Comtrade data help LLM analysts answer 'How much semiconductor equipment does China import from the Netherlands?' with precise official trade statistics.
# pip install comtradeapicall pandas
import comtradeapicall as comtrade
# Get China's exports to the US for 2023
df = comtrade.previewFinalData(
typeCode="C", freqCode="A", clCode="HS",
period="2023", reporterCode="156",
cmdCode="TOTAL", flowCode="X",
partnerCode="842", partner2Code=None,
customsCode=None, motCode=None,
maxRecords=10, format_output="JSON"
)
print(df[["reporterDesc", "partnerDesc", "primaryValue"]])Official dataset source
More datasets used by Python data engineers.
Data.gov hosts 300,000+ datasets from US federal agencies covering health, education, environment, agriculture, finance, and transportation. Used in data engineering for government analytics pipelines, public health research, geospatial analysis, and building civic data applications with Python.
Data.gov.uk provides datasets from UK central and local government covering crime, transport, planning, health, and environment. Used in data engineering for public sector analytics, policy research pipelines, geospatial visualisation, and building civic technology applications in Python.
Google's Open Images Dataset contains 9 million images annotated with object bounding boxes, segmentation masks, visual relationships, and image-level labels across 600 categories. Used in computer vision data engineering pipelines for model training, benchmark evaluation, and building image classification datasets in Python.