The World Bank World Development Indicators provides 1,600+ time-series indicators covering poverty, health, education, infrastructure, and environment for 217 countries from 1960 onwards. Used in data engineering for global development dashboards, longitudinal analysis pipelines, and economic research systems in Python.
The `wbdata` and `world_bank_data` Python libraries provide full WDI API access. Engineers use `wbdata.get_dataframe(indicators_dict, country=country_codes)` to pull multiple indicators simultaneously into wide-format DataFrames with year rows and country columns.
WDI is the most comprehensive public dataset for building AI development analytics tools. Index all 1,400 indicators for a RAG system that can answer virtually any factual development question with World Bank data. AI forecasting models use WDI historical indicators as features for predicting future development outcomes.
# pip install world_bank_data pandas
import world_bank_data as wb, pandas as pd
# Fetch multiple WDI indicators for all countries
indicators = {
"SP.POP.TOTL": "population",
"NY.GDP.PCAP.CD": "gdp_per_capita",
"SH.DYN.MORT": "child_mortality"
}
df = wb.get_series(list(indicators.keys()), simplify_index=True).unstack(level=0)
df.columns = list(indicators.values())
print(df.dropna().tail(10))Official dataset source
More datasets used by Python data engineers.
The WHO Global Health Observatory offers datasets on a wide range of health-related indicators, including disease prevalence, mortality rates, healthcare access and more.
Access datasets on child well-being, education enrolment, nutrition, immunisation, child mortality, and child protection indicators worldwide from UNICEF. Used in data engineering for humanitarian analytics pipelines, SDG progress tracking, and building global child health indicator dashboards in Python.
New York City's open data portal provides 3,000+ datasets covering taxi trips, 311 complaints, crime statistics, building permits, health inspections, and transit data. Used in urban data engineering pipelines for city analytics, transportation modelling, and building geospatial dashboards in Python.