The World Inequality Database provides long-run historical data on income and wealth distribution, top income shares, and inequality indices for 100+ countries. Used in data engineering for economic inequality analytics, distributional analysis pipelines, and building income tracking dashboards in Python.
The `wid` Python package provides API access to the World Inequality Database. Engineers use `wid.download()` with indicator codes (e.g., 'sptinc_p90p100_z' for top 10% income share) to retrieve country-year time-series into pandas DataFrames for inequality research.
WID data enables AI analysis tools that track wealth concentration and inequality trends. Build a RAG system indexed on WID country profiles so LLMs can answer 'How has income inequality changed in the US since 1980?' with Piketty-group verified statistics. AI policy models use WID data to predict redistribution impacts.
# pip install wid pandas
import wid, pandas as pd
# Top 1% income share in the US
df = wid.download(
indicators="sptinc_p99p100_z",
areas=["US"],
years=range(1980, 2023),
ages=992, pop="j"
)
df.columns = ["year", "top1_share"]
print(df.tail(10))Official dataset source
More datasets used by Python data engineers.
The Federal Reserve Bank of St. Louis FRED database provides over 800,000 economic time series from 100+ sources, including interest rates, inflation, GDP, and employment data. Widely used in financial and economic data pipelines via the fredapi Python library for loading macro data into analytical systems.
The FEC provides access to campaign finance data, including information on political contributions, campaign expenditures, fundraising activities and financial disclosures filed by political candidates, parties and committees in the United States.
FiveThirtyEight publishes the datasets behind its data journalism articles covering US politics, sports analytics, economics, and culture. Available on GitHub as clean, analysis-ready CSV files, making them ideal for practising data loading, statistical analysis pipelines, and exploratory data workflows in Python.