Access demographic, economic, social, and geographic datasets from the US Census Bureau including the American Community Survey, decennial census, and economic census. Used in data engineering for population analysis pipelines, market research, geospatial enrichment, and building socioeconomic dashboards in Python.
The `census` Python library provides clean access to the Census Bureau API with variable name lookups. Engineers query ACS 5-Year estimates by state, county, or tract, building DataFrames of demographic variables for geographic market analysis and target audience sizing.
Census data enables AI systems that reason about US demographics and market characteristics. Build a RAG system indexed on census tract profiles so an LLM agent can answer 'What is the median household income in this zip code?' Demographic features from the Census are also key inputs to location-based AI models.
# pip install census pandas
from census import Census
import pandas as pd
c = Census("YOUR_API_KEY")
data = c.acs5.state_county(
("NAME", "B19013_001E"), # median household income
Census.ALL, Census.ALL
)
df = pd.DataFrame(data).rename(columns={"B19013_001E": "median_income"})
print(df.nlargest(5, "median_income")[["NAME", "median_income"]])Official dataset source
More datasets used by Python data engineers.
The Federal Reserve Bank of St. Louis FRED database provides over 800,000 economic time series from 100+ sources, including interest rates, inflation, GDP, and employment data. Widely used in financial and economic data pipelines via the fredapi Python library for loading macro data into analytical systems.
The FEC provides access to campaign finance data, including information on political contributions, campaign expenditures, fundraising activities and financial disclosures filed by political candidates, parties and committees in the United States.
Access 16,000+ development indicators from the World Bank covering GDP, poverty, health, education, infrastructure, and environment for 200+ countries. Used in data engineering for building global development dashboards, time-series analysis pipelines, and cross-country economic comparison systems in Python.