The United States Department of Labor provides a wide range of datasets on labor market conditions, employment trends, wages, prices, productivity and workplace safety.
BLS bulk data files are downloadable as flat text files from download.bls.gov. Engineers parse these fixed-width or tab-delimited files with `pandas.read_csv(sep='\t')`, joining series metadata files with data files using series IDs to construct labeled time-series datasets.
BLS bulk data provides decades of US labor statistics for training AI economic forecasting models. Fine-tune time-series models on employment and wage data, or build a RAG system indexed on BLS industry summaries so AI assistants can answer occupational questions with authoritative statistics.
# pip install pandas
import pandas as pd
# BLS bulk data — Occupational Employment Statistics
url = "https://download.bls.gov/pub/time.series/oe/oe.data.0.Current"
df = pd.read_csv(url, sep="\t", names=["series_id","year","period","value","footnote"])
# Filter for software developer occupation (SOC 15-1252)
dev = df[df["series_id"].str.contains("15-1252", na=False)]
print(dev.tail(5))Official dataset source
More datasets used by Python data engineers.
The Federal Reserve Bank of St. Louis FRED database provides over 800,000 economic time series from 100+ sources, including interest rates, inflation, GDP, and employment data. Widely used in financial and economic data pipelines via the fredapi Python library for loading macro data into analytical systems.
The FEC provides access to campaign finance data, including information on political contributions, campaign expenditures, fundraising activities and financial disclosures filed by political candidates, parties and committees in the United States.
Access demographic, economic, social, and geographic datasets from the US Census Bureau including the American Community Survey, decennial census, and economic census. Used in data engineering for population analysis pipelines, market research, geospatial enrichment, and building socioeconomic dashboards in Python.