The European Centre for Disease Prevention and Control publishes datasets on infectious disease surveillance, outbreak monitoring, antimicrobial resistance, and vaccination coverage across Europe. Used in public health data pipelines, epidemiological analysis, and building disease monitoring dashboards in Python.
ECDC publishes data as CSV and Excel files via their Surveillance Atlas API and direct download links. Engineers use `pandas.read_csv()` or `pandas.read_excel()` to ingest disease surveillance data, which is typically structured by disease, country, year, and age group.
ECDC disease surveillance data trains AI epidemic forecasting models and builds public health knowledge bases for RAG systems. AI-powered health dashboards use ECDC data to answer 'What is the current measles situation in Europe?' with official epidemiological data rather than outdated training knowledge.
# pip install requests pandas
import requests, pandas as pd
url = "https://opendata.ecdc.europa.eu/covid19/nationalcasedeath_eueea_daily_ei/csv/data.csv"
df = pd.read_csv(url)
df["dateRep"] = pd.to_datetime(df["dateRep"], format="%d/%m/%Y")
latest = df.groupby("countriesAndTerritories").apply(lambda x: x.nlargest(1, "dateRep"))
print(latest[["countriesAndTerritories", "cases", "deaths"]].head(10))Official dataset source
More datasets used by Python data engineers.
Thousands of publicly available datasets hosted on GitHub repositories covering social media, finance, healthcare, sports, and scientific domains. Accessible directly via the GitHub API or raw download URLs, making them ideal for practising version-controlled data ingestion and automated dataset pipelines in Python.
Access datasets on child well-being, education enrolment, nutrition, immunisation, child mortality, and child protection indicators worldwide from UNICEF. Used in data engineering for humanitarian analytics pipelines, SDG progress tracking, and building global child health indicator dashboards in Python.
New York City's open data portal provides 3,000+ datasets covering taxi trips, 311 complaints, crime statistics, building permits, health inspections, and transit data. Used in urban data engineering pipelines for city analytics, transportation modelling, and building geospatial dashboards in Python.