Access confirmed exoplanet data collected by NASA's Kepler, K2, and TESS missions, including orbital parameters, stellar properties, and discovery methods. Useful for scientific data pipelines, astronomy datasets, and practising complex query-based API ingestion in Python.
The NASA Exoplanet Archive supports ADQL (SQL-like) queries via its Table Access Protocol (TAP) service. Engineers use `astroquery.nasa_exoplanet_archive` or direct HTTP requests to retrieve planet parameters (mass, radius, orbital period) into pandas DataFrames.
NASA exoplanet data trains ML models for planet detection and habitability classification. RAG systems built on the exoplanet archive let LLMs answer detailed questions like 'How many super-Earths orbit in the habitable zones of their stars?' with verified catalog data.
# pip install requests pandas
import requests, pandas as pd
query = "select pl_name,pl_rade,pl_bmasse,pl_orbper from ps where pl_rade < 2 order by pl_rade"
resp = requests.get(
"https://exoplanetarchive.ipac.caltech.edu/TAP/sync",
params={"query": query, "format": "json"}
)
df = pd.DataFrame(resp.json())
print(df.head(10))Official dataset source
More datasets used by Python data engineers.
A word-finding query engine that returns words related by meaning, sound, spelling, and context. Used in NLP data engineering pipelines for synonym expansion, keyword generation, text augmentation datasets, and building linguistic feature engineering workflows in Python.
Access NASA's extensive collection of space data including the Astronomy Picture of the Day, Mars rover photos, near-Earth object tracking, satellite imagery, and Earth observation datasets. Commonly used in scientific data pipelines, geospatial analysis workflows, and educational data engineering projects with Python.
Access Wolfram Alpha's computational knowledge engine for structured answers to mathematical, scientific, and factual queries. Used in data engineering for data enrichment pipelines, automated fact-checking workflows, and generating computed features from natural language questions in Python.