Build a complete dbt project with staging models, core business logic, dashboard models, and tests to transform e-commerce data in a PostgreSQL warehouse. Master the modern data transformation workflow used by data teams worldwide.
This example demonstrates how to:
Ensure you have dbt installed:
pip install dbt-core dbt-postgres
Create a new dbt project:
dbt init dbt_ecommerce_project
cd dbt_ecommerce_project
Set up your profiles.yml file with your database credentials.
my_profile: target: dev outputs: dev: type: postgres host: your_host user: your_user pass: your_password port: 5432 dbname: your_database schema: your_schema threads: 4
models/
sources.yml
staging/
stg_orders.sql
stg_customers.sql
core/
fct_daily_sales.sql
dim_customers.sql
dashboards/
dashboard_daily_sales.sql
tests/
assert_positive_total_amount.sql
dbt_project.yml
This structure organizes the project into staging, core, and dashboard models, with separate files for sources and tests.
# models/sources.yml version: 2 sources: - name: raw_data database: your_database schema: public tables: - name: orders - name: customers
This file defines the raw data sources, specifying the database, schema, and table names.
-- models/staging/stg_orders.sql {{ config( materialized='view' ) }} SELECT order_id, customer_id, order_date, status, amount FROM {{ source('raw_data', 'orders') }} WHERE amount > 0
Staging models clean and prepare raw data. They're configured as views for efficiency.
-- models/core/fct_daily_sales.sql {{ config( materialized='table', partition_by={ "field": "order_date", "data_type": "date" } ) }} SELECT o.order_date, COUNT(DISTINCT o.order_id) as total_orders, COUNT(DISTINCT o.customer_id) as unique_customers, SUM(o.amount) as total_amount FROM {{ ref('stg_orders') }} o GROUP BY o.order_date
Core models represent key business logic. They're materialized as tables for better query performance.
-- models/dashboards/dashboard_daily_sales.sql {{ config( materialized='view' ) }} SELECT ds.order_date, ds.total_orders, ds.unique_customers, ds.total_amount, LAG(ds.total_amount) OVER (ORDER BY ds.order_date) as prev_day_amount, (ds.total_amount - LAG(ds.total_amount) OVER (ORDER BY ds.order_date)) / LAG(ds.total_amount) OVER (ORDER BY ds.order_date) * 100 as daily_growth_percent FROM {{ ref('fct_daily_sales') }} ds
Dashboard models create aggregated views for reporting, including trend analysis.
-- tests/assert_positive_total_amount.sql SELECT order_date, total_amount FROM {{ ref('fct_daily_sales') }} WHERE total_amount <= 0
This test ensures data quality by checking for negative total amounts.
# dbt_project.yml name: "dbt_ecommerce_project" version: "1.0.0" config-version: 2 profile: "dbt_ecommerce_project" model-paths: ["models"] test-paths: ["tests"] models: dbt_ecommerce_project: staging: +materialized: view core: +materialized: table dashboards: +materialized: view seeds: dbt_ecommerce_project: +enabled: true
The dbt_project.yml file sets project-wide and folder-specific configurations.
To run this example:
Run the models:
dbt run
Run the tests:
dbt test
Generate documentation:
dbt docs generate
dbt docs serve
This will create the models in your database, run the tests, and generate documentation for your project.
You do need a PostgreSQL database up and running to use the postgres profile. Here's what you need to do:
Install PostgreSQL:
sudo apt-get install postgresqlbrew install postgresqlStart the PostgreSQL service:
sudo systemctl start postgresqlbrew services start postgresqlCreate a database:
sudo -u postgres psql
CREATE DATABASE your_database;
CREATE USER your_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE your_database TO your_user;
\q
Create the raw data tables. Connect to your database and run:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INTEGER,
order_date DATE,
status VARCHAR(50),
amount DECIMAL(10,2)
);
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100),
created_at TIMESTAMP
);
Insert some sample data:
INSERT INTO customers (first_name, last_name, email, created_at)
VALUES
('John', 'Doe', 'john@example.com', '2023-01-01 10:00:00'),
('Jane', 'Smith', 'jane@example.com', '2023-01-02 11:00:00');
INSERT INTO orders (customer_id, order_date, status, amount)
VALUES
(1, '2023-01-03', 'completed', 100.50),
(2, '2023-01-04', 'completed', 200.75),
(1, '2023-01-05', 'pending', 50.25);
Ensure that your profiles.yml file (located in ~/.dbt/ or the appropriate directory) contains the correct connection details for PostgreSQL. Update your profiles.yml. Make sure the database details match what you've set up:
dbt_ecommerce_project:
target: dev
outputs:
dev:
type: postgres
host: localhost
user: your_user
pass: your_password
port: 5432
dbname: your_database
schema: public
threads: 4
Now you have a PostgreSQL database set up with some sample data, ready for use with dbt. Remember to replace your_database, your_user, and your_password with the actual values you used when setting up your database.