dbt Star Schema Tutorial

Austin Bikeshare Demo

Interactive Tutorial

Learn dbt Data Warehouse Design

Master dbt configuration and star schema design through the Austin Bikeshare project. Follow the workflow, explore annotated code, and understand data transformations visually.

5 Config Files
4 SQL Models
SCD Type 2 Snapshots
Sample Data

Data Mart Design

🎯Business Goal
Enable historical analysis of Austin B-cycle bike trips with accurate station context at the time of each trip.

Business Questions This Enables:

  • •How did station closures affect ridership patterns?
  • •What was the average trip duration when a station had 15 docks vs 10 docks?
  • •Which subscriber types used stations before vs after status changes?

Key Challenge

Station attributes (status, dock count) change over time, but source data only shows the current state. We need to preserve historical versions to analyze trips accurately.

Platform Architecture
Data flows from BigQuery public data through dbt layers to the final mart
Source Layer
BigQuery Public Data
bikeshare_trips

Raw trip events

bikeshare_stations

Station metadata

Staging Layer
austin_bikeshare_demo_staging
stg_bikeshare_trips

Cleaned trips with date parts

Snapshot Layer
austin_bikeshare_demo_snapshots
snap_bikeshare_stations

SCD2 station history

Marts Layer
austin_bikeshare_demo_marts
dim_stations_scd

Station dimension with versions

fct_trips_with_station_history

Trip facts with historical context

Entity Relationship Diagram
Visual representation of the star schema: sources → dimensions → facts
Primary Key
Foreign Key
Source
Dimension
Fact

Trip Data Path

bikeshare_trips
source

Raw trip records from BigQuery public dataset. Each row is one bike rental.

trip_idINTEGER
subscriber_typeSTRING
start_timeTIMESTAMP
start_station_idINTEGER
end_station_idINTEGER
duration_minutesINTEGER
stg_bikeshare_trips
fct_trips_with_station_history
fact

Trip facts joined with historical station data. Station attributes are AS OF trip date.

trip_idINTEGER
trip_dateDATE
subscriber_typeSTRING
start_station_idINTEGER
start_station_status_at_tripSTRING
start_station_docks_at_tripINTEGER
duration_minutesINTEGER
start_hourINTEGER
day_of_weekINTEGER

Station Data Path (with SCD2 history)

bikeshare_stations
source

Station metadata. Contains current state only - no history.

station_idINTEGER
nameSTRING
statusSTRING
addressSTRING
number_of_docksINTEGER
modified_dateDATE
snap_bikeshare_stations
dim_stations_scd
dimension

Station dimension with full version history (SCD Type 2). Multiple rows per station.

station_version_idSTRING
station_idINTEGER
station_nameSTRING
statusSTRING
dbt_valid_fromTIMESTAMP
dbt_valid_toTIMESTAMP
is_currentBOOLEAN
version_numberINTEGER
→ fct_trips

KPI Measures (from Fact Table)

trip_count
COUNT(*)

Total number of trips

avg_duration
AVG(duration_minutes)

Average trip length in minutes

peak_hour_trips
COUNT(*) WHERE start_hour BETWEEN 7 AND 9

Morning rush hour trips

subscriber_ratio
COUNT(subscriber_type=Annual) / COUNT(*)

Percentage of annual subscribers

Data Flow Summary

bikeshare_tripsstg_bikeshare_tripsfct_trips_with_station_history
bikeshare_stationssnap_bikeshare_stationsdim_stations_scdfct_trips_with_station_history
Source Data
Raw tables from BigQuery public dataset
bikeshare_trips
Source
Each row = one bike rental
trip_idstart_timestart_station_idsubscriber_typeduration_minutes
99000012019-06-15 09:30:002575Annual12
99000022021-02-20 14:15:002712Walk Up8
99000032020-08-10 11:00:002575Local36525
Simulated data for illustration
bikeshare_stations
Source
Current snapshot only - no history!
station_idnamestatusnumber_of_docksmodified_date
25755th & Bowieactive152020-06-15
2712Toomey Rd @ S. Lamarclosed132021-01-10
Simulated data for illustration
Final Data Mart: fct_trips_with_station_history
Trip facts enriched with historically accurate station attributes
Column Origins:
From bikeshare_trips
From bikeshare_stations
SQL calculated
dbt auto-generated
trip_idtrip_datesubscriber_typestart_station_idstart_station_namestart_station_status_at_tripstart_station_docks_at_tripduration_minutesstart_hourday_of_weekdbt_valid_fromdbt_valid_to
Final fact table with historically accurate station attributes
trip_idtrip_datesubscriber_typestart_station_idstart_station_namestart_station_status_at_tripstart_station_docks_at_trip
99000012019-06-15Annual25755th & Bowieactive15
99000022021-02-20Walk Up2712Toomey Rd @ S. Lamarclosed13
99000032020-08-10Local36525755th & Bowieactive15
Simulated data for illustration