1 min to read
NBA Game ETL & Interactive Dashboard
ETL practice

This project showcases a complete ETL pipeline—taking raw NBA game data from the web and turning it into an interactive dashboard. The main goal was to practice end-to-end Extract ∙ Transform ∙ Load skills.
Key Steps
- Extract — A Python script calls the free balldontlie API, paginates safely, respects rate-limits, and stores raw JSON data.
-
Transform — Using
pandas
, the raw data is flattened: columns are renamed, dates parsed, and season/team features engineered into a tidyclean_games.csv
. -
Load — The cleaned data is pushed into a PostgreSQL
database (
nba_data.games
) via SQLAlchemy + psycopg2. - Explore — A Streamlit app connects to Postgres, lets users filter by season and team, and visualizes score differentials.
What I Practiced
- API pagination & rate-limit handling
- Data cleaning + feature engineering with
pandas
- Relational loading & schema design in PostgreSQL
- Secure credential management (
.env
+.gitignore
) - Rapid dashboarding with Streamlit
All code, requirements, and setup instructions are available on GitHub, making it easy to reproduce, schedule nightly refreshes, or plug into Tableau.
Comments