Analytics Quickstart

Get from dataset download to a real answer in a few minutes. This page is the opening possession: install a lightweight stack, open the DuckDB file, and leave with one leaderboard, one shot read, and one standings query you can reuse.

If you want to rehearse the SQL shape before downloading anything, warm up in SQL Playground. It runs DuckDB-WASM in the browser with self-contained NBA-flavored sample rows, so you can practice the structure first and connect to the full dataset second.

Choose one first possession

If you have today…	Start here	First useful action	Stop when
A browser tab and no local setup	SQL Playground	Run one sample query and change one filter, sort, or aggregate so the SQL shape is yours	You understand the query skeleton you want to port
A local file or download path	Possession 3	Connect to `nba.duckdb` and run `SHOW TABLES`	You can see `analytics_`, `dim_`, and `fact_*` surfaces
A basketball question and need one reusable answer	Possession 4	Run one option end to end and keep the output dataframe or notebook cell	You have one working pattern you can reuse

analytics_*Fastest first read

DuckDBLocal engine

PolarsFrame layer

3First wins

Connection

Win 1

Prove that the warehouse opens cleanly and that SHOW TABLES returns the surfaces you expect.

Pattern

Win 2

Leave with one pattern you can reuse: a leaderboard, a shot-zone slice, or a season snapshot.

Branch

Win 3

Know whether the next possession belongs in DuckDB Query Examples, Player Comparison, Shot Chart Analysis, or Schema Reference.

Pick your route

Start here

Rehearse the query shape first

Open SQL Playground if you want browser-only reps before touching the local warehouse.

Start here

Prove the dataset is usable locally

Start at Possession 1 and stop once SHOW TABLES returns the surfaces you expect.

Start here

Grab one working pattern fast

Jump to Possession 4 if you already have the file and just need one reusable analysis pattern.

Start here

Branch deeper after the first result

Finish on Where to go next once you know whether the next possession is SQL, visualization, comparison, or schema reading.

Coaching note: Pick one lane only. Browser-only reps belong in SQL Playground; live-data validation starts with SHOW TABLES; deeper analysis starts after you have one working query in hand. Start with the denormalized analytics_* surfaces when you need the fastest first read, then drop to fact tables once you know which possession deserves deeper inspection.

Opening possession

Run the opening four possessions

Possession 1 — Install dependencies

Lightweight local stack

$ pip install polars duckdb kagglehub

This gives you one SQL engine (duckdb), one DataFrame tool (polars), and one download path (kagglehub) without a heavier local stack.

Possession 2 — Download the data

import kagglehub

path = kagglehub.dataset_download("wyattowalsh/basketball")
print(f"Data at: {path}")

You want the printed path because the next possession reuses it directly.

Possession 3 — Connect and confirm the floor

import duckdb

conn = duckdb.connect(f"{path}/nba.duckdb")

# See all tables
tables = conn.sql("SHOW TABLES").fetchall()
for t in tables:
    print(t[0])

What good looks like: you can see familiar surfaces such as dimension tables (dim_*), fact tables (fact_*), and convenience analytics outputs (analytics_*).

Possession 4 — Run your first question

Pick one of these three reads before you branch out. Each one gives you a reusable pattern for joins, filters, and aggregation.

Leaderboard

Option A

Use this when you need an all-time or season-scoped ranking and want a clean fact-plus-dimension aggregation pattern.

Shot map

Option B

Use this when you need a player slice with court coordinates you can later plot or group by zone.

Standings

Option C

Use this when you need a current-season team snapshot that already lives in a purpose-built fact table.

Option A — Who are the all-time scoring leaders?

df = conn.sql("""
    SELECT
        p.full_name AS player,
        COUNT(*) AS games,
        SUM(b.pts) AS total_points,
        ROUND(AVG(b.pts), 1) AS ppg
    FROM fact_player_game_traditional b
    JOIN dim_player p ON b.player_id = p.player_id
    JOIN dim_game g ON b.game_id = g.game_id
    WHERE p.is_current = true
      AND g.season_type = 'Regular Season'
    GROUP BY 1
    ORDER BY total_points DESC
    LIMIT 10
""").pl()

print(df)

Use this pattern when you need a leaderboard: join a fact table, add player or team labels from dimensions, then aggregate.

Option B — What does a player's shot chart look like?

import polars as pl

shots = conn.sql("""
    SELECT
        sc.loc_x, sc.loc_y,
        sc.shot_made_flag,
        sc.shot_zone_basic
    FROM fact_shot_chart sc
    JOIN dim_player p ON sc.player_id = p.player_id
    WHERE p.is_current = true
      AND p.full_name = 'LeBron James'
      AND sc.season_year = '2024-25'
""").pl()

print(f"Total shots: {len(shots)}")
print(shots.group_by("shot_zone_basic").agg([
    pl.count().alias("attempts"),
    pl.col("shot_made_flag").sum().alias("makes"),
]))

Use this pattern when you need a player-level slice with location columns you can later plot.

Option C — How do teams compare this season?

standings = conn.sql("""
    SELECT
        t.full_name AS team,
        s.conference,
        s.wins, s.losses,
        ROUND(s.win_pct, 3) AS pct,
        s.conf_rank
    FROM fact_standings s
    JOIN dim_team t ON s.team_id = t.team_id
    WHERE s.season_year = '2024-25'
      AND s.season_type = 'Regular Season'
    ORDER BY s.conference, s.conf_rank
""").pl()

print(standings)

Use this pattern when you need a current-season snapshot that already lives in a purpose-built fact table.

Where to go next

Next question

Browser-only sandbox

Go to SQL Playground before touching the local file.

Next question

More SQL examples

Go to DuckDB Query Examples for copy-ready patterns.

Next question

Visualize shooting locations

Go to Shot Chart Analysis for court-map work.

Next question

Run side-by-side player reads

Go to Player Comparison for head-to-head patterns.

Next question

Read table-level documentation

Go to Schema Reference before you write more SQL.

Analytics Quickstart

Choose one first possession

If you have today…	Start here	First useful action	Stop when
A browser tab and no local setup	SQL Playground	Run one sample query and change one filter, sort, or aggregate so the SQL shape is yours	You understand the query skeleton you want to port
A local file or download path	Possession 3	Connect to `nba.duckdb` and run `SHOW TABLES`	You can see `analytics_`, `dim_`, and `fact_*` surfaces
A basketball question and need one reusable answer	Possession 4	Run one option end to end and keep the output dataframe or notebook cell	You have one working pattern you can reuse

analytics_*Fastest first read

DuckDBLocal engine

PolarsFrame layer

3First wins

Connection

Win 1

Prove that the warehouse opens cleanly and that SHOW TABLES returns the surfaces you expect.

Pattern

Win 2

Leave with one pattern you can reuse: a leaderboard, a shot-zone slice, or a season snapshot.

Branch

Win 3

Know whether the next possession belongs in DuckDB Query Examples, Player Comparison, Shot Chart Analysis, or Schema Reference.

Pick your route

Start here

Rehearse the query shape first

Open SQL Playground if you want browser-only reps before touching the local warehouse.

Start here

Prove the dataset is usable locally

Start at Possession 1 and stop once SHOW TABLES returns the surfaces you expect.

Start here

Grab one working pattern fast

Jump to Possession 4 if you already have the file and just need one reusable analysis pattern.

Start here

Branch deeper after the first result

Finish on Where to go next once you know whether the next possession is SQL, visualization, comparison, or schema reading.

Opening possession

Run the opening four possessions

Possession 1 — Install dependencies

Lightweight local stack

$ pip install polars duckdb kagglehub

This gives you one SQL engine (duckdb), one DataFrame tool (polars), and one download path (kagglehub) without a heavier local stack.

Possession 2 — Download the data

import kagglehub

path = kagglehub.dataset_download("wyattowalsh/basketball")
print(f"Data at: {path}")

You want the printed path because the next possession reuses it directly.

Possession 3 — Connect and confirm the floor

import duckdb

conn = duckdb.connect(f"{path}/nba.duckdb")

# See all tables
tables = conn.sql("SHOW TABLES").fetchall()
for t in tables:
    print(t[0])

What good looks like: you can see familiar surfaces such as dimension tables (dim_*), fact tables (fact_*), and convenience analytics outputs (analytics_*).

Possession 4 — Run your first question

Pick one of these three reads before you branch out. Each one gives you a reusable pattern for joins, filters, and aggregation.

Leaderboard

Option A

Use this when you need an all-time or season-scoped ranking and want a clean fact-plus-dimension aggregation pattern.

Shot map

Option B

Use this when you need a player slice with court coordinates you can later plot or group by zone.

Standings

Option C

Use this when you need a current-season team snapshot that already lives in a purpose-built fact table.

Option A — Who are the all-time scoring leaders?

df = conn.sql("""
    SELECT
        p.full_name AS player,
        COUNT(*) AS games,
        SUM(b.pts) AS total_points,
        ROUND(AVG(b.pts), 1) AS ppg
    FROM fact_player_game_traditional b
    JOIN dim_player p ON b.player_id = p.player_id
    JOIN dim_game g ON b.game_id = g.game_id
    WHERE p.is_current = true
      AND g.season_type = 'Regular Season'
    GROUP BY 1
    ORDER BY total_points DESC
    LIMIT 10
""").pl()

print(df)

Use this pattern when you need a leaderboard: join a fact table, add player or team labels from dimensions, then aggregate.

Option B — What does a player's shot chart look like?

import polars as pl

shots = conn.sql("""
    SELECT
        sc.loc_x, sc.loc_y,
        sc.shot_made_flag,
        sc.shot_zone_basic
    FROM fact_shot_chart sc
    JOIN dim_player p ON sc.player_id = p.player_id
    WHERE p.is_current = true
      AND p.full_name = 'LeBron James'
      AND sc.season_year = '2024-25'
""").pl()

print(f"Total shots: {len(shots)}")
print(shots.group_by("shot_zone_basic").agg([
    pl.count().alias("attempts"),
    pl.col("shot_made_flag").sum().alias("makes"),
]))

Use this pattern when you need a player-level slice with location columns you can later plot.

Option C — How do teams compare this season?

standings = conn.sql("""
    SELECT
        t.full_name AS team,
        s.conference,
        s.wins, s.losses,
        ROUND(s.win_pct, 3) AS pct,
        s.conf_rank
    FROM fact_standings s
    JOIN dim_team t ON s.team_id = t.team_id
    WHERE s.season_year = '2024-25'
      AND s.season_type = 'Regular Season'
    ORDER BY s.conference, s.conf_rank
""").pl()

print(standings)

Use this pattern when you need a current-season snapshot that already lives in a purpose-built fact table.

Where to go next

Next question

Browser-only sandbox

Go to SQL Playground before touching the local file.

Next question

More SQL examples

Go to DuckDB Query Examples for copy-ready patterns.

Next question

Visualize shooting locations

Go to Shot Chart Analysis for court-map work.

Next question

Run side-by-side player reads

Go to Player Comparison for head-to-head patterns.

Next question

Read table-level documentation

Go to Schema Reference before you write more SQL.

Analytics Quickstart

Win 1

Win 2

Win 3

Rehearse the query shape first

Prove the dataset is usable locally

Grab one working pattern fast

Branch deeper after the first result

Option A

Option B

Option C

Browser-only sandbox

More SQL examples

Visualize shooting locations

Run side-by-side player reads

Read table-level documentation

Stay in the same possession

SQL Playground

DuckDB Query Examples

On this page

Analytics Quickstart

Win 1

Win 2

Win 3

Rehearse the query shape first

Prove the dataset is usable locally

Grab one working pattern fast

Branch deeper after the first result

Option A

Option B

Option C

Browser-only sandbox

More SQL examples

Visualize shooting locations

Run side-by-side player reads

Read table-level documentation

Stay in the same possession

SQL Playground

DuckDB Query Examples

On this page