nbadbArena Data Lab
GuidesSet MenuAnalytics Quickstart11 waypoints

Playbook

Analytics Quickstart

Opening possession for analysts who want to get from download to first query in minutes.

Analytics Quickstart

Get from dataset download to a real answer in a few minutes. This page is the opening possession: install a lightweight stack, open the DuckDB file, and leave with one leaderboard, one shot read, and one standings query you can reuse.

Pick your route

If you need to…Start hereStop when…
Prove the dataset is usable locallyInstall dependenciesSHOW TABLES returns the surfaces you expect
Grab one working analysis pattern fastRun your first questionYou have one query you can now swap filters into
Go deeper after the first resultWhere to go nextYou are on the guide that matches the next question

Coaching note: Start with the denormalized analytics_* surfaces when you need the fastest first read, then drop to fact tables once you know which possession deserves deeper inspection.

Run the opening four possessions

Possession 1 — Install dependencies

pip install polars duckdb kagglehub

This gives you one SQL engine (duckdb), one DataFrame tool (polars), and one download path (kagglehub) without a heavier local stack.

Possession 2 — Download the data

import kagglehub

path = kagglehub.dataset_download("wyattowalsh/basketball")
print(f"Data at: {path}")

You want the printed path because the next possession reuses it directly.

Possession 3 — Connect and confirm the floor

import duckdb

conn = duckdb.connect(f"{path}/nba.duckdb")

# See all tables
tables = conn.sql("SHOW TABLES").fetchall()
for t in tables:
    print(t[0])

What good looks like: you can see familiar surfaces such as dimension tables (dim_*), fact tables (fact_*), and convenience analytics outputs (analytics_*).

Possession 4 — Run your first question

Pick one of these three reads before you branch out. Each one gives you a reusable pattern for joins, filters, and aggregation.

Option A — Who are the all-time scoring leaders?

df = conn.sql("""
    SELECT
        p.first_name || ' ' || p.last_name AS player,
        COUNT(*) AS games,
        SUM(b.pts) AS total_points,
        ROUND(AVG(b.pts), 1) AS ppg
    FROM fact_box_score_player b
    JOIN dim_player p ON b.player_id = p.player_id
    JOIN dim_game g ON b.game_id = g.game_id
    WHERE g.season_type = 'Regular Season'
    GROUP BY 1
    ORDER BY total_points DESC
    LIMIT 10
""").pl()

print(df)

Use this pattern when you need a leaderboard: join a fact table, add player or team labels from dimensions, then aggregate.

Option B — What does a player's shot chart look like?

import polars as pl

shots = conn.sql("""
    SELECT
        sc.loc_x, sc.loc_y,
        sc.shot_made_flag,
        sc.shot_zone_basic
    FROM fact_shot_chart sc
    JOIN dim_player p ON sc.player_id = p.player_id
    JOIN dim_game g ON sc.game_id = g.game_id
    WHERE p.last_name = 'James'
      AND p.first_name = 'LeBron'
      AND g.season_year = 2024
""").pl()

print(f"Total shots: {len(shots)}")
print(shots.group_by("shot_zone_basic").agg([
    pl.count().alias("attempts"),
    pl.col("shot_made_flag").sum().alias("makes"),
]))

Use this pattern when you need a player-level slice with location columns you can later plot.

Option C — How do teams compare this season?

standings = conn.sql("""
    SELECT
        t.city || ' ' || t.nickname AS team,
        s.conference,
        s.wins, s.losses,
        ROUND(s.win_pct, 3) AS pct,
        s.playoff_rank
    FROM fact_standings s
    JOIN dim_team t ON s.team_id = t.team_id
    WHERE s.season_id = '2024-25'
    ORDER BY s.conference, s.playoff_rank
""").pl()

print(standings)

Use this pattern when you need a current-season snapshot that already lives in a purpose-built fact table.

Where to go next

Your next questionBest route
“Give me more SQL examples I can copy.”DuckDB Query Examples
“Now I want to visualize shooting locations.”Shot Chart Analysis
“I need side-by-side player reads.”Player Comparison
“I need table-level documentation before I write more SQL.”Schema Reference

Keep moving

Stay in the same possession

Keep the mental model warm with adjacent pages, section hubs, and search-friendly routes into the same topic cluster.

Section hub

On this page