Jump Ball
Getting Started
The tip-off and control tower for nbadb's public analytical surface.
nbadb Documentation
Welcome to the Arena Data Lab for nbadb: the control tower for getting from first install to first query, first refresh, and first production handoff.
nbadb turns the NBA stats surface into a 141-table public analytical model with DuckDB staging, SQL-first transforms, and exports for DuckDB, SQLite, Parquet, and CSV.
Quick navigation
Start the install
Jump to Installation for prerequisites, install routes, env defaults, and the first sanity checks.
Understand the pipeline
Open Architecture for the raw → staging → star flow, internal state boundaries, and why the warehouse is shaped this way.
Find the right command
Use CLI Reference when the question is "what command does this?" or "what flags are accepted right now?"
Find the right table
Start with Schema Reference, then move to the Data Dictionary when you need exact fields and meanings.
Use this page when…
| If you need to… | Go here first | Why this is the shortest route |
|---|---|---|
| Get nbadb onto a machine and prove it works | Installation | It covers prerequisites, install paths, NBADB_ defaults, and first-run checks |
| Understand how raw extraction becomes queryable warehouse tables | Architecture | It explains the raw → staging → star flow, validation tiers, and internal state boundaries |
| Find the exact command or flag surface | CLI Reference | It maps command intent, shared behaviors, and current signatures in one place |
| Find the right public table family or field | Schema Reference and Data Dictionary | Use schema pages for table discovery, then switch to field-level definitions |
First five minutes
If you just want to get oriented without reading the whole site, take this route:
- Install nbadb from Installation.
- Run a first build or pull a published dataset.
- Check CLI Reference for the exact command surface.
- Use Schema Reference to find the right table family.
- Move into the guide lane that matches your job.
pip install nbadb
nbadb init
nbadb schemanbadb init is the full historical build and usually takes hours, not minutes. Once your data directory is seeded, nbadb daily becomes the standard game-day possession for refreshing the current season.
Choose your route
I am installing or evaluating nbadb
Start with Installation. It covers the PyPI route, the source route, the NBADB_ settings that matter first, and what should appear in your data directory when things work.
I need to understand how the warehouse works
Read Architecture next. That page explains the pipeline stages, validation tiers, internal pipeline tables, export lanes, and the design choices behind the public model.
I need to query or model against the data
Use Schema Reference to find the right table family, then switch to the Data Dictionary for field-level meaning and the DuckDB Query Examples guide for working SQL.
I need recurring operator workflows
Keep Daily Updates, Kaggle Setup, and the Troubleshooting Playbook open together. Those are the practical runbook pages.
Reader routes in one glance
| Reader | Start here | Keep this open next |
|---|---|---|
| Evaluator or first-time installer | Installation | CLI Reference |
| Analyst looking for usable tables fast | Schema Reference | DuckDB Query Examples |
| Builder changing pipeline or docs behavior | Architecture | CLI Reference |
| Operator running recurring refreshes | Daily Updates | Troubleshooting Playbook |
What's on the floor
| Surface | Count | Reach for it when… | Start here |
|---|---|---|---|
| Dimensions | 17 | You need stable entities like players, teams, games, seasons, and identity/history context | Dimensions |
| Facts | 102 | You need grain-specific measurements across player, team, game, play, shot, and specialty surfaces | Facts & Bridges |
| Bridges | 2 | You are resolving many-to-many relationships between public entities | Facts & Bridges |
Aggregates (agg_*) | 16 | You want pre-rolled summaries instead of rebuilding the same rollups in downstream SQL | Derived Aggregations |
Analytics outputs (analytics_*) | 4 | You want the shortest route to analysis-ready convenience surfaces | Analytics Views |
Fast lanes by reader
Analyst lane
Begin with Analytics Quickstart, keep DuckDB Query Examples open for working SQL, and jump to Analytics Views when you want the shortest route to a usable surface.
Builder lane
Start with Installation, then read CLI Reference and Architecture before changing pipeline behavior, schemas, or docs generation flows.
Operator lane
Use Daily Updates for the recurring refresh cadence, Kaggle Setup for distribution handoffs, and the Troubleshooting Playbook when artifacts or CI go sideways.
Generated vs. hand-written docs
Some docs pages are hand-authored. Others are generated from schema metadata and lineage information.
uv run nbadb docs-autogen --docs-root docs/content/docsThat command owns schema/{raw,staging,star}-reference.mdx, data-dictionary/{raw,staging,star}.mdx, diagrams/er-auto.mdx, lineage/lineage-auto.mdx, and lineage/lineage.json. Regenerate those outputs when code changes; do not hand-edit them.
Jump to the rest of the arena
Core entry pages
- Installation — prerequisites, install routes, config defaults, and first checks
- Architecture — pipeline stages, validation tiers, internal state, and design decisions
- CLI Reference — exact command signatures plus operator and CI notes
Reference surfaces
- Schema Reference — table families, join lanes, and generated reference coverage
- Data Dictionary — column-level definitions and glossary terms
- Endpoints — upstream endpoint grouping and extraction coverage
Guided routes
- Guides — onboarding, recurring ops, and analysis drills
- Visual Asset Prompt Pack — basketball-native prompts for hero art, Open Graph cards, icons, and subtle atmosphere assets
Keep moving
Stay in the same possession
Keep the mental model warm with adjacent pages, section hubs, and search-friendly routes into the same topic cluster.