nbadbArena Data Lab
Data DictionaryScoreboard KeyData Dictionary13 waypoints

Stat Legend

Data Dictionary

Scorer's-table lexicon for column meanings, naming patterns, and generated field inventories

Data Dictionary

Think of this section as the scorer's table for the warehouse: the place you go when a stat abbreviation, column suffix, or layer name looks familiar but not familiar enough.

The split in this section is intentional. The hand-authored pages explain meaning, naming habits, and reading strategy. The generated pages answer the contract question: which exact fields exist on a given tier right now.

Best first read
Curated
start with the glossary or field reference before diving into generated inventories
Command-owned pages
3
raw, staging, and star field inventories are generated from schema metadata
Core question
Meaning + owner
what does this field mean, and which tier owns it?

Use the curated pages for interpretation and navigation. Use the generated tier pages when you need the exact field inventory for a specific schema-backed layer. If you are deciding what a field means, start curated. If you are checking whether a field exists, go generated.

Start at the scorer's table

Start by question

If you only have one minute, pick the lane first and the page second.

Curated quick route

Decode a metric, acronym, or formula

Start with Glossary when the question is about basketball shorthand such as TS%, PIE, PPP, or box-score abbreviations.

Curated quick route

Read a dense schema page faster

Open Field Reference when the question is really about field patterns: keys, suffixes, home/visitor labels, rating names, or discriminator columns.

Generated quick route

Verify raw-layer field names

Use Raw for endpoint-shaped field inventories closest to source naming and payload structure.

Generated quick route

Verify normalized staging columns

Use Staging when you need the cleaned, typed, warehouse-ready field list feeding transforms.

Generated quick route

Verify public warehouse columns

Use Star for the final schema-backed surface exposed to analysis across dimensions, facts, bridges, aggregates, and analytics outputs.

2am path

Need the shortest possible route?

If you do not know where to start, check Glossary for meaning, Field Reference for patterns, then drop into the generated tier page that matches the layer you are touching.

Fast path by situation

Choose the right lookup sheet

If you need to answer...Start hereWhy
What does this metric or shorthand mean?GlossaryIt decodes basketball analytics terms, formulas, and common abbreviations
How should I read key names, suffixes, and recurring column patterns?Field ReferenceIt explains the warehouse's high-signal field families and naming habits
Which exact fields exist in the raw tier?RawGenerated inventory for extraction-shaped schemas
Which exact fields exist after normalization?StagingGenerated inventory for DuckDB-ready staging schemas
Which exact fields exist on the public schema-backed surface?StarGenerated inventory for final star-tier schemas

Curated pages vs generated pages

Curated pages: how to read the warehouse

These are the hand-authored pages in this section:

  • Glossary for metric meaning, formulas, and stat-family shorthand
  • Field Reference for keys, suffixes, role labels, and discriminator columns
  • this index page for route-finding, section boundaries, and maintenance expectations

Generated pages: what the schema exposes

These are the command-owned pages in this section:

  • Raw for source-shaped extraction fields
  • Staging for cleaned, typed normalization fields
  • Star for the final analytics-facing surface
Read the layer labels

Layer and prefix guide

PrefixLayerExampleWhat it usually means
raw_Raw schema/reference objectraw_boxscoretraditionalv3Endpoint-shaped payload contract closest to source naming
stg_DuckDB staging targetstg_boxscoretraditionalv3__player_statsCleaned, typed, warehouse-ready input layer
dim_Dimensiondim_player, dim_team, dim_gameContext and controlled vocabulary around the facts
fact_Factfact_play_by_play, fact_team_gameMeasured events, game lines, dashboards, or specialty outputs
bridge_Bridgebridge_game_official, bridge_play_playerMany-to-many helper that prevents repeated columns or duplicate joins
agg_Aggregateagg_player_season, agg_team_seasonReusable rollups for common analytical questions
analytics_Analytics view/tableanalytics_player_game_completePre-joined shortcut for everyday analyst workflows

Column naming conventions

The short version

  • Business keys: <entity>_id such as player_id, team_id, and game_id
  • History-aware surrogate keys: <entity>_sk where SCD Type 2 handling matters
  • Percentages and rates: _pct suffix for decimal percentages, plus names like pace, pie, or ppp for established metrics
  • Aggregations: total_ for totals, avg_ for averages, _rank for rank fields, and rolling-window names such as pts_roll5
  • Flags: is_ prefixes for booleans like is_current and is_weekend
  • Context labels: discriminator columns such as split_type, detail_type, summary_type, and tracking_type often define the meaning of a row

The 2am reading order

  1. Find the grain anchor: player_id, team_id, game_id, season_year, or lineup/group keys.
  2. Check the row-type columns: split_type, detail_type, summary_type, tracking_type, or similar.
  3. Only then read the measures: percentages, totals, ratings, ranks, and rolling windows.
Command-owned inventories

Generated pages

The generated tier pages in this section come from schema metadata and should be refreshed with:

uv run nbadb docs-autogen --docs-root docs/content/docs

That command regenerates:

  • data-dictionary/raw.mdx
  • data-dictionary/staging.mdx
  • data-dictionary/star.mdx

It also refreshes the matching schema reference pages plus ER and lineage artifacts. Keep this index, the glossary, and the field reference hand-authored; treat the generated tier pages as command-owned outputs with a different job and a different maintenance path.

Keep moving

Stay in the same possession

Keep the mental model warm with adjacent pages, section hubs, and search-friendly routes into the same topic cluster.

Section hub

On this page