nbadbArena Data Lab
LineagePossession ChainLineage9 waypoints

Ball Movement

Data Lineage

Follow ball movement from NBA API sources through staging and into the star schema

Data Lineage

Lineage is the film room for nbadb. Instead of watching a possession from the broadcast angle, you watch every touch: the inbound pass at the NBA API, the outlet into raw capture, the half-court reset in staging, and the finish in the analytical warehouse.

Replay-review note: Start here when the question is "where did this come from?" or "what breaks if I change this?"

Tip-off
stats + static + live
The full covered nba_api runtime surface starts each lineage chain.
Passing lanes
raw -> stg -> star
Each touch adds naming, validation, and dependency context.
Finish
dim / fact / agg / analytics
Downstream tables either set the floor, record the action, or summarize the result.
Read the floor

Choose Your Lens

LensBest forWhat you will see
Table LineageImpact analysis and dependency tracingFull possession chains from endpoints to downstream tables
Column LineageDebugging a field, rename, or constraint issuePassing-lane examples for important columns
Generated Lineage MapExhaustive dependency lookupAuto-generated transformer graph sourced from code metadata

lineage-auto.mdx is generator-owned. Use the curated pages in this section for orientation, then use the generated map when you need exhaustive, code-sourced dependency detail.

Quick navigation

Entry surface

Trace full table chains

Start with Table Lineage when you need to replay the whole possession from source feed to downstream table.

Entry surface

Debug one field

Use Column Lineage when the breakage is local to one key, metric, rename, or constraint.

Generated companion

Open the exhaustive replay map

Jump to Generated Lineage Map when curated examples are not wide enough and you need code-sourced coverage.

Next route

Leave replay review for shape or contracts

Skip to Next steps when you are ready to switch from dependency tracing to diagrams, schema docs, or endpoint scouting.

Use this section when…

If you need to answer…Start here
“Where did this table come from?”Table Lineage
“Which upstream field fed this column?”Column Lineage
“What else breaks if I change this staging schema or transformer?”Table Lineage
“Where is the exhaustive code-derived dependency map?”Generated Lineage Map
Curated lens

Replay the whole possession

Use Table Lineage when a table, dashboard, or output surface looks wrong and you need to trace the full dependency chain back to the inbound feed.

Curated lens

Slow the tape down to one field

Open Column Lineage when the breakage is local to a single key, metric, rename, or constraint rather than the whole table.

Generated lens

Check the full code-sourced map

Go to Generated Lineage Map when you need exhaustive dependency coverage generated directly from transformer metadata.

Why Lineage Matters

  1. Debugging: When a value looks wrong in a fact table, trace it back to the source API endpoint
  2. Impact analysis: Before changing a staging schema, see which downstream tables are affected
  3. Coverage: Identify which API endpoints feed which warehouse tables
  4. Documentation: Understand the complete data flow without reading transform code

Possession Map

Mermaid diagram

Showing Mermaid source preview until the SVG diagram hydrates.

Preparing board
Source preview
flowchart LR
    A["Tip-off
NBA API"] --> B["Outlet pass
Raw capture"]
    B --> C["Half-court set
Staging validation"]
    C --> D["Finish
Star schema"]
    D --> E["Kick-out
Aggregates, analytics, export"]
    style A fill:#e1f5fe
    style B fill:#fff8e1

Read it left to right: sources start the action, raw preserves the original shape, staging organizes the possession, and the star surface makes the result queryable.

Read the replay by question

QuestionFocus onThen route to
“Where did the chain start?”The source and raw touches in the possession mapEndpoints if you need source-family detail
“Where was the shape normalized?”The staging touch and validation table belowSchema Reference if you need exact contracts
“What table or view finished the play?”The star and export touchesTable Lineage or Column Lineage for the detailed replay
NBA API --> Raw capture --> Staging validation --> Star surface --> Export
 source      preserve feed    normalize + type      dim/fact/agg        SQLite /
                                                 + dependency flow      DuckDB / Parquet / CSV

Each stage applies progressively stricter validation:

StageSchema LayerValidationColumn Names
ExtractRawStructural onlyUPPER_CASE (API native)
StageStagingTypes + nullability + rangessnake_case
TransformStarFull constraints + FK refssnake_case

Generation

Lineage documentation can be regenerated from transform code:

uv run nbadb docs-autogen
# or: uv run python -m nbadb.docs_gen

This introspects BaseTransformer.depends_on and staging schema metadata["source"] to build lineage graphs automatically.

Run the next replay

Next steps from lineage

Next stop

Switch from dependency to warehouse shape

Move to Diagrams when you understand the chain of custody and now need the faster visual board for schema shape, pipeline flow, or endpoint coverage.

Next stop

Verify exact contracts after the replay

Continue to Schema Reference or the Data Dictionary when the lineage answer still needs an exact column contract, field meaning, or naming convention check.

Next stop

Reconnect the replay to source scouting reports

Jump to Endpoints when the upstream question is really about the nba_api family, result set, or extractor surface that starts the possession.

Keep moving

Stay in the same possession

Keep the mental model warm with adjacent pages, section hubs, and search-friendly routes into the same topic cluster.

Section hub

On this page