Ball Movement
Data Lineage
Follow ball movement from NBA API sources through staging and into the star schema
Data Lineage
Lineage is the film room for nbadb. Instead of watching a possession from the broadcast angle, you watch every touch: the inbound pass at the NBA API, the outlet into raw capture, the half-court reset in staging, and the finish in the analytical warehouse.
Replay-review note: Start here when the question is "where did this come from?" or "what breaks if I change this?"
Choose Your Lens
| Lens | Best for | What you will see |
|---|---|---|
| Table Lineage | Impact analysis and dependency tracing | Full possession chains from endpoints to downstream tables |
| Column Lineage | Debugging a field, rename, or constraint issue | Passing-lane examples for important columns |
| Generated Lineage Map | Exhaustive dependency lookup | Auto-generated transformer graph sourced from code metadata |
lineage-auto.mdx is generator-owned. Use the curated pages in this section
for orientation, then use the generated map when you need exhaustive,
code-sourced dependency detail.
Quick navigation
Trace full table chains
Start with Table Lineage when you need to replay the whole possession from source feed to downstream table.
Debug one field
Use Column Lineage when the breakage is local to one key, metric, rename, or constraint.
Open the exhaustive replay map
Jump to Generated Lineage Map when curated examples are not wide enough and you need code-sourced coverage.
Leave replay review for shape or contracts
Skip to Next steps when you are ready to switch from dependency tracing to diagrams, schema docs, or endpoint scouting.
Use this section when…
| If you need to answer… | Start here |
|---|---|
| “Where did this table come from?” | Table Lineage |
| “Which upstream field fed this column?” | Column Lineage |
| “What else breaks if I change this staging schema or transformer?” | Table Lineage |
| “Where is the exhaustive code-derived dependency map?” | Generated Lineage Map |
Replay the whole possession
Use Table Lineage when a table, dashboard, or output surface looks wrong and you need to trace the full dependency chain back to the inbound feed.
Slow the tape down to one field
Open Column Lineage when the breakage is local to a single key, metric, rename, or constraint rather than the whole table.
Check the full code-sourced map
Go to Generated Lineage Map when you need exhaustive dependency coverage generated directly from transformer metadata.
Why Lineage Matters
- Debugging: When a value looks wrong in a fact table, trace it back to the source API endpoint
- Impact analysis: Before changing a staging schema, see which downstream tables are affected
- Coverage: Identify which API endpoints feed which warehouse tables
- Documentation: Understand the complete data flow without reading transform code
Possession Map
flowchart LR
A["Tip-off
NBA API"] --> B["Outlet pass
Raw capture"]
B --> C["Half-court set
Staging validation"]
C --> D["Finish
Star schema"]
D --> E["Kick-out
Aggregates, analytics, export"]
style A fill:#e1f5fe
style B fill:#fff8e1
…Read it left to right: sources start the action, raw preserves the original shape, staging organizes the possession, and the star surface makes the result queryable.
Read the replay by question
| Question | Focus on | Then route to |
|---|---|---|
| “Where did the chain start?” | The source and raw touches in the possession map | Endpoints if you need source-family detail |
| “Where was the shape normalized?” | The staging touch and validation table below | Schema Reference if you need exact contracts |
| “What table or view finished the play?” | The star and export touches | Table Lineage or Column Lineage for the detailed replay |
NBA API --> Raw capture --> Staging validation --> Star surface --> Export
source preserve feed normalize + type dim/fact/agg SQLite /
+ dependency flow DuckDB / Parquet / CSVEach stage applies progressively stricter validation:
| Stage | Schema Layer | Validation | Column Names |
|---|---|---|---|
| Extract | Raw | Structural only | UPPER_CASE (API native) |
| Stage | Staging | Types + nullability + ranges | snake_case |
| Transform | Star | Full constraints + FK refs | snake_case |
Generation
Lineage documentation can be regenerated from transform code:
uv run nbadb docs-autogen
# or: uv run python -m nbadb.docs_genThis introspects BaseTransformer.depends_on and staging schema metadata["source"] to build lineage graphs automatically.
Next steps from lineage
Switch from dependency to warehouse shape
Move to Diagrams when you understand the chain of custody and now need the faster visual board for schema shape, pipeline flow, or endpoint coverage.
Verify exact contracts after the replay
Continue to Schema Reference or the Data Dictionary when the lineage answer still needs an exact column contract, field meaning, or naming convention check.
Reconnect the replay to source scouting reports
Jump to Endpoints when the upstream question is really about the nba_api family, result set, or extractor surface that starts the possession.
Keep moving
Stay in the same possession
Keep the mental model warm with adjacent pages, section hubs, and search-friendly routes into the same topic cluster.