SIADS 521: Visual Exploration of Data · University of Michigan · November 2025
On this page
This project was completed as part of SIADS 521: Visual Exploration of Data at the University of Michigan. The course focuses on the principles and practice of visual data analysis — choosing the right chart types, encoding data effectively, and building interactive tools that let analysts explore data rather than just report it.
The central question: how has NBA shot selection changed over the past 14 seasons, and which shooting zones are actually efficient? To answer this, I built an interactive Plotly dashboard that analyzes shot-by-shot data for the top 25 NBA scorers by points per game from 2011 to 2025. The dashboard uses four complementary chart types — each chosen for a specific analytical purpose — and lets users filter dynamically by player and season across all views simultaneously.
Skills demonstrated in this project
The dataset contains shot-by-shot records for the top 25 NBA players by points per game across each season from 2011 to 2025 — 828,368 total attempts. Each record includes the shot's court coordinates, distance from the basket, shot type (two-pointer or three-pointer), outcome (made or missed), and the points scored. Players are filtered by PPG ranking per season, meaning the dataset captures the league's most prolific offensive players at their peak rather than career-wide data.
Prior to visualization, the data was cleaned to remove outlier attempts (shots beyond 35 feet from the basket, or outside court boundaries), standardize column names, and calculate a points-scored field based on shot type and outcome. This preprocessing made it possible to compute efficiency metrics — points per attempt and field goal percentage — across any player, season, or distance range selected in the dashboard.
The dashboard pairs each visualization type with a specific analytical purpose. A player dropdown and season slider filter all four charts simultaneously, so any change in selection propagates across the full view — allowing direct comparison of, for example, how a specific player's shot chart and efficiency profile compare league-wide across different seasons.
The scatter plot maps every shot attempt onto a half-court coordinate plane, color-coded by outcome (green for makes, red for misses). This serves as the dashboard's spatial foundation — showing where shots are taken and immediately surfacing hotspots, dead zones, and player tendencies. Scatter plots are the right choice here because both axes (x and y court coordinates) carry meaningful spatial information, and encoding shot outcome as color adds a third dimension without cluttering the view.
Filtering by season reveals the shift in league-wide tendency over time. In the earliest seasons, shot attempts are distributed across the entire court, mid-range included. By the most recent seasons, that zone has nearly emptied out entirely — replaced by a concentrated preference for three-point attempts and shots at the rim.
The histogram bins shot attempts into five distance ranges (0–3 ft, 3–10 ft, 10–16 ft, 16–24 ft, and 24+ ft) and calculates points per attempt for each zone. A red-yellow-green color scale makes the efficiency gradient immediately readable. This transforms raw spatial data into a quantitative efficiency profile — revealing not just where players shoot, but which zones actually pay off.
The line chart plots the percentage of two-point vs. three-point attempts across each season, with both series on the same axes for direct comparison. Line charts are optimal for this because the data is time-ordered and the story is about trend — not snapshot — making continuous lines more intuitive than bars. Hovering any point shows the exact split for that season, and filtering to a single player reveals how individual shot selection evolved over a career.
The stacked bar chart displays cumulative points from two-pointers and three-pointers for each season, with each bar divided by shot type. This view shifts the lens from shot attempts to actual scoring output — showing not just how shot selection changed, but how that shift affected total points generated. Stacked bars are well-suited for proportional composition across discrete categories, making the growing share of three-point scoring easy to read at a glance.