Analyzing a LeBron dunk using decomposition and motion graphs – the code

Analyzing LeBron's speed in Python using VisualStudio

This post dives into some of the technical nitty-gritty powering our lesson on analyzing a basketball play using decomposition and motion graphs. If you haven’t had a chance to check out that post, definitely start there. As stated in that post, 95% of the credit belongs to Savvas Tjortjoglou and his incredibly helpful blog post describing how to access and manipulate NBA movement data.

First thing’s first – we need to pick what play we’d like to analyze. I chose this dunk from Game 3 of the 2015 NBA Finals, since it’s both awe-inspiring and seemed like it could be broken down into a relatively simple play. To be able to read the correct set of movement data, we need two IDs: the game ID and the event ID. The game ID is easy to retrieve – Game 3 took place on June 9, so we can go to the Scores page on stats.nba.com for June 9, 2015 and click on Box Score under the only game there. This leads us to http://stats.nba.com/game/#!/0041400403/, and that last part of the path is our game ID – 0041400403.

The event ID is a little trickier to get. Here’s one method, from Savvas:

  1. Click on Play by Play on the game’s stat page (http://stats.nba.com/game/#!/0041400403/)
  2. Find the play you want and note the text description – in this case, we want “James REBOUND (Off:1 Def:11)” at 4:20 in Q4.
  3. Plug the game ID into the following URL, which returns the JSON data for the game: http://stats.nba.com/stats/playbyplayv2?StartPeriod=1&EndPeriod=10&RangeType=2&GameID=0041400403
  4. Finally, search for the text description from step 2. This leads us to the following entry:
    ["0041400403",445,4,0,4,"11:20 PM","4:20","James REBOUND (Off:1 Def:11)",...]

    The first element is the game ID, and the second is the event ID: 445. It’s actually referred to as “EVENTNUM” in the JSON, but they are the same ID.

Now that we have the two IDs we need, we’re ready to code! Setting up Python and all the requisite libraries was easily the most challenging part of this project for me. I used http://www.lfd.uci.edu/~gohlke/pythonlibs/ to download and PIP to install the NumPy, Pandas, MatPlotLib, Requests, SciPy and Seaborn libraries. All of the scripts below are available here.

The core data loading and manipulation part of our Python scripts comes directly from Savvas’ blog, linked above:

print("Loading data...")

import requests # used for loading data from NBA
import pandas as pd # used for data storage and manipulation
import numpy as np # used for math manipulation

import matplotlib.pyplot as plt # used for plotting
import matplotlib.animation as animation # used for animation
import seaborn as sns # used for graph formatting

url = "http://stats.nba.com/stats/locations_getmoments/?eventid=445&gameid=0041400403" # set id parameters as appropriate

response = requests.get(url) # load data from NBA
print("Done loading data")

home = response.json()["home"] # dictionary of home players
visitor = response.json()["visitor"] # dictionary of visiting players
moments = response.json()["moments"] # player moment data

headers = ["team_id", "player_id", "x_loc", "y_loc", 
           "radius", "moment", "game_clock", "shot_clock"] # column headers

player_moments = [] # create an empty list to populate

for moment in moments:
    for player in moment[5]: # for every player and the ball, at each moment in time...
        player.extend((moments.index(moment), moment[2], moment[3])) # add index, game clock and shot clock
        player_moments.append(player)

df = pd.DataFrame(player_moments, columns=headers) # create a Pandas dataframe with this data

players = home["players"] # create a list of players, starting with home players...
players.extend(visitor["players"]) # and then adding visiting players

id_dict = {} # create a new dictionary to map player IDs to their jerseys

for player in players:
    id_dict[player['playerid']] = [player["firstname"]+" "+player["lastname"],
                                   player["jersey"]] # populate dictionary with players' names and jersey numbers

id_dict.update({-1: ['ball', np.nan]}) # add the ball

df["player_name"] = df.player_id.map(lambda x: id_dict[x][0]) # replace ids with names in dataframe
df["player_jersey"] = df.player_id.map(lambda x: id_dict[x][1]) # replace ids with jersey number in dataframe

time_mask = (df.game_clock <= 266) & (df.game_clock >= 260) # narrow down the play to the time we care about
time_df = df[time_mask]

I used the above code in three different scripts. The first creates the player movement animation, the second creates the animated speed graphs for each player, and the third creates the static speed graphs with larger fonts.

The player movement animation script is admittedly not the ideal implementation – I hardcoded in player names for the sake of time, knowing I’d only need to use this once. It converts each player’s x and y locations to lists and scatterplots them over time on top of an image of the court (converted to png). I used TeamColors to figure out the correct hex codes to use for each player.

james = time_df[time_df.player_name=="LeBron James"] # create a new dataframe with just James' data
jamesx = james.x_loc.tolist() # convert James' x-locations to a list
jamesy = james.y_loc.tolist() # convert James' y-locations to a list
dellavedova = time_df[time_df.player_name=="Matthew Dellavedova"] # repeat for other players and the ball
dellavedovax = dellavedova.x_loc.tolist()
dellavedovay = dellavedova.y_loc.tolist()
curry = time_df[time_df.player_name=="Stephen Curry"]
curryx = curry.x_loc.tolist()
curryy = curry.y_loc.tolist()
thompson = time_df[time_df.player_name=="Klay Thompson"]
thompsonx = thompson.x_loc.tolist()
thompsony = thompson.y_loc.tolist()
ball = time_df[time_df.player_name=="ball"]
ballx = ball.x_loc.tolist()
bally = ball.y_loc.tolist()

court = plt.imread("fullcourt.png") # load the background image of the court

fig = plt.figure(figsize=(15, 11.5)) # create the figure to plot on
plt.imshow(court, zorder=0, extent=[0,94,50,0]) # draw the background image
plt.xlim(0,94) # match the axes to the image
plt.ylim(0,50)

def animate(nframe): # function used to animate the motion
    print("Percent complete: " + str(round(100*nframe/len(jamesx),0)) + "%") # show progress
    plt.plot(jamesx[nframe], 50-jamesy[nframe], 'o', ms=30, color='#FDBB30', zorder=1) # draw each player as a large dot matching their team color
    plt.plot(dellavedovax[nframe], 50-dellavedovay[nframe], 'o', ms=30, color='#FDBB30')
    plt.plot(curryx[nframe], 50-curryy[nframe], 'o', ms=30, color='#006BB6')
    plt.plot(thompsonx[nframe], 50-thompsony[nframe], 'o', ms=30, color='#006BB6')
    plt.plot(ballx[nframe], 50-bally[nframe], 'o', ms=20, color='#FA8320') # draw the ball as a smaller dot in orange

anim = animation.FuncAnimation(fig, animate, frames=len(jamesx), interval=0) # generate the animation
anim.save('animation_realtime.mp4', writer='ffmpeg', fps=25, bitrate=2000, codec="libxvid") # save the animation

print('Animation complete!')

The animated speed graph script uses a function to calculate player speed over time using the positional data, and another function to animate a smoothed speed curve over time:

def travel_dist(player_locations_x, player_locations_y,player_locations_clock): # calculate distance over time
    diff_x = np.diff(player_locations_x) # create a new list with x-distance traveled between each moment
    diff_y = np.diff(player_locations_y) # create a new list with y-distance traveled between each moment
    diff_time = 0 - np.diff(player_locations_clock) # determine the length of each moment
    time = player_locations_clock
    dist = np.sqrt(diff_x*diff_x + diff_y*diff_y) # calculate distance for each moment
    return [time, dist, diff_time]

window = 10 # smoothing window (larger is smoother)

fig = plt.figure() # create a figure to plot on

def plot_speed(player_name, teamcolor, name):
    plt.cla() # clear the plot so each speed curve stands alone
    [time, dist, diff_time] = travel_dist(time_df[time_df['player_name'].str.contains(player_name)].x_loc,time_df[time_df['player_name'].str.contains(player_name)].y_loc,time_df[time_df['player_name'].str.contains(player_name)].game_clock) # calculate distance over time for the given player
    smoothed_speed = np.convolve(dist/diff_time,np.ones(int(window))/float(window), 'valid') # smooth the speed curve
    time_head = time.head(len(time)-round(window/2)) # cut the invalid start off after smoothing
    time_middle = time.tail(len(smoothed_speed)) # cut the invalid end off after smoothing
    time_plot = time_middle.tolist() # convert time to a list
    speed_plot = smoothed_speed.tolist() # convert speed to a list
    plt.xlim(0,6) # lock x-axis to the correct time window
    plt.ylim(0,20) # lock y-axis to a reasonable max speed
    plt.xlabel("Time (seconds)", size=14) # set axis labels and title 
    plt.ylabel("Speed (miles per hour)", size=14)
    plt.title(name, size=24)

    def animate(nframe): # function used to animate the motion
        print(name + " percent complete: " + str(round(100*nframe/len(time_plot),0)) + "%") # show progress
        plt.plot(max(time_plot)-time_plot[nframe], speed_plot[nframe]*0.681818, 'o', color=teamcolor) # plot speed curve, shifting time to start at 0 and converting speed to MPH
        
    anim = animation.FuncAnimation(fig, animate, frames=len(time_middle), interval=0) # generate the animation
    anim.save('speed_animation_'+name+'.mp4', writer='ffmpeg', fps=25, bitrate=2000, codec="libxvid") # save the animation

plot_speed('Curry', '#006BB6', 'Curry')
plot_speed('James', '#FDBB30', 'James')
plot_speed('Dellavedova', '#FDBB30', 'Dellavedova')
plot_speed('Klay', '#006BB6', 'Thompson')

print('Animation complete!')

Finally, the static speed graph script uses the same speed calculation, but plots one static graph formatted a bit more legibly for analysis purposes. In the code below, I’m plotting only James’ graph, and I’m not explicitly saving it in code.

def travel_dist(player_locations_x, player_locations_y,player_locations_clock): # calculate distance over time
    diff_x = np.diff(player_locations_x) # create a new list with x-distance traveled between each moment
    diff_y = np.diff(player_locations_y) # create a new list with y-distance traveled between each moment
    diff_time = 0 - np.diff(player_locations_clock) # determine the length of each moment
    time = player_locations_clock
    dist = np.sqrt(diff_x*diff_x + diff_y*diff_y) # calculate distance for each moment
    return [time, dist, diff_time]

window = 10 # smoothing window (larger is smoother)

plt.rc('xtick', labelsize=30) # increase font size of axis ticks
plt.rc('ytick', labelsize=30) 

def plot_speed(player_name, teamcolor, name): # plot the speed of a given player
    [time, dist, diff_time] = travel_dist(time_df[time_df['player_name'].str.contains(player_name)].x_loc,time_df[time_df['player_name'].str.contains(player_name)].y_loc,time_df[time_df['player_name'].str.contains(player_name)].game_clock) # calculate distance over time for the given player
    smoothed_speed = np.convolve(dist/diff_time,np.ones(int(window))/float(window), 'valid') # smooth the speed curve
    time_head = time.head(len(time)-round(window/2)) # cut the invalid start off after smoothing
    time_middle = time.tail(len(smoothed_speed)) # cut the invalid end off after smoothing
    plt.plot(max(time_middle)-time_middle, smoothed_speed*0.681818, 'o', color=teamcolor, label=name) # plot speed curve, shifting time to start at 0 and converting speed to MPH

#plot_speed('Curry', '#FDB927', 'Curry') # right now this will only plot James' speed graph
plot_speed('James', '#FDBB30', 'James')
#plot_speed('Dellavedova', '#002D62', 'Dellavedova')
#plot_speed('Klay', '#006BB6', 'Thompson')

plt.xlabel("Time (seconds)", size=30) # set axis labels and title
plt.ylabel("Speed (miles per hour)", size=30)
plt.title("James' speed over time", size=40)
plt.show()

Let me take this chance to reiterate that I am not a software engineer – both to excuse any inefficient coding practices, but also to reinforce that you don’t need a CS degree and resources available online to do some really cool things with code!

If you’d like to download this VS project (even if it’s just for the .py scripts), you can find that here. Any questions, leave a comment below or reach out on Twitter!

Facebooktwitterredditmail

Leave a Reply

Your email address will not be published.