Creating a soccer-style table for the NBA
As someone who only watches basketball and the English Premier League, I love getting to combine the two in some way. Even though I'm not the biggest Boston Celtics fan in the world, seeing my favorite footballer, Bukayo Saka, taking in a Celtics game was one of the bright spots in my malaise after Arsenal's late-season loss to Tottenham Hotspur ended their hopes at an improbable top-4 finish.
So when I had the idea to convert the NBA standings into a soccer-style table, I thought it would be a great little project. I'm going to break down the necessary parts of the short R script that I wrote to accomplish this. It involves scraping data from Basketball Reference.
So first off, I wanted to make this script portable. This script will work on any recent NBA season, but if you want to adapt this for an NBA season that happened before the Hornets became the Pelicans and the Bobcats became the Hornets, you'll just have to change what team abbreviations go into the "teams" vector.
Here I will load the necessary packages:
library(dplyr)
library(stringr)
library(rvest)
This is a vector that holds all 30 teams' abbreviations on Basketball Reference:
teams <- c('BOS','TOR','PHI','NYK','BRK',
'ATL','CHO','WAS','MIA','ORL',
'MIL','CHI','IND','DET','CLE',
'MEM','DAL','HOU','SAS','NOP',
'LAL','LAC','SAC','GSW','PHO',
'MIN','UTA','DEN','OKC','POR')
Now we set the year and initialize the "schedule" data frame. I'm going to do this for the current season, 2022-23. The end result table will be accurate through the games on Friday, December 16th, 2022.
year=2023
schedule <- data.frame()
Now I'll pull all 30 teams' schedules using this for-loop:
for(i in 1:length(teams)){
url <- paste0('https://www.basketball-reference.com/teams/',teams[i],'/',year,'_games.html')
webpage <- read_html(url)
tbls_ls <- webpage %>%
html_nodes("table") %>%
html_table(fill = TRUE)
#only pulling a few columns from the first table in each list
sched <- data.frame(tbls_ls[[1]])[c(2,6,7,9:11)]
sched <- sched %>% filter(Date!="Date")
names(sched) <- c("Date","Home_Away","Opponent","Overtime",
"Points_For","Points_Against")
sched$Team <- teams[i]
schedule <- rbind(schedule,sched)
}
Because we scraped everything from online, it's all text, so we'll convert the dates to dates and the numbers to numbers. Revolutionary.
schedule$Date <- as.Date(schedule$Date,format = "%a, %b %d, %Y")
schedule$Points_For <- as.numeric(schedule$Points_For)
schedule$Points_Against <- as.numeric(schedule$Points_Against)
schedule$Overtime <- ifelse(is.na(schedule$Overtime)," ",schedule$Overtime)
Now is where the tough decision comes in. Soccer is famous for having ties, basketball is famous for going into as many overtimes as possible to decide a winner and loser in every single game. If we don't incorporate a tie in some way, the "points" on the table will just be 3 for every win and 0 for every loss, and then the table will be no different from the standings.
So, I arbitrarily decided that, in addition to overtime games, a "draw" could also include a result within 3 points. It could be 4 or 5. You could just include games that went into overtime. I just chose overtime games and games within 3 points. Feel free to change the draw threshold around and see how different the table becomes!
schedule$Result <- ifelse(schedule$Overtime=="OT", 'D',
ifelse(abs(schedule$Points_Against-schedule$Points_For)<=3, 'D',
ifelse(schedule$Points_For > schedule$Points_Against,'W','L')))
schedule$points <- ifelse(schedule$Result=='D',1,ifelse(schedule$Result=='W',3,0))
Now we will put everything together in a nice clean table in the same column order as the soccer table. (Here's where I switch from base R to dplyr full-time.)
table <- schedule %>%
filter(!is.na(points)) %>%
group_by(Team) %>%
summarize(
`Matches Played` = sum(!is.na(points)),
Wins = sum(ifelse(points==3,1,0)),
Draws = sum(ifelse(points==1,1,0)),
Losses = sum(ifelse(points==0,1,0)),
`Scored` = sum(Points_For,na.rm=T),
`Conceded` = sum(Points_Against,na.rm=T),
Differential = Scored-Conceded,
Points=sum(points,na.rm=T)) %>%
arrange(desc(Points),desc(Differential)) %>%
mutate(Rank=row_number()) %>%
print(n=30)
But we also want to show the "Form" column that shows the 5 most recent results for each team. So we'll make that and join it into the final table, first initializing the blank "form" data frame:
form <- data.frame()
for(i in 1:length(teams)){
L5 <- schedule %>%
filter(Team==teams[i] & !is.na(points)) %>%
tail(5) %>%
mutate('Last 5' = paste0(Result[5],
Result[4],
Result[3],
Result[2],
Result[1]))
form[i,1] <- teams[i]
form[i,2] <- L5$`Last 5`[1]
}
names(form) <- c("Team",'Form')
Now we left-join the table with "form" and rearrange a couple of columns to get our full result.
full_table <- table %>%
left_join(form) %>%
select(10,1:9,11)
And so, for the day I'm writing this up, this is the current soccer-style table for the NBA. If the NBA relegated and promoted teams like soccer leagues did, we'd have the Boston Celtics clearly winning the league, the Memphis Grizzlies, Cleveland Cavaliers, and New Orleans Pelicans joining them in the Champions League, and the San Antonio Spurs, Detroit Pistons, and Charlotte Hornets facing relegation.
Rank | Team | Matches Played | Wins | Draws | Losses | Scored | Conceded | Differential | Points | Form |
---|---|---|---|---|---|---|---|---|---|---|
1 | Boston Celtics | 30 | 20 | 5 | 5 | 3572 | 3383 | 189 | 65 | LDLLW |
2 | Memphis Grizzlies | 28 | 17 | 4 | 7 | 3261 | 3117 | 144 | 55 | WWWWW |
3 | Cleveland Cavaliers | 30 | 16 | 6 | 8 | 3327 | 3144 | 183 | 54 | WWDWL |
4 | New Orleans Pelicans | 28 | 16 | 6 | 6 | 3287 | 3129 | 158 | 54 | DLDWW |
5 | Milwaukee Bucks | 28 | 17 | 3 | 8 | 3142 | 3047 | 95 | 54 | LWLDW |
6 | Phoenix Suns | 29 | 15 | 8 | 6 | 3343 | 3211 | 132 | 53 | WLDLL |
7 | Brooklyn Nets | 30 | 15 | 5 | 10 | 3381 | 3333 | 48 | 50 | DWDWW |
8 | Philadelphia 76ers | 28 | 14 | 5 | 9 | 3120 | 3027 | 93 | 47 | WWWDL |
9 | Sacramento Kings | 28 | 14 | 5 | 9 | 3298 | 3233 | 65 | 47 | WDLLW |
10 | Denver Nuggets | 28 | 13 | 7 | 8 | 3239 | 3213 | 26 | 46 | LWWDD |
11 | Los Angeles Clippers | 31 | 13 | 6 | 12 | 3322 | 3350 | -28 | 45 | LWWWL |
12 | Golden State Warriors | 30 | 13 | 5 | 12 | 3511 | 3505 | 6 | 44 | LLLWD |
13 | New York Knicks | 29 | 12 | 6 | 11 | 3315 | 3260 | 55 | 42 | WDWWW |
14 | Toronto Raptors | 29 | 12 | 6 | 11 | 3233 | 3214 | 19 | 42 | DDLLW |
15 | Indiana Pacers | 30 | 12 | 6 | 12 | 3447 | 3474 | -27 | 42 | LWLDW |
16 | Utah Jazz | 31 | 11 | 8 | 12 | 3654 | 3589 | 65 | 41 | DWLLD |
17 | Portland Trail Blazers | 29 | 11 | 7 | 11 | 3262 | 3256 | 6 | 40 | LWWWD |
18 | Atlanta Hawks | 30 | 12 | 4 | 14 | 3417 | 3451 | -34 | 40 | WLLDL |
19 | Minnesota Timberwolves | 29 | 12 | 4 | 13 | 3313 | 3360 | -47 | 40 | DLLLW |
20 | Los Angeles Lakers | 28 | 11 | 5 | 12 | 3230 | 3255 | -25 | 38 | WDWDL |
21 | Chicago Bulls | 28 | 11 | 5 | 12 | 3157 | 3187 | -30 | 38 | LDDWW |
22 | Dallas Mavericks | 29 | 8 | 13 | 8 | 3241 | 3188 | 53 | 37 | WLWLD |
23 | Washington Wizards | 29 | 9 | 6 | 14 | 3220 | 3303 | -83 | 33 | LLLLL |
24 | Miami Heat | 30 | 7 | 11 | 12 | 3241 | 3277 | -36 | 32 | DDWLW |
25 | Oklahoma City Thunder | 29 | 8 | 6 | 15 | 3344 | 3400 | -56 | 30 | DDLLL |
26 | Houston Rockets | 28 | 8 | 3 | 17 | 3086 | 3230 | -144 | 27 | DWWLW |
27 | Orlando Magic | 30 | 7 | 5 | 18 | 3281 | 3406 | -125 | 26 | WWWWD |
28 | San Antonio Spurs | 28 | 7 | 3 | 18 | 3085 | 3370 | -285 | 24 | LDWWL |
29 | Detroit Pistons | 31 | 5 | 4 | 22 | 3441 | 3649 | -208 | 19 | LDLLL |
30 | Charlotte Hornets | 29 | 4 | 7 | 18 | 3194 | 3403 | -209 | 19 | LDLLL |
Note: I cut out parts of my script that automatically push the results to Google Sheets, but to make this table in a clean HTML format, I copied-and-pasted the spreadsheet result into this tool.
If you're curious, here are the top 4 teams of the past few seasons:
2021-22:
Phoenix Suns, 180 points
Memphis Grizzlies, 160 points
Miami Heat, 160 points,
Golden State Warriors, 153 points
2020-21:
Utah Jazz, 155 points
Brooklyn Nets, 140 points
Los Angeles Clippers, 139 points
Philadelphia 76ers, 135 points
2019-20:
Milwaukee Bucks, 166 points
Toronto Raptors, 151 points
Los Angeles Lakers, 141 points
Boston Celtics, 138 points