Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sections to function reference #12

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/.gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@

*.html
1 change: 1 addition & 0 deletions .github/workflows/pkgdown.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,4 @@ jobs:
git config --local user.email "[email protected]"
Rscript -e 'roxygen2::roxygenize()'
Rscript -e 'pkgdown::deploy_to_branch(new_process = FALSE)'

2 changes: 1 addition & 1 deletion .github/workflows/pkgdown_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ jobs:
git config --local user.name "GitHub Actions"
Rscript -e 'roxygen2::roxygenise()'
Rscript -e 'pkgdown::deploy_to_branch(new_process = FALSE, branch = "gh-pages-test")'

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
.Ruserdata
.DS_Store
docs

6 changes: 1 addition & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Package: puntr
Type: Package
Title: Analysis of Punting
License: MIT
Version: 1.3
Version: 1.4
Authors@R: c(
person("Dennis", "Brookner", role = c("aut", "cre"), email = "[email protected]"),
person("Raphael", "LadenGuindon", role = "aut"))
Expand All @@ -13,12 +13,8 @@ URL: https://puntalytics.github.io/puntr, https://github.com/Puntalytics/puntr
Encoding: UTF-8
LazyData: true
Imports:
cfbfastR,
dplyr,
forcats,
ggimage,
ggplot2,
ggrepel,
glue,
magrittr,
nflfastR,
Expand Down
162 changes: 81 additions & 81 deletions R/college.R
Original file line number Diff line number Diff line change
@@ -1,84 +1,84 @@
#' Import college punting data
#' @description Import college punting data for seasons in the scope of \code{cfbfastR} (back to 2014).
#' This function is a wrapper around \code{cfbfastR::load_cfb_pbp}.
#' @param years A year or range of years to be scraped
#' @return A tibble \code{punts} of punts in the \code{cfbfastR} format
#' @examples
#' \dontrun{
#' import_college_punts(2018:2021)
#' #' Import college punting data
#' #' @description Import college punting data for seasons in the scope of \code{cfbfastR} (back to 2014).
#' #' This function is a wrapper around \code{cfbfastR::load_cfb_pbp}.
#' #' @param years A year or range of years to be scraped
#' #' @return A tibble \code{punts} of punts in the \code{cfbfastR} format
#' #' @examples
#' #' \dontrun{
#' #' import_college_punts(2018:2021)
#' #' }
#' #' @export
#' import_college_punts <- function(years) {
#' punts <- purrr::map_df(years, function(x){
#' cfbfastR::load_cfb_pbp(x) %>%
#' dplyr::filter(punt == 1) %>%
#' dplyr::mutate(season = x)
#' })
#' return(punts)
#' }
#' @export
import_college_punts <- function(years) {
punts <- purrr::map_df(years, function(x){
cfbfastR::load_cfb_pbp(x) %>%
dplyr::filter(punt == 1) %>%
dplyr::mutate(season = x)
})
return(punts)
}

#' Convert college data to \code{puntr} format
#'
#' @description Rename columns and process data such that the output can be plugged directly into \code{puntr::calculate_all},
#' and the output of that can be plugged directly into \code{puntr::create_mini} (or \code{puntr::create_miniY}).
#' @param punts A data frame containing punts in the cfbfastR format
#' @param power_five Logical, defaults to TRUE to include only punters from Power 5 teams
#' @return A tibble \code{punts} in a format usable for \code{puntr::calculate_all}
#' @examples
#' \dontrun{
#' college_to_pro(punts)
#' #' Convert college data to \code{puntr} format
#' #'
#' #' @description Rename columns and process data such that the output can be plugged directly into \code{puntr::calculate_all},
#' #' and the output of that can be plugged directly into \code{puntr::create_mini} (or \code{puntr::create_miniY}).
#' #' @param punts A data frame containing punts in the cfbfastR format
#' #' @param power_five Logical, defaults to TRUE to include only punters from Power 5 teams
#' #' @return A tibble \code{punts} in a format usable for \code{puntr::calculate_all}
#' #' @examples
#' #' \dontrun{
#' #' college_to_pro(punts)
#' #' }
#' #' @export
#' college_to_pro <- function(punts, power_five = TRUE) {
#'
#' punts <- punts %>%
#' #dplyr::rename(season = year) %>%
#' dplyr::mutate(GrossYards = play_text %>%
#' stringr::str_extract("punt for [:digit:]+") %>%
#' stringr::str_extract("[:digit:]+") %>%
#' as.numeric()) %>%
#' dplyr::filter(!is.na(GrossYards)) %>%
#' dplyr::mutate(return_yards = as.numeric(yds_punt_return),
#' return_yards = ifelse(is.na(return_yards),0,return_yards),
#' return_yards = ifelse(stringr::str_detect(play_text,"loss"),-1*return_yards,return_yards)) %>%
#' #dplyr::filter(!is.na(return_yards)) %>%
#' dplyr::mutate(punter_player_name = play_text %>%
#' stringr::str_extract(".+(?= punt for)")) %>%
#' dplyr::mutate(YardsFromOwnEndZone = as.integer(100 - yards_to_goal)) %>%
#' dplyr::filter(YardsFromOwnEndZone <= 70) %>%
#' dplyr::mutate(touchback = play_text %>% stringr::str_detect("ouchback")) %>%
#' dplyr::mutate(return_yards = dplyr::if_else(touchback, 0, return_yards)) %>%
#' dplyr::mutate(NetYards = GrossYards - return_yards) %>%
#' dplyr::mutate(GrossYards = dplyr::if_else(touchback, as.numeric(GrossYards-20), as.numeric(GrossYards))) %>%
#' dplyr::mutate(punt_out_of_bounds = play_text %>% stringr::str_detect("out.of.bounds")) %>%
#' dplyr::mutate(punt_fair_catch = play_text %>% stringr::str_detect("air catch")) %>%
#' dplyr::mutate(punt_downed = play_text %>% stringr::str_detect("downed")) %>%
#' dplyr::mutate(PD = dplyr::if_else(YardsFromOwnEndZone >=41, 1, 0)) %>%
#' #rename columns to avoid breaking calculate_all()
#' dplyr::rename(ep_before_cfb = ep_before,
#' ep_after_cfb = ep_after)
#'
#'
#'
#' # Pull from data-repo to avoid requiring cfbd API key
#' #team_info <- cfbfastR::cfbd_team_info()
#' team_info <- readRDS(url("https://github.com/saiemgilani/cfbfastR-data/blob/master/team_info/rds/cfb_team_info_2020.rds?raw=true"))
#'
#' if(power_five) {
#' team_info <- team_info %>%
#' dplyr::filter(conference %in% c("Pac-12","SEC","Big Ten","Big 12","ACC"))
#' }
#'
#' team_info <- team_info %>%
#' dplyr::mutate(logo = purrr::map(logos,magrittr::extract2,1),
#' logo = as.character(logo)) %>%
#' dplyr::select(school, logo, color, alt_color) %>%
#' dplyr::rename(team_abbr = school,
#' team_logo_espn = logo,
#' team_color = color,
#' team_color2 = alt_color)
#'
#' punts <- punts %>% dplyr::inner_join(team_info, by = c("pos_team" = "team_abbr"))
#'
#' return(punts)
#' }
#' @export
college_to_pro <- function(punts, power_five = TRUE) {

punts <- punts %>%
#dplyr::rename(season = year) %>%
dplyr::mutate(GrossYards = play_text %>%
stringr::str_extract("punt for [:digit:]+") %>%
stringr::str_extract("[:digit:]+") %>%
as.numeric()) %>%
dplyr::filter(!is.na(GrossYards)) %>%
dplyr::mutate(return_yards = as.numeric(yds_punt_return),
return_yards = ifelse(is.na(return_yards),0,return_yards),
return_yards = ifelse(stringr::str_detect(play_text,"loss"),-1*return_yards,return_yards)) %>%
#dplyr::filter(!is.na(return_yards)) %>%
dplyr::mutate(punter_player_name = play_text %>%
stringr::str_extract(".+(?= punt for)")) %>%
dplyr::mutate(YardsFromOwnEndZone = as.integer(100 - yards_to_goal)) %>%
dplyr::filter(YardsFromOwnEndZone <= 70) %>%
dplyr::mutate(touchback = play_text %>% stringr::str_detect("ouchback")) %>%
dplyr::mutate(return_yards = dplyr::if_else(touchback, 0, return_yards)) %>%
dplyr::mutate(NetYards = GrossYards - return_yards) %>%
dplyr::mutate(GrossYards = dplyr::if_else(touchback, as.numeric(GrossYards-20), as.numeric(GrossYards))) %>%
dplyr::mutate(punt_out_of_bounds = play_text %>% stringr::str_detect("out.of.bounds")) %>%
dplyr::mutate(punt_fair_catch = play_text %>% stringr::str_detect("air catch")) %>%
dplyr::mutate(punt_downed = play_text %>% stringr::str_detect("downed")) %>%
dplyr::mutate(PD = dplyr::if_else(YardsFromOwnEndZone >=41, 1, 0)) %>%
#rename columns to avoid breaking calculate_all()
dplyr::rename(ep_before_cfb = ep_before,
ep_after_cfb = ep_after)



# Pull from data-repo to avoid requiring cfbd API key
#team_info <- cfbfastR::cfbd_team_info()
team_info <- readRDS(url("https://github.com/saiemgilani/cfbfastR-data/blob/master/team_info/rds/cfb_team_info_2020.rds?raw=true"))

if(power_five) {
team_info <- team_info %>%
dplyr::filter(conference %in% c("Pac-12","SEC","Big Ten","Big 12","ACC"))
}

team_info <- team_info %>%
dplyr::mutate(logo = purrr::map(logos,magrittr::extract2,1),
logo = as.character(logo)) %>%
dplyr::select(school, logo, color, alt_color) %>%
dplyr::rename(team_abbr = school,
team_logo_espn = logo,
team_color = color,
team_color2 = alt_color)

punts <- punts %>% dplyr::inner_join(team_info, by = c("pos_team" = "team_abbr"))

return(punts)
}
6 changes: 3 additions & 3 deletions R/import.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ import_punts <- function(years, local = FALSE, path = NULL) {
}

#' Import play-by-play data
#' @description Grab all play-by-play data, not just punts. This function pulls data directly from the \code{nflfastR-data} repo, and
#' is purely a wrapper for ease of use.
#' @param years A year or range of years between 1999 and 2020, inclusive
#' @description DEPRECATED: Grab all play-by-play data, not just punts, from the \code{nflfastR-data} repo. This will be much slower than
#' using the \code{nflfastR::load_pbp()} or \code{nflfastR::update_pbp()} functions.
#' @param years A year or range of years between 1999 and 2021, inclusive
#' @return A tibble \code{pbp} containing play-by-play data for the specified years
#' @examples
#' \dontrun{
Expand Down
25 changes: 17 additions & 8 deletions R/mini2.R
Original file line number Diff line number Diff line change
Expand Up @@ -99,15 +99,24 @@ custom_summary <- function(data, ...) {
SHARP_RERUN_OF = mean(SHARP_RERUN_OF, na.rm = TRUE),
SHARP_RERUN_PD = mean(SHARP_RERUN_PD, na.rm = TRUE),
...,
team = getmode_local(posteam),
team_logo_espn = getmode_local(team_logo_espn),
team_color = getmode_local(team_color),
team_color2 = getmode_local(team_color2),
team = puntr::getmode(posteam),
team_logo_espn = puntr::getmode(team_logo_espn),
team_color = puntr::getmode(team_color),
team_color2 = puntr::getmode(team_color2),
)
return(.summary)
}

getmode_local <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
#' Find the mode of a column
#' @description Hilariously, R does not have a built-in method for finding the mode of a column. \code{puntr} has included an internal
#' helper function for this purpose for a while, and now it can be yours too!
#' @param column The dataframe column (or any other list) from which you would like to extract the most frequently occuring item
#' @return The mode of \code{column}
#' @examples
#' \dontrun{
#' players_most_common_team <- getmode(punts_by_punter$posteam)
#' }
#' @export
getmode <- function(column) {
uniqv <- unique(column)
uniqv[which.max(tabulate(match(column, uniqv)))]
}
23 changes: 23 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,26 @@ navbar:
data:
text: Data from past seasons
href: https://github.com/Puntalytics/puntr-data/tree/master/data
reference:
- title: Get data
desc: Import punting play-by-play data and calculate [Puntalytics metrics](https://puntalytics.github.io/metrics.html)
- contents:
- import_punts
- trust_the_process
- punt_trim
- calculate_all
- subtitle: College data
- contents:
- import_college_punts
- college_to_pro
- title: Summarize data
desc: Turn a play-by-play dataframe into a dataframe summarizing player stats
- contents:
- starts_with("by_punter")
- title: Miscellaneous
contents:
- getmode
- title: Deprecated
- contents:
- starts_with("create")
- import_seasons
9 changes: 6 additions & 3 deletions vignettes/puntr.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,9 @@ library(tidyverse) # always a good idea to do this too
For speed, we've already scraped (using [`nflfastR`](https://www.nflfastr.com/)) and saved punting data for the 1999-2020 seasons. The easiest thing to do is download the `puntr-data` repo [here](https://github.com/Puntalytics/puntr-data/tree/master/data), and then point `puntr::import_punts()` to your local copy of the data. You can also download the data directly each time; this takes around 15 minutes.
Import, clean, and calculate as follows:
```{r imports, message=FALSE}
#punts_raw <- import_punts(1999:2020, local=TRUE, path=your_local_path) # recommended
punts_raw <- import_punts(2018:2020) # This takes ~15 minutes
# punts_raw <- import_punts(1999:2020, local=TRUE, path=your_local_path) # recommended
# punts_raw <- import_punts(1999:2020) # This takes ~15 minutes
punts_raw <- import_punts(2018:2020)
punts_cleaned <- trust_the_process(punts_raw) # clean
punts <- calculate_all(punts_cleaned) # calculate custom Puntalytics metrics
```
Expand Down Expand Up @@ -83,6 +84,7 @@ Let's take a look at some of the columns in this data frame:
punters %>%
arrange(desc(pEPA)) %>%
select(punter_player_name, Gross, Net, pEPA) %>%
mutate(across(where(is.numeric), round, 3)) %>%
rmarkdown::paged_table()
```

Expand All @@ -95,6 +97,7 @@ which gives every unique punter season a row.
punter_seasons %>%
arrange(desc(pEPA)) %>%
select(punter_player_name, season, Gross, Net, pEPA) %>%
mutate(across(where(is.numeric), round, 3)) %>%
rmarkdown::paged_table()
```
And finally, to compare punter **games**, use
Expand All @@ -109,7 +112,7 @@ These dataframes - `punts`, `punters`, `punter_seasons` and `punter_games` - sho


## Using `puntr` with college data
***NOTE: `puntr` was successfully migrated from `cfbscrapR` to `cfbfastR` in version 1.2.2***
***NOTE: `puntr` was successfully migrated from `cfbscrapR` to `cfbfastR` in version 1.2.2***
***NOTE: The `by_` family of summary functions have not yet been tested for `cfbfastR` data, but might work.***

`puntr` can also handle punting data for college football, piggybacking off of the scraping abilities of the [`cfbfastR`](https://saiemgilani.github.io/cfbfastR/) package. You need at least 3 seasons worth of data to run `calculate_all()`. Import and clean as follows:
Expand Down
Loading