August 12th, 2015

Acquiring PITCHf/x

library(pitchRx)
# returns a list of related tables (see diagram below)
dat <- scrape(start = "2008-01-01", end = Sys.Date())

Storing PITCHf/x

db <- dplyr::src_sqlite("pitchRx.sqlite3", create = TRUE)
pitchRx::scrape(start = "2008-01-01", end = Sys.Date(), connect = db$con)
  • Any database connection should work!
  • Writes data in streaming chunks to avoid exhausting memory.
  • Keeping your database up-to-date is also easy!
    update_db(db$con)

Animating PITCHf/x

Query/Animate PITCHf/x

Player/date info recorded on the at-bat level.

library(dplyr)
atbats <- tbl(db, 'atbat') %>%
  filter(pitcher_name == 'Yu Darvish', batter_name == 'Albert Pujols', 
         date == '2013_04_24')
  • Join at-bats with pitches and animate!
    tbl(db, 'pitch') %>%
      inner_join(atbats, by = c('num', 'gameday_link')) %>%
      collect() %>% pitchRx::animateFX()

Modeling called strike decisions

Inspired from Brian Mills' Work

    # condition on umpire decisions
    pitches <- tbl(db, "pitch") %>%
      filter(des %in% c("Called Strike", "Ball")) %>%
      mutate(strike = as.numeric(des == "Called Strike"))
    # goal is to compare 2008 to 2014
    atbats <- tbl(db, "atbat") %>%
      mutate(year = substr(date, 5L, -4L)) %>%
      filter(year %in% c("2008", "2014"))
    dat <- left_join(pitches, atbats)
    library(mgcv)
    # 48 (2 x 2 x 12) surfaces!
    m <- bam(strike ~ interaction(stand, year, count) +
                s(px, pz, by = interaction(stand, year, count)),
              data = dat, family = binomial(link = 'logit'))

Visualizing differences

strikeFX(dat, model = m, density1 = list(year = "2008"),
          density2 = list(year = "2014"), 
          layer = facet_grid(count ~ stand))

Middle of the plate at the knees

Aside on interval estimation

  • These intervals are approximate, point-wise, and assume smoothness parameters are known.
  • (Wood 2006) obtains simultaneous intervals without conditioning on smoothness parameters via parametric bootstrap.

Quantifying homefield bias in called strikes

Thank you!

  • Special thanks to:
  • Brian Mills for comments/discussions on pitchRx and GAMs.
  • Mike Lopez for the invitation