Practical tools for exploratory web graphics

Why interactive graphics?

Why should presentation graphics be interactive?
- Helps demonstrate your point
Why should exploratory graphics be interactive?
- Generate insight faster (thus, iteration time is crucial!).

Technique	Related Question(s)	Examples
Identification	What is this point/mark?	Hover for additional info
Filter	How does one group compare to another? What happened during this time period?	`shiny::selectInput()` `shiny::sliderInput()` Click on legend entries
Zoom & pan	Is there local structure?	Click & drag to alter x/y limits
Linked highlighting	How does the marginal/joint compare to a conditional?	Linked brushing on a scatterplot matrix

Why web graphics? Portability!

Why web graphics? Composability!

Web graphics usually aren't practical for exploring data

Good for presentation (viz type is known), bad for exploration (viz type is unknown)

Identification, zoom & pan w/ ggplotly

library(plotly)
p <- ggplot(diamonds, aes(price, carat, color = cut)) + 
  geom_point(alpha = 0.05)
p %>% ggplotly() %>% toWebGL()

Extending James' ggplot

Linked highlighting via ggplotly

That's great, but…

ggplot2's interface wasn't designed for interactive graphics.
ggplot2 requires data frame(s) and can be inefficient (especially for time series).
plot_ly() provides a low-level R interface to plotly.js (now open source!).
Smarter defaults, informative warnings/errors, and higher-level functionality are on their way to plot_ly().

Smart defaults

plot_ly(z = ~volcano)
#> No trace type specified. Applying `add_heatmap()`.
#> Read more about this trace type here -> https://plot.ly/r/reference/#heatmap

Informative warnings/errors

plot_ly(puppies = rnorm(100)) %>% add_boxplot()
#> Error: Must supply either `x` or `y` attribute
plot_ly(x = rnorm(100)) %>% add_boxplot(name = "puppies")

plot_ly loves dplyr verbs

txhousing %>%
  group_by(city) %>%
  plot_ly(x = ~date, y = ~median) %>%
  add_lines()

data-plot-pipeline

txhousing %>%
  group_by(city) %>%
  plot_ly(x = ~date, y = ~median) %>%
  add_lines(name = "Texan Cities") %>%
  filter(city == "Houston") %>%
  add_lines(name = "Houston")

data-model-plot-pipeline

library(broom)
mtcars %>%
    lm(mpg ~ wt, data = .) %>%
    augment() %>%
    mutate(l  = .fitted - 1.96 * .se.fit, u = .fitted + 1.96 * .se.fit) %>%
    plot_ly(x = ~wt, y = ~mpg) %>%
    add_markers(fill = "black") %>%
    add_lines(y = ~.fitted, stroke = "steelblue") %>%
    add_ribbons(ymin = ~l, ymax = ~u, fill = "gray", alpha = 0.3)

split-apply-subplot

s <- split(mpg, mpg$drv)
plots <- lapply(s, function(d) plot_ly(d, x = ~cty, name = ~drv))
subplot(plots, nrows = length(plots), shareX = TRUE)

Thank you!

GitHub: https://github.com/cpsievert
Twitter: https://twitter.com/cpsievert
Email: cpsievert1 @ gmail dot com