+ - 0:00:00
Notes for current slide
Notes for next slide

Interactive dataviz on the web with R & plotly

Carson Sievert

Slides: bit.ly/useR18

@cpsievert
@cpsievert
cpsievert1@gmail.com
https://cpsievert.me/

Slides released under Creative Commons

1 / 74

Your turn

Open the slides bit.ly/useR18

Go to this address rstudio.cloud/project/14090


Time: 2 minutes

2 / 74

About me

  • PhD in statistics at Iowa State with Heike Hofmann & Di Cook (Dec 2016)

  • CEO of Sievert Consulting LLC (Jan 2017 - Present)

    • Services: product development, statistical modeling, and R training
    • Clients: plotly, Sandia Labs, NOAA, LL Bean
  • Expert at Library of Congress (June 2018 - Present)

    • Provide statistical expertise for DoD, BJS, and other US Federal Government projects
  • I ❤️ interactive data visualization

    • Maintain/author R 📦s: plotly, LDAvis, animint
3 / 74

Data science workflow

4 / 74

Web graphics are great for presentation!
















Sharable, portable, composable (i.e., reports, dashboards, etc)

5 / 74

Web technologies aren't designed for this iteration!

















Follow-up questions (ignited through visualization) may rely on sophisticated computations

6 / 74

...but interactivity augments exploration


[1]: Worried about inference? See visual (Majumder et al 2013) and post-selection (Berk et al 2013) inference frameworks.

7 / 74

Interactive graphics can augment exploratory analysis, but are only practical when we can iterate quickly

8 / 74

Interactive graphics can enhance presentation, but are only practical when easily distributed

9 / 74

When is a web application necessary?

10 / 74

Easier to share, scale, and maintain

11 / 74

Client-side technologies: HTML, JavaScript, CSS

12 / 74

DYK: Many R packages generate HTML/JavaScript...

...via htmlwidgets and htmltools

13 / 74

plotly can do a lot in a standalone page!

14 / 74

plotly's client-side reactivity options

  1. Graphical (database) queries

  2. Respond to plotly sliders, buttons, and dropdowns via plotly.js functions

  3. Custom JavaScript via htmlwidgets::onRender()

15 / 74

plotly's client-side reactivity options

  1. Graphical (database) queries

  2. Respond to plotly sliders, buttons, and dropdowns via plotly.js functions

  3. Custom JavaScript via htmlwidgets::onRender()

16 / 74

Make ggplot2 interactive

library(plotly)
p <- ggplot(txhousing) + geom_line(aes(date, median, group = city))
ggplotly(p)
17 / 74

Customize tooltip

library(plotly)
p <- ggplot(txhousing) + geom_line(aes(date, median, group = city, text = city))
ggplotly(p, tooltip = "text")
18 / 74

Highlight a key (e.g. city) column

library(plotly)
tx <- highlight_key(txhousing, ~city)
p <- ggplot(tx) + geom_line(aes(date, median, group = city, text = city))
gg <- ggplotly(p, tooltip = "text")
highlight(gg, on = "plotly_click")
19 / 74

Direct/indirect manipulation & persistent highlighting!

gg <- ggplotly(p, tooltip = "text")
highlight(gg, on = "plotly_hover", selectize = TRUE, dynamic = TRUE)
20 / 74
21 / 74

"Linking as a
database query"

22 / 74
SELECT * FROM table
WHERE city == "South Padre Island"
23 / 74

Works with 3D!

24 / 74

Works with aggregates (& mapbox)!

25 / 74

Works with list-columns & animation!

26 / 74

Works with other htmlwidgets

27 / 74

Choose Your Turn

(A) Think of a question you'd like to ask of your data via a (linked) interactive graphic (bonus: draw it)!

(B) Study the code for generating this visual. How does it work?


Time: 5 minutes

28 / 74

Texas housing prices

tx <- txhousing %>%
select(city, year, month, median) %>%
filter(city %in% c("Galveston", "Midland", "Odessa", "South Padre Island"))
#> # A tibble: 748 x 4
#> city year month median
#> <chr> <int> <int> <dbl>
#> 1 Galveston 2000 1 95000
#> 2 Galveston 2000 2 100000
#> 3 Galveston 2000 3 98300
#> 4 Galveston 2000 4 111100
#> 5 Galveston 2000 5 89200
#> 6 Galveston 2000 6 108600
#> 7 Galveston 2000 7 99000
#> 8 Galveston 2000 8 96200
#> 9 Galveston 2000 9 104000
#> 10 Galveston 2000 10 118800
#> # ... with 738 more rows

How does price differ across cities?

29 / 74

Price versus month, by city & year

library(ggplot2)
ggplot(tx, aes(month, median, group = year)) +
geom_line() +
facet_wrap(~city, ncol = 2)
30 / 74

Query specific years

library(plotly)
TX <- highlight_key(tx, ~year)
p <- ggplot(TX, aes(month, median, group = year)) + geom_line() +
facet_wrap(~city, ncol = 2)
ggplotly(p, tooltip = "year")

31 / 74

Set selection mode and default selections

highlight(.Last.value, on = "plotly_hover", defaultValues = 2006)

32 / 74

Make comparisons with dynamic brush

highlight(.Last.value, dynamic = TRUE, persistent = TRUE, selectize = TRUE)

33 / 74

Customize the appearance of selections

highlight(
.Last.value, dynamic = TRUE, persistent = TRUE,
selected = attrs_selected(mode = "markers+lines", marker = list(symbol = "x"))
)

34 / 74

Automate queries via animation

p <- ggplot(tx, aes(month, median)) +
geom_line(aes(group = year), alpha = 0.2) +
geom_line(aes(frame = year), color = "red") +
facet_wrap(~city, ncol = 2)
ggplotly(p)

35 / 74

Generally useful for comparing within/across panels!

36 / 74

Your Turn

Run the following R code to generate the soccer visualization:

demo("crosstalk-highlight-epl-2", package = "plotly")
  1. Compare the performance of 'Liverpool' with 'Chelsea'.
  2. Run plotly_json(). What does this return?

Time: 3 minutes

37 / 74

Inspect the underlying data

plotly_json() returns the underlying JSON of any plotly graph -- nice way to learn how ggplotly() maps to plotly.js!

38 / 74
39 / 74

Want to work with plotly.js directly?

schema() returns the official plotly.js figure reference tied to the R package

40 / 74

plotly is much more than ggplotly()!

Initiate a plotly graph (without ggplot2):

  • plot_ly(): 'flexible' interface to plotly.js
  • plot_mapbox(): plot_ly() wrapper/shortcut for scattermapbox
  • plot_geo(): plot_ly() wrapper/shortcut for scattergeo

Add data (i.e., traces) to a graph

Modify a graph (before printing)

  • style() to modify traces of an existing plotly graph
  • layout() to add/modify to a layout component

Tools for talking with plotly cloud

  • Send data and/or graphs with api_create()
  • Retrieve data and/or graphs with api_download_plot()/api_download_grid()
  • Do anything the plotly's server API supports with api()
41 / 74

'Smart' trace type defaults

subplot(
plot_ly(diamonds, x = ~cut, y = ~clarity),
plot_ly(diamonds, x = ~cut, color = ~clarity),
nrows = 2, shareX = TRUE
)
42 / 74

Your Turn

Use plotly_json() to study how the R code (on the last slide) maps to JSON.

NOTE: We don't have time to cover plot_ly() in depth..learn more in the plotly cookbook chapter of the plotly for R book


Time: 3 minutes

43 / 74

Aggregating selections

44 / 74
library(plotly)
d <- highlight_key(mpg)
dots <- plot_ly(d, color = ~class, x = ~displ, y = ~cyl)
boxs <- plot_ly(d, color = ~class, x = ~class, y = ~cty) %>%
add_boxplot()
bars <- plot_ly(d, colors = "Set1", x = ~class, color = ~class)
subplot(dots, boxs, titleX = TRUE, titleY = TRUE) %>%
subplot(bars, nrows = 2, titleX = TRUE, titleY = TRUE) %>%
layout(
barmode = "overlay",
showlegend = FALSE
) %>%
highlight("plotly_selected")
45 / 74

Aggregating selections (continued)

46 / 74
library(plotly)
d <- highlight_key(mtcars)
sp <- plot_ly(d, x = ~mpg, y = ~disp) %>%
add_markers(color = I("black"))
# 'statistical trace types'
hist <- plot_ly(d, x = ~factor(cyl)) %>% add_histogram(color = I("black"))
box <- plot_ly(d, y = ~disp, color = I("black")) %>% add_boxplot(name = " ")
violin <- plot_ly(d, y = ~disp, color = I("black")) %>% add_trace(type = "violin", name = " ")
subplot(sp, box, violin, shareY = TRUE, titleX = TRUE, titleY = TRUE) %>%
subplot(hist, widths = c(.75, .25), titleX = TRUE, titleY = TRUE) %>%
layout(
barmode = "overlay",
title = "Click and drag scatterplot",
showlegend = FALSE
) %>%
highlight("plotly_selected")
47 / 74

Aggregating selections (continued)

48 / 74
library(plotly)
tx <- highlight_key(txhousing, ~city)
p1 <- ggplot(tx, aes(date, median, group = city)) + geom_line()
gg1 <- ggplotly(p1, tooltip = c("city", "date", "median"))
p2 <- plot_ly(tx, x = ~median, color = I("black")) %>%
add_histogram(histnorm = "probability density")
subplot(gg1, p2, titleX = TRUE, titleY = TRUE) %>%
layout(barmode = "overlay") %>%
highlight(dynamic = TRUE, selected = attrs_selected(opacity = 0.3))
49 / 74

Talk with other htmlwidgets

library(leaflet)
sd <- highlight_key(quakes)
p <- plot_ly(sd, x = ~depth, y = ~mag) %>% add_markers(alpha = 0.5) %>% highlight("plotly_selected")
map <- leaflet(sd) %>% addTiles() %>% addCircles()
crosstalk::bscols(p, map)
50 / 74

Linking plotly with DT

library(plotly)
data(trails, package = "mapview")
tsd <- highlight_key(trails)
crosstalk::bscols(
plot_mapbox(tsd, text = ~FKN, hoverinfo = "text"),
DT::datatable(tsd)
)
51 / 74

Expectations vs reality





plotly has advanced support for highlight events (e.g., persistent, dynamic, selectize)

Other crosstalk-enabled htmlwidgets likely won't respect (non-default) highlight() options.

However, filter events should generally be supported.

52 / 74

Filter vs highlight

Highlight events dim the opacity of existing marks.

Filter events completely removes existing marks and rescales axes.

At least currently, filter events must be fired from crosstalk widgets.

53 / 74
54 / 74

Crosstalk's filtering widgets

library(crosstalk)
tx <- highlight_key(txhousing)
widgets <- bscols(
widths = c(12, 12, 12),
filter_select("city", "Cities", tx, ~city),
filter_slider("sales", "Sales", tx, ~sales),
filter_checkbox("year", "Years", tx, ~year, inline = TRUE)
)
widgets
55 / 74

Filtering

bscols(
widths = c(4, 8), widgets,
plot_ly(tx, x = ~date, y = ~median, showlegend = FALSE) %>%
add_lines(color = ~city, colors = "black")
)
56 / 74

Your turn

  1. Use htmlwidgets::saveWidget() to save a plotly graph (e.g. plot_ly()). What's the size of the HTML file it creates?
  2. Use htmltools::save_html() to save the plotly+leaflet example. What's the size of the HTML file it creates?
  3. What's the difference between using saveWidget() and save_html()? When is one preferred to the other?

Time: 5 minutes

57 / 74
58 / 74

plotly's client-side reactivity options

  1. Graphical (database) queries

  2. Respond to plotly sliders, buttons, and dropdowns via plotly.js functions

  3. Custom JavaScript via htmlwidgets::onRender()

59 / 74
60 / 74

The implementation

styles <- schema()$layout$layoutAttributes$mapbox$style$values
style_buttons <- lapply(styles, function(s) {
list(label = s, method = "relayout", args = list("mapbox.style", s))
})
storms <- sf::st_read(system.file("shape/storms_xyz.shp", package = "sf"), quiet = TRUE)
plot_mapbox(storms, color = I("red")) %>%
layout(
title = "Changing the base layer",
updatemenus = list(list(y = 0.8, buttons = style_buttons))
)
61 / 74

plotly's client-side reactivity options

  1. Graphical (database) queries

  2. Respond to plotly sliders, buttons, and dropdowns via plotly.js functions

  3. Custom JavaScript via htmlwidgets::onRender()

62 / 74

Google search on click

The customdata attribute provides a way to attach "meta-data" to visual attributes that you can access with JavaScript

plot_ly(mtcars, x = ~wt, y = ~mpg) %>%
add_markers(customdata = ~paste0("http://google.com/#q=", rownames(mtcars))) %>%
htmlwidgets::onRender("function(el, x) {
el.on('plotly_click', function(d) {
var url = d.points[0].customdata;
window.open(url);
});
}")
63 / 74

Demo

In the RStudio cloud project, open the 'customdata.R' script:

file.edit("~/customdata.R")
64 / 74

Hello 👋

shiny::runApp("~/tutorials/20180711/shiny/01", display.mode = "showcase")
65 / 74

Accessing plotly user events

shiny::runApp("~/tutorials/20180711/shiny/02", display.mode = "showcase")
66 / 74

Your turn

  1. Modify the last app to use plot_ly() instead of ggplotly()

  2. Add output blocks that print out data from the following events:

  • "plotly_hover"
  • "plotly_click"
  • "plotly_relayout"

Time: 5 minutes

67 / 74

Targetting events

shiny::runApp("~/tutorials/20180711/shiny/03", display.mode = "showcase")
68 / 74

plotly proxies

By default, shiny updates require a full redraw, but proxies allows us to leverage the plotly.js API to modify/update graphs more efficiently

shiny::runApp("~/tutorials/20180711/shiny/04", display.mode = "showcase")
69 / 74

Streaming data

shiny::runApp("~/tutorials/20180711/shiny/05", display.mode = "showcase")
70 / 74

Your turn

Open the last example

file.edit("~/tutorials/20180711/shiny/05/app.R")

Modify it to do the following:

  1. Add sliderInput() for controlling the streaming interval.

  2. Add a widgets to:

    • Change only the width of the line
    • Change only the color of the line
    • Only add/remove markers (e.g. points) for each data point

Time: 10 minutes

Hint: some of these shiny example apps will be helpful (e.g. proxy_restyle_economics) -- https://github.com/ropensci/plotly/tree/master/inst/examples/shiny

71 / 74

Your turn

Remember this "your turn"? Let's try to implement it!

72 / 74

Ask me anything!!

74 / 74

Your turn

Open the slides bit.ly/useR18

Go to this address rstudio.cloud/project/14090


Time: 2 minutes

2 / 74
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow