class: middle center hide-slide-number monash-bg-gray80 .info-box.w-50.bg-white[ These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See <a href=lecture-10A.pdf>here for the PDF <i class="fas fa-file-pdf"></i></a>. ] <br> .white[Press the **right arrow** to progress to the next slide!] --- class: title-slide count: false background-image: url("images/bg-12.png") # .monash-blue[ETC5521: Exploratory Data Analysis] <h1 class="monash-blue" style="font-size: 30pt!important;"></h1> <br> <h2 style="font-weight:900!important;">Exploring data having a space and time context</h2> .bottom_abs.width100[ Lecturer: *Di Cook* <i class="fas fa-envelope"></i> ETC5521.Clayton-x@monash.edu <i class="fas fa-calendar-alt"></i> Week 10 - Session 1 <br> ] <style type="text/css"> .gray80 { color: #505050!important; font-weight: 300; } .bg-gray80 { background-color: #DCDCDC!important; } </style> --- background-image: \url(images/week10A/southwestrocks.jpg) background-position: 20% 80% class: middle center # .monash-white[You show me continents, I see the islands,] # .monash-white[You count the centuries, I blink my eyes] <br> <br> <br> <br> <br> <br> .monash-white[[Björk](https://www.bjork.com)] --- # Outline ### .monash-orange2[First part] - Breaking up data by time, and by space - Maps of space over time - Exploring time over space with glyph maps <br> ### .monash-orange2[Second part] - Capturing spatial trend - Bending the choropleth map - A flash back to the 1970s: Tukey's median polish --- # .orange[Case study] .bg-orange.circle[1] Temperature change in Americas .panelset[ .panel[.panel-name[data] 6 years of monthly measurements of a 24x24 spatial grid from Central America collated by Paul Murrell, U. Auckland. .s400[
time
y
x
lat
long
date
cloudhigh
cloudlow
cloudmid
ozone
pressure
surftemp
temperature
id
day
month
year
1
1
1
-21.2
-113.80000
1995-01-01
0.5
31.0
2.0
260
1000
297.4
296.9
1-1
0
1
1995
1
1
2
-21.2
-111.29565
1995-01-01
1.5
31.5
2.5
260
1000
297.4
296.5
2-1
0
1
1995
1
1
3
-21.2
-108.79130
1995-01-01
1.5
32.5
3.5
260
1000
297.4
296.0
3-1
0
1
1995
1
1
4
-21.2
-106.28696
1995-01-01
1.0
39.0
4.0
258
1000
296.9
296.5
4-1
0
1
1995
1
1
5
-21.2
-103.78261
1995-01-01
0.5
48.0
4.5
258
1000
296.5
295.5
5-1
0
1
1995
1
1
6
-21.2
-101.27826
1995-01-01
0.0
50.0
2.5
258
1000
296.5
295.0
6-1
0
1
1995
1
1
7
-21.2
-98.77391
1995-01-01
0.0
51.0
4.5
256
1000
295.5
295.5
7-1
0
1
1995
1
1
8
-21.2
-96.26957
1995-01-01
0.0
52.5
5.0
258
1000
295.5
295.0
8-1
0
1
1995
1
1
9
-21.2
-93.76522
1995-01-01
0.5
54.0
8.5
256
1000
295.0
295.0
9-1
0
1
1995
1
1
10
-21.2
-91.26087
1995-01-01
1.0
56.0
11.5
258
1000
294.6
294.6
10-1
0
1
1995
1
1
11
-21.2
-88.75652
1995-01-01
1.0
56.0
11.5
258
1000
294.6
294.6
11-1
0
1
1995
1
1
12
-21.2
-86.25217
1995-01-01
1.0
60.5
12.5
254
1000
294.1
294.6
12-1
0
1
1995
1
1
13
-21.2
-83.74783
1995-01-01
1.5
57.0
19.5
254
1000
293.2
294.6
13-1
0
1
1995
1
1
14
-21.2
-81.24348
1995-01-01
2.5
57.0
20.0
252
1000
293.2
294.6
14-1
0
1
1995
1
1
15
-21.2
-78.73913
1995-01-01
1.5
53.5
22.0
254
1000
293.2
294.6
15-1
0
1
1995
1
1
16
-21.2
-76.23478
1995-01-01
1.0
51.5
24.5
254
995
293.6
295.5
16-1
0
1
1995
1
1
17
-21.2
-73.73043
1995-01-01
1.0
52.0
16.5
256
1000
295.0
296.0
17-1
0
1
1995
1
1
18
-21.2
-71.22609
1995-01-01
1.5
37.5
8.0
256
995
296.9
296.9
18-1
0
1
1995
1
1
19
-21.2
-68.72174
1995-01-01
14.0
12.5
22.5
258
690
294.6
288.3
19-1
0
1
1995
1
1
20
-21.2
-66.21739
1995-01-01
29.5
7.0
30.0
256
680
292.7
288.8
20-1
0
1
1995
] ] .panel[.panel-name[R] ```r data(nasa) nasa %>% slice_head(n=20) %>% gt() ``` ] ] --- # Spatiotemporal object in R: cubble .flex[ .w-45[ .s300.f5[ ```r nasa_cb <- as_cubble(as_tibble(nasa), key=id, index=time, coords=c(long, lat)) nasa_cb ``` ``` ## # cubble: key: id [576], index: time, nested form ## # spatial: [-113.8, -21.2, -56.2, 36.2], Missing CRS! ## # temporal: time [int], date [dttm], cloudhigh [dbl], cloudlow [dbl], cloudmid [dbl], ozone [int], pressure [int], surftemp [dbl], temperature [dbl], day [dbl], month [dbl], year [dbl] ## y x lat long id ts ## <int> <int> <dbl> <dbl> <chr> <list> ## 1 1 1 -21.2 -114. 1-1 <tibble [72 × 12]> ## 2 1 2 -21.2 -111. 2-1 <tibble [72 × 12]> ## 3 1 3 -21.2 -109. 3-1 <tibble [72 × 12]> ## 4 1 4 -21.2 -106. 4-1 <tibble [72 × 12]> ## 5 1 5 -21.2 -104. 5-1 <tibble [72 × 12]> ## 6 1 6 -21.2 -101. 6-1 <tibble [72 × 12]> ## 7 1 7 -21.2 -98.8 7-1 <tibble [72 × 12]> ## 8 1 8 -21.2 -96.3 8-1 <tibble [72 × 12]> ## 9 1 9 -21.2 -93.8 9-1 <tibble [72 × 12]> ## 10 1 10 -21.2 -91.3 10-1 <tibble [72 × 12]> ## # ℹ 566 more rows ``` ] <img src="https://huizezhang-sherry.github.io/cubble/reference/figures/cubble-operations.png" style="width:500px"> ] .w-5[ .white[space] ] .w-45[ .s300.f5[ ```r nasa_cb %>% face_temporal() ``` ``` ## # cubble: key: id [576], index: time, long form ## # temporal: 1 -- 72 [1], no gaps ## # spatial: y [int], x [int], lat [dbl], long [dbl] ## id time date cloudhigh cloudlow cloudmid ozone pressure surftemp temperature day month year ## <chr> <int> <dttm> <dbl> <dbl> <dbl> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1-1 1 1995-01-01 00:00:00 0.5 31 2 260 1000 297. 297. 0 1 1995 ## 2 1-1 2 1995-02-01 00:00:00 1 33.5 3 254 1000 299. 298. 31 2 1995 ## 3 1-1 3 1995-03-01 00:00:00 2 25.5 4 254 1000 298. 298. 59 3 1995 ## 4 1-1 4 1995-04-01 00:00:00 4 33.5 5.5 244 1000 299. 299. 90 4 1995 ## 5 1-1 5 1995-05-01 00:00:00 6.5 36.5 14 250 1000 298. 297. 120 5 1995 ## 6 1-1 6 1995-06-01 00:00:00 3.5 36 10 256 1000 295 296. 151 6 1995 ## 7 1-1 7 1995-07-01 00:00:00 1 42.5 6 268 1000 296. 296 181 7 1995 ## 8 1-1 8 1995-08-01 00:00:00 0 45.5 1 276 1000 295. 294. 212 8 1995 ## 9 1-1 9 1995-09-01 00:00:00 0 43.5 3.5 274 1000 295 294. 243 9 1995 ## 10 1-1 10 1995-10-01 00:00:00 1 40 5 282 1000 295 295. 273 10 1995 ## # ℹ 41,462 more rows ``` ] <br><br> Like a `tibble` but can .monash-blue2[pivot back and forth between spatial and temporal] components. ] ] --- # .orange[Case study] .bg-orange.circle[1] Temperature change in Americas .flex[ .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/spatial-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% ggplot() + geom_point(aes(x=long, y=lat)) + geom_point(data=filter(nasa_cb, x==5, y==20), aes(x=long, y=lat), colour="orange", size=4) + geom_point(data=filter(nasa_cb, x==20, y==2), aes(x=long, y=lat), colour="turquoise", size=4) ``` ] ] ] ] .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/temporal-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[R] .f5[ ```r nasa_cb_f <- nasa_cb %>% face_temporal() ggplot(nasa_cb_f) + geom_line(aes(x=date, y=surftemp, group=id), alpha=0.2) + geom_line(data=filter(nasa_cb_f , id=="5-20"), aes(x=date, y=surftemp, group=id), colour="orange", linewidth=2) + geom_line(data=filter(nasa_cb_f , id=="20-2"), aes(x=date, y=surftemp, group=id), colour="turquoise", linewidth=2) ``` ] ] ] ] ] --- # Pre-processing of time and space <br> <br> <center> .info-box[Think of .monash-orange2[time] and .monash-orange2[space] as .monash-orange2[ordered categorical variables].] </center> <br> <br> - Time may need to be converted to categories. - Spatial variable *might* need to be discretised, or gridded. <br> <br> For the nasa data, this is already done. Time is an integer from 1 to 72 (6 years of 12 months), as well as a date, and month and year. Space is a 24x24 grid of longitude and latitude, and also provided as an integer 1 to 24 in both x and y. --- # Slice in time and create a spatial map .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/raster-1.png" width="40%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br> <br> In January 2005, temperatures are - cool over land in the north - cool over the Andes in south america - warm on the equator, and along the coastline There are 12*6=72 maps to make!! ] .panel[.panel-name[R] .f5[ ```r # Get the map sth_america <- map_data("world") %>% filter(between(long, -115, -53), between(lat, -20.5, 41)) nasa_cb %>% face_temporal() %>% filter(month == 1, year == 1995) %>% select(id, time, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_tile(aes(x=long, y=lat, fill=surftemp)) + geom_path(data=sth_america, aes(x=long, y=lat, group=group), colour="white", linewidth=1) + scale_fill_viridis_c("", option = "magma") + theme_map() + theme(legend.position = "bottom") + ggtitle("January 1995") ``` ] ] ] --- # Explore spatial trend over time .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/space_time-1.png" width="75%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br><br> - Exploring spatial trend over time is obtained by .monash-blue2[faceting the maps by time]. - Can you see El Nino in 1997? Can you see the summer vs winter in the different hemispheres? ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% face_temporal() %>% select(id, time, month, year, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_tile(aes(x=long, y=lat, fill=surftemp)) + facet_grid(year~month) + scale_fill_viridis_c("", option = "magma") + theme_map() + theme(legend.position = "bottom") ``` ] ] ] --- # glyphmap: time across space .flex[ .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/time_space-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br><br> .monash-blue2[Global scale]: temperature min/max across all spatial locations and time - Different seasonality at different locations, particularly more north and more south. ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% face_temporal() %>% select(id, time, month, year, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_polygon(data=sth_america, aes(x=long, y=lat, group=group), fill="#014221", alpha=0.2, colour="#ffffff") + cubble::geom_glyph_box(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), fill=NA) + cubble::geom_glyph(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp)) + theme_map() ``` ] ] ] ] .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/time_space2-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br><br> .monash-blue2[Local scale]: min/max for each spatial location - El Nino year in equatorial region may be visible. - Notice also odd patterns on the west (Andes mountains) of South America. ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% face_temporal() %>% select(id, time, month, year, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_polygon(data=sth_america, aes(x=long, y=lat, group=group), fill="#014221", alpha=0.2, colour="#ffffff") + cubble::geom_glyph_box(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), fill=NA) + cubble::geom_glyph(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), global_rescale = FALSE) + theme_map() ``` ] ] ] ] ] --- # glyphmap: time across space .flex[ .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/unnamed-chunk-12-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br><br> .monash-blue2[Global scale]: temperature min/max across all spatial locations and time - Different seasonality at different locations, particularly more north and more south. ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% face_temporal() %>% select(id, time, month, year, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_polygon(data=sth_america, aes(x=long, y=lat, group=group), fill="#014221", alpha=0.2, colour="#ffffff") + cubble::geom_glyph_box(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), fill=NA) + cubble::geom_glyph(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp)) + theme_map() ``` ] ] ] ] .w-50[ .panelset[ .panel[.panel-name[plot] <img src="images/lecture-10A/time_space3-1.png" width="90%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] <br><br> .monash-blue2[Polar coordinates] - Seasonal pattern may be more visible. ] .panel[.panel-name[R] .f5[ ```r nasa_cb %>% face_temporal() %>% select(id, time, month, year, surftemp) %>% unfold(long, lat) %>% ggplot() + geom_polygon(data=sth_america, aes(x=long, y=lat, group=group), fill="#014221", alpha=0.2, colour="#ffffff") + cubble::geom_glyph_box(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), fill=NA) + cubble::geom_glyph(data=nasa, aes(x_major = long, x_minor = day, y_major = lat, y_minor = surftemp), polar=TRUE) + theme_map() ``` ] ] ] ] ] --- # .orange[Case study] .bg-orange.circle[1] Temperature change in Americas ## Exploring El Nino Slice space, and show the time series, and the pattern is very clear: The seasonal water temperature decrease doesn't happen in 1997, and water in this area stays unseasonably warm. <img src="images/lecture-10A/over_time-1.png" width="100%" style="display: block; margin: auto;" /> ---
<i class="fas fa-wrench faa-wrench animated-hover faa-slow " style=" color:#D93F00;"></i>
Your turn using tsibbletalk .s500.f5[ ```r library(tsibble) library(tsibbletalk) library(lubridate) nasa_shared <- nasa %>% mutate(date = ymd(date)) %>% select(long, lat, date, surftemp, id) %>% as_tsibble(index=date, key=id) %>% as_shared_tsibble() p1 <- nasa_shared %>% ggplot(aes(x = long, y = lat)) + geom_point(aes(group = id)) p2 <- nasa_shared %>% ggplot(aes(x = date, y = surftemp)) + geom_line(aes(group = id), alpha = 0.5) library(plotly) subplot( ggplotly(p1, tooltip = "Region", width = 100), ggplotly(p2, tooltip = "Region", width = 900), nrows = 1, widths=c(0.4, 0.6)) %>% highlight(dynamic = TRUE) ``` ] --- background-image: url(images/week10A/space-time.jpg) background-size: 60% background-position: 95% 35% .w-35[ # Thinking about spatiotemporal data Space is a continuous variable, and in theory it fills out a square (or polygon). Often, though it is .monash-blue2[measured irregularly]. Sensor stations are sporadically placed. This affects making a density display of the measured variable. Strategy is to .monash-blue2[plot it at the resolution given], before trying to make a regular grid. ] --- background-image: url(images/week10A/usa-lin-legend.jpg) background-size: 20% background-position: 95% 80% # Glyphmaps of irregular space data Linear model fit to temperature recorded at historical weather stations across the USA ([USHCN](https://www.ncei.noaa.gov/products/land-based-station/us-historical-climatology-network)) <img src="images/week10A/usa-lin-overlap.png" width="70%"> --- background-image: url(images/week10A/usa-lin-collapse-legend.jpg) background-size: 20% background-position: 95% 80% # Regularising Measurements from nearby stations have been merged. Note, some [areas of the USA have been cooling](https://www.epa.gov/climate-indicators/climate-change-indicators-us-and-global-temperature). This is interesting! <img src="images/week10A/usa-lin-collapse.png" width="70%"> --- background-image: url(images/week10A/usa-season-legend.jpg) background-size: 20% background-position: 95% 80% # Seasonality De-construct time into a seasonal component, and representing this on the map is reasonable also. <img src="images/week10A/usa-season-collapsed.png" width="70%"> --- # Resources and Acknowledgement - [cubble](https://huizezhang-sherry.github.io/cubble): A Vector Spatio-Temporal Data Structure for Data Analysis - [sf](https://r-spatial.github.io/sf/): Simple Features for R - [Healy (2018) Data Visualization](https://socviz.co/maps.html#maps) - [Perpinan Lamigueiro (2018) Displaying time series, spatial and space-time data with R](https://oscarperpinan.github.io/bookvis/) - Wikle, Zammit-Mangion, Cressie (2018) [Spatio-Temporal Statistics with R](https://spacetimewithr.org) - [Moraga, Paula. (2019). Geospatial Health Data](https://www.paulamoraga.com/book-geospatial/index.html) - [Visualising spatial data using R](https://rspatialdata.github.io) --- background-size: cover class: title-slide background-image: url("images/bg-12.png") <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. .bottom_abs.width100[ Lecturer: *Di Cook* <i class="fas fa-envelope"></i> ETC5521.Clayton-x@monash.edu <i class="fas fa-calendar-alt"></i> Week 10 - Session 1 <br> ]