All consultations will be in person and zoom. Check Moodle for the links.
| Week | Slides | Tutorial | Topic | Readings | Assessments | 
|---|---|---|---|---|---|
| 00 | A: | Course information | |||
| 01 (Jul 24) | A: ; B: | Overview. Why this course? What is EDA? | The Landscape of R Packages for Automated Exploratory Data Analysis | Tutorial preparation quizzes due each week. | |
| 02 (Jul 31) | A: ; B: | Learning from history | EDA Case Study: Bay area blues | ||
| 03 (Aug 7) | A: ; B: | Initial data analysis and model diagnostics: Model dependent exploration and how it differs from EDA | The initial examination of data | ||
| 04 (Aug 14) | A: ; B: | Using computational tools to determine whether what is seen in the data can be assumed to apply more broadly | Wickham et al. (2010) Graphical inference for Infovis | ||
| 05 (Aug 21) | A: ; B: | Working with a single variable, making transformations, detecting outliers, using robust statistics | Unwin (2015) Graphical Data Analysis Ch 3-4; Wilke (2019) Ch 6 Visualizing Amounts; Ch 7 Visualizing distributions; | Assignment 1 (individual) due on Fri Aug 25, 4:30pm | |
| 06 (Aug 28) | A: ; B: | Bivariate dependencies and relationships, transformations to linearise | Unwin (2015) Graphical Data Analysis Ch 5; Wilke (2019) Ch 12 Visualising associations | ||
| 07 (Sep 4) | A: ; B: | Making comparisons between groups and strata | Wilke (2019) Ch 9, 10.2-4, 11.2; Unwin (2015) Graphical Data Analysis Ch 10 | ||
| 08 (Sep 11) | A: ; B: | Going beyond two variables, exploring high dimensions | Unwin (2015) Graphical Data Analysis Ch 6; Cook and Laa (2023) Interactively exploring high-dimensional data and models in R Chapter 1 | Assignment 2 (individual) due on Fri Sep 15, 4:30pm | |
| 09 (Sep 18) | A: ; B: | Exploring data having a space and time context Part I | Reintroducing tsibble: data tools that melt the clock; brolgar: An R package to BRowse Over Longitudinal Data Graphically and Analytically in R; Listen to Nick talking about longitudinal data; Unwin (2015) Graphical Data Analysis Ch 11 | ||
| Mid-semester Break (1 week) - no lectures or tutorials | |||||
| 10 (Oct 2) | A: ; B: | Exploring data having a space and time context Part II | Moraga (2019) Spatial data and R packages for mapping; cubble: A Vector Spatio-Temporal Data Structure for Data Analysis; Making maps plot faster Simplify spatial polygons; sf: Simple Features for R | Assignment 3 (part 1) due on Fri Oct 6, 4:30pm | |
| 11 (Oct 9) | A: ; B: | Sculpting data using models, checking assumptions, co-dependency and performing diagnostics | Cook & Weisberg (1994) An Introduction to Regression Graphics Ch 6; Cleveland (1993) Visualising Data Ch 4; ; How to use a tour to check if your model suffers from multicollinearity | ||
| 12 (Oct 16) | Extending beyond the data, what can and cannot be inferred more generally, given the data collection | Assignment 3 (part 2) due on Fri Oct 27, 4:30pm | |||
On successful completion of this unit, you should be able to:
learn to use modern data exploration tools with real data to uncover interesting structure and unusual observations
understand how to map out appropriate analyses, and to define what we would expect to see in the data
be able to compute null samples in order to test apparent patterns, and to interpret the results of visual inference
critically assess the strength and adequacy of a data analysis.