ETC5521 Tutorial 1

Introduction to exploratory data analysis

Author

Prof. Di Cook

Published

July 27, 2023

🎯 Objectives

This is the first tutorial meeting of the semester. The goal is to get to know other people in the class with you, and your tutors, and check you’ve got the right skills to get started, and to begin thinking about exploratory data analysis.

🔧 Preparation

  • Complete the weekly quiz, before the deadline!
  • Have git installed on your laptop so that you can access the test classroom.
  • Have the latest versions of RStudio and R installed on your laptop.
  • Install this list of R packages:
install.packages(c("fun", "dplyr", "here"))
  • Create an RStudio Project for this unit, called eda or ETC5521. All your work in the tutorials should be conducted in this project. Ideally, your project is organised into folders, one for data, one for tutorial_XX, … Each week when you begin your tutorial, open the project.

Exercise 1.1: Ice breaker

  • Grab your name tag
  • Follow the instructions of your tutor to get to know your classmates.
  • Which R package is your favourite? ggplot2 or plotly
  • Have you used the R package purrr before? Y or N
  • What does the package profvis do? Visualisation data or Profiling code?

Exercise 1.2: How good are your detective skills?

Being good at noticing something unexpected or unusual is an important skills for exploratory data analysis. This exercise is designed to practice your detective skills.

Play the game alzheimer_test from the fun package by running this code:

library(fun)
x = alzheimer_test()

You will be given 6 tasks to complete. Each one is to find a specific letter hidden among a \(10\times 30\) grid of letters. When you are finished, answer these questions:

  1. Which task did you THINK was the most difficult?
  2. Which task does the DATA say was most difficult based, based on the time taken to answer, tm1.1.j. in your results data?
  3. Save the dataset to an .rda file.
load(here::here("data/alzheimers.rda"))
library(dplyr)
x %>% 
  select(char1.1.j., char2.1.j., tm1.1.j.) %>%
  arrange(desc(tm1.1.j.))
           char1.1.j. char2.1.j.  tm1.1.j.
ans.user.2          M          N 30.839718
ans.user.3          I          T 19.695932
ans.user.5          D          O 17.189302
ans.user.1          O          C 16.534676
ans.user.4          F          E  4.424869
ans.user            9          6  3.812386

Exercise 1.3: Get started using GitHub Classroom

  1. In Moodle go to the Assignment 0 instructions to find the invitation to a GitHub Classroom. Accept this invitation.
  2. Clone the “test” template repo.
  3. Upload your results data file to your repo.

👋 Finishing up

Make sure you say thanks and good-bye to your tutor. This is a time to also report what you enjoyed and what you found difficult.