class: middle center hide-slide-number monash-bg-gray80 .info-box.w-50.bg-white[ These slides are viewed best by Chrome or Firefox and occasionally need to be refreshed if elements did not load properly. See <a href=lecture-08A.pdf>here for the PDF <i class="fas fa-file-pdf"></i></a>. ] <br> .white[Press the **right arrow** to progress to the next slide!] --- class: title-slide count: false background-image: url("images/bg-12.png") # .monash-blue[ETC5521: Exploratory Data Analysis] <h1 class="monash-blue" style="font-size: 30pt!important;"></h1> <br> <h2 style="font-weight:900!important;">Going beyond two variables, exploring high dimensions</h2> .bottom_abs.width100[ Lecturer: *Di Cook* <i class="fas fa-envelope"></i> ETC5521.Clayton-x@monash.edu <i class="fas fa-calendar-alt"></i> Week 8 - Session 1 <br> ] --- Read about the original book, and movie on [wikipedia](https://en.wikipedia.org/wiki/Flatland) <center> <iframe width="980" height="551" src="https://www.youtube.com/embed/C8oiwnNlyE4" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </center> .footnote[Trailer for "FLATLAND 2: SPHERELAND"] --- class: transition middle # More than two continuous variables? # Use a scatterplot matrix synonyms: splom, draughtsman plot --- # .orange[Case study] .bg-orange.circle[1] Olive oils .panelset[ .panel[.panel-name[data] .scroll-800[
id
region
area
palmitic
palmitoleic
stearic
oleic
linoleic
linolenic
arachidic
eicosenoic
1.North-Apulia
1
1
1075
75
226
7823
672
36
60
29
2.North-Apulia
1
1
1088
73
224
7709
781
31
61
29
3.North-Apulia
1
1
911
54
246
8113
549
31
63
29
4.North-Apulia
1
1
966
57
240
7952
619
50
78
35
5.North-Apulia
1
1
1051
67
259
7771
672
50
80
46
6.North-Apulia
1
1
911
49
268
7924
678
51
70
44
7.North-Apulia
1
1
922
66
264
7990
618
49
56
29
8.North-Apulia
1
1
1100
61
235
7728
734
39
64
35
9.North-Apulia
1
1
1082
60
239
7745
709
46
83
33
10.North-Apulia
1
1
1037
55
213
7944
633
26
52
30
11.North-Apulia
1
1
1051
35
219
7978
605
21
65
24
12.North-Apulia
1
1
1036
59
235
7868
661
30
62
44
13.North-Apulia
1
1
1074
70
214
7728
747
50
79
33
14.North-Apulia
1
1
875
52
243
8018
655
41
79
32
15.North-Apulia
1
1
952
49
254
7795
780
50
75
41
16.North-Apulia
1
1
1155
98
201
7606
816
32
60
29
17.North-Apulia
1
1
943
94
183
7840
788
42
75
31
18.North-Apulia
1
1
1278
69
205
7344
957
45
70
28
19.North-Apulia
1
1
961
70
195
7958
742
46
75
30
20.North-Apulia
1
1
952
77
258
7820
736
43
78
33
21.North-Apulia
1
1
1074
67
236
7692
716
56
83
45
22.North-Apulia
1
1
995
46
288
7806
679
56
86
40
23.North-Apulia
1
1
1056
53
247
7703
700
54
89
51
24.North-Apulia
1
1
1065
39
234
7876
703
42
74
26
25.North-Apulia
1
1
1065
45
245
7779
696
47
82
38
26.Calabria
1
2
1315
139
230
7299
832
42
60
32
27.Calabria
1
2
1321
136
217
7174
950
43
63
30
28.Calabria
1
2
1359
115
246
7234
874
45
63
18
29.Calabria
1
2
1378
111
272
7127
940
46
64
23
30.Calabria
1
2
1295
109
245
7253
903
43
62
38
31.Calabria
1
2
1275
121
215
7285
892
40
68
41
32.Calabria
1
2
1336
120
318
7083
915
50
70
38
33.Calabria
1
2
1309
122
241
7257
870
46
72
35
34.Calabria
1
2
1340
114
189
7337
820
48
72
21
35.Calabria
1
2
1299
116
253
7309
823
40
69
27
36.Calabria
1
2
1221
107
221
7441
798
54
70
28
37.Calabria
1
2
1245
72
283
7395
829
44
67
28
38.Calabria
1
2
1285
129
244
7323
819
57
65
36
39.Calabria
1
2
1248
107
313
7299
840
46
66
33
40.Calabria
1
2
1356
106
236
7209
866
48
75
36
41.Calabria
1
2
1260
102
228
7354
870
49
64
28
42.Calabria
1
2
1261
121
312
7238
877
47
65
25
43.Calabria
1
2
1304
124
279
7160
928
48
61
37
44.Calabria
1
2
1344
117
287
7129
897
51
65
41
45.Calabria
1
2
1323
96
300
7351
757
47
54
26
46.Calabria
1
2
1292
117
215
7351
839
48
61
32
47.Calabria
1
2
1254
118
244
7394
786
46
71
24
48.Calabria
1
2
1312
131
259
7167
939
41
69
20
49.Calabria
1
2
1213
109
301
7261
925
47
65
31
50.Calabria
1
2
1359
98
351
7262
780
41
56
16
51.Calabria
1
2
1266
97
263
7435
743
45
69
29
52.Calabria
1
2
1298
99
311
7311
787
45
67
23
53.Calabria
1
2
1272
116
279
7258
872
43
72
27
54.Calabria
1
2
1278
87
332
7379
771
44
53
24
55.Calabria
1
2
1184
112
311
7391
819
48
57
28
56.Calabria
1
2
1382
110
268
7241
828
39
60
30
57.Calabria
1
2
1183
146
292
7580
618
38
51
23
58.Calabria
1
2
1261
153
219
7355
818
52
70
26
59.Calabria
1
2
1198
136
239
7639
633
27
55
19
60.Calabria
1
2
1225
134
232
7658
616
36
49
26
61.Calabria
1
2
1339
166
208
7190
923
40
69
25
62.Calabria
1
2
1132
157
240
7641
638
45
60
31
63.Calabria
1
2
1381
183
245
7385
609
47
70
25
64.Calabria
1
2
1409
128
257
7257
759
43
57
16
65.Calabria
1
2
1306
127
250
7254
869
47
68
24
66.Calabria
1
2
1372
120
250
7355
702
44
68
28
67.Calabria
1
2
1336
113
242
7293
855
38
60
18
68.Calabria
1
2
1401
151
238
7164
857
45
72
36
69.Calabria
1
2
1390
119
234
7236
823
40
62
41
70.Calabria
1
2
1432
152
281
7029
949
39
55
25
71.Calabria
1
2
1412
124
298
7182
790
45
68
28
72.Calabria
1
2
1366
147
291
7197
783
51
70
34
73.Calabria
1
2
1383
118
273
7282
738
45
68
29
74.Calabria
1
2
1283
102
263
7400
763
54
65
28
75.Calabria
1
2
1296
136
260
7380
780
48
51
18
76.Calabria
1
2
1287
108
287
7343
826
44
44
23
77.Calabria
1
2
1351
159
296
7229
810
36
60
22
78.Calabria
1
2
1241
97
268
7499
709
52
69
36
79.Calabria
1
2
1267
101
300
7230
898
74
65
34
80.Calabria
1
2
1235
138
252
7322
861
54
66
36
81.Calabria
1
2
1255
103
223
7395
848
47
56
30
82.South-Apulia
1
3
1454
183
196
7057
1014
27
46
19
83.South-Apulia
1
3
1347
194
197
7277
895
25
46
15
84.South-Apulia
1
3
1364
204
225
6929
1084
21
50
14
85.South-Apulia
1
3
1410
199
216
7130
955
21
48
19
86.South-Apulia
1
3
1384
178
208
7105
999
29
67
26
87.South-Apulia
1
3
1412
185
217
6842
1203
34
72
32
88.South-Apulia
1
3
1410
232
280
6715
1233
32
60
24
89.South-Apulia
1
3
1509
209
257
6647
1240
42
62
30
90.South-Apulia
1
3
1317
197
256
7036
1067
40
60
22
91.South-Apulia
1
3
1286
192
203
7132
1053
38
65
28
92.South-Apulia
1
3
1273
191
202
6862
1303
43
70
28
93.South-Apulia
1
3
1463
183
183
6747
1307
36
60
24
94.South-Apulia
1
3
1399
187
191
6861
1233
38
60
17
95.South-Apulia
1
3
1413
193
208
6875
1202
30
60
18
96.South-Apulia
1
3
1369
206
203
6953
1168
35
50
16
97.South-Apulia
1
3
1488
172
170
6920
1144
37
54
14
98.South-Apulia
1
3
1323
160
205
6911
1298
24
50
17
99.South-Apulia
1
3
1311
166
170
6902
1312
41
69
28
100.South-Apulia
1
3
1286
163
183
7040
1230
29
57
12
101.South-Apulia
1
3
1380
173
188
7038
1139
31
44
14
102.South-Apulia
1
3
1394
164
223
7086
1042
24
43
23
103.South-Apulia
1
3
1324
174
198
6863
1289
36
70
21
104.South-Apulia
1
3
1290
157
192
7000
1263
26
51
19
105.South-Apulia
1
3
1361
163
196
6888
1273
37
58
24
106.South-Apulia
1
3
1387
182
242
6913
1101
44
68
30
107.South-Apulia
1
3
1369
180
181
7000
1130
39
45
24
108.South-Apulia
1
3
1303
165
175
7025
1243
31
41
16
109.South-Apulia
1
3
1346
160
169
7072
1151
39
48
15
110.South-Apulia
1
3
1369
171
184
6937
1246
30
48
15
111.South-Apulia
1
3
1305
172
169
7004
1260
28
50
11
112.South-Apulia
1
3
1351
179
186
6935
1243
36
50
19
113.South-Apulia
1
3
1283
151
182
7000
1271
40
52
21
114.South-Apulia
1
3
1449
175
198
6883
1162
40
70
22
115.South-Apulia
1
3
1310
180
183
7054
1202
26
32
12
116.South-Apulia
1
3
1360
163
176
6901
1280
28
65
27
117.South-Apulia
1
3
1300
187
196
6920
1253
41
76
25
118.South-Apulia
1
3
1368
171
218
7010
1057
41
54
26
119.South-Apulia
1
3
1207
151
156
7159
1234
27
51
14
120.South-Apulia
1
3
1348
154
183
6917
1277
48
56
16
121.South-Apulia
1
3
1334
186
229
7261
827
34
56
20
122.South-Apulia
1
3
1301
156
207
7003
1229
41
48
14
123.South-Apulia
1
3
1226
181
213
6961
1230
47
74
26
124.South-Apulia
1
3
1201
168
190
7100
1216
43
64
16
125.South-Apulia
1
3
1297
153
177
7004
1260
35
60
16
126.South-Apulia
1
3
1248
163
158
7103
1222
31
60
14
127.South-Apulia
1
3
1335
159
197
6974
1220
36
60
17
128.South-Apulia
1
3
1219
167
171
7087
1254
35
50
16
129.South-Apulia
1
3
1318
179
177
7030
1194
35
42
25
130.South-Apulia
1
3
1264
167
166
7130
1187
22
52
12
131.South-Apulia
1
3
1201
175
201
7129
1193
36
49
15
132.South-Apulia
1
3
1252
180
181
7055
1214
31
59
38
133.South-Apulia
1
3
1273
182
209
6965
1191
43
74
23
134.South-Apulia
1
3
1351
179
170
7034
1154
35
66
10
135.South-Apulia
1
3
1336
155
212
7103
1086
33
55
20
136.South-Apulia
1
3
1499
201
182
6803
1204
30
56
24
137.South-Apulia
1
3
1425
198
193
7032
1041
31
52
17
138.South-Apulia
1
3
1358
204
227
6962
1109
41
65
34
139.South-Apulia
1
3
1346
181
257
7147
933
40
60
36
140.South-Apulia
1
3
1392
186
256
6732
1278
53
64
29
141.South-Apulia
1
3
1311
166
222
7006
1147
41
80
27
142.South-Apulia
1
3
1314
171
229
6923
1198
47
76
42
143.South-Apulia
1
3
1409
200
207
6842
1224
31
60
27
144.South-Apulia
1
3
1342
174
221
6993
1147
36
64
23
145.South-Apulia
1
3
1387
182
206
7100
1020
34
54
17
146.South-Apulia
1
3
1413
202
205
6920
1165
36
46
13
147.South-Apulia
1
3
1430
209
225
6800
1200
32
59
27
148.South-Apulia
1
3
1336
185
223
6956
1155
56
73
16
149.South-Apulia
1
3
1372
200
200
6916
1189
33
50
22
150.South-Apulia
1
3
1330
157
228
7055
1108
42
55
25
151.South-Apulia
1
3
1412
207
208
6822
1239
36
51
28
152.South-Apulia
1
3
1321
209
217
6948
1178
42
62
23
153.South-Apulia
1
3
1401
200
217
6980
1073
40
68
21
154.South-Apulia
1
3
1401
214
217
6734
1293
44
69
27
155.South-Apulia
1
3
1457
168
242
6724
1266
54
59
30
156.South-Apulia
1
3
1451
199
221
6835
1177
37
51
29
157.South-Apulia
1
3
1438
206
248
6806
1183
34
57
28
158.South-Apulia
1
3
1462
204
237
6644
1309
42
54
28
159.South-Apulia
1
3
1529
215
203
6602
1310
45
69
27
160.South-Apulia
1
3
1510
189
245
6752
1188
36
52
28
161.South-Apulia
1
3
1437
222
184
6803
1240
43
56
16
162.South-Apulia
1
3
1327
129
247
7024
1157
38
56
22
163.South-Apulia
1
3
1438
172
252
6630
1380
40
64
24
164.South-Apulia
1
3
1447
176
189
6849
1180
42
64
26
165.South-Apulia
1
3
1355
144
214
6972
1198
33
60
24
166.South-Apulia
1
3
1369
156
241
6890
1209
42
63
30
167.South-Apulia
1
3
1471
188
276
6697
1269
34
51
16
168.South-Apulia
1
3
1456
179
240
6738
1267
41
65
14
169.South-Apulia
1
3
1314
140
207
7020
1220
28
59
12
170.South-Apulia
1
3
1408
176
192
6909
1195
45
50
25
171.South-Apulia
1
3
1397
172
191
7107
1018
36
50
29
172.South-Apulia
1
3
1413
191
186
6937
1180
31
46
13
173.South-Apulia
1
3
1539
194
213
6764
1178
38
58
16
174.South-Apulia
1
3
1304
159
234
7019
1174
38
53
19
175.South-Apulia
1
3
1341
160
231
7033
1069
40
67
33
176.South-Apulia
1
3
1508
208
249
6641
1311
25
43
20
177.South-Apulia
1
3
1515
226
257
6595
1287
41
63
16
178.South-Apulia
1
3
1262
165
235
7120
1113
32
51
21
179.South-Apulia
1
3
1307
197
238
7003
1144
37
50
24
180.South-Apulia
1
3
1294
159
253
7009
1190
30
52
13
181.South-Apulia
1
3
1460
187
215
6843
1172
35
56
32
182.South-Apulia
1
3
1476
187
203
6837
1197
36
48
22
183.South-Apulia
1
3
1482
178
197
6814
1201
40
64
24
184.South-Apulia
1
3
1388
176
185
7008
1111
48
53
31
185.South-Apulia
1
3
1367
172
235
7066
1054
35
45
26
186.South-Apulia
1
3
1272
207
205
7152
1098
37
52
22
187.South-Apulia
1
3
1323
157
234
7132
1022
38
58
31
188.South-Apulia
1
3
1206
218
242
7193
1002
37
54
25
189.South-Apulia
1
3
1383
157
217
7018
1090
40
60
37
190.South-Apulia
1
3
1521
190
238
6956
986
36
50
23
191.South-Apulia
1
3
1350
168
227
6986
1165
29
58
17
192.South-Apulia
1
3
1422
181
218
6813
1230
30
59
21
193.South-Apulia
1
3
1298
166
224
6986
1162
34
65
31
194.South-Apulia
1
3
1447
236
245
6607
1336
33
51
21
195.South-Apulia
1
3
1347
197
211
6795
1300
32
59
34
196.South-Apulia
1
3
1339
170
253
6989
1110
29
63
23
197.South-Apulia
1
3
1388
183
216
6867
1208
28
61
21
198.South-Apulia
1
3
1527
260
232
6488
1370
31
45
20
199.South-Apulia
1
3
1495
237
236
6571
1318
32
58
26
200.South-Apulia
1
3
1487
246
251
6504
1390
29
53
19
201.South-Apulia
1
3
1399
180
232
6855
1190
32
66
22
202.South-Apulia
1
3
1489
215
242
6777
1145
30
60
22
203.South-Apulia
1
3
1339
166
226
6928
1198
30
60
23
204.South-Apulia
1
3
1482
246
238
6444
1462
27
50
20
205.South-Apulia
1
3
1434
172
255
6646
1354
27
59
25
206.South-Apulia
1
3
1347
156
214
6850
1313
25
48
19
207.South-Apulia
1
3
1340
158
233
6848
1272
32
63
25
208.South-Apulia
1
3
1453
180
244
6752
1238
34
54
23
209.South-Apulia
1
3
1306
149
226
7082
1097
33
61
24
210.South-Apulia
1
3
1349
161
217
6997
1138
31
62
23
211.South-Apulia
1
3
1254
151
205
7319
947
28
54
23
212.South-Apulia
1
3
1168
144
220
7230
1109
31
52
28
213.South-Apulia
1
3
1346
167
224
6959
1111
30
49
23
214.South-Apulia
1
3
1390
184
212
6898
1189
29
44
19
215.South-Apulia
1
3
1283
149
224
7077
1104
30
57
32
216.South-Apulia
1
3
1214
137
232
7269
1005
32
55
23
217.South-Apulia
1
3
1491
227
205
6941
988
33
68
34
218.South-Apulia
1
3
1479
218
207
7039
887
36
65
36
219.South-Apulia
1
3
1445
174
228
6875
1123
29
69
31
220.South-Apulia
1
3
1439
183
218
6775
1226
32
66
29
221.South-Apulia
1
3
1387
154
204
6991
1090
34
74
32
222.South-Apulia
1
3
1426
169
192
7025
1043
31
64
27
223.South-Apulia
1
3
1451
200
208
6980
1006
30
62
31
224.South-Apulia
1
3
1493
204
188
6913
1044
32
61
35
225.South-Apulia
1
3
1419
192
207
6996
1014
36
70
36
226.South-Apulia
1
3
1342
177
199
7172
952
34
65
33
227.South-Apulia
1
3
1349
152
236
7145
949
35
75
29
228.South-Apulia
1
3
1440
196
208
6938
1070
32
61
26
229.South-Apulia
1
3
1460
215
197
6918
1081
28
55
23
230.South-Apulia
1
3
1249
133
205
7417
827
33
72
33
231.South-Apulia
1
3
1348
159
238
7017
1081
31
67
25
232.South-Apulia
1
3
1341
155
244
6958
1144
32
68
26
233.South-Apulia
1
3
1398
149
204
7182
907
29
76
30
234.South-Apulia
1
3
1454
200
199
6910
1090
30
62
25
235.South-Apulia
1
3
1334
153
219
6928
1214
33
66
24
236.South-Apulia
1
3
1438
204
189
7107
910
33
63
27
237.South-Apulia
1
3
1303
138
212
7170
1016
34
69
25
238.South-Apulia
1
3
1323
147
210
7108
1070
33
61
20
239.South-Apulia
1
3
1417
169
207
6875
1184
34
57
27
240.South-Apulia
1
3
1360
167
225
6883
1220
31
55
27
241.South-Apulia
1
3
1420
179
214
6923
1121
33
56
27
242.South-Apulia
1
3
1472
218
214
6724
1238
29
53
23
243.South-Apulia
1
3
1368
174
205
7042
1066
31
57
26
244.South-Apulia
1
3
1367
173
228
6948
1141
32
53
24
245.South-Apulia
1
3
1403
173
209
6843
1210
33
63
33
246.South-Apulia
1
3
1413
197
206
6737
1387
34
60
31
247.South-Apulia
1
3
1201
138
207
7011
1269
37
64
35
248.South-Apulia
1
3
1359
180
207
6895
1203
33
61
30
249.South-Apulia
1
3
1518
198
225
6681
1243
29
57
24
250.South-Apulia
1
3
1434
185
189
6771
1269
30
62
25
251.South-Apulia
1
3
1367
162
179
6772
1368
33
64
27
252.South-Apulia
1
3
1461
181
197
6783
1246
26
57
23
253.South-Apulia
1
3
1368
161
198
7030
1095
33
59
31
254.South-Apulia
1
3
1419
159
215
6862
1193
35
60
31
255.South-Apulia
1
3
1514
162
298
6725
1119
45
93
30
256.South-Apulia
1
3
1328
171
253
6987
1030
38
83
39
257.South-Apulia
1
3
1469
160
337
6675
1127
44
94
36
258.Sicily
1
4
1222
133
227
7425
824
36
69
35
259.Sicily
1
4
1639
172
331
6510
1124
46
91
32
260.Sicily
1
4
1345
133
272
6801
1194
48
83
37
261.Sicily
1
4
1339
170
275
6838
1060
46
88
43
262.Sicily
1
4
1194
135
263
7277
889
44
95
41
263.Sicily
1
4
1112
68
375
7770
448
52
69
45
264.Sicily
1
4
1222
70
329
7605
566
48
67
43
265.Sicily
1
4
1136
72
341
7616
661
49
65
32
266.Sicily
1
4
926
41
277
7815
784
45
65
25
267.Sicily
1
4
1105
69
373
7714
532
51
68
37
268.Sicily
1
4
1109
79
305
7576
763
45
64
36
269.Sicily
1
4
1284
93
265
7235
893
43
77
46
270.Sicily
1
4
1120
69
277
7416
946
42
59
36
271.Sicily
1
4
916
52
281
7870
694
42
64
58
272.Sicily
1
4
905
49
288
7747
812
49
71
56
273.Sicily
1
4
1206
55
287
7329
935
44
74
42
274.Sicily
1
4
1457
182
267
7020
863
41
84
37
275.Sicily
1
4
1327
140
193
7328
823
36
87
35
276.Sicily
1
4
1303
100
251
7045
1049
40
86
40
277.Sicily
1
4
1444
175
259
6876
1027
34
78
32
278.Sicily
1
4
1505
243
226
6962
858
30
72
27
279.Sicily
1
4
1429
162
223
6917
1041
37
77
40
280.Sicily
1
4
1491
162
211
6994
928
37
97
38
281.Sicily
1
4
1393
128
211
7189
870
38
93
40
282.Sicily
1
4
1404
134
210
7110
923
40
101
43
283.Sicily
1
4
1222
130
214
7374
856
38
89
45
284.Sicily
1
4
1153
74
316
7593
705
42
64
32
285.Sicily
1
4
1169
76
307
7553
728
43
69
32
286.Sicily
1
4
1369
104
237
7375
775
39
70
15
287.Sicily
1
4
993
58
267
7743
773
41
62
44
288.Sicily
1
4
980
53
254
7719
815
44
69
47
289.Sicily
1
4
967
55
273
7692
833
45
63
47
290.Sicily
1
4
1128
73
354
7527
728
44
76
38
291.Sicily
1
4
1188
85
273
7445
814
44
73
42
292.Sicily
1
4
1257
95
247
7405
812
43
70
35
293.Sicily
1
4
1262
88
301
7471
704
43
71
31
294.South-Apulia
1
3
1283
153
196
7107
1115
37
60
28
295.South-Apulia
1
3
1263
155
199
7140
1148
31
42
18
296.South-Apulia
1
3
1369
158
215
7160
958
38
69
32
297.South-Apulia
1
3
1353
172
175
6965
1212
28
75
19
298.South-Apulia
1
3
1187
139
185
7427
952
29
56
22
299.South-Apulia
1
3
1732
231
156
6437
1313
45
62
23
300.South-Apulia
1
3
1620
255
166
6628
1212
29
62
27
301.South-Apulia
1
3
1543
172
193
6740
1157
52
87
34
302.South-Apulia
1
3
1498
170
195
6804
1206
35
66
23
303.South-Apulia
1
3
1399
169
171
7011
1100
36
72
16
304.South-Apulia
1
3
1293
156
191
7101
1111
32
60
31
305.South-Apulia
1
3
1420
175
152
7004
1149
27
50
20
306.South-Apulia
1
3
1721
238
255
6300
1350
35
70
28
307.South-Apulia
1
3
1742
221
156
6415
1315
43
82
23
308.South-Apulia
1
3
1391
187
189
6975
1062
52
70
45
309.South-Apulia
1
3
1517
206
249
6680
1205
33
80
27
310.South-Apulia
1
3
1269
157
193
7140
1148
31
40
18
311.South-Apulia
1
3
1577
204
208
6732
1183
20
52
20
312.South-Apulia
1
3
1590
241
195
6705
1149
27
68
21
313.South-Apulia
1
3
1621
280
197
6608
1179
28
58
27
314.South-Apulia
1
3
1753
275
236
6367
1214
23
61
27
315.South-Apulia
1
3
1679
260
177
6568
1191
30
59
33
316.South-Apulia
1
3
1419
203
176
6973
1083
38
78
27
317.South-Apulia
1
3
1693
236
174
6499
1204
51
102
37
318.South-Apulia
1
3
1692
270
234
6499
1196
31
59
15
319.South-Apulia
1
3
1638
252
215
6570
1199
39
53
29
320.South-Apulia
1
3
1497
247
219
6621
1270
36
73
32
321.South-Apulia
1
3
1442
222
194
6677
1314
36
72
38
322.South-Apulia
1
3
1680
270
170
6440
1310
31
62
28
323.South-Apulia
1
3
1463
164
185
6909
1154
49
58
17
324.Inland-Sardinia
2
5
1129
120
222
7272
1112
43
98
2
325.Inland-Sardinia
2
5
1042
135
210
7376
1116
35
90
3
326.Inland-Sardinia
2
5
1103
96
210
7380
1085
32
94
3
327.Inland-Sardinia
2
5
1118
97
221
7279
1154
35
94
2
328.Inland-Sardinia
2
5
1052
95
215
7388
1126
31
92
1
329.Inland-Sardinia
2
5
1116
102
231
7290
1168
26
66
1
330.Inland-Sardinia
2
5
1108
132
231
7319
1101
20
66
2
331.Inland-Sardinia
2
5
1129
108
212
7386
1074
28
62
3
332.Inland-Sardinia
2
5
1085
91
223
7384
1126
28
62
3
333.Inland-Sardinia
2
5
1104
103
233
7322
1147
27
61
2
334.Inland-Sardinia
2
5
1098
88
212
7338
1140
28
67
1
335.Coast-Sardinia
2
6
1135
98
251
7120
1314
20
61
2
336.Coast-Sardinia
2
6
1158
108
245
7065
1326
22
75
1
337.Coast-Sardinia
2
6
1133
110
241
7080
1342
21
68
3
338.Coast-Sardinia
2
6
1095
125
250
7120
1305
21
83
1
339.Coast-Sardinia
2
6
1201
87
238
6990
1383
25
75
3
340.Coast-Sardinia
2
6
1213
112
245
7007
1335
22
65
3
341.Inland-Sardinia
2
5
1108
92
231
7367
1110
29
62
3
342.Inland-Sardinia
2
5
1075
103
207
7413
1096
32
68
2
343.Inland-Sardinia
2
5
1059
96
228
7386
1128
25
72
2
344.Inland-Sardinia
2
5
1176
92
207
7347
1057
35
82
1
345.Inland-Sardinia
2
5
1159
98
213
7320
1108
38
64
1
346.Inland-Sardinia
2
5
1132
80
201
7398
1095
27
67
2
347.Inland-Sardinia
2
5
1107
75
220
7399
1096
29
90
1
348.Inland-Sardinia
2
5
1092
104
234
7355
1126
28
58
2
349.Inland-Sardinia
2
5
1119
81
219
7409
1057
33
81
2
350.Inland-Sardinia
2
5
1106
93
212
7381
1104
35
68
1
351.Inland-Sardinia
2
5
1047
101
238
7385
1120
28
89
1
352.Inland-Sardinia
2
5
1165
99
214
7331
1101
22
67
3
353.Inland-Sardinia
2
5
1158
84
201
7327
1123
29
77
2
354.Inland-Sardinia
2
5
1095
88
203
7415
1093
37
78
1
355.Inland-Sardinia
2
5
1176
75
205
7396
1107
33
74
2
356.Inland-Sardinia
2
5
1103
109
220
7335
1140
28
59
2
357.Inland-Sardinia
2
5
1112
92
209
7356
1125
32
73
2
358.Inland-Sardinia
2
5
1091
93
222
7377
1113
20
53
2
359.Inland-Sardinia
2
5
1080
98
219
7371
1125
33
78
1
360.Inland-Sardinia
2
5
1051
108
227
7403
1114
30
66
3
361.Inland-Sardinia
2
5
1096
84
211
7415
1091
30
71
2
362.Inland-Sardinia
2
5
1142
97
225
7341
1101
28
65
1
363.Inland-Sardinia
2
5
1047
96
236
7399
1107
32
80
3
364.Inland-Sardinia
2
5
1114
86
210
7359
1116
31
83
2
365.Inland-Sardinia
2
5
1140
93
241
7324
1098
23
74
1
366.Inland-Sardinia
2
5
1075
91
200
7410
1107
36
80
1
367.Inland-Sardinia
2
5
1092
106
219
7427
1125
33
77
1
368.Inland-Sardinia
2
5
1076
95
204
7408
1130
27
79
2
369.Inland-Sardinia
2
5
1178
89
201
7381
1099
34
87
2
370.Inland-Sardinia
2
5
1095
104
223
7367
1111
43
56
2
371.Coast-Sardinia
2
6
1166
97
272
6971
1390
20
83
3
372.Coast-Sardinia
2
6
1154
119
257
7130
1253
22
61
1
373.Coast-Sardinia
2
6
1177
111
241
6882
1470
22
95
2
374.Coast-Sardinia
2
6
1160
96
240
7043
1357
24
79
2
375.Coast-Sardinia
2
6
1122
104
241
7145
1313
15
58
1
376.Coast-Sardinia
2
6
1132
99
257
7065
1362
24
90
3
377.Coast-Sardinia
2
6
1096
100
260
7162
1282
25
74
2
378.Coast-Sardinia
2
6
1131
87
233
7144
1307
25
72
3
379.Coast-Sardinia
2
6
1184
105
258
7020
1340
26
66
2
380.Coast-Sardinia
2
6
1135
94
235
7123
1320
24
67
2
381.Coast-Sardinia
2
6
1084
96
240
7164
1330
28
57
1
382.Coast-Sardinia
2
6
1086
127
252
7159
1285
28
62
2
383.Coast-Sardinia
2
6
1140
95
258
7085
1347
23
71
3
384.Coast-Sardinia
2
6
1138
101
254
7103
1310
25
68
1
385.Coast-Sardinia
2
6
1159
110
261
7068
1297
27
77
2
386.Inland-Sardinia
2
5
1051
78
211
7421
1146
30
82
2
387.Inland-Sardinia
2
5
1048
79
213
7439
1130
28
61
2
388.Inland-Sardinia
2
5
1061
86
220
7421
1102
29
79
3
389.Inland-Sardinia
2
5
1105
88
210
7353
1142
28
72
1
390.Inland-Sardinia
2
5
1145
35
237
7208
1118
20
46
2
391.Inland-Sardinia
2
5
1049
96
219
7303
1168
22
47
2
392.Inland-Sardinia
2
5
1105
120
218
7302
1158
23
45
3
393.Inland-Sardinia
2
5
1030
84
214
7403
1177
21
70
1
394.Inland-Sardinia
2
5
1070
98
215
7280
1240
28
68
3
395.Inland-Sardinia
2
5
1103
81
208
7310
1177
30
90
3
396.Inland-Sardinia
2
5
1040
101
205
7368
1176
25
85
3
397.Inland-Sardinia
2
5
1100
95
210
7320
1113
22
72
3
398.Inland-Sardinia
2
5
1118
85
199
7415
1060
36
86
3
399.Inland-Sardinia
2
5
1065
98
230
7345
1163
24
74
1
400.Inland-Sardinia
2
5
1131
78
221
7358
1120
22
69
2
401.Inland-Sardinia
2
5
1080
120
218
7296
1145
35
105
2
402.Inland-Sardinia
2
5
1075
86
231
7403
1109
22
73
3
403.Inland-Sardinia
2
5
1040
103
228
7364
1173
25
66
2
404.Inland-Sardinia
2
5
1128
82
203
7320
1148
30
88
1
405.Inland-Sardinia
2
5
1060
111
231
7363
1149
20
65
1
406.Inland-Sardinia
2
5
1103
78
220
7365
1149
20
65
2
407.Inland-Sardinia
2
5
1110
91
201
7318
1185
24
74
2
408.Inland-Sardinia
2
5
1091
108
218
7383
1183
28
88
3
409.Inland-Sardinia
2
5
1094
96
220
7341
1127
26
96
2
410.Coast-Sardinia
2
6
1131
87
208
7170
1308
28
57
2
411.Coast-Sardinia
2
6
1175
108
214
7076
1307
33
85
2
412.Coast-Sardinia
2
6
1076
77
202
7243
1305
29
67
1
413.Coast-Sardinia
2
6
1120
90
240
7068
1383
23
75
1
414.Coast-Sardinia
2
6
1152
111
238
7080
1372
25
81
2
415.Coast-Sardinia
2
6
1141
95
250
7035
1388
22
68
2
416.Coast-Sardinia
2
6
1098
103
267
7135
1301
24
76
2
417.Coast-Sardinia
2
6
1126
100
236
7062
1380
26
69
1
418.Coast-Sardinia
2
6
1087
89
243
7200
1302
18
60
1
419.Coast-Sardinia
2
6
1115
96
236
7085
1372
20
75
2
420.Coast-Sardinia
2
6
1178
92
241
7006
1376
22
84
1
421.Coast-Sardinia
2
6
1162
106
242
7025
1368
25
71
2
422.Umbria
3
9
1085
70
180
7955
605
20
50
1
423.Umbria
3
9
1085
70
185
7955
600
25
55
1
424.Umbria
3
9
1090
60
190
7950
600
28
47
2
425.Umbria
3
9
1080
65
189
7960
602
35
20
1
426.Umbria
3
9
1090
60
195
7955
600
28
42
2
427.Umbria
3
9
1105
55
200
7900
600
37
55
2
428.Umbria
3
9
1060
75
175
7975
610
20
55
2
429.Umbria
3
9
1050
70
170
7977
605
28
65
1
430.Umbria
3
9
1100
55
198
7905
600
35
50
3
431.Umbria
3
9
1065
65
178
7965
605
22
65
2
432.Umbria
3
9
1085
60
188
7955
602
30
50
2
433.Umbria
3
9
1080
65
180
7960
605
25
55
1
434.Umbria
3
9
1085
60
190
7955
602
30
53
1
435.Umbria
3
9
1075
68
195
7960
602
20
40
3
436.Umbria
3
9
1090
58
192
7950
600
35
40
3
437.Umbria
3
9
1095
60
198
7945
600
38
34
2
438.Umbria
3
9
1090
58
195
7950
600
30
42
2
439.Umbria
3
9
1095
58
198
7950
602
35
32
1
440.Umbria
3
9
1090
58
195
7940
600
35
42
2
441.Umbria
3
9
1095
58
198
7945
600
35
34
1
442.Umbria
3
9
1095
55
200
7940
600
35
45
3
443.Umbria
3
9
1080
70
188
7965
608
28
36
3
444.Umbria
3
9
1090
60
195
7950
600
32
38
2
445.Umbria
3
9
1105
55
200
7900
595
39
56
1
446.Umbria
3
9
1110
50
205
7900
595
40
52
1
447.Umbria
3
9
1075
70
198
7978
608
28
33
2
448.Umbria
3
9
1075
65
185
7980
608
35
42
3
449.Umbria
3
9
1065
75
180
7975
610
25
50
3
450.Umbria
3
9
1070
75
188
7980
602
22
45
2
451.Umbria
3
9
1070
75
188
7980
602
22
45
1
452.Umbria
3
9
1100
70
200
7910
610
39
44
1
453.Umbria
3
9
1075
70
185
7960
610
22
58
2
454.Umbria
3
9
1050
78
175
7990
610
18
59
3
455.Umbria
3
9
1090
60
198
7945
600
32
35
2
456.Umbria
3
9
1050
78
188
7990
608
28
23
3
457.Umbria
3
9
1075
70
190
7975
605
28
27
3
458.Umbria
3
9
1098
54
202
7945
595
42
32
2
459.Umbria
3
9
1105
15
198
8005
575
52
20
2
460.Umbria
3
9
1110
75
220
7915
510
55
65
2
461.Umbria
3
9
1058
50
178
7988
626
40
55
3
462.Umbria
3
9
1115
30
225
7955
600
55
15
2
463.Umbria
3
9
1105
30
198
7995
570
52
20
3
464.Umbria
3
9
1072
49
178
7980
615
48
48
2
465.Umbria
3
9
1110
15
210
7990
570
50
20
2
466.Umbria
3
9
1110
80
215
7910
525
50
60
1
467.Umbria
3
9
1055
60
175
7985
620
45
50
1
468.Umbria
3
9
1100
80
215
7930
535
45
60
2
469.Umbria
3
9
1105
55
205
7965
600
25
20
2
470.Umbria
3
9
1095
50
210
7948
600
25
35
2
471.Umbria
3
9
1110
50
220
7950
600
52
10
2
472.Umbria
3
9
1092
37
210
7955
600
40
40
3
473.East-Liguria
3
7
1290
60
260
7550
670
70
100
2
474.East-Liguria
3
7
1170
80
230
7690
720
40
70
1
475.East-Liguria
3
7
1100
90
250
7680
760
30
80
2
476.East-Liguria
3
7
1120
70
240
7720
730
40
80
2
477.East-Liguria
3
7
1160
70
250
7650
750
30
90
1
478.East-Liguria
3
7
1200
50
210
7770
690
20
50
3
479.East-Liguria
3
7
1140
50
200
7990
580
10
20
1
480.East-Liguria
3
7
1220
80
240
7610
760
30
60
2
481.East-Liguria
3
7
1180
90
250
7520
800
50
100
2
482.East-Liguria
3
7
1210
70
250
7560
780
40
90
2
483.East-Liguria
3
7
1220
80
220
7540
770
60
100
2
484.East-Liguria
3
7
1180
100
190
7520
820
50
100
1
485.East-Liguria
3
7
1160
90
220
7580
790
40
90
1
486.East-Liguria
3
7
1130
100
240
7620
780
30
90
1
487.East-Liguria
3
7
1080
100
260
7710
750
20
70
2
488.East-Liguria
3
7
1090
90
280
7730
720
50
100
1
489.East-Liguria
3
7
1020
100
270
7770
710
40
90
1
490.East-Liguria
3
7
1090
90
250
7680
760
60
80
1
491.East-Liguria
3
7
1120
100
260
7720
680
30
80
2
492.East-Liguria
3
7
1080
80
240
7830
670
30
70
2
493.East-Liguria
3
7
1160
70
230
7860
640
10
20
1
494.East-Liguria
3
7
1100
80
240
7820
670
20
70
2
495.East-Liguria
3
7
1050
100
250
7930
630
10
30
3
496.East-Liguria
3
7
1090
90
270
7780
690
30
50
3
497.East-Liguria
3
7
1120
80
260
7750
680
30
80
3
498.East-Liguria
3
7
1120
100
250
7680
730
40
70
2
499.East-Liguria
3
7
1190
90
230
7670
710
30
80
2
500.East-Liguria
3
7
1170
110
250
7620
740
20
90
1
501.East-Liguria
3
7
1120
100
230
7720
730
20
70
1
502.East-Liguria
3
7
1190
80
270
7690
720
10
40
2
503.East-Liguria
3
7
1400
90
270
7420
800
0
20
2
504.East-Liguria
3
7
1350
80
250
7520
760
10
30
1
505.East-Liguria
3
7
1090
60
220
7890
670
10
60
2
506.East-Liguria
3
7
1150
90
230
7790
650
30
60
1
507.East-Liguria
3
7
1240
90
220
7820
590
10
30
1
508.East-Liguria
3
7
1220
100
240
7890
530
0
10
2
509.East-Liguria
3
7
1180
80
250
7870
580
10
30
2
510.East-Liguria
3
7
1170
110
240
7730
630
30
90
1
511.East-Liguria
3
7
1170
100
280
7710
640
20
70
3
512.East-Liguria
3
7
1180
80
220
7790
680
10
40
1
513.East-Liguria
3
7
1200
90
240
7820
590
10
50
2
514.East-Liguria
3
7
1140
90
240
7880
570
20
60
3
515.East-Liguria
3
7
1160
70
210
7870
580
30
80
3
516.East-Liguria
3
7
1130
80
250
7780
650
40
60
3
517.East-Liguria
3
7
1150
80
240
7800
630
30
70
2
518.East-Liguria
3
7
1110
70
240
7820
670
20
70
3
519.East-Liguria
3
7
1150
70
220
7850
620
20
40
2
520.East-Liguria
3
7
1180
80
240
7760
670
20
50
2
521.East-Liguria
3
7
1020
80
250
7920
680
10
30
3
522.East-Liguria
3
7
610
80
230
8410
650
0
20
3
523.West-Liguria
3
8
1190
150
290
7340
1020
0
10
2
524.West-Liguria
3
8
1110
130
210
7550
1000
0
0
1
525.West-Liguria
3
8
1020
100
220
7530
1030
0
0
3
526.West-Liguria
3
8
1070
120
210
7600
990
0
10
3
527.West-Liguria
3
8
1010
90
350
7480
1050
10
10
1
528.West-Liguria
3
8
1060
140
240
7680
830
10
40
2
529.West-Liguria
3
8
1060
140
270
7620
880
10
20
1
530.West-Liguria
3
8
1030
100
230
7740
900
0
0
2
531.West-Liguria
3
8
1120
130
250
7530
970
0
0
3
532.West-Liguria
3
8
1030
110
220
7760
980
0
0
2
533.West-Liguria
3
8
1070
100
230
7600
990
10
0
1
534.West-Liguria
3
8
1140
180
220
7610
850
10
10
2
535.West-Liguria
3
8
1090
180
230
7590
860
10
40
2
536.West-Liguria
3
8
980
110
300
7720
910
10
0
3
537.West-Liguria
3
8
980
90
330
7540
1040
0
0
2
538.West-Liguria
3
8
960
90
200
7810
940
0
0
2
539.West-Liguria
3
8
990
90
210
7780
930
0
0
2
540.West-Liguria
3
8
1060
120
210
7600
1010
0
0
1
541.West-Liguria
3
8
1240
150
250
7610
730
10
10
1
542.West-Liguria
3
8
1060
90
310
7850
690
0
0
2
543.West-Liguria
3
8
1020
100
290
7620
960
0
10
2
544.West-Liguria
3
8
970
90
220
7700
1020
0
0
3
545.West-Liguria
3
8
1180
130
220
7450
1010
0
10
2
546.West-Liguria
3
8
1060
140
240
7690
850
10
10
1
547.West-Liguria
3
8
990
100
250
7630
1030
0
0
3
548.West-Liguria
3
8
1010
90
350
7630
940
10
0
3
549.West-Liguria
3
8
1040
90
250
7780
820
10
10
1
550.West-Liguria
3
8
1040
90
250
7810
810
10
10
2
551.West-Liguria
3
8
1020
90
350
7620
920
10
0
3
552.West-Liguria
3
8
1020
90
260
7620
1010
0
0
3
553.West-Liguria
3
8
1010
90
350
7610
930
10
0
3
554.West-Liguria
3
8
920
110
340
7720
910
0
0
3
555.West-Liguria
3
8
1030
100
250
7710
900
0
10
2
556.West-Liguria
3
8
960
90
300
7820
830
0
0
3
557.West-Liguria
3
8
1030
110
210
7810
840
0
0
1
558.West-Liguria
3
8
1010
100
240
7710
910
10
20
2
559.West-Liguria
3
8
1020
90
240
7800
850
0
0
2
560.West-Liguria
3
8
1120
90
300
7650
830
0
10
1
561.West-Liguria
3
8
1090
90
290
7710
800
10
0
2
562.West-Liguria
3
8
1100
120
280
7630
770
10
10
2
563.West-Liguria
3
8
1090
80
240
7820
760
10
0
2
564.West-Liguria
3
8
1150
90
250
7720
810
0
10
3
565.West-Liguria
3
8
1110
90
230
7810
750
0
10
2
566.West-Liguria
3
8
1010
110
210
7720
950
0
0
1
567.West-Liguria
3
8
1070
100
220
7730
870
10
10
2
568.West-Liguria
3
8
1280
110
290
7490
790
10
10
2
569.West-Liguria
3
8
1060
100
270
7740
810
10
10
3
570.West-Liguria
3
8
1010
90
210
7720
970
0
0
2
571.West-Liguria
3
8
990
120
250
7750
870
10
10
2
572.West-Liguria
3
8
960
80
240
7950
740
10
20
2
] ] .panel[.panel-name[description] *Source*: Forina, M., Armanino, C., Lanteri, S. & Tiscornia, E. (1983), Classi- fication of Olive Oils from their Fatty Acid Composition, in Martens, H. and Russwurm Jr., H., eds, Food Research and Data Analysis, Applied Science Publishers, London, pp. 189–214. It was brought to our attention by Glover & Hopke (1992) *Number of rows*: 572 ; *Number of variables*: 10 <br> *Description*: This data consists of the percentage composition of fatty acids found in the lipid fraction of Italian olive oils. The data arises from a study to determine the authenticity of an olive oil. <table> <tr> <td> region </td> <td> Three "super-classes" of Italy: North, South, and the island of Sardinia </td></tr> <tr> <td> area</td> <td> Umbria, East and West Liguria (North), North and South Apulia, Calabria, and Sicily (South), (inland and coastal Sardinia) </td></tr> <tr> <td> palmitic, palmitoleic, stearic, oleic, linoleic, linolenic, arachidic, eicosenoic</td> <td> fatty acids, % `\(\times\)` 100</td></tr> </table> *Primary question*: How do we distinguish the oils from different regions and areas in Italy based on their combinations of the fatty acids? ] .panel[.panel-name[R] ```r olive <- read_csv("http://ggobi.org/book/data/olive.csv") %>% rename(id = `...1`) olive %>% gt() ``` ] ] --- # .orange[Case study] .bg-orange.circle[1] Olive oils .panelset[ .panel[.panel-name[🖼️] <img src="images/lecture-08A/olivepairs-1.png" width="48%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] .grid[ .item[ ### Differences between groups - region 1 (south) is separated from the other two using just one variable, eiscosenoic acid - regions 2 (sardinia) and 3 (north) are separated from each other in a combination of variables: oleic, linoleic, linolenic, arachidic. ] .item[ ### General association between variables, and univariate distributions - Strong negative association between some variables (oleic, linoleic and other vars) - Some clustering of observations (linoleic, arachidic, eicosenoic) - Outliers (linolenic, oleic) - Discreteness in linolenic and arachidic for region 3 (only) suggesting different recording protocols by region ]] ] .panel[.panel-name[R] ```r ggscatmat(olive, columns=7:11, color = "region") + scale_colour_brewer(palette="Set1") ``` ] ] --- # .orange[Case study] .bg-orange.circle[2] PISA .panelset[ .panel[.panel-name[🖼️] .grid[ .item[ <br> The Programme for International Student Assessment (PISA) is a triennial survey conducted by the Organization for Economic Cooperation and Development (OECD) on assessment measuring 15-year-old student performances in reading, mathematics and science. <br> <br> .nonash-blue2[Math scores for Australia for 2018.] (Only 6 or the 10 shown.) ] .item[ <img src="images/lecture-08A/pisapairs-1.png" width="90%" style="display: block; margin: auto;" /> ] ] ] .panel[.panel-name[learn] .grid[ .item[ <br> <br> Association is uniformly positive, linear and moderately strong. <br> <br> .monash-blue2[Compare to a simulated sample from a multivariate normal] 👉 <br> <br> .monash-orange2[Can you tell that it is synthetic data?] ] .item[ <img src="images/lecture-08A/mvnorm-1.png" width="90%" style="display: block; margin: auto;" /> ] ] ] .panel[.panel-name[R] ```r pisa_oz <-readRDS(here::here("data/pisa2018_oz.rds")) ggscatmat(pisa_oz, columns=6:11, alpha = 0.1) ``` ```r vc <- matrix(rep(0.80,100), ncol=10) diag(vc) <- rep(1, 10) sim <- rmvnorm(7000, mean=rep(0, 10), sigma=vc) sim <-as_tibble(sim) ggscatmat(sim, columns=1:6, alpha = 0.1) ``` ] ] --- class: informative middle # Diversion This is an example of fraudulent synthetic data, presented in a Lancet article in May 2020 claiming hydroxychloroquinine increased risk of death. .footnote[Note: This does not mean that I support the use of HCQ.] --- <img src="images/week7/Lancet_HCQ.png" width="100%"> .footnote[Ellis (2020) [Surgisphere data fraud fiasco ](https://docs.google.com/presentation/d/1Ls-SsFuFJsGBfvQIcQt7HcojkznaGQN-/edit#slide=id.p5)] --- background-image: \url(images/week7/covid_HCQ.png) background-size: 80% background-position: 50% 5% <div id="rectangle" style="width: 1250px; height: 5px; background-color: red; position: absolute; top: 410px; left: 0;"></div> --- background-image: \url(images/week7/covid_HCQ.png) background-size: 80% background-position: 50% 5% <br> .think-box[*Another rather remarkable aspect is how beautifully uniform the aggregated data are across continents*] -- <br> <br> .think-box[*For example, smoking is almost between 9.4-10% in 6 continents. As they don’t tell us which countries are involved, hard to see how this matches known smoking prevalences. Antiviral use is 40.5, 40.4, 40.7, 40.2, 40.8, 38.4%. Remarkable! I didn’t realise that treatment was so well coordinated across the world. Diabetes and other co-morbidities don’t vary much either.*] .footnote[The 28 May open letter to The Lancet coordinated by James Watson] --- class: transition middle # Generalised pairs plot If the types of variables are not both quantitative, there are some other choices of mapping --- # .orange[Case study] .bg-orange.circle[3] Tips .panelset[ .panel[.panel-name[🖼️] <img src="images/lecture-08A/tips-pairs-1.png" width="40%" style="display: block; margin: auto;" /> ] .panel[.panel-name[learn] .grid[ .item[ - positive linear moderate relationship between tip and total - size also has a weak positive linear association with tip and total - total bill from males slightly higher - total bill are more variable for smoking parties - total bill lower on Thursday, and increasing through Sunday - total bill higher at night ] .item[ - some outliers in tips on Saturday nights, paid by males, at night. - more bills paid by males - smaller number of female non-smokers paying bill - pretty similar number of men and women paying bill at lunchtime, but more men in the evening - no diners on Thu/Fri during the day!! ] ] ] .panel[.panel-name[R] ```r tips <- read_csv("http://ggobi.org/book/data/tips.csv") %>% mutate(day = factor(day, levels=c("Thu", "Fri", "Sat", "Sun"))) ggpairs(tips[, c(2,3,8,5,6)]) ``` ] ] --- class: transition middle # Scagnostics Has your data got too many pairs of variables to scan easily? --- background-image: \url(https://upload.wikimedia.org/wikipedia/commons/1/14/1980s_computer_worker%2C_Centers_for_Disease_Control.jpg) background-size: cover <center> <img src="images/week7/cognostics.png" width="600px"> </center> .footnote[Friedman and Stuetzle (2002) [John W. Tukey's work on interactive graphics](https://projecteuclid.org/download/pdf_1/euclid.aos/1043351250)] --- background-image: \url(https://upload.wikimedia.org/wikipedia/commons/1/14/1980s_computer_worker%2C_Centers_for_Disease_Control.jpg) background-size: cover <br> <br> <br> <br> <center> <img src="images/week7/scagnostics.png" width="600px"> </center> .footnote[Friedman and Stuetzle (2002) [John W. Tukey's work on interactive graphics](https://projecteuclid.org/download/pdf_1/euclid.aos/1043351250)] --- # Scagnostics <table class=" lightable-paper" style='font-family: "Arial Narrow", arial, helvetica, sans-serif; width: auto !important; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> plot </th> <th style="text-align:left;"> set </th> <th style="text-align:right;"> outlying </th> <th style="text-align:right;"> stringy </th> <th style="text-align:right;"> striated </th> <th style="text-align:right;"> clumpy </th> <th style="text-align:right;"> sparse </th> <th style="text-align:right;"> monotonic </th> <th style="text-align:right;"> dcor </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> <html><body><img src="images/lecture-07A/scagplots-1.png" width="80" height="80"></body></html> </td> <td style="text-align:left;"> line </td> <td style="text-align:right;"> 0.000 </td> <td style="text-align:right;"> 1.000 </td> <td style="text-align:right;"> 0.600 </td> <td style="text-align:right;"> 0.368 </td> <td style="text-align:right;"> 0.157 </td> <td style="text-align:right;"> 0.997 </td> <td style="text-align:right;"> 0.991 </td> </tr> <tr> <td style="text-align:left;"> <html><body><img src="images/lecture-07A/scagplots-2.png" width="80" height="80"></body></html> </td> <td style="text-align:left;"> norm </td> <td style="text-align:right;"> 0.190 </td> <td style="text-align:right;"> 0.789 </td> <td style="text-align:right;"> 0.330 </td> <td style="text-align:right;"> 0.603 </td> <td style="text-align:right;"> 0.095 </td> <td style="text-align:right;"> 0.013 </td> <td style="text-align:right;"> 0.160 </td> </tr> <tr> <td style="text-align:left;"> <html><body><img src="images/lecture-07A/scagplots-3.png" width="80" height="80"></body></html> </td> <td style="text-align:left;"> circle </td> <td style="text-align:right;"> 0.000 </td> <td style="text-align:right;"> 1.000 </td> <td style="text-align:right;"> 0.980 </td> <td style="text-align:right;"> 0.966 </td> <td style="text-align:right;"> 0.065 </td> <td style="text-align:right;"> 0.009 </td> <td style="text-align:right;"> 0.248 </td> </tr> <tr> <td style="text-align:left;"> <html><body><img src="images/lecture-07A/scagplots-4.png" width="80" height="80"></body></html> </td> <td style="text-align:left;"> stripes </td> <td style="text-align:right;"> 0.129 </td> <td style="text-align:right;"> 0.698 </td> <td style="text-align:right;"> 0.338 </td> <td style="text-align:right;"> 0.985 </td> <td style="text-align:right;"> 0.094 </td> <td style="text-align:right;"> 0.665 </td> <td style="text-align:right;"> 0.632 </td> </tr> <tr> <td style="text-align:left;"> <html><body><img src="images/lecture-07A/scagplots-5.png" width="80" height="80"></body></html> </td> <td style="text-align:left;"> clumps </td> <td style="text-align:right;"> 0.038 </td> <td style="text-align:right;"> 0.608 </td> <td style="text-align:right;"> 0.233 </td> <td style="text-align:right;"> 0.992 </td> <td style="text-align:right;"> 0.107 </td> <td style="text-align:right;"> 0.375 </td> <td style="text-align:right;"> 0.502 </td> </tr> </tbody> </table> --- # How are scagnostics calculated? The building blocks are: convex hull, alpha hull, and minimal spanning tree <center> <img src="images/week7/draw1.png" width="100%"> </center> .footnote[Sketches made by Harriet Mason] --- .pull-left[ **Convex:** Measure of how convex the shape of the data is. Computed as the ratio between the area of the alpha hull (A) and convex hull (C). `$$s_{convex}=w\frac{area(A)}{area(C)}$$` <img src="images/week7/drawconvex.png" width="100%"> ] .pull-right[ **Skinny:** A measure of how "thin" the shape of the data is. It is calculated as the ratio between the area and perimeter of the alpha hull (A) with some normalisation such that 0 correspond to a perfect circle and values close to 1 indicate a skinny polygon. `$$s_{skinny}= 1-\frac{\sqrt{4\pi area(A)}}{perimeter(A)}$$` <img src="images/week7/drawskinny.png" width="100%"> ] .footnote[Sketches made by Harriet Mason] --- .pull-left[ **Outlying:** A measure of proportion and severity of outliers in dataset. Calculated by comparing the edge lengths of the outlying points in the MST with the length of the entire MST. `$$s_{outlying}=\frac{length(M_{outliers})}{length(M)}$$` <img src="images/week7/drawoutlying.png" width="100%"> ] .pull-right[ **Stringy:** This measure identifies a "stringy" shape with no branches, such as a thin line of data. It is calculated by comparing the number of vertices of degree two `\((V^{(2)})\)` with the total number of vertices `\((V)\)`, dropping those of degree one `\((V^{(1)})\)`. `$$s_{stringy} = \frac{|V^{(2)}|}{|V|-|V^{(1)}|}$$` <img src="images/week7/drawstringy.png" width="100%"> ] .footnote[Sketches made by Harriet Mason] --- .pull-left[ **Skewed:** A measure of skewness in the edge lengths of the MST (not in the distribution of the data). It is calculated as the ratio between the 40% IQR and the 80% IQR, adjusted for sample size dependence. `$$s_{skewed} = 1-w(1-\frac{q_{90}-{q_{50}}}{q_{90}-q_{10}})$$` <img src="images/week7/drawskewed.png" width="100%"> ] .pull-right[ **Clumpy:** This measure is used to detect clustering and is calculated through an iterative process. First an edge J is selected and removed from the MST. From the two spanning trees that are created by this break, we select the largest edge from the smaller tree (K). The length of this edge (K) is compared to the removed edge (J) giving a clumpy measure for this edge. This process is repeated for every edge in the MST and the final clumpy measure is the maximum of this value over all edges. `$$\max_{j}(1-\frac{\max_{k}(length(e_k))}{length(e_j)})$$` <img src="images/week7/drawclumpy.png" width="100%"> ] .footnote[Sketches made by Harriet Mason] --- .pull-left[ **Striated:** This measure identifies features such as discreteness by finding parallel lines, or smooth algebraic functions. Calculated by counting the proportion of acute (0 to 40 degree) angles between the adjacent edges of vertices with only two edges. `$$\frac1{|V|}\sum_{v \in V^{2}}I(cos\theta_{e(v,a)e(v,b)}<-0.75)$$` <img src="images/week7/drawstriated.png" width="100%"> ] .pull-right[ **Monotonic:** Checks if the data has an increasing or decreasing trend. Calculated as the Spearman correlation coefficient, i.e. the Pearson correlation between the ranks of x and y. `$$s_{monotonic} = r^2_{spearman}$$` <img src="images/week7/drawmonotonic.png" width="100%"> ] .footnote[Sketches made by Harriet Mason] --- .pull-left[ **Splines:** Measures the functional non-linear dependence by fitting a penalised splines model on X using Y, and on Y using X. The variance of the residuals are scaled down by the axis so they are comparable, and finally the maximum is taken. Therefore the value will be closer to 1 if either relationship can be decently explained by a splines model. `$$s_{splines}=\max_{i\in x,y}[1-\frac{Var(Residuals_{model~i=.})}{Var(i)}]$$` <img src="images/week7/drawsplines.png" width="100%"> ] .pull-right[ **Dcor:** A measure of non-linear dependence which is 0 if and only if the two variables are independent. Computed using an ANOVA like calculation on the pairwise distances between observations. `\(s_{dcor}= \sqrt{\frac{V(X,Y)}{V(X,X)V(Y,Y)}}\)` where `\(V(X,Y)=\frac{1}{n^2}\sum_{k=1}^n\sum_{l=1}^nA_{kl}B_{kl}\)`, `\(A_{kl}=a_{kl}-\bar{a}_{k.}-\bar{a}_{.j}-\bar{a}_{..}\)` `\(B_{kl}=b_{kl}-\bar{b}_{k.}-\bar{b}_{.j}-\bar{b}_{..}\)` <img src="images/week7/drawdcor.png" width="100%"> ] .footnote[Sketches made by Harriet Mason] --- # Scagnostics from familiar measures There are many more ways to numerically characterise association that can be used as scagnostics too: - We used those available in the [vaast]() R package - Slope, intercept, and error estimate from a simple linear model - Correlation - Principal component analysis: first eigenvalue - Linear discriminant analysis: Between group SS to within group SS - Cluster metrics - Also see - tignostics for time series ([feasts](https://feasts.tidyverts.org) R package) - longnostics for longitudinal data ([brolgar](http://brolgar.njtierney.com) R package) --- # Resources - Friendly and Denis "Milestones in History of Thematic Cartography, Statistical Graphics and Data Visualisation" available at http://www.datavis.ca/milestones/ - Schloerke et al (2020). GGally: Extension to 'ggplot2'. https://ggobi.github.io/ggally. - Wilkinson, Anand, Grossmann (1994) Graph-Theoretic Scagnostics, http://papers.rgrossman.com/proc-094.pdf - Grimm, K. (2016). Kennzahlenbasierte grafikauswahl (pp. III, 210) [Doctoral thesis]. Universität Augsburg. - Hofmann et al (2020) binostics: Compute Scagnostics. R package version 0.1.2. https://CRAN.R-project.org/package=binostics - O'Hara-Wild, Hyndman, Wang (2020). https://CRAN.R-project.org/package=fabletools - Tierney, Cook, Prvan (2020) https://github.com/njtierney/brolgar --- background-size: cover class: title-slide background-image: url("images/bg-12.png") <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. .bottom_abs.width100[ Lecturer: *Di Cook* <i class="fas fa-envelope"></i> ETC5521.Clayton-x@monash.edu <i class="fas fa-calendar-alt"></i> Week 8 - Session 1 <br> ]