This is a brief introduction to data visualization and basic statistical analysis with R. You will learn:
dplyr
for data frame manipulationggplot2
to visualization datadplyr
and ggplot2
are both packages from
the tidyverse. Tidyverse offers
a syntax that is more human-interpretable and allows for greater
customizability of your plots.
R variables types: string, integer, float, and boolean.
a <- "this is a string"
b <- 3
c <- 3.14
d <- TRUE
Printing in R:
print(a)
## [1] "this is a string"
You could also just type whatever variable or data frame name into
the console without the print()
function.
a
## [1] "this is a string"
Combining strings is a bit annoying. We need to use the
paste
function.
d <- "this is"
e <- "R"
paste(d, e)
## [1] "this is R"
Use paste0
if you want to concatenate strings without a
space.
paste0(d, e)
## [1] "this isR"
list_one <- c("one", "two", "three", "four", "five")
list_one
## [1] "one" "two" "three" "four" "five"
Important: R starts indexing at 1.
list_one[1]
## [1] "one"
Updating an element:
list_one[2] <- "cat"
list_one
## [1] "one" "cat" "three" "four" "five"
Slicing
list_one[c(3:5)]
## [1] "three" "four" "five"
list_one[c(1,2,5)]
## [1] "one" "cat" "five"
The following prints the list without elements at index 1 and index 3. This is different from Python negative indices!
list_one[c(-1,-3)]
## [1] "cat" "four" "five"
Appending to lists
list_two <- append(list_one, "apple")
list_two
## [1] "one" "cat" "three" "four" "five" "apple"
Various kinds of numeric lists that can be easily generated with R:
list_three <- 1:100
list_four <- rnorm(100)
for (item in list_two) {
print(item)
}
## [1] "one"
## [1] "cat"
## [1] "three"
## [1] "four"
## [1] "five"
## [1] "apple"
a_number <- 33
if (a_number > 5){
print(paste(a_number, "is greater than 5"))
} else if (a_number < 5){
print(paste(a_number, "is smaller than 5"))
} else {
print(paste(a_number, "is equal to 5"))
}
## [1] "33 is greater than 5"
When starting an R project, the two steps are 1) installing and loading the necessary packages and 2) setting up your working directory so that your analysis doesn’t get lost.
First, let’s install the tidyverse packages:
install.packages("tidyverse")
This may take a while.
Now let’s load the tidyverse package.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
This library already includes the dplyr
and
ggplot2
packages, so we don’t need to load them
separately.
You should set R to the directory you’re working from. This is the directory where you will save your R scripts, any processed data frames outputs, figures, etc.
setwd("/Users/aletheia/Documents/PennCourses/Spring2025/LING2220/R/R1")
Similarly, you can use the getwd()
command to figure out
what your current working directory is.
getwd()
## [1] "/Users/aletheia/Documents/PennCourses/Spring2025/LING2220/R/R1"
A typical set up for working directory has the following structure:
- Working directory
- data
- figs
- output
The data
folder should contain the raw, unprocessed data
files. The figs
directory will contain any figures you
generate with R and decide to save. The output
file is for
any other types of output you generate in your analysis process.
You can create the subfolders with your computer’s file explorer, or
you can use the following R
command
dir.create('figs')
dir.create('output')
dir.create('data')
These folders will be created in your working directory.
We will be working with the Hillenbrand vowel data. Let’s first read in the data table.
vowels = read.table('data/htable.csv', header = TRUE, sep = ",")
The read.table()
command uses the argument
header = TRUE
to tell R that this data file header names
(i.e., each column of has a name), and sep = ","
tells R
that each column is separated by a comma. A shorthand of this command is
read.csv('data/htable.csv')
.
head(vowels)
## mwbg talker vowel dur F0 F1 F2 F3 F4 F1.20 F2.20 F3.20 F1.50 F2.50
## 1 m 1 ae 323 174 663 2012 2659 3691 669 2008 2671 671 1992
## 2 m 2 ae 250 102 628 1871 2477 3489 627 1871 2456 636 1881
## 3 m 3 ae 344 99 605 1812 2570 0 608 1812 2572 618 1789
## 4 m 4 ae 312 124 627 1910 2488 3463 629 1882 2460 720 1750
## 5 m 6 ae 254 115 647 1864 2561 3506 642 1866 2557 666 1829
## 6 m 7 ae 254 96 582 1999 2567 3754 592 1958 2568 624 1925
## F3.50 F1.80 F2.80 F3.80
## 1 2659 685 1773 2680
## 2 2455 628 1793 2451
## 3 2618 632 1708 2693
## 4 2435 757 1563 2527
## 5 2499 689 1696 2556
## 6 2569 626 1791 2577
What kind of information do we have from the table?
dimensions of the data frame:
dim(vowels)
## [1] 1668 18
number of rows:
nrow(vowels)
## [1] 1668
number of columns:
ncol(vowels)
## [1] 18
Get column names:
colnames(vowels)
## [1] "mwbg" "talker" "vowel" "dur" "F0" "F1" "F2" "F3"
## [9] "F4" "F1.20" "F2.20" "F3.20" "F1.50" "F2.50" "F3.50" "F1.80"
## [17] "F2.80" "F3.80"
Select a single column:
vowels$mwbg
## [1] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [19] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [37] "m" "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [55] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [73] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [91] "w" "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [109] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g"
## [127] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "m" "m" "m" "m" "m"
## [145] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [163] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [181] "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [199] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [217] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "b" "b"
## [235] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [253] "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [271] "g" "g" "g" "g" "g" "g" "g" "g" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [289] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [307] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "w"
## [325] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [343] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [361] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "b" "b" "b" "b" "b" "b" "b"
## [379] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [397] "b" "b" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [415] "g" "g" "g" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [433] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [451] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w"
## [469] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [487] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [505] "w" "w" "w" "w" "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [523] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g"
## [541] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "m" "m"
## [559] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [577] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [595] "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [613] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [631] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [649] "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [667] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g" "g" "g"
## [685] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "m" "m" "m" "m" "m" "m" "m"
## [703] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [721] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [739] "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [757] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [775] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "b" "b" "b" "b"
## [793] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [811] "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [829] "g" "g" "g" "g" "g" "g" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [847] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [865] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w"
## [883] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [901] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [919] "w" "w" "w" "w" "w" "w" "w" "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [937] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [955] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [973] "g" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [991] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1009] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w"
## [1027] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1045] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1063] "w" "w" "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [1081] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g"
## [1099] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "m" "m" "m" "m"
## [1117] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1135] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1153] "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1171] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1189] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "b"
## [1207] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [1225] "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [1243] "g" "g" "g" "g" "g" "g" "g" "g" "g" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1261] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1279] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1297] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1315] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1333] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "b" "b" "b" "b" "b" "b"
## [1351] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [1369] "b" "b" "b" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
## [1387] "g" "g" "g" "g" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1405] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1423] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w"
## [1441] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1459] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1477] "w" "w" "w" "w" "w" "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [1495] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g"
## [1513] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "m"
## [1531] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1549] "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m" "m"
## [1567] "m" "m" "m" "m" "m" "m" "m" "m" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1585] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1603] "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w" "w"
## [1621] "w" "w" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b"
## [1639] "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "g" "g" "g" "g" "g" "g" "g"
## [1657] "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g" "g"
With categorical variables like gender and vowel, we should usually convert them to factors.
vowels$mwbg = as.factor(vowels$mwbg)
vowels$vowel = as.factor(vowels$vowel)
table(vowels$mwbg)
##
## b g m w
## 324 228 540 576
summary(vowels$F0)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 90 147 213 197 236 330
sd(vowels$F0)
## [1] 51.76277
Using the summary() and sd() functions, get the group means, sds, and medians for dur, F1, F2, F3.
dplyr
operationsTo select column(s):
vowels %>% select(talker, vowel, F1) %>% head
## talker vowel F1
## 1 1 ae 663
## 2 2 ae 628
## 3 3 ae 605
## 4 4 ae 627
## 5 6 ae 647
## 6 7 ae 582
Using the select
function from dplyr function, get the
following subsets of column
To filter columns:
vowels %>% filter(vowel == 'uw' & F1 < 350)
## mwbg talker vowel dur F0 F1 F2 F3 F4 F1.20 F2.20 F3.20 F1.50 F2.50
## 1 m 2 uw 257 114 319 938 2091 2957 320 938 2092 320 931
## 2 m 7 uw 231 113 326 997 2384 3463 348 1020 2358 322 999
## 3 m 10 uw 237 156 338 1087 2515 0 382 1210 2475 328 1080
## 4 m 11 uw 327 161 319 936 2187 3346 317 893 2159 314 931
## 5 m 17 uw 210 174 339 860 2013 0 338 860 2000 367 837
## 6 m 30 uw 288 105 313 861 2374 3366 330 915 2390 309 860
## 7 m 40 uw 205 151 316 893 2385 3947 365 975 2378 316 893
## F3.50 F1.80 F2.80 F3.80
## 1 2111 331 1038 2073
## 2 2373 321 1047 2318
## 3 2513 317 1108 2342
## 4 2125 323 988 2144
## 5 1999 340 868 2084
## 6 2383 302 870 2388
## 7 2385 323 967 2357
Using the filter
from dplyr function, get the following
subsets of data
It’s usually easier to make categorical variables into factors so R doesn’t treat them as strings or continuous variables.
While base R has decent plotting functions, we will be using the ggplot2 package. It allows for greater customizable and generates prettier plots.
The typical first step in data analysis is visualization. Let’s make a basic boxplot to visualize group differences.
ggplot(data = vowels, aes(x = mwbg, y = F0)) +
geom_boxplot() +
xlab('Speaker group') +
ylab('F0 (hertz)')
What do you observe?
Another way of visualizing group differences is the violin plot. It’s very similar to boxplots, except it shows you the density of data points at different values.
ggplot(data = vowels, aes(x = mwbg, y = F0)) +
geom_violin() +
xlab('Speaker group') +
ylab('F0 (hertz)')
What do you observe from the violin plot?
Density plot.
ggplot(data = vowels, aes(x = F0, color = mwbg)) +
geom_density() +
xlab('F0 (hertz)') +
ylab('density')
Create a boxplot visualizing the differences in F1 between the mwbg groups.
Means and standard deviations are common descriptive statistics to get started with.
vowels %>%
group_by(mwbg) %>%
summarise(f0.mean = mean(F0), f0.sd = sd(F0))
## # A tibble: 4 × 3
## mwbg f0.mean f0.sd
## <fct> <dbl> <dbl>
## 1 b 236. 28.3
## 2 g 238. 20.9
## 3 m 131. 22.0
## 4 w 220. 23.2
For the girls in the data set, what are the means and sds for dur, F1, F2, F3?
For the vowel ‘ae’, what are the means and sds for dur, F1, F2, F3?
First, a note about normal distribution and parametric statistical methods. What is the normal distribution?
Let’s generate some normally distributed data and plot it to convince ourselves that the data points are normally distributed.
normal.data=data.frame(value=rnorm(1000))
ggplot(normal.data, aes(x=value)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
We can use the quantile-quantile plot (QQ plot) to visually assess whether our data is normal.
ggplot(data = normal.data, aes(sample = value)) + stat_qq() + stat_qq_line()
Normal data look like a straight line on the QQ plot.
Is our F0 data normal? Let’s do an overall plot.
ggplot(data = vowels, aes(sample = F0)) + stat_qq() + stat_qq_line()
Normally distributed data should look like a straight line. What might be causing the sharp rise in this plot?
Let’s separate the data by groups.
ggplot(data = vowels, aes(sample = F0, color=mwbg)) +
stat_qq() +
stat_qq_line()
Yes, our data follows the normal distribution. We can use parametric statistical tests. Why?
Remember our research question: Do the different groups (men, women, boys, girls) have different f0 values?
What do p-values really tell us?
# null hypothesis: Men and women have the same group mean f0.
# reject if p < 0.05
t.test(vowels %>% filter(mwbg=="m") %>% select(F0), vowels %>% filter(mwbg=="w") %>% select(F0))
##
## Welch Two Sample t-test
##
## data: vowels %>% filter(mwbg == "m") %>% select(F0) and vowels %>% filter(mwbg == "w") %>% select(F0)
## t = -65.832, df = 1113.8, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -91.84056 -86.52449
## sample estimates:
## mean of x mean of y
## 131.2185 220.4010
Let’s translate it into statistics-speak. Statistics is about whether we reject the null hypothesis or not.
Statistically, our null hypothesis is: Men, women, boys, and girls have the same f0 means.
Let’s do a nonparametric test just for fun.
wilcox.test(vowels[vowels$mwbg=="m", "F0"], vowels[vowels$mwbg=="w", "F0"])
##
## Wilcoxon rank sum test with continuity correction
##
## data: vowels[vowels$mwbg == "m", "F0"] and vowels[vowels$mwbg == "w", "F0"]
## W = 1938.5, p-value < 2.2e-16
## alternative hypothesis: true location shift is not equal to 0
Use a t.test to test the following null hypothesis: Boys and girls have the same mean F0. Do we reject this null hypothesis? Why or why not?
# null hypothesis: all groups have the same mean
anova.mod = aov(F0~mwbg, data=vowels)
summary(anova.mod)
## Df Sum Sq Mean Sq F value Pr(>F)
## mwbg 3 3536645 1178882 2110 <2e-16 ***
## Residuals 1664 929889 559
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Use an anova test to test the following null hypothesis: all groups have the same mean dur. Do we reject this null hypothesis? Why or why not?
Let’s make a basic scatterplot:
ggplot(data = vowels) +
geom_point(aes(x = F1, y = F2))
First, it seems that there are quite a few zero measurements. We want
to exclude these. One way to do it is setting all the 0’s to NA, and use
R’s na.rm
or drop_na()
function to exlude
them.
vowels[vowels==0] = NA
ggplot(data = vowels %>% drop_na, aes(x = F1, y = F2)) +
geom_point()
However, this isn’t very informative.
ggplot(data = vowels %>% drop_na, aes(x = F1, y = F2, color = vowel)) +
geom_point()
There are many ways to proceed.
ggplot(data = vowels %>% drop_na, aes(x = F2, y = F1, color = vowel)) +
geom_point() +
scale_x_reverse() +
scale_y_reverse()
ggplot(data = vowels %>% drop_na, aes(x = F2, y = F1, color = vowel)) +
geom_point() +
scale_x_reverse() +
scale_y_reverse() +
facet_wrap(.~mwbg)
ggplot(data = vowels %>% drop_na, aes(x = F2, y = F1, color = vowel)) +
geom_text(aes(label=vowel, alpha=0.5)) +
scale_x_reverse() +
scale_y_reverse() +
facet_wrap(.~mwbg) + stat_ellipse()
ggplot(vowels, aes(x=vowel, y=F0)) +
geom_boxplot()
vowels$height = "high"
vowels %>% filter(vowel %in% c('ae', 'ah')) %>% mutate(height='low')
## mwbg talker vowel dur F0 F1 F2 F3 F4 F1.20 F2.20 F3.20 F1.50 F2.50
## 1 m 1 ae 323 174 663 2012 2659 3691 669 2008 2671 671 1992
## 2 m 2 ae 250 102 628 1871 2477 3489 627 1871 2456 636 1881
## 3 m 3 ae 344 99 605 1812 2570 NA 608 1812 2572 618 1789
## 4 m 4 ae 312 124 627 1910 2488 3463 629 1882 2460 720 1750
## 5 m 6 ae 254 115 647 1864 2561 3506 642 1866 2557 666 1829
## 6 m 7 ae 254 96 582 1999 2567 3754 592 1958 2568 624 1925
## 7 m 8 ae 289 122 602 1880 2539 3584 606 1873 2539 612 1843
## 8 m 9 ae 339 120 545 1872 2630 NA 544 1872 2628 562 1694
## 9 m 10 ae 282 153 552 2027 2737 3588 533 2027 2721 570 1941
## 10 m 11 ae 319 136 629 1871 2574 3657 635 1879 2612 622 1885
## 11 m 13 ae 279 100 685 1807 2611 NA 687 1803 2642 685 1795
## 12 m 14 ae 272 106 559 1983 2560 3817 558 1997 2570 608 1877
## 13 m 16 ae 212 138 544 1819 2511 3266 549 1816 2511 551 1773
## 14 m 17 ae 257 135 685 1592 2308 NA 686 1602 2304 682 1695
## 15 m 18 ae 264 133 514 2031 2815 3618 514 2031 2815 545 2045
## 16 m 19 ae 244 149 612 1937 2605 3646 613 1941 2648 633 1828
## 17 m 20 ae 286 140 544 2114 2971 NA 544 2114 2971 556 1991
## 18 m 21 ae 252 108 556 1854 2679 3628 559 1877 2695 565 1817
## 19 m 22 ae 249 119 619 2023 2716 NA 619 2023 2716 632 1983
## 20 m 23 ae 274 112 601 1997 2499 4272 601 1997 2499 628 1904
## 21 m 24 ae 270 120 572 2072 2648 3949 574 2070 2636 625 1880
## 22 m 25 ae 301 115 546 1820 2548 3930 551 1792 2565 554 1819
## 23 m 26 ae 259 120 562 1895 2513 2937 567 1878 2509 561 1858
## 24 m 27 ae 257 105 594 2000 2550 3448 594 2001 2550 638 1863
## 25 m 28 ae 212 131 662 1934 2649 NA 640 1940 2698 627 1899
## 26 m 29 ae 241 114 566 2041 2500 3862 586 2005 2605 565 2040
## 27 m 30 ae 302 96 559 1873 2509 3673 560 1866 2541 567 1786
## 28 m 31 ae 208 143 579 1982 2639 NA 572 1945 2618 624 1874
## 29 m 32 ae 208 149 623 1937 2526 NA 627 1948 2571 680 1867
## 30 m 33 ae 328 136 555 1996 2609 3666 558 1989 2637 560 1945
## 31 m 34 ae 267 129 555 1849 2502 3511 555 1849 2502 569 1767
## 32 m 35 ae 326 110 634 1730 2456 3220 638 1739 2462 651 1683
## 33 m 36 ae 270 121 570 1856 2630 3473 569 1865 2638 621 1790
## 34 m 37 ae 207 130 614 1768 2393 3372 615 1762 2384 617 1721
## 35 m 38 ae 213 131 553 2140 2327 4278 552 2062 2332 569 1953
## 36 m 39 ae 311 153 605 2002 2666 3563 591 2005 2606 642 1870
## 37 m 40 ae 273 140 597 1989 2677 4252 608 1973 2647 623 1929
## 38 m 41 ae 230 139 561 1772 2597 3547 559 1781 2602 615 1624
## 39 m 42 ae 334 128 615 1885 2709 3581 617 1875 2717 646 1876
## 40 m 44 ae 276 98 577 1952 2588 NA 570 1937 2555 614 1788
## 41 m 45 ae 283 187 629 2436 3022 NA 626 2368 2987 657 2324
## 42 m 47 ae 292 129 511 1811 2529 3556 514 1813 2515 563 1750
## 43 m 48 ae 245 119 590 1830 2512 3997 580 1832 2476 577 1812
## 44 m 49 ae 266 123 622 1957 2841 NA 622 1956 2832 615 1966
## 45 m 50 ae 289 116 565 2055 2575 3380 566 2043 2537 617 1918
## 46 w 1 ae 305 225 678 2293 2861 4412 681 2295 2868 711 2160
## 47 w 2 ae 486 214 624 2442 3091 5306 621 2470 3042 755 2237
## 48 w 3 ae 293 192 666 2370 2814 3706 701 2328 2768 799 2074
## 49 w 4 ae 353 233 743 2230 3055 4476 759 2193 3073 812 1948
## 50 w 5 ae 338 223 677 2320 2987 5230 678 2263 2987 804 2056
## 51 w 6 ae 362 223 627 2266 2875 4085 628 2224 2918 730 1983
## 52 w 7 ae 313 176 690 2327 2771 4089 695 2277 2837 805 2088
## 53 w 8 ae 284 238 658 2650 3471 4199 668 2608 3520 719 2268
## 54 w 9 ae 253 251 685 2299 2930 4119 688 2330 2808 714 2245
## 55 w 10 ae 356 227 621 2249 2873 3978 609 2261 2917 649 2148
## 56 w 11 ae 385 188 868 2004 2797 NA 801 2012 2772 878 1919
## 57 w 12 ae 355 229 682 2486 3207 4450 674 2484 3140 694 2442
## 58 w 13 ae 315 237 668 2252 2790 4181 676 2246 2776 727 1921
## 59 w 14 ae 272 214 726 2350 2996 4523 697 2358 3047 789 2326
## 60 w 15 ae 400 226 620 2316 2849 4151 602 2378 2931 687 1962
## 61 w 16 ae 356 202 634 2596 3255 4605 651 2545 3169 778 2276
## 62 w 17 ae 323 212 672 2145 2764 4205 678 2218 2772 688 2136
## 63 w 19 ae 363 237 698 2339 3075 4353 691 2357 3073 831 2101
## 64 w 20 ae 307 187 586 2299 2748 NA 593 2299 2740 724 2132
## 65 w 21 ae 382 217 616 2156 2865 3896 616 2151 2865 676 1997
## 66 w 22 ae 338 186 576 2429 3101 4327 573 2431 3104 685 2308
## 67 w 23 ae 365 213 674 2256 2818 4355 669 2297 2795 772 2131
## 68 w 24 ae 331 248 738 2378 3305 4314 737 2417 3316 754 2233
## 69 w 25 ae 461 220 646 2406 3283 4709 652 2322 3268 802 2084
## 70 w 26 ae 346 168 746 1944 2927 4355 744 2008 2912 749 1956
## 71 w 27 ae 272 246 734 2518 3176 NA 728 2530 3177 727 2504
## 72 w 28 ae 260 225 662 2276 2955 4143 662 2270 2938 746 2068
## 73 w 29 ae 333 238 696 2447 3122 5024 700 2456 3136 747 2344
## 74 w 30 ae 310 238 687 2378 2913 3973 703 2379 2942 922 2125
## 75 w 31 ae 364 205 645 2154 3139 4027 671 2166 3036 749 2098
## 76 w 32 ae 239 208 626 2374 2836 NA 620 2373 2795 747 2090
## 77 w 33 ae 261 235 689 2701 3490 4513 689 2774 3534 905 2495
## 78 w 34 ae 277 218 557 2586 3202 NA 606 2514 2902 689 2104
## 79 w 35 ae 328 211 668 2296 2711 NA 666 2281 2685 745 2078
## 80 w 36 ae 413 200 746 2371 2984 4115 746 2385 2983 789 2204
## 81 w 37 ae 405 192 893 2070 3024 5118 898 2088 2916 907 2051
## 82 w 38 ae 294 219 665 2408 3034 4290 659 2436 3026 734 2114
## 83 w 39 ae 365 192 564 2442 NA 4038 565 2446 NA 710 2057
## 84 w 40 ae 301 197 714 2254 2625 NA 727 2248 2604 747 2069
## 85 w 41 ae 301 216 625 2594 3146 4003 625 2625 3158 653 2464
## 86 w 42 ae 312 222 552 2227 2978 NA 569 2204 3004 654 2070
## 87 w 44 ae 353 230 685 2205 2813 NA 681 2232 2839 728 1978
## 88 w 45 ae 333 208 657 2192 2654 4122 649 2235 2698 754 2028
## 89 w 46 ae 365 156 649 2508 3050 NA 612 2532 2973 736 2419
## 90 w 47 ae 327 211 817 2102 2711 4076 818 2122 2679 866 1989
## 91 w 48 ae 310 210 626 2331 2826 4005 623 2266 2852 735 1897
## 92 w 49 ae 319 209 706 2400 2923 NA 724 2327 2929 754 2244
## 93 w 50 ae 357 209 751 2432 2896 4181 750 2432 2896 816 2093
## 94 b 1 ae 257 238 630 2423 3166 4495 651 2413 3115 683 2295
## 95 b 2 ae 359 286 829 2495 3218 NA 778 2461 3424 835 2491
## 96 b 3 ae 335 214 631 2801 3508 NA 602 2760 3453 589 2686
## 97 b 4 ae 398 239 712 2608 3247 NA 712 2608 3247 690 2416
## 98 b 5 ae 267 200 748 2589 3042 5074 752 2562 3033 815 2498
## 99 b 7 ae 323 262 769 2203 3126 4128 760 2169 3144 862 2154
## 100 b 8 ae 316 216 870 2281 3077 NA 820 2239 3181 869 2267
## 101 b 9 ae 245 220 709 2565 3526 NA 709 2565 3526 683 2476
## 102 b 10 ae 396 205 634 2555 3121 4492 642 2559 3126 710 2498
## 103 b 11 ae 298 209 630 2509 3112 4573 627 2513 3098 693 2411
## 104 b 12 ae 415 252 736 2505 3332 4874 736 2504 3307 771 2326
## 105 b 13 ae 281 216 634 2535 3260 4479 630 2532 3248 673 2481
## 106 b 14 ae 314 198 697 2418 3371 4322 657 2471 3376 760 2320
## 107 b 15 ae 382 272 607 2620 3350 4534 617 2599 3369 752 2382
## 108 b 16 ae 367 187 753 2227 3064 NA 750 2233 3042 746 2235
## 109 b 17 ae 352 246 726 2231 2932 3843 742 2246 2902 767 2003
## 110 b 18 ae 307 249 741 2444 3043 4430 746 2455 3021 819 2119
## 111 b 19 ae 312 209 674 2663 3243 NA 693 2672 3256 713 2559
## 112 b 21 ae 352 205 769 2234 2910 4034 771 2215 2889 771 2047
## 113 b 22 ae 256 229 678 2524 3418 4460 678 2501 3424 595 2370
## 114 b 23 ae 346 267 809 2592 3331 NA 796 2595 3344 978 2309
## 115 b 24 ae 216 206 545 2690 NA 4362 536 2698 NA 674 2319
## 116 b 25 ae 284 223 669 2440 NA 4173 675 2441 NA 765 2275
## 117 b 26 ae 451 220 643 2434 3326 NA 643 2422 3335 791 2094
## 118 b 27 ae 243 211 634 2410 3303 4666 675 2370 3261 787 2152
## 119 b 28 ae 284 227 676 2253 3121 4122 681 2257 3146 744 2053
## 120 b 29 ae 291 212 860 2503 3002 4345 867 2468 3022 907 2232
## 121 g 1 ae 295 242 741 2433 3341 4110 712 2329 3328 875 2176
## 122 g 2 ae 283 196 729 2878 3792 NA 729 2878 3792 654 2804
## 123 g 4 ae 385 255 932 2523 3644 NA 905 2512 3704 977 2325
## 124 g 5 ae 456 227 682 2638 3510 4372 678 2616 3481 723 2494
## 125 g 6 ae 292 222 799 2397 3125 4312 804 2410 3126 867 2441
## 126 g 7 ae 341 242 722 2418 3017 NA 701 2412 3021 683 2355
## 127 g 8 ae 384 266 695 2704 3731 4633 702 2702 3693 761 2550
## 128 g 9 ae 382 265 758 2373 3327 4766 757 2406 3363 758 2358
## 129 g 10 ae 311 237 946 2540 3229 NA 938 2561 3205 887 2566
## 130 g 11 ae 353 230 648 2334 2834 NA 648 2334 2834 864 2213
## 131 g 12 ae 322 240 730 2668 3564 NA 753 2680 3559 876 2465
## 132 g 13 ae 301 251 752 2501 NA NA 750 2509 NA 832 2383
## 133 g 14 ae 306 194 717 2313 3150 NA 717 2313 3150 753 2277
## 134 g 15 ae 354 226 888 2605 3651 NA 887 2619 3592 1018 2511
## 135 g 17 ae 308 208 727 2394 3320 4450 749 2408 3317 832 2131
## 136 g 18 ae 317 219 591 2632 NA NA 556 2610 NA 591 2504
## 137 g 19 ae 238 248 747 2703 3829 NA 744 2708 3830 792 2405
## 138 g 20 ae 290 227 672 2484 3492 NA 754 2523 3470 1070 2100
## 139 g 21 ae 250 227 555 2569 3424 4677 570 2549 3424 923 2115
## 140 m 1 ah 316 159 813 1283 2687 3739 809 1280 2687 839 1259
## 141 m 2 ah 249 101 749 1060 2842 3792 718 1049 2804 742 1109
## 142 m 3 ah 373 97 755 1133 2695 3297 750 1117 2692 766 1161
## 143 m 4 ah 302 127 832 1222 2624 4050 831 1183 2555 819 1261
## 144 m 6 ah 230 112 871 1204 2595 3480 873 1209 2583 852 1211
## 145 m 7 ah 265 98 786 1341 2403 3717 761 1330 2422 755 1341
## 146 m 8 ah 302 115 748 1293 2446 3383 749 1288 2454 734 1318
## 147 m 9 ah 330 122 738 1394 2522 NA 738 1372 2542 694 1399
## 148 m 10 ah 271 152 763 1147 2840 NA 763 1147 2840 762 1228
## 149 m 11 ah 321 148 829 1444 2241 NA 807 1382 2167 834 1470
## 150 m 13 ah 256 104 825 1429 2701 3443 789 1413 2635 802 1413
## 151 m 14 ah 278 100 737 1298 2323 3320 737 1298 2323 723 1369
## 152 m 16 ah 192 145 679 1208 2630 3506 679 1208 2630 645 1215
## 153 m 17 ah 243 132 689 1064 2303 NA 654 973 2280 686 1001
## 154 m 18 ah 342 135 702 1364 2498 3421 732 1418 2484 707 1404
## 155 m 19 ah 269 148 811 1355 2599 NA 756 1349 2700 811 1356
## 156 m 20 ah 272 135 744 1489 2586 NA 745 1489 2586 753 1506
## 157 m 21 ah 206 114 758 1363 2421 3520 758 1363 2401 771 1347
## 158 m 22 ah 287 115 784 1345 2522 3942 808 1359 2568 804 1395
## 159 m 23 ah 278 107 802 1297 2765 4049 790 1295 2771 757 1316
## 160 m 24 ah 258 131 803 1234 2430 3862 810 1227 2443 816 1342
## 161 m 25 ah 256 116 683 1238 2428 3826 682 1176 2399 683 1238
## 162 m 26 ah 271 124 697 1370 2597 NA 710 1368 2585 690 1351
## 163 m 27 ah 212 106 743 1423 2494 3682 720 1423 2477 716 1458
## 164 m 28 ah 154 173 710 1084 2753 NA 695 1032 2758 762 1069
## 165 m 29 ah 220 123 825 1438 2434 3788 822 1432 2382 830 1453
## 166 m 30 ah 290 101 673 1301 2433 3886 673 1301 2432 678 1342
## 167 m 31 ah 184 160 707 1421 2403 4126 700 1430 2378 693 1440
## 168 m 32 ah 216 146 697 1293 2461 NA 702 1303 2487 740 1259
## 169 m 33 ah 272 135 816 1361 2493 3846 835 1360 2491 808 1422
## 170 m 34 ah 250 115 722 1411 2264 3368 735 1423 2249 693 1420
## 171 m 35 ah 269 109 748 1274 2406 3201 748 1274 2406 771 1308
## 172 m 36 ah 252 116 744 1388 2541 3321 762 1393 2541 727 1433
## 173 m 37 ah 219 127 700 1295 2310 NA 700 1287 2310 700 1296
## 174 m 38 ah 192 148 705 1323 2643 3820 703 1324 2653 701 1328
## 175 m 39 ah 287 139 963 1524 2552 3494 970 1490 2491 954 1565
## 176 m 40 ah 254 126 711 1428 2413 3978 705 1446 2407 686 1438
## 177 m 41 ah 213 138 712 1325 2624 3574 712 1328 2619 718 1314
## 178 m 42 ah 292 140 821 1283 2498 4190 821 1283 2498 850 1347
## 179 m 44 ah 286 101 822 1315 2708 3689 837 1343 2694 791 1303
## 180 m 45 ah 285 180 753 1200 2677 NA 772 1203 2712 932 1161
## 181 m 47 ah 301 120 662 1183 2374 NA 662 1190 2343 661 1234
## 182 m 48 ah 207 122 682 1259 2618 3833 679 1263 2618 677 1281
## 183 m 49 ah 270 124 818 1315 2697 NA 821 1337 2695 806 1315
## 184 m 50 ah 250 110 710 1483 2575 3748 713 1511 2561 721 1489
## 185 w 1 ah 265 211 1012 1603 2767 4281 1001 1637 2762 1058 1692
## 186 w 2 ah 443 209 883 1682 2962 4059 908 1677 2903 885 1706
## 187 w 3 ah 257 192 1025 1548 2748 5478 1053 1574 2748 1014 1619
## 188 w 4 ah 350 216 804 1484 2789 4355 840 1474 2764 798 1498
## 189 w 5 ah 365 226 935 1377 2598 4753 938 1423 2626 961 1499
## 190 w 6 ah 329 211 804 1363 2803 NA 794 1408 2756 827 1382
## 191 w 7 ah 306 173 939 1233 2564 4119 920 1191 2670 942 1237
## 192 w 8 ah 317 243 827 1701 3056 3892 845 1702 3010 817 1734
## 193 w 9 ah 282 240 913 1436 2589 4097 913 1436 2589 961 1496
## 194 w 10 ah 367 223 856 1540 2667 3942 855 1467 2625 840 1541
## 195 w 11 ah 380 191 882 1380 2834 4120 933 1458 2872 877 1421
## 196 w 12 ah 343 216 887 1743 2557 NA 885 1711 2504 902 1738
## 197 w 13 ah 329 226 869 1495 2731 3937 897 1525 2748 929 1545
## 198 w 14 ah 275 211 994 1609 2930 NA 997 1613 2934 996 1663
## 199 w 15 ah 340 226 955 1615 2678 3681 908 1640 2628 947 1623
## 200 w 16 ah 305 189 1085 1687 2870 4578 1149 1762 2890 1072 1733
## 201 w 17 ah 267 205 937 1623 2844 3898 939 1617 2844 941 1608
## 202 w 19 ah 316 239 918 1640 2801 4234 939 1584 2687 918 1640
## 203 w 20 ah 365 193 818 1351 2933 4160 842 1320 2935 922 1384
## 204 w 21 ah 402 216 769 1451 2898 3949 763 1379 2861 771 1472
## 205 w 22 ah 350 178 1008 1495 3018 4151 998 1482 3065 1005 1498
## 206 w 23 ah 329 207 905 1514 2848 4537 930 1494 2849 851 1536
## 207 w 24 ah 303 259 1063 1680 2785 4405 1042 1667 2791 1042 1740
## 208 w 25 ah 430 212 1053 1677 2868 4142 1066 1679 2866 994 1649
## 209 w 26 ah 333 170 798 1351 2851 4228 805 1363 2803 782 1370
## 210 w 27 ah 284 250 869 1751 2762 4207 891 1749 2780 930 1864
## 211 w 28 ah 217 235 863 1538 3038 4298 863 1538 3038 866 1625
## 212 w 29 ah 303 248 938 1462 2825 4715 914 1395 2836 1064 1469
## 213 w 30 ah 293 232 957 1591 2733 5540 967 1579 2713 903 1637
## 214 w 31 ah 315 202 827 1543 2980 4355 878 1591 3082 823 1566
## 215 w 32 ah 240 213 708 1547 2426 NA 695 1541 2418 715 1571
## 216 w 33 ah 283 241 1163 1685 3250 NA 1152 1676 3352 1117 1707
## 217 w 34 ah 247 211 1011 1541 2803 4256 1013 1548 2798 980 1616
## 218 w 35 ah 310 201 1035 1486 2865 4155 1026 1490 2866 1004 1538
## 219 w 36 ah 455 208 952 1676 2862 4116 939 1616 2909 892 1681
## 220 w 37 ah 419 175 810 1314 3236 NA 833 1238 3279 838 1330
## 221 w 38 ah 336 217 993 1671 2751 4150 997 1671 2758 928 1647
## 222 w 39 ah 338 212 931 1348 2698 4540 958 1301 2603 947 1363
## 223 w 40 ah 297 198 822 1371 3130 4176 816 1390 3205 823 1367
## 224 w 41 ah 298 200 875 1491 2979 4155 890 1513 2984 864 1537
## 225 w 42 ah 268 226 884 1568 2819 NA 900 1612 2812 880 1561
## 226 w 44 ah 340 212 938 1492 2673 3930 942 1492 2677 954 1531
## 227 w 45 ah 312 195 914 1383 2854 4456 944 1379 2810 930 1438
## 228 w 46 ah 334 150 901 1509 2670 5312 956 1515 2783 909 1613
## 229 w 47 ah 302 207 975 1534 2658 4104 963 1473 2607 975 1534
## 230 w 48 ah 346 218 796 1502 2810 3801 818 1491 2778 796 1502
## 231 w 49 ah 326 224 1145 NA 3272 4370 1149 NA 3276 974 NA
## 232 w 50 ah 357 234 968 1433 2850 4296 965 1427 2844 944 1520
## 233 b 1 ah 212 241 831 1676 2602 5616 845 1684 2583 863 1696
## 234 b 2 ah 328 276 1020 1555 2742 NA 1023 1555 2809 1003 1501
## 235 b 3 ah 298 214 982 1823 2865 4624 997 1834 2813 984 1850
## 236 b 4 ah 357 250 932 1874 2994 4636 936 1889 3054 935 1897
## 237 b 5 ah 270 227 1052 1659 3433 4784 1084 1677 3347 1044 1679
## 238 b 7 ah 294 308 881 1543 3076 4050 810 1605 3061 882 1542
## 239 b 8 ah 340 202 762 1604 2991 NA 969 1609 2978 981 1674
## 240 b 9 ah 285 230 987 1610 3010 4334 1011 1589 3055 1012 1646
## 241 b 10 ah 413 209 1123 1640 2684 3752 1147 1662 2728 1131 1672
## 242 b 11 ah 282 203 832 1602 2551 3921 849 1604 2585 822 1564
## 243 b 12 ah 425 258 968 1808 3600 4732 972 1850 3626 976 1858
## 244 b 13 ah 273 211 933 1694 2860 4486 914 1686 2822 945 1673
## 245 b 14 ah 263 201 998 1560 2745 3923 970 1529 2800 992 1559
## 246 b 15 ah 340 287 871 2042 3410 NA 838 1968 3271 869 2045
## 247 b 16 ah 364 187 859 NA 2789 5556 836 NA 2774 854 1401
## 248 b 17 ah 327 245 898 1544 2862 3810 920 1491 2814 945 1498
## 249 b 18 ah 281 254 1010 1705 2753 3899 1007 1719 2828 1031 1668
## 250 b 19 ah 292 218 951 1982 2807 NA 937 2048 2859 854 1979
## 251 b 21 ah 363 226 1010 1604 2588 3810 979 1590 2574 978 1686
## 252 b 22 ah 350 263 854 1518 2988 4522 903 1499 2991 871 1655
## 253 b 23 ah 292 277 1119 1588 3010 3974 1115 1577 3013 1188 1462
## 254 b 24 ah 238 228 1205 NA 2708 3883 1219 NA 2732 1196 NA
## 255 b 25 ah 295 220 872 1565 2531 4389 868 1587 2531 885 1746
## 256 b 26 ah 314 210 776 1490 2730 3929 765 1481 2726 785 1488
## 257 b 27 ah 270 223 1147 1553 2877 3946 1143 1553 2873 1077 1562
## 258 b 28 ah 271 221 1065 1483 2723 3726 1068 1516 2780 1048 1455
## 259 b 29 ah 285 227 944 1540 2896 4298 942 1548 2892 931 1587
## 260 g 1 ah 276 238 894 1420 2828 4068 873 1416 2866 906 1453
## 261 g 2 ah 298 208 1312 1820 3308 4288 1333 1853 3231 1364 1915
## 262 g 4 ah 324 246 1007 1742 2982 NA 933 1709 3102 1036 1742
## 263 g 5 ah 375 204 978 1817 2843 4425 1030 1758 2864 978 1818
## 264 g 6 ah 275 225 1026 1544 2536 4225 1025 1558 2532 1014 1521
## 265 g 7 ah 384 227 843 1803 2562 NA 872 1782 2583 860 1817
## 266 g 8 ah 407 249 993 2000 3174 4186 998 2005 3198 971 2070
## 267 g 9 ah 321 245 1197 1734 3187 4712 1180 1637 3243 1127 1717
## 268 g 10 ah 282 218 855 1845 3055 4493 855 1845 3055 1003 1933
## 269 g 11 ah 298 240 1047 1488 2879 NA 1062 1514 2846 1066 1575
## 270 g 12 ah 317 200 1154 1932 3044 4564 1103 1885 3192 1159 1945
## 271 g 13 ah 334 241 931 1749 3205 4326 987 1727 3322 931 1749
## 272 g 14 ah 308 189 954 1472 2844 3811 972 1395 2898 971 1490
## 273 g 15 ah 299 233 1316 1752 3113 4180 1245 1731 3177 1246 1787
## 274 g 17 ah 304 248 911 1624 3230 4155 923 1617 3222 933 1691
## 275 g 18 ah 416 214 1033 1929 3428 NA 1064 1931 3357 1066 1922
## 276 g 19 ah 220 235 1129 2005 2826 3914 1129 1988 2709 980 2030
## 277 g 20 ah 259 234 1145 1655 3062 NA 1072 1622 3245 1150 1726
## 278 g 21 ah 186 215 1021 1720 3186 4359 1075 1745 3268 1050 1773
## F3.50 F1.80 F2.80 F3.80 height
## 1 2659 685 1773 2680 low
## 2 2455 628 1793 2451 low
## 3 2618 632 1708 2693 low
## 4 2435 757 1563 2527 low
## 5 2499 689 1696 2556 low
## 6 2569 626 1791 2577 low
## 7 2509 620 1677 2553 low
## 8 2614 563 1616 2668 low
## 9 2711 637 1517 2718 low
## 10 2555 679 1736 2582 low
## 11 2619 710 1748 2790 low
## 12 2520 618 1708 2550 low
## 13 2457 564 1633 2516 low
## 14 2295 601 1693 2391 low
## 15 2800 628 1870 2796 low
## 16 2603 646 1701 2665 low
## 17 3050 587 1735 2986 low
## 18 2566 643 1646 2582 low
## 19 2699 682 1798 2621 low
## 20 2515 665 1748 2725 low
## 21 2576 632 1698 2627 low
## 22 2565 567 1650 2597 low
## 23 2483 566 1667 2505 low
## 24 2513 663 1750 2499 low
## 25 2694 625 1728 2711 low
## 26 2546 575 1870 2598 low
## 27 2494 570 1681 2566 low
## 28 2604 647 1813 2605 low
## 29 2486 637 1759 2559 low
## 30 2572 590 1753 2443 low
## 31 2371 565 1711 2536 low
## 32 2413 629 1635 2421 low
## 33 2612 626 1677 2617 low
## 34 2327 599 1655 2366 low
## 35 2345 624 1707 2317 low
## 36 2488 658 1744 2519 low
## 37 2618 684 1745 2616 low
## 38 2511 566 1596 2514 low
## 39 2767 702 1859 2704 low
## 40 2650 624 1561 2741 low
## 41 2887 802 2165 2848 low
## 42 2516 563 1603 2544 low
## 43 2352 619 1749 2511 low
## 44 2831 631 1814 2774 low
## 45 2491 619 1676 2542 low
## 46 2867 705 1968 2906 low
## 47 3033 712 2000 3069 low
## 48 2595 925 1983 2695 low
## 49 2951 805 1923 3119 low
## 50 2808 806 1988 2810 low
## 51 2791 763 1788 2760 low
## 52 2709 816 1938 2753 low
## 53 3435 734 2119 3525 low
## 54 2975 774 1870 2741 low
## 55 2934 734 1826 2858 low
## 56 2746 928 1707 2691 low
## 57 2994 803 2051 2762 low
## 58 2727 781 1875 2757 low
## 59 3055 812 2035 2826 low
## 60 2749 709 1774 2819 low
## 61 2954 842 1913 3064 low
## 62 2751 794 1884 2784 low
## 63 3008 889 1950 3021 low
## 64 2697 930 1817 2610 low
## 65 2821 746 1874 2867 low
## 66 2970 866 1811 2938 low
## 67 2712 791 2008 2877 low
## 68 3270 816 2140 3263 low
## 69 3154 776 1935 3083 low
## 70 2940 721 1891 2920 low
## 71 3095 835 2028 3029 low
## 72 3004 749 1890 3093 low
## 73 3013 982 2027 2830 low
## 74 2926 902 1858 2995 low
## 75 3030 756 1920 2931 low
## 76 2600 779 1871 2559 low
## 77 3527 962 2174 3602 low
## 78 2802 731 1916 2854 low
## 79 2654 769 1871 2680 low
## 80 2999 807 1944 2915 low
## 81 3016 874 2026 3028 low
## 82 2950 747 2005 2859 low
## 83 2591 988 1777 2717 low
## 84 2698 790 1895 2748 low
## 85 3146 773 2080 3187 low
## 86 2844 766 1746 2809 low
## 87 2743 770 1494 2604 low
## 88 2574 880 1809 2706 low
## 89 3082 824 2126 3018 low
## 90 2704 832 1900 2720 low
## 91 2817 741 1811 2888 low
## 92 2940 697 2084 3077 low
## 93 2927 829 2044 2973 low
## 94 2888 806 2049 2961 low
## 95 3212 897 2133 3384 low
## 96 3389 946 2272 3284 low
## 97 3270 809 2149 2907 low
## 98 3123 900 2130 3337 low
## 99 3193 873 2052 3149 low
## 100 3103 862 2331 3144 low
## 101 3326 826 2225 3022 low
## 102 3106 832 2305 2951 low
## 103 3058 743 2066 2988 low
## 104 3178 797 2242 3112 low
## 105 3187 830 2136 2943 low
## 106 3276 908 2040 2841 low
## 107 3399 809 2185 3361 low
## 108 2931 825 2080 2912 low
## 109 2873 921 1870 2958 low
## 110 2942 888 2109 2858 low
## 111 3233 764 2313 3107 low
## 112 2753 760 1911 2718 low
## 113 3381 614 2051 3333 low
## 114 3264 1010 1896 3137 low
## 115 2733 795 2060 2739 low
## 116 NA 838 2023 NA low
## 117 2958 917 1897 3062 low
## 118 3175 745 2128 3236 low
## 119 3010 789 1783 2876 low
## 120 2886 1010 1973 2896 low
## 121 2888 889 2008 2939 low
## 122 3636 1000 2236 3285 low
## 123 3434 949 2100 3110 low
## 124 3459 868 2112 3159 low
## 125 3052 886 2023 3152 low
## 126 2963 552 2211 3007 low
## 127 3655 777 2129 3495 low
## 128 3313 850 2122 3223 low
## 129 3147 878 2279 3172 low
## 130 2842 881 2032 2911 low
## 131 3251 899 2284 3112 low
## 132 NA 894 2159 NA low
## 133 3150 755 2159 3170 low
## 134 3367 1055 2134 2867 low
## 135 3165 812 2063 3271 low
## 136 NA 944 2297 NA low
## 137 3547 916 2177 3339 low
## 138 3162 1048 2009 3316 low
## 139 3286 969 2057 3427 low
## 140 2629 752 1496 2620 low
## 141 2804 760 1138 2758 low
## 142 2740 754 1159 2691 low
## 143 2607 804 1320 2592 low
## 144 2607 799 1325 2557 low
## 145 2412 695 1581 2522 low
## 146 2457 697 1440 2477 low
## 147 2544 622 1501 2602 low
## 148 2548 683 1400 2297 low
## 149 2244 670 1590 2526 low
## 150 2703 702 1542 2739 low
## 151 2306 657 1492 2397 low
## 152 2571 634 1369 2553 low
## 153 2274 641 1376 2324 low
## 154 2493 684 1506 2652 low
## 155 2581 698 1497 2666 low
## 156 2515 661 1586 2504 low
## 157 2382 691 1432 2408 low
## 158 2506 760 1544 2507 low
## 159 2763 703 1542 2809 low
## 160 2462 721 1571 2501 low
## 161 2428 634 1342 2404 low
## 162 2461 639 1375 2486 low
## 163 2492 680 1611 2433 low
## 164 2861 749 1301 2807 low
## 165 2499 787 1538 2596 low
## 166 2422 624 1465 2400 low
## 167 2425 681 1431 2648 low
## 168 2557 715 1441 2545 low
## 169 2496 747 1572 2498 low
## 170 2208 644 1443 2206 low
## 171 2423 730 1457 2441 low
## 172 2491 656 1509 2502 low
## 173 2320 679 1394 2258 low
## 174 2595 679 1471 2591 low
## 175 2460 946 1586 2484 low
## 176 2378 683 1515 2489 low
## 177 2627 598 1532 2606 low
## 178 2438 816 1468 2479 low
## 179 2715 675 1432 2624 low
## 180 2955 870 1691 2869 low
## 181 2349 628 1302 2279 low
## 182 2607 642 1372 2664 low
## 183 2677 753 1514 2624 low
## 184 2517 710 1557 2542 low
## 185 2801 886 1784 2807 low
## 186 3048 783 1863 3081 low
## 187 2745 970 1787 2900 low
## 188 2803 759 1574 2758 low
## 189 2615 864 1697 2640 low
## 190 2811 752 1603 2786 low
## 191 2560 884 1476 2721 low
## 192 3092 814 1933 3153 low
## 193 2624 905 1597 2604 low
## 194 2687 804 1701 2831 low
## 195 2827 838 1571 2784 low
## 196 2571 836 1867 2666 low
## 197 2716 817 1736 2753 low
## 198 2867 927 1745 2786 low
## 199 2682 803 1743 2726 low
## 200 2898 775 1910 2973 low
## 201 2870 838 1658 2860 low
## 202 2801 890 1899 2867 low
## 203 2971 814 1585 2872 low
## 204 2863 761 1635 2801 low
## 205 3016 912 1573 2952 low
## 206 2817 735 1896 2885 low
## 207 2965 993 1808 3152 low
## 208 2841 898 1797 2744 low
## 209 2875 707 1491 2900 low
## 210 2817 916 1941 2833 low
## 211 3078 771 1920 3118 low
## 212 2797 1030 1697 2836 low
## 213 2697 896 1797 2785 low
## 214 2958 822 1684 2889 low
## 215 2459 747 1613 2521 low
## 216 3357 1046 2078 3433 low
## 217 2815 931 1808 2852 low
## 218 2817 946 1676 2754 low
## 219 2906 825 1836 2928 low
## 220 3170 907 1719 3115 low
## 221 2724 810 1819 2859 low
## 222 2671 873 1646 2723 low
## 223 3179 833 1554 3176 low
## 224 3000 847 1782 2933 low
## 225 2820 869 1722 2917 low
## 226 2641 834 1627 2730 low
## 227 2829 907 1651 2817 low
## 228 2679 846 1743 2709 low
## 229 2658 845 1635 2671 low
## 230 2810 688 1720 2839 low
## 231 3193 892 1829 3249 low
## 232 2864 898 1849 2821 low
## 233 2576 807 1980 2893 low
## 234 2665 980 1744 2818 low
## 235 2775 982 1848 2773 low
## 236 3027 871 1989 2945 low
## 237 3327 885 2006 3507 low
## 238 3072 877 1837 3031 low
## 239 3049 810 1954 3218 low
## 240 2900 931 1816 2950 low
## 241 2637 1016 1765 2466 low
## 242 2644 761 1734 2723 low
## 243 3660 908 2123 3684 low
## 244 2862 841 2004 2846 low
## 245 2783 934 1796 2898 low
## 246 3354 869 2038 3407 low
## 247 2791 874 1671 2850 low
## 248 2783 855 1674 2779 low
## 249 2748 920 1896 2731 low
## 250 2875 856 2078 2988 low
## 251 2602 834 1827 2635 low
## 252 2968 831 1764 2837 low
## 253 2934 1070 1955 2885 low
## 254 2664 930 1643 2547 low
## 255 2621 861 1966 2721 low
## 256 2662 814 1693 NA low
## 257 2870 838 1806 2911 low
## 258 2652 885 1590 2957 low
## 259 2862 881 1856 2833 low
## 260 2912 833 1631 2920 low
## 261 3319 995 2112 3449 low
## 262 2965 983 1883 2927 low
## 263 2841 935 1901 2800 low
## 264 2600 914 1736 3063 low
## 265 2636 704 1888 2719 low
## 266 3229 928 1995 3257 low
## 267 3217 1055 1858 3144 low
## 268 3009 961 2078 2875 low
## 269 2893 897 1916 3082 low
## 270 2974 1011 2151 3116 low
## 271 3205 892 1928 3342 low
## 272 2890 933 1640 2781 low
## 273 3152 1063 1901 3024 low
## 274 3225 865 1860 3165 low
## 275 3286 846 2031 3408 low
## 276 2741 1085 2096 2848 low
## 277 3067 981 1992 3329 low
## 278 3250 947 2146 3333 low
install.packages('lmerTest')
vowels.means = vowels %>%
group_by(vowel, mwbg) %>%
select(vowel, mwbg, F0, F1, F2) %>%
summarise_all(c(mean=mean, sd=sd))
lm(F0~F1, data=vowels) %>% summary
##
## Call:
## lm(formula = F0 ~ F1, data = vowels)
##
## Residuals:
## Min 1Q Median 3Q Max
## -113.49 -44.28 10.03 36.87 132.11
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.511e+02 4.712e+00 32.07 <2e-16 ***
## F1 7.682e-02 7.609e-03 10.10 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 50.26 on 1666 degrees of freedom
## Multiple R-squared: 0.05766, Adjusted R-squared: 0.05709
## F-statistic: 101.9 on 1 and 1666 DF, p-value: < 2.2e-16
library(lmerTest)
## Loading required package: lme4
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
mod = lmer(F0~F1+(1|talker), data = vowels)
summary(mod)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: F0 ~ F1 + (1 | talker)
## Data: vowels
##
## REML criterion at convergence: 17694.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.5042 -0.8309 0.2182 0.7353 2.6261
##
## Random effects:
## Groups Name Variance Std.Dev.
## talker (Intercept) 289.8 17.02
## Residual 2254.4 47.48
## Number of obs: 1668, groups: talker, 49
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 1.550e+02 5.100e+00 4.925e+02 30.393 <2e-16 ***
## F1 6.549e-02 7.269e-03 1.634e+03 9.009 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## F1 -0.846