Description of corpora

English (L1 and L2)

Description

The 'EPG Set 1' word list was partially based on Peter Ladefoged's 'American English' chapter of the Handbook of IPA illustrating all consonant and vowel phonemes, with additional words included to illustrate allophonic variation in consonants. The set was used in this corpus for L1 speakers only.
The 'EPG Set 2' word list was originally designed by Laura Colantoni and Jeffrey Steele to study the acquisition of English consonants, vowels, and consonant clusters by L2 learners. In the current corpus it was used for both L1 and L2 speakers.
The current data are from 3 L1 Canadian English speakers and 11 L2 speakers whose L1 is French, Japanese, Korean, or Spanish.
The simultaneous EPG and audio recordings were collected in the Linguistics Phonetics Lab in 2009-11 (EPG Set 1) and 2015-18 (EPG Set 2).

Speaker codes

L1 English: ENf01, ENm01, ENm02
L2 English, L1 French: FRCf01, FRCf02, FRQf01, FRQf02 (C = Continental/France, Q = Quebec)
L2 English, L1 Japanese: JPf03, JPf04, JPf05
L2 English, L1 Korean: KRm02
L2 English, L1 Spanish: SPAf04, SPCf01, SPPf01 (A = Argentine, C = Cuban, P = Peninsular/Spain)

Materials

EPG Set 1 words (alphabetical)
EPG Set 2 words (alphabetical and phonetic), produced in the carrier phrase I say __ again and in isolation.
Text: The North Wind and the Sun, based on Ladefoged (1999)
Multiple repetitions were elicited of all the materials (on average 6 for words in carrier sentences, 2 for single words, and 4 for the text).
In file names, 'c' refers to words in carrier sentences, 's' to words in isolation,and 'p' to the text (e.g. chin_c1_epENm01: the first repetition of the word 'chin' in the carrier sentence by ENm01).
All files are coded for segments (consonants and vowels) and graphemes.