The spooky ambiguity of Halloween
Consider these two transcriptions from the online Oxford Advanced Learner’s Dictionary:
The first word, of course, is bullsh*t — as the dictionary puts it, a taboo, uncountable, slang noun meaning “ideas, statements or beliefs that you think are silly or not true”. The second word, however, is not ratsh*t. It’s ratchet, “a wheel or bar with teeth along the edge and a metal piece that fits between the teeth, allowing movement in one direction only”.
The relevant point here is that ratchet and ratsh*t sound different, but the OALD would transcribe them identically. Ratchet contains an affricate phoneme /tʃ/, while ratsh*t contains a plosive phoneme /t/, followed by a fricative phoneme /ʃ/ which is louder and longer (at least in fairly careful speech) than the fricative part of the affricate. Transcribing the two words identically is an example of transcriptional ambiguity.
In principle, transcriptional ambiguity is undesirable. If two words sound different enough for natives to tell them apart, they ought to be different in transcription. The OALD tolerates a bit of ambiguity because it prioritizes transcriptional simplicity. But if we’re willing to accept more complex transcription, we can eliminate the ambiguity: there are various ways to show the difference between /tʃ/ as a single affricate and /tʃ/ as a two-phoneme sequence.
The Cambridge dictionary employs a syllable-boundary symbol: bullsh*t /ˈbʊl.ʃɪt/ versus ratchet /ˈrætʃ.ɪt/, so ratsh*t (if they included it) would be /ˈræt.ʃɪt/. The big problem with this is that there’s disagreement about the location (and perhaps even the existence) of syllable boundaries. As the Cambridge Introduction puts it, “No completely satisfactory scheme of syllable division can be produced”. The Merriam-Webster dictionary, for example, puts the /tʃ/ of ratchet in the second syllable. (An alternative not generally used by dictionaries would be to show in compounds the morpheme boundary from which the alleged syllable boundary is derived. The plus sign is widely used for this, e.g. ratchet /ˈratʃɪt/ versus ratsh*t /ˈrat+ʃɪt/.)
In the case of a pair like ratchet and ratsh*t, there’s a difference in levels of stress. The second syllable of ratchet has the lowest level of stress, widely referred to as being weak. The second syllable of a compound like ratsh*t is not weak, and some would transcribe it with secondary stress, like the Macmillan dictionary, where we find /ˈlæpˌtɒp/ and /ˈbʊlˌʃɪt/. This would give us ratchet /ˈratʃɪt/ versus ratsh*t /ˈratˌʃɪt/
According to the IPA chart, affricates “can be represented by two symbols joined by a tie bar if necessary.” And to quote the Wikipedia article on the voiceless postalveolar affricate, it is “transcribed in the International Phonetic Alphabet with ⟨t͡ʃ⟩, ⟨t͜ʃ⟩ or ⟨tʃ⟩ (formerly the ligature ⟨ʧ⟩).” This would give us ratchet /ˈratʃ͡ɪt/ versus ratsh*t /ˈratʃɪt/.
The use of two symbols for a single phoneme can be shown by bringing them into closer proximity. As mentioned above, the postalveolar affricate used to have its own single character, the ligature ʧ. I still tend to write it like this when transcribing by hand. The change in IPA policy was not for phonetic reasons, but to avoid a proliferation of special characters for all the other possible affricates. However, following the principle of “one sound, one symbol”, this solution is phonetically sensible, and it’s the one we follow in the CUBE dictionary:Alternative symbols
The ambiguity of /tʃ/ is, to be frank, a problem of the IPA’s own making, resulting from the decision to show affricates as sequences. Plosives also involve a sequence of phonetic events, but the IPA doesn’t show these events sequentially. Languages often use single letters for affricates, like ‘z’ or ‘c’ for t͡s and č for t͡ʃ. Linguists, particularly in America, have commonly used /č/ for the postalveolar affricate, and this kind of transcription is one of the many options available in CUBE:
Let’s now turn from affricate consonants to vowels, where symbol combinations are typically used to represent diphthongs, even when these are considered single phonemes, as in English.
Take the FACE diphthong. In the familiar transcription system use by most dictionaries, this is /eɪ/. In the transcription system that I favour, which is used in the CUBE dictionary and which I explain in my new video, FACE is /ɛj/. In both transcriptions, the component symbols are used independently: /e/ and /ɛ/ as the checked DRESS vowel, /ɪ/ as the KIT vowel and /j/ as the approximant in words like yes and young. However, the potential ambiguity isn’t really an issue here, because /e/ and /ɛ/ can’t otherwise occur before /ɪ/ or /j/. Whenever the reader sees /eɪ/ or /ɛj/, it can only be the FACE diphthong.
Two rather more serious cases of ambiguity can be found in the original version of the familiar transcription system for Received Pronunciation (RP). These are /ɪə/ and /ʊə/, which in classic mid-20th century RP represented not only diphthongs but also sequences of two phonemes, because /ɪ/ and /ʊ/ could then occur before vowels. You can still find transcriptions of this type online, in the quaint “British English” transcriptions of the Collins dictionary. Examples:
dear /dɪə/ /ɪə/ = 1 syllable
India /ˈɪndɪə/ /ɪə/ = 2 syllables
sure /ʃʊə/ /ʊə/ = 1 syllable
Joshua /ˈdʒɒʃʊə/ /ʊə/ = 2 syllables
The CUBE dictionary has very different transcriptions for these pairs of words, which in contemporary pronunciation sound very different:
Conservative dictionaries like Cambridge, Longman and the OALD also distinguish the pairs, but only at the cost of complicating the system with special symbols /i/ and /u/, which aren’t even phonemes but rather represent a range of pronunciations:
(In fact the /i/ and /u/ symbols only superficially cover up the ambiguity, since their multiple pronunciations include /ɪ/ and /ʊ/, returning us to the original problem.)
And so to an interesting potential ambiguity in CUBE’s default transcription system, namely the /əw/ which we favour for the GOAT diphthong (again, see my new video for justification). Of course, /ə/ and /w/ have independent lives in the phonology of English, and these independent phonemes can co-occur fairly commonly in that order. Examples: away, aware, await, awake, towards, Hawaii. And, like the other six closing diphthongs, GOAT can occur before vowels: oasis, poetic, koala, Joanna, Croatia, etc. So, according to this transcription system, both away and oasis begin with /əw/.
We have much the same options for disambiguation that we had with ratchet and ratsh*t. Those who believe in syllable boundaries would generally syllabify away and oasis as /ə.wɛ́j/ and /əw.ɛ́jsɪs/. But, surprise surprise, Longman and Cambridge disagree about where to put the second syllable boundary in oasis…
The tie bar option would give us /əw/ in away and /ə͡w/ in oasis. And the spacing option, actually used in CUBE, gives us:
As with /tʃ/, there’s the option of choosing alternative symbols, and this is supported by the fact that the starting qualities of away and oasis are different, contrary to both the traditional and the CUBE systems: away begins with a more back quality than oasis. In the following clip you can hear 1. the initial /əw/ from five dictionary recordings of away; 2. the initial GOAT vowel from five recordings of oasis; 3. the first 50 milliseconds of the away recordings; 4. the first 50 ms of the oasis recordings:
This could be due to the backing effect of the following /w/ on weak schwa, while GOAT, being a strong vowel, is more able to retain its opening quality. To reflect the quality difference, it would be possible to select different vowel symbols, for example /ə/ for independent schwa and /ɜw/ for GOAT. (Indeed, /ɜʊ/ would have been a more consistent choice for GOAT in the traditional system, keeping /ə/ as strictly a weak vowel symbol and using /ɜ/ in the strong vowels NURSE /ɜː/ and GOAT /ɜʊ/.) This would give us away /əwɛ́j/ and oasis /ɜwɛ́jsɪs/.
But this quality difference seems to be far less pronounced when /əw/ follows a stressed syllable, in words like Iowa and Halloween. In words of this type, we often seem to be dealing not with an ambiguous transcription, where the symbols fail to show a contrastive pronunciation difference, but with neutralization, where a single transcription appropriately reflects the loss of a true difference in pronunciation.
American dictionaries are more decisive than British dictionaries about words like Iowa and Halloween. Of course the onset of GOAT is more distinct from schwa in AmE, the traditional transcription for GOAT being /oʊ/ (though the onset quality actually tends to be more like [ɔ] or [ʌ]). So dictionaries like the American Heritage and both of the Merriam-Webster dictionaries (which have different transcription systems) agree that Iowa and Halloween contain not GOAT but /əw/, placing a stress mark (and thereby, presumably, a syllable boundary) between the two symbols in Halloween:
But the picture in British dictionaries is less clear. Collins states that BrE Iowa contains GOAT /əʊ/, but the OALD gives it /əw/; while Longman and Cambridge allow both. These four dictionaries all give /əw/ in AmE Iowa, and all insist on GOAT /əʊ/ for BrE Halloween; but for AmE Halloween, Longman and the OALD give GOAT /oʊ/, Collins gives /əw/, while Cambridge allows both.
Furthermore, the British dictionaries’ blanket preference for GOAT in BrE Halloween seems belied by real data, as provided by YouGlish. These British speakers all seem to have a very weak middle syllable in Halloween, which I for one find it hard to identify straightforwardly with the strong GOAT vowel:
My guess is that if you asked those speakers what the last syllable of the word is, they’d be more likely to say ween than een (regardless of the etymology “all hallows’ evening”).
The GOAT diphthong often weakens to schwa over time, as in domain and homophobic. And when this happens before another vowel, the diphthong’s semivowel glide survives because, as my new video discusses, English hates hiatus (abutting vowels). Sometimes AmE retains GOAT where BrE doesn’t, e.g. domain, and sometimes AmE seems to be further ahead in weakening: the Merriam-Webster dictionaries have weakening not only in Halloween but also in following /ˈfɑːləwɪŋ/ and widower /ˈwɪdəwɚ/.
Have British speakers en masse switched Halloween from GOAT to American-style /əˈw/, while British dictionaries, as is so often the case, have failed to keep up with contemporary pronunciation?
Or is it just that the distinction often evaporates? For my money, it’s actually an advantage of using /əw/ for GOAT that it shows how naturally this strong diphthong reduces to weak schwa plus /w/, and how hard it can sometimes be to tell the difference.
Happy New Year, Prof. Lindsey!
Thank you, and same to you!
I believe I pronounce the second syllable of this word with something closer to /ə/ than /ɪ/; the same applies to other words with ‘-et’, e.g. ‘piglet’, ‘wallet’, ‘magnet’, ‘fillet’. (I would, however, pronounce ‘basket’, ‘limpet’ and possibly a few other words with /ɪ/, not /ə/ – I cannot explain why.) So that would be one way to differentiate the mechanical device from the organic waste product.
Another thing I have noticed is variation between British English speakers in the pronunciation of intervocalic /tʃ/ – some (including myself) precede it with a glottal stop, whilst others do not. So, for example, ‘Richard’ and ‘catching’ could be pronounced thus:
/ˈrɪʔtʃəd/ or /ˈrɪtʃəd/
/’kæʔtʃɪng/ or /’kætʃɪng/
(I have not seen any in depth study on this, but my impression is that the variant *with* the glottal stop is more common in RP.)
Those that pronounce intervocalic /tʃ/ without the accompanying glottal stop are still likely to insert it before word-final /tʃ/, thus:
The same could, perhaps, be another differentiating phonetic feature between ‘ratchet’ /’rætʃɪt/ and ‘rat sh*t’ /’ræʔtʃɪt/.