(A longer version of my Words of the Week post. The additional material starts here.)
The name of Sblended, a British chain of milkshake bars, is a very nice pun. On the one hand, it’s short for ‘it’s blended’. It’s can be shortened to ts or just s, as in the title of the famous Gershwin song ’S Wonderful (the song begins one minute into the clip).
On the other hand, the name sounds just like the adjective splendid. If you remove the s from a recording of the word splendid, the remainder sounds like blended – beginning with a b, and not with a p as in plenty. Here are recordings of blended and splendid, some of which have been made by removing s from splendid or by adding s to blended:
This is not just a fact about the words blended and splendid. It applies to all syllables which begin with s followed by a ‘plosive’ consonant, ie one written with p, t, or k/c.
So speech sounds like s + beach, not like s + peach; stuck sounds like s + duck, not like s + tuck; and Scott sounds like s + got, not like s + cot.
Words like peach, tuck and cot begin with aspirated consonants. (I explain aspiration here.) Many non-native speakers of English correctly use aspirated consonants in such words, but unfortunately use the same aspirated p, t, k in s-words like speech, stuck, Scott (and splendid). This is wrong. Here I am saying Scott correctly, then repeated with a gap after the s to show that the remainder sounds like got:
And here I am saying Scott incorrectly, with aspirated k as in cot:
Some may be wondering, since ’s blended sounds like splendid, whether they should be transcribed similarly. My view is that the most insightful phonemic transcription of both would be /sblɛndɪd/. However, in the CUBE online dictionary which I co-edit, we ultimately decided to show /splɛndɪd/ by default, with /sblɛndɪd/ only as an option. Here I’ll dis/g/uss some pros and cons.
English has six contrasting plosives which fall into two sets, most widely symbolized as /p t k/ and /b d g/. When they precede the vowel of their syllable, /p t k/ exhibit aspiration, or appreciable ‘Voice Onset Time’ – a delay between the plosive release and the start of vocal cord vibration. /b d g/, on the other hand, have a VOT close to zero.
The plosives in /s/-clusters have a VOT close to zero; in phonemic analysis, therefore, the criterion of phonetic similarity should assign them to the /b d g/ phonemes rather than /p t k/ phonemes. And, from a practical point of view, non-native users of transcriptions should be guided to pronounce plosives in /s/-clusters like /b d g/, and not with the long VOT of /p t k/.
It’s often said that plosives are aspirated in the onset of stressed syllables. I say this myself in my video on aspiration. Indeed, aspiration tends to be longer and more noticeable in stressed syllables. But the VOT difference between /p t k/ and /b d g/ is also found in unstressed syllables, eg police and balloon. And again, the first plosive of spaghetti patterns with the /b/ of balloon rather than the /p/ of police:
Similarly, the ending of rusty is interchangeable with that of Rushdie:
It’s not only /s/, but fricatives in general, which restrict a following plosive to the /b d g/ kind (unless a major boundary intervenes). So the endings of larder and laughter are also interchangeable:
In other words, the most phonetically and phonologically insightful phonemicizations of these words are /rʌsdiː/ and /rʌʃdiː/, /lɑːdə/ and /lɑːfdə/. The same thing can be found in final position: the final plosives of words like pleased and least are typically interchangeable:
Admittedly, those final plosives are both voiceless and /t/-like. But when followed by vowels they will have the short VOT of /b d g/. That is, pleased‿about it and least‿of all may both contain the /də/ of domestic. And so it would make sense to transcribe both pleased and least as ending in /d/. As John Wells conceded in an excellent blog post, “If someone wants to transcribe English this way, I find it difficult to argue forcefully against the idea.”
The relevant question, it seems to me, is why a phonetician or phonologist would not transcribe English this way, when the reasons to do so are strong. Well, there are two reasons why we decided against this kind of transcription as the default in the CUBE dictionary. One reason is fairly good, the other is frankly not so good, but important.
The goodish reason, as I see it, is that a fair number of transcription users have mother tongues with voicing assimilation. These include Russian and Dutch. Such speakers, if presented with /sb sd sg/, will tend to pronounce them [zb zd zg], assimilating the voicing of /s/ to the following plosive. This is why such speakers will often pronounce Facebook as Phasebook, or textbook as tegzbook.
The weaker-but-important reason is tradition. Both lay people and academics have been conditioned by spelling and by traditional transcription to think in terms of ‘sp st sk’. CUBE allows the user to search not only by spelling but also by sounds. If users wish to search for /s/-clusters, it would be unreasonable (we decided) to demand that they search for /sb sd sg/ rather than the traditional /sp st sk/. This is why CUBE will, by default, show splendid as /splɛndɪd/.
However, we retain the more insightful /sb sd sg/ as an optional alternative, eg /sblɛndɪd/. This option may be selected under ANALYSES.
Transcribing with /b, d, g/ (and also /gw/) certainly feels right. But don’t we need to know about more than voice onset time and aspiration here, as far as contrast is concerned? Namely tenseness, and – in the case of a word like ‘best’ – the length of the preceding vowel?
It’s true that VOT/aspiration isn’t the only difference between /p, t, k/ and /b, d, g/. Typically the former are stronger (acoustically more intense; possibly more ‘tense’ in some sense). But in terms of signalling a contrast, the plosives in /s/-clusters are far more like /b, d, g/ than /p, t, k/: interchanging the plosives in /s/-clusters with tokens of /b, d, g/ (as I’ve done in many of the clips in this post) doesn’t alter the perceived identity.
You’re also right that the voiced-voiceless contrast is realized partly by effects on a preceding vowel, but that’s not particularly relevant here. The /s/ in best will ‘clip’ the length of the preceding vowel regardless of whether the following plosive is phonemicized as /t/ or /d/.
I see, thanks.
Piotr Gąsiorowski has a paper arguing that these clusters, which he calls ‘presigmatised stops’, have a unique status in terms of syllable structure in English and some other Indo-European languages, essentially patterning like single phonemes rather than clusters, and so allowing the violation of phonotactic constraints, including in English the ban on trisyllabic onsets (as in /str-/).
He also presents some evidence that this special status goes all the way back to PIE – where the clusters are able to override the usual root structure constraints.
These clusters also behave oddly with regard to reduplication, which normally copies only the initial consonant. On his blog Gąsiorowski writes:
“They are copied whole in Germanic, unlike other sC clusters, where only the /s/ is copied; Vedic copies the stop; Greek, Avestan and Old Irish, the /s/. Tocharian A copies /s/; Tocharian B, either the stop or the whole cluster; Latin has a crazy pattern: either sVsT- or sTVs- (simplifying the onset of the base!); Hittite evidence is scanty. Götz Keydana and Andrew Byrd argue that the sVsT- template is original. I am inclined towards the traditional view that sTVsT- is original and the other variants are due to independent branch-specific dissimilations.”
If he’s right on that last statement, it would imply that PIE grammar did treat them as single segments.
Interesting; thanks for the reference.
An alternative explanation which some give for the anomaly of these clusters is that the s is in a separate syllable.
Sorry I think that should be ‘trisegmental onsets’ rather than ‘trisyllabic’ – an error in the Gąsiorowski paper which I copied over.
Very fascinating. It seems like an alternate transcription of /b d g/ would make things less misleading for speakers of languages with voicing assimilation. There really should be dedicated fortis and lenis diacritics. Phonetically, the voiced stops are lenis, and the most invariable feature is lack of aspiration. None of the makeshift transcriptions of fortis and lenis consonants would work for English. Transcribing fortis consonants as aspirated, as in Icelandic and Scottish Gaelic would not be accurate, since final fortis stops after a vowel are often not aspirated. Transcribing lenis consonants using a voiced stop symbol with a devoicing diacritic would also be inappropriate for English, since the lenis stops are sometimes voiced. There really ought to be fortis and lenis diacritics. Perhaps if /b/ in /sblendid/ were replaced with /p/ with a lenis diacritic, it would be less confusing for speakers with voicing assimilation. Unless they still ended up interpreting lenis as meaning voiced.
Thanks, Gabriel. From a theoretical point of view, the status of fortis/lenis is interesting. How independent from voicing is it? Is it a phonological cover term, like ‘stress’?
From a practical point of view, I’m a bit cynical about diacritics. In strict phonemic terms, English clearly has only a binary system, typically written /ptk/ and /bdg/, and equally clearly the stops in s-clusters belong to the latter set. Very few transcription users will pay attention to or understand diacritics. With or without diacritics, some learners/users will wrongly aspirate sp, st, sk, and some learners/users will wrongly voice sb, sd, sg. I tell those I teach to make a good strong ssssss and then add /bdg/; of course, that won’t work in a dictionary!
But isn’t it more likely percepted as unaspirated ptk by those with voicing assimilation, than as bdg?
P. S. For me, there is a perfect Russian p sound in given examples of sp
When many years ago in Sheffield I worked with a lady whose accent was very RP and she regularly was confused by my pronunciation. I would say pin and she would hear bin and for tot she would hear dot. In my dialect p, t and k had little or no aspiration initially but had it finally. A word such as taut had no initial aspiration but a slight final. This final aspiration disappeared before a following vowel. Locally (North East Derbyshire and Bassetlaw)) this is becoming obsolete though aspiration is still weak for most speakers. Sheffield speakers only a few miles away were and are quite heavily aspirated (witness Sean Bean). Rotherham and Barnsley had a similar pattern to North East Derbyshire,
So would you say that e.g. there’s no contrast between ‘discussed’ and ‘disgust’?