This article attempts to describe the vowels of Standard Southern British (SSB) in a way that is phonetically explicit and accurately represents their phonological categorization. According to the Handbook of the International Phonetic Association,
Standard Southern British… is the modern equivalent of what has been called ‘Received Pronunciation’ (‘RP’). It is an accent of the south east of England which operates as a prestige norm there and (to varying degrees) in other parts of the British Isles and beyond. (p. 4)
The approach here will be point out the many and varied differences between the vowels of contemporary SSB and those embodied in the symbols which A. C. Gimson chose in 1962 as a description, “explicit on the phonetic level”, of the upper class speech of that era. The tumultuous sociophonetic change of the 1960s meant that Gimson’s description became out of date almost the moment it was published (see my article 1962); but out-of-touch British publishers unfortunately chose to standardize and enforce Gimson’s obsolescent symbols, which then became still more entrenched thanks to dissemination across the internet.
It’s therefore imperative that users of materials containing these old symbols should be aware of the discrepancy between what they imply and the actual sounds of SSB today. Non-natives are particulary unlikely to be aware of how absurdly old-fashioned the IPA values of the familiar symbols now are to native ears.
Phoneticians describe vowels by use of reference qualities or ‘cardinal’ vowels which map out the auditory vowel space. As systematized by Daniel Jones, there are eight primary cardinals. These were conventionally plotted on a quadrilateral according to the estimated positions of the tongue body in the mouth (with the front of the mouth to the left). Here is Jones’s quadrilateral, along with a triangular arrangement corresponding more closely to acoustic reality, with demonstrations recorded by Jones himself, starting with the first cardinal vowel i and running anticlockwise to number 8, u.
Sometimes we encounter less common vowels which require symbols other than the primary cardinals. For example some languages, including French and German, have contrastive rounding of the non-low front vowels. For these, phoneticians turn to a ‘secondary’ set of cardinal symbols, eg y, the rounded counterpart of i. And occasionally phoneticians feel the need to use entirely non-cardinal symbols to capture more idiosyncratic qualities.
RP as classically described had a vowel system which was large and phonologically rather disorganized. To transcribe it, A. C. Gimson chose IPA symbols comprising most of the primary cardinals, two secondary cardinals ɒ and ʌ, plus the non-cardinals æ, ɪ and ʊ, and the neutral vowel schwa, ə, which belongs in the centre of the vowel space (a further symbol ɜ was added for schwa’s long counterpart). It’s important to understand that Gimson chose these IPA symbols as ‘explicit’ descriptors of upper class speech circa 1962.
By contrast, the vowel system of contemporary Standard Southern British (SSB) can be described with a high degree of accuracy using just the eight primary cardinal vowels, plus the two central vowels ɵ and ə:
I’ll now discuss these vowel qualities, paying attention to points of divergence from RP, and making one or two minor modifications.
START, TRAP, PRICE, MOUTH
A comparison of SSB vowels with those of RP indicates what can be thought of broadly as an anticlockwise vowel shift. This had three distinct components: lowering of the front vowels, raising of the mid back vowels, and fronting of the high back vowels.
The first two of these components are plausibly attributed to the fact that RP had crowding towards both the upper front area (FLEECE, KIT, DRESS and TRAP all being non-low) and the lower back area (START, LOT, NORTH and STRUT all being non-high). These two areas of crowding required the use of, respectively, two non-cardinal symbols ɪ and æ, and two secondary cardinals ɒ and ʌ, all four of which tend to be problematic for foreign learners.
The START vowel has raised somewhat in the direction of ʌ, but cardinal 5 ɑ remains a fairly accurate description of it; it is of course a long vowel, ɑː.
ɑ is also the first element of SSB’s PRICE diphthong, ɑj. Here is the word by said by Deputy Prime Minister Nick Clegg, followed by its first half, which might be from the word balm or bar:
And the same thing with Daniel Radcliffe saying try:
When the PRICE diphthong is followed by schwa, as in variety, the sequence may undergo smoothing, the contemporary form of which I described in another post. This replaces a diphthong-schwa sequence with a long form of the diphthong’s first element; so ɑjə may be smoothed to ɑː, the START vowel. Here is opera singer Simon Keenlyside pronouncing variety with ɑː, making a rhyme with party:
And Prince Harry pronouncing aisle not as ɑjəl but as ɑːl:
By contrast with the stable ɑː of START, the a of contemporary TRAP represents arguably the flagship change from RP. The TRAP vowel varied among RP speakers, from a quality halfway between cardinals 3 and 4, all the way up to ɛ, and it was responsible for the old joke that “sex is what posh people get their coal in” (ie sacks). In comedic parodies of old RP, the TRAP vowel is generally a focus of attention. (Jack Windsor Lewis beautifully summarizes the saga of the ash vowel – as he more traditionally calls it – here.)
The RP TRAP vowel was certainly considered different enough from cardinal 4 that the non-cardinal symbol æ had to be drafted in. This is no longer the case. Here are some contemporary TRAP qualities from Nick Clegg and Kate Winslet:
Clegg’s balance of power additionally shows that the a of TRAP also functions as the first element of the MOUTH diphthong. And, when the MOUTH diphthong is immediately followed by schwa, smoothing is again a possibility, producing a lengthened version of the diphthong’s first element, ie long TRAP. Here is chef Sam Clark saying sourdough with a MOUTH-schwa sequence in the syllable sour:
But he also produces the word with a smoothed first syllable containing the TRAP quality a, producing something very like sad:
True æ survives outside SSB, for example in London-Estuary and of course in America. This pronunciation of rag as ɹæg is characteristic of General American:
– although that clip is actually from the BBC’s Watch With Mother in 1957:
SSB TRAP has moved closer to the a of languages like Italian, which typically lies between cardinals 4 and 5. How close? Here, courtesy of Google Translate, are Italian pasta and English bastion:
If we swap the initial consonant-vowel sequences of the two words
the results are more or less equally acceptable.
So contemporary SSB TRAP overlaps with Italian a. Therefore, while æ is phonetically appropriate for American English, it’s more accurate to use a for SSB, even though cardinal 4 is if anything more front. (A very back TRAP vowel is stereotyped as a young, posh feature in this comedy video.)
It’s unsurprising that TRAP lowered, since the decidedly non-low quality maintained in old RP left a gulf between it and START. Given the large number of vowels in the RP system, a gap that wide – at least as wide as the gap between the e and a of Spanish, with only five vowels – quite naturally failed to endure. (This is less of an issue in General American, with its less back START.)
DRESS, FACE, SQUARE
The vowel of DRESS also lowered, attaining a contemporary value very much like the third cardinal, ɛ. In fact this quality can do triple duty, as the long monophthong of SQUARE, ɛː, and as the first element of the FACE diphthong. By contrast, RP as described by Jones had ɛ in SQUARE but e in DRESS and FACE. Contemporary FACE is on average less open than DRESS or SQUARE, but it’s often just as open. Illustrating SSB, Nick Clegg and Daniel Craig, then the letter A from a couple of dictionaries:
Again, the presence of ɛ in both FACE and SQUARE is confirmed by evidence from smoothing. When the FACE diphthong is immediately followed by schwa, the sequence is optionally smoothed into a lengthened version of the diphthong’s first element. Here is UCL’s Prof. Mark Miodownik saying thinner layers with a FACE-schwa sequence in layers:
But in an atomic layer of graphene, he smoothes layer to lɛː, the same as lair:
LOT, NORTH, CHOICE
RP’s crowding of the lower back area was eased by the raising of LOT and NORTH to modern qualities which are better described by cardinals 6 and 7 respectively. The low LOT vowel of old RP, ɒ, was quite similar in quality to START. Here is Princess Elizabeth in 1940, saying Margaret and tomorrow:
And here are along the path and to the log from Watch With Mother in 1957:
Contemporary LOT is similar to the NORTH of RP, though of course shorter. Here is saw with RP ɔː from Watch With Mother in 1957, followed by its first part, which sounds rather like the beginning of contemporary sock or sob:
Likewise the beginning of the Queen’s 1957 loyalty with RP ɔɪ sounds like the start of contemporary lot:
SB LOT is also very similar to the ɔ of Italian. Here, again from Google Translate, are English bossy and Italian posso, then repeated with the initial consonant-vowel sequences swapped. I’ll let you decide whether the originals or the edited versions come first:
Incidentally, transcribing SSB LOT as ɔ reinstates Daniel Jones’s practice: “In broad transcription of particular languages it is generally convenient to use the symbol ɔ in place of ɒ” (Outline of English Phonetics, 1962). Perhaps Jones found that the symbol ɒ caused confusion among foreign learners commensurate with its oddity and scarcity in the world’s languages. Today, describing SSB LOT with the more natural ɔ is no longer mere convenience.
The same applies to o for NORTH, though it is of course long, oː. The SSB vowel is somewhat more open than cardinal 7, but closer to it than to 6. Here are Clegg and actress Emma Thompson:
The contemporary NORTH vowel is very similar to the oː of German. This clip could easily be a native pronunciation of contemporary SSB torn and dorm:
But in fact it’s a German speaker saying Ton and Dom.
And again, there’s some similarity to the o of Italian, though the latter tends to be more exactly cardinal 7-like. Compare English sauna from the online Oxford Advanced Learner’s Dictionary with Italian sono from the Dizionario italiano multimediale e multilingue d’ortografia e di pronunzia:
As in RP, the NORTH vowel also functions as the first element of the CHOICE diphthong. Here from a 1945 newsreel is RP voice, followed by its beginning, vɔ.
And here is contemporary voice from the online Macmillan dictionary, again followed by its beginning, vo.
Once again, evidence from smoothing confirms that CHOICE begins with the NORTH quality. A sequence of CHOICE and schwa may optionally become a lengthened version of the diphthong’s first element. Here is telecoms honcho Richard Hooper saying the finances of the Royal Mail need to be stabilized with ɹojəl mɛjəl smoothed to ɹoːl mɛːl.
The upper corners of the vowel space, cardinals 1 i and 8 u, were used by RP as long monophthongs in FLEECE and GOOSE. Both of these pronunciations are now old-fashioned, the latter extremely so. Here is at least for a few minutes said by the Queen with old-fashioned liːst and fjuː in her 1957 Christmas message:
And, from Watch With Mother the same year, big leaves and usually with old-fashioned liːvz and juːsually:
In diphthongizing these two vowels, SSB has phonologically rationalized its long vowel system. Whereas the long monophthongs of RP (FLEECE, NURSE, START, NORTH and GOOSE) did not pattern together phonologically, the long monophthongs of SSB (SQUARE, NURSE, START, NORTH, NEAR, PURE) are a natural class: they exhibit r-linking (r-liaison) with a following vowel. Meanwhile FLEECE has joined the class of front-closing diphthongs (FLEECE, FACE, PRICE, CHOICE) which exhibit j-linking with a following vowel, and GOOSE has joined the class of back-closing diphthongs (GOAT, MOUTH, GOOSE) which exhibit w-linking with a following vowel. (Note that whereas r-liaison involves the insertion of an extrinsic ɹ, in j-linking and w-linking the j and w are intrinsic to the first vowel.)
This is a basic part of English phonology; failing to observe it is a glaring but routine characteristic of non-native speech, even with advanced learners. For example, foreigners tend to pronounce initialisms like B.A. or U.R.L. as BʔA and UʔRʔL rather than the native pronunciations, BjA and UwRɹL.
So in SSB, the upper corners of the vowel space function as the endpoints of its two subclasses of diphthongs. I prefer to transcribe them as the semivowels j and w, which are exactly equivalent in terms of the vowel space, but have additional benefits. First, j and w make explicit that the English diphthongs are falling, ie moving from greater to lesser prominence. Second, they reinforce English vowel linking for the foreign learner, eg way out wɛjawt, show off ʃəwɔf, as well as initialisms like those in the previous paragraph. Additionally, they provide by far the best way to teach cardinals 1 and 8 to native English speakers: simply pronounce intervocalic j or w, then “hit the pause button” in the middle of the semivowel.
In final position, especially when heavily accented, SSB diphthongs are sometimes followed by a short, schwa-like sound which highlights the final j or w. Here are me and loo said by Kate Winslet:
Such pronunciations clearly differ from the closing diphthongs of old RP with their distinctively not-quite-high endpoints ɪ and ʊ. These sound very old-fashioned, especially in final accented position, e.g. May in this 1932 newsreel clip:
or the Queen saying Christmas Day in 1957:
or the narrator of the 1948 film Quartet saying 24 plays:
Of course, in the less accented contexts of continuous speech the contemporary j and w endpoints will often not be reached. This kind of target undershoot is a universal phenomenon in running speech.
The first element of the GOOSE diphthong brings us into the central area of the vowel space, which is characterized by considerable variability. (As with visual colours, it seems that vowels along the periphery of the perceptual space are more narrowly defined; see my post on the vowel space.) The start of the GOOSE diphthong is, I think, phonologically equivalent to the FOOT vowel. This, as I’ve discussed in another post, is no longer the German kaputt vowel ʊ, as exemplified in old RP by took and good from Watch With Mother, 1957:
The FOOT of contemporary SSB is very like “French schwa”, and is well described as the rounded close-mid vowel ɵ. Here is Kate Winslet’s could:
If we remove the final d and reduce the initial aspiration, it makes a pretty convincing French que:
And if we isolate the first part of her loo (so to speak), we again hear a similarity to French le:
In narrow phonetic terms, the start of GOOSE is usually a bit closer than the FOOT vowel, and the IPA symbol ʉ exists to denote this; however, I’m not aware than any language contrasts ɵ and ʉ, and for that reason I consider ɵw and ʉw equally acceptable transcriptions of SSB GOOSE.
Certainly the widespread transcriptions ʊ and uː are fossils as far as contemporary SSB is concerned (though the ʊ of kaputt of course survives in the north of England and in Australia). I commend those Germans (and others) who trust their ears more than their dictionaries, and use their ü rather than their u when speaking English.
The fronting of FOOT and GOOSE may well be due to the raising of LOT and NORTH. It may also be associated with the inexorable rise of l-vocalization in southern England. This not only creates new w-diphthongs, but in its syllabic form also creates a new high back rounded vowel in words like hospital and model, arguably helping to push FOOT and GOOSE forward.
The KIT vowel is very similar to cardinal 2 e, but descriptions of RP used the symbol e for RP’s relatively close DRESS vowel, turning for KIT to the non-cardinal ɪ. I don’t have a recording of Daniel Jones demonstrating ɪ, so here are both e and ɪ performed by John Wells and by Paul Meier:
I think I got those demonstrations in the right order; frankly I find it hard to tell them apart. The KIT vowel is probably no further from cardinal 2 e than the START vowel is from cardinal 5 ɑ. So from a phonetic point of view it would make perfectly good sense to use e for the KIT lexical set, and this would be supported also by its frequent correspondence in unstressed syllables to orthographic e, as in wanted, between, knowledge, decide, basket, glasses, etc. But the potential for confusing foreign learners is perhaps too great. So, for practical rather than phonetic reasons, it’s reasonable to continue using the special non-cardinal symbol ɪ for KIT.
The FLEECE diphthong is a narrow one, but when pronounced emphatically it can certainly be heard in its full form, ɪj. Here is Kate Winslet’s me again, followed by its first part, audibly ɪ.
Daniel Radcliffe’s me has a first element which is still more open, especially in the second of two utterances here:
Like all diphthongs, FLEECE is liable In continuous speech to be compressed towards a monophthong. But foreign learners should practice its full form; Germans, for example, should definitely avoid using their iː for FLEECE.
commA-STRUT, NURSE, GOAT
I explored the mid and lower central area of the vowel space at considerable length in another post. Schwa ə is at least as key to the sound of SSB as it was to that of RP, occurring not only as the ubiquitous pause vowel, but also across the lexical sets commA, NURSE, GOAT and STRUT. Many will prefer to give STRUT its own position in the vowel space, lower and/or backer than schwa (but hard to pin down since it is so variable). In strict phonemic terms, however, STRUT and commA are not truly contrastive, so no confusion is caused by pronouncing them similarly, as millions of British and American speakers do. On the other hand, learners who aim for a distinctive, more peripheral STRUT risk confusing it with TRAP, START or LOT. Acquiring a good schwa is a vastly higher priority, and learners are far better off saying Pizza H[ə]t than either Pizza Hat or Pizza Hot, which are the usual errors of those striving after the mercurial grail of “ʌ”, a chalice not of silver but quicksilver.
Finally, the lexical sets NEAR and PURE. In RP, these patterned with SQUARE as centring diphthongs: ɪə, ʊə, eə. In SSB, on the other hand, SQUARE is now one of the long monophthongs, while a large and ever-increasing number of PURE words have switched to another of the long monophthongs, NORTH – including for many speakers the word cure, which therefore no longer serves as an effective keyword for the set. Here is cure pronounced as kjoː from the online Cambridge dictionary:
In SSB, NEAR words and the surviving PURE words are widely “varisyllabic”. In strongly accented phrase-final position, they’re sometimes heard as sequences of FLEECE/GOOSE + schwa, ie ɪjə and ʉwə. More generally they tend to be heard in the ‘smooth’ monosyllabic forms ɪː and ɵː.
To illustrate the difference, here are year as jɪjə from the Cambridge online dictionary, and jɪː from the BBC’s Andrew Plant in this year:
And here are secure as sɪkjʉwə from the Macmillan online dictionary and cure as kjɵː from the BBC’s Elise Wicker in cure seekers and cure a variety:
In less strongly articulated positions, the smooth ɪː and ɵː predominate in SSB (remembering, of course, that the PURE set is dwindling). What’s clear is that the centring diphthongs of RP are decreasingly heard as such in SSB, and that a distinctive lexical set of PURE words is becoming marginal.
Here, then, is a slightly modified vowel space for SSB:And here are the vowels listed by lexical set, cross-classified, with a commA-STRUT distinction and, where different, more familiar Gimsonian symbols between slanting brackets:To sum up. The vowels of SSB are spread more evenly around the vowel space than RP’s were. They employ fewer qualities, and fewer non-primary qualities, making them easier to learn. Each of the short vowels except LOT (and, if distinct, STRUT) functions also in the long vowel system, which was not true of RP. The long vowels of SSB fall into three phonological classes according to linking behaviour, an increase in pattern regularity compared with RP. RP’s set of centring diphthongs is no longer needed.
Daniel Jones happened to codify the empire’s upper class speech during a phase of considerable phonetic unnaturalness. The social authority of the accent, and the scholarly authority of Jones’s and Gimson’s description of it, enshrined and perpetuated that description throughout and despite what has ironically been a period of quite exceptional sociophonetic change – change that was driven by shifts in British society and culture, and perhaps also by the very unnaturalness, and hence instability, of RP itself.
Today we can still identify a relatively prestigious accent which will serve those foreign learners who want a British model. This contemporary BrE accent, described above, embodies various developments away from the relative unnaturalness of RP; quite aside from the desirability of recording contemporary speech accurately, it makes overwhelming sense to avail foreign learners of this greater naturalness.