Eminem and the “gay lisp”

In a recent post I referred to Eminem’s use of two kinds of [s] and [z] sounds. One is the plain kind which he uses when speaking in interviews:

When he raps, on the other hand, his [s] and [z] are shifted up to higher acoustic frequencies:

Such high-frequency sibilants (hi-f for short) are generally articulated by shifting the tongue-blade closer to the teeth, and so can be described as “fronted s and z”, or [s̟] and [z̟].

Having encountered on YouTube some recordings of Eminem rapping a capella (i.e. free of musical accompaniment), I thought it might be worth comparing measurements of his rapping [s̟] with the plain [s] in his speaking voice. For the latter, I chose his October 2010 interview with Anderson Cooper on CBS TV’s 60 Minutes. For the rap I chose ‘Till I Collapse because most of it is a single-voice track with no multiple dubs.

I restricted my measurements to the fortis fricative “s” and not its lenis counterpart “z” which is sometimes accompanied by voicing (vibrating vocal cords). Here are the plain [s] sounds from the interview:

And here are the fronted [s̟] sounds from the rap:

It’s easy to hear that the rap sibilants are “higher” than the speech sibilants. To show this graphically, I made long term average spectra (LTAS) of the two groups of sounds. The spectrum of a sound shows where its acoustic energy is located in the frequency range, which is the horizontal axis in the following graph. The average range for rapping [s̟], in red, is clearly higher (shifted to the right) compared with the average range for speaking [s]:

Eminem’s speaking [s] is acoustically typical for the plain [s] of English speakers, with most of the energy concentrated above 4000 Hz. (This is why [s] sounds are not well transmitted by conventional telephones, which have an upper frequency limit of about 3400 Hz.) But the rapping [s̟] is obviously far higher still – in fact it extends above the 10kHz upper limit of the recordings I had to work on.

(One difference between Eminem’s speaking [s] and rapping [s̟] is “smeared” away by the long term average spectra, namely the greater consistency of the rapping [s̟], compared with the audible variability of the speaking [s]. This follows from the Eminem’s style of rapping, which involves a constantly high-volume, harsh-voice delivery in a high pitch range with a narrow intonational repertory.)

As I said in my other post, Eminem’s clearly hi-f sibilants in his rapping

struck me particularly because the long-standing association of hi-f [s] and [z] with “femininity” in the anglophone world seemed somewhat at odds with the machismo of rap culture.

It’s this association, with what I called heightened or stereotypical “femininity”, that has given rise to the widespread description of hi-f sibilants or fronted s/z as a “gay lisp”.  I think most or many people “get” what the term refers to; over a quarter of a million YouTube viewers seem to know what’s going on this satirical clip.

Clearly it’s true that hi-f sibilants are used by some gay men, throughout the English speaking world and, to some extent, beyond it.  A British example of a well-known gay man with hi-f sibilants is Harry Derbidge from the TV reality soap The Only Way Is Essex.  Here he’s saying “In Essex it’s, I’d say it is pretty easy”:

The gay character Kurt Hummel on TV’s Glee has hi-f sibilants, but to my ear they’re less consistently so in the interview speech of Chris Colfer, the (gay) actor who plays him. The phonetic situation is slightly complicated by the fact that Colfer often uses a grooved dental [s̪], with the tongue tip audibly and visibly raised to the upper incisors, whereas classic hi-f sibilants have a lowered tongue-tip. Here’s Colfer:

Nonetheless, considerations of political correctness aside, there are clearly a number of things that make the term “gay lisp” wrong.

Firstly, it’s not a lisp. A lisp is the pronunciation of a target sibilant which fails to realise the sibilance; it’s a sporadic phenomenon characteristic of an individual, not of a community. The most common lisp is dental, pronouncing “s” like “th”. Although hi-f sibilants typically have fronted articulation towards the teeth, they retain sibilance.

Secondly, many gay men don’t have hi-f sibilants, and in fact are not linguistically distinct in any way from non-gay people. I don’t hear any “gay” features in Sir Elton John’s speech:

Thirdly, hi-f sibilants are not associated with gay women. (The title of the satirical clip above refers relatively accurately to “gay man’s lisp”.)  But – fourthly – “feminine” women are as capable as men of markedly fronting their sibilants. The various Kardashians seem to have hi-f sibilants:

An example of a British “hi-f woman” is Amy Childs, the beautician from The Only Way Is Essex. Here she’s saying “Does it fit nice? I feel like a goddess.”

So hi-f sibilance is associated with heightened or stereotypical “femininity” independently of speaker sex or sexual orientation.  The sound-symbolic basis for this association can presumably be traced to the fact that females are on average smaller than males, and smaller physical bodies produce acoustically higher frequencies than larger ones.  This is why violins and bongos sound higher, cellos and kettle drums lower.

But then there’s Eminem. He not only “code-switches” between plain and hi-f sibilants, but uses the hi-f variety in an aggressive “macho” genre which has been accused of sexism and homophobia.

Of course most speech sounds are not sound-symbolic at all, but purely arbitrary. The general public may believe that French “sounds romantic”, that Italian “sounds emotional”, that German “sounds harsh”, etc. But this is just a displacement onto those languages of stereotypical attitudes and prejudices which cling, for better or worse, to the people who speak them.  Generally, languages and accents have the sounds they do through a combination of universal properties and historical accident, not because they symbolize their speakers.

So we find that hi-f sibilants turn up in linguistic systems without any sexual baggage at all. As I said in my earlier post, French is to my ear a hi-f language. Many (though not all) French speakers use hi-f sibilants without any “feminine” connotation. In this interview with French President Sarkozy, the first interviewer clearly has hi-f sibilants. Sarkozy’s [s] and [z] are generally less hi-f, but for example his emphatic savez-vous at 0:17 begins with an [s] which is decidedly hi-f by cross-linguistic standards:

And hi-f sibilants seem to be characteristic of MLE (Multicultural London English), which has succeeded Cockney as the dialect of London’s East End. (See here for my post and here for an excellent introduction to MLE.)

My hunch is that the hi-f sibilants of today’s young inner-city Londoners share a common ancestry with Eminem’s – in African American hip hop culture. Here are Grandmaster Flash and the Furious Five in the early 80s:

So although the “gay” association may have predominated in people’s minds, hi-f sibilants seem to have been around unnoted in hip hop for decades.

5 replies
  1. RA
    RA says:

    It’s my impression that “hi-f” sibilants are also common in Southern Irish English (as opposed to Ulster English). Listen to this clip of Irish comedian Dara Ó Briain and note the way he pronounces his [s] and [z]. Also listen to the [s] and [z] of Irish lingiust Raymond Hickey here. It makes me wonder if the motivation there is to have a larger phonetic distinction between [s] and [z] and the lenited allophones of /t/ and /d/.

    • Geoff Lindsey
      Geoff Lindsey says:

      Nice observation – Dara’s hi-f sibilants passed me by. Your explanation stands to reason, too; though it’s my impression that the Merseyside variety of t-lenition does have the potential to neutralize t and s. My Latin teacher was a Merseysider, and in my first lesson I thought he said amo, amas, amas.

      I also suspect that those Merseysiders who have hi-f sibilants also have hi-f lenited stops. http://www.youtube.com/watch?v=UlJadzQv5hk

  2. Dinora
    Dinora says:

    Hello, Geoff!

    I’m intrigued about a phenomenon that’s not a gay lisp, but somehow I find it related to this, perhaps because it’s about sibilants: is there anywhere in English-phonology books a mention of the kinds of sibilants used by Brian Sewell and Maggie Smith as the Dowager Countess of Grantham – for example, the kind used by the Dowager Countess when she says “indignation” here


    or Brian Sewell’s /ʃ, ʒ, tʃ, dʒ/:


    Are those retracted consonants or retroflexes? Apicals or laminals? Are they labialized or pronounced with lip protrusion? How widespread are these realizations and are they present in any of the dialects of English?

    • Geoff Lindsey
      Geoff Lindsey says:

      Yes, those postalveolars were apical for many speakers of conservative RP. I find them very like the corresponding Russian consonants, of which Wikipedia says “/ʂ/ and /ʐ/ are somewhat concave apical postalveolar. They may be described as retroflex, e.g. by Hamann (2004), but this is to indicate that they are not laminal nor palatalized”.

      I can’t think of any varieties of English today where such realizations are basic or common. However, I do hear something auditorily not dissimilar from a few contemporary Southern Brit TV presenters, like Konnie Huq and Dermot O’Leary. Listen to Konnie Huq saying ‘nerd-ish’ at 0:40 here:
      You can see that this newer kind of articulation is unrounded, whereas I think the old RP apicals could be rounded. I’m not sure about the tongue shape in the newer sound. Perhaps it’s a grooved palatal.

    • Rick Bailey
      Rick Bailey says:

      @ Dinora, Geoff:

      I associate those “dark” /ʃ, ʒ, tʃ, dʒ/ sounds with U-RP (Upper Class Received Pronunciation) or at least “very posh and/or old RP.” I personally thought a discussion of those sounds was something missing from linguist J.C. Wells’s description of U-RP in his Accents of English 2 (1982).

      If you watch this clip from The Story of English Episode 7 “The Muvver Tongue” (1986), you’ll notice that Wells himself uses the type /ʃ/ you’re referring to at about 4m10s into the video in the word socially (the 1st time he uses that word; “…spreading out geographically and socially…”). Wells’s /ʃ/ there is rounded (I slowed the video down to 0.25 speed). Harry Potter’s doppelgänger also has a “dark”-sounding /ʃ/ at about 4m45s (“…called very good English…”). The boy’s /ʃ/ there is unrounded. I didn’t even have to slow down the video for that one. He even looks to be smiling there; that’s a dead giveaway.

      We can also listen to the accent of MP for North East Somerset Jacob Rees-Mogg. I’ve heard a lot of British people comment on the poshness or “plumminess” of his accent. I think part of what gives people that impression is his “dark” /ʃ, ʒ, tʃ, dʒ/ (or at least /ʃ/). He uses a nice, long (almost Welsh-sounding) /ʃ/ in discombobulation (~ 18s) that sounds “dark.” His lips are rounded there.

      Listen to historian David Starkey say “protection” (8:46) and “extortion” (8:51) on Question Time. Both of those /ʃ/’s are rounded.

      There is also comedian Russell Brand, whose accent isn’t anywhere near RP. I’ve always noticed his dark /ʃ, ʒ, tʃ, dʒ/. Listen at 0:52: “…shouldn’t destroy the planet…” That /ʃ/ is clearly rounded. Interestingly, his /ʃ/ in his 1st repetition of shouldn’t right before then (“Here’s the thing that you shouldn’t do…”) doesn’t have that same dark quality, even though it is also rounded. So it must not be lip rounding that’s responsible for the auditory difference between those 2 tokens of /ʃ/.

Leave a Reply

Leave a Reply

Your email address will not be published. Required fields are marked *