Eminem and the “gay lisp”

In a recent post I referred to Eminem’s use of two kinds of [s] and [z] sounds. One is the plain kind which he uses when speaking in interviews:

When he raps, on the other hand, his [s] and [z] are shifted up to higher acoustic frequencies:

Such high-frequency sibilants (hi-f for short) are generally articulated by shifting the tongue-blade closer to the teeth, and so can be described as “fronted s and z”, or [s̟] and [z̟].

Having encountered on YouTube some recordings of Eminem rapping a capella (i.e. free of musical accompaniment), I thought it might be worth comparing measurements of his rapping [s̟] with the plain [s] in his speaking voice. For the latter, I chose his October 2010 interview with Anderson Cooper on CBS TV’s 60 Minutes. For the rap I chose ‘Till I Collapse because most of it is a single-voice track with no multiple dubs.

I restricted my measurements to the fortis fricative “s” and not its lenis counterpart “z” which is sometimes accompanied by voicing (vibrating vocal cords). Here are the plain [s] sounds from the interview:

And here are the fronted [s̟] sounds from the rap:

It’s easy to hear that the rap sibilants are “higher” than the speech sibilants. To show this graphically, I made long term average spectra (LTAS) of the two groups of sounds. The spectrum of a sound shows where its acoustic energy is located in the frequency range, which is the horizontal axis in the following graph. The average range for rapping [s̟], in red, is clearly higher (shifted to the right) compared with the average range for speaking [s]:

Eminem’s speaking [s] is acoustically typical for the plain [s] of English speakers, with most of the energy concentrated above 4000 Hz. (This is why [s] sounds are not well transmitted by conventional telephones, which have an upper frequency limit of about 3400 Hz.) But the rapping [s̟] is obviously far higher still – in fact it extends above the 10kHz upper limit of the recordings I had to work on.

(One difference between Eminem’s speaking [s] and rapping [s̟] is “smeared” away by the long term average spectra, namely the greater consistency of the rapping [s̟], compared with the audible variability of the speaking [s]. This follows from the Eminem’s style of rapping, which involves a constantly high-volume, harsh-voice delivery in a high pitch range with a narrow intonational repertory.)

As I said in my other post, Eminem’s clearly hi-f sibilants in his rapping

struck me particularly because the long-standing association of hi-f [s] and [z] with “femininity” in the anglophone world seemed somewhat at odds with the machismo of rap culture.

It’s this association, with what I called heightened or stereotypical “femininity”, that has given rise to the widespread description of hi-f sibilants or fronted s/z as a “gay lisp”.  I think most or many people “get” what the term refers to; over a quarter of a million YouTube viewers seem to know what’s going on this satirical clip.

Clearly it’s true that hi-f sibilants are used by some gay men, throughout the English speaking world and, to some extent, beyond it.  A British example of a well-known gay man with hi-f sibilants is Harry Derbidge from the TV reality soap The Only Way Is Essex.  Here he’s saying “In Essex it’s, I’d say it is pretty easy”:

The gay character Kurt Hummel on TV’s Glee has hi-f sibilants, but to my ear they’re less consistently so in the interview speech of Chris Colfer, the (gay) actor who plays him. The phonetic situation is slightly complicated by the fact that Colfer often uses a grooved dental [s̪], with the tongue tip audibly and visibly raised to the upper incisors, whereas classic hi-f sibilants have a lowered tongue-tip. Here’s Colfer:

Nonetheless, considerations of political correctness aside, there are clearly a number of things that make the term “gay lisp” wrong.

Firstly, it’s not a lisp. A lisp is the pronunciation of a target sibilant which fails to realise the sibilance; it’s a sporadic phenomenon characteristic of an individual, not of a community. The most common lisp is dental, pronouncing “s” like “th”. Although hi-f sibilants typically have fronted articulation towards the teeth, they retain sibilance.

Secondly, many gay men don’t have hi-f sibilants, and in fact are not linguistically distinct in any way from non-gay people. I don’t hear any “gay” features in Sir Elton John’s speech:

Thirdly, hi-f sibilants are not associated with gay women. (The title of the satirical clip above refers relatively accurately to “gay man’s lisp”.)  But – fourthly – “feminine” women are as capable as men of markedly fronting their sibilants. The various Kardashians seem to have hi-f sibilants:

An example of a British “hi-f woman” is Amy Childs, the beautician from The Only Way Is Essex. Here she’s saying “Does it fit nice? I feel like a goddess.”

So hi-f sibilance is associated with heightened or stereotypical “femininity” independently of speaker sex or sexual orientation.  The sound-symbolic basis for this association can presumably be traced to the fact that females are on average smaller than males, and smaller physical bodies produce acoustically higher frequencies than larger ones.  This is why violins and bongos sound higher, cellos and kettle drums lower.

But then there’s Eminem. He not only “code-switches” between plain and hi-f sibilants, but uses the hi-f variety in an aggressive “macho” genre which has been accused of sexism and homophobia.

Of course most speech sounds are not sound-symbolic at all, but purely arbitrary. The general public may believe that French “sounds romantic”, that Italian “sounds emotional”, that German “sounds harsh”, etc. But this is just a displacement onto those languages of stereotypical attitudes and prejudices which cling, for better or worse, to the people who speak them.  Generally, languages and accents have the sounds they do through a combination of universal properties and historical accident, not because they symbolize their speakers.

So we find that hi-f sibilants turn up in linguistic systems without any sexual baggage at all. As I said in my earlier post, French is to my ear a hi-f language. Many (though not all) French speakers use hi-f sibilants without any “feminine” connotation. In this interview with French President Sarkozy, the first interviewer clearly has hi-f sibilants. Sarkozy’s [s] and [z] are generally less hi-f, but for example his emphatic savez-vous at 0:17 begins with an [s] which is decidedly hi-f by cross-linguistic standards:

And hi-f sibilants seem to be characteristic of MLE (Multicultural London English), which has succeeded Cockney as the dialect of London’s East End. (See here for my post and here for an excellent introduction to MLE.)

My hunch is that the hi-f sibilants of today’s young inner-city Londoners share a common ancestry with Eminem’s – in African American hip hop culture. Here are Grandmaster Flash and the Furious Five in the early 80s:

So although the “gay” association may have predominated in people’s minds, hi-f sibilants seem to have been around unnoted in hip hop for decades.