March 2016 - english speech services

Phonetics people often have pet sounds, pronunciations which particularly interest them and catch their attention. One of mine is the bunched or molar r, particularly the clear variety heard in Lancashire, the Caribbean and traditional New York City. (A dark, pharyngealized variety is common in General American.) I even wrote a blog post about it some years ago, with special reference to its idiosyncratic and distinctive use by Daniel 007 Craig.

So when I saw the long-awaited new Star Wars film, The Force Awakens, a part of my mind registered the bunched-sounding r of a white-helmeted stormtrooper who appears in a single brief scene halfway through.

It was only some time later that I came across rumours to the effect that this stormtrooper was a cameo performance by none other than Daniel Craig, whose Bond epic Spectre had been simultaneously filming next door at Pinewood Studios.

The Force Awakens is about to become available to buy or rent for home viewing, but audio recordings of the stormtrooper scene have sneaked onto YouTube. Here’s one line of dialogue (with a *minor spoiler alert* as the line hints at a plot development, but if you know enough about the world of Star Wars to take the hint, you’ve probably seen the film already):

I will remove these restraints and leave the cell with the door open

Our recognition of familiar voices (and faces, etc) is something that happens automatically, without conscious effort. Primed as I now am for the possibility that the voice is Craig’s, I find that the clip does trigger a reaction that it’s him. Making a more considered evaluation, the pitch range seems right, the sibilants sound right, and spectrographic analysis suggests that those rʼs have about the same meeting-point of F2 and F3 as in the known recordings of Craig from my earlier post (roughly 1500 Hz). Here are some smaller bits of Craig and stormtrooper, respectively, so you can make your own ear-comparisons.

report / remove

I’m writing / I will

ribs / restraints

Do I think it’s Daniel Craig? Yes.

Is this identification beyond reasonable doubt? No. I certainly wouldn’t make such an assertion if this were professional forensic casework, of which I’ve done a fair bit.

For one thing, much of my personal confidence that this is Craig derives from non-phonetic evidence. Fellow Force Awakens actor Simon Pegg apparently revealed the Craig cameo (though Craig himself had made a show of denying it). Media websites say it’s now ‘confirmed’ that the stormtrooper is Craig, but as far as I’m aware an official source hasn’t been named. Perhaps the DVD/blu-ray extras will settle the matter.

But a forensic phonetic opinion needs to be based on the phonetic evidence. In the present case, this is scanty. The stormtrooper scene is brief, its vocabulary limited. Acoustically, the stormtrooper’s speech is filtered, by a physical helmet or by digital tinkering, or by both. This is like a lot of speaker-comparison casework, where the samples are often telephone recordings, or from other low-quality sources. Further, there’s an accent discrepancy between the Englishman Craig and the American-sounding stormtrooper; in the forensic situation, this would be called ‘accent disguise’.

A fundamental issue is the plasticity of speech. Even for a single individual, speech doesn’t stay still, like a fingerprint. It fluctuates rapidly from sound to sound, word to word, day to day; it’s affected by style, mood, fatigue, health and other factors, as well as being susceptible to deliberate manipulation. As a result there are serious limitations on the ability of experts to ‘identify’ voices. These loom large in the Code of Practice of the International Association for Forensic Phonetics and Acoustics:

2. …Members should maintain awareness of the limits of their knowledge and competencies when agreeing to carry out work.
4. Members should make clear, both in their reports and in giving evidence in court, the limitations of forensic phonetic and acoustic analysis.

The very term ‘identification’ is dangerous, conjuring ideas of 100% confidence in the minds of jurors (and of police officers, and lawyers, and judges) who for years have been watching spy films in which a few spoken words unlock access to top secret Stuff. Note how J P French Associates, Britain’s longest-established forensic speech and acoustics lab, describe their speaker comparison services:

Speaker comparison conclusions are stated in terms of support for the view that speakers in different recordings were the same, against the view that they were different speakers: in the form ‘the evidence provides strong support for the view that the questioned caller is the suspect’, for example. The scale of support statements used is that recommended by the (UK) Association of Forensic Science Providers.

Speaker comparison cannot provide the same strength of evidence as DNA, for example, so terms like ‘a match’ or ‘identification’ are not used.

Such caution and reserve can seem exasperating to police officers who’ve spent so long on a case that they can easily ‘recognize’ the voice of a suspect in the recordings they serve up to an expert witness.

But I always keep in mind the times I’ve mistakenly thought I’d recognized a voice – occasions which have brought home the limits of ‘identification’ rather more vividly than a Code of Practice. Most recently, writing my Words of the Week post on Great testes of America, I thought I’d check the identity of the voice in McDonald’s UK TV ads. My brain had always categorized the voice, without any real thought, as that of TV-radio presenter Paul Ross (brother of the more famous talk show host Jonathan). But an online search suggested that it’s really writer-director Dexter Fletcher, whose voice I was less familiar with. Despite the similar accent, comparison shows clear differences between the voices, and that the McDonalds voice is more like Fletcher’s. Brief examples:

Fletcher / Ross

Ross / McDonald’s man

Categorizing familiar voices (and faces) on autopilot is valuable in our day-to-day lives. The expert witness has to listen and measure more carefully. There may be cases where dissimilarities make two speech samples very unlikely to be from the same individual. But when it comes to identification, doubt generally remains.

––
For further reading on speaker identification, a much-cited 2001 paper by Francis Nolan, ‘Speaker identification evidence: its forms, limitations, and roles’, is available here, and a more recent chapter by Dominic Watt is available here. A remarkably reasonable newspaper article is this one from 2013:

The race to fingerprint the human voice

“You do not have to say anything, but it may harm your defence if you do not mention, when questioned, something you later rely on in court. Anything you do say may be given in evidence.”

Archive for month: March, 2016

The race to fingerprint the human voice