It’s a well established fact that the transcription of multiple speakers adds significant complexity to the task of transcribing audio. The reasons for this include the fact that two or more speakers can be hard to distinguish between, especially on a sound recording. The situation is made worse by speakers interrupting one another and/or talking over each other, making it impossible to discern exactly what was said and by whom. Interrupted speech is often fast paced and interruptions themselves tend to increase the pace of a conversation as each individual competes to be heard. This in turn can create a nightmare for a transcriptionist who must listen to the audio repeatedly to tease out the precise language and to identify each speaker. There is also the formatting task of starting a new labelled paragraph each time the speaker changes, which in itself can become time consuming when two or more speakers with short utterances are involved.
One obvious (but often unaccounted) factor that contributes to the ease of identification is how similar speakers’ voices sound. Two quite similar-sounding male or female voices can be incredibly hard to distinguish, especially once background noise and the nuances of the recording environment are introduced. If the sound quality of the recording isn’t crystal clear it can become nigh impossible in some cases. This is unsurprising when you consider that an outsourced transcriptionist will be unfamiliar with the speakers and their speech idiosyncrasies. The greater the number of speakers the more challenging the transcription becomes.
Despite these factors, there is no doubt that TTP has a significant advantage over automatic voice recognition software (which is rarely accurate with multiple speakers), and over much of our competition, as we are specialists in multi-speaker audio. Our experienced transcriptionists are trained to handle audio containing more than one speaker and are skilled in identifying individual voices. TTP transcriptionists take the time to distinguish speakers by replaying the audio several times if required, and are required to format the document in a professional and standardised way to clearly identify each speaker. In instances where a transcriptionist is unable to discern speech or distinguish between speakers, our practise is to use time stamps. This allows clients to quickly cue the audio to the exact moment of missing speech so that they can attempt to fill in the blanks themselves.
Call to Action!
If you have audio recordings of meetings, forums, interviews, or presentations containing multiple speakers, why not defer to the experts? To find out more, or for a quote, please contact us today:
Unable to display Facebook posts.