Georgina Brown (Lancaster University, UK)
Title: Synthetic speech is more than a threat.
Abstract: Speech synthesis systems make up a substantial group of “spoofing” techniques that could, in principle, generate speech samples for fraudulent or other malicious purposes. Rapid developments within speech synthesis undoubtedly mean that these technologies will continue to contribute to widespread concerns around “deepfakes”. The threat of synthetic speech now has a firm presence in the media, but there is also a body of academic research that has accumulated over the last decade. Research attention has largely been afforded to how automatic speaker recognition systems perform when presented with different forms of spoofed speech. There has also been research on how well spoofed speech can be detected by automatic classifiers and human listeners. As the need for ongoing “spoofing-aware” research is clear, it would not come as a surprise to see further work on spoofed speech in presentations at IAFPA 2023. One aspect that unites much of the existing research literature on spoofed speech is the impressive performance of one particular speech synthesis technique. This talk places this method under the microscope; however, instead of focusing on the threatening aspect of synthetic speech, this talk will zoom in on the opportunities that have become available to forensic speech science. Specifically, this talk demonstrates how speech synthesis methods could help to make automatic speaker recognition systems more explainable.
Catalin Grigoras (UC Denver, USA)
Title: Digital Audio Authentication: Framework, Challenges and Solutions.
Abstract: The goal of this presentation is to provide a summary review of the latest developments in conducting comprehensive examinations of digital audio authenticity which rely on the results of multiple analyses to inform an ultimate scientific finding or unbiased opinion. Digital audio authentication is a process of establishing the provenance of a questioned recording to determine whether it is consistent with an original one or if there is evidence of editing. This presentation proposes the organization of several techniques in a logical manner for the authentication of digital audio recordings. Special attention has been given to interpreting results from individual analyses and incorporating them into a holistic view of a recording’s authenticity where a finding can be corroborated against the results of other analyses. Only in this way can an examiner present a conclusion with confidence and assurance that all possible hypotheses have been exhausted in the execution of this important endeavor. The framework for digital audio authentication that will be discussed involves accurate, repeatable, reliable, unbiased, and scientific analyses derived from peer reviewed publications in order to meet court guidelines or case precedence, best practice recommendations, and the appropriate criteria for international legal systems. The presentation will also include some of the nowadays challenges and solutions.
Title: Speaker verification and identification for criminal investigations and forensics.
Abstract: The talk will present innovative technologies aiming to support law enforcement agencies become more efficient in processing large-scale diverse cases with a goal to unmask criminal networks and their members as well as to reveal the true identity of perpetrators. Recently closed EC project (ROXANNE), combining capabilities of speech/language technologies and visual analysis with network analysis, will be introduced and project achievements will be demonstrated on various realistic data. Particular focus will be given to speech processing related applications (i.e., speaker identification and verification) deployed by investigators to automatically analyse and cluster lawfully intercepted communication specifically for cross-border organised crime. The talk will also introduce technological solutions to deploy speaker recognition for forensic case.