Deepfake smiles matter less
The perceptual and emotional impact of deepfakes
In our digital time, where artificial intelligence (AI) crafts deceptively realistic human faces, the emergence of deepfake technology may blur the boundaries between reality and digital fabrication. These AI-generated faces, though technologically astounding, carry a weight of societal implications that demand a thorough examination. A recent study published in Scientific Reports and conducted by Science of Intelligence (SCIoI) scientists Anna Eiserbeck, Martin Maier, Julia Baum, and Rasha Abdel Rahman, delves into the psychological and neural repercussions tied to the perception of AI-generated faces, especially focusing on the emotional expressions they portray.
Products of generative artificial intelligence, such as deceptively real-looking photos and videos of people, known as deepfakes, are becoming increasingly common. Until now, however, it was unclear how knowing that a face might or might not be real affects how we perceive it and respond to it emotionally. In their new study, the researchers analyzed facial expression ratings and brain responses to smiling, angry, and neutral faces that participants assumed were either real or computer-generated. The results show that a computer-generated smile matters less to us on several levels: it is perceived as less intense, elicits a weaker emotional response in the brain, and appears to give us pause. Angry faces, on the other hand, remain equally threatening, whether we believe them to be genuine or not. These fundamental new findings have implications for how we as a society will deal with deepfakes, both when they are used for good and for ill.
Deepfakes and the Human Brain: A study of Perception and Emotional Evaluation
The study, involving 30 participants and utilizing EEG technology, explored the effects of the belief that a portrayed individual iseither real or deepfake on psychological and neural measures of face perception. In the words of SCIoI researcher Martin Maier, “When confronted with smiling faces marked as deepfakes, participants showed reduced perceptual and emotional responses, and a slower evaluative process as opposed to when the faces were marked real. Intriguingly, this impact was not mirrored in the perception of negative expressions, which remained consistent whether believed to be real or fake.” The findings highlight a complex interplay of emotional valence and presumed authenticity, and mark the first time a distinction has been drawn in the psychological impact between positive and negative expressions portrayed by deepfakes. “When real and fake faces are otherwise indistinguishable, perception and emotional responses may crucially depend on the prior belief that what you are seeing is, in fact, real or fake,“ added Rasha Abdel Rahman, principal investigator of the study.
In order to reach these conclusions, the researchers looked at how the brain’s response to images of faces evolves over time, focusing on three stages: early visual perception (up to 200 milliseconds after a face was shown, before we are even aware of seeing it), reflexive emotional processing (at 200 – 350 milliseconds, reflecting our immediate emotional reactions), and higher-level evaluative processing (at 350 milliseconds and later, marking a more thoughtful consideration). They used a method called Event-Related Potentials (ERPs) to track these stages. The findings showed that when people looked at smiles they thought were created by deepfake technology, their typical early visual and emotional responses were weaker. Understanding this has direct implications for different situations in which we may encounter deepfakes: when used, for instance, to bring back younger versions of movie characters, the hope is that the emotional expressions of artificially generated characters appear just as lively and genuine as a real actor. In these situations, the study results suggest, knowing that the character is artificially generated may compromise its impact, especially for positive emotions. When used in the context of misinformation campaigns, the results suggest that artificially generated negative contents may stick, even though observers may suspect that the images are fake.
Implications and Future Directions in Deepfake Technology
The findings of this study serve as a cornerstone in understanding the behavioral and neural dynamics of human interaction with AI-generated faces, while the discoveries underscore the necessity for a nuanced approach in devising policies and strategies to navigate the growing sphere of deepfake technology. The results also provide a starting point for further explorations into other domains of AI-generated content such as text, visual art, or music. As deepfake technology continues to evolve, nurturing a profound understanding of its psychological and neural impact becomes central to both optimally using its potential to benefit society, and fortify societal resilience against the various challenges it poses.
Defending your voice against deepfakes
McKelvey engineers prevent synthesis of deceptive speech by making it more difficult for AI tools to read voice recordings
Recent advances in generative artificial intelligence have spurred developments in realistic speech synthesis. While this technology has the potential to improve lives through personalized voice assistants and accessibility-enhancing communication tools, it also has led to the emergence of deepfakes, in which synthesized speech can be misused to deceive humans and machines for nefarious purposes.
In response to this evolving threat, Ning Zhang, an assistant professor of computer science and engineering at the McKelvey School of Engineering at Washington University in St. Louis, developed a tool called AntiFake, a novel defense mechanism designed to thwart unauthorized speech synthesis before it happens. Zhang presented AntiFake Nov. 27 at the Association for Computing Machinery’s Conference on Computer and Communications Security in Copenhagen, Denmark.
Unlike traditional deepfake detection methods, which are used to evaluate and uncover synthetic audio as a post-attack mitigation tool, AntiFake takes a proactive stance. It employs adversarial techniques to prevent the synthesis of deceptive speech by making it more difficult for AI tools to read necessary characteristics from voice recordings. The code is freely available to users.
“AntiFake makes sure that when we put voice data out there, it’s hard for criminals to use that information to synthesize our voices and impersonate us,” Zhang said. “The tool uses a technique of adversarial AI that was originally part of the cybercriminals’ toolbox, but now we’re using it to defend against them. We mess up the recorded audio signal just a little bit, distort or perturb it just enough that it still sounds right to human listeners, but it’s completely different to AI.”
To ensure AntiFake can stand up against an ever-changing landscape of potential attackers and unknown synthesis models, Zhang and first author Zhiyuan Yu, a graduate student in Zhang’s lab, built the tool to be generalizable and tested it against five state-of-the-art speech synthesizers. AntiFake achieved a protection rate of over 95%, even against unseen commercial synthesizers. They also tested AntiFake’s usability with 24 human participants to confirm the tool is accessible to diverse populations.
Currently, AntiFake can protect short clips of speech, taking aim at the most common type of voice impersonation. But, Zhang said, there’s nothing to stop this tool from being expanded to protect longer recordings, or even music, in the ongoing fight against disinformation.
“Eventually, we want to be able to fully protect voice recordings,” Zhang said. “While I don’t know what will be next in AI voice tech — new tools and features are being developed all the time — I do think our strategy of turning adversaries’ techniques against them will continue to be effective. AI remains vulnerable to adversarial perturbations, even if the engineering specifics may need to shift to maintain this as a winning strategy.”
***
Yu Z, Zhai S, and Zhang N. AntiFake: Using adversarial audio to prevent unauthorized speech synthesis. Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Nov. 26-30, 2023. DOI: https://doi.org/10.1145/3576915.3623209. Code: https://sites.google.com/view/yu2023antifake.
This work was supported by the National Science Foundation and Army Research Office.
Originally published on the McKelvey School of Engineering website.
ARTICLE TITLE
AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis
No comments:
Post a Comment