MSU study: How can AI personas be used to detect human deception?
EAST LANSING, Mich. – Can an AI persona detect when a human is lying – and should we trust it if it can? Artificial intelligence, or AI, has had many recent advances and continues t evolve in scope and capability. A new Michigan State University–led study is diving deeper into how well AI can understand humans by using it to detect human deception.
In the study, published in the Journal of Communication, researchers from MSU and the University of Oklahoma conducted 12 experiments with over 19,000 AI participants to examine how well AI personas were able to detect deception and truth from human subjects.
“This research aims to understand how well AI can aid in deception detection and simulate human data in social scientific research, as well as caution professionals when using large language models for lie detection,” said David Markowitz, associate professor of communication in the MSU College of Communication Arts and Sciences and lead author of the study.
To evaluate AI in comparison to human deception detection, the researchers pulled from Truth-Default Theory, or TDT. TDT suggests that people are mostly honest most of the time and we are inclined to believe that others are telling us the truth. This theory helped the researchers compare how AI acts to how people act in the same kinds of situations.
“Humans have a natural truth bias — we generally assume others are being honest, regardless of whether they actually are,” Markowitz said. “This tendency is thought to be evolutionarily useful, since constantly doubting everyone would take much effort, make everyday life difficult, and be a strain on relationships.”
To analyze the judgment of AI personas, the researchers used the Viewpoints AI research platform to assign audiovisual or audio-only media of humans for AI to judge. The AI judges were asked to determine if the human subject was lying or telling the truth and provide a rationale. Different variables were evaluated, such as media type (audiovisual or audio-only), contextual background (information or circumstances that help explain why something happens), lie-truth base-rates (proportions of honest and deceptive communication), and the persona of the AI (identities created to act and talk like real people) to see how AI’s detection accuracy was impacted.
For example, one of the studies found that AI was lie-biased, as AI was much more accurate for lies (85.8%) compared to truths (19.5%). In short interrogation settings, AI’s deception accuracy was comparable to humans. However, in a non-interrogation setting (e.g., when evaluating statements about friends), AI displayed a truth-bias, aligning more accurately to human performance. Generally, the results found that AI is more lie-biased and much less accurate than humans.
“Our main goal was to see what we could learn about AI by including it as a participant in deception detection experiments. In this study, and with the model we used, AI turned out to be sensitive to context — but that didn’t make it better at spotting lies,” said Markowitz.
The final findings suggest that AI’s results do not match human results or accuracy and that humanness might be an important limit, or boundary condition, for how deception detection theories apply. The study highlights that using AI for detection may seem unbiased, but the industry needs to make significant progress before generative AI can be used for deception detection.
“It’s easy to see why people might want to use AI to spot lies — it seems like a high-tech, potentially fair, and possibly unbiased solution. But our research shows that we’re not there yet,” said Markowitz. “Both researchers and professionals need to make major improvements before AI can truly handle deception detection.”
By Paige Higley
###
Michigan State University has been advancing the common good with uncommon will for 170 years. One of the world’s leading public research universities, MSU pushes the boundaries of discovery to make a better, safer, healthier world for all while providing life-changing opportunities to a diverse and inclusive academic community through more than 400 programs of study in 17 degree-granting colleges.
For MSU news on the web, go to MSUToday or x.com/MSUnews.
Journal
Journal of Communication
Article Title
The (in)efficacy of AI personas in deception detection experiments
Lay intuition as effective at jailbreaking AI chatbots as technical methods
Penn State
image:
Inquiries submitted to an AI chatbot by a Bias-a-Thon participant and the AI-generated answers showing religious bias.
view moreCredit: CSRAI / Penn State
UNIVERSITY PARK, Pa. — It doesn’t take technical expertise to work around the built-in guardrails of artificial intelligence (AI) chatbots like ChatGPT and Gemini, which are intended to ensure that the chatbots operate within a set of legal and ethical boundaries and do not discriminate against people of a certain age, race or gender. A single, intuitive question can trigger the same biased response from an AI model as advanced technical inquiries, according to a team led by researchers at Penn State.
“A lot of research on AI bias has relied on sophisticated ‘jailbreak’ techniques,” said Amulya Yadav, associate professor at Penn State’s College of Information Sciences and Technology. “These methods often involve generating strings of random characters computed by algorithms to trick models into revealing discriminatory responses. While such techniques prove these biases exist theoretically, they don’t reflect how real people use AI. The average user isn’t reverse-engineering token probabilities or pasting cryptic character sequences into ChatGPT — they type plain, intuitive prompts. And that lived reality is what this approach captures.”
Prior work probing AI bias — skewed or discriminatory outputs from AI systems caused by human influences in the training data, like language or cultural bias — has been done by experts using technical knowledge to engineer large language model (LLM) responses. To see how average internet users encounter biases in AI-powered chatbots, the researchers studied the entries submitted to a competition called “Bias-a-Thon.” Organized by Penn State’s Center for Socially Responsible AI(CSRAI), the competition challenged contestants to come up with prompts that would lead generative AI systems to respond with biased answers.
They found that the intuitive strategies employed by everyday users were just as effective at inducing biased responses as expert technical strategies. The researchers presented their findings at the 8th AAAI/ACM Conference on AI, Ethics, and Society.
Fifty-two individuals participated in the Bias-a-Thon, submitting screenshots of 75 prompts and AI responses from eight generative AI models. They also provided an explanation of the bias or stereotype that they identified in the response, such as age-related or historical bias.
The researchers conducted Zoom interviews with a subset of the participants to better understand their prompting strategies and their conceptions of ideas like fairness, representation and stereotyping when interacting with generative AI tools. Once they arrived at a participant-informed working definition of “bias” — which included a lack of representation, stereotypes and prejudice, and unjustified preferences toward groups — the researchers tested the contest prompts in several LLMs to see if they would elicit similar responses.
“Large language models are inherently random,” said lead author Hangzhi Guo, a doctoral candidate in information sciences and technology at Penn State. “If you ask the same question to these models two times, they might return different answers. We wanted to use only the prompts that were reproducible, meaning that they yielded similar responses across LLMs.”
The researchers found that 53 of the prompts generated reproducible results. Biases fell into eight categories: gender bias; race, ethnic and religious bias; age bias; disability bias; language bias; historical bias favoring Western nations; cultural bias; and political bias. The researchers also found that participants used seven strategies to elicit these biases: role playing, or asking the LLM to assume a persona; hypothetical scenarios; using human knowledge to ask about niche topics, where it’s easier to identify biased responses; using leading questions on controversial topics; probing biases in under-represented groups; feeding the LLM false information; and framing the task as having a research purpose.
“The competition revealed a completely fresh set of biases,” said Yadav, organizer of the Bias-a-Thon. “For example, the winning entry uncovered an uncanny preference for conventional beauty standards. The LLMs consistently deemed a person with a clear face to be more trustworthy than a person with facial acne, or a person with high cheekbones more employable than a person with low cheekbones. This illustrates how average users can help us uncover blind spots in our understanding of where LLMs are biased. There may be many more examples such as these which have been overlooked by the jailbreaking literature on LLM bias.”
The researchers described mitigating biases in LLMs as a cat-and-mouse game, meaning that developers are constantly addressing issues as they arise. They suggested strategies that developers can use to mitigate these issues now, including implementing a robust classification filter to screen outputs before they go to users, conducting extensive testing, educating users and providing specific references or citations so users can verify information.
“By shining a light on inherent and reproducible biases that laypersons can identify, the Bias-a-Thon serves an AI literacy function,” said co-author S. Shyam Sundar, Evan Pugh University Professor at Penn State and director of the Penn State Center for Socially Responsible Artificial Intelligence, which has since organized other AI competitions such as Fake-a-thon, Diagnose-a-thon and Cheat-a-thon. “The whole goal of these efforts is to increase awareness of systematic problems with AI, to promote the informed use of AI among laypersons and to stimulate more socially responsible ways of developing these tools.”
Other Penn State contributors to this research include doctoral candidates Eunchae Jang, Wenbo Zhang, Bonam Mingole and Vipul Gupta. Pranav Narayanan Venkit, research scientist at Salesforce AI Research; Mukund Srinath, machine learning scientist at Expedia; and Kush R. Varshney from IBM Research also participated in the work.
Caption
An inquiry submitted by a Bias-a-Thon participant and the generative AI response showing bias toward conventional beauty standards.
Credit
CSRAI / Penn State
Article Title
Exposing AI Bias by Crowdsourcing: Democratizing Critique of Large Language Models
No comments:
Post a Comment