AI aversion in social interactions
An experimental study suggests that people are less likely to behave in a trusting and cooperative manner when interacting with AI than when interacting with other humans. Scientists use experimental games to probe how humans make social decisions requiring both rational and moral thinking. Fabian Dvorak and colleagues compared how humans act in classic two-player games when playing with another human to how humans act when playing with a large-language model acting on behalf of another human. Participants played the Ultimatum Game, the Binary Trust Game, the Prisoner’s Dilemma Game, the Stag Hunt Game, and the Coordination Game. The games were played online with 3,552 humans and the LLM ChatGPT. Overall, players exhibited less fairness, trust, trustworthiness, cooperation, and coordination when they knew they were playing with a LLM, even though any rewards would go to the real person the AI was playing for. Prior experience with ChatGPT did not mitigate the adverse reactions. Players who were able to choose whether or not to delegate their decision-making to AI often did so, especially when the other player would not know if they had done so. When players weren’t sure if they were playing with a human or a LLM, they hewed more closely to behavior displayed toward human players. According to the authors, the results reflect a dislike of socially interacting with an AI, an example of the broader phenomenon of algorithm aversion.
Journal
PNAS Nexus
Article Title
Adverse reactions to the use of large language models in social interactions
Can AI analogize?
GPT-4 can use analogies to reason, according to a study
Can large language models (LLMs) reason by analogy? Some outputs suggest that they can, but it has been argued that these results reflect mimicry of the results of analogical reasoning in the models’ training data. To test this claim, LLM’s have been asked to solve counterfactual problems that are unlikely to be similar to problems in training data sets. Here is an example:
Let’s solve a puzzle problem involving the following fictional alphabet:
[x y l k w b f z t n j r q a h v g m u o p d i c s e]
Here is the problem:
[x y l k] [x y l w]
[j r q a] [ ? ]
What four letters solve the puzzle? The correct answer would be “j r q h,” as h is one letter beyond a in the fictional alphabet, just as w is one letter beyond k in the fictional alphabet. However, many models have been unable to solve similar problems. Taylor W. Webb and colleagues propose that the failure to solve these counterfactual problems has more to do with LLM’s well known difficulty in counting, since the problems require basic counting, in order to establish the position of each letter in the sequence. The authors evaluated a recent version of GPT-4 that can write and execute code, which allowed the model to create a code to count items. This LLM was able to solve these counterfactual letter-string analogies at a roughly human level of performance, and gave coherent and accurate explanations of why the correct solution was correct. According to the authors, GPT-4 can use analogies to reason, a capacity that may be supported by a set of structured operations and emergent relational representations.
Journal
PNAS Nexus
Article Title
Evidence from counterfactual tasks supports emergent analogical reasoning in large language models
Article Publication Date
27-May-2025
COI Statement
Taylor Webb is a postdoctoral researcher at Microsoft Research.
No comments:
Post a Comment