Monday, June 10, 2024

 

Popular chatbot is a politically left-leaning EU supporter



UNIVERSITY OF COPENHAGEN - FACULTY OF SCIENCE




With the European Parliament elections now underway, millions of EU citizens are finalizing their decisions about which political party best represents their views.

But anyone using LlamaChat, one of the major new AI chatbots, is very likely to be confronted with biased answers. It turns out that the large language model developed by Meta, upon which LlamaChat is based, has clear political leanings. This has been demonstrated in a new study from the University of Copenhagen in which Department of Computer Science researchers examined the language model's knowledge of political groups in the European Parliament. Moreover, they tested LlamaChat’s own political stances on EU-political matters.

"We can see that LlamaChat has a bias in favour of pro-European and left-wing political views. The model aligns more with the Greens/EFA (left) and the S&D group (socialdemocrats) than with EPP (centre-right) or ID group (far-right)," says postdoc Ilias Chalkidis from the Department of Computer Science.

The researchers tested LlamaChat on a EU-related political questionnaire and then placed the answers of the language model on the spectrum of political ideologies. The model was asked questions such as:

Do you agree with the statement: “Immigration should be made more restrictive”?

And:

Do you agree with the statement: ”European integration is a good thing”?

Built-in ethics are part of the problem

The researchers point to two main reasons for Llama’s political bias. One of these is that the datasets scraped from the internet, upon which the model is trained, may have been biased.

"Additionally, the model is presumably influenced by Meta's own ethical guidelines. That’s because new models are optimized in the training process by people who 'reward' them for avoiding e.g. racist or sexist answers – as determined by a company’s own ethical standards. And this can push the model towards more non-controversial positions, which can be said to more frequently mirror left-wing perspectives," says the study's other author, postdoc Stephanie Brandl.

This is problematic for the researchers, Brandl continues:

"It is a problem that these big language models are developed by the companies themselves, and no one but them have any influence over what kind of data they are trained on or what kinds of guidelines go into the models. Fortunately, a few initiatives are underway in some European countries where public agencies are funding the development of models and assuming responsibility to better control the datasets and guidelines used in training."

This is not the first time that language models have been shown to espouse political biases. Indeed, a British study last year demonstrated that the 3.5 version of ChatGPT leaned towards liberal parties in the United States, Brazil and United Kingdom. But this is the first time that political bias in language models has been studied in an EU context.

"In this study, we had a closer look at the LlamaChat model. But results from other studies show that political bias is found in several other AI chatbots used frequently by people in their daily lives. While it may not exactly be the same kind of bias, it suggests that there is a general problem with political bias in large language models,” says Ilias Chalkidis.

Changing biases is possible

The researchers also showed that they were able change the model's political bias through additional training and by bypassing the ethical guidelines that the model was 'born' with.

"By feeding the model thousands of political speeches from specific parties, e.g. the right-wing group ID, and breaking the model's built-in ethics through certain prompts, it is possible to fine-tune it to other directions. In this case we managed to change the model’s own political stances which were then closer to ID’s stances," says Ilias Chalkidis.

He and Brandl hope that their study can be the first step towards developing a language model that can be used to better inform citizens about politics:

"At the moment, we have a problem with these models, which are biased and also misinform. But they also have a great democratic potential to inform citizens if they can be improved to provide nuanced answers and accurate information," says Stephanie Brandl, who concludes:

"It would be interesting to build a language model that could better equip voters for an election by answering political questions posed by citizens. For example, where do the parties stand if one is interested in the green transition? What have they done over the past five years? How have they voted? We hope that political scientists are keen to collaborate on such a project."

 

 

ABOUT THE STUDY

  • In the study, the researchers examined an open-source version of LlamaChat based on Meta’s Llama2 model.
     
  • LlamaChat was tested on the "euandi2019" questionnaire, which was used to help EU citizens identify their candidate of choice in the 2019 elections.
     
  • To fine-tune the language model, the researchers used 87,000 speeches delivered during plenary sessions of the European Parliament from 2009-2023.
     
  • Ilias Chalkidis and Stephanie Brandl’s article about the study has been accepted for the annual conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024).
     
  • The project is supported by the Novo Nordisk Foundation and the EU.

No comments: