AI unleashes a Pandora's box: ChatGPT generates convincingly fake scientific article
Researchers demonstrate how AI can generate seemingly authentic scientific articles, prompting ethical concerns in the scientific community
Peer-Reviewed PublicationA new study published in the Journal of Medical Internet Research on May 31, 2023, by Dr Martin Májovský and colleagues has revealed that artificial intelligence (AI) language models such as ChatGPT (Chat Generative Pre-trained Transformer) can generate fraudulent scientific articles that appear remarkably authentic. This discovery raises critical concerns about the integrity of scientific research and the trustworthiness of published papers.
Researchers from Charles University, Czech Republic, aimed to investigate the capabilities of current AI language models in creating high-quality fraudulent medical articles. The team used the popular AI chatbot ChatGPT, which runs on the GPT-3 language model developed by OpenAI, to generate a completely fabricated scientific article in the field of neurosurgery. Questions and prompts were refined as ChatGPT generated responses, allowing the quality of the output to be iteratively improved.
The results of this proof-of-concept study were striking—the AI language model successfully produced a fraudulent article that closely resembled a genuine scientific paper in terms of word usage, sentence structure, and overall composition. The article included standard sections such as an abstract, introduction, methods, results, and discussion, as well as tables and other data. Surprisingly, the entire process of article creation took just 1 hour without any special training of the human user.
While the AI-generated article appeared sophisticated and flawless, upon closer examination expert readers were able to identify semantic inaccuracies and errors particularly in the references—some references were incorrect, while others were non-existent. This underscores the need for increased vigilance and enhanced detection methods to combat the potential misuse of AI in scientific research.
This study’s findings emphasize the importance of developing ethical guidelines and best practices for the use of AI language models in genuine scientific writing and research. Models like ChatGPT have the potential to enhance the efficiency and accuracy of document creation, result analysis, and language editing. By using these tools with care and responsibility, researchers can harness their power while minimizing the risk of misuse or abuse.
In a commentary on Dr Májovský’s article, published here, Dr Pedro Ballester discusses the need to prioritize the reproducibility and visibility of scientific works, as they serve as essential safeguards against the flourishing of fraudulent research.
As AI continues to advance, it becomes crucial for the scientific community to verify the accuracy and authenticity of content generated by these tools and to implement mechanisms for detecting and preventing fraud and misconduct. While both articles agree that there needs to be a better way to verify the accuracy and authenticity of AI-generated content, how this could be achieved is less clear. “We should at least declare the extent to which AI has assisted the writing and analysis of a paper,” suggests Dr Ballester as a starting point. Another possible solution proposed by Majovsky and colleagues is making the submission of data sets mandatory.
The article “Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened” was published by JMIR Publications in its flagship title, the Journal of Medical Internet Research and can be accessed here.
Please cite as:
Májovský M, Černý M, Kasal M, Komarc M, Netuka D Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora’s Box Has Been Opened J Med Internet Res 2023;25:e46924
URL: https://www.jmir.org/2023/1/e46924/
doi: 10.2196/46924
###
About JMIR Publications
JMIR Publications is a leading, born-digital, open access publisher of 30+ academic journals and other innovative scientific communication products that focus on the intersection of health, and technology. Its flagship journal, the Journal of Medical Internet Research, is the leading digital health journal globally in content breadth and visibility, and is the largest journal in the medical informatics field.
To learn more about JMIR Publications, please visit jmirpublications.com or connect with us via Twitter, LinkedIn, YouTube, Facebook, and Instagram.
Head office: 130 Queens Quay East, Unit 1100, Toronto, ON, M5A 0P6 Canada
Media contact: communications@jmir.org
The content of this communication is licensed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, published by JMIR Publications, is properly cited.
JOURNAL
Journal of Medical Internet Research
ARTICLE TITLE
Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora’s Box Has Been Opened
AI tests into top 1% for original creative thinking
MISSOULA – New research from the University of Montana and its partners suggests artificial intelligence can match the top 1% of human thinkers on a standard test for creativity.
The study was directed by Dr. Erik Guzik, an assistant clinical professor in UM’s College of Business. He and his partners used the Torrance Tests of Creative Thinking, a well-known tool used for decades to assess human creativity.
The researchers submitted eight responses generated by ChatGPT, the application powered by the GPT-4 artificial intelligence engine. They also submitted answers from a control group of 24 UM students taking Guzik’s entrepreneurship and personal finance classes. These scores were compared with 2,700 college students nationally who took the TTCT in 2016. All submissions were scored by Scholastic Testing Service, which didn’t know AI was involved.
The results placed ChatGPT in elite company for creativity. The AI application was in the top percentile for fluency – the ability to generate a large volume of ideas – and for originality – the ability to come up with new ideas. The AI slipped a bit – to the 97th percentile – for flexibility, the ability to generate different types and categories of ideas.
“For ChatGPT and GPT-4, we showed for the first time that it performs in the top 1% for originality,” Guzik said. “That was new.”
He was gratified to note that some of his UM students also performed in the top 1%. However, ChatGTP outperformed the vast majority of college students nationally.
Guzik tested the AI and his students during spring semester. He was assisted in the work by Christian Gilde of UM Western and Christian Byrge of Vilnius University. The researchers presented their work in May at the Southern Oregon University Creativity Conference.
“We were very careful at the conference to not interpret the data very much,” Guzik said. “We just presented the results. But we shared strong evidence that AI seems to be developing creative ability on par with or even exceeding human ability.”
Guzik said he asked ChatGPT what it would indicate if it performed well on the TTCT. The AI gave a strong answer, which they shared at the conference:
“ChatGPT told us we may not fully understand human creativity, which I believe is correct,” he said. “It also suggested we may need more sophisticated assessment tools that can differentiate between human and AI-generated ideas.”
He said the TTCT is protected proprietary material, so ChatGPT couldn’t “cheat” by accessing information about the test on the internet or in a public database.
Guzik has long been interested in creativity. As a seventh grader growing up in the small town of Palmer, Massachusetts, he was in a program for talented-and-gifted students. That experience introduced him to the Future Problem Solving process developed by Ellis Paul Torrance, the pioneering psychologist who also created the TTCT. Guzik said he fell in love with brainstorming at that time and how it taps into human imagination, and he remains active with the Future Problem Solving organization – even meeting his wife at one of its conferences.
Guzik and his team decided to test the creativity of ChatGPT after playing around with it during the past year.
“We had all been exploring with ChatGPT, and we noticed it had been doing some interesting things that we didn’t expect,” he said. “Some of the responses were novel and surprising. That’s when we decided to put it to the test to see how creative it really is.”
Guzik said the TTCT test uses prompts that mimic real-life creative tasks. For instance, can you think of new uses for a product or improve this product?
“Let’s say it’s a basketball,” he said. “Think of as many uses of a basketball as you can. You can shoot it in a hoop and use it in a display. If you force yourself to think of new uses, maybe you cut it up and use it as a planter. Or with a brick you can build things, or it can be used as a paperweight. But maybe you grind it up and reform it into something completely new.”
Guzik had some expectation that ChatGPT would be good at creating a lot of ideas (fluency), because that’s what generative AI does. And it excelled at responding to the prompt with many ideas that were relevant, useful and valuable in the eyes of the evaluators.
He was more surprised at how well it did generating original ideas, which is a hallmark of human imagination. The test evaluators are given lists of common responses for a prompt – ones that are almost expected to be submitted. However, the AI landed in the top percentile for coming up with fresh responses.
“At the conference, we learned of previous research on GPT-3 that was done a year ago,” Guzik said. “At that time, ChatGPT did not score as well as humans on tasks that involved original thinking. Now with the more advanced GPT-4, it’s in the top 1% of all human responses.”
With AI advances speeding up, he expects it to become a key tool for the world of business going forward and a significant new driver of regional and national innovation.
“For me, creativity is about doing things differently,” Guzik said. “One of the definitions of entrepreneurship I love is that to be an entrepreneur is to think differently. So AI may help us apply the world of creative thinking to business and the process of innovation, and that’s just fascinating to me.”
He said the UM College of Business is open to teaching about AI and incorporating it into coursework.
“I think we know the future is going to include AI in some fashion,” Guzik said. “We have to be careful about how it’s used and consider needed rules and regulations. But businesses already are using it for many creative tasks. In terms of entrepreneurship and regional innovation, this is a game changer.”
###
No comments:
Post a Comment