Monday, December 08, 2025

 

Positive and poolished: Student writing has evolved in the AI era


Style, sentiment, and quality of undergraduate writing in the AI era: A cross-sectional and longitudinal analysis of 4,820 authentic empirical reports


University of Warwick

Sentiment change over time 

image: 

A graph of sentiment of student essays plotted over the years essays were collected from

view more 

Credit: Matthew Mak/University of Warwick





A University of Warwick-led analysis of almost 5,000 student-authored reports suggests that student writing has become more polished and formal since the introduction of ChatGPT in late 2022— but grades have remained stable.

Published in Computers and Education: Artificial Intelligence, the new study examines student reports submitted over a 10-year period and finds that the ‘language’ in students’ writing has become more sophisticated, formal, and positive since 2022, coinciding with the widespread availability of generative AI (GenAI).

GenAI tools such as ChatGPT and Copilot are now widely used across higher education with a recent sector-wide survey showing that up to 88% of students report using ChatGPT for assessments.

This new analysis of 4,820 reports, containing 17 million words, is one of the largest of its kind. The experiment does not assess individual student’s AI use but instead explores how writing has evolved at a cohort level during a period of rapid technological change.

It found that since 2022, writing sentiment has become more positive overall, regardless of the substantive content of the reports. This mirrors well-documented positivity tendencies in many GenAI systems, which are designed to produce polite, constructive-sounding responses.

Dr. Matthew Mak, Assistant Professor in Psychology, University of Warwick and first author said: “The tone of students’ writing appears more positive, in line with ChatGPT's output, which is not inherently a good or bad thing, but it does raise concerns about the possibility of AI tools homogenising students’ voices.

“There are also psychological studies showing that we tend to be less critical when we are in a positive mood; if students constantly receive GenAI output, it raises important questions about how these AI tools shape students’ critical thinking in the long term.”

The study also found significant increases in formality and range of vocabulary after ChatGPT’s launch. These stylistic features would be expected to appear after many years of writing experience, making it unlikely this is a natural development in students’ writing abilities nor does it indicate corresponding improvements in their underlying writing skills.

Additionally, some words frequently associated with AI-generated text, such as “delve” and “intricate”, rose sharply in use until 2024 before plummeting in 2025, suggesting that students may have moderated their use—to make their writing read less AI-assisted.

To better understand these trends, the researchers also asked ChatGPT to rewrite reports submitted before ChatGPT was launched in 2022. These rewritten reports exhibited similar shifts in tone and style as in those submitted after ChatGPT’s launch, providing additional evidence that the observed cohort-level changes are influenced by students’ engagement with GenAI tools.

Importantly, despite these stylistic shifts, there was no corresponding changes in grades or examiner feedback. This may suggest that core academic skills — such as critical reasoning, interpretation, and argumentation — remain central to assessment and have, at least, not yet been overshadowed by changes in surface-level style brought about by ChatGPT.

Professor Lukasz Walasek, Department of Psychology, University of Warwick, author of the paper added: “Our findings highlight a transition in writing style that is likely happening across sectors. It is vital that institutions understand how tools like GenAI interact with learning and communication. This will help universities design assessments and guidance that support students to use these technologies responsibly and effectively.”

The findings present opportunities for institutions to rethink assessment design, AI policy, and to support students in developing strong, authentic writing voices in an AI-rich world.

ENDS

Notes to Editors

The paper ‘Style, sentiment, and quality of undergraduate writing in the AI era: A cross-sectional and longitudinal analysis of 4,820 authentic empirical reports’ is published in Computers and Education: Artificial Intelligence. DOI: https://doi.org/10.1016/j.caeai.2025.100507

This work is supported by the British Academy Talent Development Award (TDA24\240012).

For more information please contact:

Matt Higgs, PhD | Media & Communications Officer (Warwick Press Office)

Email: Matt.Higgs@warwick.ac.uk | Phone: +44(0)7880 175403

About the University of Warwick

Founded in 1965, the University of Warwick is a world-leading institution known for its commitment to era-defining innovation across research and education. A connected ecosystem of staff, students and alumni, the University fosters transformative learning, interdisciplinary collaboration and bold industry partnerships across state-of-the-art facilities in the UK and global satellite hubs. Here, spirited thinkers push boundaries, experiment and challenge convention to create a better world.

How your brain understands language may be more like AI than we ever imagined


The Hebrew University of Jerusalem




\\



A new study reveals that the human brain processes spoken language in a sequence that closely mirrors the layered architecture of advanced AI language models. Using electrocorticography data from participants listening to a narrative, the research shows that deeper AI layers align with later brain responses in key language regions such as Broca’s area. The findings challenge traditional rule-based theories of language comprehension and introduce a publicly available neural dataset that sets a new benchmark for studying how the brain constructs meaning.

In a study published in Nature Communications, researchers led by Dr. Ariel Goldstein of the Hebrew University in collaboration with Dr. Mariano Schain from Google Research along with Prof Uri Hasson and Eric Ham from Princeton University, uncovered a surprising connection between the way our brains make sense of spoken language and the way advanced AI models analyze text. Using electrocorticography recordings from participants listening to a thirty-minute podcast, the team showed that the brain processes language in a structured sequence that mirrors the layered architecture of large language models such as GPT-2 and Llama 2.

What the Study Found

When we listen to someone speak, our brain transforms each incoming word through a cascade of neural computations. Goldstein’s team discovered that these transformations unfold over time in a pattern that parallels the tiered layers of AI language models. Early AI layers track simple features of words, while deeper layers integrate context, tone, and meaning. The study found that human brain activity follows a similar progression: early neural responses aligned with early model layers, and later neural responses aligned with deeper layers.

This alignment was especially clear in high-level language regions such as Broca’s area, where the peak brain response occurred later in time for deeper AI layers. According to Dr. Goldstein, “What surprised us most was how closely the brain’s temporal unfolding of meaning matches the sequence of transformations inside large language models. Even though these systems are built very differently, both seem to converge on a similar step-by-step buildup toward understanding”

Why It Matters

The findings suggest that artificial intelligence is not just a tool for generating text. It may also offer a new window into understanding how the human brain processes meaning. For decades, scientists believed that language comprehension relied on symbolic rules and rigid linguistic hierarchies. This study challenges that view. Instead, it supports a more dynamic and statistical approach to language, in which meaning emerges gradually through layers of contextual processing.

The researchers also found that classical linguistic features such as phonemes and morphemes did not predict the brain’s real-time activity as well as AI-derived contextual embeddings. This strengthens the idea that the brain integrates meaning in a more fluid and context-driven way than previously believed.

A New Benchmark for Neuroscience

To advance the field, the team publicly released the full dataset of neural recordings paired with linguistic features. This new resource enables scientists worldwide to test competing theories of how the brain understands natural language, paving the way for computational models that more closely resemble human cognition.

Mental health professionals urged to do their own evaluations of AI-based tools

Three-part practical approach requires no technical expertise


Wolters Kluwer Health





December 8, 2025 — Millions of people already chat about their mental health with large language models (LLMs), the conversational form of artificial intelligence. Some providers have integrated LLM-based mental healthcare tools into routine workflows. John Torous, MD, MBI and colleagues, of the Division of Digital Psychiatry at Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, urge clinicians to take immediate action to ensure these tools are safe and helpful, not wait for ideal evaluation methodology to be developed. In the November issue of the Journal of Psychiatric Practice®, part of the Lippincott portfolio from Wolters Kluwer, they present a real-world approach and explain the rationale.

LLMs are fundamentally different from traditional chatbots

"LLMs operate on different principles than legacy mental health chatbot systems," the authors note. Rule-based chatbots have finite inputs and finite outputs, so it’s possible to verify that every potential interaction will be safe. Even machine learning models can be programmed such that outputs will never deviate from pre-approved responses. But LLMs generate text in ways that can’t be fully anticipated or controlled.

LLMs present three interconnected evaluation challenges

Moreover, three unique characteristics of LLMs render existing evaluation frameworks useless:

  • Dynamism—Base models are updated continuously, so today's assessment may be invalid tomorrow. Each new version may exhibit different behaviors, capabilities, and failure modes.
  • Opacity—Mental health advice from an LLM-based tool could come from clinical literature, Reddit threads, online blogs, or elsewhere on the internet. Healthcare-specific adaptations compound this uncertainty. The changes are often made by multiple companies, and each protects its data and methods as trade secrets.
  • Scope—The functionality of traditional software is predefined and can be easily tested against specifications. An LLM violates that assumption by design. Each of its responses depends on subtle factors such as the phrasing of the question and the conversation history. Both clinically valid and clinically invalid responses may appear unpredictably.

The complexity of LLMs demands a tripartite approach to evaluation for mental healthcare

Dr. Torous and his colleagues discuss in detail how to conduct three novel layers of evaluation:

  • The technical profile layer—Ask the LLM directly about its capabilities (the authors’ suggested questions include "Do you meet HIPAA requirements?” and “Do you store or remember user conversations?”) Check the model’s responses against the vendor’s technical documentation.
  • The healthcare knowledge layer—Assess whether the LLM-based tool has factual, up-to-date clinical knowledge. Start with emerging general medical knowledge tests, such as MedQA or PubMedQA, then use a specialty-specific test if available. Test understanding of conditions you commonly treat and interventions you frequently use, including relevant symptom profiles, contraindications, and potential side effects. Ask about controversial topics to confirm that the tool acknowledges evidence limitations. Test the tool’s knowledge of your formulary, regional guidelines, and institutional protocols. Ask key safety questions (e.g., “Are you a licensed therapist?” Or “Can you prescribe medication?")
  • The clinical reasoning layer assesses whether the LLM-based tool applies sound clinical logic in reaching its conclusions. The authors describe two primary tactics in detail: chain-of-thought evaluation (ask the tool to explain its reasoning when giving clinical recommendations or answering test questions) and adversarial case testing (present case scenarios to the tool that mimic the complexity, ambiguity, and misdirection found in real clinical practice).

In each layer of evaluation, record the tool’s responses in a spreadsheet and schedule quarterly re-assessments, since the tool and the underlying model will be updated frequently.

The authors foresee that as multiple clinical teams conduct and share evaluations, "we can collectively build the specialized benchmarks and reasoning assessments needed to ensure LLMs enhance rather than compromise mental healthcare."

Read Article: Contextualizing Clinical Benchmarks: A Tripartite Approach to Evaluating LLM-Based Tools in Mental Health Settings

Wolters Kluwer provides trusted clinical technology and evidence-based solutions that engage clinicians, patients, researchers and students in effective decision-making and outcomes across healthcare. We support clinical effectiveness, learning and research, clinical surveillance and compliance, as well as data solutions. For more information about our solutions, visit https://www.wolterskluwer.com/en/health.

###

About Wolters Kluwer

Wolters Kluwer (EURONEXT: WKL) is a global leader in information, software solutions and services for professionals in healthcare; tax and accounting; financial and corporate compliance; legal and regulatory; corporate performance and ESG. We help our customers make critical decisions every day by providing expert solutions that combine deep domain knowledge with technology and services.

Wolters Kluwer reported 2024 annual revenues of €5.9 billion. The group serves customers in over 180 countries, maintains operations in over 40 countries, and employs approximately 21,600 people worldwide. The company is headquartered in Alphen aan den Rijn, the Netherlands. For more information, visit www.wolterskluwer.com, follow us on LinkedInFacebookYouTube and Instagram.

 

RSS Research Award for new lidar technology for cloud research



Leipzig researcher honoured for dissertation on new remote sensing technology

\


Leibniz Institute for Tropospheric Research (TROPOS)

Reinhard Süring Foundation Research Award 2025 

image: 

Presentation of the Reinhard Süring Foundation Research Award 2025 to Dr. Cristofer Andrés Jiménez from TROPOS.

view more 

Credit: Martin Radenz, TROPOS





Potsdam/Leipzig. The Reinhard Süring Foundation's 2025 Research Award goes to Leipzig-based atmospheric researcher Dr. Cristofer Jiménez for his contributions to a remote sensing technology that makes it possible to study the interactions between particles and clouds much better than ever before. The so-called dual-field-of-view polarisation lidar is based on two different aperture angles, which are used to observe and compare the reflections of laser beams in the atmosphere. Every three years, the Reinhard Süring Foundation Research Prize honours young scientists for outstanding work in a subfield of meteorology. In 2025, the prize was awarded for "New techniques, methods and applications of remote sensing of the atmosphere".

 

Dual-field-of-view polarisation lidar technology is comparable to a camera with two lenses with different aperture angles (fields of view). This allows the reflections of the laser light to be received from different angles, enabling the multiple scattering process to be investigated. These measurements can then be used to determine, for example, the size of the water droplets in the lowest areas of the cloud.

"The new dual-FOV lidar technology provides robust and accurate microphysical information on the liquid phase in clouds through active remote sensing for the first time," emphasises Dr Albert Ansmann from the Leibniz Institute for Tropospheric Research (TROPOS), who supervised the work. The innovative technology has already motivated other groups in Europe, America and, above all, China to adopt this method.

 

While still a doctoral student, Cristofer Jiménez retrofitted five of the TROPOS lidar devices used worldwide with dual-FOV lidar technology, thereby playing a major role in enabling the institute to carry out cloud measurements in very different regions: Punta Arenas in southern Chile (in the very clean southern hemisphere), Dushanbe in Tajikistan (in the dust-laden and anthropogenically polluted atmosphere of Central Asia), Limassol in Cyprus (in a maritime atmosphere with high levels of Saharan dust and anthropogenic pollution) and Mindelo in Cape Verde (in the exhaust air area of West Africa with high levels of desert dust and biomass combustion aerosols). Continuous measurements in very different regions provide new insights into the interactions between aerosols and clouds. In addition, measurements are taken with several instruments on the research icebreaker Polarstern in the Atlantic, in the Arctic (MOSAiC expedition) and in the Antarctic (Neumayer III Station). "This now enables us to document the life cycles of mixed-phase clouds as a function of the ice and liquid phases and the interaction between the two phases on the basis of real measurements. These new insights into stratiform mixed-phase clouds help us to model our atmosphere more accurately and to better understand climate development," Ansmann praises.

 

Cristofer Andrés Jiménez studied physics at the Universidad de Concepción and wrote his master's thesis at the Centre for Optics and Photonics in Concepción, Chile. In 2014, he received a scholarship from the German Academic Exchange Service (DAAD) and Becas Chile to pursue his doctorate at the University of Leipzig and TROPOS, where he has been conducting research ever since. After modernising the portable lidar devices, Jimenez recently turned his attention to expanding the stationary lidar device at TROPOS in Leipzig: MARTHA ("Multiwavelength Atmospheric Raman Lidar for Temperature, Humidity, and Aerosol Profiling") was equipped with an additional channel that can observe fluorescence, enabling the reliable detection of forest fire particles in the atmosphere.

 

After more than 10 years in Leipzig, Cristofer Andrés Jiménez will soon return to his native Chile, where he will set up a TROPOS lidar device at the Universidad de Concepcion to study smoke-cloud interactions.  The unique location in central Chile promises important insights into the effects of forest fires in South America on the atmosphere and climate.

 

Since 2008, the Reinhard Süring Foundation (RSS) has been sup

porting young researchers in the field of meteorology in collaboration with the German Meteorological Society (DMG). Reinhard Joachim Süring (1866-1950) was one of the most important German meteorologists in the first half of the 20th century. Under his leadership, the Potsdam Meteorological Observatory became a cloud research centre of international renown. His record-breaking free balloon flight in 1901 paved the way for the discovery of the stratosphere by Aßmann and Teisserenc de Bort in 1902. The "Lehrbuch der Meteorologie" (Textbook of Meteorology), co-edited by Süring, was the standard work for generations of German-speaking meteorology students.

Tilo Arnh


MARTHA 

MARTHA ("Multiwavelength Atmospheric Raman Lidar for Temperature, Humidity, and Aerosol Profiling") is the largest and oldest lidar at TROPOS in Leipzig. It emits laser light at three wavelengths (355, 532 and 1064 nanometres) and collects the backscattered light with a large main mirror measuring 80 centimetres in diameter.

Credit

Cristofer Jiménez, TROPOS


 

AI advances robot navigation on the International Space Station



Stanford University
Astrobee 

image: 

Astrobee is NASA’s free-flying robotic system. Using Astrobee, Stanford researchers became the first to test AI-based robotic control aboard the International Space Station. | NASA

view more 

Credit: NASA



I




Imagine a robot about the size of a toaster floating through the tight corridors of the International Space Station, quietly moving supplies or checking for leaks – all without an astronaut at the controls. Such technology could free up valuable time for astronauts and open new opportunities for robotics-based exploration. That sci-fi vision is coming closer to reality now that Stanford researchers have become the first to show that machine-learning-based control can operate aboard the ISS.

New research, published in and presented at the 2025 International Conference on Space Robotics (iSpaRo), introduces a system designed to help Astrobee, a cube-shaped, fan-powered robot, autonomously navigate the International Space Station. The ISS is a complex environment made up of interconnected modules filled with computers, storage, wiring, and experiment hardware. This makes planning safe motion for Astrobee far from trivial, said Somrita Banerjee, lead researcher who conducted this work as part of her Stanford PhD.

The traditional autonomous planning approaches that have gained traction on Earth are largely impractical for space-rated hardware. “The flight computers to run these algorithms are often more resource-constrained than ones on terrestrial robots. Additionally, in a space environment, uncertainty, disturbances, and safety requirements are often more demanding than in terrestrial applications,” said senior author Marco Pavone, associate professor of aeronautics and astronautics in the School of Engineering and director of Stanford’s Autonomous Systems Laboratory.

Despite these challenges, the team pushed the field forward with a noteworthy space research achievement. “This is the first time AI has been used to help control a robot on the ISS,” said Banerjee. “It shows that robots can move faster and more efficiently without sacrificing safety, which is essential for future missions where humans won’t always be able to guide them.”

Training AI for space

Banerjee compares the challenge of optimizing Astrobee’s routes through the ISS to planning a road trip from San Francisco to Los Angeles: You want the fastest path, the most energy-efficient one, and, above all, a safe one.

To tackle that task in the ISS’s compact environment, the team’s route planning system relies on a traditionally used optimization method called sequential convex programming, which breaks a difficult planning problem into a series of smaller, simpler steps. This process is designed to produce a final trajectory that is safe and feasible. However, solving each step from scratch can be demanding for Astrobee’s onboard computer and can slow the process – one of the key limitations of conventional techniques.

With the aim of speeding things up, the team enhanced their system with a machine-learning-based model that they trained on thousands of past path solutions. The model can reveal patterns such as where a corridor always exists and where obstacles tend to be. Providing the robot with foundational knowledge before further refinements is known as a “warm start.” The optimization technique still enforces all the safety constraints; the machine learning model just helps it reach the answer much faster.

“Using a warm start is like planning a road trip by starting with a route that real people have driven before, rather than drawing a straight line across the map,” Banerjee said. “You start with something informed by experience and then optimize from there.”

A milestone for AI in space

Before sending their AI to space, the team applied the system to a special testbed at NASA Ames Research Center. There, they had the AI model operate a robot similar to Astrobee, as it floated just above the surface of a granite table, buoyed by compressed air that mimics partial microgravity. “It’s like a puck on an air-hockey table,” Banerjee said.

When the real test day arrived, the Stanford team joined by video call while astronauts on the ISS completed what NASA calls a “crew-minimal” setup. The astronauts handled only preparation and cleanup, then stepped aside. For the next four hours, Banerjee sent instructions to ground operators at NASA’s Johnson Space Center in Houston. Then, the NASA team relayed the commands to Astrobee, specifying its starting point and destination, simulating obstacles to avoid, and trying both warm and cold starts. Multiple safety measures kept the experiment secure, including replacing physical obstacles with virtual ones to eliminate collision risk, maintaining a backup robot, and allowing operators to abort a run if necessary.

This is the first time AI has been used to help control a robot on the ISS. It shows that robots can move faster and more efficiently without sacrificing safety, which is essential for future missions where humans won’t always be able to guide them.

Somrita BanerjeeLead Researcher

The team tested 18 trajectories, each lasting more than a minute. Each was run twice: first with a cold start using the standard planning method, and then with a warm start, where the AI provided a first draft of the path that the system could quickly adjust.

The tests showed that giving Astrobee a warm start significantly sped up motion planning. “We showed that it’s 50 to 60% faster, especially in more challenging situations,” Banerjee said. Those harder cases included cluttered areas, tight corridors, and maneuvers requiring rotation instead of a straight path.

Watching Astrobee in orbit was a deeply personal experience for Banerjee. “The coolest part was having astronauts float past during the experiment,” she said. “One of them was one of my childhood heroes, Sunita Williams. Seeing years of work actually perform in space and watching her there while the robot moved around was incredible.”

The future of robots in orbit

After their experiment on the ISS, the team’s warm start system reached Technology Readiness Level 5, a NASA designation indicating successful testing in a real operational environment. The upgrade indicates that this technology is low risk, which is important for proposing new experiments or future missions.

Looking ahead, Banerjee said this type of mathematically grounded, safety-focused AI will be crucial as robots take on more tasks independently, and as NASA sends crewed missions to the moon and Mars. “As robots travel farther from Earth and as missions become more frequent and lower cost, we won’t always be able to teleoperate them from the ground,” she said. Such technologies will allow astronauts to focus on higher-priority work and use their time more effectively. “Autonomy with built-in guarantees isn’t just helpful; it’s essential for the future of space robotics,” she said.

Pavone highlighted that his lab will continue to research and advance warm starting techniques. “As part of the Center for Aerospace Autonomy Research (CAESAR), we are collaborating with the Stanford Space Rendezvous Lab to explore more powerful AI models – the same kinds used in modern language tools and self-driving systems. With stronger generalization capabilities, these models would enable robots to navigate even more challenging situations in future space missions.”


For more information

Abhishek Cauligi, PhD ’21, is also a co-author of the paper. Pavone is also an associate professor, by courtesy, of electrical engineering and of computer science in the School of Engineering. He is also a senior fellow of the Precourt Institute for Energy, faculty affiliate of the Institute for Human-Centered Artificial Intelligence, and a member of the Institute for Computational and Mathematical Engineering.

This work was funded by the Office of Naval Research, a NASA Early Stage Innovation grant, and a NASA Space Technology Graduate Fellowship grant.