Thursday, May 15, 2025

 

AI analysis of labor, delivery notes finds racial disparities in biased language

Columbia University Irving Medical Center





NEW YORK, NY (May 13, 2025) - Black patients admitted to the hospital for labor and delivery are more likely to have stigmatizing language documented in their clinical notes than White patients, Columbia University School of Nursing researchers report in JAMA Network Open

Veronica Barcelona, PhD, an assistant professor at Columbia Nursing, and her colleagues also found differences in how Hispanic and Asian/Pacific Islander (API) patients were described compared to White patients. 

Clinicians’ documentation can both reflect bias and perpetuate it, the authors note, and may contribute to racial and ethnic disparities in health and health care. Barcelona and her colleagues used a form of artificial intelligence called natural language processing to analyze clinical notes for 18,646 patients admitted to two large hospitals for labor and birth in 2017-2019, identifying instances of both stigmatizing and positive language in their electronic health records. 

Four categories of stigmatizing language were included: showing bias toward a patient’s marginalized language/identity; suggesting a patient was “difficult”; indicating unilateral/authoritarian clinical decisions; and questioning a patient’s credibility.  

The researchers also considered two types of positive language: preferred/autonomy, which portrays the patient giving birth as an active, decision-making participant in childbirth, and the patient’s viewpoint from an impartial perspective; and power/privilege language, which includes noting markers of a patient’s status or higher psychological or socioecological position. 

Language conveying bias was found for 49.3% of patients overall, and 54.9% of Black patients. The most common type of stigmatizing language, describing a patient as “difficult,” was seen in 28.6% of patients’ charts overall and 33% of Black patients’ charts.  

Compared to White patients, Black patients were 22% more likely to have any type of stigmatizing language in their clinical notes. Black patients were also 19% more likely to have positive documentation in their charts than White patients.  

Hispanic patients were 9% less likely to be documented as “difficult” patients than White patients and 15% less likely to have positive language overall. API patients were 28% less likely to have language in the marginalized language/identity category, and their charts were 31% less likely to include power/privilege language. 

“These findings underscore the importance of implementing targeted interventions to mitigate biases in perinatal care and to foster documentation practices that are both equitable and culturally sensitive,” the authors conclude. 

Barcelona’s Columbia Nursing co-authors include data manager Ismael Ibrahim Hulchafo, MD; doctoral student Sarah Harkins, BS; and Associate Professor Maxim Topaz, PhD. 

The study was funded with a grant from the Columbia University Data Science Institute Seed Funds Program and a grant from the Gordon and Betty Moore Foundation.  

About Columbia University School of Nursing  

Columbia University School of Nursing is advancing nursing education, research, and practice to advance health for all. As one of the top nursing schools in the country, we offer direct-entry master’s degrees, advanced nursing, and doctoral programs with the goal of shaping and setting standards for nursing everywhere. And, as a top recipient of NIH research funding, we address health disparities for under-resourced populations and advance equitable health policy and delivery.  

Through our expansive network of clinical collaborations in New York City and around the world —including our dedicated faculty practice, the ColumbiaDoctors Nurse Practitioner Group — we cultivate a culture of innovation and diversity and champion a community-centered approach to care. Across the Columbia Nursing community, we encourage active listening, big thinking, and bold action, so that, together, we’re moving health forward.  

Columbia University School of Nursing is part of Columbia University Irving Medical Center, which also includes the Columbia University Vagelos College of Physicians and Surgeons, the Mailman School of Public Health, and the College of Dental Medicine. 

 

From sidelines to spreadsheets: UF doctoral students take AI coaching research from the court to Japan



University of Florida





Shortly after the confetti settled over the University of Florida’s basketball championship, two graduate students studying artificial intelligence traveled to Japan to discuss how coaches are using data and technology to maximize player performance and safety. 

Accomplished athletes themselves, UF engineering students Mollie Brewer and Kevin Childs are co-primary investigators on a paper exploring how coaches analyze data – often from wearable sensors – to shape training and strategy and, ultimately, win more games.

If a player comes off an intense workout, for example, coaches can look at the data and determine if that player needs rest before the next game. This means successful coaches – like those coaching the championship basketball team – are adding “data analyst” to their roles. 

Brewer and Childs’ paper was selected for presentation at the renowned Association for Computing Machinery CHI conference in Yokohama, Japan, which ran April 26 to May 2. It was a big deal, not just because UF students are getting an international spotlight on their AI in Athletics research but also because the pair’s very first research paper was chosen for a highly competitive world conference.

“This is a look into how coaches use technology within collegiate athletics,” Childs said. “We have a lot of studies talking to recreational athletes. We have some studies within the human computer-interaction area looking at professional sports. But we don't really have an understanding of how technology is used in systems like collegiate athletics.”

Brewer added: “And we’re not going to lie, bringing it to a conference on the back of a national championship is even more exciting.” 

Brewer and Childs are UF Ph.D. engineering students and key players in the $2.5 million UF & Sport Collaborative initiative that, among other things, explores how AI data can maximize athletic performance and reduce injuries.

Known as the AI-Powered Athletics project, this partnership between the Herbert Wertheim College of Engineering and the University Athletic Association (UAA) delves deep into wearables such as fitness trackers and other sensors attached to athletes to provide information for AI databases.  

“This paper was on the coach's perspective of what types of technology and data are being used in collegiate athletics,” Brewer said. “We're presenting how the landscape of data flows in this environment and also finding the opportunities to improve technology and data usage among these top-level coaches.”

The researchers worked with five teams and 17 coaches. For privacy, they were not able to reveal what teams participated. 

But a March article on floridagators.com reported UF’s basketball team utilizes data for decisions on the court and in training. Heading into the SEC Tournament in March, for example, coaches used the data to increase intensity at practices to ensure optimum stamina if the team had to play three games in three days (which they did). 

As data analysts, coaches are figuring out what the numbers mean for individual athletes, particularly the relationship between intensity and injury.
 
“They're intaking data from dozens of sources and processing this to figure out the optimal training plan,” Childs said. “For example, there are GPS IMUs – inertial measurement units – being worn by a lot of student athletes. They are little vests with sensors between the shoulder blades. It captures all their position data, how fast they jump, how fast they're moving.” 

Another conclusion: Coaching is not a one-person job in collegiate athletics. 

“There's an entire interdisciplinary team, and that information is shared among everybody to make decisions, and sometimes the same technology output is used by different roles,” Brewer said. “A dietician may use it one way. An athletic trainer may use the output for return to play, but they're all communicating together.”

Brewer is a cyclist, and Childs was a competitive swimmer. They know the ins and outs of competition and results. This research sharpens the competitive edge, that “last-second buzzer beater – anything you can do to get that extra 1%,” Childs said.
 
“We are so proud of Mollie and Kevin's innovative thinking and hard work,” said UF Department of Computer & Information Science and Engineering Professor Kristy Boyer, Ph.D., one of collaborators on the paper. “This project exemplifies what can happen when university faculty and innovators within the athletic association come together with a common goal.”
  
In addition to Brewer, Childs and Boyer, the paper’s other collaborators are Kevin Butler, Ph.D., from the Department of Computer & Information Science and Engineering, Garrett F. Beatty, Ph.D., from UF’s College of Health and Human Performance, Spencer Thomas from the UAA and Jennifer Nichols, Ph.D., Daniel Ferris Ph.D., and Celeste Wilkins from UF’s J. Crayton Pruitt Family Department of Biomedical Engineering.

As for exploring Japan?

“The experience was amazing,” Childs said. “Some of the highlights were traveling to random spots on a map with Mollie and trying out different food. Conveyor-belt sushi offered a fun game of guessing what variety of seafood was on the plate, and I think I satisfied my yearly quota of ramen consumption.”

“We loved connecting with new faces at CHI and getting inspired by the exciting research on the horizon,” Brewer added. “After our presentation, we ran out to join our new friends for a jog around the bay, just in time to catch a rare view of Mt. Fuji at sunset. It felt like the universe's quiet nod of ‘well done.’” 

 

First machine learning model developed to calculate the volume of all glaciers on Earth




Università Ca' Foscari Venezia
Alaska glaciers 

image: 

Alaska glaciers

view more 

Credit: Niccolò Maffezzoli https://nmaffe.github.io/iceboost_webapp/




VENICE – A team of researchers led by Niccolò Maffezzoli, “Marie Curie” fellow at Ca’ Foscari University of Venice and the University of California, Irvine, and an associate member of the Institute of Polar Sciences of the National Research Council of Italy, has developed the first global model based on artificial intelligence to calculate the ice thickness distribution of all the glaciers on Earth. The model has been published in the journal Geoscientific Model Development and is expected to become a reference tool for those studying future glacier melt scenarios.

Accurate knowledge of glacier volumes is essential for projecting future sea level rise, managing water resources, and assessing societal impacts linked to glacier retreat. However, estimating their absolute volume remains a major scientific challenge. Over the years, more than 4 million in situ measurements of glacier thickness have been collected, thanks in large part to NASA’s Operation IceBridge. Despite the extensive dataset, current modelling approaches have not yet exploited its potential.

AI applied to glacier data
Direct measurements of glacier thickness cover less than 1% of the planet’s glaciers, highlighting the need for models capable of providing global-scale estimates of ice thickness and volume. This newly published study is the first to leverage such observational data in conjunction with the power of machine learning algorithms.

“Our model combines two decision tree algorithms,” explains Maffezzoli, “trained on thickness measurements and 39 features including ice velocity, mass balance, temperature fields, and geometric and geodetic variables. The trained model shows errors that are up to 30-40% lower than current traditional global models, particularly at polar latitudes and along the peripheries of the ice sheets, where the majority of the planet’s ice is located.”

Improving maps and projections of sea level rise
In polar regions and in the margins of Greenland and Antarctica, having accurate ice thickness estimates is particularly important. These estimates serve as initial conditions for numerical models that simulate ice flow and its interactions with the ocean—interactions that are key to projecting sea level rise under future climate scenarios. The model demonstrates strong generalisation capabilities in these regions and, the researchers believe, may help to refine current maps of subglacial topography in specific areas of the ice sheets, such as the Geikie Plateau or the Antarctic Peninsula.

This work represents an initial step towards producing updated estimates of global glacier volumes that will be useful to modellers, the IPCC, and policymakers.

“We aim to release two datasets totalling half a million ice thickness maps by the end of 2025,” announces Maffezzoli. “There is still a long way to go, but this work shows that AI and machine learning approaches are opening up new and exciting possibilities for ice modelling.”

The significance of glaciers
At present, glaciers contribute approximately 25-30% of observed global sea level rise, and their melting is accelerating. This is particularly significant in arid regions such as the Andes or the major mountain ranges of the Himalaya and Karakoram, where glacial headwaters support the livelihoods of billions. It is also critical for understanding the stability of the polar ice sheets in Greenland and Antarctica, where peripheral interactions with the ocean influence ice sheet dynamics.

 

AI tools may be weakening the quality of published research, study warns




University of Surrey





Artificial intelligence could be affecting the scientific rigour of new research, according to a study from the University of Surrey. 

The research team has called for a range of measures to reduce the flood of "low-quality" and "science fiction" papers, including stronger peer review processes and the use of statistical reviewers for complex datasets. 

In a study published in PLOS Biology, researchers reviewed papers proposing an association between a predictor and a health condition using an American government dataset called the National Health and Nutrition Examination Survey (NHANES), published between 2014 and 2024.  

NHANES is a large, publicly available dataset used by researchers around the world to study links between health conditions, lifestyle and clinical outcomes. The team found that between 2014 and 2021, just four NHANES association-based studies were published each year – but this rose to 33 in 2022, 82 in 2023, and 190 in 2024. 

Dr Matt Spick, co-author of the study from the University of Surrey, said: 

“While AI has the clear potential to help the scientific community make breakthroughs that benefit society, our study has found that it is also part of a perfect storm that could be damaging the foundations of scientific rigour. 

“We’ve seen a surge in papers that look scientific but don’t hold up under scrutiny – this is ‘science fiction’ using national health datasets to masquerade as science fact. The use of these easily accessible datasets via APIs, combined with large language models, is overwhelming some journals and peer reviewers, reducing their ability to assess more meaningful research – and ultimately weakening the quality of science overall.” 

The study found that many post-2021 papers used a superficial and oversimplified approach to analysis – often focusing on single variables while ignoring more realistic, multi-factor explanations of the links between health conditions and potential causes. Some papers cherry-picked narrow data subsets without justification, raising concerns about poor research practice, including data dredging or changing research questions after seeing the results. 

Tulsi Suchak, post-graduate researcher at the University of Surrey and lead author of the study, added: 

“We’re not trying to block access to data or stop people using AI in their research – we’re asking for some common sense checks. This includes things like being open about how data is used, making sure reviewers with the right expertise are involved, and flagging when a study only looks at one piece of the puzzle. These changes don’t need to be complex, but they could help journals spot low-quality work earlier and protect the integrity of scientific publishing.” 

To help tackle the issue, the team has laid out a number of practical steps for journals, researchers and data providers. They recommend that researchers use the full datasets available to them unless there’s a clear and well-explained reason to do otherwise, and that they are transparent about which parts of the data were used, over what time periods, and for which groups.  

For journals, the authors suggest strengthening peer review by involving reviewers with statistical expertise and making greater use of early desk rejection to reduce the number of formulaic or low-value papers entering the system. Finally, they propose that data providers assign unique application numbers or IDs to track how open datasets are used – a system already in place for some UK health data platforms. 

Anietie E Aliu, co-author of the study and post-graduate student at the University of Surrey, said: 

“We believe that in the AI era, scientific publishing needs better guardrails. Our suggestions are simple things that could help stop weak or misleading studies from slipping through, without blocking the benefits of AI and open data. These tools are here to stay, so we need to act now to protect trust in research.” 

 

AI overconfidence mirrors human brain condition



A similarity between language models and aphasia points to diagnoses for both




University of Tokyo

Energy landscape analysis 

image: 

The nature of the dynamics of signals in both the brains of people with aphasia and in large language models, or LLMs, proved strikingly similar when represented visually. ©2025 Watanabe et al. CC-BY-ND

view more 

Credit: ©2025 Watanabe et al. CC-BY-ND




Agents, chatbots and other tools based on artificial intelligence (AI) are increasingly used in everyday life by many. So-called large language model (LLM)-based agents, such as ChatGPT and Llama, have become impressively fluent in the responses they form, but quite often provide convincing yet incorrect information. Researchers at the University of Tokyo draw parallels between this issue and a human language disorder known as aphasia, where sufferers may speak fluently but make meaningless or hard-to-understand statements. This similarity could point toward better forms of diagnosis for aphasia, and even provide insight to AI engineers seeking to improve LLM-based agents.

This article was written by a human being, but the use of text-generating AI is on the rise in many areas. As more and more people come to use and rely on such things, there’s an ever-increasing need to make sure that these tools deliver correct and coherent responses and information to their users. Many familiar tools, including ChatGPT and others, appear very fluent in whatever they deliver. But their responses cannot always be relied upon due to the amount of essentially made-up content they produce. If the user is not sufficiently knowledgeable about the subject area in question, they can easily fall foul of assuming this information is right, especially given the high degree of confidence ChatGPT and others show.

“You can’t fail to notice how some AI systems can appear articulate while still producing often significant errors,” said Professor Takamitsu Watanabe from the International Research Center for Neurointelligence (WPI-IRCN) at the University of Tokyo. “But what struck my team and I was a similarity between this behavior and that of people with Wernicke’s aphasia, where such people speak fluently but don’t always make much sense. That prompted us to wonder if the internal mechanisms of these AI systems could be similar to those of the human brain affected by aphasia, and if so, what the implications might be.”

To explore this idea, the team used a method called energy landscape analysis, a technique originally developed by physicists seeking to visualize energy states in magnetic metal, but which was recently adapted for neuroscience. They examined patterns in resting brain activity from people with different types of aphasia and compared them to internal data from several publicly available LLMs. And in their analysis, the team did discover some striking similarities. The way digital information or signals are moved around and manipulated within these AI models closely matched the way some brain signals behaved in the brains of people with certain types of aphasia, including Wernicke’s aphasia.

“You can imagine the energy landscape as a surface with a ball on it. When there’s a curve, the ball may roll down and come to rest, but when the curves are shallow, the ball may roll around chaotically,” said Watanabe. “In aphasia, the ball represents the person’s brain state. In LLMs, it represents the continuing signal pattern in the model based on its instructions and internal dataset.”

The research has several implications. For neuroscience, it offers a possible new way to classify and monitor conditions like aphasia based on internal brain activity rather than just external symptoms. For AI, it could lead to better diagnostic tools that help engineers improve the architecture of AI systems from the inside out. Though, despite the similarities the researchers discovered, they urge caution not to make too many assumptions.

“We’re not saying chatbots have brain damage,” said Watanabe. “But they may be locked into a kind of rigid internal pattern that limits how flexibly they can draw on stored knowledge, just like in receptive aphasia. Whether future models can overcome this limitation remains to be seen, but understanding these internal parallels may be the first step toward smarter, more trustworthy AI too.”

Journal article: Takamitsu Watanabe, Katsuma Inoue, Yasuo Kuniyoshi, Kohei Nakajima, Kazuyuki Aihara “Comparison of large language model with aphasia”Advanced Science, https://doi.org/10.1002/advs.202414016


Funding: This work was supported by Grant-in-aid for Research Activity from Japan Society for Promotion of Sciences (19H03535, 21H05679, 23H04217, JP20H05921), The University of Tokyo Excellent Young Researcher Project, Showa University Medical Institute of Developmental Disabilities Research, JST Moonshot R&D Program (JPMJMS2021), JST FOREST Program (24012854), Institute of AI and Beyond of UTokyo, Cross-ministerial Strategic Innovation Promotion Program (SIP) on “Integrated Health Care System” (JPJ012425).

Research Contact:

International Research Center for Neurointelligence (WPI-IRCN) - https://ircn.jp/en/


Public Relations Group, The University of Tokyo,
7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan
press-releases.adm@gs.mail.u-tokyo.ac.jp
 

About The University of Tokyo:

The University of Tokyo is Japan's leading university and one of the world's top research universities. The vast research output of some 6,000 researchers is published in the world's top journals across the arts and sciences. Our vibrant student body of around 15,000 undergraduate and 15,000 graduate students includes over 4,000 international students. Find out more at www.u-tokyo.ac.jp/en/ or follow us on X (formerly Twitter) at @UTokyo_News_en.