Thursday, March 19, 2026

  

Can a specialized AI model steer doctors toward the right scan?



Fine-tuned GPT-enhanced system outperforms general purpose models on radiology guideline alignment—with important caveats for clinical use




Intelligent Medicine

Similarity score distribution. 

image: 

Response similarity scores distribution. The distribution of response similarity scores for model (a) AMIR-GPT (b) GPT-4 (c) GPT-3.5 (d) Gemini is displayed from left to right, respectively

view more 

Credit: Intelligent Medicine





Medical imaging is important in healthcare; however, its overutilization can contribute to resource wastage and can cause harm to patients. While various guidelines are available for its appropriate utilization, their adoption remains a challenge. Now, a new study in Intelligent Medicine finds that domain-specific adaptation may help improve AI-assisted imaging recommendations, pointing to a new direction for value-based clinical decision support.

Every year up to 30% of medical imaging studies ordered in the United States are considered unnecessary. This issue wastes resources, strains healthcare systems, and exposes patients to avoidable risks from radiation. Despite the existence of evidence-based appropriateness guidelines, translating them consistently into day-to-day clinical decisions remains difficult. A new study published in journal Intelligent Medicine this February suggests that large language models adapted to specific clinical domains may offer a meaningful path forward.

The research team, based at Beijing Friendship Hospital and collaborating institutions, developed a model called the Appropriate Medical Imaging Recommendations Generative Pre-trained Transformer (AMIR-GPT). Rather than relying on a general-purpose AI system, they asked whether targeted fine-tuning on structured radiology guidance could produce more accurate, guideline-aligned imaging recommendations for common clinical scenarios.

“Overutilization of medical imaging is not just a cost problem. It reflects a gap between the best available evidence and what happens in practice. Our goal was to explore whether a domain-specific AI model could help bridge that gap in a way that supports clinicians, not replaces them,” says Han Lyu, M.D., corresponding author of the study and associate professor at the Department of Radiology, Beijing Friendship Hospital, Capital Medical University.

Building and testing the model

To train AMIR-GPT, the researchers curated 1,036 question-and-answer pairs derived from 26 guidelines in the American College of Radiology Appropriateness Criteria (ACR AC), covering a broad range of common clinical indications, including low back pain, trauma, fractures, abdominal pain, cancer screening and staging, gastrointestinal bleeding, hearing related complaints, and pediatric fever. Of the 1,036 entries, 932 were used for model training across four iterations, with the remaining 104 reserved for testing.

AMIR-GPT was benchmarked against GPT-4, GPT-3.5, and Gemini using the same test questions. Responses were scored on a 1 to 5 scale for similarity to standard answers through an automated assessment by GPT-3.5 and by two expert radiologists.

What the results show

In the most stringent performance category, perfect agreement with standard guideline answers (score 5 out of 5), AMIR-GPT achieved the highest proportion among all models evaluated, at 33.3% of test responses. This compares to 16.7% for GPT-4, 6.2% for GPT-3.5, and 6.2% for Gemini. The overall difference among models was statistically significant (ANOVA: f = 6.49, P = 0.0004). Pairwise testing confirmed a significant advantage for AMIR-GPT over GPT-3.5 (P = 0.018).

However, the picture was more nuanced across other performance bands. When high match (score 4 out of 5), medium match (score 3 out of 5) and low match (score less than 3) are considered, the general purpose models were still competitive to AMIR-GPT. This finding matters for interpreting the study's claims. In medical AI evaluation, model ranking depends on whether the benchmark emphasizes exact guideline adherence or partial alignment. In clinical practice, that distinction is not merely academic. A fluent answer is not the same as a clinically appropriate one.

Qualitative review reinforced this point. In one higher-scoring example, AMIR-GPT correctly identified magnetic resonance imaging (MRI) without intravenous contrast as the appropriate first-line imaging study for a surgical candidate with subacute low back pain after six weeks of conservative management. This is consistent with ACR guidance and clinically meaningful. However, lower-scoring outputs revealed familiar risks in medical AI: omissions and deviations from standard recommendations, and in one case, an incorrect characterization of computed tomography (CT) enterography that failed to account for the potential masking of upper gastrointestinal bleeding by oral contrast agents.

Promising direction, preliminary evidence

The study positions domain specific fine tuning as a potentially useful strategy for improving AI performance in specialized clinical tasks. But the authors are careful not to overstate the implications.

The dataset covered only a subset of published ACR criteria, limiting the model's exposure to rarer or more complex clinical scenarios. Outputs that are inaccurate, fabricated, or off-target remain a barrier to unsupervised clinical deployment.

“This is a step toward AI as a collaborative tool in medicine, but responsible integration requires broader datasets, stronger evaluation methods, and validation across diverse real-world settings before these systems can be trusted more widely,” says Dr. Lyu.

Future work will focus on expanding training data to cover a broader range of ACR guidelines and more complex cases, incorporating real-time error correction mechanisms, and exploring applicability in electronic health record analysis and broader clinical decision support.

Importance

The findings contribute to a growing body of evidence suggesting that high performance in healthcare AI may require more than scaling general purpose models. Domain-specific adaptation, disciplined alignment with the standards, evidence structures, and reasoning patterns of a particular medical field, may be just as important as model size.

About the authors

Dr. Han Lyu (吕晗) is an Associate Chief Physician and Associate Professor of Radiology at Beijing Friendship Hospital, Capital Medical University. He specializes in advanced neuroimaging, brain structural-functional networks, tinnitus mechanisms, cerebral perfusion, and AI-enhanced medical diagnostics, with notable contributions to brain aging and neurodegeneration research. He is a former visiting scholar at Stanford University. Email: chrislvhan@126.com

Prof. Wang Zhenchang (王振常) is a distinguished medical imaging expert and Academician of the Chinese Academy of Engineering. He is affiliated with the Department of Radiology, Beijing Friendship Hospital, Capital Medical University, and leads pioneering work in ultra-high-resolution CT (world’s first 50 μm bone-specific scanner) for auditory and visual systems, as well as AI integration in medical imaging and diagnostics. Email: cjr.wzhch@vip.163.com

About the journal
Intelligent Medicine is a peer-reviewed, open-access journal focusing on the integration of AI, data science, and digital technology in clinical medicine and public health. It is published by the Chinese Medical Association in partnership with Elsevier. To learn more about Intelligent Medicine, please visit https://www.sciencedirect.com/journal/intelligent-medicine

Funding information
This study was partially supported by the National Natural Science Foundation of China (62171297, 61931013). The funder had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

HKUMed develops innovative AI tool: A single blood test can predict heart diseases up to 15 years before onset





The University of Hong Kong

HKUMed develops a cardiovascular risk prediction tool that can accurately predict the future risk of six major cardiovascular diseases with a single blood test. 

image: 

HKUMed develops a cardiovascular risk prediction tool that can accurately predict the future risk of six major cardiovascular diseases with a single blood test. The system can provide early warning signals up to 15 years before clinical onset. The research is led by Professor Zhang Qingpeng (left).

view more 

Credit: HKU




A research team from the Department of Pharmacology and Pharmacy at the LKS Faculty of Medicine of the University of Hong Kong (HKUMed) has developed an innovative AI-based cardiovascular risk prediction tool, called CardiOmicScore. With a single blood test, the system can accurately forecast the future risk of six major cardiovascular diseases (CVDs): coronary artery disease, stroke, heart failure, atrial fibrillation, peripheral artery disease and venous thromboembolism. It can also provide early warning signals up to 15 years before clinical onset. The findings were published in Nature Communications [link to the publication].

AI-based multiomics integration reflects the body's real-time health status
CVDs remain the leading cause of death worldwide, accounting approximately 19.8 million fatalities in 2022 alone. In routine health assessments, physicians typically evaluate cardiovascular risk based on age, blood pressure, smoking and other conventional clinical indicators. However, these measures often fail to capture subtle and early biological changes before the disease becomes clinically apparent, leading to many patients missing the optimal window for preventive intervention. Although polygenic risk scores have become popular in recent years, genetic predisposition is largely fixed at birth and does not change over time. Consequently, polygenic risk scores cannot reflect the immediate impact on health conditions resulting from  lifestyle or environmental changes. This creates an urgent need for tools that can capture a person’s current biological state and provide accurate, early warnings for CVDs.

To address this problem, the HKUMed research team applied deep learning techniques to integrate multiomics data, including genomics, metabolomics and proteomics, to develop the CardiOmicScore tool. The study was based on large-scale population data from the UK Biobank, analysing 2,920 circulating proteins and 168 metabolites measured from blood samples. These molecular signals act as ‘real-time recorders’ of the body, sensitively reflecting subtle changes in the immune system, metabolism, and vascular health.

Professor Zhang Qingpeng, Associate Professor in the Department of Pharmacology and Pharmacy at HKUMed, explained, ‘Genes determine where we start—they define our baseline health risk. However, proteins and metabolites reflect our current physical health. Our AI tool is designed to decode these complex molecular signals, enabling doctors and patients to identify risks much earlier, which can potentially change the trajectory of disease through timely lifestyle modifications and early prevention.’

Accurate prediction of six major cardiovascular diseases with 15‑year advance warning in high-risk groups
The results showed that CardiOmicScore transforms complex multiomics measurements into personalised risk scores with substantially improved predictive performance compared with conventional polygenic risk scores. When combined with clinical information such as age and gender, the model significantly enhanced the risk prediction accuracy of six common CVDs and can even flag elevated risk up to 15 years before symptoms appear.

This study marks a shift in precision medicine from a static, gene-centric paradigm towards a more dynamic, multiomics-based approach. In the future, a small-volume blood sample may be sufficient to generate a comprehensive cardiovascular risk profile for multiple diseases. Professor Zhang added, ‘We aim to leverage technology to identify and prevent diseases before they develop. By shifting health management from reactive treatment to proactive prediction and intervention, we aim to create a lasting impact for both public health and individual patient care.’

About the research team
The study was led by Professor Zhang Qingpeng, Associate Professor in the Department of Pharmacology and Pharmacy, HKUMed, and the HKU Musketeers Foundation Institute of Data Science (IDS). The first author is Luo Yan from the HKU IDS.

Media enquiries
Please contact LKS Faculty of Medicine of The University of Hong Kong by email (medmedia@hku.hk).

Using AI to improve standard-of-care cardiac imaging 



UCSF-led research with deep neural networks enhances echocardiogram views of major cardiac conditions.




University of California San Francisco Medical Center





Heart disease is the leading cause of adult death worldwide, making cardiovascular disease diagnosis and management a global health priority. An echocardiogram, or cardiac ultrasound, is one of the most commonly used imaging tools employed by physicians to diagnose a variety of heart diseases and conditions.  

Most standard echocardiograms provide two-dimensional visual images (2D) of the three-dimensional (3D) cardiac anatomy. These echocardiograms often capture hundreds of 2D slices or views of a beating heart that can enable physicians to make clinical assessments about the function and structure of the heart. 

To improve diagnostic accuracy of cardiac conditions, researchers from UC San Francisco set out to determine whether deep neural networks (DNNs), a type of AI algorithm, could be re-designed to better capture complex 3D anatomy and physiology from multiple imaging views simultaneously. They developed a new “multiview” DNN structure—or architecture—to enable it to draw information from multiple imaging views at once, rather than the current approach of using only a single view. They then trained demonstration DNNs using this architecture to detect disease states for three cardiovascular conditions: left and right ventricular abnormalities, diastolic dysfunction, and valvular regurgitation.  

In a study published March 17 in Nature Cardiovascular Research, the researchers compared the performance of DNNs that analyzed data from either single view or multiple views of the echocardiograms from UCSF and the Montreal Heart Institute. They found that DNNs trained on multiple views improved diagnostic accuracy compared to DNNs trained on any single view, demonstrating that AI models combining information from multiple imaging views simultaneously better captured the disease state of these heart conditions.  

“Until now, AI has primarily been used to analyze one 2D view at a time—from either images or videos—which limits an AI algorithm’s ability to learn disease-relevant information between views,” said senior study author Geoffrey Tison, MD, MPH, a cardiologist and co-director of the UCSF Center for Biosignal Research. “DNN architectures that can integrate information across multiple high-resolution views represent a significant step toward maximizing AI performance in medical imaging. In the case of echocardiography, most diagnoses necessitate considering information from more than one view because the information from any single view tells only part of the story.” 

For example, for the assessment of left ventricle (LV) size or function, the echocardiogram view showing all the chambers of the heart at once (A4c) best captures certain left ventricular walls (inferoseptal and anterolateral walls), whereas another perpendicular echo view (A2c) captures other important walls (anterior and inferior walls). Often the function of LV walls may appear completely normal in one view but have significant dysfunction in another view. For the echocardiogram tasks they examined, such as identifying left and right ventricular abnormalities and diastolic dysfunction, the researchers’ results suggest that the multiview DNNs likely learn interrelated information between features from each view to achieve higher overall performance.  

“Our multi-view neural network architecture is explicitly designed to enable the model to learn complex relationships between information in multiple imaging views,” said study first author Joshua Barrios, PhD, an assistant professor in the UCSF Division of Cardiology. “We find that this approach improves performance for diagnostic tasks in echocardiography, but this new AI architecture can also be applied to other medical imaging modalities where multiple views contain complimentary information.” 

The researchers also found that averaging the predictions of three single-view DNNs improves performance beyond a single-view DNN while also being less computationally expensive, thus providing a viable alternative to training a multiview DNN. Comparatively, however, the multiview DNN provided the strongest performance.  They suggest that future research should examine how multiview DNN architectures may assist other medical tasks or imaging modalities.  

Additional Authors: Minhaj U. Ansari, MS, Jeffrey E. Olgin, MD, Sean Abreau, MS, Jacques Delfrate, MS, Elodie L. Langlais, Robert Avram, MD, MS. 

Funding: Support for this work was received from the National Institutes of Health: K23HL135274 (G.H.T.), R56HL161475 (G.H.T.), and DP2HL174046 (G.H.T.). 

Disclosures: Please see the study. 

About UCSF Health: UCSF Health is recognized worldwide for its innovative patient care, reflecting the latest medical knowledge, advanced technologies and pioneering research. It includes the flagship UCSF Medical Center, which is a top-ranked specialty hospital, as well as UCSF Benioff Children’s Hospitals, with campuses in San Francisco and Oakland; two community hospitals, UCSF Health St. Mary's and UCSF Health Saint Francis; Langley Porter Psychiatric Hospital; UCSF Benioff Children’s Physicians; and the UCSF Faculty Practice. These hospitals serve as the academic medical center of the University of California, San Francisco, which is world-renowned for its graduate-level health sciences education and biomedical research. UCSF Health has affiliations with hospitals and health organizations throughout the Bay Area. Visit http://www.ucsfhealth.org/. Follow UCSF Health on FacebookThreads or LinkedIn

No comments: