Saturday, April 11, 2026

 

Scientists develop spatiotemporal correlation-based deep learning framework for bias correction of atmospheric and oceanic variables




Institute of Atmospheric Physics, Chinese Academy of Sciences
Overview of the proposed bias correction architecture 

image: 

Overview of the proposed bias correction architecture

view more 

Credit: Yuze Sun





Daily travel plans and early warnings for extreme weather all rely on traditional numerical weather prediction. However, both traditional numerical weather prediction and AI forecasting large models have long suffered from systematic biases, which compromise forecast accuracy.

To address this challenge, the research group led by Prof. Xiaomeng Huang from Tsinghua University, China, in collaboration with the National Climate Centre, China, has developed an AI bias correction framework based on spatiotemporal correlation deep learning. This framework accurately corrects forecast biases, achieving a maximum 20% reduction in the root-mean-square error of 7-day 2-meter air temperature forecasts. In addition, it can support the bias correction of oceanic variables, making forecasts in meteorological and oceanic scenarios more accurate. The findings were recently published in Atmospheric and Oceanic Science Letters.

The research team systematically integrated three key innovations into their model design: dynamic climatological normalization, ConvLSTM with temporal causality constraints, and residual self-attention mechanisms, enabling systematic bias correction of European Centre for Medium-Range Weather Forecasts (ECMWF) numerical forecasts.

The model was trained and validated using 41 years (1981–2021) of global atmospheric data, with ERA5 (fifth generation ECMWF atmospheric reanalysis) data serving as the ground truth. A decadal stratified sampling strategy, i.e., five non-consecutive years (1981, 1991, 2001, 2011, 2021) selected at 10-year intervals as a testing set, was employed to ensure the model’s generalization capability across distinct climate phases.

Results show that the model boasts outstanding generalization capability. After being trained on the air temperature variable, it only takes 20 minutes to perform cross-variable correction for wind fields and air pressure, cutting the retraining time by 85%. Integrated as a plug-in into existing AI forecasting models, it further improves the forecast skill by 10%. Moreover, the corrected atmospheric data can significantly enhance the prediction performance of ocean models, enabling cross-domain empowerment from meteorology to oceanography.

This research was supported by the National Natural Science Foundation of China. The model code has been made publicly available, providing a reproducible technical solution for meteorological AI research.

 

New AI technology to speed drug development



University of Virginia Health System





University of Virginia School of Medicine scientists have developed a bold new approach to drug development and discovery that could dramatically accelerate the creation of new medicines.

UVA’s Nikolay V. Dokholyan, PhD, and colleagues have developed a suite of artificial intelligence-powered tools, called YuelDesign, YuelPocket and YuelBond, that work together to transform how new drugs are created. The centerpiece, YuelDesign, uses a cutting-edge form of AI called diffusion models to design new drug molecules tailored to fit their protein targets exactly, even accounting for the way proteins flex and shift shape during binding.

A companion tool, YuelPocket, identifies exactly where on a protein a drug can attach, while YuelBond ensures the chemical bonds in designed molecules are accurate. Together, the approach is poised to improve both how new drugs are designed and how quickly and efficiently existing drugs can be evaluated for new purposes.

“Think of it this way: Other methods try to design a key for a lock that's sitting perfectly still, but in your body, that lock is constantly jiggling and changing shape. Our AI designs the key while the lock is moving, so the fit is much more realistic,” said Dokholyan, of UVA’s Department of Neurology. “This could make a real difference for patients with cancer, neurological disorders and many other conditions where we desperately need better drugs targeting these wiggly proteins but keep hitting dead ends.”

The Pitfalls of Drug Development

The average cost of developing a new drug has been estimated to reach or exceed $2.6 billion, and almost 90% of new drugs fail when they reach human testing. That is due, in no small part, to the difficulty of predicting how molecules in a drug will interact, or bind, with their targets in the body. If a molecule doesn’t bind exactly as intended, at exactly the right spot, the drug won’t work, or could have unwanted, harmful side effects.

Artificial intelligence has helped address this problem, greatly accelerating drug design, but Dokholyan’s work takes it to the next level. His YuelDesign overcomes limitations of the existing options by designing drug molecules while treating proteins as flexible, dynamic structures, not the rigid and frozen snapshots used by other methods. This is critical because proteins often change shape when a drug binds to them, a phenomenon known as "induced fit." Ignoring this flexibility can lead to drugs that look promising on a computer screen but fail in reality.

Dokholyan and his team designed YuelDesign specifically to overcome this problem. Using advanced AI “diffusion models,” the technology simultaneously generates both the protein pocket structure and the small molecule that can slot into it – the key that will turn the lock, allowing both to adapt to each other during the design process.

A companion tool, YuelPocket, uses graph neural networks to identify precisely where on a protein a drug should bind, even on predicted protein structures from existing tools such as AlphaFold. “Most existing AI tools treat the protein as a frozen statue, but that's not how biology works. Our approach lets the protein and the drug candidate evolve together during the design process, just as they would in the body,” said researcher Dr. Jian Wang. “We showed, for example, that when designing molecules for a well-known cancer-related protein called CDK2, only YuelDesign could capture the critical structural changes that happen when a drug binds.”

Mapping out protein pockets is critical to “virtually every aspect of modern development,” the researchers note in a new scientific paper outlining their YuelPocket testing. The promising results have Dokholyan hopeful that the technology can reduce drug development costs, improve the success rate of new drug candidates and accelerate how quickly new treatments and cures can reach patients. (Accelerating how quickly lab discoveries can be turned into medicines to benefit patients is the primary mission of UVA’s new Paul and Diane Manning Institute of Biotechnology.)

“Our ultimate goal is to make drug discovery faster, cheaper and more likely to succeed, so that promising treatments can reach patients sooner,” Dokholyan said, adding that he wants to “democratize” drug discovery by putting new tools at scientists’ fingertips. “We've made all of our tools freely available to the scientific community. We want researchers anywhere in the world to be able to use them to tackle the diseases that matter most to their patients.”

Findings Published

Dokholyan and his team have described the development and results of these tools in papers in the scientific journals PNASJCIM and Science Advances. The research team includes Wang, Dong Yan Zhang, Shreshty Budakoti and Dokholyan. The scientists have no financial interest in the work.

The research has been supported by the National Institutes of Health, grant 1R35 GM134864; the National Science Foundation, grant 2210963; the Huck Institutes of the Life Sciences; and the Passan Foundation.

To keep up with the latest medical discoveries from the UVA School of Medicine and the Manning Institute, bookmark the Making of Medicine blog at https://makingofmedicine.virginia.edu.

 

Penn researchers use AI to surface unreported GLP-1 side effects in Reddit posts




An AI analysis of more than 400,000 Reddit posts found discussions of menstrual changes, fatigue and temperature-related complaints that may not be fully captured in clinical trials or drug labeling.




University of Pennsylvania School of Engineering and Applied Science

Analyzing Reddit Posts About GLP-1s with AI 

image: 

A close-up of the process the researchers used to analyze Reddit posts: at left is an example of the type of post the researchers fed into an AI-powered analysis, part of which is shown at right. 

view more 

Credit: Sylvia Zhang





By using AI to analyze more than 400,000 Reddit posts, Penn researchers have identified patient-reported symptoms associated with GLP-1s, the popular weight-loss and diabetes drugs semaglutide and tirzepatide, that may not be fully captured in clinical trials or regulatory documents.  

The new study, published in Nature Health, covers more than half a decade of posts from nearly 70,000 Reddit users and highlights two main classes of symptoms that warrant further study: reproductive symptoms, including irregular menstrual cycles, and temperature-related complaints, such as chills and hot flashes. 

“Some of the side effects we found, like nausea, are well known, and that shows that the method is picking up a real signal,” says Sharath Chandra Guntuku, Research Associate Professor in Computer and Information Science (CIS) at Penn Engineering and the study’s senior author. “The underreported symptoms are leads that came from patients themselves, unprompted, and clinicians could potentially pay attention to them.”  

“Clinical trials generally identify the most dangerous side effects of drugs,” adds Lyle Ungar, Professor in CIS and a co-author on the study. “But they can fail to find what symptoms patients are most concerned about; even though social media is not necessarily representative, a large collection of posts may reflect additional concerns.” 

The researchers caution that their findings are not causal. “We can’t say that GLP-1s are actually causing these symptoms,” notes Neil Sehgal, the study’s first author and a doctoral student in CIS advised by Guntuku and Ungar. “But nearly 4% of the Reddit users in our sample reported menstrual irregularities, which would be even higher in a female-only sample. We think that’s a signal worth investigating.” 

Studying Social Media for Health

In 2011, Ungar participated in one of the earliest efforts to mine online, user-created content for information about drugs’ adverse effects. 

“Online patient communities work a lot like a neighborhood grapevine,” says Ungar. “People who are living with these medications are swapping notes with each other in real time, sharing experiences that rarely make it into a doctor's office visit or an official report.”

In the years since, social media use has only grown, making data from these platforms increasingly promising as a source of information about the side effects of medications, even as the platforms themselves have made accessing the data more difficult. (Guntuku has also published research on strategies for adapting to changes in platform access.)

“Clinical trials are the gold standard, but by design, they are slow,” says Guntuku. “This is not a replacement for trials, but it can move much faster, and that speed matters when a drug goes from niche to mainstream almost overnight.” 

Leveraging AI to Analyze Social Media

Until now, the most challenging part of this process, which Guntuku calls "computational social listening,” has been scale. 

Because users vary in how they describe their symptoms, the effort required to map individual social media posts to language in the Medical Dictionary for Regulatory Activities (MedDRA),  which clinicians use to describe symptoms, limited the amount of data this approach could handle. 

Now, large language models like GPT or Gemini have enabled the systematic analysis of social media posts at unprecedented scale. “Large language models have made it possible to do this kind of analysis much faster with a level of standardization that could be difficult to achieve before,” says Sehgal. 

Unreported Symptoms 

While the population the researchers studied is admittedly not representative — Reddit users are younger, more likely to be male and disproportionately based in the United States — the symptoms described in their collective accounts largely match the known side effects of semaglutide and tirzepatide: about 44% of users in the study described at least one side effect, most commonly some form of gastrointestinal distress. 

What stood out was the nontrivial percentage of users who reported symptoms that may not be fully reflected in current drug labeling or routine adverse-event reporting. Nearly 4% of users who reported side effects described reproductive symptoms, including menstrual changes such as intermenstrual bleeding, heavy bleeding and irregular cycles. 

Others reported temperature-related complaints, such as chills, feeling cold, hot flashes and fever-like symptoms 

In addition, fatigue ranked as the second most common complaint among Reddit users, despite reaching reporting thresholds in relatively few clinical trials.

“These drugs are thought to work by engaging part of the brain called the hypothalamus, which helps regulate a wide variety of hormones,” says Jena Shaw Tronieri, Senior Research Investigator at Penn’s Center for Weight and Eating Disorders and a co-author of the study. “That doesn’t mean the medications are necessarily causing these symptoms, but it could suggest that reports of menstrual changes and body temperature fluctuations are worth studying more systematically.” 

Future Directions

In the near term, the researchers hope their findings will encourage clinicians and researchers to take a closer look at the side effects patients are discussing online. “They’re clearly on patients’ minds, and that’s worth paying attention to,” says Sehgal.

The team also hopes to expand the work beyond Reddit and beyond English-language communities to test whether the same patterns appear across different platforms and populations. 

“We don’t really know yet whether what we’re seeing on Reddit reflects the experience of GLP-1 users globally, or whether it’s particular to the kind of person who posts on Reddit in the United States,” Ungar says. 

Ultimately, the researchers believe this kind of rapid, AI-assisted social media analysis could become a useful way to spot early warning signs around emerging drugs and wellness trends. 

For substances that trend quickly online, especially those sold in loosely regulated or unregulated markets, like injectable peptides, patient discussions on platforms like Reddit and TikTok may offer one of the earliest clues to what users are actually experiencing. 

“The whole point of this kind of approach is that it can move quickly, and that’s exactly when it’s most valuable,” says Guntuku.

This study was conducted at the University of Pennsylvania School of Engineering and Applied Science. The authors report no outside funding. Tronieri reports receiving an investigator-initiated grant, on behalf of the University of Pennsylvania, from Novo Nordisk and receiving consulting fees from Currax Pharmaceuticals, LLC. The other authors report no conflicts of interest. 


A close-up of the process the researchers used to analyze Reddit posts: at left is an example of the type of post the researchers fed into an AI-powered analysis, part of which is shown at right. 

Credit

Sylvia Zhang

AI gives doctors early warning of disease “tipping points” — often from a single patient sample



New Intelligent Medicine editorial details how dynamics-driven models are enabling real-time, individualized disease forecasting




Intelligent Medicine






The editorial, "Dynamics-driven medical big data mining: dynamic approaches to early disease forecasting and individualized care," published in Intelligent Medicine (February 2026, Volume 6, Issue 1), was written by Lu Wang (Tianjin Medical University), Han Lyu (Beijing Friendship Hospital, Capital Medical University), and Bin Sheng (Shanghai Jiao Tong University). It argues that the future of medical AI lies not only in diagnosing disease once it is visible, but in detecting the early dynamic changes that happen before symptoms fully appear. By analyzing how health data evolve over time, from omics and medical records to imaging and wearable devices, AI may help identify “tipping points” when the body is moving toward disease. The authors also stress that these systems must be rigorously validated and used to support, not replace, clinical judgment.

 

From population averages to individual tipping points

At the heart of this framework is dynamic network biomarker (DNB) theory, which detects impending disease transitions by monitoring sharp rises in fluctuations and correlations within biomolecular networks. Prior work summarized in the editorial has validated DNB-based approaches across two clinically important scenarios: flagging heightened gene-expression instability in influenza infection days before symptoms appear, and identifying genomic tipping points where cells shift from benign to malignant states, with tumor progression prediction accuracies exceeding 80%.

For busy clinicians, the most immediately relevant advance may be individual-specific edge-network analysis (iENA), which transforms molecular data into edge networks and assesses critical transitions using a single patient's own longitudinal data, without requiring a control group. In transcriptomic applications, this single-sample approach has achieved area-under-the-curve (AUC) values greater than 0.9, bringing real-time, bedside-applicable dynamic assessment within reach for the first time in this class of methods.

 

Hybrid AI narrows the gap between models and patients

The editorial also presents evidence that combining mechanistic physiological knowledge with deep learning, rather than relying on data-driven models alone, substantially improves clinical utility. In type 1 diabetes management, physiology-informed long short-term memory (LSTM) networks reduced mean absolute error in blood-glucose prediction to 35.0 mg/dL, compared with 79.7 mg/dL for traditional simulators, achieving a reduction of more than 55%. These models create patient-specific digital twins that can be used to test therapeutic strategies in silico before clinical application.

Beyond metabolic disease, the editorial describes parallel advances across data modalities: temporal graph neural networks applied to EHRs improved diagnosis prediction accuracy by 10–15% on the MIMIC-III dataset; dynamic graph models derived from functional MRI predicted treatment outcomes in tinnitus; and Transformer-based architectures trained on longitudinal EHRs have shown capacity to forecast multi-disease risks, including diabetes and hypertension, through hierarchical attention mechanisms.

 

Augmenting, not replacing, clinical judgment

"These dynamics-driven approaches are designed to augment, not replace, clinical expertise," said Professor Bin Sheng, corresponding author and professor at the School of Computer Science, Shanghai Jiao Tong University. "They provide timely early-warning signals that empower proactive intervention, moving medicine from reactive treatment to genuine prevention, while preserving the irreplaceable role of human judgment in complex medical decision-making."

 

Current limitations demand careful deployment

The editorial is equally direct about the challenges that must be resolved before these tools can deliver equitable, real-world benefits. Data heterogeneity and missing values can produce false positives in critical transition detection, inflating network fluctuations in ways that generate erroneous alerts. A more fundamental challenge is that current methods excel at identifying statistical associations but cannot reliably distinguish correlation from causation without incorporating medical domain knowledge and experimental validation. Interpretability remains a significant barrier: although tools such as SHAP and LIME provide partial explanations for model decisions, full transparency in deep architectures is yet to be achieved, and opaque predictions risk eroding the clinical trust that adoption requires.

Ethical and regulatory concerns also demand attention. Privacy risks persist in federated learning despite distributed training architectures, and algorithmic bias is a particular concern when models trained on specific populations are deployed in underrepresented groups, with the potential to widen rather than narrow healthcare inequalities.
 

The path forward: multimodal integration and prospective validation

Looking ahead, the editorial identifies two priorities. The first is multimodal integration: fusing omics, imaging, EHR, and wearable data through advanced Transformers, graph neural networks, and causal inference methods, including instrumental variables and counterfactual simulations, to construct comprehensive, causal models of individual disease trajectories. The second, and arguably more critical, is rigorous prospective validation. The authors stress that the gap between theoretical promise and clinical implementation can only be closed through well-designed prospective clinical trials and real-world deployment studies across diverse populations and healthcare settings.

Published as open access, the editorial serves as both a state-of-the-field reference and a practical roadmap for clinicians, researchers, and healthcare leaders working at the intersection of medicine and artificial intelligence.

 

***

 

Reference
DOI: 10.1016/j.imed.2025.10.001
 

About the Corresponding Author
Professor Bin Sheng received his Ph.D. in Computer Science and Engineering from The Chinese University of Hong Kong in 2011. He currently serves as a full professor at the School of Computer Science of Shanghai Jiao Tong University. His research focuses on virtual reality, computer graphics, and medical artificial intelligence. Sheng has published extensively in leading journals, including JAMA, Nature Medicine, The Lancet Digital Health, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He is the Managing Editor of The Visual Computer and has co-chaired multiple international conferences and AI challenges.


About the Journal
Intelligent Medicine is a peer-reviewed, open-access journal focusing on the integration of artificial intelligence, data science, and digital technology in clinical medicine and public health. It is published by the Chinese Medical Association in partnership with Elsevier. To learn more about Intelligent Medicine, please visit: https://www.sciencedirect.com/journal/intelligent-medicine


Funding information
This work was supported by the Youth Fund of the National Natural Science Foundation of China (Grant No. 32300519, 62522119, and T2525004).

 

Pharma.AI Spring Kickoff 2026: Drive the future of pharmaceutical intelligence




InSilico Medicine
Pharma.AI Spring Kickoff 2026: Drive the Future of Pharmaceutical Intelligence 

image: 

As the AI era becomes increasingly shaped by foundation models, the pharmaceutical industry is entering a new phase of opportunity for discovery, design, and decision-making driven by AI for science. To explore these advancements, Insilico Medicine (03696.HK), a clinical-stage generative AI–driven drug discovery company, today announced that the Pharma.AI Spring Kickoff 2026 will be held at 10:00 AM ET on April 14, with registration and event details available at: https://insilico.zoom.us/webinar/register/WN_h7tujok6SdmfDWzkZwRgNg.

view more 

Credit: Insilico Medicine




As the AI era becomes increasingly shaped by foundation models, the pharmaceutical industry is entering a new phase of opportunity for discovery, design, and decision-making driven by AI for science. To explore these advancements, Insilico Medicine (03696.HK), a clinical-stage generative AI–driven drug discovery company, today announced that the Pharma.AI Spring Kickoff 2026 will be held at 10:00 AM ET on April 14, with registration and event details available at: https://insilico.zoom.us/webinar/register/WN_h7tujok6SdmfDWzkZwRgNg.

The 2026 season of the Pharma.AI webinar series will showcase the ongoing AI revolution in life sciences, including the increased interest in the use of foundation models why specialized models remain essential for biology, chemistry, and translational research; How Pharma.AI brings together foundation models and scientific AI agents within a unified AI-driven workflow for drug R&D and scientific research; and how Insilico’s leading “AI trains AI” approach may enable foundation models to be better adapted for scientific and drug discovery applications, accelerating the evolution of AI decision-making systems.

More specifically, the upcoming event will highlight new capabilities across the Pharma.AI ecosystem, including the MMAI Gym for Science, updates to core modules such as PandaOmics, Generative Biologics, and Chemistry42.

"As we kick off 2026, our focus is on moving beyond simple AI-driven toward a truly AI-decision ecosystem," says Alex Aliper, PhD, president at Insilico Medicine. "With the introduction of the continued evolution of Pharma.AI, we are building the foundation for pharmaceutical superintelligence systems that can reason more effectively, adapt to real scientific workflows, and generate meaningful impact across drug discovery and development. The upcoming webinar brings together exciting new updates and is designed to provide researchers with the latest tools and best practices for tackling the most challenging problems in human health."

 

Highlights at a Glance

  • MMAI Gym: Turning Foundation Models into High-Performance Drug Discovery Engines

The MMAI Gym for Science, a foundation model training framework, was introduced by Insilico in January 2026. Leveraging over 1,000 drug R&D benchmarks and approximately 120 billion tokens of public and proprietary drug discovery data, the framework utilizes multi-task fine-tuning and reinforcement learning to significantly enhance the performance of foundation models across specialized tasks in drug discovery.

Validating the power of this framework, we demonstrate that MMAI-trained foundation models achieved up to 10X performance gains on key drug discovery benchmarks compared to general-purpose foundation models, which fell short on approximately 75–95% of tasks. Moreover, in March 2026, Insilico and Liquid AI jointly delivered LFM2-2.6B-MMAI (v0.2.1), the first model trained through their first MMAI Gym collaboration. Despite its lightweight, on-premise design, the model delivered SOTA performance across several key tasks. The paper detailing the training process and final performance was accepted at ICLR 2026.

During the upcoming event, attendees will learn how this supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT) training and benchmarking system can significantly improve the performance of causal LLMs on real-world drug discovery tasks, and how to access the platform.

  • PandaOmics: Target Prioritization with Single-Cell and PandaClaw

PandaOmics is Insilico Medicine’s AI-driven platform for therapeutic target discovery and indication expansion. It integrates and analyzes large-scale multi-omics and biomedical datasets to help researchers to identify and prioritize disease-specific drug targets and to expand the therapeutic indications of targets of interest.

Recent upgrades to PandaOmics include the incorporation of comprehensive single-cell datasets, which provide enhanced resolution for target identification. In addition, PandaClaw, an agentic AI tool that allows scientists to conduct complex, real-time multi-omics analyses, generate research hypotheses, and perform target evaluations via a simple natural-language interface.

  • Chemistry42: Multi-Target and Advanced Alchemistry

Chemistry42 is Insilico Medicine’s AI-driven platform for designing and discovering novel small molecules. It combines generative model ensembles and advanced physics-based methods to help researchers create and optimize novel compounds. A core part of Chemistry42 is Nach01, an AI model trained on billions of data points to understand both natural and chemical language, enabling hundreds of professional tasks and laying the groundwork for a “prompt-to-drug” future.

The latest updates include multitarget support for molecule generation, enhanced results visualization for smoother analysis, Nach01-MMAI for molecule generation, and new Absolute Binding Free Energy (ABFE) calculations in Alchemistry.

  • Generative Biologics: Cyclic Peptide Design & Linear Peptide Optimization

Generative Biologics is a cutting-edge biologics engineering platform. It uses advanced multi-parameter optimization to tackle complex challenges in the design of antibodies, peptides, and other biologic drugs. Powered by more than 10 generative and predictive models and enhanced by precise physics-based tools, Generative Biologics enables the rapid creation of diverse, optimized biologics, allowing scientists to generate viable binder candidates in less than 72 hours.

The platform now includes major updates for peptide design. It introduces a completely new workflow for cyclic peptides, supporting both head-to-tail and disulfide-bond architectures, generating hundreds of candidates in just hours with AI- and physics-based prioritization. In parallel, researchers have successfully optimized linear peptides using the platform to refine the lead candidate, P3, against GLP-1R and to produce dozens of new candidates, with the top variant, P3-1, achieving a sixfold improvement over the original lead.

Pharma.AI is an end-to-end AI platform for drug discovery and development, integrating target discovery, generative chemistry, biologics design, and predictive clinical modeling into a unified AI-driven workflow for pharmaceutical R&D. We hope to see you at our first event as we kick off 2026.

 

Date: April 14, 2026

Time: 10:00 AM ET

Link: Register via Zoom

 

About Insilico Medicine

Insilico Medicine is a pioneering global biotechnology company dedicated to integrating artificial intelligence and automation technologies to accelerate drug discovery, drive innovation in the life sciences, and extend healthy longevity to people on the planet. The company was listed on the Main Board of the Hong Kong Stock Exchange on December 30, 2025, under the stock code 03696.HK.

By integrating AI and automation technologies and deep in-house drug discovery capabilities, Insilico is delivering innovative drug solutions for unmet needs including fibrosis, oncology, immunology, pain, and obesity and metabolic disorders. Additionally, Insilico extends the reach of Pharma.AI across diverse industries, such as advanced materials, agriculture, nutritional products and veterinary medicine. For more information, please visit www.insilico.com