It’s possible that I shall make an ass of myself. But in that case one can always get out of it with a little dialectic. I have, of course, so worded my proposition as to be right either way (K.Marx, Letter to F.Engels on the Indian Mutiny)
Tuesday, May 05, 2026
No digital content is safe from generative AI, researchers say
Cybersecurity researchers discovered that simple artificial intelligence tools can defeat security techniques meant to protect authentic content from use in deepfakes, facial identity theft, or artistic style mimicry
A research team led by Virginia Tech cybersecurity expert Bimal Viswanath has found a critical blind spot in today's image protection techniques designed to prevent bad actors from stealing online content for unauthorized artificial intelligence training, style mimicry, and deepfake manipulations.
The research team found that attackers can defeat existing security using off-the-shelf artificial intelligence (AI) models and simple commands. Furthermore, “there is currently no foolproof, mathematically guaranteed way for users to protect publicly posted images against an adversary using off-the-shelf GenAI models,” Viswanath said.
The work was presented at the fourth IEEE Conference on Secure and Trustworthy Machine Learning, in Munich, Germany. The authors include Viswanath, doctoral students Xavier Pleimling and Sifat Muhammad Abdullah, Assistant Professor Peng Gao, Murtuza Jadliwala of the University of Texas at San Antonio, and Gunjan Balde and Mainack Mondal of Indian Institute of Technology, Kharagpur.
As AI tools become more powerful and accessible, this work highlights the growing need for stronger cybersecurity, trustworthy AI, privacy, and digital forensics protections.
GenAI makes fraud easier
Previously, fraudsters needed to use specialized, purpose-built methods to circumvent image security techniques that made it difficult for bad actors to use authentic content for deepfakes, facial identity theft, or artistic style mimicry.
"But using today’s off-the-shelf, image-to-image generative AI models and a simple text prompt, our researchers easily and effectively removed a wide range of these protections,” Viswanath said.
They demonstrated this security weakness across eight case studies spanning six diverse protection schemes. The vulnerabilities impact a wide spectrum of defenses, including perturbations meant to protect specific semantic properties, like a person's facial identity, invisible "protective noise" applied through an AI's latent space, and even robust protections specifically designed to survive downstream fine-tuning tasks.
“Our general-purpose attack not only circumvents these defenses but actually outperforms existing specialized attacks, while preserving the image's utility for the adversary," Viswanath said.
Racing to solve a growing problem
This work has exposed a critical and widespread vulnerability in the current landscape of image protection, proving that simply adding imperceptible protective noise to an image is no longer enough to stop data scrapers and forgers.
“It is especially concerning because current security methods can give a false sense of security,” Viswanath said. “We urgently need to develop robust defenses and establish that any future protection mechanism can defend against attacks from off-the-shelf generative AI models.”
This means the cybersecurity community must wholly re-evaluate its approach to secure visual content.
“Any future protection mechanism must be strictly benchmarked against simple, text-guided attacks from widely available, off-the-shelf GenAI models, not just evaluating them against specialized, purpose-built attacks,” Viswanath said. “Researchers should also note that GenAI image-to-image models will continue to improve over time, potentially making defense efforts harder.”
Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes
A toy model to understand how AI learns
A simple physics-inspired model sheds light on how neural networks learn, offering new clues to the surprising efficiency and stability of modern AI systems
Artificial intelligence systems based on neural networks — such as ChatGPT, Claude, DeepSeek or Gemini — are extraordinarily powerful, yet their internal workings remain largely a “black box”. To better understand how these systems produce their responses, a group of physicists at Harvard University has developed a simplified mathematical model of learning in neural networks that can be analysed mathematically using the tools of statistical physics.
“Toy models”, like the one presented in the study just published in the Journal of Statistical Mechanics: Theory and Experiment (JSTAT), provide researchers with a controlled theoretical laboratory for investigating the fundamental mechanisms of neural networks. A deeper understanding of how these systems work could help design artificial intelligence systems that are more efficient and reliable, while also addressing some of the current challenges.
The laws of AI
It’s a bit like when Kepler described the laws governing the motion of the planets. “The way Newton’s laws of gravity were discovered was first by identifying scaling laws between the orbital periods of planets and their radii,” explains Alexander Atanasov, a PhD student in theoretical physics at Harvard University and first author of the new study. Kepler formulated his laws by observing planetary motion, without fully understanding the mechanisms behind it. Yet that work proved crucial: it later enabled Newton to uncover gravity, leading to a much deeper understanding of the universe.
In studies of deep learning—the branch of artificial intelligence based on neural networks—we may still be in a similar Keplerian phase. Today researchers have identified several empirical laws that describe how neural networks behave, but we still lack a kind of “theory of gravity” explaining why they behave that way.
Scientists, for example, know about the scaling laws. “We know that if we take a model and make it bigger, or give it more data, its performance increases,” explains Cengiz Pehlevan, Associate Professor of Applied Mathematics at Harvard University and senior author of the study. These laws make performance predictable, but they do not yet reveal the deeper mechanisms behind it. This approach is not only inefficient—today’s AI systems consume enormous amounts of energy—but also does little to advance our understanding of how these systems actually work.
Neural networks as biological organisms
“Deep learning models are not algorithms written by hand as a set of rules. They’re not engineered manually,” explains Atanasov. “It’s much more similar to an organism being grown in a lab.”
Generative AI chatbots rely on neural networks, a technology that — in a very distant way — resembles the functioning of a biological brain. They consist of many small processing units, called artificial neurons, each performing simple operations but connected together in a complex network.
It is this networked structure that allows “intelligent” behaviour to emerge. Although we know the mathematical operations performed by each individual component, predicting and mechanistically explaining the behaviour of the system as a whole remains extremely difficult: as the number of components grows, the complexity increases rapidly.
A toy model
Since it is currently impossible to analyse a full-scale neural network with exact mathematical methods, Atanasov and his colleagues chose to work with a simplified model that still captures many key features of more complex systems.
“The model we’re studying is simple enough to be solved mathematically,” explains Jacob Zavatone-Veth, Junior Fellow at the Harvard Society of Fellows and co-author of the study. “At the same time, it reproduces several of the key phenomena seen in large neural networks.”
The toy model used in the study is ridge regression, a variant of linear regression.
Linear regression is a statistical method used to estimate relationships between variables. For example, if we know the height and weight of 100 people, we can use linear regression to identify a mathematical relationship between the two and estimate the height of a new person based only on their weight.
The mystery of overfitting — and why it often doesn’t happen
Ridge regression is a type of regression that helps reduce the phenomenon known as overfitting. When models are trained on large datasets, a neural network — a bit like a very diligent but perhaps not particularly insightful student — may end up simply memorising the training data instead of learning patterns that allow it to generalise and make reliable predictions on new data.
Yet deep learning models often behave in a surprising way. “Despite being extremely large, these models can learn from the data without overfitting,” explains Atanasov, calling it “one of the great mysteries of deep learning.”
At first glance this seems counterintuitive. In theory, larger models should be more prone to overfitting. Instead, the scaling laws show that performance often improves as more data are used during training.
New insights
The new study offers one possible piece of that explanation. According to the researchers, the ability of neural networks to learn without overfitting may arise from principles related to renormalization theory, a framework widely used in statistical physics.
To see why, it helps to consider the dimensionality of the data processed by modern AI systems. In the earlier example of linear regression we considered only two variables — height and weight. Real systems such as ChatGPT, however, operate in spaces with thousands or even millions of variables, making an exact mathematical analysis extremely difficult.
Here ideas from statistical physics become useful. In very high-dimensional data, small random variations — known as statistical fluctuations — naturally appear. Renormalization theory shows that many microscopic details can be effectively absorbed into a small number of parameters, meaning that even very complex systems can display relatively simple large-scale behaviour.
Using this framework and their simplified toy model, the researchers show how these high-dimensional fluctuations can actually stabilise learning rather than destabilise it.
“This is something we can understand by analysing simpler linear models,” explains Pehlevan, suggesting that the same mechanism may explain why current neural networks avoid overfitting even when they are highly over-parameterised.
The simplified model may also serve another purpose. As Zavatone-Veth notes, it could be a kind of baseline for understanding how learning might behave in very high-dimensional systems. By studying a model that is simple enough to analyse mathematically, researchers can identify which aspects of learning are likely to be generic—that is, expected to appear across many different neural networks—and which instead depend on the details of a specific model. In this sense, studies like this may help clarify some of the more fundamental principles underlying learning in complex systems.
Method of Research
Computational simulation/modeling
Article Title
Scaling and renormalization in high-dimensional regression
Article Publication Date
5-May-2026
New AI model reads DNA sequences to reconstruct ancestry
Borrowing from chatbots, researchers create first language model for population genetics
Researchers at the University of Oregon have developed an artificial intelligence tool that can read genetic code the way large language models like ChatGPT read text. Scanning the genome for biological mutation patterns, the computer model traces pairs of genes back in time to their last common ancestor.
It’s the first language model designed for population genetics, said Andrew Kern, a computational biologist in the UO College of Arts and Sciences. As described in a paper published April 10 in the Proceedings of the National Academy of Sciences, the AI tool offers scientists a fast and flexible alternative to classical methods for reconstructing evolutionary history.
In practice, it can help researchers like Kern understand when disease-resistance genes emerged in a population, for example, or when species evolved key traits.
“Advances in generative AI and the architectures behind them are potentially useful to a number of fields outside a chatbot,” said Kern, an Evergreen professor of biology. “We’re borrowing strengths from the world of AI and applying them in this different context that’s largely been untapped.”
Genomes are often compared to a written language, with combinations of DNA’s four-letter alphabet — A, T, C and G — forming the basis for genes and chromosomes. Kern and his lab are most interested in what’s misspelled, which scientists call mutations: changes in DNA sequences, like swapped or missing letters, that accumulate over time as part of evolution.
Often harmless, mutations can be passed down from generation to generation, leaving a trail of breadcrumbs for tracing ancestral relationships.
Traditional methods based on math and statistics are the gold standard for translating mutations into ancestry. They’re difficult to beat in most cases, said Kevin Korfmann, lead author of the study and former postdoctoral researcher at the UO. But those classical probabilistic approaches can be slow and struggle with large or incomplete genomic datasets, he added.
So, the researchers looked to AI to efficiently interpret the language of life by modifying a GPT-2 model, the older machine learning architecture behind ChatGPT. But instead of being trained on large volumes of English text, the language model was trained on simulations of genetic evolution across different species — including bacteria, rodents, mosquitoes and primates — to learn and recognize mutation patterns.
“We can’t repeat evolution, so one of the key workflows we have is developing simulations,” Korfmann said. “The simulations mimic evolutionary processes, and then we use the outcomes as training data for our deep learning models.”
In general, stretches of DNA with many mutations likely trace back to a distant common ancestor, whereas those with few mutations are likely to share a more recent ancestor. This helps explain why chimpanzees are considered humans’ closest living relatives, with similar DNA, while sea sponges are the most distant, having diverged genetically more than 700 million years ago.
Based on those mutation patterns and other biological principles, the AI model can predict when gene pairs last shared a common ancestor, known as the “coalescence time.”
In tests, the tool performed as well as state-of-the-art statistical methods, which was surprising to the research team.
“You never really know what’s going to work when you’re essentially borrowing techniques from a totally different world and applying them to a new problem,” Kern said. “But this was a case where things worked really well.”
The computer model was also dramatically faster. While traditional methods can take hours or even days to decode a single mosquito chromosome, the new approach can do it in minutes. That efficiency is especially beneficial for scientists handling large amounts of genetic sequence data.
“Compared to classical inferential approaches, the AI tool doesn’t have to reason about every mutation individually,” Korfmann said. “It just reads the patterns because all of the expensive statistical work was done up front, during training, which sidesteps the bottleneck.”
The model’s simulation-based training also enables scientists to use DNA datasets that are incomplete or missing genetic code — an issue Kern frequently faces when working with mosquito genetic databases for his research on malaria transmission.
That versatility comes at a crucial moment for malaria control, Kern said. For decades, insecticides have been a cornerstone for the prevention of malaria-spreading mosquitoes. But evolution, as Kern puts it, “did its thing.”
“Insecticide resistance is being observed in all of these mosquito populations today,” he said. “A major challenge in preventing the spread of malaria has been understanding the evolution of insecticide resistance. Now, we can go in with our AI model, ask how long ago these resistance genes arose in the population, and learn about the evolutionary history of this critical carrier of malaria.”
Looking ahead, Kern and Korfmann aim to advance the biological model beyond tracing shared ancestry between two lineages towards reconstructing full genealogical trees across multiple lineages. Some traditional methods can already do this, but Kern said they’d like to chase that goal from a machine-learning angle.
“There’s so much going on in the machine learning field that we haven’t applied yet in our field,” Korfmann said. “There’s tons of translational work to do to get these novel algorithms working in biology.”
HOBOKEN, NJ., May 4, 2026 — In a novel attempt to improve how large language models learn and make them more capable and energy-efficient, Stevens researchers devise a groundbreaking algorithm that improves AI data sharing, boosts performance and reduces power consumption.
Large language models like ChatGPT are huge. Letting many people train them together without sharing users’ private data—an approach called federated learning—is slow and inefficient. To collaborate, the models must share their updated versions of the entire data all the time—and that’s a huge amount of information to exchange. This approach uses a lot of network bandwidth memory and is energy intensive. As a result, models can’t be synchronized as often as necessary, resulting in outdated versions.
“It’s too much data to share,” says Stevens PhD candidate Yide Ran, who was the driving force behind the effort to improve the process. “It's like sending in an entire encyclopedia when you only need to change a few entries. But you really don’t need to do that.”
Working together with his advisors Zhaozhuo Xu, Assistant Professor of Computer Science at the School of Engineering who studies machine learning at Stevens Department of Computer Science and Denghui Zhang, Assistant Professor in Information Systems and Analytics at the School of Business, Ran sought to improve how language models share their data.
The team built upon the previously known concept that effective learning in large language models is often driven by a surprisingly small but well-chosen subset of parameters. The result is a more agile, faster-working model that also uses less energy. The researchers named the model MEERKAT after the animal, known for its dexterity and speed.
Instead of sharing the entire giant AI model, MEERKAT shares updates to only 0.1 percent of the model, which includes the most important parameters. “So you are no longer sending the entire encyclopedia when only a few key definitions have changed,” explains Zhang. That shrinks communications by over 1000 times. “Updates that used to be gigabytes are now just a few megabytes,” Zhang says.
MEERKAT’s other efficiency secret is using a different error-checking approach. Standard AI training requires an intense mathematical process called backpropagation, which stands for backward propagation of errors, in which AI performs self-checks to avoid mistakes. Although it’s a core algorithm used to train neural networks by minimizing the difference between predicted and actual outputs, backpropagation consumes huge amounts of memory and energy. MEERKAT simply tweaks the model slightly and checks the results, completely bypassing backpropagation.
Finally, small updates allow for more frequent synchronization of data, which is another breakthrough, as it keeps models up to date. “Because updates are so tiny, data can be now sent back and forth more often,” says Xu. “The result is a much better shared model.”
This new approach substantially reduces computational and communication costs, helping make advanced AI adaptation more feasible for resource-constrained institutions, researchers say. Their work will also support more equitable deployment of AI in domains such as healthcare, education and cross-institutional collaboration, where centralized data collection is often difficult to achieve due to privacy and other issues.
About Stevens Institute of Technology
Stevens is a premier, private research university situated in Hoboken, New Jersey. Since our founding in 1870, technological innovation has been the hallmark of Stevens’ education and research. Within the university’s three schools and one college, more than 8,000 undergraduate and graduate students collaborate closely with faculty in an interdisciplinary, student-centric, entrepreneurial environment. Academic and research programs spanning business, computing, engineering, the arts and other disciplines actively advance the frontiers of science and leverage technology to confront our most pressing global challenges. The university continues to be consistently ranked among the nation’s leaders in career services, post-graduation salaries of alumni and return on tuition investment.
Article Title
Mitigating Non-IID Drift in Zeroth-Order Federated LLM Fine-Tuning with Transferable Sparsity
No comments:
Post a Comment