Tuesday, May 05, 2026

 

No digital content is safe from generative AI, researchers say

Cybersecurity researchers discovered that simple artificial intelligence tools can defeat security techniques meant to protect authentic content from use in deepfakes, facial identity theft, or artistic style mimicry




Virginia Tech

Peng Gao and Bimal Viswanath. 

image: 

(From left) Peng Gao and Bimal Viswanath.

view more 

Credit: Photos by Tonia Moxley for Virginia Tech.





A research team led by Virginia Tech cybersecurity expert Bimal Viswanath has found a critical blind spot in today's image protection techniques designed to prevent bad actors from stealing online content for unauthorized artificial intelligence training, style mimicry, and deepfake manipulations.

The research team found that attackers can defeat existing security using off-the-shelf artificial intelligence (AI) models and simple commands. Furthermore, “there is currently no foolproof, mathematically guaranteed way for users to protect publicly posted images against an adversary using off-the-shelf GenAI models,” Viswanath said.

The work was presented at the fourth IEEE Conference on Secure and Trustworthy Machine Learning, in Munich, Germany. The authors include Viswanath, doctoral students Xavier Pleimling and Sifat Muhammad Abdullah, Assistant Professor Peng Gao, Murtuza Jadliwala of the University of Texas at San Antonio, and Gunjan Balde and Mainack Mondal of Indian Institute of Technology, Kharagpur.

As AI tools become more powerful and accessible, this work highlights the growing need for stronger cybersecurity, trustworthy AI, privacy, and digital forensics protections.

GenAI makes fraud easier

Previously, fraudsters needed to use specialized, purpose-built methods to circumvent image security techniques that made it difficult for bad actors to use authentic content for deepfakes, facial identity theft, or artistic style mimicry.

"But using today’s off-the-shelf, image-to-image generative AI models and a simple text prompt, our researchers easily and effectively removed a wide range of these protections,” Viswanath said.

They demonstrated this security weakness across eight case studies spanning six diverse protection schemes. The vulnerabilities impact a wide spectrum of defenses, including perturbations meant to protect specific semantic properties, like a person's facial identity, invisible "protective noise" applied through an AI's latent space, and even robust protections specifically designed to survive downstream fine-tuning tasks.

“Our general-purpose attack not only circumvents these defenses but actually outperforms existing specialized attacks, while preserving the image's utility for the adversary," Viswanath said.

Racing to solve a growing problem

This work has exposed a critical and widespread vulnerability in the current landscape of image protection, proving that simply adding imperceptible protective noise to an image is no longer enough to stop data scrapers and forgers.

“It is especially concerning because current security methods can give a false sense of security,” Viswanath said. “We urgently need to develop robust defenses and establish that any future protection mechanism can defend against attacks from off-the-shelf generative AI models.”

This means the cybersecurity community must wholly re-evaluate its approach to secure visual content.

“Any future protection mechanism must be strictly benchmarked against simple, text-guided attacks from widely available, off-the-shelf GenAI models, not just evaluating them against specialized, purpose-built attacks,” Viswanath said. “Researchers should also note that GenAI image-to-image models will continue to improve over time, potentially making defense efforts harder.”

DOI

Article Title

A toy model to understand how AI learns




A simple physics-inspired model sheds light on how neural networks learn, offering new clues to the surprising efficiency and stability of modern AI systems




Sissa Medialab





Artificial intelligence systems based on neural networks — such as ChatGPT, Claude, DeepSeek or Gemini — are extraordinarily powerful, yet their internal workings remain largely a “black box”. To better understand how these systems produce their responses, a group of physicists at Harvard University has developed a simplified mathematical model of learning in neural networks that can be analysed mathematically using the tools of statistical physics.

“Toy models”, like the one presented in the study just published in the Journal of Statistical Mechanics: Theory and Experiment (JSTAT), provide researchers with a controlled theoretical laboratory for investigating the fundamental mechanisms of neural networks. A deeper understanding of how these systems work could help design artificial intelligence systems that are more efficient and reliable, while also addressing some of the current challenges.

The laws of AI

It’s a bit like when Kepler described the laws governing the motion of the planets. “The way Newton’s laws of gravity were discovered was first by identifying scaling laws between the orbital periods of planets and their radii,” explains Alexander Atanasov, a PhD student in theoretical physics at Harvard University and first author of the new study. Kepler formulated his laws by observing planetary motion, without fully understanding the mechanisms behind it. Yet that work proved crucial: it later enabled Newton to uncover gravity, leading to a much deeper understanding of the universe.

In studies of deep learning—the branch of artificial intelligence based on neural networks—we may still be in a similar Keplerian phase. Today researchers have identified several empirical laws that describe how neural networks behave, but we still lack a kind of “theory of gravity” explaining why they behave that way.

Scientists, for example, know about the scaling laws. “We know that if we take a model and make it bigger, or give it more data, its performance increases,” explains Cengiz Pehlevan, Associate Professor of Applied Mathematics at Harvard University and senior author of the study. These laws make performance predictable, but they do not yet reveal the deeper mechanisms behind it.  This approach is not only inefficient—today’s AI systems consume enormous amounts of energy—but also does little to advance our understanding of how these systems actually work.

Neural networks as biological organisms

“Deep learning models are not algorithms written by hand as a set of rules. They’re not engineered manually,” explains Atanasov. “It’s much more similar to an organism being grown in a lab.”

Generative AI chatbots rely on neural networks, a technology that — in a very distant way — resembles the functioning of a biological brain. They consist of many small processing units, called artificial neurons, each performing simple operations but connected together in a complex network.

It is this networked structure that allows “intelligent” behaviour to emerge. Although we know the mathematical operations performed by each individual component, predicting and mechanistically explaining the behaviour of the system as a whole remains extremely difficult: as the number of components grows, the complexity increases rapidly. 

A toy model

Since it is currently impossible to analyse a full-scale neural network with exact mathematical methods, Atanasov and his colleagues chose to work with a simplified model that still captures many key features of more complex systems.

“The model we’re studying is simple enough to be solved mathematically,” explains Jacob Zavatone-Veth, Junior Fellow at the Harvard Society of Fellows and co-author of the study. “At the same time, it reproduces several of the key phenomena seen in large neural networks.”

The toy model used in the study is ridge regression, a variant of linear regression. 

Linear regression is a statistical method used to estimate relationships between variables. For example, if we know the height and weight of 100 people, we can use linear regression to identify a mathematical relationship between the two and estimate the height of a new person based only on their weight.

The mystery of overfitting — and why it often doesn’t happen

Ridge regression is a type of regression that helps reduce the phenomenon known as overfitting. When models are trained on large datasets, a neural network — a bit like a very diligent but perhaps not particularly insightful student — may end up simply memorising the training data instead of learning patterns that allow it to generalise and make reliable predictions on new data.

Yet deep learning models often behave in a surprising way. “Despite being extremely large, these models can learn from the data without overfitting,” explains Atanasov, calling it “one of the great mysteries of deep learning.”

At first glance this seems counterintuitive. In theory, larger models should be more prone to overfitting. Instead, the scaling laws show that performance often improves as more data are used during training. 

New insights

The new study offers one possible piece of that explanation. According to the researchers, the ability of neural networks to learn without overfitting may arise from principles related to renormalization theory, a framework widely used in statistical physics.

To see why, it helps to consider the dimensionality of the data processed by modern AI systems. In the earlier example of linear regression we considered only two variables — height and weight. Real systems such as ChatGPT, however, operate in spaces with thousands or even millions of variables, making an exact mathematical analysis extremely difficult.

Here ideas from statistical physics become useful. In very high-dimensional data, small random variations — known as statistical fluctuations — naturally appear. Renormalization theory shows that many microscopic details can be effectively absorbed into a small number of parameters, meaning that even very complex systems can display relatively simple large-scale behaviour.

Using this framework and their simplified toy model, the researchers show how these high-dimensional fluctuations can actually stabilise learning rather than destabilise it.

“This is something we can understand by analysing simpler linear models,” explains Pehlevan, suggesting that the same mechanism may explain why current neural networks avoid overfitting even when they are highly over-parameterised.

The simplified model may also serve another purpose. As Zavatone-Veth notes, it could be a kind of baseline for understanding how learning might behave in very high-dimensional systems. By studying a model that is simple enough to analyse mathematically, researchers can identify which aspects of learning are likely to be generic—that is, expected to appear across many different neural networks—and which instead depend on the details of a specific model. In this sense, studies like this may help clarify some of the more fundamental principles underlying learning in complex systems.
 

No comments: