Why AI can’t understand a flower the way humans do
AI needs to touch, feel and smell to have a sense of the world
Even with all its training and computer power, an artificial intelligence (AI) tool like ChatGPT can’t represent the concept of a flower the way a human does, according to a new study.
That’s because the large language models (LLMs) that power AI assistants are based usually on language alone, and sometimes with images.
“A large language model can’t smell a rose, touch the petals of a daisy or walk through a field of wildflowers,” said Qihui Xu, lead author of the study and postdoctoral researcher in psychology at The Ohio State University.
“Without those sensory and motor experiences, it can’t truly represent what a flower is in all its richness. The same is true of some other human concepts.”
The study was published this week in the journal Nature Human Behaviour.
Xu said the findings have implications for how AI and humans relate to each other.
“If AI construes the world in fundamentally different way from humans, it could affect how it interacts with us,” she said.
Xu and her colleagues compared humans and LLMs in their knowledge representation of 4,442 words – everything from “flower” and “hoof” to “humorous” and “swing.”
They compared the similarity of representations between humans and two state-of-the-art LLM families from OpenAI (GPT-3.5 and GPT-4) and Google (PaLM and Gemini).
Humans and LLMs were tested on two measures. One, called the Glasgow Norms, ask for ratings of words on nine dimensions, such as arousal, concreteness and imageability. For example, the measure asks for ratings of how emotionally arousing a flower is, and how much one can mentally visualize a flower (or how imageable it is).
The other measure, called Lancaster Norms, examined how concepts of words are related to sensory information (such as touch, hearing, smell, vision) and motor information, which are involved with actions – such as what humans do through contacts with the mouth, hand, arm and torso.
For example, the measure asks for ratings on how much one experiences flowers by smelling, and how much one experiences flowers using actions from the torso.
The goal was to see how the LLMs and humans were aligned in their ratings of the words. In one analysis, the researchers examined how much humans and AI were correlated on concepts. For example, do the LLMs and humans agree that some concepts have higher emotional arousal than others?
In a second analysis, researchers investigated how humans compared to LLMs on deciding how different dimensions may jointly contribute to a word’s overall conceptual representation and how different words are interconnected.
For example, the concepts of pasta and roses might both receive high ratings for how much they involve the sense of smell. However, pasta is considered more similar to noodles than to roses – at least for humans – not just because of its smell, but also its visual appearance and taste.
Overall, the LLMs did very well compared to humans in representing words that didn’t have any connection to the senses and to motor actions. But when it came to words that have connections to things we see, taste or interact with using our body, that’s where AI failed to capture human concepts.
“From the intense aroma of a flower, the vivid silky touch when we caress petals, to the profound joy evoked, human representation of ‘flower’ binds these diverse experiences and interactions into a coherent category,” the researchers say in the paper.
The issue is that most LLMs are dependent on language, and “language by itself can’t fully recover conceptual representation in all its richness,” Xu said.
Even though LLMs can approximate some human concepts, particularly when they don’t involve senses or motor actions, this kind of learning is not efficient.
“They obtain what they know by consuming vast amounts of text – orders of magnitude larger than what a human is exposed to in their entire lifetimes – and still can’t quite capture some concepts the way humans do,” Xu said.
“The human experience is far richer than words alone can hold.”
But Xu noted that LLMs are continually improving and it’s likely they will get better at capturing human concepts. The study did find that LLMs that are trained with images as well as text did do better than text-only models in representing concepts related to vision.
And when future LLMs are augmented with sensor data and robotics, they may be able to actively make inferences about and act upon the physical world, she said.
Co-authors on the study were Yingying Peng, Ping Li and Minghua Wu of the Hong Kong Polytechnic University; Samuel Nastase of Princeton University; and Martin Chodorow of the City University of New York.
Journal
Nature Human Behaviour
Method of Research
Data/statistical analysis
Subject of Research
People
Article Title
'Large language models without grounding recover non-sensorimotor but not sensorimotor features of human concepts'
Article Publication Date
4-Jun-2025
AI-generated memes funnier on average, but humans' gags rated funniest
KTH, Royal Institute of Technology
image:
“AI is great at generating lots of ideas quickly,” says the study’s co-author, Zhikun Wu, a master’s candidate at KTH Royal Institute of Technology. “But quantity doesn’t always mean quality.”
view moreCredit: Zhikun Wu
Can AI do humor? A new study suggests artificial intelligence can create internet memes as funny as those made by humans. But when it comes to gags that truly connect with viewers, people still have the edge.
Researchers from KTH Royal Institute of Technology, LMU Munich, and TU Darmstadt recently conducted the first large-scale study exploring how humans and AI collaborate to create internet memes. The team compared three groups: humans working alone, humans co-creating with a cutting-edge language model (LLM), and the LLM generating memes entirely on its own.
Participants created memes using classic templates like Doge, Futurama Fry, and Boromir’s iconic “One does not simply…” line. A second group of nearly 100 people then rated the memes for creativity, humor, and shareability.
The researchers found that on average, memes made entirely by AI scored higher than those made by humans or human-AI teams. But the top-performing memes told a different story: humans were funniest, while human-AI collaborations stood out in creativity and shareability.
Published on the ACM Digital Library, the paper was presented at the 2025 International Conference on Intelligent User Interfaces in Cagliari, Italy.
“AI is great at generating lots of ideas quickly,” says the study’s co-author, Zhikun Wu, a master’s candidate at KTH Royal Institute of Technology. “But quantity doesn’t always mean quality.”
Able to draw from vast datasets, AI models can produce content that appeals to a wide audience, the authors wrote. But many of the top-rated memes were created with human involvement, suggesting that AI models primarily produce “solid but average quality.”
“The best results came when humans curated and refined what the AI produced,” Wu says.
Participants who worked with the AI assistant generated more ideas and reported less effort—but many didn’t fully engage with the system. Fewer than half interacted with the AI more than once, and only a handful used it iteratively. This limited use may have held back the potential of true co-creativity.
The study highlighted a key challenge in human-AI collaboration: while AI can produce content that appeals to a broad audience, human creativity is still essential for content that resonates deeply—especially in humor, Wu says.
“Humor isn’t just about punchlines,” Wu says. “It’s about surprise, cultural context, and emotional nuance—things AI doesn’t fully grasp.”
The researchers argue that future AI tools should better support iterative, dialog-based creativity, helping users stay connected to their work while amplifying their ideas. In other words, systems shouldn’t just generate content, but help people shape it into something meaningful.
“While AI can increase productivity and produce content that appeals to a wide audience, human creativity is still key for creating content that connects more deeply in certain areas,” the authors wrote.
A meme which was rated among the best created by human-AI collaboration, in a recent study.
Credit
Zhikun Wu, Thomas Weber, Florian Müller
Method of Research
Observational study



.jpg)
No comments:
Post a Comment