A.I.
Researchers in Portugal develop an image analysis AI platform to boost worldwide research
DL4MicEverywhere empowers life scientists to harness cutting-edge deep learning techniques for biomedical research
INSTITUTO GULBENKIAN DE CIENCIA
A team of researchers from the Instituto Gulbenkian de Ciência (IGC) in Portugal, together with Åbo Akademi University in Finland, the AI4Life consortium, and other collaborators, have developed an innovative open-source platform called DL4MicEverywhere published today in the journal Nature Methods*. This platform provides life scientists with easy access to advanced artificial intelligence (AI) for the analysis of microscopy images. Itenables other researchers, regardless of their computational expertise, to easily train and use deep learning models on their own data.
Deep learning, a subfield of AI, has revolutionised the analysis of large and complex microscopy datasets, allowing scientists to automatically identify, track and analyse cells and subcellular structures. However, the lack of computing resources and AI expertise prevents some researchers in life-sciences from taking advantage of these powerful techniques in their own work. DL4MicEverywhere addresses these challenges by providing an intuitive interface for researchers to use deep learning models on any experiment that requires image analysis and in diverse computing infrastructures, from simple laptops to high-performance clusters.
"Our platform establishes a bridge between AI technological advances and biomedical research", said Ivan Hidalgo-Cenamor, first author of the study and researcher at IGC. “With it, regardless of their expertise in AI, researchers gain access to cutting-edge microscopymethods, enabling them to automatically analyse their results and potentially discover new biological insights”.
The DL4MicEverywhere platform builds upon the team's previous work, ZeroCostDL4Mic, to allow the training and use of models across various computational environments. The platform also includes a user-friendly interface and expands the collection of available methodologies that users can apply to common microscopy image analysis tasks.
"DL4MicEverywhere aims to democratise AI for microscopy by promoting community contributions and adhering to FAIR principles for scientific research software - making resources findable, accessible, interoperable and reusable", explained Dr. Estibaliz Gómez-de-Mariscal, co-lead of the study and researcher at IGC. "We hope this platform will empower researchers worldwide to harness these powerful techniques in their work, regardless of their resources or expertise".
The development of DL4MicEverywhere is a great example of the collaborative environment in science. First, it was developed with the purpose of allowing any researcher worldwide to take advantage of the most advanced technologies in microscopy, contributing to accelerate scientific discoveries. Second, it was made possible only through an international collaboration of experts in computer science, image analysis, and microscopy, with key contributions from the AI4Life consortium. The project was co-led by Ricardo Henriques at IGC and Guillaume Jacquemet at Åbo Akademi University.
"This work represents an important milestone in making AI more accessible and reusable for the microscopy community", said Professor Jacquemet. "By enabling researchers to share their models and analysis pipelines easily, we can accelerate discoveries and enhance reproducibility in biomedical research".
"DL4MicEverywhere has the potential to be transformative for the life sciences," added Professor Henriques. "It aligns with our vision in AI4Life to develop sustainable AI solutions that empower researchers and drive innovation in healthcare and beyond".
The DL4MicEverywhere platform is freely available as an open-source resource, reflecting the teams' commitment to open science and reproducibility. The researchers believe that by lowering the barriers to advanced microscopy image analysis, DL4MicEverywhere will enable breakthrough discoveries in fields ranging from basic cell biology to drug discovery and personalised medicine.
*Iván Hidalgo-Cenalmor, Joanna W Pylvänäinen, Mariana G Ferreira, Craig T Russell, Ignacio Arganda-Carreras, AI4Life Consortium, Guillaume Jacquemet, Ricardo Henriques, Estibaliz Gómez-de-Mariscal (2024) DL4MicEverywhere: Deep learning for microscopy made flexible, shareable, and reproducible. Nature Methods. DOI: 10.1038/s41592-024-02295-6
DL4MicEverywhere empowers life scientists to harness cutting-edge deep learning techniques for biomedical research
INSTITUTO GULBENKIAN DE CIENCIA
A team of researchers from the Instituto Gulbenkian de Ciência (IGC) in Portugal, together with Åbo Akademi University in Finland, the AI4Life consortium, and other collaborators, have developed an innovative open-source platform called DL4MicEverywhere published today in the journal Nature Methods*. This platform provides life scientists with easy access to advanced artificial intelligence (AI) for the analysis of microscopy images. Itenables other researchers, regardless of their computational expertise, to easily train and use deep learning models on their own data.
Deep learning, a subfield of AI, has revolutionised the analysis of large and complex microscopy datasets, allowing scientists to automatically identify, track and analyse cells and subcellular structures. However, the lack of computing resources and AI expertise prevents some researchers in life-sciences from taking advantage of these powerful techniques in their own work. DL4MicEverywhere addresses these challenges by providing an intuitive interface for researchers to use deep learning models on any experiment that requires image analysis and in diverse computing infrastructures, from simple laptops to high-performance clusters.
"Our platform establishes a bridge between AI technological advances and biomedical research", said Ivan Hidalgo-Cenamor, first author of the study and researcher at IGC. “With it, regardless of their expertise in AI, researchers gain access to cutting-edge microscopymethods, enabling them to automatically analyse their results and potentially discover new biological insights”.
The DL4MicEverywhere platform builds upon the team's previous work, ZeroCostDL4Mic, to allow the training and use of models across various computational environments. The platform also includes a user-friendly interface and expands the collection of available methodologies that users can apply to common microscopy image analysis tasks.
"DL4MicEverywhere aims to democratise AI for microscopy by promoting community contributions and adhering to FAIR principles for scientific research software - making resources findable, accessible, interoperable and reusable", explained Dr. Estibaliz Gómez-de-Mariscal, co-lead of the study and researcher at IGC. "We hope this platform will empower researchers worldwide to harness these powerful techniques in their work, regardless of their resources or expertise".
The development of DL4MicEverywhere is a great example of the collaborative environment in science. First, it was developed with the purpose of allowing any researcher worldwide to take advantage of the most advanced technologies in microscopy, contributing to accelerate scientific discoveries. Second, it was made possible only through an international collaboration of experts in computer science, image analysis, and microscopy, with key contributions from the AI4Life consortium. The project was co-led by Ricardo Henriques at IGC and Guillaume Jacquemet at Åbo Akademi University.
"This work represents an important milestone in making AI more accessible and reusable for the microscopy community", said Professor Jacquemet. "By enabling researchers to share their models and analysis pipelines easily, we can accelerate discoveries and enhance reproducibility in biomedical research".
"DL4MicEverywhere has the potential to be transformative for the life sciences," added Professor Henriques. "It aligns with our vision in AI4Life to develop sustainable AI solutions that empower researchers and drive innovation in healthcare and beyond".
The DL4MicEverywhere platform is freely available as an open-source resource, reflecting the teams' commitment to open science and reproducibility. The researchers believe that by lowering the barriers to advanced microscopy image analysis, DL4MicEverywhere will enable breakthrough discoveries in fields ranging from basic cell biology to drug discovery and personalised medicine.
*Iván Hidalgo-Cenalmor, Joanna W Pylvänäinen, Mariana G Ferreira, Craig T Russell, Ignacio Arganda-Carreras, AI4Life Consortium, Guillaume Jacquemet, Ricardo Henriques, Estibaliz Gómez-de-Mariscal (2024) DL4MicEverywhere: Deep learning for microscopy made flexible, shareable, and reproducible. Nature Methods. DOI: 10.1038/s41592-024-02295-6
JOURNAL
Nature Methods
Nature Methods
DOI
METHOD OF RESEARCH
Imaging analysis
Imaging analysis
SUBJECT OF RESEARCH
Not applicable
Not applicable
ARTICLE TITLE
DL4MicEverywhere: deep learning for microscopy made flexible, shareable and reproducible
DL4MicEverywhere: deep learning for microscopy made flexible, shareable and reproducible
ARTICLE PUBLICATION DATE
17-May-2024
17-May-2024
Model disgorgement: the key to fixing AI bias and copyright infringement?
By now, the challenges posed by generative AI are no secret. Models like OpenAI’s ChatGPT, Anthropic’s Claude and Meta’s Llama have been known to “hallucinate,” inventing potentially misleading responses, as well as divulge sensitive information, like copyrighted materials.
One potential solution to some of these issues is “model disgorgement,” a set of techniques that force models to purge themselves of content that leads to copyright infringement or biased responses.
In a recent paper in Proceedings of the National Academy of Sciences (PNAS), Michael Kearns, National Center Professor of Management & Technology in Computer and Information Science (CIS), and three fellow researchers at Amazon share their perspective on the potential for model disgorgement to solve some of the issues facing AI models today.
In the following Q&A, Kearns discusses the paper and its implications for improving AI.
What is model disgorgement?
Model disgorgement is the name for a broad set of techniques and the problems that those techniques are trying to solve. The goal is to mitigate or eradicate the effects of particular pieces of training data from the behavior of a trained model.
You expect individual pieces of training data or collections of training data to influence the behavior of the model. But this can lead to privacy leaks, copyright violations and other issues that aren’t covered by the law yet.
How is model disgorgement different from efforts to ensure data privacy, like Europe’s General Data Protection Regulation?
These are different but related concerns. If I ask Facebook to delete all of my stored Facebook activity from their servers, the GDPR requires that to be done on request.
Laws like the GDPR are less clear about what happens before your data is deleted. Your data was used to train a predictive model, and that predictive model is still out there, operating in the world. That model will still have been trained on your data even after your data is deleted from Facebook’s servers. This can lead to a number of problems.
For one, if your data was private, a third-party adversary might be able to reverse-engineer sensitive aspects of your private data. This is certainly an instance where you would want model disgorgement techniques to remove that sensitive data from the model.
In addition, there are also issues with copyright, as we’re seeing in The New York Times’ lawsuit against OpenAI. ChatGPT can regurgitate verbatim copyrighted articles from the Times. It’s pretty clear that OpenAI used those articles in training ChatGPT.
To be clear, the paper doesn’t want those articles to be private; it wants the articles to be accessible to the public. But the Times also wants to control the articles’ use and reproduction.
Finally, there’s another issue that I might call ‘stylistic infringement,’ where a user can say, ‘Give me a painting in the style of Andy Warhol of a cat skateboarding in Rittenhouse Square.” The model is able to do a good job because it’s been trained on the entire output of Andy Warhol’s career. If you’re the executor of Andy Warhol’s estate, you might take issue with this.
Even though these are very different issues, the technical ways of addressing them are quite similar, and involve model disgorgement techniques. In other words, it’s not that model disgorgement is different from efforts to ensure data privacy, it’s more that model disgorgement techniques can be used in certain situations where current approaches to privacy like the GDPR fall short.
The Ethical Algorithm, which you co-wrote with Aaron Roth, Henry Salvatori Professor of Computer & Cognitive Science in CIS, and which you recently referenced in the context of AI, describes how to embed ethical considerations into algorithm design. Would that approach be feasible with AI models?
When we wrote the book, generative AI didn’t exist, at least not like it does today. Our book focused on traditional machine learning, which involves more targeted predictions—like taking the information on a loan application and coming up with an assessment of the risk that a particular person would default if given a loan.
When an application is that targeted, it becomes much more feasible to bake into the training process defenses against various harms that you’re concerned about, like demographic bias in the performance of the model or leaking the private training data.
For now, we’ve lost that ability in training generative models because of the extreme open-ended nature of their outputs.
Would it be possible to filter the training data for AI models to reduce the likelihood of biased or copyright-breaching responses?
That’s hard for a few reasons.
The way you train a competitive large language model is by scraping the entire internet—literally. That’s table stakes. You also need a lot of other more proprietary data sources. When that is the starting point, there’s so much you don’t know about your training data.
In principle, we know how to train huge neural networks in a way that will avoid all of these problems. You can train a neural network under the constraint of differential privacy, a method of intentionally corrupting data to shield private information, for instance, and fewer of these problems will occur.
Nobody’s tried. I think the general feeling is that the degradation in performance you would get by training a large language model under the constraint of differential privacy would kind of obviate the point in the first place.
In other words, the quality would be so bad that you’d start generating nonsensical, nongrammatical outputs. The amount of noise that you would need to add to the training process, which is how differential privacy works—it just wouldn’t work at scale.
Can you provide a few examples of model disgorgement techniques? How do they work?
One conceptually straightforward solution is retraining from scratch. This is clearly infeasible given the scale and size of these networks and the compute time and resources it takes to train them. At the same time, retraining is kind of a gold standard—what you would like to achieve in a more efficient, scalable way.
Then there are “algorithmic” solutions. One of these is machine “unlearning.” Instead of retraining the whole network, we could just modify it in some way that mitigates or reduces the effects of your data on the training process.
Another algorithmic approach is training under the constraint of differential privacy: adding noise to the training process in a way that minimizes the effects of any particular piece of training data, while still letting you use the aggregate properties of the data set.
Then there are what I might call system-level techniques. One of these is “sharding.” If I divided my training data into 100 “shards,” I could train a different model on each of those 100 shards and then produce an overall model by averaging those 100 models.
If we’re lucky enough that your data was only in one of those 100 shards, and you wanted to remove your data, we could just remove that model entirely from the average. Or we could retrain just that model, which used only one percent of the overall data.
Your data’s contribution to something like ChatGPT is quite minuscule. If you did a sharding approach, your data would likely fall entirely within one, maybe at most two, of these 100 shards.
The bigger concern is for really large data sets. How do you make sure that every organization whose data you’re using is kind of only in one of the 100 shards?
To arrange this, you have to know what the organizations are in advance—and this gets back to my earlier point that often you don’t know what’s in your training data.
If my training data is some massive file, which is a crawl of the entire internet, and I break it into 100 pieces, I have no idea where Getty Images’ data might be distributed amongst those hundred pieces.
If we could go back in time and change the way the internet was designed, could we make sure that every piece of data online was tagged or identified with different levels of protection so that scraping the internet would yield metadata to inform what AI models can and can’t use in training?
My gut reaction is that this approach might help solve the problems that we’re discussing here, but would have possibly resulted in very different challenges elsewhere.
One of the great successes of the consumer internet was its openness and the lack of structure and rules for how data is organized and how data can cross reference other data. You could imagine setting up the rules differently. But you can also imagine the internet maybe never happening because it would just be too onerous to build on it.
The great success story of the internet has come from basically the lack of rules. You pay for the lack of rules, in the areas that we’re discussing here today.
Most people who think seriously about privacy and security would probably agree with me that a lot of the biggest problems in those topics come from the lack of rules, the design of the internet, but that’s also what made it so accessible and successful.
In short, it’s hard to avoid these trade-offs.
In your recent PNAS paper, you and your co-authors organize the model disgorgement methods discussed above into a taxonomy, classifying them according to when they take action and how they work. What do you hope the paper offers future researchers and industry professionals?
It’s a non-technical paper in many ways, and it’s meant for a broader audience. We hope that the paper will help frame thinking about these issues—in particular, the trade-offs among the different technical methods for model disgorgement. This felt like a topic that was important enough societally and nascent enough scientifically that it was a good time to kind of step up and survey the landscape.
By now, the challenges posed by generative AI are no secret. Models like OpenAI’s ChatGPT, Anthropic’s Claude and Meta’s Llama have been known to “hallucinate,” inventing potentially misleading responses, as well as divulge sensitive information, like copyrighted materials.
One potential solution to some of these issues is “model disgorgement,” a set of techniques that force models to purge themselves of content that leads to copyright infringement or biased responses.
In a recent paper in Proceedings of the National Academy of Sciences (PNAS), Michael Kearns, National Center Professor of Management & Technology in Computer and Information Science (CIS), and three fellow researchers at Amazon share their perspective on the potential for model disgorgement to solve some of the issues facing AI models today.
In the following Q&A, Kearns discusses the paper and its implications for improving AI.
What is model disgorgement?
Model disgorgement is the name for a broad set of techniques and the problems that those techniques are trying to solve. The goal is to mitigate or eradicate the effects of particular pieces of training data from the behavior of a trained model.
You expect individual pieces of training data or collections of training data to influence the behavior of the model. But this can lead to privacy leaks, copyright violations and other issues that aren’t covered by the law yet.
How is model disgorgement different from efforts to ensure data privacy, like Europe’s General Data Protection Regulation?
These are different but related concerns. If I ask Facebook to delete all of my stored Facebook activity from their servers, the GDPR requires that to be done on request.
Laws like the GDPR are less clear about what happens before your data is deleted. Your data was used to train a predictive model, and that predictive model is still out there, operating in the world. That model will still have been trained on your data even after your data is deleted from Facebook’s servers. This can lead to a number of problems.
For one, if your data was private, a third-party adversary might be able to reverse-engineer sensitive aspects of your private data. This is certainly an instance where you would want model disgorgement techniques to remove that sensitive data from the model.
In addition, there are also issues with copyright, as we’re seeing in The New York Times’ lawsuit against OpenAI. ChatGPT can regurgitate verbatim copyrighted articles from the Times. It’s pretty clear that OpenAI used those articles in training ChatGPT.
To be clear, the paper doesn’t want those articles to be private; it wants the articles to be accessible to the public. But the Times also wants to control the articles’ use and reproduction.
Finally, there’s another issue that I might call ‘stylistic infringement,’ where a user can say, ‘Give me a painting in the style of Andy Warhol of a cat skateboarding in Rittenhouse Square.” The model is able to do a good job because it’s been trained on the entire output of Andy Warhol’s career. If you’re the executor of Andy Warhol’s estate, you might take issue with this.
Even though these are very different issues, the technical ways of addressing them are quite similar, and involve model disgorgement techniques. In other words, it’s not that model disgorgement is different from efforts to ensure data privacy, it’s more that model disgorgement techniques can be used in certain situations where current approaches to privacy like the GDPR fall short.
The Ethical Algorithm, which you co-wrote with Aaron Roth, Henry Salvatori Professor of Computer & Cognitive Science in CIS, and which you recently referenced in the context of AI, describes how to embed ethical considerations into algorithm design. Would that approach be feasible with AI models?
When we wrote the book, generative AI didn’t exist, at least not like it does today. Our book focused on traditional machine learning, which involves more targeted predictions—like taking the information on a loan application and coming up with an assessment of the risk that a particular person would default if given a loan.
When an application is that targeted, it becomes much more feasible to bake into the training process defenses against various harms that you’re concerned about, like demographic bias in the performance of the model or leaking the private training data.
For now, we’ve lost that ability in training generative models because of the extreme open-ended nature of their outputs.
Would it be possible to filter the training data for AI models to reduce the likelihood of biased or copyright-breaching responses?
That’s hard for a few reasons.
The way you train a competitive large language model is by scraping the entire internet—literally. That’s table stakes. You also need a lot of other more proprietary data sources. When that is the starting point, there’s so much you don’t know about your training data.
In principle, we know how to train huge neural networks in a way that will avoid all of these problems. You can train a neural network under the constraint of differential privacy, a method of intentionally corrupting data to shield private information, for instance, and fewer of these problems will occur.
Nobody’s tried. I think the general feeling is that the degradation in performance you would get by training a large language model under the constraint of differential privacy would kind of obviate the point in the first place.
In other words, the quality would be so bad that you’d start generating nonsensical, nongrammatical outputs. The amount of noise that you would need to add to the training process, which is how differential privacy works—it just wouldn’t work at scale.
Can you provide a few examples of model disgorgement techniques? How do they work?
One conceptually straightforward solution is retraining from scratch. This is clearly infeasible given the scale and size of these networks and the compute time and resources it takes to train them. At the same time, retraining is kind of a gold standard—what you would like to achieve in a more efficient, scalable way.
Then there are “algorithmic” solutions. One of these is machine “unlearning.” Instead of retraining the whole network, we could just modify it in some way that mitigates or reduces the effects of your data on the training process.
Another algorithmic approach is training under the constraint of differential privacy: adding noise to the training process in a way that minimizes the effects of any particular piece of training data, while still letting you use the aggregate properties of the data set.
Then there are what I might call system-level techniques. One of these is “sharding.” If I divided my training data into 100 “shards,” I could train a different model on each of those 100 shards and then produce an overall model by averaging those 100 models.
If we’re lucky enough that your data was only in one of those 100 shards, and you wanted to remove your data, we could just remove that model entirely from the average. Or we could retrain just that model, which used only one percent of the overall data.
Your data’s contribution to something like ChatGPT is quite minuscule. If you did a sharding approach, your data would likely fall entirely within one, maybe at most two, of these 100 shards.
The bigger concern is for really large data sets. How do you make sure that every organization whose data you’re using is kind of only in one of the 100 shards?
To arrange this, you have to know what the organizations are in advance—and this gets back to my earlier point that often you don’t know what’s in your training data.
If my training data is some massive file, which is a crawl of the entire internet, and I break it into 100 pieces, I have no idea where Getty Images’ data might be distributed amongst those hundred pieces.
If we could go back in time and change the way the internet was designed, could we make sure that every piece of data online was tagged or identified with different levels of protection so that scraping the internet would yield metadata to inform what AI models can and can’t use in training?
My gut reaction is that this approach might help solve the problems that we’re discussing here, but would have possibly resulted in very different challenges elsewhere.
One of the great successes of the consumer internet was its openness and the lack of structure and rules for how data is organized and how data can cross reference other data. You could imagine setting up the rules differently. But you can also imagine the internet maybe never happening because it would just be too onerous to build on it.
The great success story of the internet has come from basically the lack of rules. You pay for the lack of rules, in the areas that we’re discussing here today.
Most people who think seriously about privacy and security would probably agree with me that a lot of the biggest problems in those topics come from the lack of rules, the design of the internet, but that’s also what made it so accessible and successful.
In short, it’s hard to avoid these trade-offs.
In your recent PNAS paper, you and your co-authors organize the model disgorgement methods discussed above into a taxonomy, classifying them according to when they take action and how they work. What do you hope the paper offers future researchers and industry professionals?
It’s a non-technical paper in many ways, and it’s meant for a broader audience. We hope that the paper will help frame thinking about these issues—in particular, the trade-offs among the different technical methods for model disgorgement. This felt like a topic that was important enough societally and nascent enough scientifically that it was a good time to kind of step up and survey the landscape.
JOURNAL
Proceedings of the National Academy of Sciences
Proceedings of the National Academy of Sciences
DOI
METHOD OF RESEARCH
Literature review
Literature review
SUBJECT OF RESEARCH
Not applicable
Not applicable
ARTICLE TITLE
AI model disgorgement: Methods and choices
AI model disgorgement: Methods and choices
ARTICLE PUBLICATION DATE
19-May-2024
19-May-2024
Automated news video production is better with a human touch
AI-generated videos for short messages are only as well received as manually created ones if they are edited by humans.
News organizations—including Bloomberg, Reuters, and The Economist—have been using AI powered video services to meet growing audience demand for audio-visual material. A study recently published in the journal Journalism now shows that the automated production of news videos is better with human supervision.
Technology providers like Wochit and Moovly are allowing publishers to mass produce videos at scale. But what do audiences think of the results? Researchers led by LMU communication scientist Professor Neil Thurman have found that only automated videos which have been post-edited by humans were as well liked as fully human-made videos.
“Our research shows that, on average, news consumers liked short-form, automated news videos as much as manually made ones, as long as the automation process involved human supervision”, says Neil Thurman, from LMU’s Department of Media and Communication.
Together with Dr. Sally Stares (London School of Economic) and Dr. Michael Koliska (Georgetown University), Thurman evaluated the reactions of 4,200 UK news consumers to human-made, highly-automated, and partly-automated videos that covered a variety of topics including Christiano Ronaldo, Donald Trump, and the Wimbledon tennis championships. The partly-automated videos were post-edited by humans after the initial automation process.
The results show that there were no significant differences in how much news audiences liked the human-made and partly-automated videos overall. By contrast, highly-automated videos were liked significantly less. In other words, the results show that news video automation is better with human supervision.
According to Thurman, "one key takeaway of the study is that video automation output may be best when it comes in a hybrid form, meaning a human-machine collaboration. Such hybridity involves more human supervision, ensuring that automated video production maintains quality standards while taking advantage of computers’ strengths, such as speed and scale.”
AI-generated videos for short messages are only as well received as manually created ones if they are edited by humans.
News organizations—including Bloomberg, Reuters, and The Economist—have been using AI powered video services to meet growing audience demand for audio-visual material. A study recently published in the journal Journalism now shows that the automated production of news videos is better with human supervision.
Technology providers like Wochit and Moovly are allowing publishers to mass produce videos at scale. But what do audiences think of the results? Researchers led by LMU communication scientist Professor Neil Thurman have found that only automated videos which have been post-edited by humans were as well liked as fully human-made videos.
“Our research shows that, on average, news consumers liked short-form, automated news videos as much as manually made ones, as long as the automation process involved human supervision”, says Neil Thurman, from LMU’s Department of Media and Communication.
Together with Dr. Sally Stares (London School of Economic) and Dr. Michael Koliska (Georgetown University), Thurman evaluated the reactions of 4,200 UK news consumers to human-made, highly-automated, and partly-automated videos that covered a variety of topics including Christiano Ronaldo, Donald Trump, and the Wimbledon tennis championships. The partly-automated videos were post-edited by humans after the initial automation process.
The results show that there were no significant differences in how much news audiences liked the human-made and partly-automated videos overall. By contrast, highly-automated videos were liked significantly less. In other words, the results show that news video automation is better with human supervision.
According to Thurman, "one key takeaway of the study is that video automation output may be best when it comes in a hybrid form, meaning a human-machine collaboration. Such hybridity involves more human supervision, ensuring that automated video production maintains quality standards while taking advantage of computers’ strengths, such as speed and scale.”
JOURNAL
Journalism
Journalism
DOI
ARTICLE TITLE
Audience evaluations of news videos made with various levels of automation: A population-based survey experiment
Audience evaluations of news videos made with various levels of automation: A population-based survey experiment
ARTICLE PUBLICATION DATE
8-May-2024
8-May-2024
NUS researchers and industry partners demonstrate cutting-edge chip technology for ultra-low power AI connected devices
Dramatic improvements in chip energy efficiency will turbocharge Singapore’s AI and semiconductor industry with new capabilities in always-on AI devices
Researchers from NUS, together with industry partners Soitec and NXP Semiconductors, have demonstrated a new class of silicon systems that promises to enhance the energy efficiency of AI connected devices by leaps and bounds. These technological breakthroughs will significantly advance the capabilities of the semiconductor industry in Singapore and beyond.
This innovation has been demonstrated in fully-depleted silicon-on-insulator (FD-SOI) technology, and can be applied to the design and fabrication of advanced semiconductor components for AI applications. The new chip technology has the potential to extend the battery life of wearables and smart objects by a factor of 10, support intense computational workloads for use in Internet of Things applications, and halve the power consumption associated with wireless communications with the cloud.
The new suite of disruptive chip technologies will be promoted through the FD-SOI & IoT Industry Consortium to accelerate industry adoption by lowering the design barrier to entry in FD-SOI chips. An industry workshop titled “Next-gen energy-efficient FD-SOI systems" was held on 3 May 2024 for participants from the industry and research community to share and discuss the latest developments in FD-SOI technologies, and showcase the new capabilities with state-of-the-art demonstrations.
“IoT devices often operate on a very limited power budget, and hence require extremely low average power to efficiently perform regular tasks such as physical signal monitoring. At the same time, high peak performance is demanded to process occasional signal events with computationally-intensive AI algorithms. Our research uniquely allows us to simultaneously reduce the average power and improve the peak performance,” said Professor Massimo Alioto, who is from the NUS College of Design and Engineering’s Department of Electrical and Computer Engineering and is also the Director of the FD-fAbrICS (FD-SOI Always-on Intelligent & Connected Systems) joint lab where the new suite of technologies was engineered.
“The applications are wide-ranging and include smart cities, smart buildings, Industry 4.0, wearables and smart logistics. The remarkable energy improvements obtained in the FD-fAbrICS program are a game changer in the area of battery-powered AI devices, as they ultimately allow us to move intelligence from conventional cloud to smart miniaturised devices,” said Prof Alioto, who also heads the Green IC group (www.green-ic.org) at the Department of Electrical and Computer Engineering.
Powering AI devices with ultra-energy efficient chips
Research conducted by the NUS FD-fAbrICS joint lab showed that their FD-SOI chip technology can be deployed at scale with enhanced design and system integration productivity for lower cost, faster market reach, and rapid industry adoption.
“This innovation has the potential to accelerate the time to market for key players in Singapore’s semiconductor ecosystem,” said Prof Alioto. “We hope to facilitate the adoption and deployment of our design technologies at scale through the FD-SOI & IoT Industry Consortium. This is a significant contribution to the AI and semiconductor industry in Singapore, as it enables a competitive advantage while reducing the overall development cost of FD-SOI systems.”
The research breakthroughs from the NUS FD-fAbrICS joint lab leverage the combined NUS expertise and capabilities from different domains, such as digital circuits (Prof Massimo Alioto), wireless communications (Assoc Prof Heng Chun Huat), system architectures (Asst Prof Trevor Carlson), and AI models (Prof Li Haizhou). Industry leaders such as Soitec, NXP and Dolphin Design contributed to the research efforts at the joint lab, which is also supported by the Agency for Science, Technology and Research.
The NUS research team is now looking into developing new classes of intelligent and connected silicon systems that could support larger AI model sizes (“large models”) for generative AI applications. The resulting decentralisation of AI computation from cloud to distributed devices will simultaneously preserve privacy, keep latency at a minimum, and avoid wireless data deluge under the simultaneous presence of a plethora of devices.
Accelerating industry adoption of FD-SOI technologies
The industry workshop, which delved into the cutting-edge advancements and applications of FD-SOI technology, aimed to foster an environment of knowledge sharing as well as catalyse collaborations within, and between, the FD-SOI research community and the semiconductor industry in Singapore working on intelligent and connected silicon systems.
Another objective of the workshop was to facilitate rapid FD-SOI adoption and lower the design barrier to entry, by sharing the research outcomes from the FD-fAbrICS joint lab. Speakers from Soitec, GlobalFoundries, NXP, and the NUS FD-fAbrICS research team shared their perspectives on the current development of related technologies – for example, in manufacturing and microchip design – and future disruptive technologies for next-generation ultra-low power AI systems.
FD-SOI & IoT Industry Consortium
The FD-SOI & IoT Consortium was established to extend the impact of the NUS FD-fAbrICS joint lab on the semiconductor ecosystem in Singapore. Soitec and NXP are founding members of the Consortium.
Consortium members will have access to innovative FD-SOI design IP and methodologies, which will help to accelerate their next-generation prototyping and development cycle with highly energy efficient processes, especially in the fast-growing area of AI-connected chips.
The FD-SOI & IoT Consortium will support the near-term needs of industry for rapid technology road mapping and accelerated innovation cycle. At the same time, to assure sustained scalability and differentiation across the Consortium members in the longer term, the technologies developed in synergy with the FD-fAbrICS industry partners will be further expanded by some of the Consortium members.
ARTICLE TITLE
NUS researchers and industry partners demonstrate cutting-edge chip technology for ultra-low power AI connected devices
ARTICLE PUBLICATION DATE
17-May-2024
AI-powered headphones filter only unwanted noise #ASA186
Neural network categorizes ambient sounds, giving users the power to choose what to hear
OTTAWA, Ontario, May 16, 2024 – Noise-canceling headphones are a godsend for living and working in loud environments. They automatically identify background sounds and cancel them out for much-needed peace and quiet. However, typical noise-canceling fails to distinguish between unwanted background sounds and crucial information, leaving headphone users unaware of their surroundings.
Shyam Gollakota, from the University of Washington, is an expert in using AI tools for real-time audio processing. His team created a system for targeted speech hearing in noisy environments and developed AI-based headphones that selectively filter out specific sounds while preserving others. He will present his work Thursday, May 16, at 1:20 p.m. EDT as part of a joint meeting of the Acoustical Society of America and the Canadian Acoustical Association, running May 13-17 at the Shaw Centre located in downtown Ottawa, Ontario, Canada.
“Imagine you are in a park, admiring the sounds of chirping birds, but then you have the loud chatter of a nearby group of people who just can’t stop talking,” said Gollakota. “Now imagine if your headphones could grant you the ability to focus on the sounds of the birds while the rest of the noise just goes away. That is exactly what we set out to achieve with our system.”
Gollakota and his team combined noise-canceling technology with a smartphone-based neural network trained to identify 20 different environmental sound categories. These include alarm clocks, crying babies, sirens, car horns, and birdsong. When a user selects one or more of these categories, the software identifies and plays those sounds through the headphones in real time while filtering out everything else.
Making this system work seamlessly was not an easy task, however.
“To achieve what we want, we first needed a high-level intelligence to identify all the different sounds in an environment,” said Gollakota. “Then, we needed to separate the target sounds from all the interfering noises. If this is not hard enough, whatever sounds we extracted needed to sync with the user’s visual senses, since they cannot be hearing someone two seconds too late. This means the neural network algorithms must process sounds in real time in under a hundredth of a second, which is what we achieved.”
The team employed this AI-powered approach to focus on human speech. Relying on similar content-aware techniques, their algorithm can identify a speaker and isolate their voice from ambient noise in real time for clearer conversations.
Gollakota is excited to be at the forefront of the next generation of audio devices.
“We have a very unique opportunity to create the future of intelligent hearables that can enhance human hearing capability and augment intelligence to make lives better,” said Gollakota.
###
----------------------- MORE MEETING INFORMATION -----------------------
Main meeting website: https://acousticalsociety.org/ottawa/
Technical program: https://eppro02.ativ.me/src/EventPilot/php/express/web/planner.php?id=ASASPRING24
ASA PRESS ROOM
In the coming weeks, ASA's Press Room will be updated with newsworthy stories and the press conference schedule at https://acoustics.org/asa-press-room/.
LAY LANGUAGE PAPERS
ASA will also share dozens of lay language papers about topics covered at the conference. Lay language papers are summaries (300-500 words) of presentations written by scientists for a general audience. They will be accompanied by photos, audio, and video. Learn more at https://acoustics.org/lay-language-papers/.
PRESS REGISTRATION
ASA will grant free registration to credentialed and professional freelance journalists. If you are a reporter and would like to attend the hybrid / in-person meeting or virtual press conferences, contact AIP Media Services at media@aip.org. For urgent requests, AIP staff can also help with setting up interviews and obtaining images, sound clips, or background information.
ABOUT THE ACOUSTICAL SOCIETY OF AMERICA
The Acoustical Society of America is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world's leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/.
ABOUT THE CANADIAN ACOUSTICAL ASSOCIATION/ASSOCIATION CANADIENNE D’ACOUSTIQUE
• fosters communication among people working in all areas of acoustics in Canada
• promotes the growth and practical application of knowledge in acoustics
• encourages education, research, protection of the environment, and employment in acoustics
• is an umbrella organization through which general issues in education, employment and research can be addressed at a national and multidisciplinary level
The CAA is a member society of the International Institute of Noise Control Engineering (I-INCE) and the International Commission for Acoustics (ICA) and is an affiliate society of the International Institute of Acoustics and Vibration (IIAV). Visit https://caa-aca.ca/.
###
No comments:
Post a Comment