Friday, May 05, 2023

University of Toronto researchers use generative AI to design novel proteins

Peer-Reviewed Publication

UNIVERSITY OF TORONTO

Philip Kim and Jin Sub Lee 1140x760 

IMAGE: PROFESSOR PHILIP KIM AND JIN SUB (MICHAEL) LEE view more 

CREDIT: UNIVERSITY OF TORONTO

Researchers at the University of Toronto have developed an artificial intelligence system that can create proteins not found in nature using generative diffusion, the same technology behind popular image-creation platforms such as DALL-E and Midjourney.

The system will help advance the field of generative biology, which promises to speed drug development by making the design and testing of entirely new therapeutic proteins more efficient and flexible.

“Our model learns from image representations to generate fully new proteins, at a very high rate,” says Philip M. Kim, a professor in the Donnelly Centre for Cellular and Biomolecular Research at U of T’s Temerty Faculty of Medicine. “All our proteins appear to be biophysically real, meaning they fold into configurations that enable them to carry out specific functions within cells.” 

Today, the journal Nature Computational Science published the findings, the first of their kind in a peer-reviewed journal. Kim’s lab also published a pre-print on the model last summer through the open-access server bioRxiv, ahead of two similar pre-prints from last December, RF Diffusion by the University of Washington and Chroma by Generate Biomedicines.

Proteins are made from chains of amino acids that fold into three-dimensional shapes, which in turn dictate protein function. Those shapes evolved over billions of years and are varied and complex, but also limited in number. With a better understanding of how existing proteins fold, researchers have begun to design folding patterns not produced in nature.

But a major challenge, says Kim, has been to imagine folds that are both possible and functional. “It’s been very hard to predict which folds will be real and work in a protein structure,” says Kim, who is also a professor in the departments of molecular genetics and computer science at U of T. “By combining biophysics-based representations of protein structure with diffusion methods from the image generation space, we can begin to address this problem.”

The new system, which the researchers call ProteinSGM, draws from a large set of image-like representations of existing proteins that encode their structure accurately. The researchers feed these images into a generative diffusion model, which gradually adds noise until each image becomes all noise. The model tracks how the images become noisier and then runs the process in reverse, learning how to transform random pixels into clear images that correspond to fully novel proteins.

Jin Sub (Michael) Lee, a doctoral student in the Kim lab and first author on the paper, says that optimizing the early stage of this image generation process was one of the biggest challenges in creating ProteinSGM. “A key idea was the proper image-like representation of protein structure, such that the diffusion model can learn how to generate novel proteins accurately,” says Lee, who is from Vancouver but did his undergraduate degree in South Korea and master’s in Switzerland before choosing U of T for his doctorate.

Also difficult was validation of the proteins produced by ProteinSGM. The system generates many structures, often unlike anything found in nature. Almost all of them look real according to standard metrics, says Lee, but the researchers needed further proof.

To test their new proteins, Lee and his colleagues first turned to OmegaFold, an improved version of DeepMind’s software AlphaFold 2. Both platforms use AI to predict the structure of proteins based on amino acid sequences.

With OmegaFold, the team confirmed that almost all their novel sequences fold into the desired and also novel protein structures. They then chose a smaller number to create physically in test tubes, to confirm the structures were proteins and not just stray strings of chemical compounds.

“With matches in OmegaFold and experimental testing in the lab, we could be confident these were properly folded proteins. It was amazing to see validation of these fully new protein folds that don’t exist anywhere in nature,” Lee says.

Next steps based on this work include further development of ProteinSGM for antibodies and other proteins with the most therapeutic potential, Kim says. “This will be a very exciting area for research and entrepreneurship,” he adds. 

Lee says he would like to see generative biology move toward joint design of protein sequences and structures, including protein side-chain conformations. Most research to date has focussed on generation of backbones, the primary chemical structures that hold proteins together. 

“Side-chain configurations ultimately determine protein function, and although designing them means an exponential increase in complexity, it may be possible with proper engineering,” Lee says. “We hope to find out.”

JOURNAL

DOI

METHOD OF RESEARCH

SUBJECT OF RESEARCH

ARTICLE TITLE

ARTICLE PUBLICATION DATE

UMD leads new $20M NSF Institute for Trustworthy AI in Law and Society


The institute will take a holistic approach, integrating broader participation in artificial intelligence design, new technology development, and more informed governance of AI-infused systems

Grant and Award Announcement

UNIVERSITY OF MARYLAND

NSF Institute for Trustworthy AI in Law & Society (TRAILS) logo 

IMAGE: THE NSF INSTITUTE FOR TRUSTWORTHY AI IN LAW & SOCIETY (TRAILS), LED BY THE UNIVERSITY OF MARYLAND, WAS ANNOUNCED ON MAY 4, 2023. view more 

CREDIT: NSF INSTITUTE FOR TRUSTWORTHY AI IN LAW & SOCIETY

The University of Maryland has been chosen to lead a multi-institutional effort supported by the National Science Foundation (NSF) that will develop new artificial intelligence (AI) technologies designed to promote trust and mitigate risks, while simultaneously empowering and educating the public.

The NSF Institute for Trustworthy AI in Law & Society (TRAILS) announced on May 4, 2023, unites specialists in AI and machine learning with social scientists, legal scholars, educators and public policy experts. The multidisciplinary team will work with impacted communities, private industry and the federal government to determine what trust in AI looks like, how to develop technical solutions for AI that can be trusted, and which policy models best create and sustain trust.

Funded by a $20 million award from NSF, the new institute is expected to transform the practice of AI from one driven primarily by technological innovation to one that is driven by ethics, human rights, and input and feedback from communities whose voices have previously been marginalized.

“As artificial intelligence continues to grow exponentially, we must embrace its potential for helping to solve the grand challenges of our time, as well as ensure that it is used both ethically and responsibly,” said UMD President Darryll J. Pines. “With strong federal support, this new institute will lead in defining the science and innovation needed to harness the power of AI for the benefit of the public good and all humankind.”

In addition to UMD, TRAILS will include faculty members from George Washington University (GW) and Morgan State University, with more support coming from Cornell University, the National Institute of Standards and Technology (NIST), and private sector organizations like the DataedX Group, Arthur AI, Checkstep, FinRegLab and Techstars.

At the heart of establishing the new institute is the consensus that AI is currently at a crossroads. AI-infused systems have great potential to enhance human capacity, increase productivity, catalyze innovation, and mitigate complex problems, but today’s systems are developed and deployed in a process that is opaque and insular to the public, and therefore, often untrustworthy to those affected by the technology.

“We’ve structured our research goals to educate, learn from, recruit, retain and support communities whose voices are often not recognized in mainstream AI development,” said Hal Daumé III, a UMD professor of computer science who is lead principal investigator of the NSF award and will serve as the director of TRAILS.

Inappropriate trust in AI can result in many negative outcomes, Daumé said. People often “overtrust” AI systems to do things they’re fundamentally incapable of. This can lead to people or organizations giving up their own power to systems that are not acting in their best interest. At the same time, people can also “undertrust” AI systems, leading them to avoid using systems that could ultimately help them.

Given these conditions—and the fact that AI is increasingly being deployed to mediate society’s online communications, determine health care options, and offer guidelines in the criminal justice system—it has become urgent to ensure that people’s trust in AI systems matches those same systems’ level of trustworthiness.

(From left) University of Maryland doctoral student Lovely-Frances Domingo, Professor Hal Daumé III and Associate Professor Katie Shilton discuss some of Shilton’s work on ethics and policy for the design of information technologies. Daumé and Shilton are helping lead the new $20M NSF Institute for Trustworthy AI in Law & Society.

CREDIT

Maria Herd

TRAILS has identified four key research thrusts to promote the development of AI systems that can earn the public’s trust through broader participation in the AI ecosystem.

The first, known as participatory AI, advocates involving human stakeholders in the development, deployment and use of these systems. It aims to create technology in a way that aligns with the values and interests of diverse groups of people, rather than being controlled by a few experts or solely driven by profit.

Leading the efforts in participatory AI is Katie Shilton, an associate professor in UMD’s College of Information Studies who specializes in ethics and sociotechnical systems. Tom Goldstein, a UMD associate professor of computer science, will lead the institute’s second research thrust, developing advanced machine learning algorithms that reflect the values and interests of the relevant stakeholders.

Daumé, Shilton and Goldstein all have appointments in the University of Maryland Institute for Advanced Computer Studies, which is providing administrative and technical support for TRAILS.

David Broniatowski, an associate professor of engineering management and systems engineering at GW, will lead the institute’s third research thrust of evaluating how people make sense of the AI systems that are developed, and the degree to which their levels of reliability, fairness, transparency and accountability will lead to appropriate levels of trust. Susan Ariel Aaronson, a research professor of international affairs at GW, will use her expertise in data-driven change and international data governance to lead the institute’s fourth thrust of participatory governance and trust.

Virginia Byrne, an assistant professor of higher education and student affairs at Morgan State, will lead community-driven projects related to the interplay between AI and education. According to Daumé, the TRAILS team will rely heavily on Morgan State’s leadership—as Maryland’s preeminent public urban research university—in conducting rigorous, participatory community-based research with broad societal impacts.

Additional academic support will come from Valerie Reyna, a professor of human development at Cornell, who will use her expertise in human judgment and cognition to advance efforts focused on how people interpret their use of AI.

Federal officials at NIST will collaborate with TRAILS in the development of meaningful measures, benchmarks, test beds and certification methods—particularly as they apply to important topics essential to trust and trustworthiness such as safety, fairness, privacy, transparency, explainability, accountability, accuracy and reliability.

“The ability to measure AI system trustworthiness and its impacts on individuals, communities and society is limited. TRAILS can help advance our understanding of the foundations of trustworthy AI, ethical and societal considerations of AI, and how to build systems that are trusted by the people who use and are affected by them,” said Under Secretary of Commerce for Standards and Technology and NIST Director Laurie E. Locascio.

Today’s announcement is the latest in a series of federal grants establishing a cohort of National Artificial Intelligence Research Institutes. This recent investment in seven new AI institutes, totaling $140 million, follows two previous rounds of awards.

“Maryland is at the forefront of our nation’s scientific innovation thanks to our talented workforce, top-tier universities, and federal partners,” said U.S. Sen. Chris Van Hollen (D-Md.). “This National Science Foundation award for the University of Maryland—in coordination with other Maryland-based research institutions including Morgan State University and NIST—will promote ethical and responsible AI development, with the goal of helping us harness the benefits of this powerful emerging technology while limiting the potential risks it poses. This investment entrusts Maryland with a critical priority for our shared future, recognizing the unparalleled ingenuity and world-class reputation of our institutions.” 

The NSF, in collaboration with government agencies and private sector leaders, has now invested close to half a billion dollars in the AI institutes ecosystem—an investment that expands a collaborative AI research network into almost every U.S. state.

“The National AI Research Institutes are a critical component of our nation’s AI innovation, infrastructure, technology, education and partnerships ecosystem,” said NSF Director Sethuraman Panchanathan. “[They] are driving discoveries that will ensure our country is at the forefront of the global AI revolution.”


AI could run a million microbial experiments per year

Automation uncovers combinations of amino acids that feed two bacterial species and could tell us much more about the 90% of bacteria that humans have hardly studied

Peer-Reviewed Publication

UNIVERSITY OF MICHIGAN

Images 

An artificial intelligence system enables robots to conduct autonomous scientific experiments—as many as 10,000 per day—potentially driving a drastic leap forward in the pace of discovery in areas from medicine to agriculture to environmental science. 

 

Reported today in Nature Microbiology, the team was led by a professor now at the University of Michigan.

 

That artificial intelligence platform, dubbed BacterAI, mapped the metabolism of two microbes associated with oral health—with no baseline information to start with. Bacteria consume some combination of the 20 amino acids needed to support life, but each species requires specific nutrients to grow. The U-M team wanted to know what amino acids are needed by the beneficial microbes in our mouths so they can promote their growth.
 

"We know almost nothing about most of the bacteria that influence our health. Understanding how bacteria grow is the first step toward reengineering our microbiome," said Paul Jensen, U-M assistant professor of biomedical engineering who was at the University of Illinois when the project started.

 

Figuring out the combination of amino acids that bacteria like is tricky, however. Those 20 amino acids yield more than a million possible combinations, just based on whether each amino acid is present or not. Yet BacterAI was able to discover the amino acid requirements for the growth of both Streptococcus gordonii and Streptococcus sanguinis. 

 

To find the right formula for each species, BacterAI tested hundreds of combinations of amino acids per day, honing its focus and changing combinations each morning based on the previous day's results. Within nine days, it was producing accurate predictions 90% of the time. 

 

Unlike conventional approaches that feed labeled data sets into a machine-learning model, BacterAI creates its own data set through a series of experiments. By analyzing the results of previous trials, it comes up with predictions of what new experiments might give it the most information. As a result, it figured out most of the rules for feeding bacteria with fewer than 4,000 experiments.

 

"When a child learns to walk, they don’t just watch adults walk and then say 'Ok, I got it,' stand up, and start walking. They fumble around and do some trial and error first," Jensen said.

 

"We wanted our AI agent to take steps and fall down, to come up with its own ideas and make mistakes. Every day, it gets a little better, a little smarter."

 

Little to no research has been conducted on roughly 90% of bacteria, and the amount of time and resources needed to learn even basic scientific information about them using conventional methods is daunting. Automated experimentation can drastically speed up these discoveries. The team ran up to 10,000 experiments in a single day.

 

But the applications go beyond microbiology. Researchers in any field can set up questions as puzzles for AI to solve through this kind of trial and error.

 

"With the recent explosion of mainstream AI over the last several months, many people are uncertain about what it will bring in the future, both positive and negative," said Adam Dama, a former engineer in the Jensen Lab and lead author of the study. "But to me, it's very clear that focused applications of AI like our project will accelerate everyday research."

 

The research was funded by the National Institutes of Health with support from NVIDIA.

 

Study: BacterAI maps microbial metabolism without prior knowledge

 

No comments:

Post a Comment