Comparing the genes of 240 species of mammals—and one famous dog—offers a powerful new approach for understanding biology and evolutionary history
Ever since scientists first read the complete genetic codes of creatures like fruit flies and humans more than two decades ago, the field of genomics has promised major leaps forward in understanding basic questions in biology.
And now comes a major installment of that promise. In what Howard Hughes Medical Institute Investigator and HHMI Professor Beth Shapiro calls a treasure trove of research, more than 150 researchers from 50 institutions are publishing 11 different papers in the April 28, 2023, issue of Science. The research brings new insights from the Zoonomia Project, an unprecedented collaborative effort led by Elinor Karlsson, director of the Vertebrate Genomics Group at the Broad Institute of MIT and Harvard, that compares and analyzes the complete genomes of 240 different mammalian species, from aardvarks to zebus.
The findings from this enormous amount of genetic data include pinpointing genes that underlie the ability to hibernate or how brains grew larger, as well as identifying the small fraction of genes that makes humans unique. “These 11 papers are just a sampling of the type of science that can be done with the new genetic data,” says Shapiro, professor of ecology and evolutionary biology at the University of California, Santa Cruz. “They show how important these large consortia and foundational datasets really are.”
Two of the papers, co-authored by Shapiro and her Santa Cruz team, break new ground by showing how much valuable information can be found in genomes of a single species, such as endangered orcas, or even in the DNA of an individual. That individual is a sled dog named Balto, who has been immortalized in movies and a statue for helping to bring lifesaving diphtheria antitoxin to Nome, Alaska in an epic journey across the Alaskan wilderness in the winter of 1925. With just a snippet of the dog’s preserved skin and “these amazing new techniques we didn’t have before, we were able to do this cool scientific thing,” says HHMI postdoc Katie Moon, lead author of the Balto paper and a member of Shapiro’s team.
Mass extinctions
One of Shapiro’s new papers tackles a high-stakes, urgent question in conservation. Humans are now causing mass extinctions and a serious loss of biodiversity across the planet. But which species are most at risk? Traditionally, conservationists tackled the question by painstakingly counting how many individuals are in a population and estimating how much habitat remains. Such efforts show that some species, like pumas in California, which Shapiro’s team has also worked on, are seriously endangered.
But what if the animal in question is one of many thousands of species for which no good population or habitat data exist? For those, Shapiro’s team wondered, might it be possible instead to estimate the threat of extinction simply by looking through the creatures’ genomes for “bad” genes or genetic evidence of inbreeding—the tell-tale signs of trouble?
To answer the question, co-lead authors, HHMI scientist Megan Supple and Aryn Wilder of the San Diego Zoo Wildlife Alliance, used the International Union for Conservation of Nature’s “Red List of Threatened Species” to rank the 240 mammals in the Zoonomia Project along a continuum from “least concern” to “critically endangered.” Then they looked for the worrisome signals in each animal’s genome.
The results show that the genomes are remarkably revealing. “Information encoded within even a single genome can provide a risk assessment in the absence of adequate ecological or population census data,” the paper reports. No good data exist on numbers or habitats for the Upper Galilee Mountains blind mole rat, a small tunnel-digging rodent, for example. But its genome shows the species is doing just fine, thanks. In contrast, both the genomic and ecological data for orcas confirm that killer whales are in serious danger.
The genomes’ predictive power can be harnessed in the effort to identify and save endangered species, Shapiro suggests. “We know we’ll never have enough conservation dollars to go around, but by using even one genome, we can triage species,” she explains—quickly and inexpensively identifying those creatures most at risk.
Champion sled dog racer
The stakes were lower for the second paper from Shapiro’s team, the sled dog effort, but it was a lot more fun, the researchers say. “I hope people enjoy reading about Balto as much as I enjoyed working on the project,” says Moon.
The origins of the project actually go back a few years. Heather Huson, a champion sled dog racer turned Cornell University animal geneticist, was giving a talk at a meeting of sled dog veterinarians when one of the vets in the audience wondered if it would be possible to extract and analyze DNA from preserved hide. He even had a potential study subject in mind—Balto, whose taxidermied body is displayed in a glass case at the Cleveland Museum of Natural History.
Huson was hooked on the idea. “I grew up on the stories about Balto,” she recalls. But she had no experience working with old DNA, “and I wasn’t going to screw this up,” she says. So she reached out to the ancient DNA research community. The path quickly led to Beth Shapiro, a pioneer in revealing the genetic secrets of extinct creatures like mastodons and of ancient humans in the field called paleogenomics. “I reached out to Beth, and she said, ‘We can do this,’” says Huson.
The researchers got a sample of Balto’s skin from the Cleveland Museum and extracted the dog’s DNA from the sample. Moonthen did the heavy genetic lifting in UC Santa Cruz’s high-tech ancient DNA lab, reading the code of Balto’s snippets of DNA enough times to cover his entire genome 40 times over.
Normally, scientists would learn about the genetics of a species in part by looking at genetic variations among different individuals. Balto was just one individual, though, so “the challenge was how to make a research project out of one dog,” says Huson. But the team had an ace up their sleeve. In addition to being able to compare the sled dog’s genome to the 240 mammals in the Zoonomia Project, they also could tap a genetic repository created by the Broad Institute’s Karlsson that has complete genomes of 682 dogs from a wide variety of breeds. “It’s an incredible dataset,” says Moon. Because of the information it contains “we know so much about dogs—what parts of the genome make them look the way they do or perform the way they do,” Moon explains. Or as Shapiro adds, the Balto project “was an opportunity to bring these two datasets together.”
Exciting moment
Using just the information in Balto’s genes, Kathleen Morrill, then a PhD student in Karlsson’s lab at the University of Massachusetts Chan Medical School, was able to predict both the dog’s precise height and the fact that his black coat had tan highlights at the edges—which don’t even show up in most pictures. A talented artist, Morrill was able to draw a rendering, based on the genetics, that was more accurate than many pictures. “Her drawing was what Balto would have looked like,” says Moon. “It was the first time anyone has done this on an individual that’s been gone for almost 100 years—and it was a really exciting moment for me.” It also validates the idea that scientists can use genomics to accurately envision what long-extinct species—for which no pictures exist—really looked like. “It shows we can do a pretty good job predicting their physical appearance,” says Huson.
There were plenty of other scientific nuggets in Balto’s DNA as well. Born in the kennel of famous sled dog breeder Leonard Seppala in 1919, Balto was descended from dogs imported from Siberia. “But one of the coolest things is how close Balto is to modern Alaskan sled dogs as well as to the Siberian husky,” says Huson. His genome shows a mix of ancestors, with fewer deleterious genes compared to modern purebred breeds like Siberian huskies and Alaskan Malamutes. His DNA is also rich in so-called tissue development genes, which are involved in functions like muscle growth, metabolism, and oxygen consumption. “That’s exactly what you would need in a working dog,” says Moon.
Yet the genetics also reveal Balto’s limitations. Sled dogs were originally bred for great endurance, but since Balto’s time, breeders added in more speed. “Balto might have been a tough sled dog with a lot of endurance, but he wouldn’t have been very fast,” says Huson.
In fact, sled dog experts know that Balto wasn’t actually the real hero of the lifesaving 1925 journey. That honor belongs to a dog named Togo, who led Seppala’s team on the longest leg of the 674 mile trek, an astonishing 264 miles (compared to Balto’s 53 miles on the final segment). “Balto was the 2nd string dog,” says Huson. Not being prime progenitor material, he was neutered, in contrast to Togo, “who is the dog—the foundation of a lot of sled dogs,” says Huson. So, the next step, she suggests, is getting a sample from Togo’s remains, now preserved in Nome, in order to reveal the next chapter in this canine genetic drama.
JOURNAL
Science
METHOD OF RESEARCH
Experimental study
SUBJECT OF RESEARCH
Animal tissue samples
ARTICLE TITLE
Comparative genomics of Balto, a famous historic dog, captures lost diversity of 1920s sled dogs
ARTICLE PUBLICATION DATE
28-Apr-2023
Genome of famed sled dog Balto reveals genetic adaptations of working dogs
Still a good boy nearly 100 years after historic sled run, Balto has now helped scientists explore the genetics of working dogs and demonstrate the power of comparative genomics
Peer-Reviewed PublicationThe sled dog Balto has been celebrated in books and movies for his role in delivering desperately needed diphtheria antitoxin to Nome, Alaska, in 1925. Now, his DNA has enabled scientists to explore the genetics of 1920s sled dogs in Alaska and understand how they compare to modern dogs.
Scientists at UC Santa Cruz sequenced Balto’s genome as part of a large collaborative effort in comparative genomics leading to several papers published in the April 28 issue of Science. For the Balto study, the UCSC team extracted DNA from tissue samples of Balto’s taxidermied remains provided by the Cleveland Museum of Natural History and worked with colleagues at Cornell University and other institutions to investigate his ancestry and genetic traits.
“Balto’s fame and the fact that he was taxidermied gave us this cool opportunity 100 years later to see what that population of sled dogs would have looked like genetically and to compare him to modern dogs,” said Katherine Moon, a postdoctoral researcher at UC Santa Cruz and first author of the paper on the team’s findings published in Science.
The researchers found that Balto shared just part of his diverse ancestry with the Siberian husky breed. He belonged to a population of working sled dogs that were more genetically diverse than modern breeds and differed not only from today’s Siberian huskies but also from modern Alaskan sled dogs. The study found evidence that his population was genetically healthier than modern breeds and carried gene variants that may have helped the dogs survive their extreme environment.
“Balto came from a population of working dogs that were different from modern breeds and were adapted to harsh conditions,” said coauthor Beth Shapiro, professor of ecology and evolutionary biology at UC Santa Cruz and a Howard Hughes Medical Institute investigator.
The analysis of Balto’s genome involved comparing it to a dataset of 682 genomes from modern dogs and wolves, as well as an alignment of 240 mammalian genomes developed by the Zoonomia Consortium, which was the foundation for most of the new comparative genomics studies published in the special issue of Science.
“We were able to take advantage of both the Zoonomia alignment and the huge amount of work that has gone into collecting the genomes of dogs,” said Shapiro, who is a member of the Zoonomia Consortium.
She explained that a key innovation behind these new studies is the ability to align the genomes of hundreds of species so that corresponding positions in different genomes can be compared. Comparative genomics can then reveal DNA sequences that are the same across species, having remained unchanged over millions of years of evolution—an indication that these are important parts of the genome where mutations could be harmful.
“A gene that’s on one chromosome in us is on a completely different chromosome in another species,” Shapiro said. “You need a tool that can line them up so you can see which parts of these genomes are the same and which are different. Without that it’s just a bunch of genomes of species that are very divergent.”
The genome alignment tool that made this possible was developed by researchers at the UC Santa Cruz Genomics Institute led by Benedict Paten, professor of biomolecular engineering and a member of the Zoonomia Consortium. The new papers include a study that identified thousands of elements in the human genome that are highly conserved across species, and another showing how this information could make it easier to find genetic changes that increase disease risk.
“When we do genome sequencing of humans, it can be difficult to tell which genetic variants are significant,” explained Paten, who is a coauthor of both papers. “If it’s at a highly conserved site in the genome, that’s a good sign the variant may have functional effects and increase the risk of disease.”
The Balto paper also used this approach to characterize genetic variation in Balto compared to modern dogs. Balto and populations of working sled dogs had lower burdens of rare, potentially damaging variation than breed dogs, indicating they represent genetically healthier populations. The researchers also identified protein-altering, evolutionarily constrained variants in Balto in genes related to tissue development, which may represent beneficial adaptations.
“Balto had variants in genes related to things like weight, coordination, joint formation, and skin thickness, which you would expect for a dog bred to run in that environment," Moon said.
Raised in the kennel of breeder Leonhard Seppala, Balto belonged to a population of small, fast sled dogs imported from Siberia that became known as Siberian huskies. The modern Siberian husky breed, however, is quite different from Balto and from modern sled dogs. In addition to Siberian huskies and Alaskan sled dogs, other living dog lineages that share common ancestry with Balto include Greenland sled dogs, Vietnamese village dogs, and Tibetan mastiffs.
“It’s really interesting to see the evolution of dogs like Balto, even in just the past 100 years,” Moon said. “Balto’s population was different from modern Siberian huskies, which have since been bred for a physical standard, but also from modern working Alaskan sled dogs.”
One interesting trait identified in Balto’s genome is a better ability to digest starch compared to wolves and Greenland sled dogs (an isolated population), but not as good as modern dogs, which easily digest starchy foods.
Researchers were also able to use Balto’s genome to reconstruct his physical appearance, including his stature and coat color, in more detail than even historic photos could reveal.
“This project gives everyone an idea of what’s starting to be possible as more high-quality genomes become available to compare,” Moon said. “It’s an exciting moment because these are things we haven’t done before. I feel like an explorer, and once again Balto is leading the way.”
Another paper from Shapiro’s lab and the Zoonomia Consortium, in collaboration with the San Diego Zoo Wildlife Alliance and other institutions, used genomics to predict which mammal species are more likely to face extinction. Genomic analysis can reveal evidence of inbreeding and other indicators of a population on the brink. Although ecological data provide the best predictors of extinction risk, genomics can help identify species that need more attention.
“In conservation, there are more species that need attention than we have time or resources to study, and it turns out that just having one good DNA sample can be enough for us to say either they’re probably alright, or now we need to focus on this species,” Shapiro said.
In addition to Moon and Shapiro, the coauthors of the Balto study include Heather Huson and Krishnamoorthy Srikanth at Cornell University; Kathleen Morrill, Xue Li, and Elinor Karlsson at UMass Chan Medical School; Ming-Shan Wang at UC Santa Cruz; Kerstin Lindblad-Toh at the Broad Institute of Harvard and MIT; Gavin Svenson at the Cleveland Museum of Natural History; and the Zoonomia Consortium. This work was supported by the National Institutes of Health and the Siberian Husky Club of America.
JOURNAL
Science
ARTICLE TITLE
Comparative genomics of Balto, a famous historic dog, captures lost diversity of 1920s sled dogs
ARTICLE PUBLICATION DATE
28-Apr-2023
One of the Cleveland Museum of Natural History’s iconic residents continues to contribute to science
New research reveals insights into one of the world’s most famous dogs
Peer-Reviewed PublicationONE OF THE CLEVELAND MUSEUM OF NATURAL HISTORY’S ICONIC RESIDENTS CONTINUES TO CONTRIBUTE TO SCIENCE
New Research Reveals Insights Into One of the World’s Most Famous Dogs
CLEVELAND—The Cleveland Museum of Natural History is proud to count legendary sled dog Balto among the residents of its permanent collection, especially as he continues to contribute to science nearly 100 years after his lifesaving mission.
In new studies published today in a special issue of Science, researchers have demonstrated how comparative genomics (the study of an organism’s genes) can not only shed light on how certain species achieve extraordinary feats, but also help scientists better understand the parts of the human genome (the basic recipe for building a human being) that are functional and how they might influence health and disease.
Through these studies, researchers identified DNA that has remained the most unchanged across mammalian species and millions of years of evolution—and is likely biologically important. They also found part of the genetic basis for unique mammalian traits, such as the ability to hibernate or sniff out faint scents from miles away. And they pinpointed species that may be particularly susceptible to extinction, as well as genetic variants that are more likely to play causal roles in rare and common human diseases. The findings come from analyses of DNA samples collected by more than 50 different institutions worldwide, including from Balto at the Cleveland Museum of Natural History.
“The fact that the DNA from a tiny sample of Balto’s skin can provide new scientific insights is a powerful reminder of how advances in science continually allow us to glean new information from museum collections,” says Dr. Gavin Svenson, the Museum’s Chief Science Officer. “Every one of the millions of objects in our Museum has the potential to reveal an important clue to a future scientist, who in turn can enhance our understanding of the past, present, and future of the world around us.”
In 1925, Balto gained worldwide fame after leading a 13-dog team on the final leg of a 674-mile dogsled relay to deliver lifesaving medicine to Nome, Alaska, during an outbreak of diphtheria. Because Nome’s port was icebound and inaccessible by sea, the remote outpost’s only reliable link to the outside world was via dogsled. Completed by 20 mushers and more than 150 dogs in a record 127 hours, the so-called Serum Run was an amazing feat of endurance. Despite -50°F temperatures and a raging blizzard, Balto and his team traversed the last 53 miles to Nome—saving the town and surrounding communities. After delivering the serum to Nome’s hospital, musher Gunnar Kaasen went straight back to Balto, hugging him and repeating the praise “Damn fine dog.”
The courage of Balto and his teammates made them famous, but that fame was fleeting. After touring the United States vaudeville circuit with Kaasen for two years, Balto and his team were sold and put on display in a dime museum in Los Angeles. Ill and mistreated, Balto and six of his surviving teammates caught the attention of George Kimble, a businessman visiting from Cleveland. Familiar with the story of the Serum Run and outraged by the deplorable conditions, Kimble arranged to purchase the dogs for $1,500. The only catch was that he had just two weeks to raise the money.
Kimble returned to Cleveland and established a Balto Fund, taking to national radio and the local newspaper, The Plain Dealer, to appeal for donations. The response of the Cleveland community was unprecedented. The money was raised in just 10 days, with donations pouring in from individuals and businesses across the city, including schoolchildren, factory workers, out-of-town visitors, hotels, local stores, and the Western Reserve Kennel Club.
On March 19, 1927, Balto and his remaining teammates, Fox, Billy, Tillie, Sye, Old Moctoc, and Alaska Slim, received a hero’s welcome in Cleveland, complete with a parade through downtown’s Public Square. The sled dogs spent the rest of their days under the care of Cleveland’s Brookside Zoo, which was subsequently managed by the Cleveland Museum of Natural History. Balto became a local Cleveland celebrity.
Following Balto’s natural death in 1933, his mount was put on display at the Museum. A shining example of triumph in the face of incredible odds, Balto also serves as a reminder of Cleveland’s philanthropic tradition—a spirit of generosity that endures in the community today. One of the Museum’s most treasured attractions, Balto continues to inspire visitors and captivate the popular imagination through his story.
“Preserving Balto’s legacy is something we take very seriously,” says Sonia Winner, the Museum’s President & CEO. “We are thrilled that such a beloved figure of the past continues to have relevance in the present. Balto’s story is also a great story about Cleveland. The generosity of Clevelanders allowed Balto and his teammates to spend the rest of their lives at the Brookside Zoo, which at one time was operated by the Museum."
Balto was an unlikely hero. Born in Nome in 1919, he was always a bit of a disappointment to Leonhard Seppala, his original owner. Seppala was in the business of breeding small, fast huskies for racing, and Balto was stout and strong—a “freight-hauling dog.” As a result, Balto was neutered, and no specific records were saved about him or his litter.
Now, in one of the new studies published in Science, Dr. Beth Shapiro, Professor of Ecology and Evolutionary Biology at the University of California, Santa Cruz, and Dr. Heather Huson, Associate Professor of Animal Science at Cornell University, demonstrate that Balto possessed more genetic diversity than modern breeds. This genetic diversity may have contributed to his being a hardier canine, well adapted to the extreme Alaskan environment.
Balto is most often referred to as a Siberian husky, and one myth even claims that he was part wolf. But the DNA work completed in Dr. Shapiro’s UCSC lab revealed that he was only part Siberian husky.
“Balto also had ancestry related to several other living dog lineages, including Alaskan sled dogs, village dogs, Greenland dogs, and Tibetan mastiffs,” Dr. Shapiro says.
“In short, Balto lived in a time when there was more diversity in dogs than there is today in modern breeds, likely making Balto better equipped to thrive in that environment,” Dr. Huson adds.
The Balto study was conducted using data compiled by The Zoonomia Project, which sequences and compares the genomes of 240 diverse mammals to discover both the genomic basis of traits essential for all animals and changes that underlie the unique traits of individual species. The Zoonomia Project is a powerful resource for connecting genomic data to population phenotypes—the set of observable characteristics of an organism that result from the interaction of its genotype with the environment. Not only does this study offer more insight into the legendary Balto, but it also provides a road map for future scientific investigations based on the comparison of genomic traits.
More information, including a copy of the paper, can be found online at the Science press package at https://www.eurekalert.org/press/scipak/.
About the Cleveland Museum of Natural History Transformation Project
The Cleveland Museum of Natural History opened part of its transformed campus in December 2022, introducing a new Wade Oval Entrance, modernized Education Wing, and updated galleries. This opening is the latest milestone in the Museum’s $150 million transformation project, which features a LEED-certified expansion, a complete reimagining of the Museum campus and all its exhibits, and the addition of new public spaces. Pioneering a new model for natural history museums, the redesigned exhibits will place visitors at the center of the Museum experience—allowing them to better understand their connection with the natural world and the relevance of science to their daily lives. Slated for completion in late 2024, the transformation will showcase the Museum’s world-class assets while reflecting its role as a trusted resource that prioritizes engagement and responsiveness to its community. The Transforming the World of Discovery campaign has raised more than $123 million for this project, which will expand the Museum's building and outdoor visitor areas to more than 375,000 square feet. The Museum appreciates the generous support from community members, corporations, foundations, and government grants that has helped to make this transformation a reality.
About the Cleveland Museum of Natural History
The Cleveland Museum of Natural History illuminates the world around us and inspires visitors to engage with the natural forces that shape their lives. Since its founding in 1920, the Museum has pioneered scientific research to advance knowledge across diverse fields of study and used its outstanding collections, which encompass more than 5 million artifacts and specimens, to deepen the public’s understanding of the dynamic connections between humans and nature. Through its Natural Areas Program, the Museum stewards more than 12,000 acres of protected ecosystems across northern Ohio. A community gathering place, educational center, and research institution, the Museum is a vital resource that serves Cleveland and the nation. For more information, visit CMNH.org.
Media Contact:
Samantha Guenther, External Communications Manager, Cleveland Museum of Natural History
216.403.4557
sguenther@cmnh.org
*Free* Zoonomia Project: Genomes from 240 mammalian species explain human disease risks, uncommon mammalian traits, and beyond
Reports and ProceedingsFrom the two-gram bumblebee bat to whales weighing many tons, the more than 6,000 species of mammal on the planet – including humans – are highly divergent. Over the past 100 million years, they have adapted to nearly every environment on Earth. Now, an international collaboration of scientists with the Zoonomia Project – the largest comparative mammalian genomics resource in the world – has cataloged the diversity in the genomes of 240 mammalian species, representing over 80% of mammalian families. Their findings across 11 papers in this issue of Science pinpoint parts of the human genome that have remained unchanged after millions of years of evolution, providing information that may shed light on human health and disease. The authors’ work also reveals how certain uncommon mammalian traits – like the ability to hibernate – came to be. They say these analyses — and the breadth of questions they answer — only show a fraction of what is possible with this data for understanding both genome evolution and human disease.
The Zoonomia Project is an international effort in which researchers sequenced a range of mammal genomes and then aligned them – a massive computational task. Using the alignment, the researchers identified regions of the genomes, sometimes just single letters of DNA, that are most conserved, or unchanged, across mammalian species and millions of years of evolution — regions that they hypothesized were biologically important. These regions – while they don’t give rise to proteins – may contain instructions that direct where, when, and how much protein is produced. Mutations in these regions could play an important role in the origin of diseases or in the distinctive features of mammal species, the authors hypothesized.
Through their analyses, the researchers tested this hypothesis and were also able to ascertain that at least 10% of the human genome is functional, ten times as much as the approximately one per cent that codes for proteins. The findings further revealed genetic variants likely to play causal roles in rare and common human diseases, including cancer. In one paper in the package, researchers studying patients with medulloblastoma identified mutations in evolutionarily conserved positions of the human genome they believe could be causing brain tumors to grow faster or to resist treatment. The results show how using this data and approach in disease studies could make it easier to find genetic changes that increase disease risk.
In other papers in the package, the researchers pinpointed parts of the genome linked to a few exceptional traits in the mammalian world, such as extraordinary brain size, superior sense of smell, and the ability to hibernate during the winter. The authors use the genomes to confirm that estimate of effective population size and diversity can help predict risk in species that are hard to monitor and sample.
Another study in the package shows that mammals had begun to change and diverge even before the Earth was hit by the asteroid that killed the dinosaurs, approximately 65 million years ago. A different study examined more than 10,000 genetic deletions specific to humans using both Zoonomia data and experimental analysis and linked some of them to the function of neurons. Other Zoonomia papers in the package uncovered a genetic explanation for why a famous sled dog from the 1920s named Balto was able to survive the harsh landscape of Alaska; discovered human-specific changes to genome organization; used machine learning to identify regions of the genome associated with brain size; described the evolution of regulatory sequences in the human genome; focused on sequences of DNA that move around the genome; discovered that species with smaller populations historically are at higher risk of extinction today; and compared genes between nearly 500 species of mammals.
The special issue is accompanied by two Perspectives that provide further insights into the Zoonomia Project’s approach, findings, and future impacts.
***A related news briefing was held as a Zoom Webinar. Recordings of the briefing can be found here.***
Genomes from 240 mammal species explain human disease risks
Why is it that certain mammals have an exceptional sense of smell, some hibernate, and yet others, including humans, are predisposed to disease? A major international research project, jointly led by Uppsala University, Sweden and the Broad Institute, USA, has surveyed and analysed the genomes of 240 different mammals. The results, now published in 11 articles in the journal Science, show how the genomes of humans and other mammals have developed over the course of evolution. The research shows which regions have important functions in mammals, which genetic changes have led to specific characteristics in different species and which mutations can cause disease.
“In combination, the 11 articles we are now publishing in Science provide an enormous amount of information about the function and development of mammalian genomes,” says Kerstin Lindblad-Toh, Professor of Comparative Genomics at Uppsala University and one of two leaders of the international consortium of researchers. “Moreover, we have produced data that can be used for studies of evolution and medical research for many years to come.”
In a large international project jointly led by Uppsala University and the Broad Institute, more than 30 research teams have together surveyed and analysed the genomes of 240 mammal species. The results, now published in 11 articles in the journal Science, show how the genomes of humans and other mammals have developed in the course of evolution.
The human genome contains approximately 20,000 genes that constitute the code for manufacturing all the proteins in the body. The genome also contains instructions that direct where, when and how much of the proteins are produced. These parts of the genome, which are called regulatory elements, are much more difficult to identify than the parts that give rise to proteins. However, studying a great many mammals’ genomes makes it possible to figure out which parts of the genome are functionally important.
The hypothesis shared by the researchers behind the publications in Science has been that if a position in the genome has been preserved throughout 100 million years of evolution, it likely serves a function in all mammals. For the first time, they have been able to test this hypothesis on a large scale. By making a detailed survey and systematic comparison of the genomes of 240 mammals, the researchers have identified regions of the human genome with previously uncharacterised function. These regions are likely regulatory elements and are significant for the correct functioning of the genome. Mutations in these can play an important role in the origin of diseases or in the distinctive features of mammal species.
The researchers identified more than three million important regulatory elements in the human genome, about half of which were previously unknown. They were also able to ascertain that at least 10 per cent of the genome is functional, ten times as much as the approximately one per cent that codes for proteins.
The 240 different mammals in the study vary widely in their characteristics, such as the acuteness of their sense of smell or the size of their brain. The researchers were able to find regions in the genomes that lead to some species having a superior sense of smell or to certain species hibernating.
“It’s exciting to now have a picture of which mutations have steered the development of specific traits in these widely divergent mammals,” says Matthew Christmas, researcher and co-first author of one of the articles focusing on the function of the genome and how it affects distinctive features in different species.
One of the studies shows that mammals had begun to change and diverge ven before the Earth was hit by the asteroid that killed the dinosaurs, approximately 65 million years ago.
“Our results can also provide important information about whether mammals are at risk of extinction, depending on how much variation they have in their genome. This is information that can lay the foundation for understanding how to manage a species to help it survive,” says Professor Lindblad-Toh.
The new knowledge also helps researchers understand how diseases arise, by linking the positions in the genome conserved by evolution to known conditions. This can be done for all species and will also be usable with reference to human diseases.
“Our analyses of 240 mammals give us a better insight into the regulatory signals in the genome. We calibrated our results on positions that are known to contribute to disease, and then could use these to suggest additional positions which could be prioritised for neurological traits, such as schizophrenia or immune conditions including asthma or eczema,” says Jennifer Meadows, researcher and co-first author of the second article, which focuses on how the project’s data can contribute to knowledge about diseases.
The genome of healthy and sick people is compared to understand which mutations lead to disease. This produces a picture of the region in the genome that may be important, but does not yield an exact knowledge of which mutation causes the disease.
“A large proportion of the mutations that lead to common diseases, like diabetes or obsessive-compulsive disorder, lie outside the genes and have to do with gene regulation. Our studies make it easier to identify the mutations that lead to disease and to understand what goes wrong,” says Lindblad-Toh.
The researchers also studied the cancer medulloblastoma, which is the most common type of malignant brain tumour in children. Although modern treatments have improved the prognosis, not all children can be cured. Moreover, those that survive often experience lifelong side-effects from the aggressive treatment.
“In patients with medulloblastoma, we found many new mutations in evolutionarily conserved positions. We hope that analysis of these mutations will lay the ground for new diagnostics and therapies,” says Karin Forsberg-Nilsson, Professor of Stem Cell Research at Uppsala University, who led the cancer part of the study.
This work was supported in part by the National Institutes of Health (US), the Swedish Research Council (SWE), the Knut and Alice Wallenberg Foundation (SWE), and the National Science Foundation (US).
JOURNAL
Science
METHOD OF RESEARCH
Experimental study
SUBJECT OF RESEARCH
Animals
ARTICLE PUBLICATION DATE
27-Apr-2023
Genomes from 240 mammalian species reveal what makes the human genome unique
Studies from the Zoonomia Project pinpoint key parts of the human genome that have remained unchanged after millions of years of evolution and may shed light on disease and unusual traits.
Peer-Reviewed PublicationADDITIONAL PHOTOS AVAILABLE AT: http://broad.io/zoonomiaphotos
Over the past 100 million years, mammals have adapted to nearly every environment on Earth. Scientists with the Zoonomia Project have been cataloging the diversity in mammalian genomes by comparing DNA sequences from 240 species that exist today, from the aardvark and the African savanna elephant to the yellow-spotted rock hyrax and the zebu.
This week, in several papers in a special issue of Science, the Zoonomia team has demonstrated how comparative genomics can not only shed light on how certain species achieve extraordinary feats, but also help scientists better understand the parts of our genome that are functional and how they might influence health and disease.
In the new studies, the researchers identified regions of the genomes, sometimes just single letters of DNA, that are most conserved, or unchanged, across mammalian species and millions of years of evolution — regions that are likely biologically important. They also found part of the genetic basis for uncommon mammalian traits such as the ability to hibernate or sniff out faint scents from miles away. And they pinpointed species that may be particularly susceptible to extinction, as well as genetic variants that are more likely to play causal roles in rare and common human diseases. The findings come from analyses of DNA samples collected by more than 50 different institutions worldwide, including many from the San Diego Wildlife Alliance, which provided many genomes from species that are threatened or endangered.
More than 150 people across seven time zones have contributed to the Zoonomia Project, which is the largest comparative mammalian genomics resource in the world. The effort is led by Elinor Karlsson, director of the vertebrate genomics group at the Broad Institute of MIT and Harvard and a professor of bioinformatics and integrative biology at the UMass Chan Medical School, and Kerstin Lindblad-Toh, scientific director of vertebrate genomics at the Broad and a professor of comparative genomics at Uppsala University in Sweden.
“One of the biggest problems in genomics is that humans have a really big genome and we don’t know what all of it does,” said Karlsson. “This package of papers really shows the range of what you can do with this kind of data, and how much we can learn from studying the genomes of other mammals.”
Exceptional traits
In one of the studies published today, co-first authors Matthew Christmas, a researcher at Uppsala University, and Irene Kaplow, a postdoctoral researcher at Carnegie Mellon University, along with Karlsson, Lindblad-Toh, and collaborators, found that at least 10 percent of the human genome is highly conserved across species, with many of these regions occurring outside of protein-coding genes. More than 4,500 elements are almost perfectly conserved across more than 98 percent of the species studied.
Most of the conserved regions — which have changed more slowly than random fluctuations in the genome — are involved in embryonic development and regulation of RNA expression. Regions that changed more frequently shaped an animal’s interaction with its environment, such as through immune responses or the development of its skin.
The researchers also pinpointed parts of the genome linked to a few exceptional traits in the mammalian world, such as extraordinary brain size, superior sense of smell, and the ability to hibernate during the winter.
With an eye toward preserving biodiversity, the researchers found that mammals with fewer genetic changes at conserved sites in the genome were at greater risk for extinction. Karlsson and Lindblad-Toh say that even having just one reference genome per species could help scientists identify at-risk species, as less than 5 percent of all mammalian species have reference genomes, though more work is needed to develop these methods.
Disease insights
In another study, Karlsson, Lindblad-Toh, and colleagues used the mammalian genomes to study human traits and diseases. They focused on some of the most conserved single-letter genomic regions uncovered in the first paper and compared them to genetic variants that scientists have previously linked to diseases such as cancer using other methods.
The team found that their annotations of the genome based on evolutionary conservation revealed more connections between genetic variants and their function than the other methods. They also identified mutations that are likely causal in both rare and common diseases including cancer, and showed that using conservation in disease studies could make it easier to find genetic changes that increase risk of disease.
The co-first authors of this study were Patrick Sullivan, director of the Center for Psychiatric Genomics at the University of North Carolina Medical School, Chapel Hill and a professor of psychiatric genetics at Karolinska Institutet in Sweden; Jennifer Meadows, a genetics researcher at Uppsala University in Sweden; and Steven Gazal, an assistant professor of population and public health sciences at the Keck School of Medicine at the University of Southern California.
A world of questions
A third study, co-led by Steven Reilly, an assistant professor of genetics at Yale University, and Pardis Sabeti, an institute member at the Broad, examined more than 10,000 genetic deletions specific to humans using both Zoonomia data and experimental analysis, and linked some of them to the function of neurons.
Other Zoonomia papers published today revealed that mammals diversified before the mass dinosaur extinction; uncovered a genetic explanation for why a famous sled dog from the 1920s named Balto was able to survive the harsh landscape of Alaska; discovered human-specific changes to genome organization; used machine learning to identify regions of the genome associated with brain size; described the evolution of regulatory sequences in the human genome; focused on sequences of DNA that move around the genome; discovered that species with smaller populations historically are at higher risk of extinction today; and compared genes between nearly 500 species of mammals.
For Karlsson, Lindblad-Toh, and the researchers who have been sequencing mammalian genomes for Zoonomia or its precursor projects since 2005, these studies — and the breadth of questions they answer — are only a fraction of what is possible.
“We’re very enthusiastic about sequencing mammalian species,” said Lindblad-Toh. “And we’re excited to see how we and other researchers can work with this data in new ways to understand both genome evolution and human disease.”