SPRTA: a smarter way to measure evolution uncertainty
A new method from EMBL-EBI and collaborators offers fast, easy-to-interpret confidence scores for phylogenetic trees for pandemic preparedness
European Molecular Biology Laboratory
image:
SPRTA: a smarter way to measure evolution uncertainty. Image credit: Karen Arnott/EMBL-EBI
view moreCredit: Karen Arnott/EMBL-EBI
When COVID-19 arrived, researchers tried to build evolutionary family trees – known as phylogenetic trees – of the virus. These help scientists understand when new virus strains appear and how they are linked to each other. But with millions of genomes to analyse, checking how reliable those trees were proved impossible.
To address this gap, researchers at EMBL’s European Bioinformatics Institute (EMBL-EBI) and colleagues at the Australian National University have developed SPRTA (SPR-based Tree Assessment), an interpretable and efficient way to score the reliability of each branch in a phylogenetic tree. SPRTA is the first such tool that is scalable to pandemic-sized datasets.
Re-inventing phylogenetic assessment
Since 1985, scientists have relied on a method called Felsenstein’s bootstrap to measure confidence in phylogenetic trees. But because this method works by repeating the same analysis hundreds or even thousands of times, it becomes too slow to handle the millions of viral genomes sequenced during a pandemic.
A recent paper, published in the journal Nature, introduces SPRTA, a modern, scalable alternative capable of handling the huge datasets generated during large disease outbreaks. SPRTA enables researchers to track how pathogens spread and evolve reliably and rapidly, informing better decisions during outbreaks and supporting pandemic preparedness.
“For nearly 40 years, scientists have relied on the same method to measure confidence in evolutionary trees, but when faced with the scale of data we saw during the COVID-19 pandemic, the old method simply couldn’t cope,” said Nick Goldman, Group Leader at EMBL-EBI. “SPRTA gives us a fast, reliable way to understand which parts of these massive trees we can trust and to find the most plausible alternatives in regions of low confidence. This is exactly the kind of tool we’ll need to respond faster and smarter in the next pandemic.”
A smarter way to measure confidence
Traditional methods, such as Felsenstein’s bootstrap, focus on whether groups of samples, known as clades, are strongly supported by the data collected. But for outbreak analysis, that’s not always enough. SPRTA takes a different approach. It analyses how likely it is that a virus strain descends from a particular ancestor, and which alternative evolutionary paths are possible.
To do this, SPRTA tests many possible scenarios by virtually rearranging branches of the phylogenetic tree and comparing how well each one fits the data. It then assigns a simple probability score showing how confident researchers can be in each connection.
“With SPRTA, we’re not just making phylogenetic tree-building faster, we're making it smarter,” said Nicola De Maio, Senior Scientist at EMBL-EBI. “It helps researchers understand which relationships are solid and where they need to be cautious, even when working with millions of genomes.”
Designed for pandemic-scale data
Using more than two million SARS-CoV-2 genomes, the researchers demonstrated that SPRTA can:
highlight which parts of a phylogenetic tree are highly reliable,
flag uncertain sample placements, often due to incomplete or noisy data,
reveal credible alternative origins for specific branches.
SPRTA is built into MAPLE, a tool developed at EMBL-EBI for building massive phylogenetic trees efficiently. SPRTA is also available in IQ-TREE, one of the most widely used phylogenetic software packages.
Integrating SPRTA into these established tools makes the method open, accessible, and ready for researchers worldwide to apply in outbreak tracking, genomic surveillance, and evolutionary studies.
Funding
This work was supported by EMBL core funds and the Medical Research Council (MRC). Australian collaborators received support from the Chan-Zuckerberg Initiative.
Journal
Nature
Method of Research
Data/statistical analysis
Article Title
Assessing phylogenetic confidence at pandemic scales
Article Publication Date
5-Nov-2025
'Jumping genes’ help scientists resolve tree of life
Termite study provides researchers with template to solve ancient evolutionary mysteries
Genomes are key to unlocking life’s evolutionary history. The presence and absence of certain genetic sequences and mutations can give us clues to the order in which species diverge. However, even state-of-the-art methods struggle to accurately map evolutionary events from hundreds of millions of years ago. Published in Current Biology, a new method from scientists at the Okinawa Institute of Science and Technology (OIST) harnesses ‘jumping genes’ to recreate the termite tree of life, showcasing a new way for researchers to solve ancient evolutionary mysteries.
“Phylogenetic trees, which map the relationships between different organisms, are pivotal to the field of evolutionary biology. They help us to understand the origins of modern biodiversity and inform conservation strategies”, says Professor Thomas Bourguignon, author on the study and head of the OIST Evolutionary Genomics Unit. “However, challenges arise when trying to predict evolution across deep history. Phylogenetic signals are often weak, and radiation events, where species rapidly diversify over a short period, add complexity, making it difficult to identify the order that individual species emerged. Our new method supports researchers to tackle these tricky scenarios”.
What makes a gene ‘jump’? Transposons explained
Certain DNA sequences, called ‘transposable elements’, or ‘transposons’, can move from one place to another, causing mutations and increasing genetic variability. Transposons are abundant in the genomes of eukaryotes—organisms having cells with nuclei encapsulating their genomes, including animals, plants and fungi. In fact, they form up to 50% of human genomes, and more in some other eukaryotes.
Despite their abundance, transposons have been somewhat overlooked in favor of other DNA marker sequences for tree of life construction. “Until recent advances in sequencing technologies and bioinformatics annotation tools, transposon characterization at genome level was difficult,” explains first author Cong Liu, PhD student at OIST. “Phylogenetics has tended to focus on conserved genes, such as those encoding proteins critical for life, which are common across different species. These usually only change slowly over time, so are good for examining changes over evolutionary timescales.”
This slow rate of change comes with a downside; it can become difficult to resolve rapid radiation events, as there may be very limited differences in these conserved genes between species. In such cases, transposons may provide helpful information on species divergence, given their active movement across the genome.
A new way to build phylogenetic trees
To prove the usefulness of transposons, the team first had to collect data, sequencing 45 termite and two cockroach genomes. They selected a diverse range of species to represent the different families and subfamilies of the insect lineage, studying each genome carefully to identify almost 38,000 transposon families across the different species.
By analyzing the presence and absence of transposons across the 47 species, the team built a tree of life, mapping when each species seemed to diverge from earlier ancestors. They then compared their tree to previously published termite trees of life. “We achieved similar accuracy to trees built from thousands of protein marker sequence alignments.” notes Prof. Bourguignon.
Overcoming DNA degradation challenges
Although this study used relatively rich genomic information, the methods may hold for more limited data, opening new possibilities for research based on older specimens such as historical museum collections.
“It’s often not possible to get ‘nice’ genomic data,” says Mr. Liu. “DNA degrades naturally over time, and faster in hotter and more humid climates, as many biodiversity hotspots tend to be. This can be a problem even in short timescales, just going from collecting a specimen to sequencing. But it’s particularly an issue when working with historical samples from old collections.”
Methods which work with more fragmented data are important in enabling researchers to extract useful information, supporting both evolutionary studies and biodiversity mapping efforts. Since transposons are very short sequences, they could even be retrieved from fragmented DNA samples.
Termites and beyond
The team isn’t done studying termites and is using their genomic data to bring about new insights into termite physiology, social structures and even dietary evolution. However, they hope this study will inspire researchers across wider fields to explore biodiversity and evolution throughout the animal kingdom.
“Our methods are complementary to existing phylogenetic techniques. We hope to inspire others to look towards transposons to unlock new evolutionary information and clarify longstanding mysteries within trees of life”, says Prof. Bourguignon.
Journal
Current Biology
Method of Research
Computational simulation/modeling
Article Title
Robust termite phylogenies built using transposable element composition and insertion events
Article Publication Date
5-Nov-2025
No comments:
Post a Comment