Using written records – and tweets – as a roadmap for plant disease spread
North Carolina State University researchers used text analytics on both historic and modern writing to reveal more information about the effects and spread of the plant pathogen – now known as Phytophthora infestans – that caused the 1840s Irish potato famine and that continues to vex breeders of potatoes and tomatoes.
The study examined keyword terms like “potato rot” and “potato disease” after digitizing historic farm reports, news accounts and U.S. Patent Office agricultural records from 1843 to 1845 to show how the pathogen first spread across the northeast United States before causing the devastating famine in Ireland in 1845. The study also used text analysis to track social media feeds for the modern-day spread of late blight.
Textual analysis holds promise as a useful tool to help researchers track and visualize both historic and current plant diseases, the researchers say.
“We went back to original descriptions of the potato disease outbreaks in the United States because they occurred between 1843 and 1845, before outbreaks occurred in Europe,” says Jean Ristaino, William Neal Reynolds Distinguished Professor of Plant Pathology at North Carolina State University and corresponding author of a paper in Scientific Reports that describes the study. “We searched those descriptions by keywords, and by doing that we were able to recreate the original outbreak maps using location coordinates mentioned in the documents.
“We were also trying to learn what people were thinking about the disease at the time and where it came from.”
The analysis documents late blight disease on potatoes in five states – New York, Delaware, Massachusetts, New Jersey and Pennsylvania – before it spread to the rest of the northeastern U.S. and into Canada between 1843 and 1845. The pathogen later wreaked havoc on Europe – especially Ireland.
The paper also examined tweets from 2012 to 2022 to learn more about modern spread of P. infestans. They mined tweets for both common and scientific names of the pathogen and were able to geolocate the sources.
“The social media mining was interesting because we found that most people talking about this disease are scientists in developed countries promoting their own work on Twitter (now X),” Ristaino said. “It was also interesting to note that states where the disease appeared all those many years ago still have the disease now.”
The study also used Google Ngram search terms to reveal a surprising finding. The researchers saw a spike in late blight disease reported in 1950s documents. Drilling down into the relevant academic literature cited in the documents, Ristaino saw evidence of a large late blight outbreak in tomatoes in the United States after World War II.
“That could have been the emergence of a new North American strain of the pathogen, known as U.S. 1, that became really widespread after that,” Ristaino said.
Ristaino added that she and her team plan to continue this type of work and expand the analytic tools to other plant diseases and pests.
Co-authors Ariel Saffer, Laura Tateosian and Yi-Peng Yang are part of NC State’s Center for Geospatial Analytics. Amanda C. Saville, a research specialist in Ristaino’s lab, also co-authored the paper. Funding was provided by the Triangle Center for Evolutionary Medicine Seed Grant; the U.S. Dept. of Agriculture’s NIFA under grant number 2015-2370; and by the National Science Foundation PIPP Phase 1 grant number 2022-1191.
-kulikowski-
Note to editors: The abstract of the paper follows.
“Reconstructing Historic and Modern Potato Late Blight Outbreaks Using Text Analytics”
Authors: Ariel Saffer, Laura Tateosian, Amanda C. Saville, Yi-Peng Yang and Jean B Ristaino, NC State University
Published: Feb. 15, 2024 in Scientific Reports
DOI:
Abstract: In 1843, a hitherto unknown plant pathogen entered the U.S. and spread to potato fields in the northeast. By 1845, the pathogen had reached Ireland leading to devastating famine. Questions arose immediately about the source of the outbreaks and how the disease should be managed. The pathogen, now known as Phytophthora infestans, still continues to threaten food security globally. A wealth of untapped knowledge exists in both archival and modern documents, but is not readily available because the details are hidden in descriptive text. We 1) used text analytics of unstructured historical reports (1843-1845) to map U.S. late blight outbreaks; 2) characterized theories on the source of the pathogen and remedies for control; and 3) created modern late blight intensity maps using Twitter feeds. The disease spread from 5 to 17 states and provinces in the U.S. and Canada between 1843-45. Crop losses, Andean sources of the pathogen, possible causes and potential treatments were discussed. Modern disease discussion on Twitter included near-global coverage and local disease observations. Topic modeling revealed general disease information, published research, and outbreak locations. The tools described will help researchers explore and map unstructured text to track and visualize pandemics.
JOURNAL
Scientific Reports
METHOD OF RESEARCH
Data/statistical analysis
SUBJECT OF RESEARCH
Not applicable
ARTICLE TITLE
Reconstructing Historic and Modern Potato Late Blight Outbreaks Using Text Analytics
ARTICLE PUBLICATION DATE
15-Feb-2024
Plant disease: Mapping the spread of potato blight prior to the Irish potato famine *IMAGES*
The first accurate maps of outbreaks of potato blight — a disease caused by the fungus-like pathogen Phytophthora infestans that was responsible for the Irish potato famine between 1845 and 1852 — in the USA between 1843 and 1845 are presented in a study published in Scientific Reports. The findings improve our understanding of the spread of potato blight before the disease reached Europe.
Jean Ristaino and colleagues mapped outbreaks of potato blight in North America between 1843 and 1845 by analysing historic agricultural reports published in the USA during this period. The authors found that the disease was first reported in 1843 in five locations in the states of New York, Pennsylvania, New Jersey, Delaware, and Connecticut. By the end of 1844 the disease had spread to 107 additional locations, a further six US states (Ohio, Massachusetts, Rhode Island, Vermont, New Hampshire, and Maine) and the Canadian province of Nova Scotia. In 1845 the disease spread to 53 new locations, including in four additional US states (Michigan, Illinois, Indiana, Maryland) and the Canadian province of New Brunswick. Contemporary reports suggested that the disease led to crop losses of between 33 and 50%.
The authors also used their approach to characterise historical theories on the source of potato blight and remedies for treating the disease between 1843 and 1845. Proposed causes of the disease during this period included insects, weather conditions, poor quality potato varieties, and a fungus. In addition, the authors identified a widely described debate in reports about whether the fungus was the cause or a consequence of the disease. Suggested treatments for the disease included calcium oxide (known as lime), sulfur, copper sulfate (known as bluestone copper), and salt. Infected imported potato seed tubers from locations including Nova Scotia, France, and Bogota, Colombia were suspected as sources of the disease.
Together, the findings provide insight into the spread of potato blight in the USA and into public understanding of disease in the mid-19th century.
###
Article details
Reconstructing historic and modern potato late blight outbreaks using text analytics
DOI: 10.1038/s41598-024-52870-2
Corresponding Author:
Jean Ristaino
North Carolina State University, Raleigh, NC, USA
Email: jean_ristaino@ncsu.edu
Please link to the article in online versions of your report (the URL will go live after the embargo ends): https://www.nature.com/articles/s41598-024-52870-2.
JOURNAL
Scientific Reports
ARTICLE TITLE
Reconstructing historic and modern potato late blight outbreaks using text analytics
ARTICLE PUBLICATION DATE
15-Feb-2024