Friday, March 28, 2025

 

Mapping the Earth’s crops





National Center for Supercomputing Applications




As agricultural research continues to become more entwined with technology, smart farming – a phrase that encompasses research computing tools that help farmers to better address issues like crop disease, drought and sustainability – has quickly become a ubiquitous term in Ag labs across the country. The availability of NCSA resources like Delta for researchers, both nationally and on the University of Illinois Urbana-Champaign (U. of I.) campus, has fostered a hotbed of cutting-edge research projects in the agricultural domain.

Yi-Chia Chang, a Ph.D. student at the U. of I., focuses his research on machine learning (ML) and remote sensing. His most recent research, published in arXiv and accepted to IEEE IGARSS 2025, concerns crop mapping.

Imagine you’re a farmer, and you’re planning what to grow this season. You may want to know what crop would be most valuable to grow. If you’re a policymaker, you might want to know if there would be a shortage of a particular crop and incentivize farmers to grow it through subsidies. To do this, you’d have to know what’s currently growing to make those decisions – that’s where crop mapping comes into play.

Crop mapping uses satellite imagery to create a map of all the crop types in a particular region. Crop maps are essential tools when it comes to monitoring crops and regional food supplies, and these maps help when farmers are planning which crops to plant in a growing season. The maps can also help with smart farming – using these crop maps applications can monitor growth, precipitation conditions, yield predictions and even disease.

All these tools are great for farmers, but they also help at a larger scale as well, helping policymakers and organizations determine how much food and what types are being produced in a given area. Machine learning is an essential component when it comes to keeping these crop maps up-to-date. In the U.S. alone, there are millions of acres of farmland to analyze, label and map. There aren’t enough experts to analyze and keep up with data to create up-to-date, accurate crop maps, so training machines to scan satellite images and label crops is far more efficient and useful.

Researchers have had great success training machines to recognize not only crops but many other elements of farming from satellite imagery. They’ve created accurate models for crop mapping in well-researched regions like the U.S. However, there has been little research on how well these models work in new geographic areas, especially in regions where data is lacking. This raises concerns about "geospatial bias," meaning models trained on data from well-developed countries may not perform well in less-developed regions.

Our research will enable better-informed agricultural systems for policymakers and stakeholders to support global food security.

–Yi-Chia Chang, University of Illinois

Chang’s study, which was inspired by his team’s previous related research published in NeurIPS 2023 proceedings, looks at how popular Earth observation models work when applied to new regions, particularly in agriculture, where differences in farming practices and uneven data availability make it harder to transfer knowledge between areas. To do this, Chang chose four major cereal grains – maize, soybean, rice and wheat – and then tested three widely-used pre-trained models and compared their performance on data they had seen before (in-distribution) versus data from new regions (out-of-distribution).

The results showed that models pre-trained on satellite images like Sentinel-2 (SSL4EO-S12) performed better than those pre-trained on general image datasets like ImageNet.

“By harmonizing crop type datasets across five continents, we found that foundation models pre-trained on full spectral bands of Sentinel-2 perform better for crop-type mapping,” said Chang. “Our research also shows that training with out-of-distribution data can boost performance when the in-distribution data is scarce. In the long run, we still hope to acquire larger and more balanced labeled datasets since those can help achieve the best crop-type mapping results. I am excited to see how foundation models and transfer learning can benefit food security.”

Chang’s work has been fully integrated with TorchGeo, an open-source library for geospatial machine learning, so future research can easily develop further based on his results. As his team looks ahead, they plan to build upon the results of this study and apply their methodology to new smart-farming models.

“Our future work will focus on expanding crop-type datasets and developing agriculture-specific pre-trained models,” said Chang. “We will also establish benchmarks for agricultural applications of foundation models, such as crop-type mapping and crop-yield prediction, bridging the gap between GeoAI and food security solutions.”

Chang’s work required massive amounts of storage and compute power to complete. GPUs were necessary for the machine-learning aspect of the project to be completed in a timely manner, but a lot of space was also needed for all that satellite imagery.

HPC resources significantly accelerate the machine learning workflows using GPUs, reducing model training time from hours on CPUs to minutes on GPUs. Additionally, the large data-storage allocation enables us to efficiently manage the training datasets, pre-trained weights and model outputs in the cluster.

–Yi-Chia Chang, University of Illinois

Chang has experience using research computing. Prior to this project, he utilized the campus cluster hosted by a research group led by Arindam Banerjee, a professor of computer science at U. of I. Even with his previous experience with high-performance computing (HPC), Chang was happy to report that moving his project onto Delta was relatively simple.

“My experience using Delta has been smooth and user-friendly. The admin staff was responsive, approving token exchange for GPU hours and storage allocations within a few days. The technical staff efficiently helped with troubleshooting. I’d like to send a special thanks to Brett Bode for helping to allocate over 50 TB of storage for satellite imagery.”

For more information about getting an allocation on Delta, University of Illinois researchers can make a request through Illinois Computes. For larger allocations or for researchers from outside of the U. of. I, visit the ACCESS allocations page to request time on Delta or DeltaAI.


ABOUT DELTA AND DELTAAI
NCSA’s Delta and DeltaAI are part of the national cyberinfrastructure ecosystem through the U.S. National Science Foundation ACCESS program. Delta (OAC 2005572) is a powerful computing and data-analysis resource combining next-generation processor architectures and NVIDIA graphics processors with forward-looking user interfaces and file systems. The Delta project partners with the Science Gateways Community Institute to empower broad communities of researchers to easily access Delta and with the University of Illinois Division of Disability Resources & Educational Services and the School of Information Sciences to explore and reduce barriers to access. DeltaAI (OAC 2320345) maximizes the output of artificial intelligence and machine learning (AI/ML) research. Tripling NCSA’s AI-focused computing capacity and greatly expanding the capacity available within ACCESS, DeltaAI enables researchers to address the world’s most challenging problems by accelerating complex AI/ML and high-performance computing applications running terabytes of data. Additional funding for DeltaAI comes from the State of Illinois.



New geospatial intelligence methodology makes land use management more accurate and faster




A technique developed by researchers was tested in the Brazilian state of Mato Grosso and more accurately delineated areas of natural vegetation and agricultural production by crop type; the results showed 95% accuracy in mapping




News Release 
Fundação de Amparo à Pesquisa do Estado de São Paulo

New geospatial intelligence methodology makes land use management more accurate and faster 

image: 

The researchers applied the new methodology in Mato Grosso using data from the 2016/2017 strategic harvest 

view more 

Credit: Research Progress and Challenges of Agricultural Information Technology




Researchers from São Paulo State University (UNESP), at its Tupã campus in Brazil, have developed and tested a new geospatial intelligence methodology that can contribute more quickly and accurately to land use management and territorial planning projects. With this tool, it was possible to precisely delineate areas of Amazon rainforest, Cerrado vegetation (the Brazilian savannah-like biome), pastures and agricultural crops in a double-cropping system, something that can provide support for public policies aimed at agricultural production and environmental conservation.

By combining data cube architecture (ready for analysis), disseminated in Brazil through the Brazil Data Cube project, led by the National Institute for Space Research (INPE), and the Geobia (Geographic Object-Based Image Analysis) approach, the scientists were able to identify vegetation and double cropping – for example, soy and corn – over the course of a harvest in the state of Mato Grosso. They used time series of satellite images from NASA’s Modis (Moderate Resolution Imaging Spectroradiometer) sensor.

The results showed that the proposed combination, coupled with machine learning (artificial intelligence) algorithms, achieved 95% mapping accuracy.

Geobiology is a technique that allows satellite images to be processed using segmentations that group similar pixels into geo-objects and study their characteristics, such as shape, texture, and reflectance. In many cases, this allows for a more realistic interpretation. Data cubes, on the other hand, store information in dimensions – time and place – making it easier to aggregate and visualize information related to a specific location in a specific time period, such as crop areas in a harvest year.

Currently, mapping uses pixel image analysis in isolation, which ends up creating edge problems with blurring in some areas. “Scientific work has highlighted spectral confusion in border zones between different land uses as an area for improvement. So we decided to segment the images and evaluate the geographical object as the minimum unit of analysis, rather than the pixel. It’s as if the image were broken down and classified according to each piece. In this way, we were able to reduce recurring edge errors and accurately identify the targets, even with moderate spatial resolution,” Michel Eustáquio Dantas Chaves, professor at the Faculty of Science and Engineering of UNESP and corresponding author of the article, told Agência FAPESP.

Chaves has been using data cube architecture for several years to develop tools that contribute to analyses focused on the advancement of the agricultural frontier, especially in the Cerrado.

According to the professor, the methodology can be replicated to evaluate images from other Earth observation satellites, such as Landsat and Sentinel, which provide data for scientific studies, mapping and monitoring. Images from both are now being processed by the team coordinated by the professor.

The article describing the methodology was published in the special issue Research Progress and Challenges of Agricultural Information Technology of the scientific journal AgriEngineering. The study was supported by FAPESP through three projects (21/07382-223/09903-5 and 24/08083-7).

Application in practice

Mato Grosso leads national grain production with 31.4% of the country’s total, followed by the states of Paraná (12.8%) and Rio Grande do Sul (11.8%). The state is expected to reach 97.3 million tons in the 2024/2025 harvest, an increase of 4.4% over the previous harvest, according to the National Supply Company (CONAB). Almost half of this production (46.1 million tons) is expected to be soybeans.

In addition, Mato Grosso is one of the most biodiverse states in the country, containing parts of three of Brazil’s six biomes. Around 53% of its territory is in the Amazon, 40% in the Cerrado and 7% in the Pantanal.

Due to this heterogeneity of land uses and vegetation types in the territory, the researchers applied the new methodology in Mato Grosso using data from the 2016/2017 strategic harvest, in which Brazil produced 115 million tons of soybeans, of which 30.7 million tons were in the state. Land use classifications were associated with agricultural land (fallow-cotton, soybean-cotton, soybean-corn, soybean-fallow, soybean-millet and soybean-sunflower), as well as sugarcane crops, urban areas and water bodies.

The results showed an overall accuracy of 95%, demonstrating the potential of the approach to provide mapping that optimizes forest and agricultural land delineation. “Since the approach manages to identify the targets in a consistent manner, the methodology can be applied to the estimation of areas within the same harvest, favoring productivity estimates; in territorial planning actions and anything that deals with land use and land cover for decision-making,” explains Chaves about the application of the tool.

The professor explains that the methodology also makes it possible to analyze disturbances in forests and other types of natural vegetation. “It’s quicker to detect deforestation than degradation. This method allowed us to detect these variations more quickly.”

In the article, the scientists pay tribute to Professor Ieda Del’Arco Sanches, a remote sensing researcher at INPE who died in January. “This article is a way of thanking her for her teachings and following her legacy. Ieda always worked to accurately assess the Earth’s surface and to treat the data ethically and responsibly, showing how they can contribute to the construction of public policies,” adds Chaves.

About FAPESP

The São Paulo Research Foundation (FAPESP) is a public institution with the mission of supporting scientific research in all fields of knowledge by awarding scholarships, fellowships and grants to investigators linked with higher education and research institutions in the state of São Paulo, Brazil. FAPESP is aware that the very best research can only be done by working with the best researchers internationally. Therefore, it has established partnerships with funding agencies, higher education, private companies, and research organizations in other countries known for the quality of their research and has been encouraging scientists funded by its grants to further develop their international collaboration.



No comments: