Skip to main content

Mapping the Earth's crops with the help of AI can help farmers and policymakers improve planning

By Peter Fitzgibbon - 29th March 2025 - 13:13

As agriculture continues to become more entwined with technology, researchers are exploiting new computing techniques and resources to bring greater precision to crop mapping

Crop Mapping lead

Smart farming—a phrase that encompasses research computing tools that help farmers to better address issues like crop disease, drought and sustainability—has quickly become a ubiquitous term in Ag labs across the country. The availability of NCSA resources such as Delta for researchers, both nationally and on the University of Illinois Urbana-Champaign (U. of I.) campus, has fostered a hotbed of cutting-edge research projects in the agricultural domain.

Yi-Chia Chang, a Ph.D. student at the U. of I., focuses his research on machine learning (ML) and remote sensing. His most recent research, published on the arXiv preprint server and accepted for presentation at the IEEE IGARSS 2025 conference, concerns crop mapping.

Imagine you're a farmer, and you're planning what to grow this season. You may want to know what crop would be most valuable to grow. If you're a policymaker, you might want to know if there would be a shortage of a particular crop and incentivize farmers to grow it through subsidies. To do this, you'd have to know what's currently growing to make those decisions. That's where crop mapping comes into play.

Crop mapping uses satellite imagery to create a map of all the crop types in a particular region. Crop maps are essential tools when it comes to monitoring crops and regional food supplies, and these maps help when farmers are planning which crops to plant in a growing season. The maps can also help with smart farming—using these crop maps applications can monitor growth, precipitation conditions, yield predictions and even disease.

All these tools are great for farmers, but they also help at a larger scale as well, helping policymakers and organizations determine how much food and what types are being produced in a given area. Machine learning is an essential component when it comes to keeping these crop maps up-to-date.

In the U.S. alone, there are millions of acres of farmland to analyze, label and map. There aren't enough experts to analyze and keep up with data to create up-to-date, accurate crop maps, so training machines to scan satellite images and label crops is far more efficient and useful.

Researchers have had great success training machines to recognize not only crops but many other elements of farming from satellite imagery. They've created accurate models for crop mapping in well-researched regions like the U.S. However, there has been little research on how well these models work in new geographic areas, especially in regions where data is lacking. This raises concerns about "geospatial bias," meaning models trained on data from well-developed countries may not perform well in less-developed regions.

Farmer
While there is relatively good global data on cropland distribution, the data on crop types, yields, and management at fine resolution and global scale are still lacking. Yet agriculture plays a direct role in achieving the second Sustainable Development Goal (SDG 2), Zero Hunger, which seeks to simultaneously address food security challenges and global environmental sustainability. The availability of accurate crop mapping models will be an important factor in achieving this goal in less-developed regions

"Our research will enable better-informed agricultural systems for policymakers and stakeholders to support global food security," says Yi-Chia Chang, University of Illinois

Chang's study, which was inspired by his team's previous related research published in the NeurIPS 2023 proceedings, looks at how popular Earth observation models work when applied to new regions, particularly in agriculture, where differences in farming practices and uneven data availability make it harder to transfer knowledge between areas.

To do this, Chang chose four major cereal grains—maize, soybean, rice and wheat—and then tested three widely-used pre-trained models and compared their performance on data they had seen before (in-distribution) versus data from new regions (out-of-distribution).

The results showed that models pre-trained on satellite images like Sentinel-2 (SSL4EO-S12) performed better than those pre-trained on general image datasets such as ImageNet.

Crop Mapping
Visualization of example input Sentinel-2 images, ground truth masks, and model predictions using SSL4EO-S12 pre-trained weights. Overall results are promising, with the models capturing the general class distribution and correctly identifying most fields

"By harmonizing crop-type datasets across five continents, we found that foundation models pre-trained on full spectral bands of Sentinel-2 perform better for crop-type mapping," said Chang.

"Our research also shows that training with out-of-distribution data can boost performance when the in-distribution data is scarce. In the long run, we still hope to acquire larger and more balanced labeled datasets since those can help achieve the best crop-type mapping results. I am excited to see how foundation models and transfer learning can benefit food security."

Sentinel-2
Imagery derived from the Sentinel-2 multi-spectral satellite instrument - made freely available under the European Union’s Copernicus programme - gave promising results in Chang’s research. This illustration shows the various processing steps following image acquisition by the Sentinel-2 ground segment, with only Level-1C and Level-2A products being released to all users. Level-1B products are available for expert users only on request.

Chang's work has been fully integrated with TorchGeo, an open-source library for geospatial machine learning, so future research can easily develop further based on his results. As his team looks ahead, they plan to build upon the results of this study and apply their methodology to new smart-farming models.

TorchGeo
To help realise the potential of deep learning for remote sensing applications, TorchGeo provides a Python library for integrating geospatial data into the PyTorch deep learning ecosystem. TorchGeo provides data loaders for a variety of benchmark datasets, composable datasets for generic geospatial data sources, samplers for geospatial data, and transforms that work with multispectral imagery

"Our future work will focus on expanding crop-type datasets and developing agriculture-specific pre-trained models," said Chang. "We will also establish benchmarks for agricultural applications of foundation models, such as crop-type mapping and crop-yield prediction, bridging the gap between GeoAI and food security solutions."

Chang's work required massive amounts of storage and compute power to complete. GPUs were necessary for the machine-learning aspect of the project to be completed in a timely manner, but a lot of space was also needed for all that satellite imagery.

"HPC resources significantly accelerate the machine learning workflows using GPUs, reducing model training time from hours on CPUs to minutes on GPUs. Additionally, the large data-storage allocation enables us to efficiently manage the training datasets, pre-trained weights and model outputs in the cluster," says Chang.

Chang has experience using research computing. Prior to this project, he utilized the campus cluster hosted by a research group led by Arindam Banerjee, a professor of computer science at U. of I. Even with his previous experience with high-performance computing (HPC), Chang was happy to report that moving his project onto Delta was relatively simple.

Delta
Delta is a dedicated computing resource designed by Hewlett Packard Enterprises (HPE) and The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign. It delivers a highly capable GPU-focused compute environment for GPU and CPU workloads and is the most performant GPU computing resource in the National Science Foundation’s portfolio.

"My experience using Delta has been smooth and user-friendly. The admin staff was responsive, approving token exchange for GPU hours and storage allocations within a few days. The technical staff efficiently helped with troubleshooting. I'd like to send a special thanks to Brett Bode for helping to allocate over 50 TB of storage for satellite imagery."

More information: Yi-Chia Chang et al, On the Generalizability of Foundation Models for Crop Type Mapping, arXiv (2024). DOI: 10.48550/arxiv.2409.09451

Journal information: arXiv

Story Source: University of Illinois at Urbana-Champaign

Read More: Big Data Satellite Imaging Agriculture

Subscribe to our newsletter

Stay updated on the latest technology, innovation product arrivals and exciting offers to your inbox.

Newsletter