Abstract

Crop maps are crucial for agricultural monitoring and food management and can additionally support domain-specific applications, such as setting cold supply chain infrastructure in developing countries. Machine learning (ML) models, combined with freely-available satellite imagery, can be used to produce cost-effective and high spatial-resolution crop maps. However, accessing ground truth data for supervised learning is especially challenging in developing countries due to factors such as smallholding and fragmented geography, which often results in a lack of crop type maps or even reliable cropland maps. Our area of interest for this study lies in Himachal Pradesh, India, where we aim at producing an open-access binary cropland map at 10-m resolution for the Kullu, Shimla, and Mandi districts. To this end, we developed an ML pipeline that relies on Sen-tinel-2 satellite images time series. We investigated two pixel-based supervised classifiers, sup-port vector machines (SVM) and random forest (RF), which are used to classify per-pixel time series for binary cropland mapping. The ground truth data used for training, validation and testing was manually annotated from a combination of field survey reference points and visual interpretation of very high resolution (VHR) imagery. We trained and validated the models via spatial cross-validation to account for local spatial autocorrelation and improve the generalization capability of the model. We tested the model on hold out test sets of each district, achieving an aver-age accuracy for the RF (our best model) of 87%. We noticed NIR band at the early and late stage of the apple harvest season (main crop in the region) to be of critical importance for the model. Finally, we used this model to generate a cropland map for three districts of Himachal Pradesh, spanning 14,600 km2, which improves the resolution and quality of existing public maps, and made the code open-source.

Details