1. Introduction
Landslides play an important role in the landscape evolution of the Loess Plateau in northwestern China. Every year, one third of the geohazards in China occur in the Loess Plateau [
1], and most of them are landslides, which cause substantial damage to buildings, farmland, gas and oil pipelines, highways and railways, and even human life [
2,
3,
4]. It has been determined that more than 14,544 landslides have occurred in the Chinese Loess Plateau [
5]. Field investigation of geological hazards in the Loess Plateau suggests that earthquakes, rainfall, and human activities are common triggers for loess landslides [
6]. In addition to new landslides, there is a risk that old landslides could slide again. An old landslide is the result of prolonged and intricate geological processes occurring on slopes [
7], while most old landslides are stable, triggers such as human activities, earthquakes, and rainfall can lead to the reactivation of these old landslides. In the 1950s, the Wolongsi old landslide in the **’an-Baoji section of the Longhai Railway slid again, with a sliding area about 33 × 104
and a volume of about 2.0 ×
, pushing the Longhai Railway southward by more than 100 m and interrupting the railway for several days. From the 1950s to the 1970s, more than 170 large and medium-sized landslides occurred along the 98 km from Baoji Gorge to Changxing, nearly half of which were old landslides [
8]. Recently, Hu et al. [
9] investigated the Beiguo landslide in Heyang County, Shaanxi Province, China, and found that since 2011, there have been several signs of local reactivity. In 2017, the landslide was completely triggered by rainfall. Zhang et al. [
10] presented a typical case (the Zhongzhai landslide) triggered by a succession of torrential rainfall occurrences in October 2021 in Niangniangba town, Tianshui, Gansu, China, which buried two houses and damaged another two houses. In order to reduce the losses caused by the reactivation of old landslides, it is necessary to detect old landslides in Loess Plateau, accurately and efficiently, used for early warning of reactivation landslides.
At present, researchers pay more attention to the detection of new landslides. Compared with old loess landslides, new loess landslides generally have obvious signs, such as bare ground and discontinuity of vegetation. As for old landslides, due to the relatively long time since its occurrence, the shape of landslides have typically changed greatly and may be covered by dense vegetation. Therefore, determining how to detect old loess landslides effectively is a challenging topic.
Remote sensing data have been applied to agriculture [
11,
12,
13], forestry [
14,
15], meteorology [
16,
17], and other fields successfully, including images from satellites [
18,
19] and unmanned aerial vehicles (UAVs) [
20,
21]. Remote sensing has the advantages of wide observation range, fast speed, and short cycle of obtaining data with high spatial resolution, so detecting landslides with remote sensing technology has become a trend [
22].
Traditional landslide detection methods are mainly based on visual interpretation [
23]. The landslide is identified through certain interpretation signs such as discontinuities in vegetation texture, landslide back wall, and shear cracks, etc. Most of the visual interpretations are conducted directly using aerial or satellite images [
24]. This relies heavily on the knowledge of experts. Therefore, manual interpretation is labor intensive and time consuming when there is a large amount of data to be interpreted, and this method is inefficient for the detection of old loess landslides across a large area [
25].
Next, machine learning (ML) methods began to be proposed for automatic detection. Colors, textures, and edges in the image were used as landslide-detection features for machine learning methods.Bui et al. [
26] used support vector machine (SVM) to detect landslides in tropical environments with a combination of airborne synthetic aperture radar (AIRSAR) data and susceptibility map** based on a geographic information system. Furthermore, Dou et al. [
27] proposed an ensemble method consisting of four models (SVM-Stacking, SVM, SVM-Bagging and SVM-Boosting) to obtain landslide susceptibility data. Similarly, Tavakkoli et al. [
28] proposed a method that incorporates object-based image analysis (OBIA) with three machine learning methods for landslide detection. These machine learning methods have greatly improved the efficiency of landslide detection. However, it has a disadvantage that manual features were designed but not learned, which leads to a lack of generalization [
29].
Recently, an increasing number of deep learning methods have been used in the field of remote sensing [
30,
31,
32] and to detect landslides [
33,
34,
35,
36,
37]. Ye et al. [
38] used deep belief networks (DBN) to predict landslide susceptibility. Ji et al. [
39] used convolution neural network (CNN)-based methods to detect landslides with high accuracy. Li et al. [
40] used Faster-RCNN (Region-CNN) to detect landslides within large-scale satellite images. Wang et al. [
41] proposed a novel deep learning method for landslide identification, combining YOLO and U-Net models. However, CNN-based models have some limitations in modeling global information due to their use of convolutional kernels. In 2017, the transformer method was proposed with a self-attention mechanism, which can learn global features well, and was first used for natural language processing (NLP) [
42]. Then, Dosovitskiy et al. [
43] proposed vision transformer (ViT), which was the first successful application of transformer in image classification tasks. After that, an increasing number of object detection models based on transformer were proposed, and these were used for landslide detection with remote sensing images. Tang et al. [
44] proposed the transformer-based semantic segmentation model (SegFormer) to identify coseismic landslides, and this has better performance compared with CNNs in landslides detection. Lv et al. [
45] proposed a pyramid vision transformer (PVT) model for landslides detection, which directly models the global information of different scales in remote sensing images. These transformer-based models can detect landslides well with the advantage that they can learn global features better. However, there are still great challenges for old loess landslide detection using high-resolution remote sensing images, mainly including:
Old loess landslides occurred over a relatively long period, and due to the loose and porous character of loess, the shapes of landslides have been changed for a long time, and may be covered with vegetation, which make it difficult to recognize them in high-resolution remote sensing images.
The high-resolution remote sensing image only contains the orthophoto-view of old loess landslides, which is difficult for training models to recognize. Actually, experts usually interpret old landslides by rotating the view angle in order to find more features and recognize them (
Figure 1). There is still no effective automatic method to simulate this process to detect old loess landslides intelligently.
Detection models based on CNNs or transformers only extract local or global features of remote sensing images, respectively. They cannot utilize various features in the image effectively, which makes detection more difficult.
In this paper, considering the above challenges and the properties of CNNs and transformers and inspired by the interpretation process of human experts from different views, a novel optimal-view and multi-view strategic hybrid deep learning (OMV-HDL) method was proposed to detect old loess landslides effectively. The OMV-HDL consists of two steps: a training step and a detection step. During the training step, the optimal-view dataset is established to train the HDL model. During the detection step, the multi-view images are obtained by multi-view automatic crop** (MAC), and they are then fed to the trained hybrid deep learning (HDL) model in parallel to detect old loess landslides independently. After that, detection results from various views are fused by the weighted boxes fusion (WBF) algorithm to yield the final result. The proposed method has a high detection performance for old loess landslides. The main contributions of this paper are as follows:
A HDL model which combines the advantages of CNNs and transformers was proposed, and it can extract global and local features of images at the same time. As such, it can detect old loess landslides effectively. The proposed method consists of the YOLOv5 object detection model based on CNNs and the detection transformer (DETR) model, and weighted boxes fusion (WBF) was introduced to fuse the results of the proposed hybrid deep learning model and to obtain comprehensive detection results.
The optimal and multi-view (OMV) strategy was proposed to detect old landslides effectively and efficiently. During the training process, more obvious features of old landslides can be learned from optimal-view images, while traditional learning methods only use orthophoto images, in which old landslides cannot be observed clearly. During detection in a new area, because the optimal view is unknown, we propose the multi-view strategy instead to detect old landslides with a trained model, which can be implemented in parallel without increasing detection time.
An optical remote sensing dataset with optimal images from the Yan’an area (YA-OP) was constructed as a benchmark for old landslide detection, and it can be used for related research about old landslides in the Loess Plateau.
The rest of this paper is organized as follows:
Section 2 illustrates the details of the study area,
Section 3 describes the proposed method for old loess landslide detection,
Section 4 presents the experimental results, and conclusions are given in
Section 5.
2. Description of the Study Area
The study area is located in the north of Shaanxi province, China, which includes four counties: Wuqi, Ansai, Zhidan, and **gbian. Among them, Wuqi, Zhidan, and Ansai counties belong to the jurisdiction of Yan’an City, while **gbian county belongs to the jurisdiction of Yulin City. The location of this area is between the latitudes of
N–
N and the longitudes of
E–
E, which indicate the central part of the Loess Plateau (
Figure 2).
This area has an inland arid and semi-arid climate four distinct seasons, sufficient sunlight, and a large temperature difference between day and night, with an annual average daily temperature range of 10.9∼14.9 °C across the entire area. The average annual temperature is 7.7∼10.6 °C, with an average annual sunshine of 2300–2700 h and an average annual precipitation of about 500 mm.
This area has a large thickness of loess accumulation, which leads to severe soil erosion, crisscrossing gullies, fragmented terrain, and the frequent occurrence of geological disasters such as landslides and collapses. Being covered by loess, the landslides that occur most often in this area are loess landslides, with the rare occurrence of rocky landslides. Loess landslides are mainly developed in the middle and shallow surface, with few deep landslides.
The main development characteristics of the loess landslide in this area include the cracks on the slope, multi-level terraces, small-scale collapse and landslide at the front edge. For old loess landslides, global features including double groove with same source (
Figure 3a) and armchair-shape (
Figure 3b) are usually to be observed.
Double groove with same source refers to a phenomenon that two grooves are formed on both sides of the landslide body, and merge into the same ditch in the upstream. This is due to the erosive effect of water flows. When it rains, the water in landslide body will dash to both sides of the slope, resulting in this phenomenon. Armchair-shape refers to a phenomenon that the backwall of the landslide usually presents Armchair-shape.
In addition to these global features, old loess landslides also have some local features. Such as landslide backwall, radial cracks on the slope, differences between the vegetation and the surrounding areas (
Figure 3b). These local features can help to the detection of old loess landslides.
5. Discussion
Observations from the results of the experiment using the Yan’an optimal-view dataset and orthophoto-view dataset are summarized as follows:
First, it is seen that the performance of DETR is much better than YOLOv5 in both datasets. This is because the DETR model pays more attention to the global features of the image, such as the overall shape of the landslide, the surface deformation around the landslide, and the geomorphological features. However, the CNN-based YOLO model pays more attention to the local features of the image, such as the local optical features of the landslide body, the landslide tongue, and the backwall of the landslide in the image, as well as the local vegetation discontinuity and surface deformation at the edge of the landslide. For old loess landslides, after experiencing wind, sand, and water erosion, the overall shape and geomorphological features of the landslide have not changed too much. Compared with these global features, the local features tend to become less noticeable, which makes old loess landslides more difficult to identify. Second, the performance of the HDL-WBF is better than the DETR model in both datasets. This indicates that the WBF fusion algorithm can fuse the results of DETR and YOLOv5 effectively. In addition, it shows that the results of DETR and YOLOv5 are complementary, that is, these two models pay attention to different features. Third, from the overall results, it is seen that the performance of DETR, YOLOv5, and HDL-WBF on the optimal-view dataset are better than those on the orthophoto-view dataset. This proves that optimal-view images have more abundant optical features compared to the orthophoto images, which can help with the detection of old loess landslides which do not have obvious optical features in orthophoto-view.
From the results of the experiment on multi-view images in **gbian county, it is observed that although the HDL-WBF model was not trained using multi-view images, it still obtained good detection results. This indicates that the optimal-view and multi-view strategy we proposed is effective at detecting old loess landslides.
Finally, in the WBF experiments, it can be observed that with the increasing of IoU and confidence thresholds, the F1 score grows roughly. This is because, as the IoU threshold and confidence threshold increase, the detected results become more accurate, causing the F1 score to increase. However, we can see that when the confidence threshold is greater than 0.5 and the IoU threshold is greater than 0.8, the F1 score decreases slightly. This occurs because, as the accuracy increases, additional boxes that may be close to the ground truth are discarded, resulting in a decrease in recall rate, which affects F1 score. Therefore, from the perspective of the overall variation of F1 score, the optimal threshold for IoU is 0.8, and the optimal threshold for confidence is in the range of 0.5 to 0.9.
Although the proposed model has achieved good results in testing and detection, it has some drawbacks. First, compared with orthophoto-view images, the process of interpreting optimal-view images still requires more labor. If self-supervised methods are used in interpretation, labor costs will be reduced further. Second, the optimal IoU threshold and confidence threshold of WBF in our method were obtained by analyzing images used in the experiment, not by automatic selection. It may be necessary to conduct experiments to update the optimal thresholds when the proposed method is used in areas such as Sichuan and Yunnan provinces, where landslides have different types of vegetation and sizes.