1. Introduction
In recent times, tourism has been considered a viable path to sustainable development, given its potential to bring in significant foreign exchange earnings, generate local employment opportunities, and promote economic growth [
1,
2,
3]. The impact of urbanization has progressively led to rural tourism becoming a popular trend [
4,
5]. Rural tourism sustains its prominent position in the tourism industry, as visitors increasingly prioritize the quest for health and wellness in the post-COVID-19 era [
6,
7].
Rural tourism destinations are of significant interest to the academic community [
3,
8,
9,
10,
11]. The research focus on rural tourism concentrates on its contribution to the development of rural areas and the transformative effects it has on these regions, often through the lens of case studies [
7,
12,
13]. The rapid growth of rural tourism has led to an increasingly competitive market environment, which has in turn created a need for scientific advancement in the field [
14]. Scholars have increasingly turned their attention towards understanding the spatial distribution patterns of its development and exploring the underlying spatial mechanisms driving these patterns [
15]. The intricate and multifaceted spatial distribution patterns observed in rural tourism villages arise from the diverse interplay of numerous influencing factors [
16]. Hence, it is essential to conduct a thorough investigation that encompasses the overall spatial arrangement of rural tourism villages, while also considering the specific contextual factors that influence their spatial distribution. This comprehensive study is crucial for unraveling the complex interplay of variables and understanding the nuanced dynamics that shape the spatial patterns of rural tourism.
The Seventh National Census of Population, conducted in 2020, revealed that a substantial proportion of China’s population, exceeding one-third, continues to reside in rural areas. Thus, the attention towards economic development and living conditions in rural areas has been a longstanding issue of concern [
17,
18]. The shifting market trends have triggered significant changes in the industrial composition of numerous rural regions in China. Consequently, new sectors have emerged, with rural tourism taking center stage as a prevailing mechanism for rural regeneration and conservation. Operating as a multi-industrial sector, rural tourism drives comprehensive economic, social, and spatial transformations and reconstructions within rural areas [
3,
19,
20,
21]. With its ability to stimulate industrial integration, enhance the value of agricultural products, augment farmers’ earnings, and reinforce the foundation of rural collective economy, rural tourism can facilitate notable advancements [
22]. In 2018, China introduced the Rural Revitalization Strategy, and the role of rural tourism in stimulating the rural revitalization has since obtained renewed attention [
18].
In 2019, the Ministry of Culture and Tourism and the National Development and Reform Commission jointly initiated a national campaign to identify key villages for promoting rural tourism. The screening of key rural tourism villages (key villages) is a multi-level process that begins at the municipal level. Municipal cultural and tourism bureaus first identify villages that have a high concentration of tourism resources, a diversity of tourism products, comprehensive tourism facilities, and attractive rural settings. These villages are then recommended to the provincial cultural and tourism department, and then the National Ministry of Culture and Tourism conducts a rigorous evaluation to determine the final list of key tourism villages. Once selected, these villages receive significant support from the government, including financial aid and training opportunities. As of 2022, the lists of four batches have been released for public reference, and designations for key villages reached a total of 1399. These designated villages serve as exemplary paradigms for advancing the development of high-quality rural tourism [
23], contributing to the optimization of rural tourism supply, and spearheading the growth of the rural tourism industry. As a result, these key villages have assumed a pivotal role in examining the progress and transformative dynamics within the domain of rural tourism in China.
This research investigates the spatial mechanisms underlying key villages in China and identifies the principal factors influencing their development. It employs spatial statistic methodologies to analyze data derived from the key rural tourism villages catalog. This study provides a comprehensive overview of rural tourism in China, and offers innovative insights into the sustainable advancement of rural tourism.
The subsequent sections of this article are structured as follows:
Section 2 provides a brief review of spatial statistical research on rural tourism in China.
Section 3 presents a comprehensive overview of the research design and analytical approach utilized in this study.
Section 4 examines the spatial configuration of key villages, offering detailed insights into the factors and associated mechanisms that exert influence on these rural tourism destinations and enable the prediction of suitable tourism zones.
Section 5 provides a concise discussion of the preceding sections. Lastly, the implications of the findings for future research and limitations of the research are summarized in the conclusion.
2. Literature Review
Scholars in China have conducted extensive analyses on the spatiotemporal distribution patterns of rural tourism, with particular emphasis on notable instances such as beautiful leisure villages [
24], traditional villages [
25], and key villages of rural tourism [
16,
26]. These analyses have been conducted at multiple levels, including national [
16,
24,
25], regional [
26,
27], and provincial [
10,
28], with the objective of elucidating the underlying determinants that influence these distributions. The distribution of rural tourism villages exhibits a pronounced spatial imbalance, characterized by a notable concentration of such villages in the eastern region of the country, predominantly positioned southeast of the Hu line [
29]. Furthermore, the density of these villages also demonstrates an apparent inter-provincial variation [
10]. The resource endowment premise and environmental conditions play a decisive role in defining the geographical distribution of rural tourism destinations [
16,
25]. Moreover, the spatial characteristics and structural arrangements of rural tourism villages are subject to the influence of multiple factors. These factors include socio-economic elements such as economic status, transit accessibility, tourist demand, policy orientation, and service quality. Additionally, the availability of natural and cultural resources, as well as geographical factors, play significant roles. The influence exerted by these factors varies in degree, sha** the spatial patterns and organizational models observed in rural tourism villages [
29,
30]. Despite these valuable findings, empirical research on the spatial mechanism sha** rural tourism remains limited, warranting further attention and investigation.
By contrast, machine learning represents a subset of artificial intelligence where computers acquire the ability to learn autonomously, without explicit programming. By utilizing statistical techniques, machine learning algorithms facilitate data analysis to identify patterns, enabling the generation of predictions or decisions [
31]. Thereinto, the random forest (RF) operates as an ensemble learning method, constructing a multitude of decision trees during the training phase, and subsequently determining the class that represents the mode of the classes (for classification) or the mean prediction (for regression) of the constituent trees [
32,
33,
34]. The RF algorithm, as an ensemble learning technique, relies on the combination of a large number of decision trees. Each tree is trained using a random selection of variables and a random subsample from the training dataset [
35]. The RF methodology has demonstrated successful utilization in both quantifying the factors influencing various phenomena and estimating the spatial distributions of homestays in Bei**g [
36]. Hence, the RF possesses the advantage of being able to evaluate the relative significance of independent variables on dependent variables [
37].
4. Results
4.1. Spatial Distribution Characteristics
4.1.1. Spatial Clusters of Key Rural Tourism Villages
Cluster analysis is a data-driven approach for grou** data points into clusters such that the data points within each cluster are more similar to each other than they are to data points in other clusters [
49]. The current inquiry investigates the overall distribution of key rural tourism villages in China and scrutinizes their spatial distribution pattern using density-based spatial clustering analysis. Utilizing the HDBSCAN algorithm, the analysis reveals the identification of 26 distinct clusters, of which clusters 4, 2, and 1 are the top-ranking clusters, accounting for a substantial proportion (32.78%) of all key villages (
Figure 4). Cluster 4, which is situated in the Yangtze Delta Plain, has traditionally functioned as the epicenter of China’s economic and cultural activities. Cluster 2, on the other hand, is found in the middle and lower reaches of the Yellow River, an area well known for its historical and cultural significance. Cluster 1 is located around Bei**g, the capital city of the country, which serves as its political, economic, and cultural center. Additionally, it is noted that more villages exhibit a dispersed layout and that 396 villages are deemed to be noise.
4.1.2. Spatial Distribution Density of Key Rural Tourism Villages
The spatial distribution density of key villages in China can be further delineated in the spatial dimension through the application of kernel density analysis. The calculated density values were divided into five categories using the natural breakpoint method, and then the spatial kernel density distribution map of villages was generated (
Figure 5). The spatial distribution density varied significantly in different regions. The area with low village density is concentrated at the northwest of the Hu line, with lower population. On the whole, high village density regions are concentrated around Bei**g and Shanghai, with several scattered regions. This is likely due to a number of factors, such as the availability of natural resources, the presence of historical and cultural sites, and the level of economic development.
4.2. Spatial Interaction between Key Rural Tourism Villages and Influential Factors
In this study, village density is employed as a surrogate measure for evaluating the spatial distribution of rural villages. By employing multi-factor correlation analysis, the impact of various factors on the spatial distribution of rural villages across different dimensions are identified. This analytical approach facilitates the examination of the significance of these factors. For instance, if a positive correlation is observed between village density and natural resources, it suggests a higher likelihood of villages being situated in areas rich in natural resources.
A scatter plot matrix can effectively illustrate bivariate relationships for several variable pairings. The graph shows both positive and negative associations, as well as non-significant ones. Significance of the relationship between variables is evidenced by the linear fit slope above each scatter plot, with values of one * (p < 0.05) or two ** (p < 0.01) indicating correlation significance. The linear fit slope above each scatter plot provides correlation significance indicated as one * (p < 0.05) or two ** (p < 0.01). The histograms situated in the diagonal section of the matrix enable observation of the distribution and shape of each variable individually.
The relationship between the natural dimension and key village distribution is depicted in
Figure 6. The results revealed a significant negative correlation between geographic conditions and village density (−0.294). Put simply, as geographic conditions become more challenging, village density tends to decrease. Interestingly, the findings indicate a positive association between village density and air quality (0.345), specifically measured by PM2.5 levels. This implies that higher levels of PM2.5 pollutants are accompanied by increased rural tourism demand. Furthermore, a positive correlation between the natural environment, represented by the presence of rivers, and village density is observed (0.231). This suggests a tendency for villages to be situated in close proximity to river systems.
Figure 7 visually depicts the correlation between the social dimension and the spatial distribution of key villages. The investigation uncovered a relatively weaker correlation between village density and influential factors such as tourism development (0.249), economic development (0.233), economic vitality (0.196), and economic potential (0.214). In contrast, tourism potential (0.492) displayed a more prominent influence on village density compared with the other dimensions. The research outcomes suggest that tourism potential plays a pivotal role in determining the placement of villages. This observation can be attributed to the propensity of villages situated in areas characterized by significant tourism potential to attract heightened tourist activity, thus promoting economic development and growth.
Furthermore,
Figure 8 presents a comprehensive overview of the correlation between the rural dimension and the spatial distribution of key villages. Within the analytical framework that examines the interrelationships among dimensions in rural areas, a relatively low correlation is observed between livelihood diversity (0.064) and village density. In contrast, factors such as rural accessibility (0.565), cultural resources (0.831), and tourism resources (0.748) exhibit a stronger influence on village density. This can be attributed to the fact that villages situated in regions with easy access, rich cultural heritage, and enticing tourism attractions are more likely to attract residents, fostering tourism growth and development.
4.3. Relative Importance of Influential Factors
In this study, the random forest algorithm was employed to conduct regression analysis, aiming to assess the relative influence of various factors on village density. The outcomes of this analysis, as illustrated in
Figure 9, indicate that cultural resources (1.67), tourism resources (0.9), tourism potential (0.47), and rural accessibility (0.34) emerge as the most significant determinants of village density, collectively contributing to 78% of the explanatory capacity. This finding underscores the fact that key villages are situated in areas abundant in cultural heritage, appealing tourism destinations, and convenient accessibility, as they are more likely to attract tourist, thus fostering rural tourism development. Conversely, factors such as tourism development, economic vitality, livelihood diversity, geographic conditions, economic development, natural environment, air quality, and economic potential exhibit comparatively lower levels of explanatory power.
4.4. Potential Area of Rural Tourism Development
The primary objective of this study is to discern suitable zones for the development of rural tourism through the comprehensive analysis of pivotal factors influencing village arrangement within tourist regions. To accomplish this, a random forest algorithm was employed to establish regression models, facilitating the assessment of the suitability level for rural tourism development in alternative areas. The resultant regression outcomes pertaining to village density were classified into five distinct tiers employing the natural break method, encompassing the most suitable area, more suitable area, moderately suitable area, less suitable area, and least suitable area (
Figure 10).
In general, the areas deemed suitable for the development of rural tourism are predominantly found south of the Hu Line, where there is a high population density. The most optimal locations are situated within the urban circles of Bei**g and Shanghai. Notably, Shanghai exhibits a wider range of influence, encompassing numerous comparatively suitable areas in its vicinity. Conversely, in the northern region, substantial appropriate areas are primarily distributed across Shandong, Henan, Shanxi, and Hebei provinces, owing to their dense populations and abundant cultural resources, rendering them suitable for rural tourism. Although the southern region boasts economic prosperity, this area lacks a concentrated cluster of suitable areas, resulting in their dispersal among various cities across the country. The eastern section of Hainan Island is an example of a region well suited to rural tourism. On the other hand, advancements in rural tourism are being witnessed in regions such as Sichuan, Guizhou, Hubei, Fujian, and Guangdong. Comparatively unsuitable areas tend to be dispersed along provincial boundaries and inter-provincial junctions, such as the intersection of Shaanxi, Sichuan, and Chongqing. Moreover, extensive areas in the northwest, namely Qinghai, ** village distribution. These findings underscore the critical role played by inherent cultural and tourism assets within rural areas, as well as their advantageous proximity to major urban markets, in facilitating rural tourism development. Additionally, the study establishes a positive correlation between enhanced rural accessibility and the advancement of rural tourism.
The assessment of rural tourism development potential emphasizes the Bei**g urban circle and the Shanghai urban circle as highly favorable regions for such endeavors. Notably, the Shanghai urban circle exhibits a broader sphere of influence, extending its impact over a wider geographic area. In the northern region, contiguous provinces with dense populations present extensive tracts suitable for rural tourism development. In contrast, the southern region features areas of moderate suitability interspersed with scattered pockets of more favorable locations. The western region, particularly the Hexi Corridor, concentrates the majority of suitable areas, whereas other regions exhibit relatively fewer opportunities. Overall, these distribution patterns of potential areas underscore the paramount importance of well-established tourism markets and abundant cultural and tourism resources in propelling rural tourism development.
6. Conclusions
The scientific arrangement of rural tourism is deemed as a fundamental element in achieving its sustainable development. Key villages, emblematic of China’s policies aimed at revitalizing rural areas, offer insight into the spatial patterns and mechanisms that underpin the growth of rural tourism in the country. This understanding, in turn, can inform decision-making processes aimed at fostering the long-term sustainability of rural tourism.
Through the application of diverse research methodologies, this study provides valuable and illuminating findings. By pinpointing suitable areas and elucidating the determinants that shape the spatial distribution of rural tourism villages, this research offers crucial insights for policymakers and tourism developers, facilitating the promotion and advancement of rural tourism in China. In comparison to previous studies conducted at the provincial level, this research extends the investigation to a larger scale, yielding more refined outcomes. These research findings have substantial implications for regional planners and policymakers, providing valuable insights into the primary determinants influencing rural tourism development around the world. Nonetheless, this study is subject to several limitations. It is crucial to acknowledge that the presence of missing data in certain regions may impact the research outcomes when addressing data gaps. In addition, the selection of influencing factors for rural tourism is biased towards supply-side factors, with limited consideration given to demand-side factors. Furthermore, future research should consider utilizing other machine learning algorithms in addition to RF algorithms.