1. Introduction
With the rapid development of China’s economy and continuous urbanization, multiple elements are gathering and diffusing in space, forming functional zoning at different regional scales [
1,
2]. The urban spatial structure is the projection of urban functional areas and their connections on a regional scale [
3]. The spatial distribution of urban functions is an important factor to measure urban development, but the evolution of regional functions may be inconsistent with urban planning and development. Therefore, the study of urban functional zones is of great significance for effective functioning [
4], the rational allocation of spatial elements [
5], economic development, and the livability of a city. Information about urban functional zones is important for urban planning management and future development for relevant decision-making departments [
6]. Traditional manual surveys, remote sensing images, and other data were important sources for the identification and division of urban functional areas in the past [
7,
8]. These methods can accurately divide urban functional areas and contribute to land use management. However, urban planning requires the analysis of large amounts of data, and these traditional methods are subjective and require a lot of manpower, material resources, and financial expense, meaning they are not suitable for routine monitoring or longitudinal comparisons [
9]. With the integration of urban regional planning and management, obtaining timely information describing urban development is urgent. The rapid and accurate identification of urban functional areas is a necessary condition for improving urban planning and management [
10].
The division of functional areas is one of the basic steps of urban planning and management [
5,
11]. The advent of the big-data era has led to the development of new methods for the study of urban spatial patterns. Novel geographical datasets represented by mobile phone signaling, POI data, and floating vehicle tracking have the advantages of large sample sizes, fast processing speeds, simple and convenient acquisition, and a diverse range of sources [
12], providing new information for quantifying urban functional areas and the possibility for refined research. Jiao et al. visualized mixed functional areas by using the RGB color addition method [
13] and were the first scholars in China to study urban functional zoning using POI data. Miao et al. analyzed urban spatial structures by using microblog check-in data, OSM road network data, and a spatial clustering method. This study found that using a street view segmentation method to describe urban heterogeneity was more effective than a uniform grid method [
14]. Long et al. found that using online data, such as OSM and POI, could generate similar results to traditional manual methods and concluded that the integrity of OSM road network data in big cities is far better than that in small cities [
9]. Yu took the traffic community as the minimum research unit to verify the effectiveness of combining floating car tracking data and POI data to identify urban functional areas [
15]. Yang et al. used mobile phone signaling data as a supplement to the day–night intensity of urban population heat to better quantify the space–time characteristics of urban functional areas [
16].
The majority of these previous studies used POI data and other large geographical datasets [
17,
18] and proved the rationality of using the OSM road network to divide research units. Mobile phone signaling data and floating car track data have strong confidentiality and high investment capital. In contrast, POI data have the advantages of simple acquisition and convenient processing, meaning that POI data are widely used in geographical research. At present, the research on POI mainly focuses on a recommendation model [
19,
20,
21] and urban functional zoning [
22], and the research on POI itself mainly focuses on automatic classification [
23] and classification accuracy by using machine learning. Nak introduced POI into eye-tracking data analysis [
24], making the research scope of POI broader.
POI data mainly describe the spatial location and attribute information of geographical entities [
25] and exclude information about the scale of land use. The function is the social attribute undertaken by geographical entities [
26]. The urban functional area is composed of a series of geographical entities in the region, which are the result of the long-term development process of the city [
27,
28] and have always been named according to the dominant function. In the study of the urban functional area, many previous studies have assigned weights to POI data by taking public awareness and surface area as influencing factors to improve the accuracy [
29]. These weights are mainly determined by questionnaires and expert evaluation, which can mean that this method can be inefficient or cause biased results [
30].
The random forest model is one of the most suitable machine learning methods [
31] for describing the importance of features that cannot be explained using manual data analysis [
32]. Zhao et al. identified the production–living–ecological spatial pattern of Zhengzhou [
33] and introduced the random forest model to assign weight values to POI data. These new methods and large geographic datasets promote research to better characterize urban functional areas. Most scholars have improved the sample bias by coupling multi-source data, and there is a lack of in-depth research on the data itself [
16]. The classification of various data combinations can differ, and there is no method to test the accuracy of the results. The development of machine learning has made scholars pay more attention to the development of POI data accuracy. Word2vec and Block2vec have been used to study the relationship between POI categories [
10,
34] improve the classification accuracy of POI, and pay more attention to the applicability of the data [
35], but there is currently a lack of research on POI weighting.
To address the subjectivity, complexity, and lack of accuracy of weight assignments in previous POI studies and considering the irrationality of directly quoting plain city data in mountainous cities, this paper adopts the random forest method to assign POI weights. We then use an urban functional area calculation model, combined with the POI and OSM road network data from the central urban area of Chongqing in 2020, to identify its functional area and quantify the mixing degree of the internal functions [
1]. Google satellite map and Baidu Street view map are used to verify the recognition results. In addition, the sensitivity of the model is tested by data thinning to make the research results more practical for urban planning. The findings of this study allow for a better characterization of regional development, provide a reference for adjusting the regional functional structure [
36], and improve the role of the Chongqing city center in the Chengdu–Chongqing economic circle.
3. Results
3.1. POI Weight Determination Based on the Random Forest Model
In this paper, the functional zoning map of Chongqing developed by Professor Gong Peng’s team at Tsinghua University [
40] was adopted to carry out the geographical calibration between the functional zoning map and POI data, and to obtain the weighted proportion of each POI. After several attempts, 60% of the data were taken as the training set, mtry = 10, ntree = 800, to give the lowest out of bag estimation error (29.16%). We then used the MDA in the Importance function to calculate the relative importance of the different POI categories (
Figure 3).
The MDA value was standardized according to the total score of 100 points and was combined with the POI types to obtain the POI weight index classification table (
Table 1).
3.2. Identification of Single Dominant Functional Areas
GIS spatial technology was used to identify urban functional areas and to quantify the central urban area of Chongqing. Different types of POI data were spatially connected with the plot data, and the single dominant functional areas with a fraction of more than 50% in each plot were identified and visualized (
Figure 4). Based on the street scale, this paper identified 1636 traffic service functional areas, 3968 industrial functional areas, 233 medical care functional areas, 777 residential service functional areas, 240 administration and public services functional areas, 7915 life service functional areas, 1010 education functional areas, 32 financial and insurance functional areas, 129 green spaces and squares functional areas, and 585 catering service functional areas. The top three categories were life services, industrial services, and transportation services. There were many non-functional areas, which were mainly distributed in the Gele and Zhongliang mountainous areas, due to the small-scale division of the study area.
Life service accounted for the largest area and had a wide distribution, mainly in the Guanyinqiao group in the southwest of Jiangbei District, the Sha**ba group in the southeast of Sha**ba District and the University City area, the Dayangshi group in the southeast of Jiulongpo District and the Nan** group in the west of Nan’an District. Life service covered the majority of Guanyinqiao, Jiefangbei, Sha**ba and other major business districts in the central urban area.
The distribution of industrial functional zones was uniform, including in the Airport group in Yubei District, the Yuzui group in Jiangbei District, and the **ba District due to Chongqing West Railway Station, Sha**ba Station, and West New Town Passenger Station. In other central urban areas, the transportation function is dispersed to serve daily commuters.
Catering services are needed in people’s daily lives, and their distribution is consistent with the location of companies and education facilities. Residential service functional areas are distributed evenly throughout the central city and are concentrated on the periphery of the business circle. Accommodation services that are close to the business circle mainly serve tourists. The number of green spaces and squares is much lower compared with other functional areas, and scenic spots such as Lijia Wisdom Park and Zhaomu Mountain Park can be clearly identified. Education functional areas are concentrated in the Sha**ba District. There are some clusters of education functional areas in the vicinity of Southwest University and Chongqing University of Technology. There is a lower distribution of education areas near the business zone of each district, which is consistent with the distribution of residential functional areas.
As a special function, the administration and public service function area occupies a relatively small amount of land and is scattered in all municipal districts. By checking the government address of each district, the effect of recognition was found to be good. However, the plot division was too fine, and so it is not obvious in the figure. The number of financial insurance functional areas is the lowest of all categories, except for in Yuzhong District, although each district does have a distribution of financial and insurance functional areas. The number of medical care functional areas is also small, and these areas are generally far from the center of each city district. This result is consistent with the large area and quiet environment required by medical and health care construction.
In conclusion, the distribution of single dominant functional areas (
Figure 5) in the central urban area of Chongqing conforms to the location conditions required by the construction of various functions. This finding shows the different development characteristics of plain areas, and its distribution is consistent with the spatial structure of “one main area, six sub-areas, multi-centers and groups” in Chongqing planning. A previous study found that 200 × 200 is more suitable for the study of single functional areas compared with 500 × 500 [
53]. The research scale of this paper was close to 200 × 200, but the single functional area obtained was still less than 2/3rds of the total area, indicating that the overall mixing degree of functional areas in central urban Chongqing is high.
3.3. Evaluation of the Mixing Degree of Functional Areas
Chongqing, as a typical mountain city, has attracted a lot of recent attention as the “8D Magic city”. Three-dimensional transportation is the embodiment of comprehensive land use, and mountains and waters are important factors causing the polycentric pattern of Chongqing [
54]. It was found that there are 4510 mixed functional areas. This paper used information entropy to identify the mixing degree of mixed functional areas (
Figure 6). The maximum value of the functional mixing degree was 2.02, the standard deviation was 0.22, and the average value was 1.38. In general, the functional mixing degree is affected by the mountainous terrain and has a polycentric distribution. Medium and high mixing degree areas are mainly distributed in the center of each administrative region. Yuzhong District has the highest overall mixing degree, with the maximum street mixing degree reaching 2.02 in Lianglukou street. Caiyuanba Station and Chongqing North Railway Station have low values, and the surrounding buildings have more room for improvement. Sha**ba, Jiulongpo, Nanan, and Banan are characterized by a low overall mixing degree, inadequate comprehensive urban development, and inefficient land use. In general, the mixing degree of the inner ring region is generally high, and the aggregation characteristics of the outer ring region are weaker.
To study the 4510 plots with mixed functions, GIS spatial technology was used to identify mixed-function zones dominated by two functions and comprehensive functional zones with three or more functions (
Figure 7). There are 3068 mixed plots with both residential and catering services, which are mainly distributed in the central business circle of each administrative region. As shown in
Figure 7, the distribution of single-function areas in these regions is also concentrated, reflecting the intensification of land use in Chongqing. The mixed-function areas dominated by additional functions mainly contain industrial, transportation, and life services. There are a total of 117 plots that have a comprehensive functional area. By checking their properties, we found that there are a certain number of various POIs, mainly for residential and transportation services.
Through further study of mixed-function areas, it can be seen that POIs with small volumes and large spatial extents have the characteristics of comprehensive utilization. The mixed-function areas are still distributed as a group, and the multi-center features of the mixing degree are more significant than for single-function areas.
3.4. Verifying the Accuracy of the Results
Some previous researches have studied urban functional zoning without verifying the accuracy of their results [
4,
7,
16,
43,
51,
52,
55]. In addition, some scholars have verified the accuracy by comparing their findings with planning maps [
13,
14], which deviates from the purpose of studying the function of urban development in this paper. Therefore, Google satellite and Baidu Street View maps were selected in this paper to judge the identification accuracy of single functional areas. ArcGIS was used to generate random points, select test samples, and overlay shapefiles on the Google satellite and Baidu Street View maps to test the recognition effect of single functional areas. A total of 682 sample plots were randomly selected as detection objects. POI partitioning results were compared with Google satellite and Baidu Street View maps to judge the correctness of the POI partitioning results. Sampling accuracy was used to estimate the recognition effect, and a confusion matrix was established to evaluate the accuracy of the results (
Table 2).
Comparing the functional areas with the Google satellite and Baidu Street View maps (
Figure 8), the identification of residential services and green spaces and squares was the best, with an accuracy of 94%. The recognition of catering services was not as accurate. Of the 50 samples, 36 were accurate. Most of the sample plots that failed to meet the standard were open residential areas, with downstairs catering and upstairs living. A residential building upstairs is usually represented by one point, while the catering downstairs can be represented by multiple points [
17], allowing a large number of catering POIs to be included in the catering area. Compared with Google satellite and Baidu Street View maps, the POI identification results using single-source data had an overall accuracy of 82% and a Kappa coefficient of 0.80, indicating a high consistency of identification accuracy. The identification accuracy was better than that of Ding Yanwen et al. (81.3%), which verified the accuracy of both single and mixed functional areas using planning maps [
56].
4. Discussion
4.1. Analysis of Model Sensitivity
This paper used GIS to thin various POI data and randomly select 90%, 70%, 50%, and 30% of the original data to study the sensitivity of the model. After using the thinning data to re-identify the functional areas, 614, 573, 525, and 524 sample plots were randomly selected for accuracy verification to obtain the corresponding accuracy. It can be seen from
Figure 9 that the overall accuracy of the model was correlated with the amount of data. A reduction of less than 50% of the amount of data had little influence on the accuracy, and the overall stability of the model was relatively high. For each single-function area, the accuracy rate of green spaces and squares and residential service areas was stable and showed an overall upward trend. The accuracy of catering services, administration and public services, and financial insurance functional areas fluctuated significantly and were the decisive factors affecting the accuracy of the model.
4.2. Analysis of Recognition Results
In single dominant function areas, the Banan company function distribution was lower. This result is likely because these areas were the location of the first development in Chongqing, and the infrastructure is relatively old, meaning that it is more difficult to meet the geographical requirements of new companies. In addition, because there is abundant natural vegetation in the south of Yudong (due to ecological protection laws), it is proposed in the development plan that large-scale development and construction will not be carried out. The number of POIs of the company is dense in the Yubei Airport Group, which mainly benefits from the airport industrial park, which was put into operation in 2011 and has experienced stable development.
The Yuzui Group in Jiangbei District is positioned in the eastern industrial new town of Chongqing and is an important part of the eastern new town development in the central city. In this group, companies and industrial parks are widely distributed, and the POI visualization is obvious. Jiulongpo District has a solid industrial foundation, and the industrial park of **ba District is positioned as the municipal science, education, and culture center, and so contains the most concentrated distribution of science and educational services. The nine districts of the main city all have obvious education functional areas. The area of educational services in Yubei district is the largest, and the distribution of education functions is the most extensive, which reflects the uniform distribution of educational resources in Chongqing.
Among the mixed functional areas dominated by the two functions, areas with both residential and catering services account for the largest proportion. This result is because residential and catering services both show spatially discrete distributions, lower mutual exclusivity with other functions, and can have a vertical spatial distribution. The combination of these two kinds of POI is not only related to the daily life of residents but also reflects the development of tourism in Chongqing. The mixing degree of Banan District and Jiulongpo District is relatively low, and these districts still have a lot of space available.
4.3. Innovations and Shortcomings
In this paper, the random forest algorithm was used to calculate the POI weight, which avoids the subjectivity of social recognition and has the advantage of efficient quantification. The form of OSM road networks and LSMS were combined to divide the minimum research unit, which is more consistent with the actual situation of the block. The results of urban function identification were accurate, which proves that it is reasonable to use the random forest algorithm to calculate the weight of the POI and small-scale division of research units to identify the urban function area. Based on the identification of single functional areas and mixed functional areas, this paper identified and discussed mixed functional areas dominated by two functions and comprehensive functional areas dominated by three or more functions. Finally, the recognition results were verified by visual interpretation, which is of significance compared to existing research [
51,
52,
53,
54].
Due to the unique topography and related policies, the group development mode of Chongqing’s downtown area is relatively mature and is consistent with the development plan. As part of the Chengdu–Chongqing economic circle, the territory development plan of Chongqing proposes to build a “multi center, multi-level and multi node” network city, which means that the center in Chongqing urban development model will transition from group to cities in the network. How to undertake this transformation smoothly and steadily poses new challenges for the internal spatial structure planning of the central urban area. The identification of urban functional areas is helpful to further understand the development of central Chongqing. In addition, the study of the mixing degree of urban functions can help show potential problems with urban planning and development and provide a valuable reference for future planning strategies.
This study provides a simple and reproducible way to assign POI weights, and the precision of verification was mainly due to visual interpretation, which took a lot of time. If the third national land resource survey data can be obtained, then deep learning can be used for verification, and the accuracy of verification will be more powerful and convenient. However, the latest round of the investigation was completed on 26 August 2021; given that data acquisition is difficult, the recent time period means that the data can be used in future studies.
In addition, the paper mainly considered objectivity when assigning weight to POI. Expert evaluations, questionnaire surveys, and other information could be added in future research to combine subjective and objective methods to quantify the weight of POI and promote POI accuracy. The present study of POI represents the description of spatial entity information and classifications using the Standard of Urban Land Classification and Planning Construction Land which lacks detailed and accurate standards. With the development of information technology such as machine learning, some scholars have studied the semantic relationship between POIs to obtain a more objective classification system without manual annotation. Interdisciplinary subjects provide opportunities for deepening the research content [
50,
57]. Questions about how to further explore the internal information of POI data remain a key difficulty for future research.
5. Conclusions
In this paper, POI data were analyzed, and the random forest algorithm was used to assign weight to POIs and to plot minimum research units to identify green spaces and squares, industrial, catering service, life service, residential service, transportation, administration and public service, education, medical care and financial insurance areas in the center in Chongqing. A total of 10 kinds of functional areas were identified, and their distribution characteristics were investigated. The functional mixing degree of the central urban area was studied, and the following conclusions were drawn:
(1) The spatial distribution of mountains and water in Chongqing makes the polycentric development obvious. This pattern of development mainly radiates outward from the business circle of each administrative region, and the land use is intensive. Excluding non-functional areas, single functional areas account for 60% of all functional areas and are mainly residential service areas. Mixed-function areas account for about 40% of the total area and are mainly composed of residential and catering services;
(2) The inner ring of the functional mixing degree is relatively high, while the outer ring is relatively low. The maximum value is located on Lianglukou street. In the interior of the plot, the functions of residential, catering, and industrial services are highly dependent on each other and have little mutual exclusivity with other functions. However, administration and public services, green spaces and squares, and other functions have strong exclusivity, resulting in many single functional areas of these categories in downtown Chongqing;
(3) The identification of residential service and green spaces and squares were the best, with a sampling accuracy of 94%. The identification of catering services was a little worse, with an overall identification accuracy of 82% and a Kappa coefficient of 0.80. Overall, these results show that the model has a good consistency and a high stability.