Systems (Aug 2024)
A Patent Mining Approach to Accurately Identifying Innovative Industrial Clusters Based on the Multivariate DBSCAN Algorithm
Abstract
Innovative Industrial Clusters (IIC), characterized by geographical aggregation and technological collaboration among technology enterprises and institutions, serve as pivotal drivers of regional economic competitiveness and technological advancements. Prior research on cluster identification, crucial for IIC analysis, has predominantly emphasized geographical dimensions while overlooking technological proximity. Addressing these limitations, this study introduces a comprehensive framework incorporating multiple indices and methods for accurately identifying IIC using patent data. To unearth latent technological insights within patent documents, Latent Dirichlet Allocation (LDA) is employed to generate topics from a collection of terms. Utilizing the applicants’ names and addresses recorded in patents, an Application Programming Interface (API) map systems facilitates the extraction of geographic locations. Subsequently, a Multivariate Density-Based Spatial Clustering of Applications with Noise (MDBSCAN) algorithm, which accounts for both technological and spatial distances, is deployed to delineate IIC. Moreover, a bipartite network model based on patent geographic information collected from the patent is constructed to analyze the technological distribution on the geography and development mode of IIC. The utilization of the model and methodologies is demonstrated through a case study on the China flexible electronics industry (FEI). The findings reveal that the clusters identified via this novel approach are significantly correlated with both technological innovation and geographical factors. Moreover, the MDBSCAN algorithm demonstrates notable superiority over other algorithms in terms of computational precision and efficiency, as evidenced by the case analysis.
Keywords