Nongye tushu qingbao xuebao (Apr 2024)
Methodology for Assessing the Influence of Technical Topics Based on PhraseLDA-SNA and Machine Learning
Abstract
[Purpose/Significance] Accurately measuring the influence of technical topics is crucial for decision-makers to understand the developmental trends in the technology sector. It is also an important link in identifying emerging, cutting-edge, and disruptive technical topics. Traditional methods of measuring technical topic influence are significantly affected by the latency of patent data approval and citations, lack a forward-looking perspective on the potential influence of technical topics, and suffer from insufficient semantic richness in the extraction of technical topics. This paper presents a method for measuring technical topic influence based on PhraseLDA-SNA and machine learning. It aims to mitigate the impact of delays in patent data approval and citation, while improving the interpretability and accuracy of the results in assessing technical topic influence. [Method/Process] In this study the explicit and implicit determinants of technical topic influence were first analyzed, based on which an index system for measuring technical topic influence was constructed. Then, the PhraseLDA model was used to extract semantically rich technical topics from a large corpus of pre-processed patent texts and to compute the topic-patent association probabilities. PhraseLDA-SNA enhances the semantic richness of technical topic extraction and deepens the analysis of topic content. Machine learning methods leverage their robust data processing and analysis capabilities to predict the high citation potential of patents related to the topics. This research integrates PhraseLDA-SNA and machine learning methods to accurately measure the significance and advanced nature of technical topics in promoting field development, thereby achieving an accurate measurement of the influence of technical topics. Finally, an empirical study was conducted in the field of cellulose biodegradation to compare the high-impact technical topics identified by the proposed method with those identified by the traditional method. Several experts with high academic influence and extensive experience in cellulose biodegradation research were invited to evaluate the high-impact technical topics identified in this study, thus validating the effectiveness of the proposed method. [Results/Conclusions] Compared with the traditional method, the technical topic influence measurement approach based on PhraseLDA-SNA and machine learning reveals more in-depth content. Moreover, this method also analyzes the importance and leading nature of technical topics, which shows superiority in quantitative analysis. Comparing the distribution of high-impact technical topic-related patents identified by the two methods across different years, the topics identified by the proposed method had a higher association ratio in the most recent data, indicating a significant reduction in the impact of patent data approval and citation delays.
Keywords