Journal of Big Data (May 2023)
Characterizing patent big data upon IPC: a survey of triadic patent families and PCT applications
Abstract
Abstract Research objective Triadic patent (TP) families and Patent Cooperation Treaty (PCT) applications are often used as datasets to measure innovation capability or R&D internationalization, but their concordance is unclear, which is the main issue in this study. Methods We collect the global TP and PCT data from the Derwent Innovations Index (DII), and a total of 1,589,172 TP families and 4,067,389 PCT applications are retrieved. Based on International Patent Classification (IPC) codes, we compare these two big datasets in three parts: IPC distribution, IPC co-occurrence network, and nation-IPC co-occurrence network. In order to understand the overall similarities and differences between TP and PCT, we make the basic statistics of the global data and w-core defined based on the w-index. Furthermore, the w-cores are visualized and the global similarities are calculated for the detailed concordance and differences. Findings The result shows that the w-core is suitable to select the core part of big data and TP and PCT get high concordance. Meanwhile, in technological convergence, some specific technical fields (e.g. chemistry, medicine, electronic communication, and lighting technology) and countries/regions (e.g. Germany, Japan, China, and Korea), there are a few differences. Practical implications TP families are very similar to PCT applications in terms of reflecting innovation capability or R&D internationalization at a macro level, but when it comes to technological convergence, specific research topics, and countries/regions, the choice may depend on the purpose of the research.
Keywords