IEEE Access (Jan 2020)

An Automated Process for the Repository-Based Analysis of Ontology Structural Metrics

  • Jose Antonio Bernabe-Diaz,
  • Manuel Franco-Nicolas,
  • Juana Maria Vivo-Molina,
  • Manuel Quesada-Martinez,
  • Astrid Duque-Ramos,
  • Jesualdo Tomas Fernandez-Breis

DOI
https://doi.org/10.1109/ACCESS.2020.3015789
Journal volume & issue
Vol. 8
pp. 148722 – 148743

Abstract

Read online

Quantitative metrics are generally applied by scientists to measure and assess the properties of data and knowledge resources. In ontology engineering, a number of metrics have been developed to analyse different features of ontologies in the last few years. However, this community has not generated any standard framework for studying the properties of ontologies or generated sufficient knowledge about the usefulness and validity as the measurement instrument of these metrics for evaluating and comparing ontologies. Recently, 19 ontology structural metrics were studied using the OBO Foundry and AgroPortal ontology repositories. This study was based on how each metric partitioned the two datasets into five groups by applying the k-means algorithm. The results suggested that the use of five clusters for every metric might be suboptimal. In this paper, we propose an automated process for the study of ontology structural metrics by including the selection of an optimal number of clusters for each metric. This optimal number is automatically obtained by using statistical properties of the generated clusters. Moreover, the cosine similarity is used for estimating the similarity of two repositories from the perspective of the behaviour of the same set of metrics. The results on the two datasets allow for a more realistic perspective on the behaviour of the metrics. In this paper, we show and discuss the difference observed in the comparative behaviour of the metrics on the two repositories when using the optimal number with respect to a predetermined number of clusters for every metric. The proposed method is not specific for ontology metrics and therefore, can be applied to other types of metrics.

Keywords