Dissimilarity measures based on the application of Hamming distance to generate controlled probabilistic tests

V. N. Yarmolik; V. V. Petrovskaya; N. A. Shevchenko

doi:10.37661/1816-0301-2024-21-2-54-72

Informatika (Jun 2024)

Dissimilarity measures based on the application of Hamming distance to generate controlled probabilistic tests

V. N. Yarmolik,
V. V. Petrovskaya,
N. A. Shevchenko

Affiliations

V. N. Yarmolik: Belarusian State University of Informatics and Radioelectronics
V. V. Petrovskaya: Belarusian State University of Informatics and Radioelectronics
N. A. Shevchenko: Technical University of Darmstadt

DOI: https://doi.org/10.37661/1816-0301-2024-21-2-54-72
Journal volume & issue: Vol. 21, no. 2
pp. 54 – 72

Abstract

Read online

Objectives. The problem of constructing dissimilarity measures based on the application of the Hamming distance to generate controlled random binary test sets is solved. The main goal of this article is to develop methods for determining the Hamming distance for the achievability of finding the difference between test sets when they coincide according to estimates of other difference measures.Methods. Based on the Hamming distance used in the theory and practice of generating controlled random tests, new dissimilarity measures are proposed for two binary test n-bit patterns. The basis of the proposed dissimilarity measures is the formation of sets of Hamming distances for initial sets, represented as sequences of characters from different alphabets.Results. The indistinguishability of pairs of binary test sets Ti and Tk is shown using a dissimilarity measure based on the application of the Hamming distance. In this case, different pairs of sets may have identical Hamming distance values. To construct new measures of difference, the original binary test sequences are represented as sequences consisting of characters belonging to different alphabets. Various strategies are proposed for applying new measures of difference based on the use of one of three rules in generating controlled probability tests. It is shown that in all three cases of dissimilarity measures, only the first few of their components areinformative, as a rule, no more than two or three. Accordingly, the computational complexity for all three options is comparable and does not exceed 3n comparison operations. The experimental studies carried out confirm the effectiveness of the proposed dissimilarity measures and their low computational complexity.Conclusion. The proposed dissimilarity measures expand the possibilities of generating test sets when forming controlled random tests. It is shown that test sets that are indistinguishable when using the Hamming distance as a dissimilarity measure have different values of the proposed dissimilarity measures, which makes it possible to more accurately classify randomly generated sets that are candidate test cases

Published in Informatika

ISSN: 1816-0301 (Print)
Publisher: National Academy of Sciences of Belarus, the United Institute of Informatics Problems
Country of publisher: Belarus
LCC subjects: Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: http://inf.grid.by

About the journal

Abstract

Keywords