Сучасний стан наукових досліджень та технологій в промисловості (Jun 2020)

METRICS FOR EVALUATING CONSISTENCY IN DISTRIBUTED DATASTORES

  • Galyna Zholtkevych

DOI
https://doi.org/10.30837/2522-9818.2020.12.040
Journal volume & issue
no. 2 (12)

Abstract

Read online

The subject of the paper is metrics for evaluating consistency of distributed datastore as one of main CAP-guarantees, more precisely, criteria for reliable distributed datastore. The goal of the research is investigation of the ability to develop such a program on the earlier stage of building distributed network and build some components of decision-making algorithm, which purpose is to build optimal network topology. This decision-making algorithm should be suitable for any business model and its requirements. To be more detailed, for that purpose the following tasks had been done: mathematical model for stochastic metric for consistency in distributed datastore is built; the conditions of consistency convergence time are investigated in initial perfect datastore environment. Methods used are: theory of number partitions, basics from graph theory and probability theory, computer modeling and program for running sets of experiments. As a result, it is established that in the conditions of data loss absence the consistency convergence after first write request is equal or less than diameter of graph that represents topology of distributed network. Such convergence has the same unit of measure as the link cost of each link in the network; the stochastic model is proposed for metric to evaluate consistency. Making a final conclusion, this will give the opportunity to investigate or monitor the current state of the system in the given time interval. This research is the base to form some elements of decision-making algorithm for building topology in a distributed network and the elements of the algorithm for monitoring such a system. Also, based on trends of requests frequency of data modification and reading, the strategy of nodes allocation in the topology is suggested, which can improve the response time and speed of convergence of the distributed storage to the fully consistent or close to that state. The practical role of the components of the decision-making algorithm is that the network architect could apply the algorithm at the stage of building the network for a distributed database, so that CAP characteristics will be optimized in the context of specific business needs. The mathematical model for the stochastic metric of distributed storage consistency can be applied both at the system design stage, for testing the satisfactory level of consistency, and at the system operation stage, as a component of the network monitoring system.

Keywords