IEEE Access (Jan 2024)

On the Generative Power of ReLU Network for Generating Similar Strings

  • Mamoona Ghafoor,
  • Tatsuya Akutsu

DOI
https://doi.org/10.1109/ACCESS.2024.3387306
Journal volume & issue
Vol. 12
pp. 52603 – 52622

Abstract

Read online

Recently, generative networks are widely used in different applied fields including computational biology for data augmentation, DNA sequence generation, and drug discovery. The core idea of these networks is to generate new data instances that resemble a given set of data. However it is unclear how many nodes and layers are required to generate the desirable data. In this context, we study the problem of generating strings with a given Hamming distance and edit distance which are commonly used for sequence comparison, error detection, and correction in computational biology to comprehend genetic variations, mutations, and evolutionary changes. More precisely, for a given string $e$ of length $n$ over a symbol set $\Sigma $ , $m = |\Sigma |$ , we proved that all strings over $\Sigma $ with hamming distance and edit distance at most $d$ from $e$ can be generated by a generative network with rectified linear unit function as an activation function. The depth of these networks is constant and are of size $\mathcal {O}(nd)$ and $\mathcal {O}(\max (md, nd))$ .

Keywords