4 open (Jan 2019)
Stochastic profile of Epstein-Barr virus in nasopharyngeal carcinoma settings
Abstract
We build a profile of the Epstein-Barr virus (EBV) by means of genomic sequences obtained from patients with nasopharyngeal carcinoma (NPC). We consider a set of sequences coming from the NCBI free source and we assume that this set is a collection of independent samples of stochastic processes related by an equivalence relation. Given a collection {(Xjt)t∈ℤ}pj=1 { ( X t j ) t ∈ Z } j = 1 p $ \{({X}_t^j{)}_{t\in \mathbb{Z}}{\}}_{j=1}^p$ of p independent discrete time Markov processes with finite alphabet A and state space S, we state that the elements (i, s) and (j, r) in {1, 2,…, p} × S are equivalent if and only if they share the same transition probability for all the elements in the alphabet. The equivalence allows to reduce the number of parameters to be estimated in the model avoiding to delete states of S to achieve that reduction. Through the equivalence relationship, we build the global profile for all the EBV in NPC sequences, this model allows us to represent the underlying and common stochastic law of the set of sequences. The equivalence classes define an optimal partition of {1, 2,…, p} × S, and it is in relation to this partition that we define the profile of the set of genomic sequences.
Keywords