| P. A. Pevzner, M. Y. Borodovski, and A. A. Mironov. Linguistic of nucleotide sequences: The signi cance of deviation from mean statistical characteristics and prediction of the frequencies of occurrence of words. J. Biomol. Struct. Dyn, 6:1013-1026, 1989. |
....have the same length m. Notation: The probability of a character X is denoted p X . Given a pattern H, one denotes P(H) Q m i=1 pH i where H i is the i th character of the pattern. Given a set H, one denotes P (H) P (H) Denote n the length of the random sequence It was observed in [Pevzner et al. 1991] that the variance depends on the structure of the words. More precisely: De nition 1 Given two strings H and F , the overlapping set is the set of H suxes that are F pre xes. Associated F suxes form the correlation set AH;F . One denotes: AH;F = w2AH;F P (w) Intuitively, when a word in ....
Pevzner, P., Borodovski, M., & Mironov, A. (1991). Linguistic of Nucleotide sequences:The Signi cance of Deviations from the Mean: Statistical Characteristics and Prediction of the Frequency of Occurrences of Words. J. Biomol. Struct. Dynam. 6, 1013-1026.
....on the letters of S, the probability of H is de ned as P(H) Y i=1 p h i where h i denotes the i th character of H. By convention, empty string has probability 1. Finding a pattern in a random text is, in some sense, correlated to the previous occurrences of the same or other patterns [13]. Hence for example, the probability of nding H 1 = ATT knowing that one has just found H 2 = TAT is intuitively rather good since a T right after H 2 is enough to give H 1 . Correlation polynomials and correlation functions give a way to formalize this intuition. De nition 2. The ....
P.A. Pevzner, M. Borodovski, and A. Mironov. Linguistic of Nucleotide sequences: The Signi cance of Deviations from the Mean: Statistical Characteristics and Prediction of the Frequency of Occurrences of Words. J. Biomol. Struct. Dynam., 6:1013-1026, 1991.
.... i 2 [1 : n m 1] to be 1 if y occurs in x starting at position i, 0 otherwise, so that Z y = n m 1 X i=1 Z i is the random variable for f(y) Expressions for the expectation and variance for the number of occurrences in the Bernoulli model, have been given by several authors (see, e.g. [9, 12, 6, 5, 10]) Here we adopt derivations in [1] With p a the probability of symbol a 2 and p = Q m i=1 p y [i] we have E(Z y ) n m 1) p Var(Z y ) 1 p)E(Z y ) p 2 (2n 3m 2) m 1) 2 pB(y) if m (n 1) 2 (1 p)E(Z y ) p 2 (n m 1) n m) 2 pB(y) otherwise where B(y) X ....
Pevzner, P. A., Borodovsky, M. Y., and Mironov, A. A. Linguistics of nucleotides sequences I: The signicance of deviations from mean statistical characteristics and prediction of the frequencies of occurrence of words. J. Biomol. Struct. Dynamics 6 (1989), 1013-1026.
No context found.
P. A. Pevzner, M. Y. Borodovski, and A. A. Mironov. Linguistic of nucleotide sequences: The signi cance of deviation from mean statistical characteristics and prediction of the frequencies of occurrence of words. J. Biomol. Struct. Dyn, 6:1013-1026, 1989.
No context found.
Pevzner, P. A., Borodovski, M. Y., and Mironov, A. A. Linguistic of nucleotide sequences: The signi cance of deviation from mean statistical characteristics and prediction of the frequencies of occurrence of words. J. Biomol. Struct. Dyn 6 (1989), 1013-1026.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC