(Enter summary)
Abstract: We present a complete analysis of the statistics of number of occurrences of a regular expression pattern in a random text. This covers "motifs" widely used in computational biology. Our approach is based on: (i) a constructive approach to classical results in theoretical computer science (automata and formal language theory), in particular, the rationality of generating functions of regular languages; (ii) analytic combinatorics that is used for deriving asymptotic properties from generating... (Update)
Context of citations to this paper: More
...turns out to be :1370028784e Gamma2 = 1747198482e Gamma2 . Interestingly enough, this would change positive Z score 22 in (Nicod eme et al. 1999) into negative Z score Gamma6:020723691. if dinucleotide frequencies are the same in both experiments) Regulatory sites We...
...the l 1 previous letters. The process is clearly linear according to the length of the sequence. In fact, in this model, recent works [17, 20] allow to compute analytically some parameters (expected number of occurrences, variance, etc) concerning the frequencies of appearance...
Cited by: More
Reliable Detection of Episodes in Event Sequences - Gwadera, Atallah, Szpankowski (2004)
(Correct)
Three Variations on Word Counting - Régnier, Lifanov, Makeev
(Correct)
Random Generation of Words of Algebraic Languages.. - Denise, Roques, Termier
(Correct)
Similar documents (at the sentence level):
64.8%: Motif Statistics - Nicodème, Salvy, Flajolet (1999)
(Correct)
Active bibliography (related documents): More All
0.5: Structured Documents Processing Using Lex and Yacc - Timoshkina, Bogoyavlenskiy, .. (2001)
(Correct)
0.3: Checking and Bounding the Solutions - Of Some Recurrence
(Correct)
0.2: Regexpcount, a Symbolic Package for Counting Problems on.. - Nicodème (2003)
(Correct)
Similar documents based on text: More All
1.2: On Generating Functions of Generating Trees - Banderier.. (1999)
(Correct)
1.0: Fast Computation with Two Algebraic Numbers - Bostan, Flajolet, Salvy, Schost (2002)
(Correct)
0.8: Fast Approximate Motif Statistics - Nicodème (2001)
(Correct)
Related documents from co-citation: More All
2: Systematic and Automated Discovery of Patterns in PROSITE Families (context) - Hart, Royyuru et al. - 2000
2: Complexity of Unusual Words Counting (context) - R'egnier
2: Selection of DNA binding sites by regulatory proteins (context) - Berg, von Hippel - 1987
BibTeX entry: (Update)
Nicod`eme, P., Salvy, B., & Flajolet, P. (1999). Motif statistics. In: ESA'99 volume 1643 of Lecture Notes in Computer Science pp. 194--211, Springer-Verlag. Proc. European Symposium on Algorithms-ESA'99, Prague. http://citeseer.ist.psu.edu/eme99motif.html More
@inproceedings{ nicodeme99motif,
author = "P. Nicod\`eme and B. Salvy and P. Flajolet",
title = "Motif statistics",
booktitle = "Proc. European Symposium on Algorithms --- ESA'99, Prague",
volume = "1643",
series = "Lecture Notes in Computer Science",
pages = "194--211",
publisher = "Springer-Verlag",
year = "1999",
url = "citeseer.ist.psu.edu/eme99motif.html" }
Citations (may not include all citations):
2003
The art of computer programming (context) - Knuth - 1981
1911
Introduction to automata theory (context) - Hopcroft, Ullman - 1979
466
Probability and Measure (context) - Billingsley - 1986
101
Central and local limit theorems applied to asymptotic enume.. (context) - Bender - 1973
101
Central and local limit theorems applied to asymptotic enume.. (context) - Bender, Richmond et al. - 1983
64
From regular expressions to deterministic automata (context) - Berry, Sethi - 1986
60
Gfun: a Maple package for the manipulation of generating and..
- Salvy, Zimmermann - 1994
52
Introduction to Computational Biology: Maps (context) - Waterman - 1995
36
emes limites pour les structures combinatoires et les foncti.. (context) - Hwang - 1994
35
The algebraic theory of context-free languages (context) - Chomsky, Sch - 1963
30
On pattern frequency occurrences in a Markovian sequence (context) - egnier, Szpankowski - 1998
29
New-York (context) - Annual, on et al. - 1998
17
The distribution of subword counts is usually normal (context) - Bender, Kochman - 1993
8
The average case analysis of algorithms: Multivariate asympt..
- Flajolet, Sedgewick - 1997
8
Problems and theorems in linear algebra (context) - Prasolov - 1994
8
Nucleic Acids Res (context) - database, in - 1997
7
Method for calculation of probability of matching a bounded .. (context) - Sewell, Durbin - 1995
5
String overlaps (context) - Guibas, Odlyzko - 1981
4
Dynamical sources in information theory: Fundamental interva.. (context) - ee - 1998
3
ective asymptotics of linear recurrences with rational coe#c.. (context) - Gourdon, Salvy - 1996
2
rst course in formal language theory (context) - Rayward-Smith - 1983
2
and Tichy (context) - Flajolet, Kirschenhofer - 1988
2
and Hofman (context) - Bairoch, Bucher - 1997
2
Computer Science and Information Processing (context) - Seminumerical - 1997
2
Finding words with unexpected frequencies in deoxyribonuclei.. (context) - the, byD et al. - 1995
2
Automata and formal languages (context) - Ecole, Palaiseau et al. - 1995
1
Calculating the exact probabilityof language-like patterns i.. (context) - Atteson - 1998
1
Regular expressions into #- nite automata (context) - uggemann-Klein - 1993
1
Probability Theory and Related Fields (context) - uniformity, strings - 1959
1
Linguistic of nucleotide sequences: The signi#cance of devia.. (context) - Pevzner, Borodovski et al. - 1989
1
ed approachtowords statistics (context) - appear, egnier et al. - 1998
1
Exceptional motifs in di#erent Markovchain models for a stat.. (context) - Schbath, Prum et al. - 1995
Documents on the same site (http://algo.inria.fr/flajolet/Publications/publist.html): More
Dynamical Sources in Information Theory: A General .. - Clément.. (1999)
(Correct)
Hidden Pattern Statistics - Flajolet, Guivarc'h, al. (2001)
(Correct)
Generating Functions for Generating Trees - Banderier.. (1999)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC