28 citations found. Retrieving documents...
Pearson, W.R., "Searching Protein Sequence Libraries: Comparison of the Sensitivity and Selectivity of the Smith Waterman and FASTA algorithms." Genomics, 11: 635--650, 1991.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Bootstrapping and Normalization for Enhanced Evaluations of.. - Green, Brenner (2002)   (1 citation)  (Correct)

.... algorithms for database searching are derivatives of the Needleman Wunsch dynamic programming algorithm [13] as modified for local alignment by Smith and Waterman [14] The Smith Waterman algorithms guarantees the optimal alignment under a given scoring scheme, and the SSEARCH program [15] provides a full implementation. Heuristics that speed up pairwise alignment have been introduced in BLAST [1] and FASTA [3] the two most popular algorithms. WU BLAST and NCBI BLAST are both implementations of the BLAST algorithm, differing in the way score statistics are generated, as well as ....

W. R. Pearson, "Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith--Waterman and FASTA algorithms," Genomics, vol. 11, pp. 635--650, 1991.


Using Hybrid Alignment for Iterative Sequence Database.. - Li, Lauria, Bundschuh (2003)   (Correct)

....theory also describes this dependence. Thus, an E value can be assigned to a gapless alignment without any further need for computation which made the original version of BLAST so successful. However, in order to detect weak sequence homologies, it is crucial to allow gaps in an alignment [24]. In the presence of gaps the E values follow according to numerical studies still the universal form Eq. 1) 31, 10, 19, 33, 34, 2, 23] However, the numerical values of the two parameters # and K are not known. There are various approaches to solve this dilemma: for large gap costs there are ....

W. R. Pearson. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the smithwaterman and fasta algorithms. Genomics, 11(3):635--650, November 1991.


Modelling Expressed Sequence Tags with a Hidden Markov Model - Lottaz   (Correct)

....The method we suggest here is based on mappings between ESTs and mRNAs. These mappings can be compared with annotations for coding sequences on the mRNAs and thus allow deduction of transition probabilities. More precisely, ESTs are aligned with their respective full length mRNA using ssearch [5]. For each UniGene cluster referring to a mRNA with complete coding sequence annotated, all clustered ESTs are analysed using ssearch. This program applies the Smith Waterman algorithm to determine optimal local alignments between the mRNA and the corresponding ESTs [8] An example output of ....

W. R. Pearson. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the smith-waterman and FASTA algorithms. Genomics, 11(3):635650, November 1991.


Speeding up Genome Computations with a Systolic Accelerator - Lavenier   (Correct)

....3) Samba implements a parameterized version of the Smith and Waterman algorithm. By varying a few parameters local or global comparisons can be performed, with or without gap penalty. A variety of comparison approaches represented by software packages such as blast [1] fasta [10] or ssearch [11] may be sped up via the accelerator. Performing sequence comparison on a systolic array is not a new idea. Other systems based on these structures have been described in the literature. Related projects using a dedicated systolic array are the Bisp [4] and the BioScan [13] machines. Other ....

....swiss prot (version 34 59 021 sequences 21 210 389 aa) using the Smith and Waterman algorithm for di#erent lengths of a protein query sequence. The first two lines give, respectively, the execution time (in minute:second) on Samba and on a 150 MHz Dec Alpha workstation running ssearch [11]. Note that the longer the query sequence, the better the speed up. This is mainly due to the restricted bandwith of the i o disk system which prevents the array from being fed at its maximum rate: a short query sequence does not require the computation to be split into several passes. ....

W.R. Pearson, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith and Waterman and FASTA algorithms, Genomics, 11(1991), pp: 635--650. 11


Near Optimal Multiple Alignment Within a Band In Polynomial Time - Li, Ma, Wang (2000)   (2 citations)  (Correct)

.... under the rubric of cutting corners in [17] Alignment within a band is used in the final stage of the FASTA program for rapid searching of protein and DNA sequence databases [13; 14] Pearson has shown that alignment within a band gives very good results for lots of protein superfamilies [15]. Other references can be found in [1; 3; 6; 20] Spouge gives a survey on this topic in [18] We now define the problem. Definition 3. c Diagonal Alignment Let S = fs1 ; s2 ; sng be a set of n sequences, each of length m, and M an caaccca ca cccc ca cccg ca ccct Figure 1: One insertion ....

W. R. Pearson, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics, 11, pp. 635-650, 1991.


Power-Laws in the Size Distribution of Gene Families in.. - Huynen, van Nimwegen (1997)   (1 citation)  (Correct)

....fate of specific gene families, but rather to find possible patterns in the size distribution of the gene families in genomes, and to present models that could account for such a distribution. Methods The Smith Waterman algorithm (Smith Waterman, 1981) as implemented in the FASTA package (Pearson, 1991) was used to compare the protein coding regions within genomes. A previous analysis of this algorithm, in which its predictions were compared to a similarity analysis based on the 3D structure of proteins showed that the algorithm produced no false positives for E 0:001 (the E value is the ....

Pearson, W. R. (1991). Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11, 635--650.


Near Optimal Multiple Alignment Within a Band In Polynomial Time - Ma, Li (1997)   (2 citations)  (Correct)

....University of Hong Kong Ming Li x University of Waterloo Abstract Multiple sequence alignment is a fundamental problem in computational biology. Because of its notorious difficulties, aligning sequences within a constant band is a popular practice in bioinformatics with good practical results [17, 13, 14, 15, 1, 3, 6, 20, 18]. However, the problem is still NP hard for multiple sequences. In this paper, we present a theoretical study of this problem. In particular, we present polynomial time approximation schemes (PTAS) for multiple sequence alignment within a constant band, under standard models of SP alignment and ....

.... under the rubric of cutting corners in [17] Alignment within a band is used in the final stage of the FASTA program for rapid searching of protein and DNA sequence databases [13, 14] Pearson has shown that alignment within a band gives very good results for lots of protein superfamilies [15]. Other references can be found in [1, 3, 6, 20] Spouge gives a survey on this topic in [18] We first define our problem. 2 Definition 3 c Diagonal Alignment Let S = fs 1 ; s 2 ; s n g be a set of n sequences, each of length m, and M an alignment of the n sequences. The length of the ....

W. R. Pearson, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics, 11, pp. 635-650, 1991.


Sequence Comparison Significance and Poisson Approximation - Waterman, Vingron (1994)   (8 citations)  (Correct)

....(Altschul et al. 1990) for rapid database searches are very well known and widely 4 used. Both these algorithms are faster than the quadratic time algorithms presented below and both can be considered heuristic approximations for the comparison score we compute here using dynamic programming (Pearson (1991)) Thus the statistical methods we present can be used for these rapid search techniques, and in the case of BLAST are already an integral part of the algorithm. Let us set the stage. Given are two sequences x = x 1 x 2 Delta Delta Delta x n and y = y 1 y 2 Delta Delta Delta ym over a finite ....

PEARSON, W.R. (1991). Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635.


SAMBA - Systolic Accelerators For Molecular Biological Applications - Lavenier (1996)   (3 citations)  (Correct)

....in a few tens of seconds on Samba. Samba implements a parameterized Smith and Waterman algorithm [14] 7] By setting differently a few parameters, local or global comparisons can be performed, with or without gap penalty. Thus, a variety of software, such as blast [1] fasta [12] or ssearch [13], may be implemented on that accelerator. The complete Samba system comprises a workstation, a systolic array of 128 full custom hardwired 12 bit processors, and a fpga based interface (see Fig. 1) The fpga interface is the PeRLe 1 board developed by Vuillemin et al. 3] it acts as a hardware ....

W.R. Pearson. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the smith and waterman and fasta algorithms. Genomics, 11:635--650, 1991.


Motif Identification Neural Design For Rapid And Sensitive.. - Cathy Wu Hsi-Lien (1996)   (1 citation)  (Correct)

....a negative (non member) sequence is accurately predicted (i.e. true negative) if either score is lower than the threshold. Note that both neural network score and P score range between 0.0 (no match) and 1.0 (perfect match) The SSEARCH program (version 1. 7A, July 1994) Smith Waterman, 1981; Pearson, 1991] was used to determine the overall sequence similarity of a query sequence to the neural network training sequences. Comparative studies. The MOTIFIND results were compared to those obtained by the PROSITE, BLAST and BLIMPS search methods. Different cut off scores were selected for every method ....

Pearson, W. R. (1991) Searching protein sequence libraries:comparison of the sensitivity and the selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11, 635-650.


Classifying Molecular Sequences Using a Linkage Graph.. - Matsuda, Ishihara.. (1999)   (2 citations)  (Correct)

....functions as a methyltransferase. RhaS and Ogt function as a transcription activator and as a methyltransferase, respectively. The multidomain structure among these proteins re#ects the bifunctionality of Ada. In Fig. 1, the similarities among these proteins are calculated by the SSEARCH program [20], an implementation of the Smith Waterman algorithm [22] with the BLOSUM 50 score matrix [13] for computing #(p i ;q j ) in De#nition 2.2. 310 H. Matsuda et al. Theoretical Computer Science 210 (1999) 305 325 In this example, the triangle inequality: D(P; Q)6D(P;R) D(R;Q) for all P; Q; R: ....

W.R. Pearson, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith--Waterman and FASTA algorithms, Genomics 11 (1991) 635 -- 650.


The Frequency Distribution of Gene Family Sizes in Complete .. - Huynen, van Nimwegen (1998)   (4 citations)  (Correct)

....families, but rather to find possible patterns in the size distribution of the gene families in genomes, and to present a general model that could account for such a distribution. Materials and Methods The Smith Waterman algorithm (Smith and Waterman, 1981) as implemented in the FASTA package (Pearson, 1991) was used to compare the protein coding regions within genomes. A previous analysis of this algorithm, in which its predictions were compared to a similarity analysis based on the 3D structure of proteins showed that the algorithm produced no false positives for E 0:001 (the E value is the ....

W. R. Pearson (1991). Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11, 635--650.


An Analytic Approach to Significance Assessment in Local.. - Bundschuh (1999)   (Correct)

....trees [32, 10] There are two classes of alignment algorithms. The simpler gapless alignment as it was implemented, e.g. in the original BLAST [1] is very fast and theoretically very well understood. However, in order to detect weakly homologous sequences, gaps have to be allowed in an alignment [25] which leads to the more sophisticated Smith Waterman algorithm [28] Both types of alignment algorithms have the drawback that they will nd an optimal alignment and an optimal score for any pair of sequences even randomly chosen and thus completely unrelated ones. Thus, it is necessary to ....

.... in the Gumbel distribution Eq. 4) of the maximal alignment scores. In order to detect weak similarities between sequences separated by a large evolutionary distance, gaps have to be allowed within an alignment to compensate for insertions or deletions occurred during the course of evolution [25]. Here, we will speci cally consider Smith Waterman local alignment [28] In this case, a possible alignment A still consists of two substrings of the two original sequences a and b. But now, these subsequences may have di erent lengths, since gaps may be inserted in the alignment. For ....

Pearson, W.R. 1991. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11, 635-650.


Rapid Assessment of Extremal Statistics for Gapped Local.. - Olsen, Bundschuh, Hwa (1999)   (2 citations)  (Correct)

....For database searches, the most commonly used are gapless alignments such as the original BLAST (Altschul et al. 1990) More sophisticated is the Smith Waterman algorithm (Smith and Waterman 1981) which allows for the insertion of gaps. The latter is needed to detect weakly homologous sequences (Pearson 1991). Copyright c fl1999, American Association for Artificial Intelligence (www.aaai.org) All rights reserved. Both alignments with and without gaps are designed to work in the local alignment regime, where the alignment scores of unrelated sequences are typically very small, so that the ....

Pearson, W.R. 1991. Searching Protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11:635--650.


A New Method for Database Searching and Clustering - Krause, Vingron (1997)   (Correct)

.... engrailed in their annotation with the exception of two entries. These two were annotated engrailed like and one of them was a fragment of only 60 amino acids length. In Figure 2 we compare the SYSTERS result to the search output generated by a rigorous Smith Waterman ( 17] SSEARCH program [14]) alignment of the seed to the database. Many of these comparisons were studied in order to determine down to which significance level in the Smith Waterman search cluster members were identified and also which higher scoring sequences were not included into the SYSTERS clusters. The statistical ....

Pearson, W.R., "Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith--Waterman and FASTA algorithms", Genomics, 11(3):635-650, 1991.


Speeding Up Genome Computations With a Systolic Accelerator - Lavenier (1998)   (Correct)

....Smith and Waterman algorithm. By varying a few parameters it is possible to perform local or global comparisons, with or without gap penalties. The accelerator can be used to speed up a variety of comparison approaches, represented by such software packages as blast [1] fasta [10] and ssearch [11]. Performing sequence comparisons on a systolic array is not a new idea. Other systems based on these structures have been described in the literature. Related projects that have used dedicated systolic arrays are the Bisp [4] and the BioScan [13] machines. Other machines, such as Kestrel [7] or ....

....protein bank (version 34, which contains 59,021 sequences and 21,210,389 amino acids) for protein query sequences of various lengths with the Smith and Waterman algorithm. The first two rows of the table give the execution times for Samba and for a 150 MHz Dec Alpha workstation running ssearch [11]. As the times show, the longer the query sequence, the better the speedup. This is due mainly to the restricted bandwith of the Samba i o disk system, which prevents the array from being fed at its maximum rate: A short query sequence does not require the computation to be split into several ....

W.R. Pearson, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith and Waterman and FASTA algorithms, Genomics, 11(1991), pp: 635--650.


Aligning Sequences with Non-Affine Gap Penalty: - Plains Algorithm Practical   (Correct)

No context found.

Pearson, W.R., "Searching Protein Sequence Libraries: Comparison of the Sensitivity and Selectivity of the Smith Waterman and FASTA algorithms." Genomics, 11: 635--650, 1991.


PatternHunter II: Highly Sensitive and Fast Homology Search - Li, Ma, Kisman, Tromp (2003)   (8 citations)  (Correct)

No context found.

Pearson, W.R., Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics, 11:635--650, 1991.


Genome Informatics 14: 464--465 (2003) Analysis of.. - Prokaryotic Genomes..   (Correct)

No context found.

Pearson W.R., Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics, 11(3):635--650, 1991.


Homology Search Methods - Brown, Li, Ma (2003)   (Correct)

No context found.

W. R. Pearson. Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics, 11:635-650, 1991.


PatternHunter II: Highly Sensitive and Fast Homology.. - Li, Ma, Kisman, Tromp   (8 citations)  (Correct)

No context found.

Pearson, W.R., Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics 11, 635-650, 1991.


PatternHunter II: Highly Sensitive and Fast Homology Search - Li, Ma, al. (2004)   (8 citations)  (Correct)

No context found.

W. R. Pearson, \Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms", Genomics 11, 635{ 650 (1991).


Coupling Hundreds of Workstations for Parallel Molecular.. - Strumpen (1995)   (10 citations)  (Correct)

No context found.

W. R. Pearson, `Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith--Waterman and FASTA algorithms', Genomics, 11, (3), 635--650 (1991).


Human Proton/Oligopeptide Transporter (POT) Genes.. - Botka, Wittig, C. (2000)   (Correct)

No context found.

Pearson WR. Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the SmithWaterman algorithms. Genomics 1991;11:635-650.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC