• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Tandem repeats finder: a program to analyze DNA sequences (1999)

by G Benson
Venue:Nucleic Acids Res
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 961
Next 10 →

The diploid genome sequence of an individual human

by Samuel Levy, Granger Sutton, Pauline C. Ng, Lars Feuk, Aaron L. Halpern, Brian P. Walenz, Nelson Axelrod, Jiaqi Huang, Ewen F. Kirkness, Gennady Denisov, Yuan Lin, Jeffrey R. Macdonald, Andy Wing, Chun Pang, Mary Shago, Timothy B. Stockwell, Alexia Tsiamouri, Vineet Bafna, Vikas Bansal, Saul A. Kravitz, Dana A. Busam, Karen Y. Beeson, Tina C. Mcintosh, Karin A. Remington, Josep F. Abril, John Gill, Jon Borman, Yu-hui Rogers, Marvin E. Frazier, Stephen W. Scherer, Robert L. Strausberg, J. Craig Venter - PLoS Biol
"... Presented here is a genome sequence of an individual human. It was produced from;32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given r ..."
Abstract - Cited by 293 (6 self) - Add to MetaCart
Presented here is a genome sequence of an individual human. It was produced from;32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2–206 bp), 292,102 heterozygous insertion/deletion events (indels)(1–571 bp), 559,473 homozygous indels (1–82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22 % of all events identified in the donor, however they involve 74 % of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44 % of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of
(Show Context)

Citation Context

...]. The values of h indel in repetitive sequence regions are 1.2 3 10 4 for regions identified by RepeatMasker (http://www.repeatmasker.org) and 4.9 3 10 4 for regions identified by TandemRepeatFinder =-=[48]-=-, respectively. Thus, the indel diversity in repetitive regions is between 1.4 and 5.8 times higher than the genome-wide rate. This suggests that the high value of hindel over all loci is likely media...

SSAHA: a fast search method for large DNA databases

by Zemin Ning, Anthony J. Cox, James C. Mullikin, Zemin Ning, Anthony J. Cox, James C. Mullikin - Genome Res , 2001
"... Article cited in: ..."
Abstract - Cited by 211 (10 self) - Add to MetaCart
Article cited in:

mreps: efficient and flexible detection of tandem repeats in dna

by Roman Kolpakov, Ghizlane Bana, Gregory Kucherov - Nucleic Acids Res , 2003
"... The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms and plays an important role in genomic fingerprinting. In this paper, we describe mreps, a powerful so ..."
Abstract - Cited by 95 (3 self) - Add to MetaCart
The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms and plays an important role in genomic fingerprinting. In this paper, we describe mreps, a powerful software tool for a fast identification of tandemly repeated structures in DNA sequences. mreps is able to identify all types of tandem repeats within a single run on a whole genomic sequence. It has a resolution parameter that allows the program to identify ‘fuzzy ’ repeats. We introduce main algorithmic solutions behind mreps, describe its usage, give some execution time benchmarks and present several case studies to illustrate its capabilities. The mreps web interface is accessible through
(Show Context)

Citation Context

...ndem repeats. The mini-satellite database (http://minisatellites.u-psud.fr/) (14) collects and stores short tandem repeats of a certain number of species, computed with the Tandem Repeats Finder tool =-=(15)-=-. A more specialized STDR database Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) focuses on short tandem repeats involved in genetic mapping and identity testin...

YASS: enhancing the sensitivity of DNA similarity search

by Laurent Noé, Gregory Kucherov - NUCLEIC ACIDS RES , 2005
"... YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous sequences, combined with a flexible hit criterion used to identify groups of seeds that are likely to e ..."
Abstract - Cited by 93 (17 self) - Add to MetaCart
YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous sequences, combined with a flexible hit criterion used to identify groups of seeds that are likely to exhibit significant alignments. A web interface (http://www.loria.fr/projects/YASS/) is available to upload input sequences in fasta format, query the program and visualize the results obtained in several forms (dot-plot, tabular output and others). A standalone version is available for download from the web page.

An Algorithm for Approximate Tandem Repeats

by Gad M. Landau, Jeanette P. Schmidt, Dina Sokol, Incyte Pharmaceuticals - In Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching (CPM), volume 684 of Lecture Notes in Computer Science , 1993
"... A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g. abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g. abcdaacd. ..."
Abstract - Cited by 88 (3 self) - Add to MetaCart
A perfect single tandem repeat is defined as a nonempty string that can be divided into two identical substrings, e.g. abcabc. An approximate single tandem repeat is one in which the substrings are similar, but not identical, e.g. abcdaacd.

CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats

by Ibtissem Grissa, Gilles Vergnaud, Christine Pourcel - Nucleic Acids Res , 2007
"... short palindromic repeats ..."
Abstract - Cited by 87 (5 self) - Add to MetaCart
short palindromic repeats
(Show Context)

Citation Context

... of background was necessary, and generally some CRISPR clusters were missed or neglected, especially the shortest one (less than three motifs). This is the case, for example, of Tandem Repeat Finder =-=(17)-=- when considering a motif (DR þ spacer) as a degenerate repeat (10,18), or Locating Uniform poly-Nucleotide Areas (LUNA), a program for finding degenerate repeats in microbial genomes on a desktop com...

Genome annotation assessment in Drosophila melanogaster. Genome Res 10

by Martin G. Reese, George Hartzell, Nomi L. Harris, Uwe Ohler, Josep F. Abril, Suzanna E. Lewis , 2000
"... Computational methods for automated genome annotation are critical to our community’s ability to make full use of the large volume of genomic sequence being generated and released. To explore the accuracy of these automated feature prediction tools in the genomes of higher organisms, we evaluated th ..."
Abstract - Cited by 60 (6 self) - Add to MetaCart
Computational methods for automated genome annotation are critical to our community’s ability to make full use of the large volume of genomic sequence being generated and released. To explore the accuracy of these automated feature prediction tools in the genomes of higher organisms, we evaluated their performance on a large, well-characterized sequence contig from the Adh region of Drosophila melanogaster. This experiment, known as the Genome Annotation Assessment Project (GASP), was launched in May 1999. Twelve groups, applying state-of-the-art tools, contributed predictions for features including gene structure, protein homologies, promoter sites, and repeat elements. We evaluated these predictions using two standards, one based on previously unreleased high-quality full-length cDNA sequences and a second based on the set of annotations generated as part of an in-depth study of the region by a group of Drosophila experts. Although these standard sets only approximate the unknown distribution of features in this region, we believe that when taken in context the results of an evaluation based on them are meaningful. The results were presented as a tutorial at the conference on Intelligent Systems in Molecular Biology (ISMB-99) in August 1999. Over 95 % of the coding nucleotides in the region were correctly identified by the majority of the gene finders, and the correct intron/exon structures were predicted for>40 % of the genes. Homology-based annotation techniques recognized and associated functions with almost half of the genes in the region; the remainder were only
(Show Context)

Citation Context

...equence structure produced predominantly by repetitive elements. Repeats also play a major role in evolution (for review, see Jurka 1998). Two groups, Gary Benson [tandem repeats finder v. 2.02 (TRF; =-=Benson 1999-=-)] and the MAGPIE team using two programs Calypso (D. Field, unpubl.) and REPuter (Kurtz and Schleiermacher 1999) submitted repetitive sequence annotations. TRF (Benson 1999) locates approximate tande...

Improved hit criteria for DNA local alignment

by Laurent Noé, Gregory Kucherov , 2004
"... The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for the selectivity and sensitivity of the whole method. In this paper, we propose two ways to improve the hit criterio ..."
Abstract - Cited by 55 (12 self) - Add to MetaCart
The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for the selectivity and sensitivity of the whole method. In this paper, we propose two ways to improve the hit criterion. First, we define the group criterion combining the advantages of the single-seed and double-seed approaches used in existing algorithms. Second, we introduce transition-constrained seeds that extend spaced seeds by the possibility of distinguishing transition and transversion mismatches. We provide analytical data as well as experimental results, obtained with our YASS software, supporting both improvements.

microRNA target predictions across seven Drosophila species and comparison to mammalian targets

by Dominic Grün, Yi-lu Wang, David Langenberger, Kristin C. Gunsalus, Nikolaus Rajewsky - PLoS Comput. Biol , 2005
"... microRNAs are small noncoding genes that regulate the protein production of genes by binding to partially complementary sites in the mRNAs of targeted genes. Here, using our algorithm PicTar, we exploit cross-species comparisons to predict, on average, 54 targeted genes per microRNA above noise in D ..."
Abstract - Cited by 54 (3 self) - Add to MetaCart
microRNAs are small noncoding genes that regulate the protein production of genes by binding to partially complementary sites in the mRNAs of targeted genes. Here, using our algorithm PicTar, we exploit cross-species comparisons to predict, on average, 54 targeted genes per microRNA above noise in Drosophila melanogaster. Analysis of the functional annotation of target genes furthermore suggests specific biological functions for many microRNAs. We also predict combinatorial targets for clustered microRNAs and find that some clustered microRNAs are likely to coordinately regulate target genes. Furthermore, we compare microRNA regulation between insects and vertebrates. We find that the widespread extent of gene regulation by microRNAs is comparable between flies and mammals but that certain microRNAs may function in clade-specific modes of gene regulation. One of these microRNAs (miR-210) is predicted to contribute to the regulation of fly oogenesis. We also list specific regulatory relationships that appear to be conserved between flies and mammals. Our findings provide the most extensive microRNA target predictions in Drosophila to date, suggest specific functional roles for most microRNAs, indicate the existence of coordinate gene regulation executed by clustered microRNAs, and shed light on the evolution of microRNA function across large evolutionary distances. All predictions are freely accessible at our searchable Web site
(Show Context)

Citation Context

.... The coverage of genes is thus roughly comparable between both sets. Additionally we masked repeats in the unique alignments using the UCSC repeat masks for set 1 and using the Tandem Repeat Remover =-=[22]-=- following Rajewsky et al. [23] for set 2. The nucleotide space of the various alignment sets is listed in Table 2 and comprises for each set a total of 2.2–4.1 Mb per species for the repeat-masked un...

PILER: identification and classification of genomic repeats

by Robert C. Edgar, Eugene W. Myers - Bioinformatics , 2005
"... Repeated elements such as satellites and transposons are ubiquitous in eukaryotic genomes. De novo computational identification and classification of such elements is a challenging problem, so repeat annotation of sequenced genomes has historically largely relied on sequence similarity to hand-curat ..."
Abstract - Cited by 46 (0 self) - Add to MetaCart
Repeated elements such as satellites and transposons are ubiquitous in eukaryotic genomes. De novo computational identification and classification of such elements is a challenging problem, so repeat annotation of sequenced genomes has historically largely relied on sequence similarity to hand-curated libraries of known repeat families. We present a new approach to de novo repeat annotation that exploits characteristic patterns of local alignments induced by certain classes of repeats. We describe PILER, a package of efficient search algorithms for identifying such patterns. Novel repeats found using PILER are reported for H. sapiens, A. thalania and D. melanogaster. The software is freely available at
(Show Context)

Citation Context

..., which is rich in these classes of repeat. As our current implementation of PILER is not designed to find short elements (< 50 bases), we first masked the sequence using Tandem Repeats Finder (TRF) (=-=Benson, 1999-=-), which locates tandem repeats of motifs from 1 to 500 bases. We then performed two search and masking steps. In each step, a library was constructed (Section 2.8) and masking performed by using blas...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University