Results 1 - 10
of
335
Inferring non-coding RNA families and classes by means of genome-scale structurebased clustering
- PLoS Comp. Biol
"... The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs, box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of m ..."
Abstract
-
Cited by 147 (48 self)
- Add to MetaCart
(Show Context)
The RFAM database defines families of ncRNAs by means of sequence similarities that are sufficient to establish homology. In some cases, such as microRNAs, box H/ACA snoRNAs, functional commonalities define classes of RNAs that are characterized by structural similarities, and typically consist of multiple RNA families. Recent advances in high-throughput transcriptomics and comparative genomics have produced very large sets of putative non-coding RNAs and regulatory RNA signals. For many of them, evidence for stabilizing selection acting on their secondary structures has been derived, and at least approximate models of their structures have been computed. The overwhelming majority of these hypothetical RNAs cannot be assigned to established families or classes. We present here a structure-based clustering approach that is capable of extracting putative RNA classes from genome-wide surveys for structured RNAs. The LocARNA tool implements a novel variant of the Sankoff algorithm that is sufficiently fast to deal with several thousand candidate sequences. The method is also robust against false positive predictions, i.e., a contamination of the input data with unstructured or non-conserved sequences. We have successfully tested the LocARNA-based clustering approach on the sequences of the RFAM-seed
A benchmark of multiple sequence alignment programs upon structural RNAs
- Nucleic Acids Res
, 2005
"... To date, few attempts have been made to benchmark the alignment algorithms upon nucleic acid sequences. Frequently, sophisticated PAM or BLOSUM like models are used to align proteins, yet equivalents are not considered for nucleic acids; instead, rather ad hoc models are generally favoured. Here, we ..."
Abstract
-
Cited by 144 (20 self)
- Add to MetaCart
(Show Context)
To date, few attempts have been made to benchmark the alignment algorithms upon nucleic acid sequences. Frequently, sophisticated PAM or BLOSUM like models are used to align proteins, yet equivalents are not considered for nucleic acids; instead, rather ad hoc models are generally favoured. Here, we systematically test the performance of existing alignment algorithms on structural RNAs. This work was aimed at achieving the following goals: (i) to determine conditions where it is appropriate to apply common sequence alignment methods to the structuralRNAalignmentproblem.Thisindicates where and when researchers should consider augmenting the alignment process with auxiliary information, such as secondary structure and (ii) to determine which sequence alignment algorithms perform well under the broadest range of conditions. We find that sequence alignment alone, using the current algorithms, is generally inappropriate,50–60 % sequence identity. Second, we note that the probabilistic method ProAlign and the aging Clustal algorithms generally outperform other sequence-based algorithms, under the broadest range of applications.
Non-coding RNA
- Hum. Mol. Genet., 15(Spec No
, 2006
"... The term non-coding RNA (ncRNA) is commonly employed for RNA that does not encode a protein, but this does not mean that such RNAs do not contain information nor have function. Although it has been generally assumed that most genetic information is transacted by proteins, recent evidence suggests th ..."
Abstract
-
Cited by 75 (2 self)
- Add to MetaCart
(Show Context)
The term non-coding RNA (ncRNA) is commonly employed for RNA that does not encode a protein, but this does not mean that such RNAs do not contain information nor have function. Although it has been generally assumed that most genetic information is transacted by proteins, recent evidence suggests that the majority of the genomes of mammals and other complex organisms is in fact transcribed into ncRNAs, many of which are alternatively spliced and/or processed into smaller products. These ncRNAs include microRNAs and snoRNAs (many if not most of which remain to be identified), as well as likely other classes of yet-to-be-discovered small regulatory RNAs, and tens of thousands of longer transcripts (including complex patterns of interlacing and overlapping sense and antisense transcripts), most of whose functions are unknown. These RNAs (including those derived from introns) appear to comprise a hidden layer of internal signals that control various levels of gene expression in physiology and development, including chromatin architecture/epigenetic memory, transcription, RNA splicing, editing, translation and turnover. RNA regulatory networks may determine most of our complex characteristics, play a significant role in disease and constitute an unexplored world of genetic variation both within and between species.
PF: Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data
- Bioinformatics
, 2006
"... doi:10.1093/bioinformatics/btl257 ..."
VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 25: 1974–1975
- Journal of General Microbiology
, 2009
"... Description: VARNA is a tool for the automated drawing, visualization and annotation of the secondary structure of RNA, designed as a companion software for web servers and databases. Features: VARNA implements four drawing algorithms, supports input/output using the classic formats dbn, ct, bpseq a ..."
Abstract
-
Cited by 70 (1 self)
- Add to MetaCart
(Show Context)
Description: VARNA is a tool for the automated drawing, visualization and annotation of the secondary structure of RNA, designed as a companion software for web servers and databases. Features: VARNA implements four drawing algorithms, supports input/output using the classic formats dbn, ct, bpseq and RNAML and exports the drawing as five picture formats, either pixel-based (JPEG, PNG) or vector-based (SVG, EPS and XFIG). It also allows manual modification and structural annotation of the resulting drawing using either an interactive point and click approach, within a web server or through command-line arguments. Availability: VARNA is a free software, released under the terms of the GPLv3.0 license and available at
Experimental approaches to identify non-coding RNAs
- Nucleic Acids Res
, 2006
"... Cellular RNAs that do not function as messenger RNAs (mRNAs), transfer RNAs (tRNAs) or ribosomal RNAs (rRNAs) comprise a diverse class of molecules that are commonly referred to as non-protein-coding RNAs (ncRNAs). These molecules have been known for quite a while, but their importance was not fully ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
(Show Context)
Cellular RNAs that do not function as messenger RNAs (mRNAs), transfer RNAs (tRNAs) or ribosomal RNAs (rRNAs) comprise a diverse class of molecules that are commonly referred to as non-protein-coding RNAs (ncRNAs). These molecules have been known for quite a while, but their importance was not fully appreciated until recent genome-wide searches discovered thousands of these molecules and their genes in a variety of model organisms. Some of these screens were based on biocomputational prediction of ncRNA candidates within entire genomes of model organisms. Alternatively, direct biochemical isolation of expressed ncRNAs from cells, tissues or entire organisms has been shown to be a powerful approach to identify ncRNAs both at the level of individual molecules and at a global scale. In this review, we will survey several such wet-lab strategies, i.e. direct sequencing of ncRNAs, shotgun cloning of small-sized ncRNAs (cDNA libraries), microarray analysis and genomic SELEX to identify novel ncRNAs, and discuss the advantages and limits of these approaches.
SnoReport: Computational identification of snoRNAs with unknown targets
, 2007
"... Summary: Unlike tRNAs and microRNAs, both classes of snoRNAs, which direct two distinct types of chemical modifications of uracil residues, have proved to be surprisingly difficult to find in genomic sequences. Most computational approaches so far have explicitly used the fact that snoRNAs predomina ..."
Abstract
-
Cited by 50 (16 self)
- Add to MetaCart
Summary: Unlike tRNAs and microRNAs, both classes of snoRNAs, which direct two distinct types of chemical modifications of uracil residues, have proved to be surprisingly difficult to find in genomic sequences. Most computational approaches so far have explicitly used the fact that snoRNAs predominantly target ribosomal RNAs and spliceosomal RNAs. The target is specified by a short stretch of sequence complementarity between the snoRNA and its target. This sequence complementarity to known targets crucially contributes to sensitivity and specificity of snoRNA gene finding algorithms. The discovery of “orphan” snoRNAs, which either have no known target, or which target ordinary protein-coding mRNAs, however, begs the question whether this class of “housekeeping” non-coding RNAs is much more wide-spread and might have a diverse set of regulatory functions. In order to approach this question, we present here a combination
Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. Genome Res. (this issue). doi
, 2007
"... * contributed equally to this work. + corresponding author: MicroRNAs (miRNAs) are short regulatory RNAs that inhibit target genes by complementary binding in 3’untranslated regions (3’UTRs). They are one of the most abundant classes of regulators, targeting a large fraction of all genes, making the ..."
Abstract
-
Cited by 48 (10 self)
- Add to MetaCart
(Show Context)
* contributed equally to this work. + corresponding author: MicroRNAs (miRNAs) are short regulatory RNAs that inhibit target genes by complementary binding in 3’untranslated regions (3’UTRs). They are one of the most abundant classes of regulators, targeting a large fraction of all genes, making their comprehensive study a requirement for understanding regulation and development. Here we use 12 Drosophila genomes to define structural and evolutionary signatures of miRNA hairpins, which we use for their de novo discovery. We predict more than 41 novel miRNAs, which encompass many unique families, and 28 of which we validate experimentally. We also define precise signals for the start position of mature miRNAs, which we use to correct the annotation of previously known miRNAs, often leading to drastic changes in their target spectrum. We show that miRNA discovery power scales with the number and divergence of species compared, suggesting that such approaches can be successful in human as dozens of mammalian genomes become available. Interestingly, for some miRNAs sense and anti-sense hairpins score highly and mature miRNAs from both strands can indeed be found in vivo. Similarly, we find that multiple starts are indeed processed in the absence of precise signals for the miRNA start, which strongly correlate with few target sites for these miRNAs. Lastly, we show that several miRNA star sequences score highly and are likely functional. For mir-10 in particular, both arms show abundant processing, and both
miRNAMap: genomic maps of microRNA genes and their target genes in mammalian genomes, Nucleic Acids Res 34
, 2006
"... doi:10.1093/nar/gkj135 ..."
(Show Context)