• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Splice site prediction with quadratic discriminant analysis using diversity measure (0)

by L Zhang, L Luo
Venue:Nucleic Acids Research
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 10

iNuc-PhysChem: A sequence-based predictor for identifying nucleosomes via physicochemical properties

by Wei Chen, Hao Lin, Peng-mian Feng, Chen Ding, Yong-chun Zuo, Kuo-chen Chou - PLoS One 2012
"... Nucleosome positioning has important roles in key cellular processes. Although intensive efforts have been made in this area, the rules defining nucleosome positioning is still elusive and debated. In this study, we carried out a systematic comparison among the profiles of twelve DNA physicochemical ..."
Abstract - Cited by 12 (4 self) - Add to MetaCart
Nucleosome positioning has important roles in key cellular processes. Although intensive efforts have been made in this area, the rules defining nucleosome positioning is still elusive and debated. In this study, we carried out a systematic comparison among the profiles of twelve DNA physicochemical features between the nucleosomal and linker sequences in the Saccharomyces cerevisiae genome. We found that nucleosomal sequences have some position-specific physicochemical features, which can be used for in-depth studying nucleosomes. Meanwhile, a new predictor, called iNuc-PhysChem, was developed for identification of nucleosomal sequences by incorporating these physicochemical properties into a 1788-D (dimensional) feature vector, which was further reduced to a 884-D vector via the IFS (incremental feature selection) procedure to optimize the feature set. It was observed by a cross-validation test on a benchmark dataset that the overall success rate achieved by iNuc-PhysChem was over 96 % in identifying nucleosomal or linker sequences. As a web-server, iNuc-PhysChem is freely accessible to the public at
(Show Context)

Citation Context

...some Identification PLOS ONE | www.plosone.org 2 October 2012 | Volume 7 | Issue 10 | e47843 coding region identification [38], protein subcellular location prediction [39,40], splice site prediction =-=[41]-=-, membrane protein type and location prediction [42], out membrane protein prediction [43], enzyme family class prediction [44], antimicrobial peptide classification [45], and prediction of protein ce...

The organization of nucleosomes around splice sites

by Wei Chen, Liaofu Luo, Lirong Zhang - Nucleic Acids Res , 2010
"... The occupancy of nucleosomes along chromosome is a key factor for gene regulation. However, except promoter regions, genome-wide properties and functions of nucleosome organization remain unclear in mammalian genomes. Using the com-putational model of Increment of Diversity with Quadratic Discrimina ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
The occupancy of nucleosomes along chromosome is a key factor for gene regulation. However, except promoter regions, genome-wide properties and functions of nucleosome organization remain unclear in mammalian genomes. Using the com-putational model of Increment of Diversity with Quadratic Discriminant (IDQD) trained from the microarray data, the nucleosome occupancy score (NOScore) was defined and applied to splice junction regions of constitutive, cassette exon, alternative 30 and 50 splicing events in the human genome. We found an interesting relation between NOScore and RNA splicing: exon regions have higher NOScores compared with their flanking intron sequences in both constitutive and alterna-tive splicing events, indicating the stronger nucleosome occupation potential of exon regions. In addition, NOScore valleys present at 25bp upstream of the acceptor site in all splicing events. By defining folding diversity-to-energy ratio to describe RNA structural flexibility, we demonstrated that primary RNA transcripts from nucleosome occupancy regions are relatively rigid and those from nucleosome depleted regions are relatively flexible. The negative correlation between nucleo-some occupation/depletion of DNA sequence and structural flexibility/rigidity of its primary transcript around splice junctions may provide clues to the deeper understanding of the unexpected role for nucleosome organization in the regulation of RNA splicing.
(Show Context)

Citation Context

...Introduction to IDQD The Increment of Diversity with Quadratic Discriminant (IDQD) method was proposed and successfully applied in the prediction of exon–intron splice sites for several model genomes =-=(22)-=-. The method has also been used in the prediction of transcription start sites (23). In this method, the sequence features are converted into the increment of diversity (ID), defined by the relation o...

Perspective Bioinformatics in China: A Personal Perspective

by Liping Wei, Jun Yu
"... In this personal perspective, we recall the history of bioinformatics and computational biology in China, review current research and education, and discuss future prospects and challenges. The field of bioinformatics in China has grown significantly in the past decade despite a delayed and patchy s ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
In this personal perspective, we recall the history of bioinformatics and computational biology in China, review current research and education, and discuss future prospects and challenges. The field of bioinformatics in China has grown significantly in the past decade despite a delayed and patchy start at the end of the 1980s by a few scientists from other disciplines, most noticeably physics and mathematics, where China’s traditional strength has been. In the late 1990s and early 2000s, rapid expansion of the field was fueled by the Internet boom and genomics boom worldwide and in China. Today bioinformatics research in China is characterized by a great variety of biological questions addressed and the close collaborative efforts between computational scientists and biologists, with a full spectrum of focuses ranging from database building and algorithm development to hypothesis generation and biological discoveries. Although challenges remain, the future of bioinformatics in China is promising thanks to advances in both computing infrastructure and experimental biology research, a steady increase of governmental funding, and most importantly a critical mass of bioinformatics scientists consisting of not only converts from other disciplines but also formally trained overseas returnees and a new generation of domestically trained bioinformatics Ph.D.s.
(Show Context)

Citation Context

...methods. Increased prediction accuracy of splice sites was achieved by using quadratic discriminant analysis with diversity measure or by introducing a competition mechanism of splice sites selection =-=[22,101]-=-. The impact of very short alternative splicing on protein structures and functions was studied [21]. An interesting work identified 2,695 newly evolved exons in rodents and calculated the new exon or...

Perspective Bioinformatics in China: A Personal Perspective

by See Profile, Liping Wei, Jun Yu
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract - Add to MetaCart
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
(Show Context)

Citation Context

...methods. Increased prediction accuracy of splice sites was achieved by using quadratic discriminant analysis with diversity measure or by introducing a competition mechanism of splice sites selection =-=[22,101]-=-. The impact of very short alternative splicing on protein structures and functions was studied [21]. An interesting work identified 2,695 newly evolved exons in rodents and calculated the new exon or...

Title of Dissertation: FEATURE GENERATION AND ANALYSIS APPLIED TO SEQUENCE CLASSIFICATION FOR SPLICE-SITE PREDICTION

by Rezarta Islamaj
"... Sequence classification is an important problem in many real-world applications. Sequence data often contain no explicit "signals, " or features, to enable the construction of classification algorithms. Extracting and interpreting the most useful features is challenging, and hand construct ..."
Abstract - Add to MetaCart
Sequence classification is an important problem in many real-world applications. Sequence data often contain no explicit "signals, " or features, to enable the construction of classification algorithms. Extracting and interpreting the most useful features is challenging, and hand construction of good features is the basis of many classification algorithms. In this thesis, I address this problem by developing a feature-generation algorithm (FGA). FGA is a scalable method for automatic feature generation for sequences; it identifies sequence components and uses domain knowledge, systematically constructs features, explores the space of possible features, and identifies the most useful ones. In the domain of biological sequences, splice-sites are locations in DNA sequences that signal the boundaries between genetic information and intervening non-coding regions. Only when splice-sites are identified with nucleotide precision can the genetic information be translated to produce functional proteins. In this thesis, I address this fundamental process by developing a highly accurate splice-site prediction model that employs our sequence feature-generation framework. The FGA
(Show Context)

Citation Context

...ll neighborhood around the splice siteswas considered. Zhang et al. [66] proposed a recursive-feature elimination approachsusing SVM.sSplice-site prediction has been the focus of other works, such as =-=[2,17,61]-=-, thatsreported promising results when compared with GeneSplicer, but it is difficult for asbiologist to interpret the features employed in those models. Especially, it is challengingsto relate them t...

Part I

by Wei Chen, Liaofu Luo, Lirong Zhang
"... Splice junction sequences, deduction and validation of IDQD method ..."
Abstract - Add to MetaCart
Splice junction sequences, deduction and validation of IDQD method
(Show Context)

Citation Context

...ons.The Increment of Diversity with Quadratic Discriminant (IDQD) method was proposed and successfully applied in the prediction of exon-intron splice sites for several model genomes including human =-=(1)-=-. The detailed deduction was given as follows: 1. Increment of Diversity (ID) Suppose the characters of a sample, a sequence or a group of sequences, are described by a set of numbers and the i-th cha...

International Journal of Pattern Recognition and Artificial Intelligence c ○ World Scientific Publishing Company FAST FEATURE SUBSET SELECTION IN BIOLOGICAL SEQUENCE ANALYSIS

by Rainer Pudimat, Rolf Backofen, Ernst G. Schukat-talamazzini
"... Motivation: Biological research produces a wealth of measured data. Neither it is easy for biologists to postulate hypotheses about the behaviour or structure of the observed entity because the relevant properties measured are not seen in the ocean of measurements. Nor it is easy to design machine l ..."
Abstract - Add to MetaCart
Motivation: Biological research produces a wealth of measured data. Neither it is easy for biologists to postulate hypotheses about the behaviour or structure of the observed entity because the relevant properties measured are not seen in the ocean of measurements. Nor it is easy to design machine learning algorithms to classify or cluster the data items for the same reason. Algorithms for automatically selecting a highly predictive subset of the measured features can help to overcome these difficulties. Results: We present an efficient feature selection strategy which can be applied to arbitrary feature selection problems. The core technique is a new method for estimating the quality of subsets from previously calculated qualities for smaller subsets by minimising the mean standard error of estimated values with an approach common to support vector machines. This method can be integrated in many feature subset search algorithms. We have applied it with sequential search algorithms and have been able to reduce the number of quality calculations for finding accurate feature subsets by about 70%. We show these improvements by applying our approach to the problem of finding highly predictive feature subsets for transcription factor binding sites.

ABSTRACT CHUANHUA XING. The Analysis and Identification of Protein-coding Sequences for Yeast Using a Free Energy Model. (Under direction of Dr. Donald L. Bitzer and

by Dr. Winser E. Alexander
"... Biological systems are information rich systems. This means that it is reasonable to propose signal processing techniques to detect, extract, and decode the information provided by biological systems. Free energy is used to measure the interactions of molecules in my research. If a biological proces ..."
Abstract - Add to MetaCart
Biological systems are information rich systems. This means that it is reasonable to propose signal processing techniques to detect, extract, and decode the information provided by biological systems. Free energy is used to measure the interactions of molecules in my research. If a biological process consists of molecular interactions along a time or space (position) continuum, a variable free energy pattern could be produced in which the variation is the physical manifestation of the encoded infor-mation, or signal. Signal processing techniques can possibly be used to extract in-formation from these signals. In my dissertation, I used signal processing approaches to analyze and identify DNA sequences that encode protein coding information and splice sites in pre-mRNA molecules. In the first part of my dissertation, I used free energy to measure the interaction of the 3 ’ tail of 18S rRNA and mRNA for detecting the period-3 signal in coding regions. The extraction of the period-3, free energy signal using signal processing techniques was used to analyze and identify protein-coding sequences. Two species were tested, including Saccharomyces cerevisiae (S. cerevisiae) and Schizosaccharomyces pombe (S. pombe). The experiments produced
(Show Context)

Citation Context

...dependency graph model and its derivatives for splice sites prediction, and made an attempt to fully capture the intrinsic interdependency between base positions in a splice site. Zhang et al. (2003) =-=[171]-=- generalized the diversity increment method and combined it with the quadratic discriminant analysis [170] (called IDQD, increment of diversity combined with quadratic discriminant analysis) to identi...

unknown title

by A. Y. Kashiwabara, D. C. G. Vieira, A. Machado-lima, São Paulo , 2006
"... Splice site prediction using stochastic regular grammars 105 Splice site prediction using stochastic regular grammars ..."
Abstract - Add to MetaCart
Splice site prediction using stochastic regular grammars 105 Splice site prediction using stochastic regular grammars
(Show Context)

Citation Context

...ese et al., 1997), SpliceView (Hubbard et al., 1999), GeneID (Guigo et al., 1992), FGENEH (Salamov and Solovyev, 2000), Grail (Uberbacher and Mural, 1991), Genscan (Burge and Karlin, 1997), and MZEF (=-=Zhang and Luo, 2003-=-). All these programs utilize intrinsic methods, that is, they are programs that try to recognize statistical patterns in the signal sequences: promoters, start and stop codons, splice sites, etc. Amo...

unknown title

by unknown authors , 2009
"... Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes ..."
Abstract - Add to MetaCart
Analysis and prediction of exon, intron, intergenic region and splice sites for A. thaliana and C. elegans genomes
(Show Context)

Citation Context

...osis protein [31]sand secretory protein prediction [32]. For the purpose ofsimproving prediction capability, ID combined with otherspredictive model was applied in exon/introns splice sitesprediction =-=[33]-=-, human PolII promoter prediction [34]sand protein predictions [35,36,37,38,39,40,41,42]. Forsreader’s conveniences, the theory of diversity is introduced as follows.sDefinition 1. For a state space X...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University