Download:
by Christina L. Zheng, Virginia R. Sa, Michael Gribskov T. Murlidharan Nair
http://hc.ims.u-tokyo.ac.jp/JSBi/journal/GIW03/GIW03F008.pdf
Add To MetaCart
Abstract:
The computational recognition of precise splice junctions is a challenge faced in the analysis of newly sequenced genomes. This is challenging due to the fact that the distribution of sequence patterns in these regions is not always distinct. Our objective is to understand the sequence signatures at the splice junctions, not simply to create an artificial recognition system. We use a combination of a neural network based calliper randomization approach and an information theoretic based feature selection approach for this purpose. This has been done in an effort to understand regions that harbor information content and to extract features relevant for the prediction of splice junctions. The analysis using the neural network based calliper randomization approach revealed regions important in the internal representation of the network model. The calliper approach captured both correlated as well as independently important features. The feature selection approach captures features that are independently informative. The two different methods can capture features with different properties. Comparative analysis of the results using both the methods help to infer about the kind of information present in the region.
Citations
|
4364
|
Elements of Information Theory
– Cover, Thomas
- 1991
|
|
3051
|
Neural Networks for Pattern Recognition
– Bishop
- 1995
|
|
431
|
Learning representation by back propagating errors
– Rumelhart, Hinton, et al.
- 1986
|
|
255
|
Toward optimal feature selection
– Koller, Sahami
- 1996
|
|
92
|
Comparison of the predicted and observed secondary structure of T4 phage lysozyme
– Matthews
- 1975
|
|
53
|
Prediction of human mRNA donor and acceptor sites from the DNA sequence
– Brunak, Engelbrecht, et al.
- 1991
|
|
39
|
Long range correlations in nucleotide sequences
– Peng, Buldyrev, et al.
- 1992
|
|
35
|
Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis
– Pedersen, Nielsen
- 1997
|
|
30
|
Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts
– Xing, Karp, et al.
|
|
20
|
A computational analysis of sequence features involved in recognition of short introns
– Lim, Burge
- 2001
|
|
17
|
Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information
– Hebsgaard, Korning, et al.
- 1996
|
|
15
|
Computational prediction of eukaryotic protein-coding genes
– Zhang
- 2002
|
|
13
|
Statistical features of human exons and their flanking regions
– Zhang
- 1998
|
|
5
|
A clean data set of EST-confirmed splice sites from Homo sapiens and standards for clean-up procedures
– Thanaraj
- 1999
|
|
4
|
Splicing of Messenger RNA Precursors
– Sharp
- 1987
|
|
3
|
Statistical analysis and prediction of the exonic structure of human genes
– Gelfand
- 1992
|
|
3
|
The role of small nuclear ribonucleoprotein particles in pre-mRNA splicing
– Maniatis, Reed
- 1987
|
|
2
|
Application of artificial neural networks for prokaryotic transcription terminator
– Nair, Tambe, et al.
- 1994
|
|
2
|
Interfering contexts of regulatory sequence elements
– Trifonov
- 1995
|
|
1
|
Using feature selection to find inputs that work better as outputs
– Caruana, Sa
- 1998
|
|
1
|
Calliper randomization: an artificial neural network based analysis of E. coli ribosome binding sites
– Nair
- 1997
|
|
1
|
Decision trees and neural nets: two complementary representations for learning
– Paredis
- 1991
|
|
1
|
A statistical analytical approach to decipher information from biological sequences: application to murine splice-site analysis and prediction
– Reddy, Pandit
- 1995
|