| Horton P, Nakai K. Better prediction of protein cellular localization sites with the k nearest neighbors classifier. Proc Int Conf Intell Syst Mol Biol 1997;5:147-152. |
....of the sequences of 1101 cytoplasmic proteins, etc. The same proteins from di#erent species are removed from the data files to reduce the redundancy of proteins. The second step is the searching signal peptide and the checking physical property of peptide sequence in question (Fig. 1) PSORT II [4] and GCG are used to check the signal peptide, and TopPred [3] is used to check the existence of the transmembrane domain. The final result is given by the linear combination of the scores calculated by these programs, and the parameters for weighting were optimized to give the best prediction ....
Horton, P. and Nakai, K., Better Prediction of Protein Cellular Localization Sites with the k Nearest Neighbors Classifier, Intelligent Systems for Molecular Biology, 5:147--152, 1997.
.... are based on an underlying assumption that patterns can be found only within divergent families [81] The few convergently related patterns for functional motifs [29] such as nuclear localization signals are usually not sufficiently specific and cannot be readily discovered by alignment [55]. Ideally, one wishes to carry out pattern discovery in an unsupervised manner. When one subselects a set of proteins from a given database and uses any of the available tools to discover the patterns present in this set, one has made the following two implicit assumptions: first that the members ....
P. Horton and K. Nakai. Better prediction of protein cellular localization sites with the k nearest neighbors classifier. International Conference on Intelligent Systems for Molecular Biology, 5:147--152, 1997.
....subjective, and highly variable. While each method can yield important information, they do not provide unambiguous information on location that can be entered into databases. There have been pioneering efforts to predict subcellular location from protein sequence (Eisenhaber and Bork 1998; Horton and Nakai 1997; Nakai and Horton 1999; Nakai and Kanehisa 1992) These efforts have been modestly successful, correctly classifying approximately 60 of proteins whose locations are currently known. A major limitation of the usefulness of these systems is that only broad categories of subcellular locations were ....
Horton, P., and Nakai, K. 1997. Better Prediction of Protein Cellular Localization Sites with the k Nearest Neighbors Classifier. Intelligent Systems for Molecular Biology, 5:147-152.
....map against a protein dataset that contains 336 entries of proteins, with an average length of 401 amino acids, belonging to E.coli bacteria. This dataset has been used to learn a k nearest neighbors classifier in research on classifying proteins based on their cellular localization sites [7]. The dataset is available from the UC Irvine Machine Learning Data Repository [5] under the name ecoli . 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 Number of Processors ecoli Alpha ecoli RWC PC II Figure 4. Parallel similarity calculation for E.coli proteins For our ....
Paul Horton and Kenta Nakai. Better prediction of protein cellular localization sites with the k nearest neighbors classifier. In Proceeding of the Fifth International Conference on Intelligent Systems for Molecular Biology, pages 147--152, Menlo Park, 1997. AAAI Press.
No context found.
Horton P, Nakai K. Better prediction of protein cellular localization sites with the k nearest neighbors classifier. Proc Int Conf Intell Syst Mol Biol 1997;5:147-152.
No context found.
Horton, P., and Nakai, K.: Better Prediction of Protein Cellular Localization Sites with the k Nearest Neighbors Classifier. Proceedings of Intelligent Systems in Molecular Biology.(1997) 368-383
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC