BibTeX
@MISC{She03association-rule-basedprediction,
author = {Rong She},
title = {ASSOCIATION-RULE-BASED PREDICTION OF OUTER MEMBRANE PROTEINS},
year = {2003}
}
OpenURL
Abstract
A class of medically important disease-causing bacteria (collectively known as Gram-negative bacteria) has been shown to have a rather distinct cell structure, in which there exists an extra "outer" membrane in addition to the "inner" membrane that presents in the cells of most other organisms. Proteins resident in this outer membrane (outer membrane proteins) are of primary research interest since such proteins are exposed on the surface of these bacterial cells and so are the prioritized targets to develop drugs against. Determination of the biological patterns that discriminate outer membrane proteins from non-outer membrane proteins could also provide insights into the biology of this important class of proteins. To date, it remains difficult to predict outer membrane proteins with high precision. Existing protein localization prediction algorithms either do not predict outer membrane proteins at all, or they simply concentrate on the overall accuracy or recall when identifying outer membrane proteins. However, as the study of a potential drug or vaccine takes great amount of time and effort in the laboratory, it is more appropriate that priority be given to getting a high precision on outer membrane protein prediction. In this thesis, we address the problem of protein localization classification with the performance measured mainly on precision of the outer membrane protein prediction. We apply the technique of association-rule based classification and propose several important optimization techniques in order to speed up the rule-mining process. In addition, we introduce the framework of building classifiers with multiple levels, which we call the refined classifier, in order to further improve the classification performance on top of the single-level classifier. Our experimental results show that our algorithms are efficient and produce high precision while maintaining the corresponding recall at a good level. Also, the idea of refined classification indeed improves the performance of the final classifier. Furthermore, our classification rules turn out to be very helpful for biologists to improve their understanding of functions and structures of the outer membrane proteins.
Keyphrases
membrane protein association-rule-based prediction outer membrane protein high precision outer membrane protein prediction outer membrane protein inner membrane overall accuracy non-outer membrane protein classification performance potential drug classification rule gram-negative bacteria biological pattern great amount primary research interest final classifier important class multiple level outer membrane good level bacterial cell rule-mining process prioritized target protein localization classification extra outer membrane propose several important optimization technique distinct cell structure important disease-causing bacteria protein localization prediction refined classification single-level classifier experimental result