Results 1  10
of
11
Classifier Ensembles: Select RealWorld Applications
, 2008
"... Broad classes of statistical classification algorithms have been developed and applied successfully to a wide range of real world domains. In general, ensuring that the particular classification algorithm matches the properties of the data is crucial in providing results that meet the needs of the p ..."
Abstract

Cited by 24 (0 self)
 Add to MetaCart
Broad classes of statistical classification algorithms have been developed and applied successfully to a wide range of real world domains. In general, ensuring that the particular classification algorithm matches the properties of the data is crucial in providing results that meet the needs of the particular application domain. One way in which the impact of this algorithm/application match can be alleviated is by using ensembles of classifiers, where a variety of classifiers (either different types of classifiers or different instantiations of the same classifier) are pooled before a final classification decision is made. Intuitively, classifier ensembles allow the different needs of a difficult problem to be handled by classifiers suited to those particular needs. Mathematically, classifier ensembles provide an extra degree of freedom in the classical bias/variance tradeoff, allowing solutions that would be difficult (if not impossible) to reach with only a single classifier. Because of these advantages, classifier ensembles have been applied to many difficult real world problems. In this paper, we survey select applications of ensemble methods to problems that have historically been most representative of the difficulties in classification. In particular, we survey applications of ensemble methods to remote sensing, person recognition, one vs. all recognition, and medicine.
Modeling The Bioconcentration Factors and Bioaccumulation Factors of Polychlorinated Biphenyls with Posetic Quantitative Super Structure/Activity Relationship
 QSSAR)”, Mol Div
"... Summary During bioconcentration, chemical pollutants from water are absorbed by aquatic animals via the skin or a respiratory surface, while the entry routes of chemicals during bioaccumulation are both directly from the environment (skin or a respiratory surface) and indirectly from food. The bioc ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Summary During bioconcentration, chemical pollutants from water are absorbed by aquatic animals via the skin or a respiratory surface, while the entry routes of chemicals during bioaccumulation are both directly from the environment (skin or a respiratory surface) and indirectly from food. The bioconcentration factor (BCF) and the bioaccumulation factor (BAF) for a particular chemical compound are defined as the ratio of the concentration of a chemical inside an organism to the concentration in the surrounding environment. Because the experimental determination of BAF and BCF is timeconsuming and expensive, it is efficacious to develop models to provide reliable activity predictions for a large number of chemical compounds. Polychlorinated biphenyls (PCBs) released from industrial activities are persistent pollutants of the environment thereby producing widespread contamination of water and soil. PCBs can bioaccumulate in the food chain, constituting a potential source of exposure for the general population. To predict the bioconcentration and bioaccumulation factors for PCBs we make use of the biphenyl substitutionreaction network for the sequential substitution of Hatoms by Clatoms. Each PCB structure then occurs as a node of this reaction network, which is some sort of superstructure, turning out mathematically to be a partially ordered set (poset). Rather than dealing with the molecular structure via ordinary QSAR we use only this poset, making different quantitative superstructure/activity relationships (QSSAR). Thence we developed cluster expansion and splinoid QSSAR for PCB bioconcentration and bioaccumulation factors. The predictive ability of the BAF and BCF models generated for 20 data sets (representing different conditions and fish species) was evaluated with the leaveoneout crossvalidation, which shows that the splinoid QSSAR (r between 0.903 and 0.935) are better than models computed with the cluster expansion (r between 0.745 and 0.887). The splinoid QSSAR models for BAF and BCF yield predictions for the missing PCBs in the investigated data sets.
Classification of small molecules by two and threedimensional decomposition kernels
 BIOINFORMATICS
, 2007
"... ..."
Philosophy of Mathematical Chemistry: A Personal Perspective
 HYLE International Journal for Philosophy of Chemistry
"... Abstract: This article discusses the nature of mathematical chemistry, discrete mathematical chemistry in particular. Molecules and macromolecules can be represented by model objects using methods of discrete mathematics, e.g., graphs and matrices. Mathematical formalisms are further applied on the ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract: This article discusses the nature of mathematical chemistry, discrete mathematical chemistry in particular. Molecules and macromolecules can be represented by model objects using methods of discrete mathematics, e.g., graphs and matrices. Mathematical formalisms are further applied on the model objects to distill various quantitative characteristics. The end product of such an exercise can be a better understanding of chemistry, the development of quantitative scales for qualitative notions of chemistry, or an illumination of the structural basis of chemical and biological properties. The aforementioned aspects of mathematical chemistry are discussed based on my own practitioner’s perspective.
Condensed Matrix: A Tool to Characterize DNA
"... Motivation. This paper reports the development of new method for mathematical characterization of the primary DNA sequences. Method. A condensed characterization of the primary sequence is based on 4*4 matrices the rows and columns of which are associated with the four nucleic bases A, G, C and T. R ..."
Abstract
 Add to MetaCart
Motivation. This paper reports the development of new method for mathematical characterization of the primary DNA sequences. Method. A condensed characterization of the primary sequence is based on 4*4 matrices the rows and columns of which are associated with the four nucleic bases A, G, C and T. Results. The condensed matrices for the primary sequences of DNA is serving as a source of invariants that allow quantitative comparisons of DNA from different sources. Conclusion. The sensitivity of the descriptor renders it suitable for using it as a parameter to index toxicity levels of various agents that induce changes in DNAs. The study was outlined on normal DNA.
Tailored Similarity Spaces for the Prediction of Physicochemical Properties #
"... Motivation. In the past, molecular similarity spaces have been developed from arbitrary sets of molecular properties or theoretical descriptors and the results of property estimation based on these methods have always been inferior to SAR and QSAR models. Tailored QMSA methods attempt to create simi ..."
Abstract
 Add to MetaCart
(Show Context)
Motivation. In the past, molecular similarity spaces have been developed from arbitrary sets of molecular properties or theoretical descriptors and the results of property estimation based on these methods have always been inferior to SAR and QSAR models. Tailored QMSA methods attempt to create similarity spaces specific for a property of interest, rather than being purely arbitrary spaces characterizing the general aspects of all chemicals within the space or intuitively selected structure spaces whose elements are chosen subjectively. To this end, we have created three similarity spaces, two tailored and one non–tailored, for a set of 166 chemicals
Nonorthogonality in Ill–Conditioned Systems ✩
"... Ridge versions of an ill–conditioned system are alleged to “act more like an orthogonal system ” than the system itself. Alternatives, called Surrogates and motivated by the conditioning of linear systems, are shown to yield smaller expected mean squares than OLS, and uniformly smaller residual sums ..."
Abstract
 Add to MetaCart
(Show Context)
Ridge versions of an ill–conditioned system are alleged to “act more like an orthogonal system ” than the system itself. Alternatives, called Surrogates and motivated by the conditioning of linear systems, are shown to yield smaller expected mean squares than OLS, and uniformly smaller residual sums of squares than ridge. Ridge and surrogate solutions are compared on several marks of orthogonality, to include conditioning of dispersion parameters, variance inflation factors, isotropy of variances, and sphericity of contours of the estimators. On these, ridge typically exhibits erratic divergence from orthogonality as the ridge scalar evolves, often reverting back to OLS in the limit. In contrast, surrogate solutions converge monotonically in the ridge scalar to those from orthogonal systems. Invariance considerations constrain the computations to models in canonical form. Case studies serve to illustrate the central issues. Key words: Ill–conditioned models, ridge regression: properties, anomalies, surrogate models, hallmarks of orthogonality, asymptotics 2000 MSC: 62J07, 62J20 1.
Similarity methods in analog selection, property estimation and clustering of diverse chemicals
"... This account summarizes Dr. Subhash Basak’s work in the field of molecular similarity. In particular, it looks at the development and application of quantitative molecular similarity analysis (QMSA) techniques using physicochemical properties, topological indices, and atom pairs as descriptors for d ..."
Abstract
 Add to MetaCart
This account summarizes Dr. Subhash Basak’s work in the field of molecular similarity. In particular, it looks at the development and application of quantitative molecular similarity analysis (QMSA) techniques using physicochemical properties, topological indices, and atom pairs as descriptors for developing structure or propertybased similarity spaces and the use of a knearest neighbors (kNN) technique to estimate the properties or activities of chemicals within the space. Additionally, the account discusses the novel tailored similarity technique pioneered by Dr. Basak’s research group and discusses the future of molecular similarity analysis
INTRODUCTION s THE PURPOSE OF QUANTITATIVE MODELS
, 2003
"... Model fitting is an important part of all sciences that use quantitative measurements. Experimenters often explore the relationships between measures. Two subclasses of relation ..."
Abstract
 Add to MetaCart
Model fitting is an important part of all sciences that use quantitative measurements. Experimenters often explore the relationships between measures. Two subclasses of relation