Results 1  10
of
136,717
The Nature of Statistical Learning Theory
, 1999
"... Statistical learning theory was introduced in the late 1960’s. Until the 1990’s it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990’s new types of learning algorithms (called support vector machines) based on the deve ..."
Abstract

Cited by 13236 (32 self)
 Add to MetaCart
Statistical learning theory was introduced in the late 1960’s. Until the 1990’s it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990’s new types of learning algorithms (called support vector machines) based
Learning Bayesian networks: The combination of knowledge and statistical data
 Machine Learning
, 1995
"... We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly simpl ..."
Abstract

Cited by 1158 (35 self)
 Add to MetaCart
We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. We identify two important properties of metrics, which we call event equivalence and parameter modularity. These properties have been mostly ignored, but when combined, greatly
Combining labeled and unlabeled data with cotraining
, 1998
"... We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the ta ..."
Abstract

Cited by 1633 (28 self)
 Add to MetaCart
data, but our goal is to use both views together to allow inexpensive unlabeled data to augment amuch smaller set of labeled examples. Speci cally, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each
InductiveDataType Systems
, 2002
"... In a previous work ("Abstract Data Type Systems", TCS 173(2), 1997), the leI two authors presented a combined lmbined made of a (strongl normal3zG9 alrmal rewrite system and a typed #calA#Ik enriched by patternmatching definitions folnitio a certain format,calat the "General Schem ..."
Abstract

Cited by 821 (23 self)
 Add to MetaCart
more general cler of inductive types, cals, "strictly positive", and to ease the strong normalgAg9Ik proof of theresulGGg system. Thisresul provides a computation model for the combination of anal"DAfGI specification language based on abstract data types and of astrongl typed functional
Data Integration: A Theoretical Perspective
 Symposium on Principles of Database Systems
, 2002
"... Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues that are interestin ..."
Abstract

Cited by 965 (45 self)
 Add to MetaCart
Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data. The problem of designing data integration systems is important in current real world applications, and is characterized by a number of issues
Cluster Ensembles  A Knowledge Reuse Framework for Combining Multiple Partitions
 Journal of Machine Learning Research
, 2002
"... This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse&ap ..."
Abstract

Cited by 603 (20 self)
 Add to MetaCart
qualitatively different application scenarios: (i) where the original clusters were formed based on nonidentical sets of features, (ii) where the original clustering algorithms worked on nonidentical sets of objects, and (iii) where a common dataset is used and the main purpose of combining multiple
CURE: An Efficient Clustering Algorithm for Large Data sets
 Published in the Proceedings of the ACM SIGMOD Conference
, 1998
"... Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering ..."
Abstract

Cited by 722 (5 self)
 Add to MetaCart
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new
The EntityRelationship Model: Toward a Unified View of Data
 ACM Transactions on Database Systems
, 1976
"... A data model, called the entityrelationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database design and description using the model and th ..."
Abstract

Cited by 1829 (6 self)
 Add to MetaCart
and the diagrammatic technique is given. Some implications for data integrity, information retrieval, and data manipulation are discussed. The entityrelationship model can be used as a basis for unification of different views of data: t,he network model, the relational model, and the entity set model. Semantic
A comparison of bayesian methods for haplotype reconstruction from population genotype data.
 Am J Hum Genet
, 2003
"... In this report, we compare and contrast three previously published Bayesian methods for inferring haplotypes from genotype data in a population sample. We review the methods, emphasizing the differences between them in terms of both the models ("priors") they use and the computational str ..."
Abstract

Cited by 557 (7 self)
 Add to MetaCart
strategies they employ. We introduce a new algorithm that combines the modeling strategy of one method with the computational strategies of another. In comparisons using real and simulated data, this new algorithm outperforms all three existing methods. The new algorithm is included in the software package
Mixtures of Probabilistic Principal Component Analysers
, 1998
"... Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a com ..."
Abstract

Cited by 532 (6 self)
 Add to MetaCart
maximumlikelihood framework, based on a specific form of Gaussian latent variable model. This leads to a welldefined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context
Results 1  10
of
136,717