Results 1  10
of
2,211,745
Reconciling Schemas of Disparate Data Sources: A MachineLearning Approach
 In SIGMOD Conference
, 2001
"... A dataintegration system provides access to a multitude of data sources through a single mediated schema. A key bottleneck in building such systems has been the laborious manual construction of semantic mappings between the source schemas and the mediated schema. We describe LSD, a system that empl ..."
Abstract

Cited by 424 (50 self)
 Add to MetaCart
to additional learners that may exploit new kinds of information. We describe a set of experiments on several realworld domains, and show that LSD proposes semantic mappings with a high degree of accuracy.
Domain names  Implementation and Specification
 RFC883, USC/Information Sciences Institute
, 1983
"... This RFC describes the details of the domain system and protocol, and assumes that the reader is familiar with the concepts discussed in a companion RFC, "Domain Names Concepts and Facilities " [RFC1034]. The domain system is a mixture of functions and data types which are an official pr ..."
Abstract

Cited by 724 (9 self)
 Add to MetaCart
This RFC describes the details of the domain system and protocol, and assumes that the reader is familiar with the concepts discussed in a companion RFC, "Domain Names Concepts and Facilities " [RFC1034]. The domain system is a mixture of functions and data types which are an official
Combining Neural Network Regression
"... this paper is to describe and evaluate an approach for combining regression estimates based on principal components regression. The method, called PCR*, is then evaluated on several realworld domains to demonstrate its robustness versus a collection of existing techniques ..."
Abstract
 Add to MetaCart
this paper is to describe and evaluate an approach for combining regression estimates based on principal components regression. The method, called PCR*, is then evaluated on several realworld domains to demonstrate its robustness versus a collection of existing techniques
K.B.: MultiInterval Discretization of ContinuousValued Attributes for Classication Learning. In:
 IJCAI.
, 1993
"... Abstract Since most realworld applications of classification learning involve continuousvalued attributes, properly addressing the discretization process is an important problem. This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuousvalued a ..."
Abstract

Cited by 831 (7 self)
 Add to MetaCart
formally derive a criterion based on the minimum description length principle for deciding the partitioning of intervals. We demonstrate via empirical evaluation on several realworld data sets that better decision trees are obtained using the new multiinterval algorithm.
Optimal Brain Damage
, 1990
"... We have used informationtheoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved sp ..."
Abstract

Cited by 509 (5 self)
 Add to MetaCart
speed of learning and/or classification. The basic idea is to use secondderivative information to make a tradeoff between network complexity and training set error. Experiments confirm the usefulness of the methods on a realworld application.
Markov Logic Networks
 MACHINE LEARNING
, 2006
"... We propose a simple approach to combining firstorder logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a firstorder knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the ..."
Abstract

Cited by 816 (39 self)
 Add to MetaCart
learned from relational databases by iteratively optimizing a pseudolikelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a realworld database and knowledge base in a university domain illustrate the promise of this approach.
Multitask Learning,”
, 1997
"... Abstract. Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for ..."
Abstract

Cited by 677 (6 self)
 Add to MetaCart
, and sketch an algorithm for multitask learning in decision trees. Because multitask learning works, can be applied to many different kinds of domains, and can be used with different learning algorithms, we conjecture there will be many opportunities for its use on realworld problems.
Finding community structure in networks using the eigenvectors of matrices
, 2006
"... We consider the problem of detecting communities or modules in networks, groups of vertices with a higherthanaverage density of edges connecting them. Previous work indicates that a robust approach to this problem is the maximization of the benefit function known as “modularity ” over possible div ..."
Abstract

Cited by 501 (0 self)
 Add to MetaCart
. The algorithms and measures proposed are illustrated with applications to a variety of realworld complex networks.
Knowledge acquisition via incremental conceptual clustering
 Machine Learning
, 1987
"... hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has ..."
Abstract

Cited by 765 (9 self)
 Add to MetaCart
not explicitly dealt with constraints imposed by real world environments. This article presents COBWEB, a conceptual clustering system that organizes data so as to maximize inference ability. Additionally, COBWEB is incremental and computationally economical, and thus can be flexibly applied in a variety
Very simple classification rules perform well on most commonly used datasets
 Machine Learning
, 1993
"... The classification rules induced by machine learning systems are judged by two criteria: their classification accuracy on an independent test set (henceforth "accuracy"), and their complexity. The relationship between these two criteria is, of course, of keen interest to the machin ..."
Abstract

Cited by 547 (5 self)
 Add to MetaCart
to the machine learning community. There are in the literature some indications that very simple rules may achieve surprisingly high accuracy on many datasets. For example, Rendell occasionally remarks that many real world datasets have "few peaks (often just one) " and so are &
Results 1  10
of
2,211,745