Results 1 - 10
of
23,670
Power-law distributions in empirical data
- ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract
-
Cited by 607 (7 self)
- Add to MetaCart
demonstrate these methods by applying them to twentyfour real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law
Combining labeled and unlabeled data with co-training
, 1998
"... We consider the problem of using a large unlabeled sample to boost performance of a learning algorithm when only a small set of labeled examples is available. In particular, we consider a setting in which the description of each example can be partitioned into two distinct views, motivated by the ta ..."
Abstract
-
Cited by 1633 (28 self)
- Add to MetaCart
data, but our goal is to use both views together to allow inexpensive unlabeled data to augment amuch smaller set of labeled examples. Speci cally, the presence of two distinct views of each example suggests strategies in which two learning algorithms are trained separately on each view, and then each
Clustering by passing messages between data points
- Science
, 2007
"... Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such “exemplars ” can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initi ..."
Abstract
-
Cited by 696 (8 self)
- Add to MetaCart
if that initial choice is close to a good solution. We devised a method called “affinity propagation,” which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges
The Entity-Relationship Model: Toward a Unified View of Data
- ACM Transactions on Database Systems
, 1976
"... A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database design and description using the model and th ..."
Abstract
-
Cited by 1829 (6 self)
- Add to MetaCart
A data model, called the entity-relationship model, is proposed. This model incorporates some of the important semantic information about the real world. A special diagrammatic technique is introduced as a tool for database design. An example of database design and description using the model
A comparison of bayesian methods for haplotype reconstruction from population genotype data.
- Am J Hum Genet
, 2003
"... In this report, we compare and contrast three previously published Bayesian methods for inferring haplotypes from genotype data in a population sample. We review the methods, emphasizing the differences between them in terms of both the models ("priors") they use and the computational str ..."
Abstract
-
Cited by 557 (7 self)
- Add to MetaCart
strategies they employ. We introduce a new algorithm that combines the modeling strategy of one method with the computational strategies of another. In comparisons using real and simulated data, this new algorithm outperforms all three existing methods. The new algorithm is included in the software package
Bagging predictors
, 1996
"... Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making ..."
Abstract
-
Cited by 3650 (1 self)
- Add to MetaCart
by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability
ANALYSIS OF WIRELESS SENSOR NETWORKS FOR HABITAT MONITORING
, 2004
"... We provide an in-depth study of applying wireless sensor networks (WSNs) to real-world habitat monitoring. A set of system design requirements were developed that cover the hardware design of the nodes, the sensor network software, protective enclosures, and system architecture to meet the require ..."
Abstract
-
Cited by 1490 (19 self)
- Add to MetaCart
We provide an in-depth study of applying wireless sensor networks (WSNs) to real-world habitat monitoring. A set of system design requirements were developed that cover the hardware design of the nodes, the sensor network software, protective enclosures, and system architecture to meet
OPTICS: Ordering Points To Identify the Clustering Structure
, 1999
"... Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of ..."
Abstract
-
Cited by 527 (51 self)
- Add to MetaCart
of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection
- INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 1995
"... We review accuracy estimation methods and compare the two most common methods: cross-validation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), te ..."
Abstract
-
Cited by 1283 (11 self)
- Add to MetaCart
We review accuracy estimation methods and compare the two most common methods: cross-validation and bootstrap. Recent experimental results on artificial data and theoretical results in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection
Reconstruction and Representation of 3D Objects with Radial Basis Functions
- Computer Graphics (SIGGRAPH ’01 Conf. Proc.), pages 67–76. ACM SIGGRAPH
, 2001
"... We use polyharmonic Radial Basis Functions (RBFs) to reconstruct smooth, manifold surfaces from point-cloud data and to repair incomplete meshes. An object's surface is defined implicitly as the zero set of an RBF fitted to the given surface data. Fast methods for fitting and evaluating RBFs al ..."
Abstract
-
Cited by 505 (1 self)
- Add to MetaCart
We use polyharmonic Radial Basis Functions (RBFs) to reconstruct smooth, manifold surfaces from point-cloud data and to repair incomplete meshes. An object's surface is defined implicitly as the zero set of an RBF fitted to the given surface data. Fast methods for fitting and evaluating RBFs
Results 1 - 10
of
23,670