Results 1 - 10
of
126,929
Unsupervised namedentity extraction from the web: An experimental study.
- Artificial Intelligence,
, 2005
"... Abstract The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, and scalable manner. The paper presents an overview of KNOW-ITALL's novel architecture and ..."
Abstract
-
Cited by 372 (39 self)
- Add to MetaCart
Abstract The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, domain-independent, and scalable manner. The paper presents an overview of KNOW-ITALL's novel architecture and design principles, emphasizing its distinctive ability to extract information without any hand-labeled training examples. In its first major run, KNOW-ITALL extracted over 50,000 class instances, but suggested a challenge: How can we improve KNOWITALL's recall and extraction rate without sacrificing precision? This paper presents three distinct ways to address this challenge and evaluates their performance. Pattern Learning learns domain-specific extraction rules, which enable additional extractions. Subclass Extraction automatically identifies sub-classes in order to boost recall (e.g., "chemist" and "biologist" are identified as sub-classes of "scientist"). List Extraction locates lists of class instances, learns a "wrapper" for each list, and extracts elements of each list. Since each method bootstraps from KNOWITALL's domain-independent methods, the methods also obviate hand-labeled training examples. The paper reports on experiments, focused on building lists of named entities, that measure the relative efficacy of each method and demonstrate their synergy. In concert, our methods gave KNOWITALL a 4-fold to 8-fold increase in recall at precision of 0.90, and discovered over 10,000 cities missing from the Tipster Gazetteer.
Mediation in experimental and nonexperimental studies: new procedures and recommendations
- PSYCHOLOGICAL METHODS
, 2002
"... Mediation is said to occur when a causal effect of some variable X on an outcome Y is explained by some intervening variable M. The authors recommend that with small to moderate samples, bootstrap methods (B. Efron & R. Tibshirani, 1993) be used to assess mediation. Bootstrap tests are powerful ..."
Abstract
-
Cited by 696 (4 self)
- Add to MetaCart
. These models are useful for theory development and testing as well as for the identification of possible points of intervention in applied work. Mediation is equally of interest to experimental psychologists as it is to those who study naturally occurring processes through nonexperimental studies. For example
Propensity Score Matching Methods For Non-Experimental Causal Studies
, 2002
"... This paper considers causal inference and sample selection bias in non-experimental settings in which: (i) few units in the non-experimental comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment units is difficult because uni ..."
Abstract
-
Cited by 714 (3 self)
- Add to MetaCart
This paper considers causal inference and sample selection bias in non-experimental settings in which: (i) few units in the non-experimental comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment units is difficult because
Preference Parameters and Behavioral Heterogeneity: An Experimental Approach in the Health and Retirement Study.”
- Quarterly Journal of Economics
, 1997
"... ..."
The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model
- Journal of neuroscience
, 1985
"... This paper presents studies of the coordination of volun-tary human arm movements. A mathematical model is for-mulated which is shown to predict both the qualitative fea-tures and the quantitative details observed experimentally in planar, multijoint arm movements. Coordination is modeled mathematic ..."
Abstract
-
Cited by 688 (18 self)
- Add to MetaCart
This paper presents studies of the coordination of volun-tary human arm movements. A mathematical model is for-mulated which is shown to predict both the qualitative fea-tures and the quantitative details observed experimentally in planar, multijoint arm movements. Coordination is modeled
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2001
"... After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time compl ..."
Abstract
-
Cited by 1315 (53 self)
- Add to MetaCart
complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper
An experimental comparison of three methods for constructing ensembles of decision trees
- Bagging, boosting, and randomization. Machine Learning
, 2000
"... Abstract. Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a “base ” learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approac ..."
Abstract
-
Cited by 610 (6 self)
- Add to MetaCart
approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance
A comparison of document clustering techniques
- In KDD Workshop on Text Mining
, 2000
"... This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique is ..."
Abstract
-
Cited by 613 (27 self)
- Add to MetaCart
This paper presents the results of an experimental study of some common document clustering techniques: agglomerative hierarchical clustering and K-means. (We used both a “standard” K-means algorithm and a “bisecting ” K-means algorithm.) Our results indicate that the bisecting K-means technique
Free Riding on Gnutella
, 2000
"... this paper, Gnutella is no exception to this finding, and an experimental study of its user patterns shows indeed that free riding is the norm rather than the exception. If distributed systems such as Gnutella rely on voluntary cooperation, rampant free riding may eventually render them useless, as ..."
Abstract
-
Cited by 614 (2 self)
- Add to MetaCart
this paper, Gnutella is no exception to this finding, and an experimental study of its user patterns shows indeed that free riding is the norm rather than the exception. If distributed systems such as Gnutella rely on voluntary cooperation, rampant free riding may eventually render them useless
Results 1 - 10
of
126,929