Results 1  10
of
2,672,184
A New Statistical Parser Based on Bigram Lexical Dependencies
, 1996
"... This paper describes a new statistical parser which is based on probabilities of dependencies between headwords in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street Journal ..."
Abstract

Cited by 491 (4 self)
 Add to MetaCart
This paper describes a new statistical parser which is based on probabilities of dependencies between headwords in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street
A New Statistical Parser Based on Bigram Lexical Dependencies
, 1996
"... This paper describes a new statistical parser which is based on probabilities of dependencies between headwords in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street Journal data s ..."
Abstract
 Add to MetaCart
This paper describes a new statistical parser which is based on probabilities of dependencies between headwords in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests using Wall Street Journal data
Estimation and Inference in Econometrics
, 1993
"... The astonishing increase in computer performance over the past two decades has made it possible for economists to base many statistical inferences on simulated, or bootstrap, distributions rather than on distributions obtained from asymptotic theory. In this paper, I review some of the basic ideas o ..."
Abstract

Cited by 1151 (3 self)
 Add to MetaCart
The astonishing increase in computer performance over the past two decades has made it possible for economists to base many statistical inferences on simulated, or bootstrap, distributions rather than on distributions obtained from asymptotic theory. In this paper, I review some of the basic ideas of bootstrap inference. The paper discusses Monte Carlo tests, several types of bootstrap test, and bootstrap confidence intervals. Although bootstrapping often works well, it does not do so in every case.
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 766 (29 self)
 Add to MetaCart
Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We
Estimating Wealth Effects without Expenditure Data— or Tears
 Policy Research Working Paper 1980, The World
, 1998
"... Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled closely ..."
Abstract

Cited by 832 (16 self)
 Add to MetaCart
Abstract: We use the National Family Health Survey (NFHS) data collected in Indian states in 1992 and 1993 to estimate the relationship between household wealth and the probability a child (aged 6 to 14) is enrolled in school. A methodological difficulty to overcome is that the NFHS, modeled
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract

Cited by 489 (2 self)
 Add to MetaCart
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon
Cumulated Gainbased Evaluation of IR Techniques
 ACM Transactions on Information Systems
, 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract

Cited by 656 (3 self)
 Add to MetaCart
Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction
TCP Vegas: New techniques for congestion detection and avoidance
 In SIGCOMM
, 1994
"... Vegas is a new implementation of TCP that achieves between 40 and 70 % better throughput, with onefifth to onehalf the losses, as compared to the implementation of TCP in the Reno distributionof BSD Unix. This paper motivates and describes the three key techniques employed by Vegas, and presents th ..."
Abstract

Cited by 592 (3 self)
 Add to MetaCart
Vegas is a new implementation of TCP that achieves between 40 and 70 % better throughput, with onefifth to onehalf the losses, as compared to the implementation of TCP in the Reno distributionof BSD Unix. This paper motivates and describes the three key techniques employed by Vegas, and presents
SMOTE: Synthetic Minority Oversampling Technique
 Journal of Artificial Intelligence Research
, 2002
"... An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often realworld data sets are predominately composed of ``normal'' examples with only a small percentag ..."
Abstract

Cited by 614 (28 self)
 Add to MetaCart
An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often realworld data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Undersampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of oversampling the minority (abnormal) class and undersampling the majority (normal) class can achieve better classifier performance (in ROC space) than only undersampling the majority class. This paper also shows that a combination of our method of oversampling the minority class and undersampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of oversampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Results 1  10
of
2,672,184