Results 1  10
of
33,449
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 679 (10 self)
 Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative
Improved Statistical Alignment Models
 In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics
, 2000
"... In this paper, we present and compare various singleword based alignment models for statistical machine translation. We discuss the five IBM alignment models, the HiddenMarkov alignment model, smoothing techniques and various modifications. ..."
Abstract

Cited by 607 (12 self)
 Add to MetaCart
In this paper, we present and compare various singleword based alignment models for statistical machine translation. We discuss the five IBM alignment models, the HiddenMarkov alignment model, smoothing techniques and various modifications.
Data Mining: Concepts and Techniques
, 2000
"... Our capabilities of both generating and collecting data have been increasing rapidly in the last several decades. Contributing factors include the widespread use of bar codes for most commercial products, the computerization of many business, scientific and government transactions and managements, a ..."
Abstract

Cited by 3142 (23 self)
 Add to MetaCart
of data and information. This explosive growth in stored data has generated an urgent need for new techniques and automated tools that can intelligently assist us in transforming the vast amounts of data into useful information and knowledge. This book explores the concepts and techniques of data mining
Statistical techniques
, 2014
"... • R book, page 43: The sample mean of the numeric data set, is • R book, page 44: The sample median, , of is the middle value of the sorted values. Let the sorted data be denoted by. Then • R book, page 49: The sample variance and sample standard deviation. For a numeric data set, the sample varian ..."
Abstract
 Add to MetaCart
• R book, page 43: The sample mean of the numeric data set, is • R book, page 44: The sample median, , of is the middle value of the sorted values. Let the sorted data be denoted by. Then • R book, page 49: The sample variance and sample standard deviation. For a numeric data set, the sample variance is, and sample standard deviation, is the square root of sample variance. • R book, page 50: The th quantile is at position in the sorted data. If this is not an integer, a weighted average is used. For example, as the median is the 0.5 quantile, if is even, the position is a fraction, and hence an average is used as shown above in the definition of median. The th quantile splits the dataset so that 100 % is smaller and 100 % is larger. • R book, page 50: Percentiles do the same thing as quantiles, except that on a scale of 0 to 100 is used instead of 0 to 1. Page 171, Devore book: Order the sample observations from smallest to largest. Then the th smallest observation in the list is taken to be the th sample percentile.
Statistical pattern recognition: A review
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2000
"... The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques ..."
Abstract

Cited by 1035 (30 self)
 Add to MetaCart
techniques and methods imported from statistical learning theory have bean receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection
Blind Signal Separation: Statistical Principles
, 2003
"... Blind signal separation (BSS) and independent component analysis (ICA) are emerging techniques of array processing and data analysis, aiming at recovering unobserved signals or `sources' from observed mixtures (typically, the output of an array of sensors), exploiting only the assumption of mut ..."
Abstract

Cited by 529 (4 self)
 Add to MetaCart
Blind signal separation (BSS) and independent component analysis (ICA) are emerging techniques of array processing and data analysis, aiming at recovering unobserved signals or `sources' from observed mixtures (typically, the output of an array of sensors), exploiting only the assumption
Making the most of statistical analyses: Improving interpretation and presentation
 American Journal of Political Science
, 2000
"... Social scientists rarely take full advantage of the information available in their statistical results. As a consequence, they miss opportunities to present quantities that are of greatest substantive interest for their research and express the appropriate degree of certainty about these quantities. ..."
Abstract

Cited by 600 (26 self)
 Add to MetaCart
. In this article, we offer an approach, built on the technique of statistical simulation, to extract the currently overlooked information from any statistical method and to interpret and present it in a readerfriendly manner. Using this technique requires some expertise,
Assessing agreement on classification tasks: the kappa statistic
 Computational Linguistics
, 1996
"... Currently, computational linguists and cognitive scientists working in the area of discourse and dialogue argue that their subjective judgments are reliable using several different statistics, none of which are easily interpretable or comparable to each other. Meanwhile, researchers in content analy ..."
Abstract

Cited by 846 (9 self)
 Add to MetaCart
Currently, computational linguists and cognitive scientists working in the area of discourse and dialogue argue that their subjective judgments are reliable using several different statistics, none of which are easily interpretable or comparable to each other. Meanwhile, researchers in content
Cumulated Gainbased Evaluation of IR Techniques
 ACM Transactions on Information Systems
, 2002
"... Modem large retrieval environments tend to overwhelm their users by their large output. Since all documents are not of equal relevance to their users, highly relevant documents should be identified and ranked first for presentation to the users. In order to develop IR techniques to this direction, i ..."
Abstract

Cited by 694 (3 self)
 Add to MetaCart
methods for their ability to retrieve highly relevant documents and allow testing of statistical significance of effectiveness differences. The graphs based on the measures also provide insight into the performance IR techniques and allow interpretation, e.g., from the user point of ...
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 502 (1 self)
 Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
Results 1  10
of
33,449