Results 1  10
of
51,261
Information Theory and Statistics
, 1968
"... Entropy and relative entropy are proposed as features extracted from symbol sequences. Firstly, a proper Iterated Function System is driven by the sequence, producing a fractaMike representation (CSR) with a low computational cost. Then, two entropic measures are applied to the CSR histogram of th ..."
Abstract

Cited by 1805 (2 self)
 Add to MetaCart
Entropy and relative entropy are proposed as features extracted from symbol sequences. Firstly, a proper Iterated Function System is driven by the sequence, producing a fractaMike representation (CSR) with a low computational cost. Then, two entropic measures are applied to the CSR histogram
Accurate Methods for the Statistics of Surprise and Coincidence
 COMPUTATIONAL LINGUISTICS
, 1993
"... Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used unjustifiably ..."
Abstract

Cited by 1057 (1 self)
 Add to MetaCart
Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics
 in Proc. 8th Int’l Conf. Computer Vision
, 2001
"... This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the s ..."
Abstract

Cited by 954 (14 self)
 Add to MetaCart
of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties. 1.
Semantic similarity based on corpus statistics and lexical taxonomy
 Proc of 10th International Conference on Research in Computational Linguistics, ROCLING’97
, 1997
"... This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantifie ..."
Abstract

Cited by 873 (0 self)
 Add to MetaCart
This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better
Assessing agreement on classification tasks: the kappa statistic
 Computational Linguistics
, 1996
"... Currently, computational linguists and cognitive scientists working in the area of discourse and dialogue argue that their subjective judgments are reliable using several different statistics, none of which are easily interpretable or comparable to each other. Meanwhile, researchers in content analy ..."
Abstract

Cited by 846 (9 self)
 Add to MetaCart
analysis have already experienced the same difficulties and come up with a solution in the kappa statistic. We discuss what is wrong with reliability measures as they are currently used for discourse and dialogue work in computational linguistics and cognitive science, and argue that we would be better off
Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms
, 1998
"... This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task. These tests are compared experimentally to determine their probability of incorrectly detecting a difference when no difference exists (type I err ..."
Abstract

Cited by 723 (8 self)
 Add to MetaCart
This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task. These tests are compared experimentally to determine their probability of incorrectly detecting a difference when no difference exists (type I
Estimating the number of clusters in a dataset via the Gap statistic
, 2000
"... We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference ..."
Abstract

Cited by 502 (1 self)
 Add to MetaCart
We propose a method (the \Gap statistic") for estimating the number of clusters (groups) in a set of data. The technique uses the output of any clustering algorithm (e.g. kmeans or hierarchical), comparing the change in within cluster dispersion to that expected under an appropriate reference
SelfSimilarity Through HighVariability: Statistical Analysis of Ethernet LAN Traffic at the Source Level
 IEEE/ACM TRANSACTIONS ON NETWORKING
, 1997
"... A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is selfsimilar or longrange dependent in nature (i.e., bursty over a wide range of time scales)  in sharp contrast to commonly made tr ..."
Abstract

Cited by 743 (24 self)
 Add to MetaCart
A number of recent empirical studies of traffic measurements from a variety of working packet networks have convincingly demonstrated that actual network traffic is selfsimilar or longrange dependent in nature (i.e., bursty over a wide range of time scales)  in sharp contrast to commonly made
Bayesian measures of model complexity and fit
 Journal of the Royal Statistical Society, Series B
, 2002
"... [Read before The Royal Statistical Society at a meeting organized by the Research ..."
Abstract

Cited by 466 (4 self)
 Add to MetaCart
[Read before The Royal Statistical Society at a meeting organized by the Research
Measurement, Modeling, and Analysis of a PeertoPeer FileSharing Workload
, 2003
"... Peertopeer (P2P) file sharing accounts for an astonishing volume of current Internet tra#c. This paper probes deeply into modern P2P file sharing systems and the forces that drive them. By doing so, we seek to increase our understanding of P2P file sharing workloads and their implications for futu ..."
Abstract

Cited by 487 (7 self)
 Add to MetaCart
parameters. Our model, which we parameterize with statistics from our trace, lets us confirm various hypotheses about filesharing behavior observed in the trace. Third, we explore the potential impact of localityawareness in Kazaa.
Results 1  10
of
51,261