| Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. E#cient algorithms for mining outliers from large data sets. In Proceedings of the 2000. |
....many di#erent fields and have no easy way of characterizing the multivariate distribution of examples. Other researchers, beginning with the work by Knorr and Ng [16] have taken a non parametric approach and proposed using an example s distance to its nearest neighbors as a measure of unusualness [2, 19, 17, 10]. Although distance is an e#ective non parametric approach to detecting outliers, the drawback is the amount of computation time required. Straightforward algorithms, such as those based on nested loops, typically require O(N ) distance computations. This quadratic scaling means that it will ....
....problem for many real databases where there are often millions of records. Recently, researchers have presented many di#erent algorithms for e#ciently finding distance based outliers. These approaches vary from spatial indexing trees to partitioning of the feature space with clustering algorithms [19]. The common goal is developing algorithms that scale to large real data sets. In this paper, we show that one can modify a simple algorithm based on nested loops, which would normally have quadratic scaling behavior, to yield near linear time mining on real, large, and high dimensional data ....
[Article contains additional citation context not shown here]
S. Ramaswamy, R. Rastogi, and K. Shim. E#cient algorithms for mining outliers from large data sets. In Proceedings of the ACM SIGMOD Conference, pages 427--438, 2000.
....We measure outlyingness by ranking data according to the magnitude of the reconstruction error. This compares to SmartSifter [22] which similarly builds models to identify outliers but scores the individuals depending on the degree to which they perturb the model. Following [22] 4] and [17] when dealing with large databases, we consider it more meaningful to assign each datum an outlyingness score. The continuous score reflects the fuzzy nature of outlyingness and also allows the investigation of outliers to be automatically prioritised for analysis. 2 Related Work We classify ....
S. Ramaswamy, R. Rastogi, and K. Shim. E#cient algorithms for mining outliers from large data sets. In Proceedings of International Conference on Management of Data, ACM-SIGMOD, Dallas, 2000.
No context found.
Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. E#cient algorithms for mining outliers from large data sets. In Proceedings of the 2000.
No context found.
Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. E#cient algorithms for mining outliers from large data sets. In Proceedings of the 2000.
No context found.
Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. E#cient algorithms for mining outliers from large data sets. In ACM SIGMOD Conference, pages 427--438, 2000.
No context found.
S. Ramaswamy, R. Rastogi, and K. Shim. E#cient algorithms for mining outliers from large data sets. In Proc. of SIGMOD'2000, pages 427--438, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC