Results 1  10
of
12
Top 10 algorithms in data mining
, 2007
"... Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, kMeans, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining a ..."
Abstract

Cited by 113 (2 self)
 Add to MetaCart
Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, kMeans, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification,
yaImpute: an R package for kNN imputation
 Environmental Systems Research Institute [ESRI
, 2008
"... This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputationbased forest attribute estimation and mapping. The impetus to ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputationbased forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor imputation methods for spatially explicit forest inventory, and a need within this research community for software that facilitates comparison among different nearest neighbor search algorithms and subsequent imputation techniques. yaImpute provides directives for defining the search space, subsequent distance calculation, and imputation rules for a given number of nearest neighbors. Further, the package offers a suite of diagnostics for comparison among results generated from different imputation analyses and a set of functions for mapping imputation results.
Watermarking with Retrieval Systems
"... We are interested in the problem of associating messages to multimedia content. This problem can be addressed by a watermarking system which embeds the associated messages into the multimedia content (also called Works). A drawback of watermarking is that the content will be distorted during embeddi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We are interested in the problem of associating messages to multimedia content. This problem can be addressed by a watermarking system which embeds the associated messages into the multimedia content (also called Works). A drawback of watermarking is that the content will be distorted during embedding. On the other hand, if we assume that the database is available, the problem can be addressed by a retrieval system. Although no undesirable distortion is created, searching in large databases is fundamentally difficult (also known as the dimensionality curse).
A: Resamplingbased approaches to study variation in morphological modularity. PLoS One 2013
"... Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric d ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric datasets. However, it has been shown, using randomly simulated datasets, that the value of the RV coefficient depends on sample size. Also, so far there is no statistical test for the difference in the RV coefficient between a priori defined groups of observations. Here, we (1), using a rarefaction analysis, show that the value of the RV coefficient depends on sample size also in real geometric morphometric datasets; (2) propose a permutation procedure to test for the difference in the RV coefficient between a priori defined groups of observations; (3) show, through simulations, that such a permutation procedure has an appropriate Type I error; (4) suggest that a rarefaction procedure could be used to obtain samplesizecorrected values of the RV coefficient; and (5) propose a nearestneighbor procedure that could be used when studying the variation of modularity in geographic space. The approaches outlined here, readily extendable to nonmorphometric datasets, allow study of the variation in the degree of integration between a priori defined modules. A Java application – that will allow performance of the proposed test using a software with graphical user interface – has also been developed and is available at the Morphometrics at Stony Brook Web
Improving massive experiments with threshold blocking∗
, 2015
"... Inferences from randomized experiments can be improved by blocking: assigning treatment in fixed proportions within groups of similar units. However, the use of the method is limited by the difficulty in deriving these groups. Current blocking methods are restricted to special cases or run in ex ..."
Abstract
 Add to MetaCart
Inferences from randomized experiments can be improved by blocking: assigning treatment in fixed proportions within groups of similar units. However, the use of the method is limited by the difficulty in deriving these groups. Current blocking methods are restricted to special cases or run in exponential time; are not sensitive to clustering of data points; and are often heuristic, providing an unsatisfactory solution in many common instances. We present an algorithm that implements a new, widely applicable class of blocking—threshold blocking—that solves these problems. Given a minimum required group size and a distance metric, we study the blocking problem of minimizing the maximum distance between any two units within the same group. We prove this is a NPhard problem and derive an approximation algorithm that yields a blocking where the maximum distance is guaranteed to be at most four times the optimal value. This algorithm runs in O(n log n) time with O(n) space complexity. This makes it the first blocking method with an ensured level of performance that works in massive experiments. While many commonly used algorithms form pairs of units, our algorithm constructs the groups flexibly for any chosen minimum size. This facilitates complex experiments with several treatment arms and clustered data. A simulation study demonstrates the efficiency and efficacy of the algorithm; tens of millions of units can be blocked using a desktop computer in a few minutes. 1.
ComputerAided Design ( ) – Contents lists available at ScienceDirect ComputerAided Design
"... journal homepage: www.elsevier.com/locate/cad A collaborative platform for integrating and optimising Computational Fluid ..."
Abstract
 Add to MetaCart
(Show Context)
journal homepage: www.elsevier.com/locate/cad A collaborative platform for integrating and optimising Computational Fluid
chaotic
"... Combining analog method and ensemble data assimilation: application to the Lorenz63 ..."
Abstract
 Add to MetaCart
(Show Context)
Combining analog method and ensemble data assimilation: application to the Lorenz63
DOI 10.1007/s1011500701142 SURVEY PAPER Top 10 algorithms in data mining
, 2007
"... Abstract This paper presents the top 10 data mining algorithms identified by the IEEE ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract This paper presents the top 10 data mining algorithms identified by the IEEE
PII: S03015629(02)005501 ● Original Contribution FRACTAL DIMENSION ESTIMATION OF CAROTID ATHEROSCLEROTIC PLAQUES FROM BMODE ULTRASOUND: A PILOT STUDY
, 2002
"... Abstract—In this paper, a pilot study regarding carotid atherosclerotic plaque instability using Bmode ultrasound (US) images was carried out. The fractal dimension of plaques obtained from ten symptomatic subjects (i.e., subjects having experienced neurological symptoms) as well as from nine asym ..."
Abstract
 Add to MetaCart
Abstract—In this paper, a pilot study regarding carotid atherosclerotic plaque instability using Bmode ultrasound (US) images was carried out. The fractal dimension of plaques obtained from ten symptomatic subjects (i.e., subjects having experienced neurological symptoms) as well as from nine asymptomatic subjects, was estimated using a novel method, called the kth nearest neighbour (KNN) method. The results indicated a significant difference, as per the fractal dimension, between the two groups, providing a significantly lower value for the asymptomatic group. Moreover, the phase of the cardiac cycle (systole/diastole) during which the fractal dimension was estimated had no systematic effect on the calculations. The fractal dimension of the plaques was also estimated using a wellknown method, namely the boxcounting (BC) method. No significant differences between the two groups, as per the fractal dimension, were observed using the BC method. The presented pilot study suggests that the fractal dimension, estimated by the proposed method, could be used as a single determinant for the discrimination of symptomatic and asymptomatic subjects. (Email: