Results 1  10
of
19
Probabilistic Characterization of Nearest Neighbor Classifier
"... The kNearest Neighbor classification algorithm (kNN) is one of the most simple yet effective classification algorithms in use. It finds major applications in text categorization, outlier detection, handwritten character recognition, fraud detection and in other related areas. Though sound theoret ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
The kNearest Neighbor classification algorithm (kNN) is one of the most simple yet effective classification algorithms in use. It finds major applications in text categorization, outlier detection, handwritten character recognition, fraud detection and in other related areas. Though sound theoretical results exist regarding convergence of the Generalization Error (GE) of this algorithm to Bayes error, these results are asymptotic in nature. The understanding of the behavior of the kNN algorithm in real world scenarios is limited. In this paper, assuming categorical attributes, we provide a principled way of studying the nonasymptotic behavior of the kNN algorithm. In particular, we derive exact closed form expressions for the moments of the GE for this algorithm. The expressions are functions of the sample, and hence can be computed given any joint probability distribution defined over the inputoutput space. These expressions can be used as a tool that aids in unveiling the statistical behavior of the algorithm in settings of interest viz. an acceptable value of k for a given sample size and distribution. Moreover, Monte Carlo approximations of such closed form expressions have been shown in [5,4] to be a superior alternative in terms of speed and accuracy when compared with computing the moments directly using Monte Carlo. This work employs the semianalytical methodology that was proposed recently to better understand the nonasymptotic behavior of learning algorithms.
Adaptive human activity recognition and fall detection using wearable sensors
 Msc Thesis, Jozef Stefan International Postgraduate School
, 2011
"... PRILAGODLJIVO PREPOZNAVANJE AKTIVNOSTI IN ZAZNAVANJE PADCEV S SENZORJI NA TELESU Magistrsko delo Supervisor: Prof. Dr. Matjaž Gams Ljubljana, Slovenia, August 2011 V ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
PRILAGODLJIVO PREPOZNAVANJE AKTIVNOSTI IN ZAZNAVANJE PADCEV S SENZORJI NA TELESU Magistrsko delo Supervisor: Prof. Dr. Matjaž Gams Ljubljana, Slovenia, August 2011 V
FINDING SIMPLICES CONTAINING THE ORIGIN IN TWO AND THREE DIMENSIONS
, 2010
"... We show that finding the simplices containing a fixed given point among those defined on a set of n points can be done in O(n + k) time for the twodimensional case, and in O(n 2 + k) time for the threedimensional case, where k is the number of these simplices. As a byproduct, we give an alternat ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We show that finding the simplices containing a fixed given point among those defined on a set of n points can be done in O(n + k) time for the twodimensional case, and in O(n 2 + k) time for the threedimensional case, where k is the number of these simplices. As a byproduct, we give an alternative (to the algorithm in 4) O(n log r) algorithm that finds the redblue boundary for n bichromatic points on the line, where r is the size of this boundary. Another byproduct is an O(n 2 +t) algorithm that finds the intersections of line segments having two red endpoints with those having two blue endpoints defined on a set of n bichromatic points in the plane, where t is the number of these intersections.
NETWORK INTRUSION DETECTION SYSTEM BASED ON MACHINE LEARNING ALGORITHMS
"... mvvnssrikanth ..."
(Show Context)
Dynamic Committees for Handling Concept Drift in Databases
"... Concept drift refers to a problem that is caused by a change in the data distribution in data mining. This leads to reduction in the accuracy of the current model that is used to examine the underlying data distribution of the concept to be discovered. A number of techniques have been introduced to ..."
Abstract
 Add to MetaCart
(Show Context)
Concept drift refers to a problem that is caused by a change in the data distribution in data mining. This leads to reduction in the accuracy of the current model that is used to examine the underlying data distribution of the concept to be discovered. A number of techniques have been introduced to address this issue, in a supervised learning (or classification) setting. In a classification setting, the target concept (or class) to be learned is known. One of these techniques is called “Ensemble learning”, which refers to using multiple trained classifiers in order to get better predictions by using some voting scheme. In a traditional ensemble, the underlying base classifiers are all of the same type. Recent research extends the idea of ensemble learning to the idea of using committees, where a committee consists of diverse classifiers. This is the main difference between the regular ensemble classifiers and the committee learning algorithms. Committees are able to use diverse learning methods simultaneously and dynamically take advantage of the most accurate classifiers as the data change. In addition, some committees are able to replace their members when they perform poorly.
Computing gender difference using FisherRao metric from facial surface normals
"... AbstractThe aim in this paper is to explore whether the FisherRao metric can be used to characterise the shape changes due to gender difference. We work using a 2.5D representation based on facial surface normals (or facial needlemaps) for gender classification. The needlemap is a shape represe ..."
Abstract
 Add to MetaCart
(Show Context)
AbstractThe aim in this paper is to explore whether the FisherRao metric can be used to characterise the shape changes due to gender difference. We work using a 2.5D representation based on facial surface normals (or facial needlemaps) for gender classification. The needlemap is a shape representation which can be acquired from 2D intensity images using shapefromshading (SFS). Using the vonMises Fisher distribution, we compute the elements of the Fisher information matrix, and use this to compute geodesic distance between fields of surface normals to construct a shapespace. We embed the fields of facial surface normals into a low dimensional pattern space using a number of alternative methods including multidimensional scaling, heat kernel embedding and commute time embedding. We present results on clustering the embedded faces using the Max Planck and EAR database.
Breast Cancer Diagnosis by using kNearest Neighbor with Different Distances and Classification Rules
"... Cancer diagnosis is one of the most studied problems in the medical domain. Several researchers have focused in order to improve performance and achieve to obtain satisfactory results. Breast cancer is one of cancer killer in the world. The diagnosis of this cancer is a big problem in cancer diagnos ..."
Abstract
 Add to MetaCart
(Show Context)
Cancer diagnosis is one of the most studied problems in the medical domain. Several researchers have focused in order to improve performance and achieve to obtain satisfactory results. Breast cancer is one of cancer killer in the world. The diagnosis of this cancer is a big problem in cancer diagnosis researches. In artificial intelligent, machine learning is a discipline which allows to the machine to evolve through a process. Machine learning is widely used in bio informatics and particularly in breast cancer diagnosis. One of the most popular methods is Knearest neighbors (KNN) which is a supervised learning method. Using the KNN in medical diagnosis is very interesting. The quality of the results depends largely on the distance and the value of the parameter “k ” which represent the number of the nearest neighbors. In this paper, we study and evaluate the performance of different distances that can be used in the KNN algorithm. Also, we analyze this distance by using different values of the parameter “k ” and by using several rules of classification (the rule used to decide how to classify a sample). Our work will be performed on the WBCD database (Wisconsin Breast Cancer Database) obtained by the university of Wisconsin Hospital. Keywords:
Rule Discovery for Binary Classification Problem using ACO based Antminer
"... Data mining can be performed by number of ways. Classification is one of them. Classification is a data mining technique that assigns items to a predefined categories or classes or labels. The aim of classification is to predict the target class for the inputted data. On the other hand biology inspi ..."
Abstract
 Add to MetaCart
(Show Context)
Data mining can be performed by number of ways. Classification is one of them. Classification is a data mining technique that assigns items to a predefined categories or classes or labels. The aim of classification is to predict the target class for the inputted data. On the other hand biology inspired algorithms such as Genetic Algorithms (GA) and Swarm based approaches like Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) were used in solving many data mining problems and currently the most prominent choice in the area of swarm intelligence. In this paper binary classification is considered as an area of problem and a modified AntMiner is used to solve the problem. The basic algorithm of AntMiner has been modified with a different classification accuracy function.
Abstract — Data Mining (DM) and Knowledge Discovery (KD)
"... requires certain steps or rules that determine the data access ..."
(Show Context)