Results 1 - 10
of
19
Probabilistic Characterization of Nearest Neighbor Classifier
"... The k-Nearest Neighbor classification algorithm (kNN) is one of the most simple yet effective classification algorithms in use. It finds major applications in text categorization, outlier detection, handwritten character recognition, fraud detection and in other related areas. Though sound theoret ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
The k-Nearest Neighbor classification algorithm (kNN) is one of the most simple yet effective classification algorithms in use. It finds major applications in text categorization, outlier detection, handwritten character recognition, fraud detection and in other related areas. Though sound theoretical results exist regarding convergence of the Generalization Error (GE) of this algorithm to Bayes error, these results are asymptotic in nature. The understanding of the behavior of the kNN algorithm in real world scenarios is limited. In this paper, assuming categorical attributes, we provide a principled way of studying the non-asymptotic behavior of the kNN algorithm. In particular, we derive exact closed form expressions for the moments of the GE for this algorithm. The expressions are functions of the sample, and hence can be computed given any joint probability distribution defined over the input-output space. These expressions can be used as a tool that aids in unveiling the statistical behavior of the algorithm in settings of interest viz. an acceptable value of k for a given sample size and distribution. Moreover, Monte Carlo approximations of such closed form expressions have been shown in [5,4] to be a superior alternative in terms of speed and accuracy when compared with computing the moments directly using Monte Carlo. This work employs the semi-analytical methodology that was proposed recently to better understand the non-asymptotic behavior of learning algorithms.
Adaptive human activity recognition and fall detection using wearable sensors
- Msc Thesis, Jozef Stefan International Postgraduate School
, 2011
"... PRILAGODLJIVO PREPOZNAVANJE AKTIVNOSTI IN ZAZNAVANJE PADCEV S SENZORJI NA TELESU Magistrsko delo Supervisor: Prof. Dr. Matjaž Gams Ljubljana, Slovenia, August 2011 V ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
PRILAGODLJIVO PREPOZNAVANJE AKTIVNOSTI IN ZAZNAVANJE PADCEV S SENZORJI NA TELESU Magistrsko delo Supervisor: Prof. Dr. Matjaž Gams Ljubljana, Slovenia, August 2011 V
FINDING SIMPLICES CONTAINING THE ORIGIN IN TWO AND THREE DIMENSIONS
, 2010
"... We show that finding the simplices containing a fixed given point among those defined on a set of n points can be done in O(n + k) time for the two-dimensional case, and in O(n 2 + k) time for the three-dimensional case, where k is the number of these simplices. As a byproduct, we give an alternat ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We show that finding the simplices containing a fixed given point among those defined on a set of n points can be done in O(n + k) time for the two-dimensional case, and in O(n 2 + k) time for the three-dimensional case, where k is the number of these simplices. As a byproduct, we give an alternative (to the algorithm in 4) O(n log r) algorithm that finds the red-blue boundary for n bichromatic points on the line, where r is the size of this boundary. Another byproduct is an O(n 2 +t) algorithm that finds the intersections of line segments having two red endpoints with those having two blue endpoints defined on a set of n bichromatic points in the plane, where t is the number of these intersections.
NETWORK INTRUSION DETECTION SYSTEM BASED ON MACHINE LEARNING ALGORITHMS
"... mvvnssrikanth ..."
(Show Context)
Dynamic Committees for Handling Concept Drift in Databases
"... Concept drift refers to a problem that is caused by a change in the data distribution in data mining. This leads to reduction in the accuracy of the current model that is used to examine the underlying data distribution of the concept to be discovered. A number of techniques have been introduced to ..."
Abstract
- Add to MetaCart
(Show Context)
Concept drift refers to a problem that is caused by a change in the data distribution in data mining. This leads to reduction in the accuracy of the current model that is used to examine the underlying data distribution of the concept to be discovered. A number of techniques have been introduced to address this issue, in a supervised learning (or classification) setting. In a classification setting, the target concept (or class) to be learned is known. One of these techniques is called “Ensemble learning”, which refers to using multiple trained classifiers in order to get better predictions by using some voting scheme. In a traditional ensemble, the underlying base classifiers are all of the same type. Recent research extends the idea of ensemble learning to the idea of using committees, where a committee consists of diverse classifiers. This is the main difference between the regular ensemble classifiers and the committee learning algorithms. Committees are able to use diverse learning methods simultaneously and dynamically take advantage of the most accurate classifiers as the data change. In addition, some committees are able to replace their members when they perform poorly.
Computing gender difference using Fisher-Rao metric from facial surface normals
"... Abstract-The aim in this paper is to explore whether the Fisher-Rao metric can be used to characterise the shape changes due to gender difference. We work using a 2.5D representation based on facial surface normals (or facial needle-maps) for gender classification. The needle-map is a shape represe ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract-The aim in this paper is to explore whether the Fisher-Rao metric can be used to characterise the shape changes due to gender difference. We work using a 2.5D representation based on facial surface normals (or facial needle-maps) for gender classification. The needle-map is a shape representation which can be acquired from 2D intensity images using shape-from-shading (SFS). Using the von-Mises Fisher distribution, we compute the elements of the Fisher information matrix, and use this to compute geodesic distance between fields of surface normals to construct a shape-space. We embed the fields of facial surface normals into a low dimensional pattern space using a number of alternative methods including multidimensional scaling, heat kernel embedding and commute time embedding. We present results on clustering the embedded faces using the Max Planck and EAR database.
Breast Cancer Diagnosis by using k-Nearest Neighbor with Different Distances and Classification Rules
"... Cancer diagnosis is one of the most studied problems in the medical domain. Several researchers have focused in order to improve performance and achieve to obtain satisfactory results. Breast cancer is one of cancer killer in the world. The diagnosis of this cancer is a big problem in cancer diagnos ..."
Abstract
- Add to MetaCart
(Show Context)
Cancer diagnosis is one of the most studied problems in the medical domain. Several researchers have focused in order to improve performance and achieve to obtain satisfactory results. Breast cancer is one of cancer killer in the world. The diagnosis of this cancer is a big problem in cancer diagnosis researches. In artificial intelligent, machine learning is a discipline which allows to the machine to evolve through a process. Machine learning is widely used in bio informatics and particularly in breast cancer diagnosis. One of the most popular methods is K-nearest neighbors (K-NN) which is a supervised learning method. Using the K-NN in medical diagnosis is very interesting. The quality of the results depends largely on the distance and the value of the parameter “k ” which represent the number of the nearest neighbors. In this paper, we study and evaluate the performance of different distances that can be used in the K-NN algorithm. Also, we analyze this distance by using different values of the parameter “k ” and by using several rules of classification (the rule used to decide how to classify a sample). Our work will be performed on the WBCD database (Wisconsin Breast Cancer Database) obtained by the university of Wisconsin Hospital. Keywords:
Rule Discovery for Binary Classification Problem using ACO based Antminer
"... Data mining can be performed by number of ways. Classification is one of them. Classification is a data mining technique that assigns items to a predefined categories or classes or labels. The aim of classification is to predict the target class for the inputted data. On the other hand biology inspi ..."
Abstract
- Add to MetaCart
(Show Context)
Data mining can be performed by number of ways. Classification is one of them. Classification is a data mining technique that assigns items to a predefined categories or classes or labels. The aim of classification is to predict the target class for the inputted data. On the other hand biology inspired algorithms such as Genetic Algorithms (GA) and Swarm based approaches like Particle Swarm Optimization (PSO) and Ant Colonies Optimization (ACO) were used in solving many data mining problems and currently the most prominent choice in the area of swarm intelligence. In this paper binary classification is considered as an area of problem and a modified AntMiner is used to solve the problem. The basic algorithm of AntMiner has been modified with a different classification accuracy function.
Abstract — Data Mining (DM) and Knowledge Discovery (KD)
"... requires certain steps or rules that determine the data access ..."
(Show Context)