Results 1  10
of
19
A.Kannan,“Survey on internal validity measure for cluster validation
 International Journal of Computer Science and Engineering Survey (IJCSES) Vol.1, No.2
, 2010
"... Abstract Data Clustering is a technique of finding similar characteristics among the data set which ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract Data Clustering is a technique of finding similar characteristics among the data set which
Clustering avatars behaviours from virtual worlds interactions
 In Proceedings of the 4th International Workshop on Web Intelligence &#38; Communities, WI&#38;C ’12
, 2012
"... Virtual Worlds (VWs) platforms and applications provide a practical implementation of the Metaverse concept. These applications, as highly inmersive and interactive 3D environments, have become very popular in social networks and games domains. The existence of a set of open platforms like OpenSim ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Virtual Worlds (VWs) platforms and applications provide a practical implementation of the Metaverse concept. These applications, as highly inmersive and interactive 3D environments, have become very popular in social networks and games domains. The existence of a set of open platforms like OpenSim or OpenCobalt have played a major role in the popularization of this technology and they open new exciting research areas. One of these areas is behaviour analysis. In virtual world, the user (or avatar) can move and interact within an artificial world with a high degree of freedom. The movements and iterations of the avatar can be monitorized, and hence this information can be analysed to obtain interesting behavioural patterns. Usually, only the information related to the avatars conversations (textual chat logs) are directly available for processing. However, these open platforms allow to capture other kind of information like the exact position of an avatar in the VW, what they are looking at (eyegazing) or which actions they perform inside these worlds. This paper studies how this information, can be extracted, processed and later used by clustering methods to detect behaviour or group formations in the world. To detect the behavioural patterns of the avatars considered, clustering techniques have been used. These techniques, using the correct data preprocessing and modelling, can be used to automatically detect hidden patterns from data.
Mining High Quality Association Rules Using Genetic Algorithms
"... Association rule mining problem (ARM) is a structured mechanism for unearthing hidden facts in large data sets and drawing inferences on how a subset of items influences the presence of another subset. ARM is computationally very expensive because the number of rules grow exponentially as the numbe ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Association rule mining problem (ARM) is a structured mechanism for unearthing hidden facts in large data sets and drawing inferences on how a subset of items influences the presence of another subset. ARM is computationally very expensive because the number of rules grow exponentially as the number of items in the database increase. This exponential growth is exacerbated further when data dimensions increase. The association rule mining problem is even made more complex when the need to take the different rule quality metrics into account arises. In this paper, we propose a genetic algorithm (GA) to generate high quality association rules with five rule quality metrics. We study the performance of the algorithm and the experimental results show that the algorithm produces high quality rules in good computational times.
Combining information from distributed evolutionary kmeans, in:
 Proceedings of the Brazilian Symposium on Neural Networks, IEEE Computer Society,
, 2012
"... AbstractOne of the challenges for clustering resides in dealing with huge amounts of data, which causes the need for distribution of large data sets in separate repositories. However, most clustering techniques require the data to be centralized. One of them, the kmeans, has been elected one of t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
AbstractOne of the challenges for clustering resides in dealing with huge amounts of data, which causes the need for distribution of large data sets in separate repositories. However, most clustering techniques require the data to be centralized. One of them, the kmeans, has been elected one of the most influential data mining algorithms. Although exact distributed versions of the kmeans algorithm have been proposed, the algorithm is still sensitive to the selection of the initial cluster prototypes and requires that the number of clusters be specified in advance. This work tackles the problem of generating an approximated model for distributed clustering, based on kmeans, for scenarios where the number of clusters of the distributed data is unknown. We propose a collection of algorithms that generate and select kmeans clustering for each distributed subset of the data and combine them afterwards. The variants of the algorithm are compared from two perspectives: the theoretical one, through asymptotic complexity analyses; and the experimental one, through a comparative evaluation of results obtained from a collection of experiments and statistical tests.
Harmony Searchbased Cluster Initialization for Fuzzy CMeans Segmentation of MR Images
"... Abstract—We propose a new approach to tackle the well known fuzzy cmeans (FCM) initialization problem. Our approach uses a metaheuristic search method called Harmony Search (HS) algorithm to produce nearoptimal initial cluster centers for the FCM algorithm. We then demonstrate the effectiveness of ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—We propose a new approach to tackle the well known fuzzy cmeans (FCM) initialization problem. Our approach uses a metaheuristic search method called Harmony Search (HS) algorithm to produce nearoptimal initial cluster centers for the FCM algorithm. We then demonstrate the effectiveness of our approach in a MRI segmentation problem. In order to dramatically reduce the computation time to find nearoptimal cluster centers, we use an alternate representation of the search space. Our experiments indicate encouraging results in producing stable clustering for the given problem as compared to using an FCM with randomly initialized cluster centers. I.
A METHODOLOGY OF SWARM INTELLIGENCE APPLICATION IN CLUSTERING BASED ON NEIGHBORHOOD CONSTRUCTION
, 2011
"... ..."
Asset Allocation under Hierarchical Clustering
"... This paper proposes a clustering asset allocation scheme which provides better riskadjusted portfolio performance than those obtained from traditional asset allocation approaches such as the equal weight strategy and the Markowitz minimum variance allocation. The clustering criterion used, which i ..."
Abstract
 Add to MetaCart
This paper proposes a clustering asset allocation scheme which provides better riskadjusted portfolio performance than those obtained from traditional asset allocation approaches such as the equal weight strategy and the Markowitz minimum variance allocation. The clustering criterion used, which involves maximization of the insample Sharpe ratio (SR), is different from traditional clustering criteria reported in the literature. Two evolutionary methods, namely Differential Evolution and Genetic Algorithm, are employed to search for such an optimal clustering structure given a cluster number. To explore the clustering impact on the SR, the insample and the outofsample SR distributions of the portfolios are studied using bootstrapped data as well as simulated paths from the single index market model. It was found that the SR distributions of the portfolios under the clustering asset allocation structure have higher mean values and skewness but approximately the same standard deviation and kurtosis than those in the nonclustered case. Genetic Algorithm is suggested as a more efficient approach than Differential Evolution for the purpose of solving the clustering problem.
Performance Evaluation of Learning by Example Techniques over Different Datasets
"... ABSTRACT: The clustering activity is an unsupervised learning observation which coalesce the data into segments. Grouping of data is done by identifying common characteristics that are labeled as similarities among data based on their characteristics. Scheming the Performance of selective clusterin ..."
Abstract
 Add to MetaCart
ABSTRACT: The clustering activity is an unsupervised learning observation which coalesce the data into segments. Grouping of data is done by identifying common characteristics that are labeled as similarities among data based on their characteristics. Scheming the Performance of selective clustering algorithms over different chosen data sets are evaluated here. Burst time is a performance parameter chosen in evaluating the performance of various selective clustering based machine learning algorithms. Here the investigational results are represented in a table. In our investigation we also suggest a clustering algorithm that performs quicker over a selected data set with reference to the parameter Burst time.
Evolutionary kmeans for distributed data sets
"... a b s t r a c t One of the challenges for clustering resides in dealing with data distributed in separated repositories, because most clustering techniques require the data to be centralized. One of them, kmeans, has been elected as one of the most influential data mining algorithms for being simp ..."
Abstract
 Add to MetaCart
(Show Context)
a b s t r a c t One of the challenges for clustering resides in dealing with data distributed in separated repositories, because most clustering techniques require the data to be centralized. One of them, kmeans, has been elected as one of the most influential data mining algorithms for being simple, scalable and easily modifiable to a variety of contexts and application domains. Although distributed versions of kmeans have been proposed, the algorithm is still sensitive to the selection of the initial cluster prototypes and requires the number of clusters to be specified in advance. In this paper, we propose the use of evolutionary algorithms to overcome the kmeans limitations and, at the same time, to deal with distributed data. Two different distribution approaches are adopted: the first obtains a final model identical to the centralized version of the clustering algorithm; the second generates and selects clusters for each distributed data subset and combines them afterwards. The algorithms are compared experimentally from two perspectives: the theoretical one, through asymptotic complexity analyses; and the experimental one, through a comparative evaluation of results obtained from a collection of experiments and statistical tests. The obtained results indicate which variant is more adequate for each application scenario.