| E.-H. Han, G. Karypis, and V. Kumar, "Clustering in a high-dimensional space using hypergraph models," Tech. Rep. 97-063, University of Minesota, 1998. |
....access method (GET POST) URL, transmission protocol (HTTP FTP) etc. which are all nonnumeric, correlation between two user sessions and, hence, their clustering, is best handled using fuzzy set approach. Other techniques for clustering web data include those using hypergraph based clustering [58]. Association Rule Mining: Some algorithms for mining association rules using FL techniques have been suggested in [59] They deal with the problem of mining fuzzy association rules understandable to humans from a database containing both quantitative and categorical attributes. Association rules ....
B. Mobasher, V. Kumar, and E. H. Han, Clustering in a High Dimensional Space Using Hypergraph Models. Minneapolis: Univ. Minnesota, 1997, Tech. Rep. TR-97-063.
....access method (GET POST) URL, transmission protocol (HTTP FTP) etc. which are all non numeric; correlation between two user sessions and hence their clustering is best handled using fuzzy set approach. Other techniques for clustering web data include those using hypergraph based clustering [58]. Association Rule Mining: Some algorithms for mining association rules using fuzzy logic techniques have been suggested in [59] They deal with the problem of mining fuzzy association rules understandable to humans from a database containing both quantitative and categorical attributes. ....
B. Mobasher, V. Kumar, and E. H. Han, "Clustering in a high dimensional space using hypergraph models," Technical Report TR-97-063, University of Minnesota, Minneapolis, 1997.
....the algorithm is first applied on a random sample from the dataset, and then the remaining data is assigned to the clusters discovered. The main disadvantages of ROCK are that it uses a static model for the overall data, and the quality of the final clustering is sensitive to the sample chosen. Han, et al. 1997) have given a clustering algorithm based on the large itemsets discovered in the database. Here, each itemset is considered as a relation over some set of items, and a hypergraph model is used to represent these relations. A hypergraph partitioning algorithm is then used to obtain a set of item ....
....are disjoint itemsets. A is called the antecedent and B is called the consequent of the rule. The support of an association rule is the support of the itemset (A [B) The confidence of the rule, denoted as conf (A ) B) is given as the ratio of sup(A [B) to sup(A) To assign a weight for a pattern, Han, et al. 1997) used an approach based on confidence values. According to this, the weight of a pattern is the average confidence value of the association rules that include all the items in the corresponding pattern and that have a singleton as a consequent. For example, the weight of a pattern fa;b;cg is equal ....
[Article contains additional citation context not shown here]
Han, E. H., Karypis, G., and Mobasher, B. 1997: `Clustering in a high dimensional space using hypergraph models', In Technical Report, University of Minnesota, Department of Computer Science, 97-019, 1997.
....partitioning technique to the problem of hyper graph clustering. The categorical data set is modeled as a hyper graph and a clustering method based on non linear dynamical systems is applied to the hyper graph. A hyper graph partitioning algorithm is used to cluster a categorical data set in [HKKM97] ROCK [GRS99] is a categorical clustering algorithm which clusters a sampled data set using a novel concept called the links between the data points. Data points are considered as neighbors if their similarity exceeds a threshold and the number of links between two data points is the number of ....
E.H. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a highdimensional space using hypergraph models (1997). In Technical Report, University of Minnesota, Department of Computer Science, 97-019, 1997.
....of yielding competitive clusters, as shown in Fig. 2, but the PDDP method with norm scaling is faster by at least an order of magnitude compared to HAC, as illustrated in Table 2. We have also had a few comparisons with a recently developed association rule hypergraph clustering method (ARHP) [24, 25, 31], based on association rule hypergraphs [2, 6] but this last method is harder to compare since it eliminates some documents so that the result is not strictly speaking a partitioning of the entire document set. The resulting entropies for 33 clusters on the J1 document set were PDDP: 0:497; ....
E. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a high-dimensional space using hypergraph models. Technical Report TR-97-063, Department of Computer Science, University of Minnesota, Minneapolis, 1997.
....HAC, the features in each document vector is usually weighted using the TFIDF scaling [SM83] which is an increasing function of the feature s text frequency and its inverse document frequency in the document space. 4. 1 Association Rule Hypergraph Partitioning Algorithm The ARHP method [HKKM97a, HKKM97b] is used for clustering related items in transactionbased databases, such as supermarket bar code data, using association rules and hypergraph partitioning. This method first finds set of items that occur frequently together in transactions using association rule discovery methods [AMS 96a] ....
E.H. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a highdimensional space using hypergraph models. Technical Report TR-97-063, Department of Computer Science, University of Minnesota, Minneapolis, 1997.
....association rules based on this new definition. We have implemented Min Apriori based on the Apriori algorithm proposed in [AS94] We have applied Min Apriori algorithm to find association rules for web document data and used association rules to find cluster of words and documents [HBG 97, HKKM97] The results show that we can find meaningful and useful association rules among documents and words without discretizing the values in the data. The results also suggest that whenever a higher value in the data indicates a stronger relation, Min Apriori is applicable. We plan to apply ....
E.H. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a high-dimensional space using hypergraph models. Technical Report TR-97-063, Department of Computer Science, University of Minnesota, Minneapolis, 1997. 2
....the vertices such that the corresponding data items in each partition are highly related and the weight of the hyperedges cut by the partitioning is minimized. To test the applicability and robustness of our scheme, we evaluated it on a wide variety of data sets [HKKM97a, MHB 97, HBG 98, HKKM97b] We present a summary of results on two different data sets: S P500 stock data for the period of 1994 1996 and protein coding data. These experiments demonstrate that our approach is applicable and effective in a wide range of domains. More specifically, our approach performed much better than ....
....these 40 partitions, only 20 of them satisfy the fitness function. Out of 20 clusters, 16 clusters were clean clusters as they contain stocks primarily from one industry group. Some of the clean clusters found from this data are shown in Table 1 and the complete list of clusters is available in [HKKM97b] Looking at these clusters we can see that our item clustering algorithm was very successful in grouping together stocks that belong to the same industry group. For example, our algorithm was able to find technology , financial , oil , gold , and metal related stock clusters. Also, it is ....
[Article contains additional citation context not shown here]
E.H. Han, G. Karypis, V. Kumar, and B. Mobasher. Clustering in a high-dimensional space using hypergraph models. Technical Report TR-97-063, Department of Computer Science, University of Minnesota, Minneapolis, 1997.
No context found.
E.-H. Han, G. Karypis, and V. Kumar, "Clustering in a high-dimensional space using hypergraph models," Tech. Rep. 97-063, University of Minesota, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC