#### DMCA

## Modeling hidden topics on document manifold (2008)

### Cached

### Download Links

- [www.cs.uiuc.edu]
- [www.cs.uiuc.edu]
- [www.cs.uiuc.edu]
- [web.engr.illinois.edu]
- [hanj.cs.illinois.edu]
- [www.cad.zju.edu.cn]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the ACM conference on Information and knowledge management |

Citations: | 29 - 6 self |

### Citations

11684 | D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm:
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ... P(di, wj) N∑ M∑ K∑ n(di, wj) log P(wj|zk)P(zk|di) k=1 where n(di, wj) the number of occurrences of term wj in document di. The above optimization problem can be solved by using standard EM algorithm =-=[9]-=-. Notice that there are NK+MK parameters {P(wj|zk), P(zk|di)} which are independently estimated in PLSI model. It is easy to see that the number of parameters in PLSI grows linearly with the number of... |

4163 | Latent Dirichlet allocation.
- Blei, Ng, et al.
- 2003
(Show Context)
Citation Context ...tribution of each document on the hidden topics independently and the number of parameters in the model grows linearly with the size of the corpus. This leads to serious problems with overfitting [16]=-=[5]-=-[19]. Latent Dirichlet Allocation (LDA) is then proposed to overcome this problem by treating the probability distribution of each document over topics as a K-parameter hidden random variable rather t... |

3721 | Normalized cuts and image segmentation
- Shi, Malik
(Show Context)
Citation Context ...rithms as follows. • Canonical k-means clustering method (k-means in short). • Two representative spectral clustering methods: Average Association (AA in short) [22], and Normalized Cut (NC in short) =-=[18]-=-[15]. Spectral clustering methods have recently emerged as one of the most effective document clusteringTable 6: Clustering performance on Reuters k Accuracy (%) Normalized Mutual Information (%) PLS... |

3702 | Indexing by latent semantic analysis
- Deerwester, Dumais, et al.
- 1990
(Show Context)
Citation Context ...ober 26–30, 2008, Napa Valley, California, USA. Copyright 2008 ACM 978-1-59593-991-3/08/10 ...$5.00. 1. INTRODUCTION Document representation has been a key problem for document analysis and processing=-=[8]-=-[10][11]. The Vector Space Model (VSM) might be one of the most popular models for document representation. In VSM, each document is represented as a bag of words. Correspondingly, the inner product (... |

1674 | On Spectral Clustering: Analysis and an Algorithm,”
- Ng, Jordan, et al.
- 2001
(Show Context)
Citation Context ...ms as follows. • Canonical k-means clustering method (k-means in short). • Two representative spectral clustering methods: Average Association (AA in short) [22], and Normalized Cut (NC in short) [18]=-=[15]-=-. Spectral clustering methods have recently emerged as one of the most effective document clusteringTable 6: Clustering performance on Reuters k Accuracy (%) Normalized Mutual Information (%) PLSI LD... |

1191 | Probabilistic latent semantic indexing
- Hofmann
- 1999
(Show Context)
Citation Context ...–30, 2008, Napa Valley, California, USA. Copyright 2008 ACM 978-1-59593-991-3/08/10 ...$5.00. 1. INTRODUCTION Document representation has been a key problem for document analysis and processing[8][10]=-=[11]-=-. The Vector Space Model (VSM) might be one of the most popular models for document representation. In VSM, each document is represented as a bag of words. Correspondingly, the inner product (or, cosi... |

979 | A view of the em algorithm that justifies incremental , sparse, and other variants. Learning
- Neal, Hinton
- 1998
(Show Context)
Citation Context ...do not have a close form re-estimation equation for P(zk|di). In this case, the traditional EM algorithm can not be applied. In the following, we discuss how to use the generalized EM algorithm (GEM) =-=[14]-=- to maximize the regularized log-likelihood of LapPLSI in Eqn. (6). The major difference between GEM and traditional EM is in the M-step. Instead of finding the globally optimal solutions for Ψ which ... |

652 | Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering,”
- Belkin, Niyogi
- 2001
(Show Context)
Citation Context ...er. In other words, the conditional probability distribution P(z|d) varies smoothly along the geodesics in the intrinsic geometry of PD. This assumption is also referred to (2)as manifold assumption =-=[3]-=-, which plays an essential rule in developing various kinds of algorithms including dimensionality reduction algorithms [3][10] and semi-supervised learning algorithms [4][24]. Let fk(d) = P(zk|d) be ... |

605 | Unsupervised learning by probabilistic latent semantic analysis
- Hofmann
- 2001
(Show Context)
Citation Context ... years and many variants of LSI have been proposed [1][20]. Despite its remarkable success in different domains, LSI has a number of deficits, mainly due to its unsatisfactory statistical formulation =-=[12]-=-. To address this issue, Hofmann [11] proposed a generative probabilistic model named Probabilistic Latent Semantic Indexing (PLSI). PLSI models each word in a document as a sample from a mixture mode... |

557 | Manifold regularization: A geometric framework for learning from labeled and unlabeled examples
- Belkin, Niyogi, et al.
- 2004
(Show Context)
Citation Context ...o (2)as manifold assumption [3], which plays an essential rule in developing various kinds of algorithms including dimensionality reduction algorithms [3][10] and semi-supervised learning algorithms =-=[4]-=-[24]. Let fk(d) = P(zk|d) be the conditional Probability Distribution Function (PDF), we use ‖fk‖ 2 M to measure the smoothness of fk along the geodesics in the intrinsic geometry of PD. When we consi... |

319 |
Document clustering based on non-negative matrix factorization”,
- Xu, Liu, et al.
- 2003
(Show Context)
Citation Context ...ained label of each document with that provided by the document corpus. Two metrics, the accuracy (AC) and the normalized mutual information metric (MI) are used to measure the clustering performance =-=[21]-=-[6]. Given a document xi, let ri and si be the obtained cluster label and the label provided by the corpus, respectively. The AC is defined as follows: AC = ∑n i=1 δ(si, map(ri)) n where n is the tota... |

194 | Spectral Relaxation for k-Means Clustering
- Zha, He, et al.
- 2001
(Show Context)
Citation Context ...d four state-of-the-art clustering algorithms as follows. • Canonical k-means clustering method (k-means in short). • Two representative spectral clustering methods: Average Association (AA in short) =-=[22]-=-, and Normalized Cut (NC in short) [18][15]. Spectral clustering methods have recently emerged as one of the most effective document clusteringTable 6: Clustering performance on Reuters k Accuracy (%... |

191 |
Spectral Graph Theory, volume 92
- Chung
- 1997
(Show Context)
Citation Context ...old M and the integral is taken over the distribution PD. In reality, the document manifold is usually unknown. Thus, ‖fk‖ 2 M in Eqn. (3) can not be computed. Recent studies on spectral graph theory =-=[7]-=- and manifold learning theory [2] have demonstrated that ‖fk‖ 2 M can be discretely approximated through a nearest neighbor graph on a scatter of data points. Consider a graph with N vertices where ea... |

168 | Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments
- Popescul, Ungar, et al.
- 2001
(Show Context)
Citation Context ... distribution of each document on the hidden topics independently and the number of parameters in the model grows linearly with the size of the corpus. This leads to serious problems with overfitting =-=[16]-=-[5][19]. Latent Dirichlet Allocation (LDA) is then proposed to overcome this problem by treating the probability distribution of each document over topics as a K-parameter hidden random variable rathe... |

101 |
Matching Theory, Akadémiai Kiado
- Lovász, Plummer
- 1986
(Show Context)
Citation Context ...erwise, and map(ri) is the permutation mapping function that maps each cluster label ri to the equivalent label from the data corpus. The best mapping can be found by using the Kuhn-Munkres algorithm =-=[13]-=-. Let C denote the set of clusters obtained from the ground truth and C ′ obtained from our algorithm. Their mutual information metric MI(C, C ′ ) is defined as follows: MI(C, C ′ ) = ∑ c i∈C,c ′ j ∈C... |

79 |
Problems of learning on manifolds
- Belkin
- 2003
(Show Context)
Citation Context ...ver the distribution PD. In reality, the document manifold is usually unknown. Thus, ‖fk‖ 2 M in Eqn. (3) can not be computed. Recent studies on spectral graph theory [7] and manifold learning theory =-=[2]-=- have demonstrated that ‖fk‖ 2 M can be discretely approximated through a nearest neighbor graph on a scatter of data points. Consider a graph with N vertices where each vertex corresponds to a docume... |

77 | Document clustering using locality preserving indexing
- Cai, He, et al.
- 2005
(Show Context)
Citation Context ...d label of each document with that provided by the document corpus. Two metrics, the accuracy (AC) and the normalized mutual information metric (MI) are used to measure the clustering performance [21]=-=[6]-=-. Given a document xi, let ri and si be the obtained cluster label and the label provided by the corpus, respectively. The AC is defined as follows: AC = ∑n i=1 δ(si, map(ri)) n where n is the total n... |

48 | Harmonic mixtures: combining mixture models and graph-based methods for inductive and scalable semisupervised learning. ICML
- Zhu, Lafferty
- 2005
(Show Context)
Citation Context ...2)as manifold assumption [3], which plays an essential rule in developing various kinds of algorithms including dimensionality reduction algorithms [3][10] and semi-supervised learning algorithms [4]=-=[24]-=-. Let fk(d) = P(zk|d) be the conditional Probability Distribution Function (PDF), we use ‖fk‖ 2 M to measure the smoothness of fk along the geodesics in the intrinsic geometry of PD. When we consider ... |

41 | Latent Semantic Space: Iterative Scaling Improves Precision of Inter-document Similarity Measurement
- Ando
- 2000
(Show Context)
Citation Context ...more reliably estimated in the reduced latent space representation than in the original representation. LSI received a lot of attentions during these years and many variants of LSI have been proposed =-=[1]-=-[20]. Despite its remarkable success in different domains, LSI has a number of deficits, mainly due to its unsatisfactory statistical formulation [12]. To address this issue, Hofmann [11] proposed a g... |

37 | Locality preserving indexing for document representation
- He, Cai, et al.
- 2004
(Show Context)
Citation Context ...r 26–30, 2008, Napa Valley, California, USA. Copyright 2008 ACM 978-1-59593-991-3/08/10 ...$5.00. 1. INTRODUCTION Document representation has been a key problem for document analysis and processing[8]=-=[10]-=-[11]. The Vector Space Model (VSM) might be one of the most popular models for document representation. In VSM, each document is represented as a bag of words. Correspondingly, the inner product (or, ... |

26 | Latent semantic analysis for multiple-type interrelated data objects
- Wang, Sun, et al.
- 2006
(Show Context)
Citation Context ...e reliably estimated in the reduced latent space representation than in the original representation. LSI received a lot of attentions during these years and many variants of LSI have been proposed [1]=-=[20]-=-. Despite its remarkable success in different domains, LSI has a number of deficits, mainly due to its unsatisfactory statistical formulation [12]. To address this issue, Hofmann [11] proposed a gener... |

20 |
Text classification with kernels on the multinomial manifold
- Zhang, Chen, et al.
(Show Context)
Citation Context ...lly sampled from a Euclidean space. Recent studies suggest that the documents are usually sampled from a nonlinear low-dimensional manifold which is embedded in the high-dimensional ambient space [10]=-=[23]-=-. Thus, the local geometric structure is essential to reveal the hidden semantics in the corpora.In this paper, we propose a new algorithm called Laplacian Probabilistic Latent Semantic Indexing (Lap... |

13 | Adjusting mixture weights of Gaussian mixture model via regularized probabilistic latent semantic analysis
- Si, Jin
- 2005
(Show Context)
Citation Context ...bution of each document on the hidden topics independently and the number of parameters in the model grows linearly with the size of the corpus. This leads to serious problems with overfitting [16][5]=-=[19]-=-. Latent Dirichlet Allocation (LDA) is then proposed to overcome this problem by treating the probability distribution of each document over topics as a K-parameter hidden random variable rather than ... |