#### DMCA

## PLDA: Parallel Latent Dirichlet Allocation for Large-scale Applications

### Cached

### Download Links

Citations: | 46 - 5 self |

### Citations

4167 | Latent dirichlet allocation
- Blei, Ng, et al.
(Show Context)
Citation Context ...leased MPI-PLDA to open source at http://code.google.com/p/plda under the Apache License. 1 Introduction Latent Dirichlet Allocation (LDA) was first proposed by Blei, Ng and Jordan to model documents =-=[1]-=-. Each document is modeled as a mixture of K latent topics, where each topic, k, is a multinomial distribution φ k over a V -word vocabulary. For any document d, its topic mixture θd is a probability ... |

3208 | MapReduce: Simplified data processing on large clusters
- Dean, Ghemawat
- 2008
(Show Context)
Citation Context ... it works via a simple example. We then present our two fault-tolerant PLDA implementations (the current core algorithm of PLDA is the AD-LDA algorithm [2]), one on MPI [3] and the other on MapReduce =-=[4]-=-. Section 4 uses two large-scale applications to demonstrate the scalability of PLDA. Finally, we discuss future research plans in Section 5.2 Learning Algorithms for LDA Blei, Ng and Jordan [1] prop... |

1059 |
Finding scientific topics
- Griffiths, Steyvers
- 2004
(Show Context)
Citation Context ...for approximate inference. Minka and Lafferty proposed a comparable algorithm [5], which uses another approximate inference method, Expectation Propagation (EP), in the E-step. Griffiths and Steyvers =-=[6]-=- proposed using Gibbs sampling, a Markov-chain Monte Carlo method, to perform inference. By assuming a Dirichlet prior, β, on model parameters Φ = {φ k} (a set of topics), Φ can be integrated (hence r... |

1008 | An Introduction to Latent Semantic Analysis
- Landauer, Foltz
- 1998
(Show Context)
Citation Context ...ey=t1 h4 2 2 human=t2, system=t1, system=t2, EPS=t1 m1 1 0 trees=t1 m3 2 1 trees=t1, graph=t2, minors=t1 3.2 Illustrative Example We use a two-category, nine-document example, originally presented in =-=[12]-=- for explaining LSA, to illustrate how PLDA works. Table 1 shows nine documents separated into two categories, where symbol h stands for human computer interaction, and m for mathematical graph theory... |

220 | Map-reduce for machine learning on multicore
- Chu, Kim, et al.
- 2006
(Show Context)
Citation Context ...ted computing of the VEM algorithm for LDA [1]. – Newman and et al. [2] presented two synchronous methods, AD-LDA and HDLDA, to perform distributed Gibbs sampling. AD-LDA is similar to distributed EM =-=[8]-=- from a data-flow perspective; HD-LDA is theoretically equivalent to learning a mixture of LDA models but suffers from high computation cost. – Asuncion, Smyth and Welling [9] presented an asynchronou... |

168 | Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments - Popescul, Ungar, et al. - 2001 |

155 | Expectation-propagation for the generative aspect model
- Minka, Lafferty
- 2002
(Show Context)
Citation Context ...ters using the inference result. Unfortunately, this inference is intractable, so variational Bayes is used in the E-step for approximate inference. Minka and Lafferty proposed a comparable algorithm =-=[5]-=-, which uses another approximate inference method, Expectation Propagation (EP), in the E-step. Griffiths and Steyvers [6] proposed using Gibbs sampling, a Markov-chain Monte Carlo method, to perform ... |

155 | Learning to Probabilistically Identify Authoritative Documents
- Cohn, Chang
- 2000
(Show Context)
Citation Context ...c) = P (wd|c)P (c|wd), where c denotes the product category of reviews I. char(wd; c) is a natural extension of the topic-sensitive characteristic measure, char(wd; z) = P (wd|z)P (z|wd), proposed by =-=[13]-=-. Expanding char(wd; c), we obtain: char(wd; c) = P (wd|c)P (c|wd) [ ] [ ] ∑ ∑ = P (wd|z)P (z|c) P (z|wd)P (c|z) z where P (z|c) and P (c|z) come from the learning result and P (wd|z) and P (z|wd) com... |

107 | Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation
- Porteuous, Newman, et al.
- 2008
(Show Context)
Citation Context ...A model learning times by reducing the total computational cost: – Gomes, Welling and Perona [10] presented an enhancement of the VEM algorithm using a bounded amount of memory. – Porteous and et al. =-=[11]-=- proposed a method to accelerate the computation of (Eq.1). The acceleration is achieved by no approximations but using the property that the probability vectors, θd, are sparse in most cases. 3 PLDA ... |

85 | 2005, `Optimization of Collective Communication Operation in MPICH
- Thakur, Rabenseifner, et al.
(Show Context)
Citation Context ...llel LDA (PLDA) and explain how it works via a simple example. We then present our two fault-tolerant PLDA implementations (the current core algorithm of PLDA is the AD-LDA algorithm [2]), one on MPI =-=[3]-=- and the other on MapReduce [4]. Section 4 uses two large-scale applications to demonstrate the scalability of PLDA. Finally, we discuss future research plans in Section 5.2 Learning Algorithms for L... |

82 | Distributed inference for latent dirichlet allocation
- Newman, Asuncion, et al.
- 2007
(Show Context)
Citation Context ...3 we present parallel LDA (PLDA) and explain how it works via a simple example. We then present our two fault-tolerant PLDA implementations (the current core algorithm of PLDA is the AD-LDA algorithm =-=[2]-=-), one on MPI [3] and the other on MapReduce [4]. Section 4 uses two large-scale applications to demonstrate the scalability of PLDA. Finally, we discuss future research plans in Section 5.2 Learning... |

74 | L.: Optimol: Automatic Online Picture Collection via Incremental Model Learning - Li, Wang, et al. - 2007 |

63 | Asynchronous distributed learning of topic models
- Asuncion, Smyth, et al.
- 2008
(Show Context)
Citation Context ...imilar to distributed EM [8] from a data-flow perspective; HD-LDA is theoretically equivalent to learning a mixture of LDA models but suffers from high computation cost. – Asuncion, Smyth and Welling =-=[9]-=- presented an asynchronous distributed Gibbs sampling algorithm. In addition to these parallelization techniques, the following optimizations can reduce LDA model learning times by reducing the total ... |

31 |
Parallelized variational EM for latent Dirichlet allocation: An experimental evaluation of speed and scalability
- Nallapati, Cohen, et al.
- 2007
(Show Context)
Citation Context ...p LDA, including both parallelizing LDA across multiple machines and reducing the total amount of work required to build an LDA model. Relevant parallelization efforts include:– Nallapati and et al. =-=[7]-=- reported distributed computing of the VEM algorithm for LDA [1]. – Newman and et al. [2] presented two synchronous methods, AD-LDA and HDLDA, to perform distributed Gibbs sampling. AD-LDA is similar ... |

30 | Psvm: Parallelizing support vector machines on distributed computers
- Chang, Zhu, et al.
- 2007
(Show Context)
Citation Context ...computation time reduced by parallelization cannot compensate for the increase in the communication time, and speedup actually decreases. On the one hand, as we previously stated and also observed in =-=[18]-=-, when the dataset size increases, and hence the computation time increases, we can add more machines to productively improve speedup. On the other hand, a job will eventually be dominated by the comm... |

26 | Collaborative filtering for orkut communities: discovery of user latent behavior
- Chen
- 2009
(Show Context)
Citation Context ... or implicitly by joining communities. When the number of communities grows over time, finding an interesting community to join can be time consuming. We use PLDA to model users’ community membership =-=[16]-=-. On a matrix formed by users as rows and communities as columns, all values in user-community cells are initially unknown. When a user joins a community, the corresponding user-community cell is set ... |

6 | Memory bounded inference in topic models
- GOMES, WELLING, et al.
- 2008
(Show Context)
Citation Context ...pling algorithm. In addition to these parallelization techniques, the following optimizations can reduce LDA model learning times by reducing the total computational cost: – Gomes, Welling and Perona =-=[10]-=- presented an enhancement of the VEM algorithm using a bounded amount of memory. – Porteous and et al. [11] proposed a method to accelerate the computation of (Eq.1). The acceleration is achieved by n... |

1 |
M.: Category-focused ranking for keyword extraction and document summarization. Google Technical Report, submitted for publication (2009
- Liu, Wang, et al.
(Show Context)
Citation Context ...ss than 100 words. The forum set consists of 2, 450, 379 entries extracted from http://www.tianya.cn. While the effectiveness of PLDA on document summarization on the Wikipedia dataset is reported in =-=[17]-=-, we report here our experimental results on scalability conducted upon both datasets. We measured and compared the speedup of MPI-PLDA and MapReduce-PLDA using these two datasets. The dataset size an... |

1 |
Avoiding communication in dense and sparse linear algebra (invited talk
- Demmel
(Show Context)
Citation Context ...be dominated by the communication overhead, and adding more machines may be counter-productive. Therefore, the next logical step in performance enhancement is to consider communication time reduction =-=[19]-=- (discussed further in concluding remarks). 3 Since the time of running N Gibbs sampling iterations is the same as N times the time of running one iteration, we do not need to run PLDA to convergence ... |