Results 1 
9 of
9
The effect of new links on Google PageRank
 Stoch. Models
, 2006
"... PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as the frequency that a random surfer visits a Web page, and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to wh ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
(Show Context)
PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as the frequency that a random surfer visits a Web page, and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to what extent a page can control its PageRank. Using the asymptotic analysis we provide simple conditions that show whether or not new links result in increased PageRank for a Web page and its neighbors. Furthermore, we show that there exists an optimal (although impractical) linking strategy. We conclude that a Web page benefits from links inside its Web community and on the other hand irrelevant links penalize the Web pages and their Web communities.
Scalable matrix computations on large scalefree graphs using 2d graph partitioning
 in Supercomputing
"... Scalable parallel computing is essential for processing large scalefree (powerlaw) graphs. The distribution of data across processes becomes important on distributedmemory computers with thousands of cores. It has been shown that twodimensional layouts (edge partitioning) can have significant a ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Scalable parallel computing is essential for processing large scalefree (powerlaw) graphs. The distribution of data across processes becomes important on distributedmemory computers with thousands of cores. It has been shown that twodimensional layouts (edge partitioning) can have significant advantages over traditional onedimensional layouts. However, simple 2D block distribution does not use the structure of the graph, and more advanced 2D partitioning methods are too expensive for large graphs. We propose a new twodimensional partitioning algorithm that combines graph partitioning with 2D block distribution. The computational cost of the algorithm is essentially the same as 1D graph partitioning. We study the performance of sparse matrixvector multiplication (SpMV) for scalefree graphs from the web and social networks using several different partitioners and both 1D and 2D data layouts. We show that SpMV run time is reduced by exploiting the graph’s structure. Contrary to popular belief, we observe that current graph and hypergraph partitioners often yield relatively good partitions on scalefree graphs. We demonstrate that our new 2D partitioning method consistently outperforms the other methods considered, for both SpMV and an eigensolver, on matrices with up to 1.6 billion nonzeros using up to 16,384 cores. Keywords parallel computing, graph partitioning, scalefree graphs, sparse matrixvector multiplication, twodimensional distribution ∗Sandia is a multiprogram laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of
Parallel Algorithms for Hypergraph Partitioning
, 2006
"... Nearoptimal decomposition is central to the efficient solution of numerous problems in scientific computing and computeraided design. In particular, intelligent a priori partitioning of input data can greatly improve the runtime and scalability of largescale parallel computations. Discrete data ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Nearoptimal decomposition is central to the efficient solution of numerous problems in scientific computing and computeraided design. In particular, intelligent a priori partitioning of input data can greatly improve the runtime and scalability of largescale parallel computations. Discrete data structures such as graphs and hypergraphs are used to formalise such partitioning problems, with hypergraphs typically preferred for their greater expressiveness. Optimal graph and hypergraph partitioning are NPcomplete problems; however, serial heuristic algorithms that run in loworder polynomial time have been studied extensively and good tool support exists. Yet, to date, only graph partitioning algorithms have been parallelised. This thesis presents the first parallel hypergraph partitioning algorithms, enabling both partitioning of much larger hypergraphs, and computation of partitions with significantly reduced runtimes. In the multilevel approach which we adopt, the coarsening and refinement phases are performed in parallel while the initial
SiteBased Partitioning and Repartitioning Techniques for Parallel PageRank Computation
"... Abstract—The PageRank algorithm is an important component in effective web search. At the core of this algorithm are repeated sparse matrixvector multiplications where the involved web matrices grow in parallel with the growth of the web and are stored in a distributed manner due to space limitatio ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Abstract—The PageRank algorithm is an important component in effective web search. At the core of this algorithm are repeated sparse matrixvector multiplications where the involved web matrices grow in parallel with the growth of the web and are stored in a distributed manner due to space limitations. Hence, the PageRank computation, which is frequently repeated, must be performed in parallel with highefficiency and lowpreprocessing overhead while considering the initial distributed nature of the web matrices. Our contributions in this work are twofold. We first investigate the application of stateoftheart sparse matrix partitioning models in order to attain high efficiency in parallel PageRank computations with a particular focus on reducing the preprocessing overhead they introduce. For this purpose, we evaluate two different compression schemes on the web matrix using the site information inherently available in links. Second, we consider the more realistic scenario of starting with an initially distributed data and extend our algorithms to cover the repartitioning of such data for efficient PageRank computation. We report performance results using our parallelization of a stateoftheart PageRank algorithm on two different PC clusters with 40 and 64 processors. Experiments show that the proposed techniques achieve considerably high speedups while incurring a preprocessing overhead of several iterations (for some instances even less than a single iteration) of the underlying sequential PageRank algorithm. Index Terms—PageRank, sparse matrixvector multiplication, web search, parallelization, sparse matrix partitioning, graph partitioning, hypergraph partitioning, repartitioning. Ç
WebSiteBased Partitioning Techniques for Reducing the Preprocessing Overhead before the Parallel PageRank Computations
"... Abstract. The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. Due to the enormous size of the Web’s hyperlink structure, PageRank computations are usually carried out on parallel computers. Recen ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The efficiency of the PageRank computation is important since the constantly evolving nature of the Web requires this computation to be repeated many times. Due to the enormous size of the Web’s hyperlink structure, PageRank computations are usually carried out on parallel computers. Recently, a hypergraphpartitioningbased formulation for parallel sparsematrix vector multiplication is proposed as a preprocessing step which will minimize the communication overhead of the parallel PageRank computations. Based on this work, we propose Websitebased partitioning approaches in order to reduce the overhead of this preprocessing step. The conducted experiments show that the proposed approach produces comparable performance results for PageRank computation while achieving lower preprocessing overheads. 1
Optimizing web structures using web mining techniques
 Intelligent Data Engineering and Automated Learning  IDEAL 2007, volume 4881 of Lecture Notes in Computer Science
, 2007
"... Abstract. With vibrant and rapidly growing web, website complexity is constantly increasing, making it more difficult for users to quickly locate the information they are looking for. This, on the other hand, becomes more and more important due to the widespread reliance on the many services availa ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. With vibrant and rapidly growing web, website complexity is constantly increasing, making it more difficult for users to quickly locate the information they are looking for. This, on the other hand, becomes more and more important due to the widespread reliance on the many services available on the Internet nowadays. Web mining techniques have been successfully used for quite some time, for example in search engines like Google, to facilitate retrieval of relevant information. This paper takes a different approach, as we believe that not only search engines can facilitate the task of finding the information one is looking for, but also an optimization of a website's internal structure, which is based on previously recorded user behavior. In this paper, we will present a novel approach to identifying problematic structures in websites. This method compares user behavior, derived via web log mining techniques, to an analysis of the website's link structure obtained by applying the Weighted PageRank algorithm (see
A REVIEW ON: DYNAMIC LINK BASED RANKING
"... Dynamic authoritybased ranking methods such as personalized PageRank and ObjectRank. Since they dynamically rank nodes in a data graph using an expensive matrixmultiplication method, the online execution time rapidly increases as the size of data graph grows. ObjectRank spends 2040 seconds to com ..."
Abstract
 Add to MetaCart
Dynamic authoritybased ranking methods such as personalized PageRank and ObjectRank. Since they dynamically rank nodes in a data graph using an expensive matrixmultiplication method, the online execution time rapidly increases as the size of data graph grows. ObjectRank spends 2040 seconds to compute queryspecific relevance scores, which is unacceptable. We introduce a novel approach, BinRank, that approximates dynamic linkbased ranking scores efficiently. BinRank partitions a dictionary into bins of relevant keywords and then constructs materialized subgraphs (MSGs) per bin in preprocessing stage. In query time, to produce
THE EFFECT OF NEW LINKS ON GOOGLE PAGERANK
, 2007
"... � PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as the frequency that a random surfer visits a Web page, and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to ..."
Abstract
 Add to MetaCart
(Show Context)
� PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can be interpreted as the frequency that a random surfer visits a Web page, and thus it reflects the popularity of a Web page. We study the effect of newly created links on Google PageRank. We discuss to what extent a page can control its PageRank. Using asymptotic analysis we provide simple conditions that show whether or not new links result in increased PageRank for a Web page and its neighbors. Furthermore, we show that there exists an optimal (although impractical) linking strategy. We conclude that a Web page benefits from links inside its Web community and on the other hand irrelevant links penalize the Web pages and their Web communities.
Generating Msg’s by Binrank for Scaling in Dynamic Authority Based Search 1
"... BinRank is a system that approximates object rank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. Number of relatively small subsets of the data graph are materialized in such a way that any keyword query can be answered by running ObjectRank on ..."
Abstract
 Add to MetaCart
(Show Context)
BinRank is a system that approximates object rank results by utilizing a hybrid approach inspired by materialized views in traditional query processing. Number of relatively small subsets of the data graph are materialized in such a way that any keyword query can be answered by running ObjectRank on only one of the subgraphs. BinRank generates the subgraphs by partitioning all the terms in the corpus based on their cooccurrence, executing ObjectRank for each partition using the terms to generate a set of random walk starting points, and keeping only those objects that receive nonnegligible scores. The intuition is that a subgraph that contains all objects and links relevant to a set of related terms should have all the information needed to rank objects with respect to one of these terms. We demonstrate that BinRank can achieve subsecond query execution time on the English Wikipedia data set, while producing highquality search results that closely approximate the results of ObjectRank on the original graph. The Wikipedia link graph contains about 108 edges, which is at least two orders of magnitude larger than what prior state of the art dynamic authoritybased search systems have been able to demonstrate. Experimental evaluation investigates the tradeoff between query execution time, quality of the results, and storage requirements of BinRank.