#### DMCA

## Generalizing pagerank: Damping functions for linkbased ranking algorithms

Venue: | In Proceedings of ACM SIGIR |

Citations: | 43 - 10 self |

### Citations

3632 | Authoritative sources in a hyperlinked environment
- Kleinberg
- 1998
(Show Context)
Citation Context ...ith linear damping we have presented can provide a good approximation with few iterations. Moreover, the approach we have presented could be also applied to multivalued ranking functions such as HITS =-=[23]-=- and topic-sensitive PageRank [20] to obtain, for instance, a method for approximating the hubs and authority scores using less iterations and a linear damping function. Our approach also helps to und... |

3329 |
Collective dynamics of ’small-world’ networks
- Watts, Strogatz
- 1998
(Show Context)
Citation Context ...irs of nodes follow a Gaussian distribution [1] (the average is not given in their paper). Analytic estimations for the average distance of a graph of scale-free network of n nodes include: O(log(n)) =-=[35]-=-; O(log(n) /log(np)) in sparse graphs with p links [12]; 1 + log(n/z1)/log(z2/z1) where z1 is the average indegree, and z2 is the average number of nodes at distance 2 [30]; and O(log(n)/log(log(n))) ... |

3269 | The PageRank citation ranking: Bringing order to the Web. Work in progress. URL: http://google.stanford.edu/backrub/pageranksub.ps
- Page, Brin, et al.
(Show Context)
Citation Context ...was not in scientific citations either), because it is very easy to manipulate in the context of the web, where creating a page costs nearly nothing. The PageRank technique, introduced by Page et al. =-=[31]-=-, actually tries to mend this problem by looking at the importance of a page in a recursive manner: “a page with high PageRank is a page referenced by many pages with high PageRank”. The algorithm not... |

3232 | Modern Information Retrieval
- Baeza-Yates, Ribeiro-Neto
- 1999
(Show Context)
Citation Context ... 3, 4, or 5 links. Then, we sampled 12,000 pairs at each minimum distance at random, and computed their similarities with the original nodes. Similarity was measured using the normalization of TF.IDF =-=[4]-=-, without stemming or stopword removal. The resulting averages are shown in Figure 2, with standard deviation error bars. Text similarity clearly decreases with distance, and in some applications the ... |

2573 |
and Complex Analysis
- Rudin, Real
- 1987
(Show Context)
Citation Context ...s: T can be written as: Z 1 0 r(α)dα = 1 N Z 1 T = r(α)dα . 0 = 1 N ∞ Z 1 ∑ (1 − α)α t=0 0 t 1 · P t dα ∞ 1 ∑ t=0 (t + 1)(t + 2) 1 · Pt , where the first equality is obtained applying Theorem 1.27 of =-=[33]-=-. Provided that P is not singular and P ̸= I, we can write TotalRank using the definition of the logarithm of a matrix: ∞ ln(I − P) = − ∑ k=1 P k k T = P −1 (I + (I − P −1 )ln(I − P)) TotalRank is a w... |

634 |
Random Graphs with Arbitrary Degree Distributions and their Applications
- Newman, Strogatz, et al.
- 2001
(Show Context)
Citation Context ... ∞ t 1 r(α) = (1 − α) ∑ α t=0 N 1Pt , or in matricial form: r(α) = (1 − α) 1 1(I − αP)−1 ||αP|| < 1. N There is an equivalent, and actually very intriguing way of rewriting this formula, mentioned in =-=[30]-=- that leads to a conclusion similar to those of [10]: given a path, that is, a sequence of edges in the graph p = 〈x1,x2,...,xk〉, such that node xi is connected to node xi+1, we define its branching c... |

543 | Topic-Sensitive PageRank”,
- Haveliwala
- 2002
(Show Context)
Citation Context ...ed can provide a good approximation with few iterations. Moreover, the approach we have presented could be also applied to multivalued ranking functions such as HITS [23] and topic-sensitive PageRank =-=[20]-=- to obtain, for instance, a method for approximating the hubs and authority scores using less iterations and a linear damping function. Our approach also helps to understand how easy or difficult it i... |

471 | Improved algorithms for topic distillation in a hyperlinked environment.
- Bharat, Henzinger
- 1998
(Show Context)
Citation Context ...e, etc. [17]. Also, links within the same site can be considered self-links and as such do not confer as much authority as a link between different sites; indeed, there are ranking methods like BHITS =-=[6]-=- that treat them differently. Other characteristics of links, such as the exploration level at which they appear in Web sites [27], or if they are at the beginning or the bottom of individual pages, o... |

430 |
A new status index derived from sociometric analysis.
- Katz
- 1953
(Show Context)
Citation Context ... seen, generic PageRank is a functional ranking where the damping function damping(t) = (1 − α)α t decays exponentially fast (something similar was first considered in citation analysis back in 1953! =-=[22]-=-). The next section shows several functional rankings by describing their damping functions. 3. DAMPING FUNCTIONS Formula (1) defines a form of ranking that is parametrized by a damping function; the ... |

291 | Stochastic Models for the Web Graph
- Kumar, Raghavan, et al.
- 2000
(Show Context)
Citation Context ...napshots from the Web, including the .uk, .it and .eu.int domains. For comparison, we also considered a synthetic scale-free network produced according to the evolving model described by Kumar et al. =-=[24]-=- (a combination of preferential attachment and random links) with the parameters suggested by Pandurangan et al. [32]. As far as the latter is concerned, in the generated graph the exponents for the p... |

200 | The Indexable web is more than 11.5 billion pages”,
- Gulli, Sigmorini
- 2005
(Show Context)
Citation Context ... = (α∗1 ) L1 +1 L2 +1 ∗ ≈ (α1) log(N1 ) log(N2 ) An example that can be used in practice is the following: let’s consider a web graph with N1 = 11.5 × 109 pages (the size of the full Web estimated by =-=[16]-=-), and another graph with only N2 = 50×106 pages (the size of the Web of a large country); the second graph is roughly 3 orders of magnitude smaller. If it is shown empirically that α∗ 1 = 0.85 is a g... |

172 |
The Diameter of a Scale-Free Random Graph
- Bollobás, Riordan
(Show Context)
Citation Context ...; O(log(n) /log(np)) in sparse graphs with p links [12]; 1 + log(n/z1)/log(z2/z1) where z1 is the average indegree, and z2 is the average number of nodes at distance 2 [30]; and O(log(n)/log(log(n))) =-=[9]-=-. We did the following experiment: starting from a node picked at random, we followed the links backwards and counted the number of nodes at different distances. The average distances found, appear to... |

156 | Topical locality in the Web.
- Davison
- 2000
(Show Context)
Citation Context ... by following longer paths in the real web graph. This cannot be known exactly, but we can attempt to measure it indirectly. Pages that link to each other are more similar than pages chosen at random =-=[13]-=-; evidence from topical crawlers [34] shows that when doing breadth-first exploring, the topic “drifts” as the distance increases. On the same line of thought, we propose to use the decrease of text s... |

114 | Ranking the web frontier,
- Eiron, McCurley, et al.
- 2004
(Show Context)
Citation Context ... filled with zeroes. Dangling nodes can be dealt with by adding an extra node that is linked to and from all other nodes, or by introducing new arcs from each dangling node to every node in the graph =-=[14]-=-. In our analysis, we shall assume that all dangling nodes have been eliminated already in some way, so that we do not have to worry about their presence. All the algorithms we will present can be mod... |

114 | Using pagerank to characterize web structure.
- Pandurangan, Raghavan, et al.
- 2006
(Show Context)
Citation Context ...le-free network produced according to the evolving model described by Kumar et al. [24] (a combination of preferential attachment and random links) with the parameters suggested by Pandurangan et al. =-=[32]-=-. As far as the latter is concerned, in the generated graph the exponents for the powerlaw in the center part of the distributions are -2.1 for in-degree and PageRank, and -2.7 for out-degree; we gene... |

112 | The quest for correct information of the web: hyper search engines.
- Marchiori
- 1997
(Show Context)
Citation Context ...imation. One of the measures of importance of a scientific paper is the number of citations that the article receives. Following this idea, several authors proposed to use links for ranking web pages =-=[28, 21, 25]-=-; however, it quickly became clear that just counting the links was not a very reliable measure of authoritativeness (it was not in scientific citations either), because it is very easy to manipulate ... |

90 | The second eigenvalue of the Google matrix.
- Haveliwala, Kamvar
- 2003
(Show Context)
Citation Context ...ive algorithm gives good approximations (both in norm and with respect to the induced node order) in few iterations, even though convergence speed and numerical stability decay when α gets close to 1 =-=[19, 18]-=-. 3.2 Linear damping As an (extreme) alternative to PageRank, let us consider a simple damping function such as: { 2(L−t) damping(t) = L(L+1) t < L 0 t ≥ L that is, a damping function that decreases l... |

60 | Pagerank as a function of the damping factor.
- Boldi, Santini, et al.
- 2005
(Show Context)
Citation Context ...h most of our results can be easily restated with a non-uniform preference vector v, for the sake of clarity we shall only consider the uniform preference 1/N in the rest of the paper. As observed in =-=[15, 8]-=-, the PageRank vector r(α) can be written as: ∞ t 1 r(α) = (1 − α) ∑ α t=0 N 1Pt , or in matricial form: r(α) = (1 − α) 1 1(I − αP)−1 ||αP|| < 1. N There is an equivalent, and actually very intriguing... |

60 |
Information flow in social groups.
- Wu, Huberman, et al.
- 2004
(Show Context)
Citation Context ...s the empirical distribution of text similarity versus distance could be used as an “empirical” damping function. Different measures of text similarity can yield different distributions; for instance =-=[36]-=- uses the number of repeated words and phrases between pages and obtains a faster decrease in similarity. Our results show that a linear damping with L = 8 or L = 9 approximates better text similarity... |

47 | A general evaluation framework for topical crawlers.
- Srinivasan, Menczer, et al.
- 2005
(Show Context)
Citation Context ...l web graph. This cannot be known exactly, but we can attempt to measure it indirectly. Pages that link to each other are more similar than pages chosen at random [13]; evidence from topical crawlers =-=[34]-=- shows that when doing breadth-first exploring, the topic “drifts” as the distance increases. On the same line of thought, we propose to use the decrease of text similarity as an approximation to an “... |

44 | Using rank propagation and probabilistic counting for link-based spam detection. in
- Becchetti
- 2006
(Show Context)
Citation Context ...an to analyze them. In particular, under the assumption that is easier to “spam” closer links, PageRank damping is more affected by collusion than the rest of the damping functions presented here. In =-=[5]-=- a truncated exponential damping, combined with other linkanalysis techniques, is used for spam detection. Acknowledgements: we would like to thank Dániel Fogaras for a valuable discussion about Total... |

42 |
Lexical and semantic clustering by web links
- Menczer
- 2004
(Show Context)
Citation Context ...irst exploring, the topic “drifts” as the distance increases. On the same line of thought, we propose to use the decrease of text similarity as an approximation to an “empirical” damping function. In =-=[29]-=- it is shown that text similarity and link distance are anticorrelated up to 4-5 links. To find out which is the correlation between link-distance and similarity, we performed the following experiment... |

40 | The diameter of random sparse graphs
- Chung, Lu
- 2001
(Show Context)
Citation Context ...erage is not given in their paper). Analytic estimations for the average distance of a graph of scale-free network of n nodes include: O(log(n)) [35]; O(log(n) /log(np)) in sparse graphs with p links =-=[12]-=-; 1 + log(n/z1)/log(z2/z1) where z1 is the average indegree, and z2 is the average number of nodes at distance 2 [30]; and O(log(n)/log(log(n))) [9]. We did the following experiment: starting from a n... |

39 |
Toward a qualitative search engine,”
- Li
- 1998
(Show Context)
Citation Context ...imation. One of the measures of importance of a scientific paper is the number of citations that the article receives. Following this idea, several authors proposed to use links for ranking web pages =-=[28, 21, 25]-=-; however, it quickly became clear that just counting the links was not a very reliable measure of authoritativeness (it was not in scientific citations either), because it is very easy to manipulate ... |

33 |
Web Page Ranking Using Link Attributes.
- Baeza-Yates, Davis
- 2004
(Show Context)
Citation Context ...ation level at which they appear in Web sites [27], or if they are at the beginning or the bottom of individual pages, or inside a certain HTML element, can also be used for non-uniform normalization =-=[3]-=-. To simplify our treatment, we will assume uniform normalization, so if a page has d out-links, each of those links has a weight of 1/d, but the results of this paper can be applied to other forms of... |

31 |
Jeong H and Barabási A L
- Albert
- 2000
(Show Context)
Citation Context ...eights between LinearRank and PageRank, for various parameter combinations. 5.1 Characteristic path lengths In scale-free networks, the distances between pairs of nodes follow a Gaussian distribution =-=[1]-=- (the average is not given in their paper). Analytic estimations for the average distance of a graph of scale-free network of n nodes include: O(log(n)) [35]; O(log(n) /log(np)) in sparse graphs with ... |

29 | Local methods for estimating pagerank values.
- Chen, Gan, et al.
- 2004
(Show Context)
Citation Context ...mping function decays exponentially: damping(t) = (1 − α)α t . Since longer paths have less importance in the calculation of PageRank, it could be approximated by using only a few levels of links. In =-=[11]-=-, it is shown that by using only the nodes at distance 1 from a target node (equivalent to linear damping with L = 2), PageRank values can be approximated with 30% of average error. Using nodes at dis... |

20 | PageRank revisited.
- Brinkmeier
- 2006
(Show Context)
Citation Context ...l form: r(α) = (1 − α) 1 1(I − αP)−1 ||αP|| < 1. N There is an equivalent, and actually very intriguing way of rewriting this formula, mentioned in [30] that leads to a conclusion similar to those of =-=[10]-=-: given a path, that is, a sequence of edges in the graph p = 〈x1,x2,...,xk〉, such that node xi is connected to node xi+1, we define its branching contribution as follows 1 branching(p) = d1d2 ···dk−1... |

16 | Totalrank: ranking without damping
- Boldi
- 2005
(Show Context)
Citation Context ...gorithm shown in Figure 1 can be used, with: START : 2v[i]/(L + 1) STOP : k = L DAMP(k) : (L − k)/(L − (k − 1))3.3 Quadratic hyperbolic damping: TotalRank Recently, a ranking method called TotalRank =-=[7]-=- has been proposed. The method aims at eliminating the necessity for an arbitrary parameter by integrating PageRank over the entire range of α. If r(α) is the vector of PageRank, then TotalRank is def... |

15 | Where to Start Browsing the Web? In
- Fogaras
- 2003
(Show Context)
Citation Context ...h most of our results can be easily restated with a non-uniform preference vector v, for the sake of clarity we shall only consider the uniform preference 1/N in the rest of the paper. As observed in =-=[15, 8]-=-, the PageRank vector r(α) can be written as: ∞ t 1 r(α) = (1 − α) ∑ α t=0 N 1Pt , or in matricial form: r(α) = (1 − α) 1 1(I − αP)−1 ||αP|| < 1. N There is an equivalent, and actually very intriguing... |

15 | The condition number of the pagerank problem
- Haveliwala, Kamvar
- 2003
(Show Context)
Citation Context ...ive algorithm gives good approximations (both in norm and with respect to the induced node order) in few iterations, even though convergence speed and numerical stability decay when α gets close to 1 =-=[19, 18]-=-. 3.2 Linear damping As an (extreme) alternative to PageRank, let us consider a simple damping function such as: { 2(L−t) damping(t) = L(L+1) t < L 0 t ≥ L that is, a damping function that decreases l... |

13 | Voting model for ranking Web pages
- Lifantsev
- 2000
(Show Context)
Citation Context ...s:Normalization. In the Web, creating an out-link is free, so there is an incentive for web page authors to create pages with many outlinks; this is the reason why a metaphor of “voting” is enforced =-=[26]-=- in which each page has only one “vote” that has to be split among its linked pages. This is typically done in link-based ranking by normalizing A row-wise: the normalization process means that every ... |

8 |
and link classifications: connecting diverse resources
- Page
- 1998
(Show Context)
Citation Context ...link the same value, as there is evidence that web links have different purposes such as navigating in a multi-page set, expanding the contents of the current page, pointing to another resource, etc. =-=[17]-=-. Also, links within the same site can be considered self-links and as such do not confer as much authority as a link between different sites; indeed, there are ranking methods like BHITS [6] that tre... |

7 |
Webpage importance analysis using conditional markov random walk
- Liu, Ma
- 2005
(Show Context)
Citation Context ... between different sites; indeed, there are ranking methods like BHITS [6] that treat them differently. Other characteristics of links, such as the exploration level at which they appear in Web sites =-=[27]-=-, or if they are at the beginning or the bottom of individual pages, or inside a certain HTML element, can also be used for non-uniform normalization [3]. To simplify our treatment, we will assume uni... |

3 |
Improving retrieval effectiveness with hyperlink information
- Joo, Myaeng
- 1998
(Show Context)
Citation Context ...imation. One of the measures of importance of a scientific paper is the number of citations that the article receives. Following this idea, several authors proposed to use links for ranking web pages =-=[28, 21, 25]-=-; however, it quickly became clear that just counting the links was not a very reliable measure of authoritativeness (it was not in scientific citations either), because it is very easy to manipulate ... |