See this document in CiteSeerX!

SpamRank Fully Automatic Link Spam Detection (2005)  (Make Corrections)  (4 citations)
Andras A. Benczur, Karoly Csalogany, Tamas Sarlos, Mate Uher



  Home/Search   Context   Related

 
View or download:
lehigh.edu/2005/benczur.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  lehigh.edu/program (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Spammers intend to increase the PageRank of certain spam pages by creating a large number of links pointing to them. We propose a novel method based on the concept of personalized PageRank that detects pages with an undeserved high PageRank value without the need of any kind of white or blacklists or other means of human intervention. We assume that spammed pages have a biased distribution of pages that contribute to the undeserved high PageRank value. We define SpamRank by penalizing pages ... (Update)

Cited by:   More
Link-Based Characterization and Detection of Web Spam - Becchetti, Castillo.. (2006)   (Correct)
Propagating Trust and Distrust to Demote Web Spam - Baoning Wu Vinay (2006)   (Correct)
Using Rank Propagation and Probabilistic Counting.. - Becchetti.. (2006)   (Correct)

Active bibliography (related documents):   More   All
4.3:   SpamRank - Fully Automatic Link Spam Detection - Benczur, Csalogany, Sarlos, Uher (2005)   (Correct)
0.7:   Undue Influence: Eliminating the Impact of Link Plagiarism on.. - Wu, Davison (2006)   (Correct)
0.7:   Site Level Noise Removal for Search Engines - Carvalho, Chirita, de Moura.. (2006)   (Correct)

Similar documents based on text:   More   All
0.4:   Parallel and Fast Sequential Algorithms for Undirected Edge.. - Benczur (1999)   (Correct)
0.2:   Scientific Committee: - Mikls Bartha Tibor   (Correct)
0.2:   Conference of Phd Students in Computer Science - Csendes (2002)   (Correct)

Related documents from co-citation:   More   All
4:   Combating web spam with trustrank (context) - Gyongyi, Garcia-Molina et al. - 2004
4:   and statistics: using statistical analysis to locate spam web pages (context) - Fetterly, Manasse et al. - 2004
4:   The PageRank citation ranking: Bringing order to the Web - Page, Brin et al.

BibTeX entry:   (Update)

A. A. Benczur, K. Csalogany, T. Sarlos, and M. Uher. Spamrank - fully automatic link spam detection. In First International Workshop on Adversarial Information Retrieval on the Web, 2005. http://citeseer.ist.psu.edu/article/benczur05spamrank.html   More

@misc{ benczur05spamrank,
  author = "A. Benczur and K. Csalogany and T. Sarlos and M. Uher",
  title = "Spamrank - fully automatic link spam detection",
  text = "A. A. Benczur, K. Csalogany, T. Sarlos, and M. Uher. Spamrank - fully automatic
    link spam detection. In First International Workshop on Adversarial Information
    Retrieval on the Web, 2005.",
  year = "2005",
  url = "citeseer.ist.psu.edu/article/benczur05spamrank.html" }
Citations (may not include all citations):
641   The anatomy of a large-scale hypertextual Web search engine - Brin, Page - 1998
576   Authoritative sources in a hyperlinked environment - Kleinberg - 1999
344   The PageRank citation ranking: Bringing order to the web - Page, Brin et al. - 1998
163   Improved algorithms for topic distillation in a hyperlinked .. - Bharat, Henzinger - 1998
97   Assessing agreement on classification tasks: the kappa stati.. - Carletta - 1996
61   Scaling personalized web search - Jeh, Widom - 2003
59   Stochastic models for the web graph - Kumar, Raghavan et al. - 2000
49   Mining the Web's link structure (context) - Chakrabarti, Dom et al. - 1999
44   Rank aggregation methods for the web - Dwork, Kumar et al. - 2001
35   A large-scale study of the evolution of web pages (context) - Fetterly, Manasse et al. - 2003
32   The stochastic approach for link-structure analysis (context) - Lempel, Moran - 2000
31   Finding authorities and hubs from link structures on the wor.. - Borodin, Roberts et al. - 2001
30   Design and implementation of a high-performance distributed .. - Suel, Shkapenyuk - 2002
26   Scale-free characteristics of random networks: the topology .. - Barabsi, Albert et al. - 2000
21   Using PageRank to Characterize Web Structure - Pandurangan, Raghavan et al. - 2002
19   Ranking the web frontier - Eiron, McCurley et al. - 2004
17   Deeper inside PageRank (context) - Langville, Meyer - 2004
16   Web search using automatic classification - Chekuri, Goldwasser et al. - 1997
14   Recognizing nepotistic links on the web - Davison - 2000
14   and statistics -- Using statistical analysis to locate spam .. (context) - Fetterly, Manasse et al. - 2004
11   Web spam taxonomy - Gyngyi, Garcia-Molina - 2005
11   The Connectivity Sonar: Detecting site functionality by stru.. - Amitay, Carmel et al. - 2003
11   Shilling recommender systems for fun and profit - Lam, Riedl - 2004
10   HITS and a unified framework for link analysis (context) - Ding, He et al. - 2002
10   Making eigenvector-based reputation systems robust to collus.. (context) - Zhang, Goel et al. - 2004
8   ACM Transactions on Internet Technology (context) - Bianchini, Gori et al. - 2005
7   Online at http://www (context) - Perkins, The et al. - 2001
6   Towards scaling fully personalized PageRank (context) - Fogaras, Rcz - 2004
6   Downweighting tightly knit communities in world wide web ran.. - Roberts, Rosenthal - 2003
4   Algorithms and experiments for the Web graph (context) - Laura, Leonardi et al. - 2003
4   PageRank increase under different collusion topologies (context) - Baeza-Yates, Castillo et al. - 2005
3   Combating web spam with TrustRank (context) - Gyngyi, Garcia-Molina et al. - 2004
3   Where to start browsing the web (context) - Fogaras - 2003
2   Identifying link farm pages (context) - Wu, Davison - 2005
2   Challenges in running a commercial search engine (context) - Singhal - 2004
2   Challenges in web search engines (context) - Henziger, Motwani et al. - 2002
http://en.pr10.info/pagerank0-badrank/
http://www.wordspy.com/words/spamdexing.asp

Documents on the same site (http://airweb.cse.lehigh.edu/program.html):   More
Blocking Blog Spam with Language Model Disagreement - Mishne, Carmel, Lempel (2005)   (Correct)
Cloaking and Redirection: A Preliminary Study - Wu, Davison (2005)   (Correct)
Web Spam, Propaganda and Trust - Metaxas, DeStefano (2005)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC