| G. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In Sixth ACM SIGKDD Conference, pages 150--160, Boston, MA, August 2000. |
....first facet, discovery of online communities, can take the form of finding a community where none (seemingly) exists: that is, bringing together people into a community; or it may be in unearthing a community that does exist an extension of the information retrieval problem. Flake et al. [25] define a community on the web as a set of web pages that link to more pages in the community than to pages outside of the community. A graph theory algorithm, maximum flow minimum cut, is applied to determine web communities that have not been formally identified. Their work uses only the ....
Flake, G.W., Lawrence, S. and Giles, C.L., Efficient Identification of Web Communities. in KDD, (Boston, MA, 2000), ACM, 150-160.
....and adaptation to the environment. Computer networks or distributed systems in general may be regarded as communities similar to the above examples. Most obviously, the Internet or Web forms entities that can be characterized as communities. Many approaches to define communities on the Web [6, 8, 10] are based on the use of existing link patterns and they therefore lack the characteristic community properties to adapt to the current context and to dynamically evolve. Implicit information [9, 13] other than link patterns are necessary to achieve this. A number of applications have been ....
Gary Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities. In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, Boston, MA, August 20--23 2000.
....G = V; E) which is directed, we define matrix A to be: A ij = ae 1 if (i; j) 2 E or (j; i) 2 E A is an adjacency matrix of the graph. Link structure alone provides us with rich information on the topic. By exploring the link structure, we are able to extract useful information from the web [5, 17, 6, 16, 21, 24, 22]. One of the most popular algorithm to retrieve information from the link structure is credited to Jon Kleinberg. We will briefly discuss his HITS algorithm in Appendix A. 5.2. Textual information. It is known that clustering documents based entirely on text information is not effective in ....
G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. KDD, 2000.
.... features are ranked using query independent link analysis [4] Links are also used in conjunction with text to identify hub and authority pages for a certain subject [17] guide search agents crawling on behalf of users or topical search engines [25, 26, 5, 27, 32] and identify Web communities [13, 18, 10, 11]. The hidden assumption behind all of these retrieval, ranking and crawling algorithms that use link analysis to make semantic inferences is a correlation between the graph topology of the Web and the meaning of pages, or more precisely the conjecture that one can infer what a page is about by ....
G. Flake, S. Lawrence, and C. Giles. Efficient identification of Web communities. In Proc. 6th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 150--160, Boston, MA, August 20--23 2000.
.... for such an image to be produced, and for such a scientific discovery to be claimed and substantiated [31, 35] An alternative scheme for automatically discovering the membership of a Web based community uses graph traversal (crawling linked web pages) and s t maximum flow network algorithms [13]. Paper number 29840, Accepted for publication with revisions, December 2001. 7 occasion to author and publish technical reports or scholarly research papers about their software development efforts, which are publicly available for subsequent examination, review, and secondary analysis. Each of ....
FLAKE, G.W., LAWRENCE, S., and GILES, C.L.: 'Efficient Identification of Web Communities', Proc. Sixth Intern. Conf. Knowledge Discovery and Data Mining, (ACM SIGKDD-2000), Boston, MA, pp. 150-160, August 2000.
....in a DBG(F,C, p,q) each node in F is allowed to form an edge with a few other nodes of C, in a similar manner as a member in a community forms a relationship with a few other members. This differs from a CBG(F,C,p,q) in which each node in F is forced to form an edge with all the nodes of F. In [28], given a set of the crawled pages on some topic, the problem of detecting a community is abstracted to maximum flow minimum cut framework, where as the source is composed of known members and the sink consist of well known non members. Given the set of pages on some topic, a community is defined ....
G.W.Flake, Steve Lawrence, C.Lee Giles, Efficient identification of web communities, in proc. of 6th ACM SIGKDD, August 2000, pp.150-160.
....reasonably large collection of pages, there is no guarantee that each community formation is reflected as a CBG core. Also, it rarely happens that a page creator puts links to all the pages of interest in particular domain. Because, a data set may not contain the potential pages to form a CBG. In [27], given a set of crawled pages on some topic, the problem of detecting a community is abstracted to maximum flow minimum cut framework, where as the source is composed of known members and the sink consist of well known non members. Given the set of pages on some topic, a community is defined as ....
G.W.Flake, Steve Lawrence, C.Lee Giles, Efficient identification of web communities, in proc. of 6th ACM SIGKDD, August 2000, pp.150-160.
....We then perform iterative pruning technique to extract a DBG structure. The complexity of proposed approach is linear as amount of computation time to find all communities increases linearly with number pages in a page collection. Further, it can be easily parallelized. Flow based approach In [12], given a set of crawled pages on some topic, the problem of detecting a community is abstracted to maximum flow minimum cut framework, where as the source is composed of known members and the sink consist of well known non members. Given the set of pages on some topic, a community is defined as ....
G.W.Flake, Steve Lawrence, C.Lee Giles, Efficient identification of web communities, The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2000, pp.150160.
....INTRODUCTION numerous web communities on the Web. A web community is a collection of web pages created by individuals or any kind of associations with a common interest on a specific topic, such as fan pages of a baseball team, and official pages of computer vendors. Some link analysis techniques [8, 5, 12, 10, 7] consider the Web as a graph, which nodes are web pages and edges are hyperlinks, and automatically identify such web communities by extracting distinctive graph structures. Web communities slightly differ from real communities. That is, web communities may consist of competitors or authors who do ....
....related pages, and improved the precision by exploiting link weighting and the order of links in a page. Companion first builds a subgraph of the Web near the seed, and extracts authorities and hubs in the graph using HITS. Then authorities are returned as related pages. In addition, Flake et al. [7] redefined a community including given seed pages as a subgraph that is separated from the Web using a maximum flow minimum cut framework. These techniques can automatically identify individual communities, however, have not concerned the relationship between communities. To build the web ....
Gary W. Flake, Steve Lawrence, and C. Lee Giles. Efficient Identification of Web Communities. In Proceedings of KDD 2000, 2000.
....web page as a node and link as an edge between two nodes) Given a large collection of pages, the trawling algorithm extracts community cores by extracting all the potential CBGs. The proposed appraoch is different in that we use DBGs to extract and relate the potential communities. In [9], given a set of crawled pages on some topic, the problem of detecting a community is abstracted to maximum flow minimum cut framework, where as the source is composed of known members and the sink consist of well known non members. Given the set of pages on some topic, a community is defined as ....
G.W.Flake, Steve Lawrence, C.Lee Giles, Efficient identification of web communities, The Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2000, pp.150-160.
....Given a large collection of pages, the trawling algorithm extracts community cores by extracting all the potential CBGs. In this paper we relax the criteria of existence of a community by defining a DBG structure. Also, the DBG abstraction is extended to relate the extracted communities. In [12], given a set of crawled pages on some topic, the problem of detecting a community is abstracted to maximum flow minimum cut framework, where as the source is composed of known members and the sink consist of well known non members. Given the set of pages on some topic, a community is defined as ....
G.W.Flake, Steve Lawrence, C.Lee Giles, Efficient identification of web communities, in Proc. 6th ACM SIGKDD, August 2000, pp.150-160.
....We propose a method for observing evolution of web communities. A web community is a collection of web pages created by individuals or associations with a common interest on a topic, such as fan pages of a baseball team, and official pages of computer vendors. Recent research on link analysis [2, 5, 4, 1] shows that we can identify a web community on a topic from densely connected structure of the web graph, in which nodes are web pages and edges are hyperlinks. Although these techniques can automatically identify communities, they have not considered evolution of communities yet. Since a web ....
....in 1999, 2000, and 2001. This system first extracts whole web communities and their relevances from each archive. This is based on our previous work of web community chart [6] that is a graph of communities, in which relevant communities are connected by edges. Compared with prior work such as [2, 5, 4, 1], the main advantage of our community chart is existence of relevances between communities. We can navigate through related communities, and can locate evolution around a particular community. Finally, our system provides an evolution viewer that allows the user to extract when and how communities ....
G. W. Flake, S. Lawrence, and C. L. Giles. Efficient Identification of Web Communities. In Proceedings of KDD 2000.
....a community centered on a given topic of interest: the pages in the first set are pages of fans while page in the second one are pages of stars. See Figure 2. Another definition has been recently introduced in which a community is a set of pages which have more links inside the set than outside [FLG00] Some other kinds of local structures have been discovered in the Web graph, and studied from this point of view [ERC 00] fans stars Fig. 2: A structure which often appears in the Web graph, and which is interpretated as a set of fans having links to their stars. At a macroscopic level, ....
Gary Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities. In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, Boston, MA, August 20--23 2000.
....and adaptation to the environment. Computer networks or distributed systems in general may be regarded as communities similar to the above examples. Most obviously, the Internet or Web forms entities that can be characterized as communities. Many approaches to define communities on the Web [6, 8, 10] are based on the use of existing link patterns and they therefore lack the characteristic community properties to adapt to the current context and to dynamically evolve. Implicit information [9, 13] other than link patterns are necessary to achieve this. A number of applications have been ....
Gary Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities. In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, Boston, MA, August 20--23 2000.
....are pointed at by other users, our friends and neighbors on the web. These links can represent anything from friendship, to collaboration, to general interest in the material on the other user s homepage. In this way individual homepages become part of a large community structure. Recent work [6] [7] 10] has attempted to use analysis of link topology to find web communities. These web communities are web page collections with a shared topic. For example, any page dealing with data mining and linking to other pages on the same topic would be part of the data mining page collection. ....
G. Flake, S. Lawrence, and C. Lee Giles. "Efficient identification of web communities". In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, August 20-23 2000, pp.150-160.
....generalized to the most general level. 0 s i 1 0 ab1 a 0.5 = b 0.8 = s i ar i s i br F AB = AB AB AB 8 . The relevant pages we have found contain a community, that is a subset of web pages that are more closely linked to each other than to pages not belonging to this subset [FLG00]. Note that only the first criterion which is not realistic to be met and the last one which is very costly to test are nonparametric. All the other criteria require at least one parameter which, in general, must be specified by the user based on his domain knowledge. 4 Implementation The ....
Flake G.W., Lawrence S., Giles C.L.: "Efficient Identification of Web Communities", Proc. Int. Conf. on Knowledge Discovery and Data Mining (KDD `00), 2000, pp. 150-160.
....free floating in the web, but point to and are pointed at by other users. These links can represent anything from friendship, to collaboration, to general interest in the material on the other user s homepage. In this way individual homepages become part of a large community structure. Recent work [6] [7] 9] has attempted to find of web pages communities by performing analysis on their graph structure. That is, given a graph, this method extracts clusters of users in the same community. Rather than attempting to extract communities in our research we attempt to gain an understanding of the ....
G. Flake, S. Lawrence, and C. Lee Giles. "Efficient identification of web communities". In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, August 20-23 2000, pp.150-160.
No context found.
G. W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000.
....to other documents (through hyperlinks) Hyperlinks are increasingly being used to improve the ability to organize, search and analyze the web. Hyperlinks (or citations) are being actively used to improve web search engine ranking [4] improve web crawlers [6] discover web communities [8], organize search results into hubs and authorities [13] make predictions about similarity between research papers [16] and even to classify target web pages [20, 9, 2, 5, 3] The basic assumption made by citation or link analysis is that a link is often created because of a subjective connection ....
Gary W. Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities,. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000.
....of the two classes. In the nonlinear case, the margin of the classifier is maximized in the kernel function space, which results in a nonlinear classification boundary. Some research has focused on using hyperlinks, in addition to text and HTML, as a means of clustering or classifying web pages [3, 6]. Our work assumes the need to determine the class of a page based solely on its raw contents, and does not have access to the inbound links for that page. QUIP(A, 3, E, Q, n) INPUT: Training examples ,A(pos) and B(neg) Set of search engines E, test queries Q The number of results to ....
Gary W. Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities,. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000.
....Eric J. Glover , Kostas Tsioutsiouliklis ,2, Steve Lawrence David M. Pennock , Gary W. Flake compuman, kt, lawrence, dpennock, flake research.nj .nec. com 1 kt cs . princeton.edu 2 NEC Research Institute Computer Science Department s 4 Independence Way Princeton University Princeton, NJ 08540 Princeton, NJ 08540 ABSTRACT The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents (documents that link to the target document) for search. We analyze the relative ....
....Tsioutsiouliklis ,2, Steve Lawrence David M. Pennock , Gary W. Flake compuman, kt, lawrence, dpennock, flake research.nj .nec. com 1 kt cs . princeton.edu 2 NEC Research Institute Computer Science Department s 4 Independence Way Princeton University Princeton, NJ 08540 Princeton, NJ 08540 ABSTRACT The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents (documents that link to the target document) for search. We analyze the relative utility of document ....
[Article contains additional citation context not shown here]
G.W. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000.
....of the two classes. In the nonlinear case, the margin of the classifier is maximized in the kernel function space, which results in a nonlinear classification boundary. Some research has focused on using hyperlinks, in addition to text and HTML, as a means of clustering or classifying web pages [3, 6]. Our work assumes the need to determine the class of a page based solely on its raw contents, and does not have access to the inbound links for that page. QUIP(A, B, SE, Q, n) INPUT: Training examples A(pos) and B(neg) Set of search engines SE, test queries Q The number of results to consider ....
Gary W. Flake, Steve Lawrence, and C. Lee Giles. Efficient identification of web communities,. In Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000), Boston, MA, 2000. ACM Press.
No context found.
G. Flake, S. Lawrence, and C. L. Giles. Efficient identification of web communities. In Sixth ACM SIGKDD Conference, pages 150--160, Boston, MA, August 2000.
No context found.
Gary William Flake, Steve Lawrence, C. Lee Giles, Efficient Identification of Web Communities, in Proceedings of the 6th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD-2000.
No context found.
G.W.Flake,S.Lawrence,andC.L.Giles. Efficientidentification of web communities. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 150--160, New York, NY, USA, 2000. ACM Press.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC