| Ming-Syan Chen, Jong Soo Park, and Philip S Yu. Data mining for path traversal patterns in a web environment. In ICDCS, pages 385--392, 1996. |
....tied to speci c Web site properties that are not found in general. Academic and commercial tools for parsing, cleaning, and sessionizing Web server logs are abundant [CTS99, SF98, HLC99, ACI99] as are data mining algorithms for discovering patterns or trends from a clean set of usage data [AS94, CPY96, GS99] One reason for the limited success has been a component of Web Usage Mining that is often overlooked the need to understand a Web sites content and structure. The processing and quanti cation of a Web sites content and structure for all but completely static and single frame Web sites ....
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In 16th International Conference on Distributed Computing Systems, pages 385-392, 1996.
....on support. 2 2 1 2 3 1 0 2 T (a tree in D) S3 2 1 3 1 0 T s String Encoding: 0 1 3 1 1 2 1 1 2 1 1 2 1 not a subtree; a sub forest weighted support = 2 string = 1 1 1 2 1 support = 1 weighted support = 1 string = 0 1 1 2 1 2 1 2 1 n4, s = 4, 4] n5, s = 5, 5] n6, s = [6, 6] n2, s = 2, 4] n3, s = 3, 3] n1, s = 1, 5] n0, s = 0, 6] match label = 03456 match labels = 134, 135 Figure 1: An Example Tree with Subtrees Example 1. Consider Figure 1, which shows an example tree T with node labels drawn from the set L = 1, 2, 3 . The figure shows for each ....
.... 1 3 1 0 T s String Encoding: 0 1 3 1 1 2 1 1 2 1 1 2 1 not a subtree; a sub forest weighted support = 2 string = 1 1 1 2 1 support = 1 weighted support = 1 string = 0 1 1 2 1 2 1 2 1 n4, s = 4, 4] n5, s = 5, 5] n6, s = 6, 6] n2, s = 2, 4] n3, s = 3, 3] n1, s = 1, 5] n0, s = [0, 6] match label = 03456 match labels = 134, 135 Figure 1: An Example Tree with Subtrees Example 1. Consider Figure 1, which shows an example tree T with node labels drawn from the set L = 1, 2, 3 . The figure shows for each node, its label (circled) its number according to depth first ....
[Article contains additional citation context not shown here]
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In International Conference on Distributed Computing Systems, 1996.
....dissertation we approach the problem for trees. This approach consists in reducing the access cost of the Web site by reducing the access cost of its leaves. Now, how do we determine which pages are leaf pages Some authors use maximal forward paths in their approach to improve Web sites, e.g. [14], 52] and [7] A forward path is a sequence of pages visited by a single user in a single session until the user goes back to a previously visited page in the same session. We may use maximal forward paths to determine which pages are leaf pages. A leaf page would be a page that is at the end of ....
Ming-Syan Chen, Jong Soo Park, and Philip S. Yu. Data mining for path traver- sal patterns in a Web environment. In Sixteenth International Conference on Distributed Computing Systems, pages 385 392, 1996.
.... length upper bound associated with them [8] Another similar approach is to first divide the time dimension of a client sequence into equal length time intervals which are no larger than a specified parameter, then to group all Web requests within each time interval together to form a transaction [6]. We call this approach window, which assumes that interesting patterns happen within a certain time duration. window can produce variablelength sub sequences, but still they do not make semantic meanings. A popular approach is to cut a client sequence into subsequences on some idle points ....
....obtained by analyzing the density distribution of time interval between pages references. The log preprocessing approach used in [13, 25] is to first obtain a set of transactions using timeout. Then by using a moving window of length n, they derive rules from the set of transactions. Chen et al. [6] proposed an approach that identified user transactions as maximal forward 8 references. We call it forward. The same idea is applied in [7, 16] A forward reference is defined to be the one that references a page that is not already visited in the current transaction. A backward reference is ....
[Article contains additional citation context not shown here]
Ming-Syan Chen, Jong Soo Park, and Philip S. Yu. Data mining for path traversal patterns in a web environment. In Proceedings of the 16th International Conference on Distributed Computing Systems, pages 385--392, May 1996.
....are not designed for very high tra#c Web servers, and usually provide little analysis of data relationships among accessed files, which is essential to fully utilizing the data gathered in the server logs. The concept of applying data mining techniques to Web server logs was first proposed in [6], 16] and [29] Mannila et al. 16] use page accesses from a Web server log as events for discovering frequent episodes [17] Chen et al. 6] introduce the concept of using the maximal forward references in order to break down user sessions into transactions for the mining of traversal ....
....to fully utilizing the data gathered in the server logs. The concept of applying data mining techniques to Web server logs was first proposed in [6] 16] and [29] Mannila et al. 16] use page accesses from a Web server log as events for discovering frequent episodes [17] Chen et al. [6] introduce the concept of using the maximal forward references in order to break down user sessions into transactions for the mining of traversal patterns. A maximal forward reference is the last page requested by a user before backtracking occurs, where the user requests a page previously viewed ....
[Article contains additional citation context not shown here]
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a Web environment. In Proceedings of the 16th International Conference on Distributed Computing Systems, pages 385--392, 1996.
....in the matching clusters. In the context of Web personalization this task involves clustering user transactions identified in the preprocessing stage. Recent work in Web usage mining has focused on the extraction of usage patterns from Web logs for the purpose of deriving marketing intelligence [1 4, 12, 19, 20], as well as the discovery of aggregate profiles for the customization or optimization of Web sites [9, 11, 14, 17, 18] For an up to date survey of Web usage mining systems see [13] Despite the advantages, usage based personalization can be problematic when little usage data is available ....
M. S. Chen, J. S. Park, and P. S. Yu. Data mining for path traversal patterns in a Web environment. In Proceedings of 16th International Conference on Distributed Computing Systems, 1996.
....set to have its support larger than mins. We call large page sets those page sets with minimum support condition satis ed. Once all large page sets are obtained, we use them to generate the desired rules. There are proposed various algorithms for solving the mining association rules problem ([1, 2, 4]) For what we want to do it is sucient to limit ourselves at nding the large page sets. We nd suitable for this the algorithms described in [2] 2.3. How We Synthesize Orientation Pages. Since we choose to mine the mining transactions obtained from content transactions in D, we are entitled to ....
....are obtained, we use them to generate the desired rules. There are proposed various algorithms for solving the mining association rules problem ( 1, 2, 4] For what we want to do it is sucient to limit ourselves at nding the large page sets. We nd suitable for this the algorithms described in [2]. 2.3. How We Synthesize Orientation Pages. Since we choose to mine the mining transactions obtained from content transactions in D, we are entitled to say that we will obtain large content page sets. What signi es, in practice, such a large content page set Discovering an association rule X ) ....
Chen M.-S., Park J. S., Yu P.S., Data Mining for Path Traversal Patterns in a Web Environment, In Proc. of the 16th International Conference on Distributed Computing Systems, pp. 385-392, 1996 (http://citeseer.nj.nec.com/128354.html).
....The LOGML database and web graph information can also be used for web characterization, providing detailed statistics on top k pages, addresses, browsers, and so on. It should be noted that association and sequence mining have also been applied to web usage mining in the past. Chen et al. [2] introduced the notion of a maximal forward chain of web pages and gave an algorithm to mine them. The WUM system [9] applies sequence mining to analyze the navigational behavior of users in a web site. WUM also supports an integrated environment for log preparation, querying and visualization. ....
M. Chen, J. Park, and P. Yu. Data mining for path traversal patterns in a web environment. In International Conference on Distributed Computing Systems, 1996.
....The LOGML database and web graph information can also be used for web characterization, providing detailed statistics on top k pages, addresses, browsers, and so on. It should be noted that association and sequence mining have also been applied to web usage mining in the past. Chen et al. [3] introduced the notion of a maximal forward chain of web pages and gave an algorithm to mine them. The WUM system [26] applies sequence mining to analyze the navigational behavior of users in a web site. WUM also supports an integrated environment for log preparation, querying and visualization. ....
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In International Conference on Distributed Computing Systems, 1996.
....interesting navigation patterns. The interestingness criteria for navigation patterns are dynamically specified by the human expert using WUM s mining language which supports the specification of statistical, structural and textual criteria. Other related work includes the following: Chen et al. [1] present an algorithm for converting the original sequence of log data into a set of maximal forward references and filtering out the effect of some backward references which are mainly made for ease of traveling. Pei et al. 3] propose a novel data structure, called Web ac cess pattern tree for ....
M.-S. Chen, J. S. Park, and P. S. Yu. Data mining for path traversal patterns in a web environment. In Proc. of the 16th International Conference on Distributed Computing Systems, pages 385--392, May 1996.
....have been applied to a wide range of applications. Projects such as WebSIFT [41, 43, 44, 45] WUM [103, 104] SpeedTracer [108] and Shahabi s work [99] have focused on Web Usage Mining in general, without extensive tailoring of the process towards one of the various sub categories. Chen et al. [35] introduced the concept of a maximal forward reference to characterize user episodes for the mining of traversal patterns. A maximal forward reference is the last page requested before backtracking occurs during a particular server session. The SpeedTracer project [108] from IBM Watson 10 is built ....
....forward reference to characterize user episodes for the mining of traversal patterns. A maximal forward reference is the last page requested before backtracking occurs during a particular server session. The SpeedTracer project [108] from IBM Watson 10 is built on the work originally reported in [35]. In addition to episode identi cation, SpeedTracer makes use of referrer and agent information in the preprocessing routines to identify users and server sessions in the absence of additional client side information. The Web Utilization Miner (WUM) system [103] provides a robust mining language ....
[Article contains additional citation context not shown here]
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In 16th International Conference on Distributed Computing Systems, pages 385-392, 1996.
....purchases at the checkout. In this application, an association might be 80 of customers who purchase milk and bread also buy butter. Discovering all such rules can be very useful for planning and marketing. Other applications include spatial data mining [7] and web access patterns discovery [5, 8]. The problem of mining association rules was rst investigated in [1] In this pioneering work, it is shown that mining association rules can be decomposed into two subtasks. First, we need to identify all subsets of items itemsets that are contained in a sucient number of The research is ....
M.-S. Chen, J. S. Park, and P. S. Yu, Data Mining for Path Traversal Patterns in a Web Environment, Proceedings of the 16th International Conference on Distributed Computing Systems, IEEE cs press, 385-392, 1996.
....site to provide customized content for the users, thereby making it more sticky and enhancing user experience. The business implications of such an ability are huge, specially for portals, personalized content providers and e tailers. Several techniques have been proposed for this problem [1] [2], 3] 4] but a de nitive solution is yet to emerge. The footprint that a webuser generates at a particular website is his cowpath in that site. Identifying the category of a user from his cowpath is a very dicult problem since the cowpath is across multiple webpages. Moreover the time spent ....
M. Chen, J. S. Park, and P. S. Yu. Data mining for path traversal patterns in a web environment. In Proc. 16th Intl. Conf on Distributed Computing Systems, pages 385-392, 1996.
....are not designed for very high traffic Web servers, and usually provide little analysis of data relationships among accessed files, which is essential to fully utilize the data gathered in the server logs. The concept of applying data mining techniques to Web server logs was first proposed in [6, 16, 29]. Mannila et al. 16] use page accesses from a Web server log as events for discovering frequent episodes [17] Chen et al. 6] introduce the concept of using the maximal forward references in order to break down user sessions into transactions for the mining of traversal patterns. A maximal ....
....which is essential to fully utilize the data gathered in the server logs. The concept of applying data mining techniques to Web server logs was first proposed in [6, 16, 29] Mannila et al. 16] use page accesses from a Web server log as events for discovering frequent episodes [17] Chen et al. [6] introduce the concept of using the maximal forward references in order to break down user sessions into transactions for the mining of traversal patterns. A maximal forward reference is the last page requested by a user before backtracking occurs, where the user requests a page previously viewed ....
[Article contains additional citation context not shown here]
M.S. Chen, J.S. Park, P.S. Yu. Data mining for path traversal patterns in a Web environment. In: Proc. 16th International Conference on Distributed Computing Systems, 1996, pp. 385--392.
....of analyses will be necessary. Web Usage Mining, which is the application of data mining techniques to large Web data repositories, adds powerful techniques to the tools available to a Web site administrator for analyzing Web site usage. Web Usage Mining techniques developed in [8, 9, 11, 16, 19, 25, 27, 30] have been used to discover frequent itemsets, association rules [5] clusters of similar pages and users, sequential patterns [15] and perform path analysis [9] Several research efforts [17, 13] have considered usage information for performing Web Content Mining [10] An overview of some of the ....
....available to a Web site administrator for analyzing Web site usage. Web Usage Mining techniques developed in [8, 9, 11, 16, 19, 25, 27, 30] have been used to discover frequent itemsets, association rules [5] clusters of similar pages and users, sequential patterns [15] and perform path analysis [9]. Several research efforts [17, 13] have considered usage information for performing Web Content Mining [10] An overview of some of the challenges involved in Web Content Mining is given in [28] The notion of what makes discovered knowledge interesting has been addressed in [14, 18, 20, 26] A ....
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In 16th International Conference on Distributed Computing Systems, pages 385--392, 1996.
....types of analyses will be necessary. Web Usage Mining, which is the application of data mining techniques to large Web data repositories, adds powerful techniques to the tools available to a Web site administrator for analyzing Web site usage. Web Usage Mining techniques developed in [BM98, CMS99, CPY96, SZAS97, SF98, ZXH98, PE98] have been used to discover frequent itemsets, association rules, clusters of similar pages and users, sequential patterns, and to perform path analysis. Several research efforts [NW97, JFM97] have considered usage information in order to perform Web Content Mining [CMS97] In Web Usage Mining, as ....
M.S. Chen, J.S. Park, and P.S. Yu. Data mining for path traversal patterns in a web environment. In 16th International Conference on Distributed Computing Systems, pages 385--392, 1996.
No context found.
Ming-Syan Chen, Jong Soo Park, and Philip S Yu. Data mining for path traversal patterns in a web environment. In ICDCS, pages 385--392, 1996.
No context found.
M. Chen, J. Park, and P. YU, "Data Mining for Path Traversal Patterns in a Web Environment", Proc. 16 Untl. Conf. Distributed Computing Systems, May 1996.
No context found.
M.S. Chen, J.S. Park and P.S. Yu, "Data mining for path traversal patterns in a web environment", In Proc. 16th Int. Conf. on Distributed Computing Systems, Hong Kong, pp. 385--392, (1996).
No context found.
M. S. Chen, J. S. Park, and P. S. Yu. Data mining for path traversal patterns in a web environment. In Sixteenth International Conference on Distributed Computing Systems, pages 385--392, 1996.
No context found.
M. Chen, J. Park, and P. Yu. Data mining for path traversal patterns in a web environment. In International Conference on Distributed Computing Systems, 1996.
No context found.
M.S. Chen, J.S. Park, and P.S. Yu, Data Mining for path traversal patterns in a Web environment, in Proc. 16 th Int. Conf. On Distributed Computing Systems, p. 385392, 1996
No context found.
M.S. Chen, J.S. Park, and P.S. Yu, Data Mining for path traversal patterns in a Web environment, in Proc. 16 th Int. Conf. On Distributed Computing Systems, p. 385-392, 1996
No context found.
M. Chen, J. S. Park, and P. S. Yu. Data Mining for Path Traversal Patterns in a Web Environment. In Proc. the 16 th Conference on Distributed Computing Systems, May 1996, pp. 385-392.
No context found.
M. S. Chen, J.S. Park and P.S. Yu. Data Mining for Path Traversal Patterns in a Web Environment. Proceedings of the 16th International Conference on Distributed Computing Systems, Hong Kong, May, 1996, pp385-392.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC