| R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web data. PhD thesis, University of Minnesota, 2000. |
....may be induced by usage where a different relationship was intended. For example, sequence mining may show that many of the users who visited page C later went to page D, along paths that indicate a prolonged search (frequent visits to help and index pages, frequent backtracking, etc. [14, 27]. This can be interpreted to mean that visitors wish to reach D from C, but that this was not foreseen in the information architecture, hence that there is at present no hyperlink from C to D. This insight can be used for static site improvement for all users (adding a link from C to D) or for ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota, Faculty of the Graduate School, 2000.
....like association rule mining, clustering, or sequential pattern discovery are being used to identify co occurring items in browsing and shopping histories, different user segments, navigation strategies, etc. This knowledge can be exploited to improve site design and navigation opportunities [5, 7, 12], to develop marketing strategies including recommender systems [18, 15] etc. However, because the primary focus of this kind of usage recording is technical, an interpretation of URLs in terms of user behavior, interests, and intentions, is not always straightforward. For example, the site ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota, Faculty of the Graduate School, 2000.
....may be induced by usage where a different relationship was intended. For example, sequence mining may show that most of those users who visited page C later went to page D, along paths that indicated a prolonged search (frequent visits to help and index pages, frequent backtracking, etc. [10, 25]. This can be interpreted to mean that visitors wish to reach D from C, but that this was not foreseen in the information architecture, hence that there is at present no hyperlink from C to D. This insight can be used for static site improvement for all users (adding a link from C to D) or for ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota, Faculty of the Graduate School, 2000.
.... data mining method evaluation of the data selection and preprocessing steps is important for Web usage mining ( 24, 17] Correctly observing user behavior on the Web requires automatically handling difficulties e.g. with filtering automatic Web robots [25] detecting sessions [7] path completion [8], For example, for evaluating session identification heuristics Cooley recommends comparing the results of log files with session identification with the results of analyzing the same log files stripped (see [8, pp. 118 123] Evaluation of path completion heuristics requires instrumented Web ....
....the data collected with instrumented Web browsers can be compared to the results of path completion algorithms applied to server logs. Another, last resort, evaluation method is testing preprocessing statistics on synthetic data. For episode identification this approach has been used by Cooley [8]. An expensive alternative are experiments in usability labs where the behavior of users is e.g. video taped and then manually analyzed. 3. Most researchers in recommender systems focus the evaluation of mining algorithms with methods known from machine learning. A common way from machine ....
Robert Walker Cooley. Web usage mining: Discovery and application of interesting patterns from web data. Ph. d. thesis, Graduate School of the University of Minnesota, University of Minnesota, 2000.
....only; combining dynamic with static information, however, leads to an improved user labelling. 3.2 Automatic categorization of web logs We applied an mHMM to (an excerpt of) a one day log le from a large commercial web site in The Netherlands. The raw entries in the le were sessionized [3] and irrelevant entries (like images) were removed. A training set was created out of clickstreams from 400 users. Then we trained a 4 component mHMM with 12 states and common observation matrix (with 134 di erent observables) for 500 cycles on the training set and we inspected the shared ....
R. W. Cooley. Web usage mining: discovery and application of interesting patterns from web data. PhD thesis, University of Minnesota, USA, 2000.
....only; combining dynamic with static information, however, leads to an improved user labelling. 3.2 Automatic categorization of web logs We applied an mHMM to (an excerpt of) a one day log le from a large commercial web site in The Netherlands. The raw entries in the le were sessionized [2] and irrelevant entries (like images) were removed. A training set was created out of clickstreams from 400 users. Then we trained a 4 component mHMM with 12 states and common observation matrix (with 134 di erent observables) for 500 cycles on the training set and we inspected the shared ....
R. W. Cooley. Web usage mining: discovery and application of interesting patterns from web data. PhD thesis, University of Minnesota, USA, 2000.
....[19, 22] Data mining algorithms have recently been applied to the user sessions to discover higher level trends. For example, researchers have applied these algorithms to discover which pages are often accessed together by doing sequential pattern, frequent itemset, or association analysis [3]. These techniques have been helpful in personalization applications [21] and Web caching and prefetching [11] A more recent development in these analysis tools is to offer basic summarization by grouping user actions into activities [7, 1, 4, 22, 17] such as reading bulletin board messages, ....
Cooley, R. Web Usage Mining: Discovery and Application of InterestingPatterns from Web Data. Ph.D. Thesis, University of Minnesota, May 2000.
....of items bought by a customer in a single purchase. The sequential pattern mining later introduced in [AS95, SA96] defines a sequence as a timeordered set of transactions, whereas in sequential mining of Web traversal patterns, each Web log entry is a separate customer transaction. Like Cooley in [C00], we identify a user visit as a set of page views that are sufficiently close over time by using a maximum time gap # # t 9 max specified by the user. We identify a page view as an html or a dynamically generated file that is sufficiently apart over time from the previously identified page view ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. Ph.D. Thesis, University of Minnesota, May 2000.
....(see [113] for the detail) It is not in our intention to give a complete survey of Web usage mining research here. Interested readers could consult the overview papers by Srivastava, et al. 113] Spiliopoulou [112] and Masand and Spiliopoulou [87] and Robert Cooley s Ph.D. thesis [32] for mining user patterns and the overview paper by Langley [80] for mining user pro les. 6. RELATED WORKS As far as we know, it was Etzioni [41] who rst coined the term Web mining. Etzioni starts by making a hypothesis that the information on the Web is suciently structured and outlines the ....
R. W. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web data. PhD thesis, Dept. of Computer Science, University of Minnesota, May 2000.
....sets are found using heuristics [Joh et al. 2000b] Notice that, for this project, the sequences are represented by server sessions and the elements stand for visited web pages. A server session or visit is defined as the click stream of page views for a single visit of a user to a website [Cooley, 2000]. To illustrate the SAM, consider the following sequences: Suppose: w d =w i =1 and h=w d w i (s 1 , 1, 2, 45, 27, 28, 112) s 2 , 1, 45, 27, 2, 28, 2) The server sessions s 1 and s 2 have five common elements (1, 2, 45, 27 and 28) and one unique element (112) Element 2 needs to be ....
....files of a Belgian telecom provider collected over a one week period are used. In order to analyse visiting behaviour on a website, sessions of webclickstream data must be identified. A server session or visit is defined as the click stream of page views for a single visit of a user to a website [Cooley, 2000]. In this paper, we will use server session and visit interchangeably. First, the data stored in the log files are cleaned in such a way that URL page requests of the form GET. html are maintained. Then a unique code is given to each distinct ip address and URL. Third, sessions are identified ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. University of Minnesota http://wwwusers. cs.umn.edu/~cooley/pubs.html.
....there will be strong correlation between the similarity among user clickstreams and the similarity among the users interests or intentions. Therefore, clustering of the former could be used to predict groupings for the latter. A lot of research has been done in the area of Web Usage Mining [1] which directly or indirectly addresses the issues involved in the extraction of web navigational patterns [2] ordering relationships [3] prediction of web surfing behavior [4] and clustering of web usage sessions [5] based on web logs, possibly supplemented by web content or structure ....
R. Cooley. Web Usage Mining: Discovery and Applications of Interesting Patterns from Web data. PhD thesis, Dept. of Computer Science, University of Minnesota, May 2000.
....of the requested page, the status code of the response message, the size of the document transferred, the referrer page and its User Agent information. 10 During preprocessing, the log entries are grouped into server sessions using a variation of the session identi cation heuristic proposed in [4]. Unlike [4] our approach is capable of identifying sessions having multiple IP Addresses or User Agents. The session identi cation technique will be described in the Appendix. 3.2 Feature Vector Construction Once the server sessions are created, the next step is to construct a feature vector ....
....page, the status code of the response message, the size of the document transferred, the referrer page and its User Agent information. 10 During preprocessing, the log entries are grouped into server sessions using a variation of the session identi cation heuristic proposed in [4] Unlike [4], our approach is capable of identifying sessions having multiple IP Addresses or User Agents. The session identi cation technique will be described in the Appendix. 3.2 Feature Vector Construction Once the server sessions are created, the next step is to construct a feature vector to represent ....
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota, 1999.
....d. Analysis: analyzing the mined pattern. In brief, Web mining is a technique to discover and analyze the useful information from the Web data. The authors of [10] claims the Web involves three types of data: data on the Web (content) Web log data (usage) and Web structure data. The authors of [5] classified the data type as content data, structure data, usage data, and user profile data. M. Spiliopoulou [14] categorized the Web mining into Web usage mining, Web text mining and user modeling mining; while today the most recognized categories of the Web data mining are Web content 5 mining, ....
....in the next section about usage mining, based on some up to date research works. 3. The Usage Mining on the Web Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web based applications [5]. In the same paper, the Web usage mining is parsed into three distinctive phases: preprocessing, pattern discovery, and pattern analysis. I think it is an excellent approach to define the usage mining procedure. It also clarified the research sub direction of the Web usage mining, which ....
[Article contains additional citation context not shown here]
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web data. PhD thesis, Dept. of Computer Science, University of Minnesota, May 2000
....the Web, which is a hot research topic. Most notably, Dumais and Chen described a recent effort on achieving good hierarchical clustering of Web search results using a technique called Support Vector Machines [12] Most relevant to our project is research on the clustering of usage of a Web site [23,11]. Cooley describes an algorithm that clusters users using a hypergraph partitioning technique [11] The system is used successfully to identify particularly interesting and similar path histories. It does not come up with significant category groupings and describe the composition of every user ....
.... good hierarchical clustering of Web search results using a technique called Support Vector Machines [12] Most relevant to our project is research on the clustering of usage of a Web site [23,11] Cooley describes an algorithm that clusters users using a hypergraph partitioning technique [11]. The system is used successfully to identify particularly interesting and similar path histories. It does not come up with significant category groupings and describe the composition of every user profile. Thus, that system will not be able to gain an overall picture of all usage of a Web site. ....
[Article contains additional citation context not shown here]
Cooley, R. (2000) Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. Ph.D. Thesis. University of Minnesota. May 2000.
....is then the sequence of page views that are accessed by a user. A server session is the click stream for a single visit of a user to a Web site. A brief overview of the necessary steps for processing Web server logs will be provided in Section 5 and further details can be found in [CMS99, Coo00] Processing the structure and content of a Web site are inter related tasks. The answer to the question of what links are available from a given page view depends on how the page view is de ned. The degree of diculty in performing content and structure processing is highly dependent on the ....
....frame and image tags that populate a particular page view (referred as intra page structure) Several usage preprocessing steps can not be completed without the site structure. In addition, the site structure is useful for identifying potentially interesting rules. As has been described in [CMS99, Coo00] the Web site structure is required for page view identi cation, and may be needed to identify users in the absence of a unique user identi er such as cookies. Due to the presence of frames, the number of potential page views for a Web site can be vast. It is not uncommon for every page view ....
[Article contains additional citation context not shown here]
Robert Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. Phd, University of Minnesota, 2000.
No context found.
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web data. PhD thesis, University of Minnesota, 2000.
No context found.
Cooley R., Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. Ph. D. Thesis, Department of Computer Science, University of Minnesota, USA, 2000.
No context found.
R. W. Cooley. Web usage mining: Discovery and application of interesting patterns from web data. Ph. d. thesis, Graduate School of the University of Minnesota, University of Minnesota, 2000.
No context found.
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web data. PhD thesis, University of Minnesota, 2000.
No context found.
R. Cooley. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. PhD thesis, University of Minnesota, 2000.
No context found.
Cooley, R. (2000). Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. University of Minnesota, Faculty of the Graduate School: Ph.D. dissertation. http://www.cs.umn.edu/research/websift/papers/- rwc thesis.ps
No context found.
Cooley R., Web Usage Mining: Discovery and Application of Interesting patterns from Web Data, Ph. D. Thesis, Department of Computer Science, University of Minnesota, 2000.
No context found.
Cooley R., Web Usage Mining: Discovery and Application of Interesting patterns from Web Data, Ph. D. Thesis, Department of Computer Science, University of Minnesota, 2000.
No context found.
Cooley, R.W.: Web usage mining: Discovery and application of interesting patterns from web data. Ph. d. thesis, Graduate School of the University of Minnesota, University of Minnesota (2000)
No context found.
Cooley R., 2000. Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. Ph. D. Thesis, Department of Computer Science, University of Minnesota, Minnesota, USA.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC