| Davison, B. Web Traffic Logs: An Imperfect Resource for Evaluation. in Ninth Annual Conference of the Internet Society (INET'99). 1999. San Jose. |
....time consuming and tends to reflect only a small number of users in abnormal usage environments. Analysis of server logs is frequently used to quantify what large 2 numbers of users are doing on a site, but it is difficult to comprehend the actions, goals, and motivations of individual users [7, 10]. In the past few years, a number of online survey and logging tools [21, 29] have been developed in an attempt to quickly gather more data than traditional lab style testing higher quality data than that provided via server log analysis. These remote usability testing tools rely on recruited ....
Davison, B. Web Traffic Logs: An Imperfect Resource for Evaluation. In Proceedings of Ninth Annual Conference of the Internet Society (INET '99). San Jose, CA, 1999.
....of the system of caches. Two different methods can be used to obtain these traces. The obvious one is to use log files that all caching software generates. However, this technique is overly simple because those logs often lack information and, even worse, sometimes contain inaccurate information [1]. To circumvent these drawbacks, one needs to instrument the software; however, this is not always possible: source code may not be available, or it may require a large amount of work. Another approach to build such traces is to use passive traffic monitoring. Several tools performing HTTP traffic ....
Brian D. Davison, "Web traffic logs: An imperfect resource for evaluation," in Proceedings of th INET'99 Conference, June 1999, http://www.isoc.org/inet99/ proceedings/4n/4n_1.htm.
....cache hit. Moreover, a trace collected at a proxy normally fails to capture any user requests that hit in the client (browser) caches. Other problems with conventional proxy logs include inadequate detail, low resolution timestamps, and poor clock synchronization in multiple host proxy arrays [12, 15, 17, 28]. In principle, one could avoid such problems by collecting traces using an instrumented client; this could capture every user reference. Instrumented browsers have been used to collect traces from small user populations [14, 16] It is difficult to instrument popular browsers today because ....
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proc. 9th Annual Conf. of the Internet Society, June 1999.
....cache hit. Moreover, a trace collected at a proxy normally fails to capture any user requests that hit in the client (browser) caches. Other problems with conventional proxy logs include inadequate detail, low resolution timestamps, and poor clock synchronization in multiple host proxy arrays [15, 12, 17, 28]. In principle, one could avoid such problems by collecting traces using an instrumented client; this could capture every user reference. Instrumented browsers have been used to collect traces from small user populations [14, 16] It is difficult to instrument popular browsers today because ....
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proc. 9th Annual Conf. of the Internet Society, June 1999.
....90 research, commercial, and freeware tools currently out there [11] However, from the perspective of the web design team, there are some problems with server logs. Interpreting the actions of an individual user is extremely difficult, as pointed out by Etgen and Cantor [12] and by Davison [13]. Web caches, both client browser caches and Intranet or ISP caches, can intercept requests for web pages. If the requested page is in the cache then the request will never reach the server and is thus not logged. Multiple people can also share the same IP address, making it difficult to ....
Davison, B., Web Traffic Logs: An Imperfect Resource for Evaluation. In Ninth Annual Conference of the Internet Society (INET'99). 1999. San Jose.
....In particular, they fail to record cache related HTTP metadata in reply headers and META http equiv tags within HTML files, reply entity bodies or their hashes, and accurate, high resolution timestamps. Davison and Caceres et al. have described the shortcomings of conventional proxy log formats [14, 18]. In a few cases researchers have instrumented browsers to collect Web client traces [15, 17] In principle, such traces support arbitrarily realistic bottom up explorations of cache hierarchies and shed light on user interactions invisible outside the client. Unfortunately, today most ....
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In 9th Conf. Internet Society, June 1999.
....web usage logs: server side and client side logging. Server side logs have the advantage of being easy to capture and generate, since all transactions go through the server. However, there are several downsides to server side logging, as pointed out by Etgen and Cantor [9] and by Davison [8]. One problem is that web caches, both client browser caches and Intranet or ISP caches, can intercept requests for web pages. If the requested page is in the cache then the request will never reach the server and is thus not logged. Another problem is that multiple people can also share the same ....
Davison, B. Web Traffic Logs: An Imperfect Resource for Evaluation. In Proceedings of Ninth Annual Conference of the Internet Society (INET'99). San Jose, June 1999.
....and the last byte of response. These timestamps provide a sample of real world response times that can be used to validate our simulator. 4. 2 Trace Preparation Most researchers have found that web traces need to be checked and often cleaned before using them in a simulator or for evaluation [17, 5]. The UCB Home IP Trace is no exception. This trace does not record the HTTP response code associated with each object. Thus, we are unable to distinguish between valid responses (e.g. code 200) error responses (e.g. 404) and file not modified responses (304) For the purpose of simulation, we ....
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), June 1999.
....likely skew pageview statistics. This is also the case for other kinds of non interactive retrievals, such as those generated by Web crawlers. Today, when analyzing server logs for interesting patterns of usage, researchers must first separate (if possible) the retrievals from automated crawlers [29, 18]. Additionally, if the server were maintaining a user model based on history, undistinguished prefetching requests could influence that model, which could generate incorrect hints that would cause additional requests for useless resources which would likewise (incorrectly) influence the server s ....
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), June 1999.
....collecting full content web traffic logs (for a small number of users) using a custom proxy for off line analysis. These logs include all HTTP request and response headers and the content of all HTML pages, since traditional logs are insufficient for analysis of content based prefetching systems [Dav99d]. By combining different sources of information, we expect to be able to make predictions of actions that have never been taken by the user and to to make predictions that reflect current user interests. Our conjecture is that the appropriate combination of information from sources such as these ....
B. D. Davison (1999) Web traffic logs: An imperfect resource for evaluation. To be published in the Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), June, San Jose, CA.
....which the socket is closed. Finally, proxy cache trace logs are inaccurate when they return stale objects since they may not have the same characteristics as current objects. Note that logs generated from non caching proxies or from HTTP packet sniffing may not have this drawback. In other work [Dav99b] we further examine the drawbacks of using standard trace logs and investigate what can be learned from more complete logs that include additional information such as page content. Current request stream workloads. Using a live request stream produces experiments that are not reproducible ....
Brian D. Davison. Web traffic logs: An imperfect resource for evaluation. In Proceedings of the Ninth Annual Conference of the Internet Society (INET'99), June 1999. To appear.
No context found.
Davison, B. Web Traffic Logs: An Imperfect Resource for Evaluation. in Ninth Annual Conference of the Internet Society (INET'99). 1999. San Jose.
No context found.
B. D. Davison. Web Traffic Logs: An Imperfect Resource for Evaluation. Proc. of the 9th Annual Conference of the Internet Society (INET'99), June 1999.
No context found.
B. D. Davison, "Web Traffic logs: An imperfect resource for evaluation," in Proc. of the Ninth Annual Conference of the Internet Society (INET'99), San Jose, 1999.
No context found.
B. D. Davison. Web Traffic Logs: An Imperfect Resource for Evaluation. Proc. of the 9th Annual Conference of the Internet Society (INET'99), June 1999.
No context found.
B. D. Davison. Web traffic logs: An imperfect resource for evaluation. In Ninth Annual Conference of the Internet Society (INET'99), pages 22--25, June 1999.
No context found.
Davison, B. Web Traffic Logs: An Imperfect Resource for Evaluation. In Proceedings of Ninth Annual Conference of the Internet Society (INET '99). San Jose, CA, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC