Results 1 - 10
of
96
A Hierarchical Internet Object Cache
- IN PROCEEDINGS OF THE 1996 USENIX TECHNICAL CONFERENCE
, 1995
"... This paper discusses the design andperformance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits of hierarch ..."
Abstract
-
Cited by 436 (5 self)
- Add to MetaCart
This paper discusses the design andperformance of a hierarchical proxy-cache designed to make Internet information systems scale better. The design was motivated by our earlier trace-driven simulation study of Internet traffic. We believe that the conventional wisdom, that the benefits of hierarchical file caching do not merit the costs, warrants reconsideration in the Internet environment. The cache implementation supports a highly concurrent stream of requests. We present performance measurements that show that the cache outperforms other popular Internet cache implementations by an order of magnitude under concurrent load. These measurements indicate that hierarchy does not measurably increase access latency. Our software can also be configured as a Web-server accelerator; we present data that our httpd-accelerator is ten times faster than Netscape's Netsite and NCSA 1.4 servers. Finally, we relate our experience fitting the cache into the increasingly complex and operational world of Internet information systems, including issues related to security, transparency to cache-unaware clients, and the role of file systems in support of ubiquitous wide-area information systems.
The Case for Geographical Push-Caching
, 1995
"... Most wide-area caching schemes are client initiated. Decisions on when and where to cache information are made without the benefit of the server's global knowledge of the situation. We believe that the server should play a role in making these caching decisions, and we propose geographical push-cach ..."
Abstract
-
Cited by 228 (1 self)
- Add to MetaCart
Most wide-area caching schemes are client initiated. Decisions on when and where to cache information are made without the benefit of the server's global knowledge of the situation. We believe that the server should play a role in making these caching decisions, and we propose geographical push-caching as a way of bringing the server back into the loop. The World Wide Web is an excellent example of a widearea system that will benefit from geographical pushcaching, and we present an architecture that allows a Web server to autonomously replicate HTML pages. 1 Introduction The World-Wide Web [1] operates for the most part as a cache-less distributed system. When two neighboring clients retrieve a document from the same server, the document is sent twice. This is inefficient, especially considering the ease with which Web browsers allow users to transfer large multimedia documents. To combat this problem, some Web browsers have begun to add local client caches. These prevent one client...
A survey of web caching schemes for the internet
- ACM Computer Communication Review
, 1999
"... The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and serve ..."
Abstract
-
Cited by 200 (1 self)
- Add to MetaCart
The World Wide Web can be considered as a large distributed information system that provides access to shared data objects. As one of the most popular applications currently running on the Internet, the World Wide Web is of an exponential growth in size, which results in network congestion and server overloading. Web caching has been recognized as one of the effective schemes to alleviate the service bottleneck and reduce the network traffic, thereby minimize the user access latency. In this paper, we first describe the elements of a Web caching system and its desirable properties. Then, we survey the state-of-art techniques which have been used in Web caching systems. Finally, we discuss the research frontier
Finding similar files in a large file system
- in Proceedings of the USENIX Winter 1994 Technical Conference
, 1994
"... We present a tool, called sif, for finding all similar files in a large file system. Files are considered similar if they have significant number of common pieces, even if they are very different otherwise. For example, one file may be contained, possibly with some changes, in another file, or a fil ..."
Abstract
-
Cited by 176 (3 self)
- Add to MetaCart
We present a tool, called sif, for finding all similar files in a large file system. Files are considered similar if they have significant number of common pieces, even if they are very different otherwise. For example, one file may be contained, possibly with some changes, in another file, or a file may be a reorganization of another file. The run-ning time for finding all groups of similar files, even for as little as 25 % similarity, is on the order of 500MB to 1GB an hour. The amount of similarity and several other customized parameters can be determined by the user at a post-processing stage, which is very fast. Sif can also be used to very quickly identify all similar files to a query file using a preprocessed index. Application of sif can be found in file management, information collecting (to remove duplicates), program reuse, file synchronization, data compression, and maybe even plagiarism detection. 1.
Maintaining Strong Cache Consistency in the World-Wide Web
- In Proceedings of the Seventeenth International Conference on Distributed Computing Systems
, 1998
"... As the Web continues to explode in size, caching becomes increasingly important. With caching comes the problem of cache consistency. Conventional wisdom holds that strong cache consistency is too expensive for the Web, and weak consistency methods such as Time-To-Live (TTL) are most appropriate. Th ..."
Abstract
-
Cited by 173 (2 self)
- Add to MetaCart
As the Web continues to explode in size, caching becomes increasingly important. With caching comes the problem of cache consistency. Conventional wisdom holds that strong cache consistency is too expensive for the Web, and weak consistency methods such as Time-To-Live (TTL) are most appropriate. This study compares three consistency approaches: adaptive TTL, polling-every-time and invalidation, through analysis, implementation and trace replay in a simulated environment. Our analysis shows that weak consistency methods save network bandwidth mostly at the expense of returning stale documents to users. Our experiments show that invalidation generates a comparable amount of network traffic and server workload to adaptive TTL and has similar average client response times, while polling-every-time results in more control messages, higher server workload and longer client response times. We show that, contrary to popular belief, strong cache consistency can be maintained for the Web with ...
World-Wide Web Cache Consistency
- IN PROCEEDINGS OF THE 1996 USENIX TECHNICAL CONFERENCE
, 1996
"... The bandwidth demands of the World Wide Web continue to grow at a hyper-exponential rate. Given this rocketing growth, caching of web objects as a means to reduce network bandwidth consumption is likely to be a necessity in the very near future. Unfortunately, many Web caches do not satisfactorily m ..."
Abstract
-
Cited by 138 (1 self)
- Add to MetaCart
The bandwidth demands of the World Wide Web continue to grow at a hyper-exponential rate. Given this rocketing growth, caching of web objects as a means to reduce network bandwidth consumption is likely to be a necessity in the very near future. Unfortunately, many Web caches do not satisfactorily maintain cache consistency. This paper presents a survey of contemporary cache consistency mechanisms in use on the Internet today and examines recent research in Web cache consistency. Using trace-driven simulation, we show that a weak cache consistency protocol (the one used in the Alex ftp cache) reduces network bandwidth consumption and server load more than either time-to-live fields or an invalidation protocol and can be tuned to return stale data less than 5% of the time.
A Toolkit for User-Level File Systems
- In Proc. Usenix Technical Conference
, 2001
"... This paper describes a C toolkit for easily extending the Unix file system. The toolkit exposes the NFS interface, allowing new file systems to be implemented portably at user level. A number of programs have implemented portable, user-level file systems. However, they have been plagued by low-perfo ..."
Abstract
-
Cited by 124 (10 self)
- Add to MetaCart
This paper describes a C toolkit for easily extending the Unix file system. The toolkit exposes the NFS interface, allowing new file systems to be implemented portably at user level. A number of programs have implemented portable, user-level file systems. However, they have been plagued by low-performance, deadlock, restrictions on file system structure, and the need to reboot after software errors. The toolkit makes it easy to avoid the vast majority of these problems. Moreover, the toolkit also supports user-level access to existing file systems through the NFS interface---a heretofore rarely employed technique. NFS gives software an asynchronous, low-level interface to the file system that can greatly benefit the performance, security, and scalability of certain applications. The toolkit uses a new asynchronous I/O library that makes it tractable to build large, event-driven programs that never block.
Scalable Internet Resource Discovery: Research Problems and Approaches
, 1994
"... Over the past several years, a number of information discovery and access tools have been introduced in the Internet, including Archie, Gopher, Netfind, and WAIS. These tools have become quite popular, and are helping to redefine how people think about wide-area network applications. Yet, they ar ..."
Abstract
-
Cited by 121 (3 self)
- Add to MetaCart
Over the past several years, a number of information discovery and access tools have been introduced in the Internet, including Archie, Gopher, Netfind, and WAIS. These tools have become quite popular, and are helping to redefine how people think about wide-area network applications. Yet, they are not well suited to supporting the future information infrastructure, which will be characterized by enormous data volume, rapid growth in the user base, and burgeoning data diversity. In this paper we indicate trends in these three dimensions and survey problems these trends will create for current approaches. We then suggest several promising directions of future resource discovery research, along with some initial results from projects carried out by members of the Internet Research Task Force Research Group on Resource Discovery and Directory Service.
WebOS: Operating System Services for Wide Area Applications
"... In this paper, we demonstrate the power of providing a common set of Operating System services to wide-area applications, including mechanisms for naming, persistent storage, remote process execution, resource management, authentication, and security. On a single machine, application developers can ..."
Abstract
-
Cited by 106 (16 self)
- Add to MetaCart
In this paper, we demonstrate the power of providing a common set of Operating System services to wide-area applications, including mechanisms for naming, persistent storage, remote process execution, resource management, authentication, and security. On a single machine, application developers can rely on the local operating system to provide these abstractions. In the wide area, however, application developers are forced to build these abstractions themselves or to do without. This ad-hoc approach often results in individual programmers implementing non-optimal solutions, wasting both programmer effort and system resources. To address these problems, we are building a system, WebOS, that provides basic operating systems services needed to build applications that are geographically distributed, highly available, incrementally scalable, and dynamically reconfigurable. Experience with a number of applications developed under WebOS indicates that it simplifies system development and improves resource utilization. In particular, we use WebOS to implement Rent-A-Server to provide dynamic replication of overloaded Web services across the wide area in response to client demands.
Improving end-to-end performance of the web using server volumes and proxy lters
- Tech. Rep. 980206-01, AT&T Labs { Research
, 1998
"... The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper o ers an end-to-end approach to improving Web performance by collectively examining the Web components { clients, proxies, servers, and the network. Our goal is to reduce userperceived laten ..."
Abstract
-
Cited by 91 (13 self)
- Add to MetaCart
The rapid growth of the World Wide Web has caused serious performance degradation on the Internet. This paper o ers an end-to-end approach to improving Web performance by collectively examining the Web components { clients, proxies, servers, and the network. Our goal is to reduce userperceived latency and the number of TCP connections, improve cache coherency and cache replacement, and enable prefetching of resources that are likely to be accessed in the near future. In our scheme, server response messages include piggybacked information customized to the requesting proxy. Our enhancement to the existing request-response protocol does not require per-proxy state at a server, and a very small amount of transient per-server state at the proxy, and can be implemented without changes to HTTP 1.1. The server groups related resources into volumes (based on access patterns and the le system's directory structure) and applies a proxy-generated lter (indicating the type of information of interest to the proxy) to tailor the piggyback information. We present e cient data structures for constructing server volumes and applying proxy lters, and a transparent way to perform volume maintenance and piggyback generation at a router along the path between the proxy and the server. We demonstrate the e ectiveness of our end-to-end approach byevaluating various volume construction and ltering techniques across a collection of large client and server logs.

