We study several transparent techniques for scaling dynamic content web sites, and we evaluate their relative impact when used in combination. Full transparency implies strong data consistency as perceived by the user, no modifications to existing dynamic content site tiers and no additional programming effort from the user or site administrator upon deployment. We study strategies for scheduling and load balancing queries on a cluster of replicated database back-ends. We also investigate transparent query caching as a means of enhancing database replication. Our work shows that, on an experimental platform with up to 8 database replicas, the various techniques work in synergy to improve overall scaling for the e-commerce TPC-W benchmark. We rank the techniques necessary for high performance in order of impact as follows. Key among the strategies are scheduling strategies, such as conflict-aware scheduling, that minimize consistency maintainance overheads. The choice of load balancing strategy is less important. Transparent query result caching increases performance significantly at any given cluster size for a mostlyread workload. Its benefits are limited for write-intensive workloads, where content-aware scheduling is the only scaling option. 1
|
1320
|
Concurrency Control and Recovery in Database Systems
– Bernstein, Hadzilacos, et al.
- 1987
|
|
344
|
The dangers of replication and a solution
– Gray, Helland, et al.
|
|
240
|
Locality-Aware Request Distribution in Cluster-Based Network Servers
– Pai, Aron, et al.
- 1998
|
|
128
|
Design and evaluation of a continuous consistency model for replicated services
– Yu, Vahdat
- 2000
|
|
96
|
Don’t be lazy, be consistent: Postgres-R, a new way to implement database replication
– Kemme, Alonso
- 2000
|
|
75
|
A new approach to developing and implementing eager database replication protocols
– Kemme, Alonso
|
|
69
|
A Publishing System for Efficiently Creating Dynamic Web Content
– Challenger, Iyengar, et al.
- 2000
|
|
67
|
Adaptive Overload Control for Busy Internet Servers
– Welsh, Culler
- 2003
|
|
60
|
Decentralized replicated-object protocols
– Keleher
- 1999
|
|
56
|
Database replication techniques: a three parameter classification
– Wiesmann, Pedone, et al.
- 2000
|
|
50
|
Replication, consistency, and practicality: Are these mutually exclusive
– Anderson, Breitbart, et al.
- 1998
|
|
47
|
Ganymed: Scalable replication for transactional web applications
– Plattner, Alonso
- 2004
|
|
46
|
Middle-Tier Database Caching for e-Business
– Luo, Krishnamurthy, et al.
- 2002
|
|
44
|
C-JDBC: Flexible Database Clustering Middleware
– Cecchet, Marguerite, et al.
- 2004
|
|
43
|
A scalable and highly available system for serving dynamic data at frequently accessed web sites
– Challenger, Dantzig, et al.
- 1998
|
|
41
|
A method for transparent admission control and request scheduling in e-commerce web sites
– Elnikety, Nahum, et al.
|
|
40
|
Postgres-R(SI): Combining replica control with concurrency control based on snapshot isolation
– Wu, Kemme
- 2005
|
|
37
|
Abbadi. Using broadcast primitives in replicated databases
– Stanoi, Agrawal, et al.
- 1998
|
|
36
|
DBProxy: A Dynamic Data Cache for Web Applications
– Amiri, Park, et al.
- 2003
|
|
35
|
Conflict-Aware Scheduling for Dynamic Content Applications
– Amza, Cox, et al.
- 2003
|
|
32
|
HACC: An Architecture for Cluster-Based Web Servers
– Zhang, Chen, et al.
- 1999
|
|
31
|
Application specific data replication for edge services
– Gao, Dahlin, et al.
- 2003
|
|
29
|
Distributed versioning: consistent replication for scaling back-end databases of dynamic content web sites
– Amza, Cox, et al.
- 2003
|
|
27
|
User-Level Communication in Cluster-Based Servers
– Carrera, Rao, et al.
- 2002
|
|
14
|
Neptune: Scalable replica management and programming support for cluster-based network services
– Shen, Yang, et al.
- 2001
|
|
13
|
IBM interactive network dispatcher. http://www.ics.raleigh.ibm.com/ics/isslearn.htm
– Corporation
|
|
12
|
Adaptive middleware for data replication
– MILAN-FRANCO, JIMENEZ-PERIS, et al.
- 2004
|
|
7
|
Oracle9i Application Server Web Caching
– Oracle
- 2000
|
|
6
|
Generalized Snapshot Isolation and a Prefix-Consistent Implementation
– Elnikety, Pedone, et al.
- 2004
|
|
6
|
The Informix Handbook
– Flannery
- 2000
|
|
6
|
Anatomy of a real e-commerce system
– Jhingran
- 2000
|
|
6
|
A simple and effective caching scheme for dynamic content
– Rajamani, Cox
- 2000
|
|
4
|
Autonomic Computing Manifesto
– IBM
- 2003
|
|
4
|
Scalable replication in database clusters
– Patiqo-Martinez, Jiminez-Peris, et al.
- 2000
|
|
3
|
Fas - a freshness-sensitive coordination middleware for a cluster of olap components
– Rhom, Bhom, et al.
- 2002
|
|
1
|
Query affinity in internet applications
– Ji, Felten, et al.
- 2001
|