### Abstract

Abstract -This manuscript presents FuSnap, a fuzzy logic based controller that monitors and controls the snapshot process of a logical storage volume in a disk array. As disks do not linearly respond to the arrival rate of user accesses, FuSnap makes use of fuzzy logic as the means to achieve better control of their response time. The goal of the FuSnap controller is to reduce the response time caused by the copy-on-writes that occur during the snapping of a storage logical volume. The FuSnap controller, based on the response time of user accesses, makes the decision on whether to proceed with a copy-on-write or a redirect-on-write when a source logical volume is being copied to a snapshot logical volume. The benefits of FuSnap approach are twofold. Firstly, significant reductions in response time of user requests are obtained with the FuSnap approach over the traditional Copy-on-Write snap approach. Secondly, these reductions in response time make the point-in-time copy of data a process less disruptive for database users. FuSnap was verified with two setups using HPUX workstations, one setup with 8 and the other with 32 disks. Index Terms-Disk Arrays, Embedded Systems, Fuzzy Control, Fuzzy Logic. I. INTRODUCTION NAPSHOT of logical volumes is an area of research of high interest for storage companies that aim at improving the availability of the data while at the same time providing data replication [1], By using the snapshot feature, users can create a point-intime copy of a logical volume or LU (Logical Unit). From the user's standpoint, the snapshot feature creates an instant copy Manuscript received January 13, 2009. Accepted for publication June 17, 2010. Copyright © 2010 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org Guillermo Navarro is with Hewlett Packard, Boise, ID 83714 USA (e-mail: guillermo.navarro@ hp.com). Milos Manic is with the University of Idaho, Idaho Falls, ID 83402 USA (e-mail: misko@ieee.org). of the original logical volume. This gives users the means to preserve a point-in-time copy (the snapshot) of the data in a source logical volume. If the data in the source gets corrupted or lost, the user can go back to the snapshot and recover the data from that point in time. The original volume with the data to be replicated will be referred to as the source volume, or just the source, for short. The copy of the original volume will be referred to as the snapshot volume, or the snapshot, for short. Improvements in the management of snapshot replication have been proposed in [9]- Fuzzy control has been used with Proportional and Integral (PI) control before. Perry et al. in [20] proposed the design of a PI fuzzy logic controller for dc-dc conversion. Precup et al. in [21] presented a merge of fuzzy and iterative learning control for fuzzy control systems focused on Takagi-Sugeno PI fuzzy controllers. Luo et al. proposed in [22] a fuzzy-PIbased control strategy for static synchronous compensator used in electric power distribution systems. Sun et al. in [23] make use of fuzzy logic reasoning to optimize the gains of PI controller as part of the fuzzy logic based control for flywheel energy storage equipment. Fuzzy control can also be applied to other type of controllers such as proportional and derivative (PD) and proportional, integral and derivative (PID). In This manuscript presents a FuSnap, a fuzzy PI control algorithm that drastically improves the response time of the user requests (reads or writes) during the snapshot process. The organization of this paper is as follows: Section II presents the copy-on-write and redirect-on-write snapshot techniques. Section III presents a model for the snapshot and the modified process. Section IV presents the fuzzy control algorithm. Section V presents the experimental results. Section VI presents the conclusions. II. BACKGROUND OF POINT-IN-TIME COPY TECHNOLOGIES The FuSnap controller improves the response time during the snapshot process by providing an intelligent way of combining two snapshot technologies: 1) Copy-on-Write (CoW) and 2) Redirect-on-Write (RoW). These two snapshot technologies will be described in two following subsections. The classification of snapshot techniques will be based mostly on the classification provided by Simitci in A. Copy-on-Write (CoW) Source logical volumes are divided into D Bv data blocks, where B v is the total number of data blocks composing the source volume. Right after the snapshot volume is created, the pointers to data blocks on each volume (source and snapshot), point to the source volume (these pointers to data blocks are in some papers also referred to as metadata If a first user write occurs to one of the data blocks in the source volume, for example D j , then this block of data must be copied to the snapshot volume before that first user write occurs so that the original point-in-time data block D j is preserved. Once the first user write occurs, the D j data block in the source volume is modified so it is now referred to as the updated D j ' data block. This snapshot technology is called copy-on-write because every first user write to the source volume causes the disk array to copy the original data block from the source to the snapshot volume before proceeding with the user write. The copy of a data block to the snapshot volume before the first user write can occur adds an extra delay to that first user write, as it has to wait for the copy. The extra delay is called the copy-on-write penalty. When a data block from the source volume has been copied to the snapshot volume then the original data block is said to have been snapped. After the copy-on-write is accomplished, the pointers to the respective data blocks must be updated (metadata must be updated) B. Redirect-on-Write (RoW) In case of RoW, the new user writes to the source volume are redirected to another volume, set aside for the snapshot III. MODELING OF THE COPY-ON-WRITE SNAPSHOT A. Markov Chain Model of the Probability of a Snap The purpose of modeling the traditional copy-on-write snapshot was to understand how the probability of a snap changes as the snapshot takes place under a constant arrival rate on-line transaction processing (OLTP) workload. The probability of a snap is one of the state variables of the process to be controlled (the snapshot process). Also, the change of the probability of a snap as a function of time has implications on the stability of the FuSnap controller as it will be explained at the end of section IV. To understand how the probability of a snap changes the equations that estimate the probability are derived. The snapshot process can be modeled as a process characterized by the binomial distribution. A Markov Chain (MC) with a finite customer population This formula corresponds to the intuitive expectation. If no data blocks have been snapped, then b = 0 and the probability of a user write causing a snap is 1. If all of the data blocks have been snapped, then b = B v , and the probability of a write causing a snap is zero, which means no more snaps will occur. The MC that models those probabilities is shown in To derive the equation for the transient analysis of the MC, differential equations were obtained assuming equilibrium in terms of the input and output flow from each state The solution of The differential equation for the probability of P 1 (t) is: The solution of The differential equation for the probability of P 2 (t) is: The solution of (6) is: By induction, the probability of being in state b is: The factorial term in equation Equation B. Practical Snapshot probability equation Equation where the equivalent terms are: The problem with (11) is that for practical uses, the number of blocks B v that make up a volume is a large number. For example, a 64GB source volume will be made up of B v = 64GB/128KB = 524,288 blocks. Obtaining the factorial of such big numbers can render the use of (11) impractical. That is why the authors propose the use of the equivalent terms p and q of the binomial p.m.f: It is interesting to consider the behavior of The probability of not causing a snap would be described by (14) and it could be now taken as the probability of not having a snapshot: Equation C. Model of the CoW process The model of the copy-on-write process is based on the response time delivered by disk drives under an OLTP workload. The two most important measures of the OLTP workload imposed on the disk array are the arrival rate in IOs per seconds (IO/s) and the response time in milliseconds [ms]. Assuming the write cache memory is in write-through mode, the response time that disk drives deliver under certain IO/s arrival rate is the key feature that will determine the response time of the user accesses (reads or writes). The response time of an access (read or write), t acc , from a disk is a function of the arrival rate on the disk, λ d : The response time introduced by the copy-on-write process, T cow , is caused by the delay of a read of the data block, T r , from the disk where the source data block is located plus the delay of the write of that data block, T w , to the disks where the snapshot data block will be located. This can be expressed in this equation: The capital "T" letters indicate the response time is for large block transfers. The data blocks copied during the copy-onwrite process are large in size compared to the user writes. For example, data blocks can be 128KB in size whereas user writes can be 8KB in size. A flow of user writes is received by a disk array. Some of the user writes, according to the p snap probability, will cause a snap and therefore those user writes will have to wait for the copy-on-write before being carried out (copy-on-write penalty). And some of the other writes, according to the 1-p snap probability, will proceed directly to be carried out. The arrival rate of the user writes, λ w , along with the p snap probability, determines the arrival rate all disks in the disk array will receive, λ D . The copy-on-write process causes extra disk accesses on the disk array. If a write to a data block causes a snap that triggers a copy-on-write then a data block (for example, 128KB in size), has to be read from a disk and it has to be written on some other disks depending on the RAID level used by the snapshot volume. For example, if RAID1 is used on the snapshot volume, then a copy-on-write will generate one read of a data block from a disk and two writes to different disks. Therefore, three more accesses on disks in the disk array were generated in the background. The accesses generated by the copy-on-write that depend on the RAID level of the snapshot volume are defined by the α RL factor. For RAID1 the α RL =2, which is the number of writes needed for each data write. The total extra arrival rate on the disk array generated by the copy-on-writes, λ cow , is: The total arrival rate on the disk array, λ D , including user reads, is: For the sake of simplicity, it was assumed that the arrival rate is balanced across all the disks in a disk array, N d , and the arrival rate on each disk is The snapshot process occurs while users are accessing a disk array. If a user write causes a snap to occur, the user write has to wait for the snap to take place before proceeding with the user write (the copy-on-write penalty). Therefore, besides the normal response time for a user write, t w , the response time is increased by the copy-on-write delay. In other words, the final response time the user write experiences with a copy-on-write, t cow , is the sum of the two as shown in the next equation: The average time for the user writes is: This can be more simply expressed by: ( ) D. Model of the proposed CoW-RoW process This manuscript presents a snapshot process that reduces the response time during the snapping of the source volume. The presented snapshot process is a combination of the CoW and RoW processes, facilitated by the fuzzy controller. The snapshot process is modified by introducing a control input parameter named snap throttle factor u th . This actuating variable (control input), represents the percentage of copy-onwrite that will be allowed out of the all the snaps generated by user writes. The other snaps will generate a redirect-on-write. The modified CoW-RoW process is illustrated in And the total arrival rate on the disk array, λ D , including user reads, is: The user writes now will experience smaller response time since the delay introduced by the redirect-on-writes, t row is significantly lower than t cow . The average response time experienced by user writes with the modified CoW-RoW process is expressed in the following equation, where to make the equation more readable the dependency on λ d is assumed for the delays t w , t cow and t row : (1 ) [ One possible simplification can be made if for practical purposes is assumed that the redirects-on-write are the same as user writes, since the user write is redirected to the snapshot volume instead of the source volume but with no other extra step in the process. This further entails that t row ≈ t w , and This equation clearly shows why the response time is better with the CoW-RoW process if the snap throttle factor, u th , is less than 1. This is one fundamental part of the process. The determination of the input control u th and the control of the snapshot process with the fuzzy control are explained in the next section. IV. SNAPSHOT FUZZY CONTROL A. Purpose and Rationale of FuSnap The FuSnap controller can be considered as dynamic and optimal Takagi-Sugeno fuzzy-logic based controller. The block diagram of the FuSnap snapshot fuzzy controller is illustrated in Modeling hard disk drives has been an area of research for a long time. While some authors proposed analytical models for hard disk drives, like Shriver et. al. in B. High level modeling of FuSnap The controlled system has two inputs: the arrival rate of writes, λ w , and the arrival rate of reads, λ r . The total arrival rate, λ, is the sum of the input parameters of the controlled system (disk array): The outputs of the system to be controlled (disk array) are the average response times experienced by the user accesses (reads or writes), t r , and t w : The state variables required for the FuSnap controller are 1) the probability of snapped blocks in the volume, p snap , which is a value in the [0,1] range; and 2) the numbers of copy-on- writes per time unit, in other words, the arrival rate of copyon-writes in the disk array, λ cow . The control input variable is the snap throttle factor, u th The FuSnap controller also requires a reference variablethe reference response time w rt . The reference response time represents the maximum acceptable response time during the snapshot process. The maximum response time used in this paper was 30ms. The 30ms value comes from the Oracle performance tuning guide [28] as a response time value that gives a good indication of an overly active I/O system. In order to control the outputs, they have to be periodically monitored every T m seconds. The decision on how often to monitor can be based on the maximum acceptable response time and the performance of the disk array controller. The sampling of the outputs is performed at intervals of time T m . Each sample is denoted by (t i ), where i is the i-th sample of the output that occurred at a time t i , as in: The The equations for the outputs are based on the arrival rate the disks are being imposed. Equation (28) can be used for the first output of the controller, the user write response time: For the other output, the average response time for reads, t r , the equation Equation C. Decision Logic If a user write causes a snap, then FuSnap makes a decision about the three possible choices to execute: 1) perform a copyon-write at the time when the user write is being served; 2) defer the copy-on-write operation by executing a redirect-onwrite; 3) perform a copy-on-write of the target data block if a redirect-on-write already took place for that data block. The way the fuzzy controller throttles the snapshot process is by controlling the percentage of copy-on-writes that are caused by user writes (option 1), versus the percentage of user writes with deferred copy-on-write (option 2). This percentage is the output of the snapshot fuzzy controller and is named snap throttle factor u th . For example, if u th = 0.4, this means that only 40% of the user writes that cause a snap will also generate a copy-on-write. The other 60% of the user writes that are causing a snap will generate a redirect-on-write. D. Estimation and fuzzification of the probability of a snap The probability of a snap is used as part of the determination of the snap throttle factor. The f snap (t i ), in addition to being an indication of the percentage of blocks snapped at a time t i , also denotes the probability of further snaps. For example, if 90% of the blocks in a volume have been snapped, the probability of user accesses causing further snaps is only 10% (assuming a random user access over the volume). The probability of a snap at time t i is: The probability of a snap p snap (t i ), the error e(t i ), and the change in error Δe(t i ), are the three variables used by the fuzzy controller to compute the snap throttle factor, u th (t i ). In order to be used by FuSnap, these three variables need to be first fuzzified as shown in The final fuzzification of the p snap value is denoted by F psnap (μ snap ), and is defined as: E. Control Error computation and fuzzification The output y(t i ) is compared with the reference response time w rt to compute the control error, e: The change in the control error, Δe, is also computed: The final goal in the fuzzification of the control error e and change in the control error Δe is to map them to one of three fuzzy descriptors, Zero (ZE), Positive Error (PE), and Negative Error (NE), respectively. These fuzzy descriptors apply to both the control error e and change in control error Δe. The purpose of these fuzzy descriptors is obvious -they indicate when the control error is close to zero, or in case where the error does exist, whether the control error is positive or negative. This fuzzification is first performed via three triangular membership functions, μ ZE , μ NE and μ PE , based on the reference response time w rt . The membership functions are described using a dummy variable error, ε, since these membership functions are the same for both e and Δe: The membership functions To finish the fuzzification, the control error e and the change in control error Δe are mapped into one of the fuzzy descriptors (NE, ZE, or PE). This is accomplished by comparing the values obtained for the three membership functions (43), (44), and (45). Depending on which of the three has the maximum value the fuzzy value of the error F e , and the fuzzy value of the change in error F Δe , are mapped into one of the fuzzy descriptors NE, ZE or PE: For example, if the output y(t 1 ) is 45ms, then using (41) the error e is 15ms. The membership values, obtained by using F. Rule Base to obtain u th The rule base can now be built based on the following heuristic criteria. First criterion is: if the user response time is high, then the control error, e, is fuzzy positive error, PE, and the controller needs to reduce the number of copy-on-writes occurring. Therefore, the snap throttle factor u th is reduced. Second criterion is: if the user response time is low, then the controller can increase the number of copy-on-writes occurring. Therefore, the snap throttle factor u th is increased. The probability of more copy-on-writes and the change in error are also taken into account. The next step once the three fuzzified input variables e, ∆e, and p snap , are estimated, is the evaluation of the fuzzy rules. The output of the fuzzy rules is the change in snap throttle factor Δu th (t i ). This value will denote the change in the snap throttle factor for the current iteration. The rule base is in G. Stability of the Fuzzy Controller The fuzzy system presented here is globally asymptotically stable based on the fact that it meets the condition for the state variables, which according to V. EXPERIMENTAL RESULTS A. Results on a small setup with 8 disks The FuSnap controller was tested with a setup that consisted of an HP 7640 Itanium workstation with 64GB of memory and with HPUX 11.23 installed. An MC534C fibre channel disk enclosure was filled with eight BF072255B2C disks. The traditional copy-on-write and FuSnap were implemented in C language and compiled with HP cc. The implementation was executed as a parent process in the user space and not as a part of the kernel. The parent process performed the following functions: 1) spawned user requests at a constant rate using the fork() Unix function; 2) kept track of the data blocks written, snapped and or with a redirect-on-write. The data block table was in shared memory so it could be updated by the spawned user requests; 3) monitored the response time of the user requests; 4) implemented the FuSnap control logic. Using this setup a comparison was run with an 8KB workload, 50% reads at 500 IO/s. The source volume was a RAID1 4GB in size using data blocks of 128KB laid out in an evenly fashion over all the 8 disks. The results in B. Results on a setup with 32 disks The FuSnap controller was also tested with a setup that consisted of an HP 7640 Itanium workstation with 64GB of memory and with RH Linux 2.6.18 installed. Four M6412A fiber channel disk enclosures were filled with twelve BF146DA47C disks. The traditional copy-on-write and FuSnap were implemented in C language and compiled with gcc. The implementation details were the same as the used in the previous setup with eight disks. Using this setup a comparison was run with an 8KB workload, 50% reads at 1,000 IO/s. The source volume was a RAID1 16GB in size using data blocks of 128KB laid out in an evenly fashion over all the 32 disks. The results in VI. CONCLUSIONS The greatest benefit FuSnap delivers is to avoid the high response time peak at the beginning of a snapshot process as predicted by the equations The FuSnap controller proves that it can provide two benefits: 1) help in ensuring quality-of-sevice (QoS) where a database needs constant access and 2) make the backup of data a less disruptive process for the users of a database. He has over 20 years of academic and industrial experience, including an appointment at the ECE Dept. and Neuroscience program at University of Idaho Moscow. As university collaborator or principal investigator he lead number of research grants with the Idaho National Laboratory, NSF, EPSCoR, Dept. of Air Force, and Hewlett-Packard, in the area of data mining and computational intelligence applications in process control, network security and infrastructure protection. Dr. Manic has published over hundred refereed articles in international journals, books, and conferences. REFERENCES