| C.K. Wong and M.C. Easton. An Efficient Method for Weighted Sampling without Replacement. SIAM Journal on Computing, 9(1):111--113, February 1980. |
....I employ some of these techniques for query sampling. In Table 3 I list the major results, with citations to the relevant algorithms. Type of sampling Citation Expected Disk Accesses SRSWR [Yao77] O(s) SRSWR, variable blocking [ORX90, OR90] O(s(b max =b avg ) SRSWOR [EN82] O(s) Weighted RS [WE80] O(s log n) Sequential RS, known population size [FMR62] O(n=b avg ) Vit84] O(s) Sequential RS, unknown population size O(n=b avg ) Vit85] O(s(1 log(n=s) Table 2.1: Basic Sampling Techniques from a single file. Assume each sample taken from a distinct disk page, i.e. s (n=b avg ) For ....
....updated. Other methods, such as the partial sum tree method discussed below, require preprocessing the entire table of weights. Acceptance rejection sampling has long been used for generation of non uniform pseudorandom number generation [Rub81] 2.7. 2 Partial Sum Trees Wong and Easton [WE80] proposed to use binary partial sum trees to expedite weighted sampling. As above, consider the file of N records, in which each record r j has inclusion probability w j in a sample of size 1. Binary partial sum trees are simply binary trees with N leaves, each containing one record r j and its ....
[Article contains additional citation context not shown here]
C.K. Wong and M.C. Easton. An Efficient Method for Weighted Sampling without Replacement. SIAM Journal on Computing, 9(1):111--113, February 1980.
....measures of efficiency are the generation time and the update time. We can rerun Walker s algorithm each time a weight is updated, but the update cost O(N) is too high. Up until recently, the best known algorithm for the dynamic problem was the binary tree based scheme developed by Wong and Easton [26], whose generation and update times are both O(log N ) Each generation requires one call to a random number generator that provides a uniform random integer in the range [0; 1iN w i ) Recently, Rajasekaran and Ross [21] and Greenberg and Vitter [12] developed different algorithms for the ....
....upper bound on the expected Step 2 cost: k Delta 2 j k Gammaj 1 1 k2 Gammak 2: In Step 3 we walk down the levels from R j in constant expected time per level, using the rejection method, using a total of O(log N) expected time. The dynamic scheme of Wong and Easton [26] uses O(log N) time per generation, but it requires only one call to a random number generator that outputs a uniform number in the range [0; 1iN w i ) Our algorithm uses an average of at most about 2L calls to a uniform random number generator, primarily due to Step 3. It may be possible to ....
[Article contains additional citation context not shown here]
C. K. Wong and M. C. Easton. An Efficient Method for Weighted Sampling without Replacement, SIAM Journal on Computing, 9(1):111--114, 1980. 24 A PREPROCESSING
....some type of randomized probing operation. Augmented trees can be used, as can some variation of acceptance rejection (A R) sampling, or some combination. This subsection shows how to emulate both types of sampling. Sampling is easy using trees augmented with ranks [KNUT73] or other weighting [WONG80] information. To sample from an n record index, we choose a random number k [1, n] and return the kth record by following the pointers whose corresponding ranges contain k (see Figure 4(a) This is discussed in undergraduate textbooks [CORM90] A R sampling is more complex. Conceptually, we ....
C. K. Wong and M. C. Easton, "An Efficient Method for Weighted Sampling Without Replacement," SIAM J. Computing 9, 1 (Feb. 1980), 111-113.
....R tree. The (explicitly stored) cardinality estimate c 0 for each node entry appears next to its pointer, along with the corresponding (derived) values for the bounds c and c . 2 (Figure 1(b) will be discussed in Section 3.1. 1 What we describe here is essentially a partial sum tree [KNUT75, WONG80]. The term ranks arose from the use of cardinality information in cumulative partial sum trees [KNUT73] and comes to the database literature via [OLKE89] 2 In Figure 1(a) the values for c and c are computed using the example formula from [ANTO92] using parameter values A = 1 2 and ....
C.K. Wong and M.C. Easton, "An Efficient Method for Weighted Sampling Without Replacement," SIAM J. Comp. 9, 1 (Feb.
....of efficiency are the generation time and the update time. We can rerun Walker s algorithm each time a weight is updated, but the update cost O(N) is too high. Up until recently, the best known algorithm for the dynamic problem was the binary tree based scheme developed by Wong and Easton [11], whose generation and update times are both O(log N ) Each generation requires one call to a random number generator that provides a uniform random number in the range [0; P 1iN w i ) Recently, Rajeskaran and Ross [8] and Greenberg and Vitter [5] developed different algorithms for the dynamic ....
....using a total of O(log N) expected time and O(log N) expected calls to a uniform random number generator. Theorem 1 The expected cost for generating a random variate according to the current weights is O(log N) where N is the number of elements. The dynamic scheme of Wong and Easton [11] uses O(log N) time per generation, but it requires only one call to a random number generator that outputs a uniform number in the range [0; P 1iN w i ) Our algorithm uses an average of at most about 2L calls to a uniform random number generator, primarily due to Step 3. It may be possible to ....
C. K. Wong and M. C. Easton. An Efficient Method for Weighted Sampling without Replacement, SIAM Journal on Computing, 9(1):111--114, 1980. 18 A PREPROCESSING
....measures of efficiency are the generation time and the update time. We can rerun Walker s algorithm each time a weight is updated, but the update cost O(N) is too high. Up until recently, the best known algorithm for the dynamic problem was the binary tree based scheme developed by Wong and Easton [26], whose generation and update times are both O(log N ) Each generation requires one call to a random number generator that provides a uniform random integer in the range [0; P 1iN w i ) Recently, Rajasekaran and Ross [21] and Greenberg and Vitter [12] developed different algorithms for the ....
....bound on the expected Step 2 cost: X 1kn k Delta 2 j k Gammaj 1 1 X 1kn k2 Gammak 2: In Step 3 we walk down the levels from R ( j in constant expected time per level, using the rejection method, using a total of O(log N) expected time. The dynamic scheme of Wong and Easton [26] uses O(log N) time per generation, but it requires only one call to a random number generator that outputs a uniform number in the range [0; P 1iN w i ) Our algorithm uses an average of at most about 2L calls to a uniform random number generator, primarily due to Step 3. It may be possible to ....
[Article contains additional citation context not shown here]
C. K. Wong and M. C. Easton. An Efficient Method for Weighted Sampling without Replacement, SIAM Journal on Computing, 9(1):111--114, 1980. 24 A PREPROCESSING
....I employ some of these techniques for query sampling. In Table 3 I list the major results, with citations to the relevant algorithms. Type of sampling Citation Expected Disk Accesses SRSWR [Yao77] O(s) SRSWR, variable blocking [ORX90, OR90] O(s(b max =b avg ) SRSWOR [EN82] O(s) Weighted RS [WE80] O(s log n) Sequential RS, known population size [FMR62] O(n=b avg ) Vit84] O(s) Sequential RS, unknown population size O(n=b avg ) Vit85] O(s(1 log(n=s) Table 2.1: Basic Sampling Techniques from a single file. Assume each sample taken from a distinct disk page, i.e. s (n=b avg ) For ....
....as the partial sum tree method CHAPTER 2. LITERATURE SURVEY 23 discussed below, require preprocessing the entire table of weights. Acceptance rejection sampling has long been used for generation of non uniform pseudorandom number generation [Rub81] 2.7. 2 Partial Sum Trees Wong and Easton [WE80] proposed to use binary partial sum trees to expedite weighted sampling. As above, consider the file of N records, in which each record r j has inclusion probability w j in a sample of size 1. Binary partial sum trees are simply binary trees with N leaves, each containing one record r j and its ....
[Article contains additional citation context not shown here]
C.K. Wong and M.C. Easton. An Efficient Method for Weighted Sampling without Replacement. SIAM Journal on Computing, 9(1):111--113, February 1980.
No context found.
C.K. Wong and M.C. Easton, "An efficient method for weighted sampling without replacement," SIAM Journal of Computing 9: 111-113 (1980). 15 Iterations p = 0:00 p = 0:90 N = 1 N = 10 N = 100 N = 1 N = 10 N = 100
No context found.
C. K. Wong and M. C. Easton. An Efficient Method for Weighted Sampling Without Replacement. SIAM Journal on Computing, 9(1):111-113, February 1980.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC