Results 1  10
of
128
Generalized binary search
 In Proceedings of the 46th Allerton Conference on Communications, Control, and Computing
, 2008
"... This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideratio ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of noisy Generalized Binary Search (GBS). GBS is a wellknown greedy algorithm for determining a binaryvalued hypothesis through a sequence of strategically selected queries. At each step, a query is selected that most evenly splits the hypotheses under consideration into two disjoint subsets, a natural generalization of the idea underlying classic binary search. GBS is used in many applications, including fault testing, machine diagnostics, disease diagnosis, job scheduling, image processing, computer vision, and active learning. In most of these cases, the responses to queries can be noisy. Past work has provided a partial characterization of GBS, but existing noisetolerant versions of GBS are suboptimal in terms of query complexity. This paper presents an optimal algorithm for noisy GBS and demonstrates its application to learning multidimensional threshold functions. 1
M.: Boosting multicore reachability performance with shared hash tables
 In: Formal Methods in ComputerAided Design
, 2010
"... This paper focuses on data structures for multicore reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
This paper focuses on data structures for multicore reachability, which is a key component in model checking algorithms and other verification methods. A cornerstone of an efficient solution is the storage of visited states. In related work, static partitioning of the state space was combined with threadlocal storage and resulted in reasonable speedups, but left open whether improvements are possible. In this paper, we present a scaling solution for shared state storage which is based on a lockless hash table implementation. The solution is specifically designed for the cache architecture of modern CPUs. Because model checking algorithms impose loose requirements on the hash table operations, their design can be streamlined substantially compared to related work on lockless hash tables. Still, an implementation of the hash table presented here has dozens of sensitive performance parameters (bucket size, cache line size, data layout, probing sequence, etc.). We analyzed their impact and compared the resulting speedups with related tools. Our implementation outperforms two stateoftheart multicore model checkers (SPIN and DiVinE) by a substantial margin, while placing fewer constraints on the load balancing and search algorithms. 1
Markov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey
, 2013
"... ..."
Scalable diversified ranking on large graphs
 In ICDM
, 2011
"... Abstract—Enhancing diversity in ranking on graphs has been identified as an important retrieval and mining task. Nevertheless, many existing diversified ranking algorithms cannot be scalable to large graphs as they have high time or space complexity. In this paper, we propose a scalable algorithm to ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Abstract—Enhancing diversity in ranking on graphs has been identified as an important retrieval and mining task. Nevertheless, many existing diversified ranking algorithms cannot be scalable to large graphs as they have high time or space complexity. In this paper, we propose a scalable algorithm to find the topK diversified ranking list on graphs. The key idea of our algorithm is that we first compute the Pagerank of the nodes of the graph, and then perform a carefully designed vertex selection algorithm to find the topK diversified ranking list. Specifically, we firstly present a new diversified ranking measure, which can capture both relevance and diversity. Secondly, we prove the submodularity of the proposed measure. And then we propose an efficient greedy algorithm with linear time and space complexity with respect to the size of the graph to achieve nearoptimal diversified ranking. Finally, we evaluate the proposed method through extensive experiments on four real networks. The experimental results indicate that the proposed method outperforms existing diversified ranking algorithms both on improving diversity in ranking and the efficiency of the algorithms. I.
SimMatrix: SIMulator for MAnyTask computing execution fabRIc at eXascale
"... scheduling Exascale computers (expected to be composed of millions of nodes and billions of threads of execution) will enable the unraveling of significant scientific mysteries. Manytask computing is a distributed paradigm, which can potentially address three of the four major challenges of exascal ..."
Abstract

Cited by 10 (8 self)
 Add to MetaCart
(Show Context)
scheduling Exascale computers (expected to be composed of millions of nodes and billions of threads of execution) will enable the unraveling of significant scientific mysteries. Manytask computing is a distributed paradigm, which can potentially address three of the four major challenges of exascale computing, namely Memory/Storage, Concurrency/Locality, and Resiliency. Exascale computing will require efficient job scheduling/management systems that are several orders of magnitude beyond the stateoftheart, which tend to have centralized architecture and are relatively heavyweight. This paper proposes a lightweight discrete event simulator, SimMatrix, which simulates job scheduling system comprising of millions of nodes and billions of cores/tasks. SimMatrix supports both centralized (e.g. firstinfirstout) and distributed (e.g. work stealing) scheduling. We validated SimMatrix against two real systems, Falkon and MATRIX, with up to 4Kcores, running on an IBM Blue Gene/P system, and compared SimMatrix with SimGrid and GridSim in terms of resource consumption at scale. Results show that SimMatrix consumes up to twoorders of magnitude lower memory per task, and at least oneorder of magnitude (and up to fourorders of magnitude) lower time per task overheads. For example, running a workload of 10 billion tasks on 1 million nodes and 1 billion cores required 142GB memory and 163 CPUhours. These relatively low costs at exascale levels of concurrency will lead to innovative studies in scheduling algorithms at unprecedented scales. 1.
Memory Trace Oblivious Program Execution
"... Abstract—Cloud computing allows users to delegate data and computation to cloud service providers, at the cost of giving up physical control of their computing infrastructure. An attacker (e.g., insider) with physical access to the computing platform can perform various physical attacks, including p ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
Abstract—Cloud computing allows users to delegate data and computation to cloud service providers, at the cost of giving up physical control of their computing infrastructure. An attacker (e.g., insider) with physical access to the computing platform can perform various physical attacks, including probing memory buses and coldboot style attacks. Previous work on secure (co)processors provides hardware support for memory encryption and prevents direct leakage of sensitive data over the memory bus. However, an adversary snooping on the bus can still infer sensitive information from the memory access traces. Existing work on Oblivious RAM (ORAM) provides a solution for users to put all data in an ORAM; and accesses to an ORAM are obfuscated such that no information leaks through memory access traces. This method, however, incurs significant memory access overhead. This work is the first to leverage programming language techniques to offer efficient memorytrace oblivious program execution, while providing formal security guarantees. We formally define the notion of memorytrace obliviousness, and provide a type system for verifying that a program satisfies this property. We also describe a compiler that transforms a program into a structurally similar one that satisfies memory trace obliviousness. To achieve optimal efficiency, our compiler partitions variables into several small ORAM banks rather than one large one, without risking security. We use several example programs to demonstrate the efficiency gains our compiler achieves in comparison with the naive method of placing all variables in the same ORAM. I.
AsDroid: Detecting stealthy behaviors in android applications by user interface and program behavior contradiction
 In Proceedings of the IEEE/ACM International Conference on Software Engineering (ICSE
, 2014
"... Android smartphones are becoming increasingly popular. The open nature of Android allows users to install miscellaneous applications, including the malicious ones, from thirdparty marketplaces without rigorous sanity checks. A large portion of existing malwares perform stealthy operations such as ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
Android smartphones are becoming increasingly popular. The open nature of Android allows users to install miscellaneous applications, including the malicious ones, from thirdparty marketplaces without rigorous sanity checks. A large portion of existing malwares perform stealthy operations such as sending short messages, making phone calls and HTTP connections, and installing additional malicious components. In this paper, we propose a novel technique to detect such stealthy behavior. We model stealthy behavior as the program behavior that mismatches with user interface, which denotes the user’s expectation of program behavior. We use static program analysis to attribute a top level function that is usually a user interaction function with the behavior it performs. Then we analyze the text extracted from the user interface component associated with the top level function. Semantic mismatch of the two indicates stealthy behavior. To evaluate AsDroid, we download a pool of 182 apps that are potentially problematic by looking at their permissions. Among the 182 apps, AsDroid reports stealthy behaviors in 113 apps, with 28 false positives and 11 false negatives.
Kernels for Global Constraints
 PROCEEDINGS OF THE TWENTYSECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2011
"... Bessière et al. (AAAI’08) showed that several intractable global constraints can be efficiently propagated when certain natural problem parameters are small. In particular, the complete propagation of a global constraint is fixedparameter tractable in k – the number of holes in domains – whenever b ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
Bessière et al. (AAAI’08) showed that several intractable global constraints can be efficiently propagated when certain natural problem parameters are small. In particular, the complete propagation of a global constraint is fixedparameter tractable in k – the number of holes in domains – whenever bound consistency can be enforced in polynomial time; this applies to the global constraints ATMOSTNVALUE and EXTENDED GLOBAL CARDINALITY (EGC). In this paper we extend this line of research and introduce the concept of reduction to a problem kernel, a key concept of parameterized complexity, to the field of global constraints. In particular, we show that the consistency problem for ATMOSTNVALUE constraints admits a linear time reduction to an equivalent instance on O(k2) variables and domain values. This small kernel can be used to speed up the complete propagation of NVALUE constraints. We contrast this result by showing that the consistency problem for EGC constraints does not admit a reduction to a polynomial problem kernel unless the polynomial hierarchy collapses.
Reconstruction of complete interval tournaments. II.
, 2010
"... Let a, b (b ≥ a) and n (n ≥ 2) be nonnegative integers and let T (a, b, n) be the set of such generalised tournaments, in which every pair of distinct players is connected at most with b, and at least with a arcs. In [40] we gave a necessary and sufficient condition to decide whether a given sequen ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Let a, b (b ≥ a) and n (n ≥ 2) be nonnegative integers and let T (a, b, n) be the set of such generalised tournaments, in which every pair of distinct players is connected at most with b, and at least with a arcs. In [40] we gave a necessary and sufficient condition to decide whether a given sequence of nonnegative integers D = (d1, d2,..., dn) can be realized as the outdegree sequence of a T ∈ T (a, b, n). Extending the results of [40] we show that for any sequence of nonnegative integers D there exist f and g such that some element T ∈ T (g, f, n) has D as its outdegree sequence, and for any (a, b, n)tournament T ′ with the same outdegree sequence D hold a ≤ g and b ≥ f. We propose a Θ(n) algorithm to determine f and g and an O(dnn 2) algorithm to construct a corresponding tournament T.
Efficient resource oblivious algorithms for multicores with false sharing
 In Proc. IEEE IPDPS
, 2012
"... Abstract—We consider algorithms for a multicore environment in which each core has its own private cache and false sharing can occur. False sharing happens when two or more processors access the same block (i.e., cacheline) in parallel, and at least one processor writes into a location in the block ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
Abstract—We consider algorithms for a multicore environment in which each core has its own private cache and false sharing can occur. False sharing happens when two or more processors access the same block (i.e., cacheline) in parallel, and at least one processor writes into a location in the block. False sharing causes different processors to have inconsistent views of the data in the block, and many of the methods currently used to resolve these inconsistencies can cause large delays. We analyze the cost of false sharing both for variables stored on the execution stacks of the parallel tasks and for output variables. Our main technical contribution is to establish a low cost for this overhead for the class of multithreaded blockresilient HBP (Hierarchical Balanced Parallel) computations. Using this and other techniques, we develop blockresilient HBP algorithms with low false sharing costs for several fundamental problems including scans, matrix multiplication, FFT, sorting, and hybrid blockresilient HBP algorithms for list ranking and graph connected components. Most of these algorithms are derived from known multicore algorithms, but are further refined to achieve a low false sharing overhead. Our algorithms make no mention of machine parameters, and our analysis of the false sharing overhead is mostly in terms of the the number of tasks generated in parallel during the computation, and thus applies to a variety of schedulers. Keywordsfalsesharing; cacheefficiency; multicores I.