Results 11  20
of
22
Tree compression with top trees.
 In Proc. ICALP
, 2013
"... We introduce a new compression scheme for labeled trees based on top trees. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast navigat ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We introduce a new compression scheme for labeled trees based on top trees. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast navigational queries directly on the compressed representation. We show that the new compression scheme achieves close to optimal worstcase compression, can compress exponentially better than DAG compression, is never much worse than DAG compression, and supports navigational queries in logarithmic time.
A framework for succinct labeled ordinal trees over large alphabets
 IN: ISAAC
, 2012
"... . We consider succinct representations of labeled ordinal trees that support a rich set of operations. Our new representations support a much broader collection of operations than previous work [10, 8, 1]. In our approach, labels of nodes are stored in a preorder label sequence, which can be compre ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
. We consider succinct representations of labeled ordinal trees that support a rich set of operations. Our new representations support a much broader collection of operations than previous work [10, 8, 1]. In our approach, labels of nodes are stored in a preorder label sequence, which can be compressed using any succinct index for strings that supports rankα and selectα operations. In other words, we present a framework for succinct representations of labeled ordinal trees that allows alphabets to be large. This answers an open problem presented by Geary et al. [10]. We further extend our work and present the first succinct representation of dynamic labeled ordinal trees that supports several labelbased operations including finding the level ancestor with a given label.
Space Efficient Data Structures for Dynamic Orthogonal Range Counting
 In: Proc. WADS
, 2011
"... Abstract. We present a linearspace data structure that maintains a dynamic set of n points with coordinates of real numbers on the plane to support orthogonal range counting, as well as insertions and deletions, in O( ( lgn lg lgn)2) time. This provides faster support for updates than previous resu ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We present a linearspace data structure that maintains a dynamic set of n points with coordinates of real numbers on the plane to support orthogonal range counting, as well as insertions and deletions, in O( ( lgn lg lgn)2) time. This provides faster support for updates than previous results with the same bounds on space cost and query time. We also obtain two other new results by considering the same problem for points on a U × U grid, and by designing the first succinct data structures for a dynamic integer sequence to support range counting. 1
The wavelet matrix: An efficient wavelet tree for large alphabets
 Information Systems
"... The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
The wavelet tree is a flexible data structure that permits representing sequences S[1, n] of symbols over an alphabet of size σ, within compressed space and supporting a wide range of operations on S. When σ is significant compared to n, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the wavelet matrix, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zeroorder entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map. 1
Efficient Path Kernels for Reaction Function Prediction
"... Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive perfor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Kernels for structured data are rapidly becoming an essential part of the machine learning toolbox. Graph kernels provide similarity measures for complex relational objects, such as molecules and enzymes. Graph kernels based on walks are popular due their fast computation but their predictive performance is often not satisfactory, while kernels based on subgraphs suffer from high computational cost and are limited to small substructures. Kernels based on paths offer a promising middle ground between these two extremes. However, the computation of path kernels has so far been assumed computationally too challenging. In this paper we introduce an effective method for computing path based kernels; we employ a BurrowsWheeler transform based compressed path index for fast and spaceefficient enumeration of paths. Unlike many kernel algorithms the index representation retains fast access to individual features. In our experiments with chemical reaction graphs, path based kernels surpass stateoftheart graph kernels in prediction accuracy. 1
Compaction and Compression General Terms: Algorithms
"... Sequence representations supporting the queries access, select, and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article, we prove a s ..."
Abstract
 Add to MetaCart
Sequence representations supporting the queries access, select, and rank are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article, we prove a strong lower bound for rank, which holds for rather permissive assumptions on the space used, and give matching upper bounds that require only a compressed representation of the sequence. Within this compressed space, the operations access and select can be solved in constant or almostconstant time, which is optimal for large alphabets. Our new upper bounds dominate all of the previous work in the time/space map.
XXS: Efficient XPath Evaluation on . . .
, 2014
"... The eXtensible Markup Language (XML) is acknowledged as the de facto standard for semistructured data representation and data exchange on the Web and many other scenarios. A wellknown shortcoming of XML is its verbosity, which increases manipulation, transmission, and processing costs. Various str ..."
Abstract
 Add to MetaCart
The eXtensible Markup Language (XML) is acknowledged as the de facto standard for semistructured data representation and data exchange on the Web and many other scenarios. A wellknown shortcoming of XML is its verbosity, which increases manipulation, transmission, and processing costs. Various structureblind and structureconscious compression techniques can be applied to XML, and some are even accessfriendly, meaning that the documents can be efficiently accessed in compressed form. Direct access is necessary to implement the query languages XPath and XQuery, which are the standard ones to exploit the expressiveness of XML. While a good deal of theoretical and practical proposals exist to solve XPath/XQuery operations on XML, only a few ones are well integrated with a compression format that supports the required access operations on the XML data. In this work we go one step further and design a compression format for XML collections that boosts the performance of XPath queries on the data. This is done by designing compressed representations of the XML data that support some complex operations apart from just accessing the data, and those are exploited to solve key components of the XPath queries. Our system, called XXS, is aimed at XML collections containing natural language text, which are compressed to within 35%50 % of their original size while supporting a large subset of XPath operations in time competitive with, and many times
Algorithms, Performance, Theory
"... About what is the smallest size we can compress an IP Forwarding Information Base (FIB) down to, while still guaranteeing fast lookup? Is there some notion of FIB entropy that could serve as a compressibility metric? As an initial step in answering these questions, we present a FIB data structure, c ..."
Abstract
 Add to MetaCart
(Show Context)
About what is the smallest size we can compress an IP Forwarding Information Base (FIB) down to, while still guaranteeing fast lookup? Is there some notion of FIB entropy that could serve as a compressibility metric? As an initial step in answering these questions, we present a FIB data structure, called Multibit BurrowsWheeler transform (MBW), that is fundamentally pointerless, can be built in linear time, guarantees theoretically optimal longest prefix match, and compresses to higherorder entropy. Measurements on a Linux prototype provide a first glimpse of the applicability of MBW. Categories and Subject Descriptors
Compressed Indexes for String Searching in Labeled Graphs
"... Storing and searching large labeled graphs is indeed becoming a key issue in the design of space/time efficient online platforms indexing modern social networks or knowledge graphs. But, as far as we know, all these results are limited to design compressed graph indexes which support basic access o ..."
Abstract
 Add to MetaCart
(Show Context)
Storing and searching large labeled graphs is indeed becoming a key issue in the design of space/time efficient online platforms indexing modern social networks or knowledge graphs. But, as far as we know, all these results are limited to design compressed graph indexes which support basic access operations onto the link structure of the input graph, such as: given a node u, return the adjacency list of u. This paper takes inspiration from the Facebook Unicorn’s platform and proposes some compressedindexing schemes for large graphs whose nodes are labeled with strings of variable length — i.e., node’s attributes such as user’s (nick)name— that support sophisticated search operations which involve both the linked structure of the graph and the string content of its nodes. An extensive experimental evaluation over real social networks will show the time and space efficiency of the proposed indexing schemes and their query processing algorithms.