Results 1 -
7 of
7
The discrete basis problem
, 2005
"... We consider the Discrete Basis Problem, which can be described as follows: given a collection of Boolean vectors find a collection of k Boolean basis vectors such that the original vectors can be represented using disjunctions of these basis vectors. We show that the decision version of this problem ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
(Show Context)
We consider the Discrete Basis Problem, which can be described as follows: given a collection of Boolean vectors find a collection of k Boolean basis vectors such that the original vectors can be represented using disjunctions of these basis vectors. We show that the decision version of this problem is NP-complete and that the optimization version cannot be approximated within any finite ratio. We also study two variations of this problem, where the Boolean basis vectors must be mutually otrhogonal. We show that the other variation is closely related with the well-known Metric k-median Problem in Boolean space. To solve these problems, two algorithms will be presented. One is designed for the variations mentioned above, and it is solely based on solving the k-median problem, while another is a heuristic intended to solve the general Discrete Basis Problem. We will also study the results of extensive experiments made with these two algorithms with both synthetic and real-world data. The results are twofold: with the synthetic data, the algorithms did rather well, but with the real-world data the results were not as good.
Mining Bi-sets in Numerical Data
"... Abstract. Thanks to an important research effort the last few years, inductive queries on set patterns and complete solvers which can evaluate them on large 0/1 data sets have been proved extremely useful. However, for many application domains, the raw data is numerical (matrices of real numbers who ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Thanks to an important research effort the last few years, inductive queries on set patterns and complete solvers which can evaluate them on large 0/1 data sets have been proved extremely useful. However, for many application domains, the raw data is numerical (matrices of real numbers whose dimensions denote objects and properties). Therefore, using efficient 0/1 mining techniques needs for tedious Boolean property encoding phases. This is, e.g., the case, when considering microarray data mining and its impact for knowledge discovery in molecular biology. We consider the possibility to mine directly numerical data to extract collections of relevant bi-sets, i.e., couples of associated sets of objects and attributes which satisfy some user-defined constraints. Not only we propose a new pattern domain but also we introduce a complete solver for computing the so-called numerical bi-sets. Preliminary experimental validation is given. 1
Application-Independent Feature Construction from Noisy Samples ⋆
"... Abstract. When training classifiers, presence of noise can severely harm their performance. In this paper, we focus on “non-class ” attribute noise and we consider how a frequent fault-tolerant (FFT) pattern mining task can be used to support noise-tolerant classification. Our method is based on an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract. When training classifiers, presence of noise can severely harm their performance. In this paper, we focus on “non-class ” attribute noise and we consider how a frequent fault-tolerant (FFT) pattern mining task can be used to support noise-tolerant classification. Our method is based on an application independent strategy for feature construction based on the so-called δ-free patterns. Our experiments on noisy training data shows accuracy improvement when using the computed features instead of the original ones. 1
New Applications of Formal Concept Analysis: A Need for Original Pattern Domains
"... Abstract. We survey the results obtained by our research group (joint ..."
(Show Context)
Mining Local Staircase Patterns in Noisy Data
"... Abstract—Most traditional biclustering algorithms identify biclusters with no or little overlap. In this paper, we introduce the problem of identifying staircases of biclusters. Such staircases may be indicative for causal relationships between columns and can not easily be identified by existing bi ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Most traditional biclustering algorithms identify biclusters with no or little overlap. In this paper, we introduce the problem of identifying staircases of biclusters. Such staircases may be indicative for causal relationships between columns and can not easily be identified by existing biclustering algorithms. Our formalization relies on a scoring function based on the Minimum Description Length principle. Furthermore, we pro-pose a first algorithm for identifying staircase biclusters, based on a combination of local search and constraint programming. Experiments show that the approach is promising. Index Terms—Staircase patterns; pattern sets; constraint pro-gramming; MDL; biclustering. I.
Mining bi-sets in numerical data
"... Abstract. Thanks to an important research effort the last few years, inductive queries on set patterns and complete solvers which can evaluate them on large 0/1 data sets have been proved extremely useful. However, for many application domains, the raw data is numerical (matrices of real numbers who ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Thanks to an important research effort the last few years, inductive queries on set patterns and complete solvers which can evaluate them on large 0/1 data sets have been proved extremely useful. However, for many application domains, the raw data is numerical (matrices of real numbers whose dimensions denote objects and properties). Therefore, using efficient 0/1 mining techniques needs for tedious Boolean property encoding phases. This is, e.g., the case, when considering microarray data mining and its impact for knowledge discovery in molecular biology. We consider the possibility to mine directly numerical data to extract collections of relevant bi-sets, i.e., couples of associated sets of objects and attributes which satisfy some user-defined constraints. Not only we propose a new pattern domain but also we introduce a complete solver for computing the so-called numerical bi-sets. Preliminary experimental validation is given. 1