Results 1 -
8 of
8
OASSIS: query driven crowd mining
- In SIGMOD
, 2014
"... Crowd data sourcing is increasingly used to gather infor-mation from the crowd and to obtain recommendations. In this paper, we explore a novel approach that broadens crowd data sourcing by enabling users to pose general questions, to mine the crowd for potentially relevant data, and to re-ceive con ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
Crowd data sourcing is increasingly used to gather infor-mation from the crowd and to obtain recommendations. In this paper, we explore a novel approach that broadens crowd data sourcing by enabling users to pose general questions, to mine the crowd for potentially relevant data, and to re-ceive concise, relevant answers that represent frequent, sig-nificant data patterns. Our approach is based on (1) a sim-ple generic model that captures both ontological knowledge as well as the individual history or habits of crowd mem-bers from which frequent patterns are mined; (2) a query language in which users can declaratively specify their in-formation needs and the data patterns of interest; (3) an efficient query evaluation algorithm, which enables mining semantically concise answers while minimizing the number of questions posed to the crowd; and (4) an implementa-tion of these ideas that mines the crowd through an interac-tive user interface. Experimental results with both real-life crowd and synthetic data demonstrate the feasibility and effectiveness of the approach.
Document Analysis Research in the Year 2021
- in Twenty-fourth International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2011
, 2011
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
S (2008) An inductive database and query language in the relational model
- L, Manolescu I (eds) Proceedings of the 11th International Conference on extending database technology (EDBT
"... In the demonstration, we will present the concepts and an implementation of an inductive database – as proposed by Imielinski and Mannila – in the relational model. The goal is to support all steps of the knowledge discovery process, from pre-processing via data mining to post-processing, on the bas ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In the demonstration, we will present the concepts and an implementation of an inductive database – as proposed by Imielinski and Mannila – in the relational model. The goal is to support all steps of the knowledge discovery process, from pre-processing via data mining to post-processing, on the basis of queries to a database system. The query language SIQL (structured inductive query language), an SQL extension, offers query primitives for feature selection, discretization, pattern mining, clustering, instance-based learning and rule induction. A prototype system processing such queries was implemented as part of the SINDBAD (structured inductive database development) project. Key concepts of this system, among others, are the closure of operators and distances between objects. To support the analysis of multi-relational data, we incorporated multi-relational distance measures based on set distances and recursive descent. The inclusion of rule-based classification models made it necessary to extend the data model and the software architecture significantly. The prototype is applied to three different applications: gene expression analysis, gene regulation prediction and structure-activity relationships (SARs) of small molecules. 1.
Author manuscript, published in "ACM Symposium on Applied Computing, Seoul: Corée, République de (2007)" A Model for Managing Collections of Patterns
, 2009
"... Abstract. Data mining algorithms are now able to efficiently deal with huge amount of data. Various kinds of patterns may be discovered and may have some great impact on the general development of knowledge. In many domains, end users may want to have their data mined by data mining tools in order t ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Data mining algorithms are now able to efficiently deal with huge amount of data. Various kinds of patterns may be discovered and may have some great impact on the general development of knowledge. In many domains, end users may want to have their data mined by data mining tools in order to extract patterns that could impact their business. Nevertheless, those users are often overwhelmed by the large quantity of patterns extracted in such a situation. Moreover, some privacy issues, or some commercial one may lead the users not to be able to mine the data by themselves. Thus, the users may not have the possibility to perform many experiments integrating various constraints in order to focus on specific patterns they would like to extract. Post processing of patterns may be an answer to that drawback. Thus, in this paper we present a framework that could allow end users to manage collections of patterns. We propose to use an efficient data structure on which some algebraic operators may be used in order to retrieve or access patterns in pattern bases. 1
ABSTRACT A Model for Managing Collections of Patterns Baptiste Jeudy
"... Data mining algorithms are now able to efficiently deal with huge amount of data. Various kinds of patterns may be discovered and may have some great impact on the general development of knowledge. In many domains, end users may want to have their data mined by data mining tools in order to extract ..."
Abstract
- Add to MetaCart
(Show Context)
Data mining algorithms are now able to efficiently deal with huge amount of data. Various kinds of patterns may be discovered and may have some great impact on the general development of knowledge. In many domains, end users may want to have their data mined by data mining tools in order to extract patterns that could impact their business. Nevertheless, those users are often overwhelmed by the large quantity of patterns extracted in such a situation. Moreover, some privacy issues, or some commercial one may lead the users not to be able to mine the data by themselves. Thus, the users may not have the possibility to perform many experiments integrating various constraints in order to focus on specific patterns they would like to extract. Post processing of patterns may be an answer to that drawback. Thus, in this paper we present a framework that could allow end users to manage collections of patterns. We propose to use an efficient data structure on which some algebraic operators may be used in order to retrieve or access patterns in pattern bases.
Manuscript Received:
, 2014
"... databases, spatiotemporal data mining, spatiotemporal relationships, spatiotemporal association rules, spatiotemporal queries, Abstract Various studies have been directed to capture spatial and time-varying characteristics of data using data mining techniques. Nevertheless, mining frequent spatiote ..."
Abstract
- Add to MetaCart
(Show Context)
databases, spatiotemporal data mining, spatiotemporal relationships, spatiotemporal association rules, spatiotemporal queries, Abstract Various studies have been directed to capture spatial and time-varying characteristics of data using data mining techniques. Nevertheless, mining frequent spatiotemporal item sets from large SpatioTemporal Data Bases (STDB) is a hard task mainly with the availability of hidden relationships. The aim of our proposal is to demonstrate the application of association rule mining to spatiotemporal data. In order to achieve this finality, our attention has been paid to mine SpatioTemporal Association Rules (STAR) containing spatiotemporal relationships in the antecedent and consequent of the rule. In this paper, we propose a three-step approach; the core step is to extract the spatiotemporal relationships from the STDB and is handled by spatiotemporal queries. The two other steps are achieved by Apriori Algorithm and are devoted respectively to frequent spatiotemporal item sets generation and STAR extraction. To prove the applicability of our method, we conducted an experimentation on a spatiotemporal database describing the town of Tunis in the dates 1987 and 2001. The obtained STARs show the spatiotemporal relationship evolution of different geo-referenced objects. 1
Towards a General Framework for Data Mining
"... Abstract. In this paper, we address the ambitious task of formulat-ing a general framework for data mining. We discuss the requirements that such a framework should fulfill: It should elegantly handle differ-ent types of data, different data mining tasks, and different types of patterns/models. We a ..."
Abstract
- Add to MetaCart
Abstract. In this paper, we address the ambitious task of formulat-ing a general framework for data mining. We discuss the requirements that such a framework should fulfill: It should elegantly handle differ-ent types of data, different data mining tasks, and different types of patterns/models. We also discuss data mining languages and what they should support: this includes the design and implementation of data mining algorithms, as well as their composition into nontrivial multi-step knowledge discovery scenarios relevant for practical application. We proceed by laying out some basic concepts, starting with (structured) data and generalizations (e.g., patterns and models) and continuing with data mining tasks and basic components of data mining algorithms (i.e., refinement operators, distances, features and kernels). We next discuss how to use these concepts to formulate constraint-based data mining tasks and design generic data mining algorithms. We finally discuss how these components would fit in the overall framework and in particular into a language for data mining and knowledge discovery. 1
EFFICIENT MANAGEMENT OF NON REDUNDANT RULES IN LARGE PATTERN BASES: A BITMAP APPROACH
"... Abstract: Knowledge Discovery from Databases has more and more impact nowadays and various tools are now avail-able to extract efficiently (in time and memory space) some knowledge from huge databases. Nevertheless, those systems generally produce some large pattern bases and then the management of ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract: Knowledge Discovery from Databases has more and more impact nowadays and various tools are now avail-able to extract efficiently (in time and memory space) some knowledge from huge databases. Nevertheless, those systems generally produce some large pattern bases and then the management of these one rapidly be-comes untractable. Few works have focused on pattern base management systems and researches on that domain are really new. This paper comes within that context, dealing with a particular class of patterns that is association rules. More precisely, we present the way we have efficiently implemented the search for non redundant rules thanks to a representation of rules in the form of bitmap arrays. Some experiments show that the use of this technique increases dramatically the gain in time and space, allowing us to manage large pattern bases. 1