Results 1 - 10
of
22
Inductive databases and multiple uses of frequent itemsets: the cInQ approach
- In Database Technologies for Data Mining - Discovering Knowledge with Inductive Queries, volume 2682 of LNCS
, 2004
"... Abstract. Inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. With an IDB the user/analyst performs a set of very different operations on data using a query language, powerful enough to perform all the required elaborations, such as data pr ..."
Abstract
-
Cited by 17 (9 self)
- Add to MetaCart
(Show Context)
Abstract. Inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. With an IDB the user/analyst performs a set of very different operations on data using a query language, powerful enough to perform all the required elaborations, such as data preprocessing, pattern discovery and pattern postprocessing. We present a synthetic view on important concepts that have been studied within the cInQ European project when considering the pattern domain of itemsets. Mining itemsets has been proved useful not only for association rule mining but also feature construction, classification, clustering, etc. We introduce the concepts of pattern domain, evaluation functions, primitive constraints, inductive queries and solvers for itemsets. We focus on simple high-level definitions that enable to forget about technical details that the interested reader will find, among others, in cInQ publications. 1
Intensional Query Answering to XQuery Expressions
- Proc. 16th Int’l Conf. Database and Expert Systems Applications
, 2005
"... Abstract. XML is a representation of data which may require huge amounts of storage space and query processing time. Summarized representations of XML data provide succinct information which can be directly queried, either when fast yet approximate answers are sufficient, or when the actual dataset ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
Abstract. XML is a representation of data which may require huge amounts of storage space and query processing time. Summarized representations of XML data provide succinct information which can be directly queried, either when fast yet approximate answers are sufficient, or when the actual dataset is not available. In this work we show which kinds of XQuery expressions admit a partial answer by using association rules extracted from XML datasets. Such partial information provide intensional answers to queries formulated either as XQuery expressions or in a visual fashion.
A Tool for Extracting XML Association Rules from XML Documents
"... The recent success of XML as a standard to represent semi-structured data, and the increasing amount of available XML data, pose new challenges to the data mining community. In this paper we present the XMINE operator a tool we developed to extract XML association rules for XML documents. The operat ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The recent success of XML as a standard to represent semi-structured data, and the increasing amount of available XML data, pose new challenges to the data mining community. In this paper we present the XMINE operator a tool we developed to extract XML association rules for XML documents. The operator, that is based on XPath and inspired by the syntax of XQuery, allows us to express complex mining tasks, compactly and intuitively. XMINE can be used to specify indifferently (and simultaneously) mining tasks both on the content and on the structure of the data, since the distinction in XML is slight. 1.
Measuring semantic centrality based on building consensual ontology on social network
- in "Proc. 2nd workshop on semantic network analysis (SNA), Budva (ME
, 2006
"... Abstract. We have been focusing on three-layered socialized semantic space, consisting of social, ontology, and concept layers. In this paper, we propose a new measurement of semantic centrality of people, meaning the power of semantic bridging, on this architecture. Thereby, the consensual ontologi ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We have been focusing on three-layered socialized semantic space, consisting of social, ontology, and concept layers. In this paper, we propose a new measurement of semantic centrality of people, meaning the power of semantic bridging, on this architecture. Thereby, the consensual ontologies are discovered by semantic alignment-based mining process in the ontology and concept layer. It is represented as the maximal semantic substructures among personal ontologies of semantically interlinked community. Finally, we have shown an example of semantic centrality applied to resource annotation on social network, and discussed our assumptions used in formulation of this measurement. 1
Warehousing complex data from the Web
"... ∗ Corresponding authors Abstract: The data warehousing and OLAP technologies are now moving onto handling complex data that mostly originate from the Web. However, intagrating such data into a decision-support process requires their representation under a form processable by OLAP and/or data mining ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
∗ Corresponding authors Abstract: The data warehousing and OLAP technologies are now moving onto handling complex data that mostly originate from the Web. However, intagrating such data into a decision-support process requires their representation under a form processable by OLAP and/or data mining techniques. We present in this paper a complex data warehousing methodology that exploits XML as a pivot language. Our approach includes the integration of complex data in an ODS, under the form of XML documents; their dimensional modeling and storage in an XML data warehouse; and their analysis with combined OLAP and data mining techniques. We also address the crucial issue of performance in XML warehouses.
XAR-Miner: Efficient Association Rules Mining for XML Data
"... In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently. In XAR-Miner, raw data in the XML document are first preprocessed to transform to either an Indexed Content Tree (IX-tree) or Multi-relational databases (Multi-DB), depending on the size of XML do ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a framework, called XAR-Miner, for mining ARs from XML documents efficiently. In XAR-Miner, raw data in the XML document are first preprocessed to transform to either an Indexed Content Tree (IX-tree) or Multi-relational databases (Multi-DB), depending on the size of XML document and memory constraint of the system, for efficient data selection and AR mining. Task-relevant concepts are generalized to produce generalized meta-patterns, based on which the large ARs that meet the support and confidence levels are generated.
A New Model for Discovering XML Association Rules from XML Documents
"... Abstract—The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized associa ..."
Abstract
- Add to MetaCart
Abstract—The inherent flexibilities of XML in both structure and semantics makes mining from XML data a complex task with more challenges compared to traditional association rule mining in relational databases. In this paper, we propose a new model for the effective extraction of generalized association rules form a XML document collection. We directly use frequent subtree mining techniques in the discovery process and do not ignore the tree structure of data in the final rules. The frequent subtrees based on the user provided support are split to complement subtrees to form the rules. We explain our model within multi-steps from data preparation to rule generation.
Mining Flexible Association Rules from XML
"... The role of the eXtensible Markup Language (XML) is becoming very important in the research fields focusing on the representation, the exchange, and the integration of information coming from different data sources and containing information related to various contexts such as, for example, medical ..."
Abstract
- Add to MetaCart
The role of the eXtensible Markup Language (XML) is becoming very important in the research fields focusing on the representation, the exchange, and the integration of information coming from different data sources and containing information related to various contexts such as, for example, medical and biological data. Extracting knowledge from XML datasets is an important issue that may be difficult because of the semistructured intrinsic nature of XML; indeed documents can have an implicit and irregular structure, not defined in advance. In this paper, we propose a novel approach for discovering frequent, but approximate, information in XML documents, based on Flexible Tree Rules taking into account both structure and content of the analyzed data. Our proposal is flexible enough to be adapted to both documents with a regular structure and documents with a highly heterogeneous structure, and can be used to evaluate the similarity of XML documents. Moreover, we describe an algorithm to evaluate the similarity degree of a Flexible Tree Rule with respect to an XML document. 1.
VISUALIZATION OF A SYNTHETIC REPRESENTATION OF ASSOCIATION RULES TO ASSIST EXPERT VALIDATION
"... In order to help the expert to validate association rules, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data structure from which the rules are extracted. The second one has t ..."
Abstract
- Add to MetaCart
(Show Context)
In order to help the expert to validate association rules, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data structure from which the rules are extracted. The second one has two subcategories: The first one consists on providing to the expert a tool for rule interactive exploration. In fact, they present these rules in textual form. The second subcategory includes the use of visualization systems to facilitate the task of rules mining. However, this last subcategory assumes that experts have statistical knowledge to interpret and validate association rules. Furthermore, the statistical methods have a lack of semantic representation and could not help the experts during the process of validation. To solve this problem, we propose in this paper a method which visualizes to the experts a synthetic representation of association rules as a formal conceptual graph (FCG). FCG represents his area of interest and allows him to realize the task of rules mining easily due to its semantic richness.
A Recent Review on XML data mining and FFP
"... The goal of data mining is to extract or mine" knowledge from large amounts of data. Emerging technologies of semi-structured data have attracted wide attention of networks, e-commerce, information retrieval and databases.XML has become very popular for representing semi structured data and a s ..."
Abstract
- Add to MetaCart
The goal of data mining is to extract or mine" knowledge from large amounts of data. Emerging technologies of semi-structured data have attracted wide attention of networks, e-commerce, information retrieval and databases.XML has become very popular for representing semi structured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasingly important. However, the structure of the XML data can be more complex and irregular than that. Association Rule Mining plays a key role in the process of mining data for frequent pattern matching. First Frequent Patterngrowth, for mining the complete set of frequent patterns by pattern fragment growth. First Frequent Pattern-tree based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets and a partition-based, divideand-conquer method is used. This paper shows a complete review of XML data mining using Fast Frequent Pattern mining in various domains.