Results 1 - 10
of
46
Measuring Similarity between Ontologies
- in Proceedings of the European Conference on Knowledge Acquisition and Management (EKAW
, 2002
"... Abstract. Ontologies now play an important role for many knowledge-intensive applications for which they provide a source of precisely defined terms. However, with their wide-spread usage there come problems concerning their proliferation. Ontology engineers or users frequently have a core ontology ..."
Abstract
-
Cited by 136 (11 self)
- Add to MetaCart
Abstract. Ontologies now play an important role for many knowledge-intensive applications for which they provide a source of precisely defined terms. However, with their wide-spread usage there come problems concerning their proliferation. Ontology engineers or users frequently have a core ontology that they use, e.g., for browsing or querying data, but they need to extend it with, adapt it to, or compare it with the large set of other ontologies. For the task of detecting and retrieving relevant ontologies, one needs means for measuring the similarity between ontologies. We present a set of ontology similarity measures and a multiple-phase empirical evaluation. 1
Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule
, 2003
"... Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic ..."
Abstract
-
Cited by 84 (4 self)
- Add to MetaCart
Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the e#ciency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set.
A Corpus-based Conceptual Clustering Method for Verb Frames and Ontology Acquisition
- In LREC workshop on
, 1998
"... We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiri ..."
Abstract
-
Cited by 79 (7 self)
- Add to MetaCart
We describe in this paper the ML system, ASIUM, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications are semantic checking of texts and syntactic parsing improvement but also text generation and translation. The inputs of ASIUM result from syntactic parsing of texts, they are subcategorization examples and basic clusters formed by head words that occur with the same verb after the same preposition (or with the same syntactical role). ASIUM successively aggregates the clusters to form new concepts in the form of a generality graph that represents the ontology of the domain. Subcategorization frames are learned in parallel, so that as concepts are formed, they fill restrictions of selection in th...
Relational Instance-Based Learning
- Proceedings of the Thirteenth International Conference on Machine Learning
, 1996
"... A relational instance-based learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values a ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
A relational instance-based learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values and in domains with noisy attributes and#or examples. Three research issues that emerge when a propositional instance-based learner is adapted to a #rst-order representation are identi#ed: #1# construction of cases from the knowledge base, #2# computation of similaritybetween arbitrarily complex cases, and #3# estimation of the relevance of predicates and attributes. Solutions to these issues are developed. Empirical results indicate that Ribl is able to achieve high classi#cation accuracy in a variety of domains. to appear in: Proc. 13th International Conference on Machine Learning, L. Saitta #ed.#, Morgan Kaufmann, 1996 1 Introduction The #eld of Inductive Logic Programming ...
Structural Regression Trees
, 1996
"... In many real-world domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determ ..."
Abstract
-
Cited by 60 (10 self)
- Add to MetaCart
In many real-world domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with non-determinate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems by integrating the statistical method of regression trees into ILP. SRT constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP syste...
ASIUM: learning subcategorization frames and restrictions of selection
, 1998
"... We describe in this paper the ML system, Asium, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requirin ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
We describe in this paper the ML system, Asium, which learns subcategorization frames of verbs and ontologies from syntactic parsing of technical texts in natural language. The restrictions of selection in the subcategorization frames are filled by the concepts of the ontology. Applications requiring subcategorization frames and ontologies are crucial and numerous. The most direct applications are semantic checking of texts and syntactic parsing improvement but also text generation and translation. The input of Asium result from syntactic parsing of texts, they are subcategorization examples and basic clusters formed by head words that occur with the same verb after the same preposition (or with the same syntactical role). Asium successively aggregates the clusters to form new concepts in the form of a generality graph that represents the ontology of the domain. Subcategorization frames are learned in parallel, so that as concepts are formed, they fill restrictions of selection in the ...
Similarity for Ontologies - a Comprehensive Framework
- In Workshop Enterprise Modelling and Ontology: Ingredients for Interoperability, at PAKM 2004
, 2004
"... In this paper we present a comprehensive framework for measuring similarity within and between ontologies as a basis for the collaboration across various application fields. In order to define such a framework, we base our work on an abstract ontology model that allows to adhere to various existi ..."
Abstract
-
Cited by 36 (8 self)
- Add to MetaCart
In this paper we present a comprehensive framework for measuring similarity within and between ontologies as a basis for the collaboration across various application fields. In order to define such a framework, we base our work on an abstract ontology model that allows to adhere to various existing and evolving ontology standards. The main characteristics of the framework is its layered structure: We have defined three levels on which the similarity between two entities (concepts or instances) can be measured: data layer, ontology layer, and context layer, that cope with the data representation, ontology meaning and the usage of these entities, respectively. In addition, in each of the layers corresponding background information is used in order to define the similarity more precisely.
Distance Induction in First Order Logic
- PROCEEDINGS OF THE 7TH INTERNATIONAL WORKSHOP ON INDUCTIVE LOGIC PROGRAMMING, ILP97, VOLUME 1297 OF LNAI
, 1997
"... This paper tackles the supervised induction of a distance from examples described as Horn clauses or constrained clauses. In opposition to syntax-driven approaches, this approach is discrimination-driven: it proceeds by defining a small set of complex discriminant hypotheses. These hypotheses serve ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
This paper tackles the supervised induction of a distance from examples described as Horn clauses or constrained clauses. In opposition to syntax-driven approaches, this approach is discrimination-driven: it proceeds by defining a small set of complex discriminant hypotheses. These hypotheses serve as new concepts, used to redescribe the initial examples. Further, this redescription can be embedded into the space of natural integers, and a distance between examples thus naturally follows. This distance can be used for classification via a k-nearest-neighbor process. Experiments on the mutagenesis dataset validate the approach, in terms of predictive accuracy, computational cost, and robustness with respect to the parameters of the algorithm.
An Integrative Proximity Measure for Ontology Alignment
, 2003
"... Integrating heterogeneous resources of the web will require finding agreement between the underlying ontologies. A variety of methods from the literature may be used for this task, basically they perform pair-wise comparison of entities from each of the ontologies and select the most similar pairs. ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Integrating heterogeneous resources of the web will require finding agreement between the underlying ontologies. A variety of methods from the literature may be used for this task, basically they perform pair-wise comparison of entities from each of the ontologies and select the most similar pairs. We introduce a similarity measure that takes advantage of most of the features of OWL-Lite ontologies and integrates many ontology comparison techniques in a common framework. Moreover, we put forth a computation technique to deal with one-to-many relations and circularities in the similarity definitions.
Relational Distance-Based Clustering
, 1998
"... Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on distance measures can often outperform their concept-based counterparts. In this paper, we therefore build on recent advances in the area of #rst-order distance metrics and present RDBC, a bottom-up agglomerative clustering algorithm for #rst-order representations that relies on distance information only and features a novel parameter-free pruning measure for selecting the #nal clustering from the cluster tree. The algorithm can empirically be shown to produce good clusterings #on the mutagenesis domain# that, when used for subsequent prediction tasks, improve on previous clustering results and approach the accuracies of dedicated predictive learners.

