Results 1 - 10
of
22
Relational Distance-Based Clustering
, 1998
"... Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on distance measures can often outperform their concept-based counterparts. In this paper, we therefore build on recent advances in the area of #rst-order distance metrics and present RDBC, a bottom-up agglomerative clustering algorithm for #rst-order representations that relies on distance information only and features a novel parameter-free pruning measure for selecting the #nal clustering from the cluster tree. The algorithm can empirically be shown to produce good clusterings #on the mutagenesis domain# that, when used for subsequent prediction tasks, improve on previous clustering results and approach the accuracies of dedicated predictive learners.
Query answering and ontology population: An inductive approach
- IN PROC. ESWC-2008
"... Abstract. In order to overcome the limitations of deductive logic-based approaches to deriving operational knowledge from ontologies, especially when data come from distributed sources, inductive (instance-based) methods may be better suited, since they are usually efficient and noisetolerant. In th ..."
Abstract
-
Cited by 13 (10 self)
- Add to MetaCart
Abstract. In order to overcome the limitations of deductive logic-based approaches to deriving operational knowledge from ontologies, especially when data come from distributed sources, inductive (instance-based) methods may be better suited, since they are usually efficient and noisetolerant. In this paper we propose an inductive method for improving the instance retrieval and enriching the ontology population. By casting retrieval as a classification problem with the goal of assessing the individual class-memberships w.r.t. the query concepts, we propose an extension of the k-Nearest Neighbor algorithm for OWL ontologies based on an entropic distance measure. The procedure can classify the individuals w.r.t. the known concepts but it can also be used to retrieve individuals belonging to query concepts. Experimentally we show that the behavior of the classifier is comparable with the one of a standard reasoner. Moreover we show that new knowledge (not logically derivable) is induced. It can be suggested to the knowledge engineer for validation, during the ontology population task. 1
Induction of optimal semi-distances for individuals based on feature sets
- WORKING NOTES OF THE 20TH INTERNATIONAL DESCRIPTION LOGICS WORKSHOP, DL2007, VOLUME 250 OF CEUR WORKSHOP PROCEEDINGS
, 2007
"... Abstract. Many activities related to semantically annotated resources can be enabled by a notion of similarity among them. We propose a method for defining a family of semi-distances over the set of individuals in a knowledge base which can be used in these activities. In the line of works on distan ..."
Abstract
-
Cited by 12 (12 self)
- Add to MetaCart
Abstract. Many activities related to semantically annotated resources can be enabled by a notion of similarity among them. We propose a method for defining a family of semi-distances over the set of individuals in a knowledge base which can be used in these activities. In the line of works on distance-induction on clausal spaces, the family is parameterized on a committee of concepts. Hence, we also present a method based on the idea of simulated annealing to be used to optimize the choice of the best concept committee. 1
Cumulativity As Inductive Bias
"... An important difference in inductive bias between machine learning approaches is whether they assume the effects of different properties on a target variable to be cumulative or not. We believe this difference may have an important influence on the performance of machine learning or data mining tech ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
An important difference in inductive bias between machine learning approaches is whether they assume the effects of different properties on a target variable to be cumulative or not. We believe this difference may have an important influence on the performance of machine learning or data mining techniques, and hence should be taken into account when deciding which techniques to use. We illustrate this point with some practical cases. We furthermore point out that in Inductive Logic Programming, most algoritms belong to the class that does not assume cumulativity. We argue for the use and/or development of ILP systems that do make this assumption.
Randomized Metric Induction and Evolutionary Conceptual Clustering for Semantic Knowledge Bases
- ACM-CIKM 2007
, 2007
"... We present an evolutionary clustering method which can be applied to multi-relational knowledge bases storing resource annotations expressed in the standard languages for the Semantic Web. The method exploits an effective and languageindependent semi-distance measure defined for the space of individ ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
We present an evolutionary clustering method which can be applied to multi-relational knowledge bases storing resource annotations expressed in the standard languages for the Semantic Web. The method exploits an effective and languageindependent semi-distance measure defined for the space of individual resources, that is based on a finite number of dimensions corresponding to a committee of discriminating features (represented by concept descriptions). A maximally discriminating group of features can be obtained with the randomized optimization methods described in the paper. The clustering algorithm represents the possible clusterings as strings of central elements (medoids, w.r.t. the given metric) of variable length. Hence, the number of clusters is not required as a parameter since the method is able to find an optimal choice by means of the evolutionary operators and of a proper fitness function. We also show how to assign each cluster with a newly constructed intensional definition in the employed concept language. An experimentation with some ontologies proves the feasibility of our method and its effectiveness in terms of clustering validity indices.
An Evolutionary Approach to Concept Learning
, 1998
"... As my thesis is now written, it is time to pause for a while, to take a look back and thank all the people who have been involved in the process. I would like to start by thanking my supervisor Professor Ralph Back. He has had long-sightedness in allowing me to do what I wanted to do and has had pat ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
As my thesis is now written, it is time to pause for a while, to take a look back and thank all the people who have been involved in the process. I would like to start by thanking my supervisor Professor Ralph Back. He has had long-sightedness in allowing me to do what I wanted to do and has had patience with the long incubation of my thesis. Next, my thanks go to the whole machine learning group at the University of Turin with whom I spent a year in 1994-95. Especially, I would like to thank Professors Attilio Giordana and Lorenza Saitta who kindly agreed to let me stay in their group. It is during this period that I started the work that nally led to this thesis. I should also thank a few biochemists. Anders Aspnas and Vic Cockcroft are co-authors of one of the papers underlying this thesis. With them, as well as with Professor Mark Johnson and Jukka Lehtonen, I had some inspiring and illuminating discussions. This work has been made possible with nancial support from the Department of Computer Science at Abo Akademi University and from the
Acquiring and adapting probabilistic models of agent conversation
- In Proceedings of the 4th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS
, 2005
"... Communication in multiagent systems (MASs) is usually governed by agent communication languages (ACLs) and communication protocols carrying a clear cut semantics. With an increasing degree of openness, however, the need arises for more flexible models of communication that can handle the uncertainty ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Communication in multiagent systems (MASs) is usually governed by agent communication languages (ACLs) and communication protocols carrying a clear cut semantics. With an increasing degree of openness, however, the need arises for more flexible models of communication that can handle the uncertainty associated with the fact that adherence to a supposedly agreed specification of possible conversations cannot be ensured on the side of other agents. As one example for such a model, interaction frames follow an empirical semantics view of communication, where meaning is defined in terms of expected consequences, and allow for a combination of existing expectations with empirical observation of how communication is used in practice. In this paper, we use methods from the fields of case-based reasoning, inductive logic programming and cluster analysis to devise a formal scheme for the acquisition and adaptation of interaction frames from actual conversations, enabling agents to autonomously (i.e. independent of users and system designers) create and maintain a concise model of the different classes of conversation in a MAS on the basis of an initial set of ACL and protocol specifications. This resembles the first rigorous attempt to solve this problem that is decisive for building truly autonomous agents.
A hierarchical clustering procedure for semantically annotated resources
- PROCEEDINGS OF THE 10TH CONGRESS OF THE ITALIAN ASSOCIATION FOR ARTIFICIAL INTELLIGENCE, AI*IA2007, VOLUME 4733 OF LNAI
, 2007
"... Abstract. A clustering method is presented which can be applied to relational knowledge bases. It can be used to discover interesting groupings of resources through their (semantic) annotations expressed in the standard languages employed for modeling concepts in the Semantic Web. The method exploit ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. A clustering method is presented which can be applied to relational knowledge bases. It can be used to discover interesting groupings of resources through their (semantic) annotations expressed in the standard languages employed for modeling concepts in the Semantic Web. The method exploits a simple (yet effective and language-independent) semi-distance measure for individuals, that is based on the resource semantics w.r.t. a number of dimensions corresponding to a committee of features represented by a group of concept descriptions (discriminating features). The algorithm is an fusion of the classic Bisecting k-Means with approaches based on medoids since they are intended to be applied to relational representations. We discuss its complexity and the potential applications to a variety of important tasks. 1 Learning Methods for Concept Languages In the inherently distributed applications related to the Semantic Web (henceforth SW) there is an extreme need of automatizing those activities which are
Induction of Optimal Semantic Semi-distances for Clausal Knowledge Bases
"... Abstract. Several activities related to semantically annotated resources can be enabled by a notion of similarity, spanning from clustering to retrieval, matchmaking and other forms of inductive reasoning. We propose the definition of a family of semi-distances over the set of objects in a knowledge ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Several activities related to semantically annotated resources can be enabled by a notion of similarity, spanning from clustering to retrieval, matchmaking and other forms of inductive reasoning. We propose the definition of a family of semi-distances over the set of objects in a knowledge base which can be used in these activities. In the line of works on distance-induction on clausal spaces, the family is parameterized on a committee of concepts expressed with clauses. Hence, we also present a method based on the idea of simulated annealing to be used to optimize the choice of the best concept committee. 1
Approximate Measures of Semantic Dissimilarity under Uncertainty
"... Abstract. We propose semantic distance measures based on the criterion of approximate discernibility and on evidence combination. In the presence of incomplete knowledge, the distance measures the degree of belief in the discernibility of two individuals by combining estimates of basic probability m ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. We propose semantic distance measures based on the criterion of approximate discernibility and on evidence combination. In the presence of incomplete knowledge, the distance measures the degree of belief in the discernibility of two individuals by combining estimates of basic probability masses related to a set of discriminating features. We also suggest ways to extend this distance for comparing individuals to concepts and concepts to other concepts. 1

