Results 1 - 10
of
68
Dbpedia spotlight: Shedding light on the web of documents
- In Proceedings of the 7th International Conference on Semantic Systems (I-Semantics
, 2011
"... Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a syste ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
DSNotify: Handling Broken Links in the Web of Data
- In 19th International WWW Conference (WWW2010
"... The Web of Data has emerged as a way of exposing structured linked data on the Web. It builds on the central building blocks of the Web (URIs, HTTP) and benefits from its simplicity and wide-spread adoption. It does, however, also inherit the unresolved issues such as the broken link problem. Broken ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The Web of Data has emerged as a way of exposing structured linked data on the Web. It builds on the central building blocks of the Web (URIs, HTTP) and benefits from its simplicity and wide-spread adoption. It does, however, also inherit the unresolved issues such as the broken link problem. Broken links constitute a major challenge for actors consuming Linked Data as they require them to deal with reduced accessibility of data. We believe that the broken link problem is a major threat to the whole Web of Data idea and that both Linked Data consumers and providers will require solutions that deal with this problem. Since no general solutions for fixing such links in the Web of Data have emerged, we make three contributions into this direction: first, we provide a concise definition of the broken link problem and a comprehensive analysis of existing approaches. Second, we present DSNotify, a generic framework able to assist human and machine actors in fixing broken links. It uses heuristic feature comparison and employs a time-interval-based blocking technique for the underlying instance matching problem. Third, we derived benchmark datasets from knowledge bases such as DBpedia and evaluated the effectiveness of our approach with respect to the broken link problem. Our results show the feasibility of a time-interval-based blocking approach for systems that aim at detecting and fixing broken links in the Web of Data.
Topicnets: Visual analysis of large text corpora with topic modeling
- ACM Transactions on Intelligent Systems and Technology, 2011
"... We present TopicNets, a web-based system for visual and interactive analysis of large sets of documents using statistical topic models. A range of visualization types and control mechanisms to support knowledge discovery are presented. These include corpus and document specific views, iterative topi ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We present TopicNets, a web-based system for visual and interactive analysis of large sets of documents using statistical topic models. A range of visualization types and control mechanisms to support knowledge discovery are presented. These include corpus and document specific views, iterative topic modeling, search, and visual filtering. Drill-down functionality is provided to allow analysts to visualize individual document sections and their relations within the global topic space. Analysts can search across a data set through a set of expansion techniques on selected document and topic nodes. Furthermore, analysts can select relevant subsets of documents and perform real-time topic modeling on these subsets to interactively visualize topics at various levels of granularity, allowing for a better understanding of the documents. A discussion of the design and implementation choices for each visual analysis technique is presented. This is followed by a discussion of three diverse use cases in which TopicNets enables fast discovery of information that is otherwise hard to find. These include a corpus of 50,000 successful NSF grant proposals, 10,000 publications from a large research center, and single documents including a grant proposal and a PhD thesis.
Semantic Tags Generation and Retrieval for Online Advertising
"... One of the main problems in online advertising is to display ads which are relevant and appropriate w.r.t. what the user is looking for. Often search engines fail to reach this goal as they do not consider semantics attached to keywords. In this paper we propose a system that tackles the problem by ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
One of the main problems in online advertising is to display ads which are relevant and appropriate w.r.t. what the user is looking for. Often search engines fail to reach this goal as they do not consider semantics attached to keywords. In this paper we propose a system that tackles the problem by two different angles: help (i) advertisers to create more efficient ads campaigns and (ii) ads providers to properly match ads content to keywords in search engines. We exploit semantic relations stored in the DBpedia dataset and use an hybrid ranking system to rank keywords and to expand queries formulated by the user. Inputs of our ranking system are (i) the DBpedia dataset; (ii) external information sources such as classical search engine results and social tagging systems. We compare our approach with other RDF similarity measures, proving the validity of our algorithm with an extensive evaluation involving real users.
LESS- Template-Based Syndication and Presentation of Linked Data
"... Abstract. Recently, the publishing of structured, semantic information as linked data has gained quite some momentum. For ordinary users on the Internet, however, this information is not yet very visible and (re-) usable. With LESS we present an end-to-end approach for the syndication and use of lin ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Recently, the publishing of structured, semantic information as linked data has gained quite some momentum. For ordinary users on the Internet, however, this information is not yet very visible and (re-) usable. With LESS we present an end-to-end approach for the syndication and use of linked data based on the definition of templates for linked data resources and SPARQL query results. Such syndication templates are edited, published and shared by using a collaborative Web platform. Templates for common types of entities can then be combined with specific, linked data resources or SPARQL query results and integrated into a wide range of applications, such as personal homepages, blogs/wikis, mobile widgets etc. In order to improve reliability and performance of linked data, LESS caches versions either for a certain time span or for the case of inaccessibility of the original source. LESS supports the integration of information from various sources as well as any text-based output formats. This allows not only to generate HTML, but also diagrams, RSS feeds or even complete data mashups without any programming involved. 1
Semantic Wonder Cloud: exploratory search in
"... Abstract. Inspired by the Google Wonder Wheel 3, in this paper we present Semantic Wonder Cloud (SWOC): a tool that helps users in knowledge exploration within the DBpedia dataset by adopting a hybrid approach. We describe both the architecture and the user interface. The system exploits not only pu ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Inspired by the Google Wonder Wheel 3, in this paper we present Semantic Wonder Cloud (SWOC): a tool that helps users in knowledge exploration within the DBpedia dataset by adopting a hybrid approach. We describe both the architecture and the user interface. The system exploits not only pure semantic connections in the underlying RDF graph but it mixes the meaning of such information with external non-semantic knowledge sources, such as web search engines and tagging systems. Semantic Wonder Cloud allows the user to explore the relations between resources of knowledge domain via a simple and intuitive graphical interface. 1
Uby - A Large-Scale Unified Lexical-Semantic Resource
- In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012
, 2012
"... We present UBY, a large-scale lexicalsemantic resource combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It currently contains nine resources in two languages: ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present UBY, a large-scale lexicalsemantic resource combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It currently contains nine resources in two languages:
The Impact of Multifaceted Tagging on Learning Tag Relations and Search
"... Abstract. In this paper we present a model for multifaceted tagging, i.e. tagging enriched with contextual information. We present TagMe!, a social tagging front-end for Flickr images, that provides multifaceted tagging functionality: It enables users to attach tag assignments to a specific area wit ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. In this paper we present a model for multifaceted tagging, i.e. tagging enriched with contextual information. We present TagMe!, a social tagging front-end for Flickr images, that provides multifaceted tagging functionality: It enables users to attach tag assignments to a specific area within an image and to categorize tag assignments. Moreover, TagMe! maps tags and categories to DBpedia URIs to clearly define the meaning of freely-chosen words. Our experiments reveal the benefits of these additional tagging facets. For example, the exploitation of the facets significantly improves the performance of FolkRank-based search. Further, we demonstrate the benefits of TagMe! tagging facets for learning semantics within folksonomies. 1
A Mobile Touchable Application for Online Topic Graph Extraction and Exploration of Web Content
"... We present a mobile touchable application for online topic graph extraction and exploration of web content. The system has been implemented for operation on an iPad. The topic graph is constructed from N web snippets which are determined by a standard search engine. We consider the extraction of a t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a mobile touchable application for online topic graph extraction and exploration of web content. The system has been implemented for operation on an iPad. The topic graph is constructed from N web snippets which are determined by a standard search engine. We consider the extraction of a topic graph as a specific empirical collocation extraction task where collocations are extracted between chunks. Our measure of association strength is based on the pointwise mutual information between chunk pairs which explicitly takes their distance into account. An initial user evaluation shows that this system is especially helpful for finding new interesting information on topics about which the user has only a vague idea or even no idea at all. 1
Boosting relation extraction with limited closed-world knowledge
- In Proceedings of COLING ’10, Poster Session. Association for Computational Linguistics
, 2010
"... This paper presents a new approach to improving relation extraction based on minimally supervised learning. By adding some limited closed-world knowledge for confidence estimation of learned rules to the usual seed data, the precision of relation extraction can be considerably improved. Starting fro ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper presents a new approach to improving relation extraction based on minimally supervised learning. By adding some limited closed-world knowledge for confidence estimation of learned rules to the usual seed data, the precision of relation extraction can be considerably improved. Starting from an existing baseline system we demonstrate that utilizing limited closed world knowledge can effectively eliminate ”dangerous ” or plainly wrong rules during the bootstrapping process. The new method improves the reliability of the confidence estimation and the precision value of the extracted instances. Although recall suffers to a certain degree depending on the domain and the selected settings, the overall performance measured by F-score considerably improves. Finally we validate the adaptability of the best ranking method to a new domain and obtain promising results. 1

