Results 1 -
8 of
8
According to the Hallmarks of Cancer
"... Motivation: The hallmarks of cancer by Hanahan and Weinberg (2000, 2011) have become highly influential in cancer research. They reduce the complexity of cancer into ten principles (e.g. resisting cell death, sustaining proliferative signaling) that explain the biological capabili-ties acquired duri ..."
Abstract
- Add to MetaCart
(Show Context)
Motivation: The hallmarks of cancer by Hanahan and Weinberg (2000, 2011) have become highly influential in cancer research. They reduce the complexity of cancer into ten principles (e.g. resisting cell death, sustaining proliferative signaling) that explain the biological capabili-ties acquired during the development of human tumours. Since new research depends crucially on existing knowledge, technology for se-mantic classification of scientific literature according to the hallmarks of cancer could greatly support literature review, knowledge discovery and applications in cancer research. Results: We present the first step towards the development of such technology. We introduce a corpus of 1,499 PubMed abstracts an-notated according to the scientific evidence they provide for the ten currently known hallmarks of cancer. We use this corpus to train a system that classifies PubMed literature according to the hallmarks. The system uses supervised machine learning and rich features largely based on biomedical text mining. We report good performance in both intrinsic and extrinsic evaluations, demonstrating both the ac-curacy of the methodology and its potential in supporting practical cancer research. We discuss how this approach could be developed and applied further in the future. Availability: The corpus of hallmark-annotated PubMed abstracts and the software for classification are available at:
RESEARCH Open Access TEES 2.2: Biomedical Event Extraction for Diverse
, 2013
"... Background: The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency ..."
Abstract
- Add to MetaCart
(Show Context)
Background: The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results: The TEES system was quickly adapted to the BioNLP’13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP’13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions: The TEES system was introduced for the BioNLP’09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.
cancer
"... Data and text mining Cell line name recognition in support of the identification of synthetic lethality in ..."
Abstract
- Add to MetaCart
Data and text mining Cell line name recognition in support of the identification of synthetic lethality in
Event-based textmining for biology and functional genomics
, 2014
"... The assessment of genome function requires a mapping between genome-derived entities and biochemical reac-tions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature pro ..."
Abstract
- Add to MetaCart
(Show Context)
The assessment of genome function requires a mapping between genome-derived entities and biochemical reac-tions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of ‘events’, i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event ex-traction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction.We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.
biomedical events from the literature
, 2013
"... graph-based patterns to extract ..."
(Show Context)
annotations of gene-cancer relations
"... Background: In order to access the large amount of information in biomedical literature about genes implicated in various cancers both efficiently and accurately, the aid of text mining (TM) systems is invaluable. Current TM systems do target either gene-cancer relations or biological processes invo ..."
Abstract
- Add to MetaCart
(Show Context)
Background: In order to access the large amount of information in biomedical literature about genes implicated in various cancers both efficiently and accurately, the aid of text mining (TM) systems is invaluable. Current TM systems do target either gene-cancer relations or biological processes involving genes and cancers, but the former type produces information not comprehensive enough to explain how a gene affects a cancer, and the latter does not provide a concise summary of gene-cancer relations. Results: In this paper, we present a corpus for the development of TM systems that are specifically targeting gene-cancer relations but are still able to capture complex information in biomedical sentences. We describe CoMAGC, a corpus with multi-faceted annotations of gene-cancer relations. In CoMAGC, a piece of annotation is composed of four semantically orthogonal concepts that together express 1) how a gene changes, 2) how a cancer changes and 3) the causality between the gene and the cancer. The multi-faceted annotations are shown to have high inter-annotator agreement. In addition, we show that the annotations in CoMAGC allow us to infer the prospective roles of genes in cancers and to classify the genes into three classes according to the inferred roles. We encode the mapping between multi-faceted annotations and gene classes into 10 inference rules. The inference rules produce results with high accuracy as measured against human annotations. CoMAGC consists of 821 sentences on prostate, breast and ovarian cancers. Currently, we deal with changes in gene expression levels among other types of gene changes. The corpus is available at
Nguyen et al. BMC Bioinformatics (2015) 16:107 DOI 10.1186/s12859-015-0538-8 RESEARCH ARTICLE Open Access
"... Wide-coverage relation extraction from MEDLINE using deep syntax ..."
(Show Context)