• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Automatically identifying gene/protein terms in MEDLINE abstracts (0)

by H Yu
Venue:J. of Biomedical Informatics
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 10

A Framework for Schema-Driven Relationship Discovery from Unstructured text

by Cartic Ramakrishnan, Krys J. Kochut, Amit P. Sheth - Proc. 5th International Semantic Web Conference , 2006
"... Abstract. We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-base ..."
Abstract - Cited by 27 (9 self) - Add to MetaCart
Abstract. We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the conversion of such relationships into RDF. Furthermore, we present results that clearly demonstrate the utility of the generated RDF in discovering knowledge from text corpora by means of locating paths composed of the extracted relationships. Keywords: Relationship Extraction, Knowledge-Driven Text mining 1
(Show Context)

Citation Context

...emical relationship) to specific (e.g.,sregulatory relationships). This becomes clear when we look at the approaches to relationship extraction surveyed in [8]. These include pattern based approaches =-=[11]-=- where patterns such as “also known as” are used to identify synonymy in protein and gene names. Template based approaches have also been investigated in the PASTA system [12]. Natural Language Proces...

The AMTEx Approach in the Medical Document Indexing and Retrieval Application

by Angelos Hliaoutakis, Kaliope Zervanou, Euripides G. M. Petrakis
"... AMTEx is a medical document indexing method, specifically designed for the automatic indexing of documents in large medical collections, such as MEDLINE, the premier bibliographic database of the U.S. National Library of Medicine (NLM). AMTEx combines MeSH, the terminological thesaurus resource of N ..."
Abstract - Cited by 11 (5 self) - Add to MetaCart
AMTEx is a medical document indexing method, specifically designed for the automatic indexing of documents in large medical collections, such as MEDLINE, the premier bibliographic database of the U.S. National Library of Medicine (NLM). AMTEx combines MeSH, the terminological thesaurus resource of NLM, with a wellestablished method for extraction of terminology, the C/NC-value method. The performance evaluation of two AMTEx configurations is measured against the current state-of-the-art, the MetaMap Transfer (MMTx) method in four experiments, using two types of corpora: a subset of MEDLINE (PMC) full document corpus and a subset of MEDLINE (OHSUMED) abstracts, for each of the indexing and retrieval tasks respectively. The experimental results demonstrate that AMTEx performs better in indexing in 20-50 % of the processing time compared to MMTx, while for the retrieval task, AMTEx performs better in the full text (PMC) corpus.
(Show Context)

Citation Context

...ues (e.g. [10], [17], [18]). The extraction of terms for the medical, biological and biomedical domain has greatly motivated research for both indexing, as well as knowledge extraction purposes [15], =-=[19]-=-, [20], [21]. In the specific context of term extraction for indexing purposes, the main objective of the term extraction process is the identification of discrete content indicators, namely index ter...

Medical Document Indexing and Retrieval: AMTEx vs. NLM MMTx

by Angelos Hliaoutakis, Kalliopi Zervanou, Euripides G. M. Petrakis
"... AMTEx is a medical document indexing method, specifically designed for the automatic indexing of documents in large medical collections, such as MEDLINE, the premier bibliographic database of the U.S. National Library of Medicine (NLM). AMTEx combines MeSH, the terminological thesaurus resource of N ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
AMTEx is a medical document indexing method, specifically designed for the automatic indexing of documents in large medical collections, such as MEDLINE, the premier bibliographic database of the U.S. National Library of Medicine (NLM). AMTEx combines MeSH, the terminological thesaurus resource of NLM, with a wellestablished method for term extraction, the C/NC-value method. The performance evaluation of two AMTEx configurations is measured against the current state-of-theart, the MMTx method in indexing and retrieval tasks in three experiments. In the first, a subset of MEDLINE (PMC) full document corpus was used for the indexing task. In the second and third, a subset of MEDLINE (OHSUMED) abstracts was used for indexing and retrieval respectively. The experimental results demonstrate that AMTEx achieves better precision in all tasks, in 50-20 % of the processing time compared to MMTx.
(Show Context)

Citation Context

...ques (e.g. [9], [14], [15]). The extraction of terms for the medical, biological and biomedical domain has greatly motivated research for both indexing, as well as knowledge extraction purposes [12], =-=[16]-=-, [17], [18]. In the specific context of term extraction for indexing purposes, the main objective of the term extraction process is the identification of discrete content indicators, namely index ter...

KNOWLEDGE MANAGEMENT, DATA MINING, AND TEXT MINING IN MEDICAL INFORMATICS

by Hsinchun Chen, Sherrilynne S. Fuller, Carol Friedman
"... In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data a ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first introduce five major paradigms for machine learning and data analysis including: probabilistic and statistical models, symbolic learning and rule induction, neural networks, evolution-based algorithms, and analytic learning and fuzzy logic. We also discuss their relevance and potential for biomedical research. Example applications of relevant knowledge management, data mining, and text mining research are then reviewed in order including: ontologies; knowledge management for health care, biomedical literature, heterogeneous databases, information visualization, and multimedia databases; and data and text mining for health care, literature, and biological data. We conclude the paper with discussions of privacy and confidentiality issues of relevance to biomedical data mining.
(Show Context)

Citation Context

...g models had comparable performance. Other studies have investigated the mapping between abbreviations and full names such that these names will not be considered by the system as different entities (=-=Yu et al., 2002-=-). After the entity names have been identified, further analyses are performed to see whether these entities have any relationships, such as gene regulations, metabolic pathways, or protein-protein in...

Using MEDLINE as a Knowledge Source for Disambiguating Abbreviations in Full-Text Biomedical Journal Articles

by Hong Yua, Won Kimb, Vasileios Hatzivassilogloua, W. John Wilburb
"... Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many abbreviations represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of biomedical abbreviations. Since many abbreviations are ambiguo ..."
Abstract - Add to MetaCart
Biomedical abbreviations and acronyms are widely used in biomedical literature. Since many abbreviations represent important content in biomedical literature, information retrieval and extraction benefits from identifying the meanings of biomedical abbreviations. Since many abbreviations are ambiguous, it would be important to map abbreviations to their full forms, which ultimately represent the meanings of the abbreviations. In this study, we present a novel unsupervised method that applies MEDLINE records as a knowledge source for disambiguating abbreviations in full-text biomedical journal articles. We first automatically generated from MEDLINE records a knowledge source or dictionary of abbreviation-full pairs. We then trained on MEDLINE records and predicted the full forms of abbreviations in full-text journal articles by applying supervised machine-learning algorithms in an unsupervised fashion. We report up to 92 % prediction precision and up to 91 % coverage. Keywords: Disambiguation, terminology, information retrieval, information extraction, machine learning
(Show Context)

Citation Context

...et of rules or patternss[3-8], or a statistical disambiguation method chooses one of the full forms for an abbreviation based on the contexts(nearby words) the abbreviation occurs in [9-13].sGPmarkup =-=[8,13]-=- is one of the systems that applies knowledge-based pattern-matching rules to map biomedicalsabbreviations to full forms when the full forms are defined.sThe pattern-matching rules include:sProceeding...

A Framework for Schema-Driven Relationship Discovery from Unstructured text

by Core Scholar, Cartic Ramakrishnan, Krzysztof Kochut, Amit P. Sheth, Cartic Ramakrishnan, Krys J. Kochut, Amit P. Sheth
"... Abstract. We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-base ..."
Abstract - Add to MetaCart
Abstract. We address the issue of extracting implicit and explicit relationships between entities in biomedical text. We argue that entities seldom occur in text in their simple form and that relationships in text relate the modified, complex forms of entities with each other. We present a rule-based method for (1) extraction of such complex entities and (2) relationships between them and (3) the conversion of such relationships into RDF. Furthermore, we present results that clearly demonstrate the utility of the generated RDF in discovering knowledge from text corpora by means of locating paths composed of the extracted relationships.
(Show Context)

Citation Context

...(ISWC2006), Athens,sGA, Nov. 6-9, 2006.sregulatory relationships). This becomes clear when we look at the approaches tosrelationship extraction surveyed in [8]. These include pattern based approaches =-=[11]-=-swhere patterns such as “also known as” are used to identify synonymy in protein andsgene names. Template based approaches have also been investigated in the PASTAssystem [12]. Natural Language Proces...

From Biomedical Literature to Knowledge: Mining Protein-Protein Interactions

by Deyu Zhou, Yulan He, Chee Keong Kwoh
"... Summary. To date, more than 16 million citations of published articles in biomedical domain are available in the MEDLINE database. These articles describe the new discoveries which accompany a tremendous development in biomedicine during the last decade. It is crucial for biomedical researchers to r ..."
Abstract - Add to MetaCart
Summary. To date, more than 16 million citations of published articles in biomedical domain are available in the MEDLINE database. These articles describe the new discoveries which accompany a tremendous development in biomedicine during the last decade. It is crucial for biomedical researchers to retrieve and mine some specific knowledge from the huge quantity of published articles with high efficiency. Researchers have been engaged in the development of text mining tools to find knowledge such as protein-protein interactions, which are most relevant and useful for specific analysis tasks. This chapter provides a road map to the various information extraction methods in biomedical domain, such as protein name recognition and discovery of protein-protein interactions. Disciplines involved in analyzing and processing unstructured-text are summarized. Current work in biomedical information extracting is categorized. Current challenges in the field are also presented and possible solutions are discussed. 1
(Show Context)

Citation Context

.... 99 abstracts were randomly selected from MEDLINE to form the training corpus and 101 MEDLINE abstracts formed the test corpus. Yapex achieved a recall of 66.4% and a precision of 67.8%. In GPmarkup =-=[53]-=-, abbreviations were first mapped to full names using a set of guidelines and protein symbols were mapped to the names by a set of pattern-matching rules. The mappings were performed on 11 million MED...

Methodological Review Term Identification in the Biomedical Literature *

by Michael Krauthammer, Goran Nenadić
"... Sophisticated information technologies are needed for effective data acquisition and integration from a growing body of the biomedical literature. Successful term identification is key to getting access to the stored literature information, as it is the terms (and their relationships) that convey kn ..."
Abstract - Add to MetaCart
Sophisticated information technologies are needed for effective data acquisition and integration from a growing body of the biomedical literature. Successful term identification is key to getting access to the stored literature information, as it is the terms (and their relationships) that convey knowledge across scientific articles. Due to the complexities of a dynamically changing biomedical terminology, term identification has been recognized as the current bottleneck in text mining, and – as a consequence – has become an important research topic both in natural language processing and biomedical communities. This article overviews state-of-the-art approaches in term identification. The process of identifying terms is analysed through three steps: term recognition, term classification and term mapping. For each step, main approaches and general trends, along with the major problems, are discussed. By assessing previous work in context of the overall term identification process, the review also tries to delineate needs for future work in the field.
(Show Context)

Citation Context

...ntification. Although there are many existing acronym repositories in the biomedical field [52, 53], it has been reported that such resources cover only parts of the acronyms that appear in documents =-=[54]-=-. The discrepancy between curated acronym resources and the wealth of acronyms defined in biomedical articles fostered the development of several acronym recognition systems. In order to locate potent...

Semantic Annotation of Biomedical Literature using

by Rune Sætre, Amund Tveit, Tonje S. Steigedal, Astrid Lægreid
"... ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...of Biomedical Literature Other approaches for (semantic) annotation (mainly for protein and gene names) of biomedical literature include: – Rule-based discovery of names (e.g. of proteins and genes), =-=[13,29,36,35]-=- – Methods for discovering relationships of proteins and genes, [2,16]. – Classifier approaches (machine learning) with textual context as features, [4,5,6,14,1,20,30,21,17] – Other approaches include...

H.3.1 [Content Analysis and Indexing

by Cartic Ramakrishnan, Amit P. Sheth
"... In this paper we define Semantic Trails as a series of simple and complex relationships connecting entities in documents across a corpus. We propose a novel algorithm that uses a dependency parse of sentences in documents to extract these simple and complex relationships. The extracted relationships ..."
Abstract - Add to MetaCart
In this paper we define Semantic Trails as a series of simple and complex relationships connecting entities in documents across a corpus. We propose a novel algorithm that uses a dependency parse of sentences in documents to extract these simple and complex relationships. The extracted relationships are represented in RDF. We evaluate the RDF that is extracted and present an analysis of the errors, their causes and possible future enhancements. Following extraction, we describe how this extracted RDF is superimposed back onto the original text to realize what we term as Semantic Trails through text. Using the TREC 2006 Genomics Track data as a use case we demonstrate how the result of our improved extraction mechanism can be used to build a Semantic Browser that supports the creation and use of Semantic Trails.
(Show Context)

Citation Context

...micalsrelationship) to specific (e.g., regulatory relationships). Thissbecomes clear when we look at the approaches to relationshipsextraction surveyed in [20]. These include pattern basedsapproaches =-=[23]-=- where patterns such as “also known as” aresused to identify synonymy in protein and gene names. Templatesbased approaches have also been investigated in the PASTAssystem [24]. Natural Language Proces...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University