• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Robust relational parsing over biomedical literature: Extracting inhibit relations, Pacific Symposium on Biocomputing. (2002)

by J Pustejovsky, J Castano, J Zhang, B Cochran, M Kotecki
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 147
Next 10 →

Accomplishments and Challenges in Literature Data Mining for Biology

by Lynette Hirschman, Jong C. Park, Junichi Tsujii, Limsoon Wong, Cathy H. Wu , 2002
"... We review recent results in literature data mining for biology and discuss the need and the steps for a challenge evaluation for this field. Literature data mining has progressed from simple recognition of terms to extraction of interaction relationships from complex sentences, and has broadened fro ..."
Abstract - Cited by 161 (12 self) - Add to MetaCart
We review recent results in literature data mining for biology and discuss the need and the steps for a challenge evaluation for this field. Literature data mining has progressed from simple recognition of terms to extraction of interaction relationships from complex sentences, and has broadened from recognition of protein interactions to arange of problems such as improving homology search, identifying cellular location, and so on. To encourage participation and accelerate progress in this expanding field, we propose creating challenge evaluations, and we describe two specific applications in this context.
(Show Context)

Citation Context

...nsiderably more complex sentences [7, 8]. This year, we see the emergence of sophisticated natural language technologies that can handle anaphora, as well as extracting a broader range of information =-=[9, 10, 11, 12, 13, 14]-=-. For the past three years, the Pacific Symposium on Biocomputing series has had a track dedicated to natural language processing and information extraction in biology. The response to the call for pa...

Mining the Biomedical Literature in the Genomic Era: An Overview

by Hagit Shatkay, Ronen Feldman - JOURNAL OF COMPUTATIONAL BIOLOGY , 2003
"... The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last f ..."
Abstract - Cited by 132 (5 self) - Add to MetaCart
The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last few years there is a lot of interest within the scientific community in literature-mining tools to help sort through this abundance of literature, and find the nuggets of information most relevant and useful for specific analysis tasks. This paper

Comparative Experiments on Learning Information Extractors for Proteins and their Interactions

by Razvan Bunescu, Ruifang Ge, Rohit J. Kate, Edward M. Marcotte, Raymond J. Mooney, Arun K. Ramani, Yuk Wah Wong , 2004
"... Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in M ..."
Abstract - Cited by 106 (7 self) - Add to MetaCart
Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction eorts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules.

Discovering patterns to extract protein-protein interactions from full texts

by Minlie Huang, et al. - BIOINFORMATICS , 2004
"... Motivation: Although there are several databases storing protein–protein interactions, most such data still exist only in the scientific literature. They are scattered in scientific literature written in natural languages, defying data mining efforts. Much time and labor have to be spent on extrac ..."
Abstract - Cited by 97 (6 self) - Add to MetaCart
Motivation: Although there are several databases storing protein–protein interactions, most such data still exist only in the scientific literature. They are scattered in scientific literature written in natural languages, defying data mining efforts. Much time and labor have to be spent on extracting protein pathways from literature. Our aim is to develop a robust and powerful methodology to mine protein–protein interactions from biomedical texts. Results: We present a novel and robust approach for extracting protein–protein interactions from literature. Our method uses a dynamic programming algorithm to compute distinguishing patterns by aligning relevant sentences and key verbs that describe protein interactions. A matching algorithm is designed to extract the interactions between proteins. Equipped only with a dictionary of protein names, our system achieves a recall rate of 80.0 % and precision rate of 80.5%.

Protein structures and information extraction from biological texts: The PASTA system

by R. Gaizauskas, G. Demetriou, P. J. Artymiuk, P. Willett - Journal of Bioinformatics , 2003
"... Motivation: The rapid increase in volume of protein structure literature means useful information may be hidden or lost in the published literature and the process of finding relevant material, sometimes the rate-determining factor in new research, may be arduous and slow. Results: We describe the P ..."
Abstract - Cited by 79 (6 self) - Add to MetaCart
Motivation: The rapid increase in volume of protein structure literature means useful information may be hidden or lost in the published literature and the process of finding relevant material, sometimes the rate-determining factor in new research, may be arduous and slow. Results: We describe the Protein Active Site Template Acquisition (PASTA) system, which addresses these problems by performing automatic extraction of informa-tion relating to the roles of specific amino acid residues in protein molecules from online scientific articles and abstracts. Both the terminology recognition and extraction capabilities of the system have been extensively eval-uated against manually annotated data and the results compare favourably with state-of-the-art results obtained in less challenging domains. PASTA is the first information extraction (IE) system developed for the protein structure domain and one of the most thoroughly evaluated IE system operating on biological scientific text to date. Availability: PASTA makes its extraction results available via a browser-based front end:
(Show Context)

Citation Context

...nformation have also been addressed by the bioinformatics community. These include protein–protein interactions (Blaschke et al., 1999; Thomas et al., 2000; Park et al., 2001; Yakushiji et al., 2001; =-=Pustejovsky et al., 2002-=-), relations between genes and drugs (Rindflesh et al., 2000) and identification of metabolic pathways (Humphreys et al., 2000; Leroy and Chen, 2002). The approaches employed vary widely and may be ch...

Classifying semantic relations in bioscience texts

by Barbara Rosario, Marti A. Hearst , 2004
"... A crucial step toward the goal of automatic extraction of propositional information from natural language text is the identification of semantic relations between constituents in sentences. We examine the problem of distinguishing among seven relation types that can occur between the entities “treat ..."
Abstract - Cited by 71 (1 self) - Add to MetaCart
A crucial step toward the goal of automatic extraction of propositional information from natural language text is the identification of semantic relations between constituents in sentences. We examine the problem of distinguishing among seven relation types that can occur between the entities “treatment” and “disease” in bioscience text, and the problem of identifying such entities. We compare five generative graphical models and a neural network, using lexical, syntactic, and semantic features, finding that the latter help achieve high classification accuracy.

GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data

by Andrey Rzhetsky, Ivan Iossifov, Tomohiro Koike, Michael Krauthammer, Pauline Kra, Mitzi Morris, Hong Yu, Pablo Ariel Duboué, Wubin Weng, W. John Wilbur, F Vasileios Hatzivassiloglou, Carol Friedman B - Journal of Biomedical Informatics , 2004
"... The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles ..."
Abstract - Cited by 65 (4 self) - Add to MetaCart
The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles pertinent to molecular biology, or automated extraction of information using natural-language processing, information visualization, and generation of specialized knowledge bases for molecular biology. GeneWays is an integrated system that combines several such subtasks. It analyzes interactions between molecular substances, drawing on multiple sources of information to infer a consensus view of molecular networks. GeneWays is designed as an open platform, allowing researchers to query, review, and critique stored information.
(Show Context)

Citation Context

...n Japan [56] that uses knowledge extraction from both article abstracts and full articles to cross-index those articles with Internet-based databases, and the United States—developed MEDSTRACT syste=-=m [32,57] that ex-=-tracts relationships of the form ‘‘A inhibits B’’ from journalarticle abstracts. We have set the context for our own project, GeneWays, by covering briefly the work that other groups have done...

Getting to the (C)Ore of Knowledge: Mining Biomedical Literature

by Berry De Bruijn, Joel Martin - Int J Med Inform
"... Literature mining is the process of extracting and combining facts from scientific publications. In recent years, many computer programs have been designed to extract various molecular biology findings from Medline abstracts or full-text articles. The present article describes the range of text mini ..."
Abstract - Cited by 56 (1 self) - Add to MetaCart
Literature mining is the process of extracting and combining facts from scientific publications. In recent years, many computer programs have been designed to extract various molecular biology findings from Medline abstracts or full-text articles. The present article describes the range of text mining techniques that have been applied to scientific documents. It divides ‘automated reading ’ into four general subtasks: text categorization, named entity tagging, fact extraction, and collection-wide analysis. Literature mining offers powerful methods to support knowledge discovery and the construction of topic maps and ontologies. An overview is given of recent developments in medical language processing. Special attention is given to the domain particularities of molecular biology, and the emerging synergy between literature mining and molecular databases accessible through Internet.
(Show Context)

Citation Context

...y, even though automated understanding is not fully possible, important relationships can be discovered by performing a full syntactic parse, where relations between syntactic components are inferred =-=[65,71,72]-=-. This approach is similar to the template searching except that it is not domain specific and attempts to identify many or all relationships in a sentence. Park [73] illustrates the syntactical compl...

A Shallow Parser Based on Closed-Class Words to Capture Relations in Biomedical Text

by Gondy Leroy, Hsinchun Chen, Jesse D. Martinez , 2003
"... Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun ..."
Abstract - Cited by 51 (6 self) - Add to MetaCart
Natural language processing for biomedical text currently focuses mostly on entity and relation extraction. These entities and relations are usually pre-specified entities, e.g., proteins, and pre-specified relations, e.g., inhibit relations. A shallow parser that captures the relations between noun phrases automatically from free text has been developed and evaluated. It uses heuristics and a noun phraser to capture entities of interest in the text. Cascaded finite state automata structure the relations between individual entities. The automata are based on closed-class English words and model generic relations not limited to specific words. The parser also recognizes coordinating conjunctions and captures negation in text, a feature usually ignored by others. Three cancer researchers evaluated 330 relations extracted from 26 abstracts of interest to them. There were 296 relations correctly extracted from the abstracts resulting in 90% precision of the relations and an average of 11 correct relations per abstract.

Ontology Learning from Text: An Overview

by Paul Buitelaar, Bernardo Magnini - In Paul Buitelaar, P., Cimiano, P., Magnini B. (Eds.), Ontology Learning from Text: Methods, Applications and Evaluation , 2005
"... ..."
Abstract - Cited by 41 (0 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

.... The goal of this work is to discover new relationships between known concepts (i.e. symptoms, drugs, diseases, . . . ) by analyzing large quantities of biomedical scientific articles (see e.g. [35] =-=[33]-=- [41]). Most of the work on text mining combines statistical analysis with more or less complex levels of linguistic analysis, e.g. by exploiting syntactic structure and dependencies for relation extr...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University