Results 1 - 10
of
42
Mining the Biomedical Literature in the Genomic Era: An Overview
- JOURNAL OF COMPUTATIONAL BIOLOGY
, 2003
"... The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last f ..."
Abstract
-
Cited by 72 (2 self)
- Add to MetaCart
The past decade has seen a tremendous growth in the amount of experimental and computational biomedical data, specifically in the areas of Genomics and Proteomics. This growth is accompanied by an accelerated increase in the number of biomedical publications discussing the findings. In the last few years there is a lot of interest within the scientific community in literature-mining tools to help sort through this abundance of literature, and find the nuggets of information most relevant and useful for specific analysis tasks. This paper
Associating Genes with Gene Ontology Codes Using a Maximum Entropy Analysis of Biomedical Literature
, 2002
"... this paper but has been provided elsewhere (Ratnaparkhi 1997; Manning and Schutze 1999) ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
this paper but has been provided elsewhere (Ratnaparkhi 1997; Manning and Schutze 1999)
Mining Medline: Abstracts, Sentences, Or Phrases?
, 2002
"... Sentence pair Sentence Phrase w--} w>0.511 - 0.339
Abstract
-
Cited by 38 (1 self)
- Add to MetaCart
Sentence pair Sentence Phrase w--} w>0.511 - 0.339<w<0.510 w<0.338 5 Discussion and Conclusion In view of the results reported here it is not surprising that researchers have reported interesting results for text mining in MEDLINE based on abstracts, sentences, and phrases. Tables 2 and 3 and the statistical significance summary in the preceding section indicate that each of these units has advantages and disadvantages compared to the others.
GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data
- Journal of Biomedical Informatics
, 2004
"... The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
The immense growth in the volume of research literature and experimental data in the field of molecular biology calls for e#cient automatic methods to capture and store information. In recent years, several groups have worked on specific problems in this area, such as automated selection of articles pertinent to molecular biology, or automated extraction of information using natural-language processing, information visualization, and generation of specialized knowledge bases for molecular biology. GeneWays is an integrated system that combines several such subtasks. It analyzes interactions between molecular substances, drawing on multiple sources of information to infer a consensus view of molecular networks. GeneWays is designed as an open platform, allowing researchers to query, review, and critique stored information.
Text Mining: Generating Hypotheses from MEDLINE
- Journal of the American Society for Information Science and Technology
"... Hypothesis generation, a crucial initial step for making scientific discoveries, relies on prior knowledge, experience and intuition. Chance connections made between seemingly distinct subareas sometimes turn out to be fruitful. The goal in text mining is to assist in this process by automaticall ..."
Abstract
-
Cited by 34 (2 self)
- Add to MetaCart
Hypothesis generation, a crucial initial step for making scientific discoveries, relies on prior knowledge, experience and intuition. Chance connections made between seemingly distinct subareas sometimes turn out to be fruitful. The goal in text mining is to assist in this process by automatically discovering a small set of interesting hypotheses from a suitable text collection.
Mining MEDLINE for Implicit Links between Dietary Substances and Diseases
- Bioinformatics
, 2004
"... This research presents our open discovery algorithm which is a text mining algorithm. We demonstrate that this algorithm may be used to uncover information that could form the basis of new hypotheses. In particular, we use it to discover novel uses for Curcumin Longa, a dietary substance, highly ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
This research presents our open discovery algorithm which is a text mining algorithm. We demonstrate that this algorithm may be used to uncover information that could form the basis of new hypotheses. In particular, we use it to discover novel uses for Curcumin Longa, a dietary substance, highly regarded for its therapeutic properties in Asia. Several diseases are identified as o#ering novel research contexts for curcumin. We analyze select suggestions: retinal diseases, Crohn's disease and disorders related to the spinal cord. Our analysis suggests that there is strong evidence in favour of a beneficial role for curcumin in these diseases. The evidence is based on curcumin's influence on several genes such as COX-2, TNF-alpha, JNK, p38 MAPK and TGF-beta. This research suggests that our open discovery algorithm may be used to find novel uses for dietary and pharmacologic substances. More generally, open discovery may be used to uncover information that potentially sheds new light on a given topic of interest.
Automatic ontology construction from the literature. Genome informatics
, 2002
"... Detailed classifications, controlled vocabularies and organised terminology are widely used in different areas of science and technology. Their relatively recent introduction in molecular biology has been crucial for progress in the analysis of genonics and massive proteomics experiments. Unfortunat ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Detailed classifications, controlled vocabularies and organised terminology are widely used in different areas of science and technology. Their relatively recent introduction in molecular biology has been crucial for progress in the analysis of genonics and massive proteomics experiments. Unfortunately the construction of the ontologies, including terminology, classification and entity relations requires considerable effort, including the analysis of massive amounts of literature. We propose here a method that automatically generates classifications of gene-product functions using bibliographic information. The corresponding classification structures mirror the ones constructed by human experts. The analysis of a large structure built for yeast gene-products, and the detailed inspection of various examples, show encouraging properties. In particular, the comparison with the well accepted GO ontology points to different situations in which the automatically derived classification can be useful for assisting human experts in the annotation of ontologies.
Literature Mining in Molecular Biology
, 2002
"... Literature mining is the process of extracting and combining facts from scientific publications. In recent years, many studies have resulted in computer programs to extract various molecular biology findings using Medline abstracts or full text articles. This article describes the range of technique ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Literature mining is the process of extracting and combining facts from scientific publications. In recent years, many studies have resulted in computer programs to extract various molecular biology findings using Medline abstracts or full text articles. This article describes the range of techniques that have been applied in literature mining. In doing so, it divides automated reading into four general subtasks: text categorization, named entity tagging, fact extraction and collection-wide analysis. Special attention is given to the domain particularities of molecular biology.
The Computational Analysis of Scientific Literature to Define and Recognize Gene Expression Clusters
, 2003
"... A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present a computational method that leverages the peer-reviewed literature in the automatic analysis of gene expression data sets. Includi ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
A limitation of many gene expression analytic approaches is that they do not incorporate comprehensive background knowledge about the genes into the analysis. We present a computational method that leverages the peer-reviewed literature in the automatic analysis of gene expression data sets. Including the literature in the analysis of gene expression data offers an opportunity to incorporate functional information about the genes when dening expression clusters. We have created a method that associates gene expression profiles with known biological functions. Our method has two steps. First, we apply hierarchical clustering to the given gene expression data set. Secondly, we use text from abstracts about genes to (i) resolve hierarchical cluster boundaries to optimize the functional coherence of the clusters and (ii) recognize those clusters that are most functionally coherent. In the case where a gene has not been investigated and therefore lacks primary literature, articles about well-studied homologous genes are added as references. We apply our method to two large gene expression data sets with different properties. The first contains measurements for a subset of wellstudied Saccharomyces cerevisiae genes with multiple literature references, and the second contains newly discovered genes in Drosophila melanogaster; many have no literature references at all. In both cases, we are able to rapidly define and identify the biologically relevant gene expression profiles without manual intervention. In both cases, we identified novel clusters that were not noted by the original investigators.
Medmesh summarizer: text mining for gene clusters
- in the Proceedings of the Second SIAM International Conference on Data Mining
, 2002
"... Gene Expression is the process by which a gene’s coded information is translated into the proteins present and operating in the cell. Changes in gene expression are associated with many important biological phenomena, including morphogenesis and aging, cancer and disease states, and adaptive ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Gene Expression is the process by which a gene’s coded information is translated into the proteins present and operating in the cell. Changes in gene expression are associated with many important biological phenomena, including morphogenesis and aging, cancer and disease states, and adaptive

