| Dragomir R. Radev. Generating Natural Language Summaries from Multiple On-Line Sources: Language Reuse and Regeneration . Unpublished PhD thesis, Columbia University, 1999. |
....a significantly more complete picture of the available information than the latest article. 1. INTRODUCTION Previous work in multidocument summarization has pointed to the importance of identifying differences and discrepancies in the information that is reported across multiple news sources [9, 12]. To our knowledge, however, this problem has not yet been systematically or thoroughly investigated. Radev and McKeown [9] for example, identify discrepancy detection as a potential problem for multidocument summarizers via anecdotal evidence, but provide no empirical evidence to indicate how ....
....in multidocument summarization has pointed to the importance of identifying differences and discrepancies in the information that is reported across multiple news sources [9, 12] To our knowledge, however, this problem has not yet been systematically or thoroughly investigated. Radev and McKeown [9], for example, identify discrepancy detection as a potential problem for multidocument summarizers via anecdotal evidence, but provide no empirical evidence to indicate how often such differences actually represent significant discrepancies in the available information, vs. simple updates in what ....
D. R. Radev and K. R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, 1988.
....applied statistical techniques (frequency analysis, variance analysis, etc. to linguistic units suchastokens, names, anaphora, etc. e.g. 27, 19, 9, 18, 2] Other approaches include the utility of discourse structure [14] the combination of information extraction and language generation [11, 17, 24, 21, 16], and using machine learning to find patterns in text [28, 4, 26] Several researchers have extended various aspects of the single document approaches to look at multi documentsummarization [13, 21, 3, 7, 15] These include comparing templates filled in by extracting information using ....
.... structure [14] the combination of information extraction and language generation [11, 17, 24, 21, 16] and using machine learning to find patterns in text [28, 4, 26] Several researchers have extended various aspects of the single document approaches to look at multi documentsummarization [13, 21, 3, 7, 15]. These include comparing templates filled in by extracting information using specialized, domain specific knowledge sources from the document, and then generating natural language summaries from the templates [21] comparing named entities extracted using specialized lists between ....
[Article contains additional citation context not shown here]
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple online sources. Compuutational Linguistics, 24(3), 1998.
....Abstracting is receiving more and more attention of NLP researchers along with the IE(Information Extraction) IR(Information Retrieval) and IF (Information Filtering) technique recently. Many automatic abstracting systems have been proposed. For example, SUMMONS [McKeown et al., 1995; Radev et al., 1998], SUMMARIST [Hovy et al., 1997; Lin, 1998] COSYMATS [Aretoulaki, 1997] SUMMAC [Sanderson, 1998] SJTUCAA [Wang et al., 1996] FDASCT [Wu et al., 1996] and so on. Tombros(1997) presented a general automatic text abstracting model which generates the abstract of the text in two steps: the source ....
Radev, Dragomir R., Kathleen R. McKeown, 1998 Generating Natural Language Summaries from Multiple On-Line Sources, Computational Linguistics, Vol.24, No.3, pp. 469-500.
....Hovy 1997) discourse structure (Marcu 1997; Marcu 1998) and user features from the query (Strzalkowski et al. 1998) to find key sentences. While most of the work to date focuses on summarization of single articles, early work is beginning to emerge on summarization across multiple documents. (Radev and McKeown 1998) use a symbolic approach to summarization, pairing information extraction systems with language generation. The result is a domain dependent system for summarization of multiple news articles on the same event, highlighting how perspective of the event has changed over time. In ongoing work at ....
Dragomir R. Radev and Kathleen R. McKeown. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, 24(3):469--500, September 1998.
....problem through the seventies and eighties (e.g. 17, 25] The resources devoted to addressing this problem grew by several orders of magnitude with the advent of the world wide web and large scale search engines. Several innovative approaches began to be explored: linguistic approaches (e.g. [1, 2, 4, 12,14,15, 18]) statistical and information centric approaches (e.g. 6, 9, 16, 24] and combinations of the two (e.g. 3, 24,26] The TIPSTER Phase III Program, an information retrieval initiative of the US Defense Department funded several of these projects on summarization [27] Almost all of this work ....
.... approaches (e.g. 6, 9, 16, 24] and combinations of the two (e.g. 3, 24,26] The TIPSTER Phase III Program, an information retrieval initiative of the US Defense Department funded several of these projects on summarization [27] Almost all of this work (with the exception of [12, 15, 18, 23]) focused on summarization by text span extraction , with sentences as the most common type of textspan. This technique creates document summaries by concatenating selected text span excerpts from the original document. This paradigm transforms the problem of summarization, which in the most ....
Radev, D., and McKeown, K. Generating natural language summaries from multiple online sources. Compuutational Linguistics (1998).
....in a directed graph. Although no readable summary is generated, keywords indicated how documents are similar or different. Mani and Bloedorn (1997) also relate pairs of documents to each other showing similarities and differences. In addition, work by McKeown and Radev (McKeown and Radev 1995; Radev and McKeown 1998) relies on an assumed system filling and selecting predefined templates used for the final summary. Later work by McKeown et al. 1999) breaks documents into paragraph based units. These units are compared to each other to identify similar and dissimilar passages. A graph based one pass ....
RADEV, DRAGOMIR R., and KATHLEEN R. MCKEOWN. 1998. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, Volume 24, Number 3.
.... inference [247, 248] automatic induction of natural language interfaces for querying data bases [249, 222] information extraction tasks [216, 217, 29, 79, 80, 81, 30, 215] acquisition of verbal properties [153] text categorization [49, 50, 53, 213] and generation of natural language [176]. 2.3 Subsymbolic Machine Learning Approaches 2.3.1 Neural Networks In their relation to NLP, neural networks [94] have been used basically to address low level problems, such as OCR [204] speech recognition and synthesis [206, 121, 155, 113, 229] and PoS tagging [155, 201, 70, 199, 131] The ....
.... properties [221, 209] General machine translation [10] 100] Spelling correction [133] 86, 88, 89] DLs ILP NNs Clust GAs LSM LogL Acquisition of verbal properties [152, 153, 29] 209] 209, 135] General machine translation [235] Spelling correction [241, 208] 116] 89] Generation [176] Table 5: References corresponding to Machine Translation and other NLP tasks 14 3 Word Sense Disambiguation: A Case Study in Supervised Machine Learning The present section is devoted to explain the comparison between four machine learning algorithms applied to Word Sense Disambiguation. This ....
D. R. Radev and K. McKeown. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, 24(3):469--500, 1998.
....that information is condensed by generalizations and elision of repeated material; and finally (iv) it is presented to the reader in the form of a new text. Most approaches to automatic abstracting concentrate on the first and second steps usually ignoring the last two. Notable exceptions being (Radev and McKeown, 1998; Paice and Jones, 1993) The third step, called condensation, can in some situations be addressed without world knowledge as recent work demonstrates (Barzilay et al. 1999) In our work, we are focusing on the overall process of automatic abstracting. Although we do not address the issue of ....
Radev, D. and McKeown, K. (1998). Generating Natural Language Summaries from Multiple OnLine Sources. Computational Linguistics, 24(3):469--500.
....They are not simply coarser views, however, as summaries must capture at least the essential details of the original source according to a user s needs (Flewelling 1999) and sometimes convey new information that is not explicit in the original data. Automated text based summarization procedures (Radev et al. 1998), common statistical approaches to database summarization and the relational aggregation operators available in most information systems do not yet sufficiently capture the semantics associated with summarized views of data (Roddick et al. 1999) and do not, for example, take time varying phenomena ....
D. Radev and K. McKeown (1998) Generating natural language summaries from multiple on-line sources. Computational Linguistics 24(3): 469-500.
....in a directed graph. Although no readable summary is generated, keywords indicated how documents are similar or different. Mani and Bloedorn (1997) also relate pairs of documents to each other showing similarities and differences. In addition, work by McKeown and Radev (McKeown and Radev 1995; Radev and McKeown 1998) relies on an assumed system filling and selecting predefined templates used for the final summary. Later work by McKeown et al. 1999) breaks documents into paragraph based units. These units are compared to each other to identify similar and dissimilar passages. A graph based one pass ....
Radev, Dragomir R., and Kathleen R. McKeown. 1998. Generating Natural Language Summaries from Multiple OnLine Sources. Computational Linguistics, Volume 24, Number 3.
.... to the user query [9, 31, 32, 42] Some systems include sub document relevance assessments and convey this information to the user via techniques such as text tiling [15] More recently, single document summarization systems provide an automated generic abstract or a query relevant summary [3, 4, 6, 8, 10, 16, 19, 20, 25, 27, 29, 30, 35 38]. Such a summary minimally provides an indication of the information content of a document so that a user can choose whether or not to read it. A more effective ideal summary will contain the content for which the user is searching. However, large scale IR and summarization have not yet been ....
....problem through the seventies and eighties (e.g. 29, 37] The resources devoted to addressing this problem grew by several orders of magnitude with the advent of the world wide web and large scale search engines. Several innovative approaches began to be explored: linguistic approaches (e.g. [3, 4, 8, 19, 24, 25, 30]) statistical and information centric approaches (e.g. 10,16,20,27,36] and combinations of the two (e.g. 6, 36, 38] The TIPSTER Phase III Program, an information retrieval initiative of the US Defense Department funded several of these projects on summarization [39] Human quality ....
[Article contains additional citation context not shown here]
D. Radev and K. McKeown. Generating natural language summaries from multiple online sources. Computational Linguistics, 24(3):469--501, September 1998.
....In [92] three algorithm are described to use information extraction as a basis for high precision text classification. The results suggest that information extraction can support high precision text classification and, in general, using more extracted information improves performance. In [82], a summarization system is presented that uses the output of the information extraction systems developed at MUC conferences. Nevertheless, the performance of these applications depends on the end performance of the information extraction. 2.8 Summary This chapter gave a brief overview of the ....
D. Radev and K. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 1998.
....[Reimer and Hahn, 1988] and SCISOR [Rau et al. 1989] were similar, each experimenting with different aspects (underlying knowledge representation structures, number of features to be considered, etc. The most recently reported work on generative summarization consists of the Columbia summarizer [Radev and McKeown, 1998] , which uses a manually specified generative grammar of English to construct English sentences from an underlying knowledge representation that uses manually crafted rules for content selection. However, none of these systems can: 1) generate summaries that may be a single noun phrase, and not ....
Dragomir Radev and Kathy McKeown. Generating natural language summaries from multiple online sources. Compuutational Linguistics, 1998.
No context found.
D. Radev and K. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-500, September 1998.
No context found.
D. Radev and K. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-- 500.
No context found.
D. Radev and K. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500.
No context found.
D. Radev and K. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, September 1998.
No context found.
D. Radev and K. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, ##(3):469-500, September 1998.
No context found.
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, September 1998.
....981 U 380 152 129 661 W 363 172 148 683 Y 284 201 205 690 Z 380 209 225 814 Table 3: Multi document evaluation CST proposes a taxonomy of the informational relationships between documents in clusters of related documents. Some of the relationships are direct descendents of these used in SUMMONS [8] except that in CST, these relationships are domain independent. CST posits that by identifying these cross document links , one can produce superior multi document summaries. The concept of using CST for multi document summaries relates to the that of using Rhetorical Structure Theory (RST) 1] ....
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, September 1998.
....We conclude the paper with an evaluation of our approach, a discussion of its scalability and portability, and with a glimpse into current work done at our group to extend the functionality of NewsInEssence. 1. 1 Related Work Summarization of multiple documents originated with the SUMMONS system [5, 9]. In it, a series of related stories in a restricted domain were converted to a semantic representation using information extraction and then a summary was produced using natural language generation techniques. Later work on multidocument summarization includes the identi cation of similarities ....
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469{ 500, September 1998.
....words obeying certain constraints, Cook kept only the words in the collocation and thus avoided a combinatorial explosion when several constraints (of collocational or other nature) needed to be combined. Another text generation system that makes use of a specific type of collocations is SUMMONS [33]. In this case, the authors have tried to capture the collocational information linking an entity (person, place, or organization) with its description (pre modifier, apposition, or relative clause) and to use it for generation of referring expressions. For example, if the system discovers that ....
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3), September 1998.
....3. The descriptions of individual foci enable the system to nd missing information and add it in. Complementary to work by Jing [Jing1999] whose emphasis is on summary uency, our approach focuses on ensuring summary informativeness. Other work on summarization at Columbia [Barzilay et al..1999, Radev and McKeown1998] focuses on multiple document summarization. In the next section, we describe a classi cation hierarchy of summarization techniques that situates current systems and show how our strategy constitutes a new category. We then illustrate how each of the three tasks above can be accomplished, by ....
....articles are identi ed as belonging to a particular domain. The articles are then summarized by inserting extracted, domain speci c information into a text template, such as a company s name and the amount of its latest dividend. Current e orts in this arena, such as work by Radev and McKeown [Radev and McKeown1998] are considerably more sophisticated, using advanced techniques to dynamically add new text not present in the template. But when no template exists for a story, what then Since there is an in nite variety of domains, we cannot simply exhaustively construct matching templates. Text ....
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469:500, September 1998.
No context found.
Dragomir R. Radev and Kathleen R. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-500, September.
....position [Lin and Hovy 1997] discourse structure [Marcu 1997; Marcu 1998] and user features from the query [Strzalkowski et al. 1998] to find key sentences. While most work to date focuses on summarization of single articles, early work is emerging on summarization across multiple documents. Radev and McKeown [1998] use a symbolic approach, pairing information extraction systems with language generation. The result is a domain dependent system for summarization of multiple news articles on the same event, highlighting how perspective of the event has changed over time. In ongoing work at Carnegie Mellon, ....
Dragomir R. Radev and Kathleen R. McKeown. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, 24(3):469--500, September 1998.
....Congress. 5.5 Algorithm for applying operators in sequence The previous section described the types of operators already implemented in summons. In Section 5.2 we mentioned that in order to produce a summary, a sequence of operators is applied on the input. We experimented with several algorithms [Radev and McKeown, 1998] for deciding the order in which they are applied until we nally settled on the greedy algorithm described in Algorithm 2. 59 5.6 Example This section describes how the algorithm is applied to a set of 4 templates by tracing the computational process that transforms the raw source into a nal ....
....more entities need to be described output summary Figure 2 describes the skeleton of the algorithm used to incorporate descriptions of entities in the summaries generated by summons. 8. 7 Applications and future work We use profile to improve lexical choice in the summary generation component [Radev and McKeown, 1998]. There are two particularly appealing cases : 1) when the extraction component has failed to extract a description, and (2) when the user model (user s interests, knowledge of the entity and personal preferences for sources of information and for either conciseness or verbosity) dictates that a ....
[Article contains additional citation context not shown here]
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-500, September 1998.
....actually use as part of our new information nder (NIF) Currently, the output of NIF can be used in two ways: as a stand alone information retrieval system, and more importantly, as a component of summons. summons is a knowledge based text generator which produces summaries of multiple sources [5, 3]. We have also developed a Web based system which performs the two algorithms above at the user s request. First, it clusters articles from its database into events and then highlights the portions of the articles that present new, old, and background information within the cluster. 2 ....
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469-500, September 1998.
No context found.
Dragomir R. Radev and Kathleen R. McKeown. "Generating natural language summaries from multiple on-line sources". In Computational Linguistics, 24(3):469--500, September.
....Congress. 5.5 Algorithm for applying operators in sequence The previous section described the types of operators already implemented in summons. In Section 5.2 we mentioned that in order to produce a summary, a sequence of operators is applied on the input. We experimented with several algorithms [Radev and McKeown, 1998] for deciding the order in which they are applied until we finally settled on the greedy algorithm described in Algorithm 2. 59 5.6 Example This section describes how the algorithm is applied to a set of 4 templates by tracing the computational process that transforms the raw source into a final ....
....more entities need to be described output summary Figure 2 describes the skeleton of the algorithm used to incorporate descriptions of entities in the summaries generated by summons. 8. 7 Applications and future work We use profile to improve lexical choice in the summary generation component [Radev and McKeown, 1998]. There are two particularly appealing cases : 1) when the extraction component has failed to extract a description, and (2) when the user model (user s interests, knowledge of the entity and personal preferences for sources of information and for either conciseness or verbosity) dictates that a ....
[Article contains additional citation context not shown here]
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, September 1998.
No context found.
Dragomir R. Radev and Kathleen McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24 (3), pages 469-500, September 1998.
No context found.
Dragomir R. Radev and Kathleen R. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, September.
No context found.
Dragomir R. Radev. Generating Natural Language Summaries from Multiple On-Line Sources: Language Reuse and Regeneration . Unpublished PhD thesis, Columbia University, 1999.
No context found.
Dragomir Radev and Kathleen McKeown. 1998. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, pages 469--500.
No context found.
Radev, Dragomir R. 1999. "Generating Natural Language Summaries from Multiple On-Line Sources: Language Reuse and Regeneration." Ph.D. diss., Columbia University.
No context found.
D. Radev and K. McKeown. Generating natural language summaries from multiple online sources. Computational Linguistics, 1998.
No context found.
D. Radev and K. McKeown. Generating natural language summaries from multiple online sources. Computational Linguistics, 1998.
No context found.
Radev, D. R. and K. R. McKeown. "Generating natural language summaries from multiple on-line sources." Compuutational Linguistics 24(3): 469---500, 1998.
No context found.
Radev, D. R., McKeown. K. R.: Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3): 469---500, 1998.
No context found.
Dragomir R. Radev and Kathleen R. McKeown. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500, 1988.
No context found.
Radev, D.R., McKeown, K.R., `Generating Natural Language Summaries from Multiple On-line Sources', Computational Linguistics, Vol. 24, No. 3, 1998.
No context found.
D.R. Radev and K.R. McKeown. 1998. Generating Natural Language Summaries from Multiple On-Line Sources. Computational Linguistics, 24(3):469--500.
No context found.
D.R. Radev and K.R. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469--500.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC