TextTiling: Segmenting text into multi-paragraph subtopic passages (1997)
| Venue: | Computational Linguistics |
| Citations: | 275 - 1 self |
BibTeX
@ARTICLE{Hearst97texttiling:segmenting,
author = {Marti A. Hearst},
title = {TextTiling: Segmenting text into multi-paragraph subtopic passages},
journal = {Computational Linguistics},
year = {1997},
pages = {33--64}
}
Years of Citing Articles
OpenURL
Abstract
TextTiling is a technique for subdividing texts into multi-paragraph units that represent passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to produce segmentation that corresponds well to human judgments of the subtopic boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many text analysis tasks, including information retrieval and summarization. 1.







