| G. Doddington, "The Topic Detection and Tracking Phase 2 (TDT2) Evaluation Plan," Available at http://www.nist.gov/speech/tdt 98.htm, 1998. |
....task is similar to IR routing and filtering tasks, the definition of event leads to at least one significant difference. An event is defined as an occurrence at a given place and time covered by the news media. Stories are on topic if they cover the event itself or any outcome (strictly defined in [2]) of the event. By this definition, all stories prior to the occurrence are off topic, which contrary to the IR tasks mentioned, theoretically provides for unlimited off topic training material (assuming retrospective corpora are available) We expected to be able to take advantage of these ....
....of close classed words was also employed. In order to improve word statistics, particularly for the beginning of the test set, we prepended a retrospective corpus (the TDT Pilot Data [3] of approximately 16 thousand stories. 1 In accordance with the evaluation specification for this project [2] no information is shared across topics. 1.3. Feature Selection The choice as well as number of features (words) used to represent a topic has a direct effect on the trade off between miss and false alarm probabilities. We investigated four methods of producing lists of features sorted by their ....
[Article contains additional citation context not shown here]
G. Doddington, "The Topic Detection and Tracking Phase 2 (TDT2) Evaluation Plan," Available at http://www.nist.gov/speech/tdt 98.htm, 1998.
No context found.
G. Doddington, "The Topic Detection and Tracking Phase 2 (TDT2) Evaluation Plan," Available at http://www.nist.gov/speech/tdt 98.htm, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC