Abstract:
We consider the task of learning hidden Markov models (HMMs) when only partially (sparsely) labeled observation sequences are available for training. This setting is motivated by the information extraction problem, where only few tokens in the training documents are given a semantic tag while most tokens are unlabeled. We rst describe the partially hidden Markov model together with an algorithm for learning HMMs from partially labeled data. We then present an active learning algorithm that selects \di-cult " unlabeled tokens and asks the user to label them. We study empirically by how much active learning reduces the required data labeling eort, or increases the quality of the learned model achievable with a given amount of user eort. 1
Citations
|
4514
|
Statistical Learning Theory
– Vapnik
- 1998
|
|
2103
|
A tutorial on hidden markov models and selected applications in speech recognition
– Rabiner
- 1989
|
|
262
|
Active learning with statistical models
– Cohn, Ghahramani, et al.
- 1996
|
|
230
|
Maximum entropy Markov models for information extraction and segmentation
– McCallum, Freitag, et al.
- 2000
|
|
154
|
Learning to construct knowledge bases from the World Wide Web
– Craven, DiPasquo, et al.
- 2000
|
|
127
|
Message understanding conference - 6: A brief history
– Grishman, Sundheim
- 1996
|
|
122
|
The hierarchical Hidden Markov Model: Analysis and applications
– Fine, Singer, et al.
- 1998
|
|
111
|
Hidden Markov models of biological primary sequence information
– Baldi, Chauvin, et al.
- 1994
|
|
106
|
Learning hidden markov model structure for informationextraction,AAAI-99WorkshoponMachineLearningforInformationExtraction
– Seymore, McCallum, et al.
- 1999
|
|
88
|
Toward optimal active learning through sampling estimation of error reduction
– Roy, McCallum
- 2001
|
|
54
|
Semantic web roadmap
– Berners-Lee
- 1998
|
|
44
|
Hidden markov models for speech recognition
– Juang, Rabiner
- 1991
|
|
38
|
Cascaded markov models
– Brants
- 1999
|
|
25
|
Information Extraction from World Wide Web - A Survey
– Eikvil
- 1999
|
|
19
|
Wrapper induction: eciency and expressiveness
– Kushmerick
- 2000
|
|
17
|
A unifying approach to html wrapper representation and learning
– Grieser, Jantke, et al.
- 2000
|
|
15
|
Conditional random Probabilistic models for segmenting and labeling sequence data
– Laerty, McCallum, et al.
- 2001
|
|
12
|
Generating transducers for semistructured data extraction from the web
– Hsu, Dung
- 1998
|