|
248
|
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
– Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, Università Roma, Tre Università, Basilicata Università, Roma Tre
- 2001
|
|
164
|
Extracting structured data from web pages
– Arvind Arasu
- 2003
|
|
151
|
A Brief Survey of Web Data Extraction Tools
– Alberto H. F. Laender, Berthier A. Ribeiro-neto, Altigran S. da Silva, Juliana S. Teixeira
- 2002
|
|
460
|
Wrapper Induction for Information Extraction
– Nicholas Kushmerick
- 1997
|
|
41
|
HTML Page Analysis Based On Visual Cues
– Y Yang, H Zhang
- 2001
|
|
126
|
NoDoSE - A tool for Semi-Automatically Extracting Structured and Semistructured Data from Text Documents.
– Brad Adelberg
- 1998
|
|
191
|
Wrapper Induction: Efficiency and Expressiveness
– Nicholas Kushmerick
- 2000
|
|
61
|
A Fully Automated Object Extraction System for the World Wide Web
– David Buttler, Ling Liu, Calton Pu
- 2001
|
|
130
|
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
– Ling Liu, Calton Pu, Wei Han
- 2000
|
|
56
|
Improving pseudo-relevance feedback in web information retrieval using web page segmentation
– Shipeng Yu, Deng Cai, Ji-rong Wen, Wei-ying Ma
- 2003
|
|
277
|
Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction
– Mary Elaine Califf, Raymond J. Mooney, David Cohn
- 2003
|
|
101
|
Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages
– D. W. Embley, D.M. Campbell, Y.S. Jiang, S.W. Liddle, D.W. Lonsdale, Y. -k. Ng, R.D. Smith
- 1999
|
|
20
|
Record Location and Reconfiguration in Unstructured Multiple-Record Web Documents
– D. W. Embley, L. Xu
|
|
66
|
Overview of the Okapi projects
– S Robertson
- 1997
|
|
73
|
Engineering a multi-purpose test collection for Web retrieval experiments
– Peter Bailey, Nick Craswell, David Hawking
- 2001
|
|
69
|
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach
– Craig A. Knoblock, Kristina Lerman, Steven Minton, Ion Muslea
- 1999
|
|
1548
|
Conditional random fields: Probabilistic models for segmenting and labeling sequence data
– John Lafferty
- 2001
|
|
56
|
Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods
– William W. Cohen
- 2004
|
|
165
|
Extracting Semistructured Information from the Web
– J. Hammer, H. Garcia-molina, J. Cho, R. Aranha, A. Crespo
- 1997
|