See this document in CiteSeerX!

Information Extraction from HTML: Application of a General Machine Learning Approach (1998)  (Make Corrections)  (73 citations)
Dayne Freitag
AAAI/IAAI



  Home/Search   Context   Related

Links:   DBLP

 
View or download:
cmu.edu/~dayne/ps/webie.ps.Z
cmu.edu/afs/cs/project/th...webie.ps.gz
cmu.edu/people/dayne/ps/webie.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/~dayne/cv (more)
From:  cmu.edu/people/dayne/cv
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Because the World Wide Web consists primarily of text, information extraction is central to any effort that would use the Web as a resource for knowledge discovery. We show how information extraction can be cast as a standard machine learning problem, and argue for the suitability of relational learning in solving it. The implementation of a general-purpose relational learner for information extraction, SRV, is described. In contrast with earlier learning systems for information extraction,... (Update)

Cited by:   More
Hierarchical Wrapper Induction for Semistructured.. - Ion Muslea Steven   (Correct)
Hierarchies in HTML Documents: Linking Text to Concepts - Burget (2004)   (Correct)
Visual HTML Document Modeling for Information Extraction - Burget (2005)   (Correct)

Similar documents (at the sentence level):
5.7%:   Toward General-Purpose Learning for Information Extraction - Freitag (1998)   (Correct)

Active bibliography (related documents):   More   All
0.9:   Multistrategy Learning for Information Extraction - Freitag (1998)   (Correct)
0.3:   Book Recommending Using Text Categorization with Extracted.. - Mooney (1998)   (Correct)
0.0:   Learning to Construct Knowledge Bases from the World.. - Craven, DiPasquo.. (1999)   (Correct)

Similar documents based on text:   More   All
0.2:   Differential Calculi on Quantum Vector Spaces with Hecke-type.. - Baez   (Correct)
0.2:   Machine Learning for Information Extraction in Informal Domains - Freitag (1998)   (Correct)
0.1:   Information Extraction with HMM Structures Learned by.. - Freitag, McCallum (2000)   (Correct)

Related documents from co-citation:   More   All
38:   Wrapper induction for information extraction - Kushmerick, Weld et al. - 1997
34:   Learning information extraction rules for semi-structured and free text - Soderland - 1999
26:   Relational learning of pattern-match rules for information extraction - Califf, Mooney - 1998

BibTeX entry:   (Update)

Freitag, D. Information extraction from html: Application of a general learning approach. Proceedings of the Fifteenth Conference on Artificial Intelligence AAAI-98 (1998), 517--523. http://citeseer.ist.psu.edu/freitag98information.html   More

@inproceedings{ freitag98information,
    author = "Dayne Freitag",
    title = "Information Extraction from {HTML}: Application of a General Machine Learning Approach",
    booktitle = "{AAAI}/{IAAI}",
    pages = "517-523",
    year = "1998",
    url = "citeseer.ist.psu.edu/freitag98information.html" }
Citations (may not include all citations):
492   Learning logical definitions from relations (context) - Quinlan - 1990  ACM   DBLP
228   Wrapper Induction for Information Extraction - Kushmerick - 1997  ACM   DBLP
180   The CN2 induction algorithm (context) - Clark, Niblett - 1989  ACM   DBLP
105   Estimating probabilities: A crucial task in machine learning (context) - Cestnik - 1990  DBLP
70   Relational learning of pattern-match rules for information e.. - Califf, Mooney - 1997  ACM   DBLP
18   Learning Text Analysis Rules for Domain-specific Natural Lan.. - Soderland - 1996  ACM
3   Empirical methods in information extraction (context) - Papers, ACL- et al. - 1997  DBLP
2   Learning to extract text-based information from the world wi.. (context) - University, CS et al. - 1997  DBLP



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.cs.cmu.edu/~dayne/cv.html):   More
Greedy Attribute Selection - Caruana, Freitag (1994)   (Correct)
How Useful Is Relevance? - Caruana, Freitag (1994)   (Correct)
Using Grammatical Inference to Improve Precision in Information.. - Freitag (1997)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC