Download:
|
by G. Attardi, S. Di Marco, D. Salvi, F. Sebastiani, Istituto Di Elaborazione
http://faure.iei.pi.cnr.it/~fabrizio/Publications/IIIS98/IIIS98.ps
Add To MetaCart
Abstract:
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult due to the sheer amount of material, and therefore it will be necessary to resort to techniques for automatic classification of documents. Classification is traditionally performed by extracting information for indexing a document from the document itself. The paper describes the technique of categorisation by context, which exploits the context perceivable from the structure of HTML documents to extract useful information for classifying the documents they refer to. We present the results of experiments with a preliminary implementation of the technique. 1.
Citations
|
1632
|
The anatomy of a large-scale hypertextual web search engine
– Brin, Page
- 1998
|
|
881
|
Term-Weighting Approaches in Automatic Text Retrieval
– Salton, Buckley
- 1988
|
|
668
|
WordNet: a lexical database for English
– Miller
- 1995
|
|
329
|
A vector space model for automatic indexing
– Salton, Wong, et al.
- 1975
|
|
87
|
Hypursuit: A hierarchical network search engine that exploits content-link hypertext clustering
– Weiss
- 1996
|
|
86
|
Feature selection, perceptron learning, and a usability case study for text categorization
– Ng, Goh, et al.
- 1997
|
|
76
|
Automatic analysis, theme generation, and summarization of machine-readable texts
– Salton, Allen, et al.
- 1994
|
|
63
|
Automatic indexing and content-based retrieval of captioned images
– Srihari
- 1995
|
|
37
|
Automatically organizing bookmarks per contents
– Maarek, Shaul
- 1996
|
|
30
|
The order of things: Activity-centred information access
– Chalmers, Rodden, et al.
- 1998
|
|
28
|
A bookmarking service for organizing and sharing URLs
– Keller, Wolfe, et al.
- 1997
|
|
24
|
Optimizing convenient online access to bibliographic databases
– Cleverdon
- 1984
|
|
23
|
Supporting cooperative and personal surfing with a desktop assistant
– Marais, Bharat
- 1997
|
|
11
|
On the measurement of inter-linker consistency and retrieval effectiveness in hypertext databases
– Ellis, Furner-Hines, et al.
- 1994
|
|
6
|
Part-of-Speech Guessing Rules: Learning and Evaluation
– Mikheev
- 1998
|
|
5
|
Contextual models of clinical publications for enhancing retrieval from full-text databases
– Purcell, Shortliffe
- 1995
|
|
5
|
A comparison of classifiers and document representations for the routing problem
– Schtze, Hull, et al.
- 1995
|
|
4
|
A Probabilistic Approach for Document Indexing
– Fuhr, Buckley
- 1991
|