| J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97-130, 2001. |
....summaries, we resort to document sampling. A good quality content summary of a collection can be derived from a small, representative document sample from the collection [3] Earlier research has shown that we can extract such a document sample with a relatively small number of query probes [3, 4, 12]. An approximate content summary can then be built from the documents that best match each query probe at the collection in question. Interestingly, the e#ectiveness of the best database selection algorithms does not su#er significantly from using approximate content summaries extracted in this ....
....probe at the collection in question. Interestingly, the e#ectiveness of the best database selection algorithms does not su#er significantly from using approximate content summaries extracted in this way, rather than the complete content summaries for which the algorithms were originally designed [6, 4]. To make the generation of content summaries in SDARTS completely automatic we adopted the method we described in [12] which uses a small number of topically focused query probes to produce highly accurate approximate content summaries. Our probing method relies on short query probes generated ....
Jamie Callan and Margaret Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
....web databases. The average number of queries sent to each database was 182, and no documents needed to be retrieved from the databases. Furthermore, the number of words per query ranged between just one and four words. Further details of our algorithm and evaluation are described in [8] See [2, 3, 10, 6, 7] 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 CNN Sports Illustrated Johns Hopkins AIDS Service Tom s Hardware Guide Office of Scientific and Technical Information Duke University Rare Books Specificity Arts Computers Health Science Sports Figure 2: Distribution of documents in the ....
Jamie Callan and Margaret Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
....way to retrieve their documents is via querying. Second, applications often require only a small fraction of a database s contents, so retrieving relevant documents via querying is an attractive choice from an efficiency viewpoint, even for crawlable databases. Various querybased methods (e.g. [4, 1]) have been proposed in the past for retrieving and extracting the information stored in databases via querying. These algorithms share the general approach of starting with a small set of queries, retrieving some documents from the database, extracting some information from them, and potentially ....
....task that typically relies on statistical summaries of the database contents. Unfortunately, web accessible text databases do not generally export summaries of their contents. In the past, query based algorithms have been proposed to automatically build such summaries for web accessible databases [4, 8]. The goal is to construct an augmented dictionary of all words that appear in the database, and their frequency. One of the algorithms described in [4] automatically discovers the content of a text database by first querying the database with some seed words, and then extracting new words from ....
[Article contains additional citation context not shown here]
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
....Johns Hopkins AIDS Service Tom s Hardware Guide Office of Scientific and Technical Information Duke University Rare Books Specificity Arts Computers Health Science Sports Figure 2: Distribution of documents in the top level categories for five searchable web databases. in [8] See [2, 3, 10, 6, 7] for other related work relevant to database classification. As we discussed in the introduction, our technique can be also applied to the classification of any database that offers a search interface for its contents, no matter if its contents are hidden or not. 3.2 Crawling based ....
Jamie Callan and Margaret Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97-130, 2001.
No context found.
J.P. Callan and M.E. Connell. Query-based sampling of text databases. ACM TOIS, 19(2):97--130, 2001.
No context found.
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems (2001) 97-130
No context found.
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems (2001) 97-130
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97-130, 2001.
No context found.
Callan, J., and Connell, M., "Query-based sampling of text databases." ACM Transactions on Information Systems, 19(2), pp. 97-130. 2001.
No context found.
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems (2001) 97-130
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97-130, 2001.
No context found.
J.P. Callan and M.E. Connell. Query-based sampling of text databases. ACM TOIS, 19(2):97--130, 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
Jamie Callan and Margaret Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
Callan, J. and Connell, M. 2001. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97_130.
No context found.
J. P. Callan and M. E. Connell. Query-based sampling of text databases. Information Systems, 19(2):97--130, 2001.
No context found.
Callan, J. & Connell, M. (2001), Query-based sampling of text databases, in `ACM Transactions on Information Systems', Vol. 19, pp. 97--130.
No context found.
J. P. Callan and M. Connell. Query-based sampling of text databases. ACM TOIS, 19(2), 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
J. P. Callan and M. E. Connell. Query-based sampling of text databases. Information Systems, 19(2):97--130, 2001.
No context found.
J. P. Callan and M. Connell. Query-based sampling of text databases. ACM TOIS, 19(2), 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
No context found.
J. Callan and M. Connell. Query-based sampling of text databases. ACM Transactions on Information Systems, 19(2):97--130, 2001.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC