| Ian H. Witten, Craig G. Nevill-Manning, and Sally Jo Cunningham. Building a digital library for computer science research: Technical issues. In Australasian Computer Science Conference Proceedings, pages 534--542, Melbourne, Australia, 1996. |
....will register their papers directly, making all of these techniques less necessary (thousands of papers have already been registered with CiteSeer) 3. 2 Full Text Indexing CiteSeer includes full text indexing of the entire content of articles, similar to the New Zealand Digital Library [31, 32]. Postscript and PDF documents are converted to text using pstotext (http: www.research.digital.com SRC virtualpaper pstotext.html) from the Digital Virtual Paper project (http: www.research.digital.com SRC virtualpaper home.html) The full text indexing performed by CiteSeer is similar to the ....
I.H. Witten, C.G. Nevill-Manning, and S.J.Cunningham. Building a digital library for computer science research: technical issues. In Proceedings Australasian Computer Science Conference, Melbourne, Australia, January 1996.
....a hierarchy from such a sequence is accomplished in about two minutes. Word based phrases In a second experiment, SEQUITUR was invoked on a large body of technical reports, part of the 1. 9 Gbyte corpus comprising the Computer Science Technical Report collection of the New Zealand Digital Library (Witten et al. 1996). The reports were presented as a sequence of words, and all words were mapped, somewhat arbitrarily, to lower case before processing. A 28 Mb sample was chosen, which included 500 technical reports from nine sites. Table 1 shows the size of the sample and the resulting grammar (which took 12 ....
Witten, I.H., Nevill-Manning, C.G., and Cunningham, S.J. (1996) "Building a digital library for computer science research: technical issues," Proc. Australasian Computer Science Conference, Melbourne, Australia, 534-542.
....corresponds to a rule in the grammar. The rules that SEQUITUR creates will be illustrated by a grammar constructed from a large body of 7000 computer science technical reports, part of the 1. 9 Gb corpus comprising the Computer Science Technical Report collection of the New Zealand Digital Library (Witten et al. 1996). Pertinent details of the a I n t h e b e g i n n i n g G o d c r e a t e d t h e h e a v e n a n d t h e e a r t h b A u c o m m e n c e m e n t , D i e u c r # a l e s c i e u x e t l a t e r r e c I m A n f a n g s c h u f G o t t d i e H i m m e ....
Witten, I.H., Nevill-Manning, C.G., and Cunningham, S.J. (1996) "Building a digital library for computer science research: technical issues," Proc. Australasian Computer Science Conference, Melbourne, Australia, 534542.
No context found.
Ian H. Witten, Craig G. Nevill-Manning, and Sally Jo Cunningham. Building a digital library for computer science research: Technical issues. In Australasian Computer Science Conference Proceedings, pages 534--542, Melbourne, Australia, 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC