| A. Mo#at, N.B. Sharman, and J. Zobel. Static compression for dynamic texts. In J.A. Storer and M. Cohn, editors, Proc. IEEE Data Compression Conference, pages 126--135, Snowbird, Utah, March 1994. IEEE Computer Society Press, Los Alamitos, California. |
....and although well within the capacity of a modern workstation, would be di#cult to support on a PC. Di#erent techniques are needed for collections that are larger or more diverse than trec, and mechanisms for trading a small loss in compression for reduced model space are part of our current work [10]. The index vocabulary, which also becomes large, can be stored on disk with little impact, since compared to the compression model it is accessed only lightly during query processing. To avoid the problems of adaptive modelling we used a semi static model [1] Hence, the compression process ....
....second to actually encode the text. A two pass approach is perfectly acceptable for static collections such as trec. In other work we have shown that the same methods can be extended with minimal degradation to dynamic collections and other situations in which it is not possible to make two passes [10]. For text data with a word based model, even the most common symbol the word the constitutes less than 5 of the symbols to be encoded, and so Hu#man coding is near optimal [5] Hu#man coding also avoids the throughput problems of the only prac6 tical alternative, arithmetic coding. In our ....
A. Mo#at, N.B. Sharman, and J. Zobel. Static compression for dynamic texts. In J.A. Storer and M. Cohn, editors, Proc. IEEE Data Compression Conference, pages 126--135, Snowbird, Utah, March 1994. IEEE Computer Society Press, Los Alamitos, California.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC