15 citations found. Retrieving documents...
Witten, I. H., Bell, T. C., Emberson H., Inglis S., Moffat, A. Textual Image Compression: Two-Stage Lossy/Lossless Encoding of Textual Images. Proc. IEEE, Vol. 82, No. 6, 1994. pp. 878--888.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
JBIG image compression standard - Kyrki (1999)   (Correct)

....systems, where a large amount of documents needs to be stored in an efficient way. 4.2 Present and future It has been shown that JBIG performs worst with textual images. To increase performance, it is perceived that the most common symbols need to be coded into separate symbol libraries [6]. For example, single characters can be coded into a symbol library. After that we can simply compress the symbol library and the sequence separately. This approach is the basis for the new international standard currently being made by the joint JBIG committee. This new emerging standard is ....

Witten, I. H., Bell, T. C., Emberson H., Inglis S., Moffat, A. Textual Image Compression: Two-Stage Lossy/Lossless Encoding of Textual Images. Proc. IEEE, Vol. 82, No. 6, 1994. pp. 878--888.


Document Image Compression and Analysis - Kia (1997)   (3 citations)  (Correct)

....processing. While Howard proposes a methodology for scalable lossy compression, his approach lacks a hierarchical representation suitable for progressive transmission. 5 An important approach to document compression, suggested by Ascher and Nagy [9] and later formalized by Witten et al. [43, 110, 111, 112], is based on the repetitive nature of text components. In a departure from pixel level to symbol level coding, marks found within a document are coded. The method is very similar to patternmatching and substitution algorithms [10, 47, 64, 69, 70, 85, 113] It is based on the measurement of ....

....be exploited to provide a highly efficient representation. We developed a technique which extracted constituent components, clustered them, and represented them in a compressed form. While this technique was developed independently, it significantly extends the technique proposed by Witten et al. [110], and took its roots from work done by Ascher and Nagy [9] Traditional coding, transmission, and lossy representation of images often made use of resolution reduction practices. This was ideal for textured images but document images do not fit easily into that class. Document images lose their ....

I. Witten, T. Bell, H. Emberson, S. Inglis, and A. Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82:878--888, 1994.


The Indexing and Retrieval of Document Images: A Survey - Doermann (1998)   (12 citations)  (Correct)

....image retrieval. The general technique, first proposed by Ascher and Nagy [3] exploits redundancies by representing each pattern class by a single prototype and representing the image as a set of pointers to these prototypes. The concept of a dynamic library was formalized by Witten et al. [69], who take small regions that are believed to correspond to textual symbols, cluster them and a create a prototype for each cluster. The image regions are then represented uniquely by a combination of this prototype, the prototype s location in the image, and an encoding of the error image. Kia ....

I. Witten, T. Bell, H. Emberson, S. Inglis, and A. Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82:878--888, 1994. 35


Structure-Preserving Document Image Compression - Omid Kia   (Correct)

....space is not necessarily an option. An important consideration is the preservation of structure so that symbols remain recognizable. In this paper we use a structural approach to compress textual images. The general approach, suggested in part by Ascher and Nagy [1] and later enhanced by Witten [7], first clusters text like components and represents each cluster by a single template (presumably of the underlying symbol) the residual error is then encoded (Figure 1) Our contribution is the use of a probabilistic model of the errors. In Section 2 we briefly discuss the characteristics of ....

I. Witten, T. Bell, H. Emberson, S. Inglis, and A. Moffat. Textual image compression: Two-stage lossy/lossles encoding of textual images. Proceedings of the IEEE, 82:878--888, 1994.


Entropy-Based Pattern Matching For Document Image Compression - Qin Zhang (1996)   (2 citations)  (Correct)

....methods of facsimile data compression based on run length coding fail to provide good performance. On the other hand, document image compression based on pattern matching is a method particularly effective for text pages. It was originally proposed in [1] and further studied in [4] 3] and [6]. Document image compression based on pattern matching works in this way: patterns, connected blobs of ink, which are expected to be characters, are extracted from the image to be compressed. These patterns are compared to previously transmitted patterns. If a match is detected, only the position ....

Witten, I.H., Bell, T.C., Emberson,H., and S. Inglis, "Textual image compression: two-stage lossy/lossless encoding of textual images," Proceedings of the IEEE, v. 86, No. 6, pp. 878-888, 1994.


Structural Compression for Document Analysis - Omid Kia (1996)   (Correct)

....in that structural repetition is recognized and coded, rather than using texture coding and resolution reduction. Our approach is based on work suggested by Ascher [1] where a dynamic library was created to classify future observations and achieve high compression ratios, and later by Witten [16], where the error map was introduced to perfectly code the original image, or to achieve lossy compression by omitting it. Omission of the error map sometimes renders a document unreadable; an algorithm needs to allocate bits which are informative and contribute significantly toward readability. ....

....by the encoded representative and a difference. If the symbols are clustered accurately, the differences will typically represent noise on the symbol boundary. If this noise can be characterized, we can provide a lossy scheme which ignores these extraneous pixels. In our approach, motivated by [16], small regions that are believed to correspond to textual symbols are identified in the image. They are clustered and a template is created for each cluster. The regions are then represented uniquely by a combination of this template, its location in the original image, and an encoding of the ....

I. Witten, T. Bell, H. Emberson, S. Inglis, and A. Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82:878--888, 1994.


A Pattern-Based Lossy Compression Scheme for Document Images - Qin Zhang   (Correct)

....pages. Document images contain patterns such as characters and lines, and a pattern may appear many times in the image. Pattern matching coding techniques exploit these macroscopic properties of document images. These techniques were originally proposed in [1] and further studied in [3] 8] and [10]. 10] and [8] discuss lossless compression based on pattern matching. Our Compression for Document Image System (CDIS) is also pattern matching based. Exploiting the knowledge of document images one step further, CDIS takes advantage of the structural layout of document images. This system ....

....Document images contain patterns such as characters and lines, and a pattern may appear many times in the image. Pattern matching coding techniques exploit these macroscopic properties of document images. These techniques were originally proposed in [1] and further studied in [3] 8] and [10] [10] and [8] discuss lossless compression based on pattern matching. Our Compression for Document Image System (CDIS) is also pattern matching based. Exploiting the knowledge of document images one step further, CDIS takes advantage of the structural layout of document images. This system effectively ....

[Article contains additional citation context not shown here]

Witten, I. H., Bell, T. C., Emberson, H., and S. Inglis, "Textual image compression: two-stage lossy/lossless encoding of textual images," Proceedings of the IEEE, v. 86, No. 6, pp. 878-888, 1994.


Randex, A Bridge Over The Internet - González, Márquez..   (Correct)

....some compression algorithms for textual (BW) and gray scale images. For the compression of textual images we have analyzed the performance of two different techniques: traditional algorithms of data compression such as RLE and LZW [3, 4, 5] and text compression techniques with and without OCR [6, 7]. Our conclusion was that, for document images, text compression algorithms perform clearly better than data compression methods. Therefore, a text compression (a) b) Figure 4: a) Original image. b) Image after skew correction. technique, which does not make use of OCR, was selected for its ....

....main reasons: on the one hand, the compression rates reached by these algorithms do not justify the increment in computational complexity introduced by the character recognition task; besides, OCR operations modify the original fonts of the document. The basic algorithm selected is described in [7]. It was incorporated to RandeX after some improvements in its compression efficiency and speed. This method provides the possibility of lossless and lossy compression by means of a processing in two stages, paving the way for a progressive transmission. The lossy compression algorithm is very ....

I.H. Witten, T.C. Bell, H. Emberson, and S. Inglis amd A. Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82(6):878--888, June 1994.


A Codebook Generation Algorithm for Document Image Compression - Zhang, Danskin, Young (1997)   (2 citations)  (Correct)

....steps) The new algorithm generates a better codebook, resulting in an overall improvement in compression performance of almost 17 . 1 INTRODUCTION For scanned text, pattern matching based compression achieves the best compression ratios currently known: roughly four times that of 2D FAX coding [11]. Pattern matching based compression has been studied by several research groups [1, 6, 9, 11] This kind of compression involves the following steps: Extract from the image of the scanned document a sequence of glyphs. Each glyph typically represents one connected blob of ink occuring somewhere ....

....in compression performance of almost 17 . 1 INTRODUCTION For scanned text, pattern matching based compression achieves the best compression ratios currently known: roughly four times that of 2D FAX coding [11] Pattern matching based compression has been studied by several research groups [1, 6, 9, 11]. This kind of compression involves the following steps: Extract from the image of the scanned document a sequence of glyphs. Each glyph typically represents one connected blob of ink occuring somewhere in the document, represented as a positioned bitmap. Partition the set of glyphs into ....

[Article contains additional citation context not shown here]

I. H. Witten, T.C. Bell, H. Emberson, and S. Inglis, "Textual image compression: twostage lossy/lossless encoding of textual images," Proceedings of the IEEE, v. 86, No. 6, pp. 878-888, 1994.


Document Image Compression via Pattern Matching - Zhang (1997)   (1 citation)  (Correct)

....text in scanned images. Document images contain patterns such as characters and lines, which may be repeated many times. Pattern matching coding techniques exploit these macroscopic properties of document images. These techniques were originally proposed in [1] and further studied in [5] 4] and [7]. Image compression based on pattern matching works in this way: patterns, connected blobs of ink, which are expected to be characters, are extracted from the image to be compressed. These patterns are compared to previously transmitted patterns. If a match is detected, only the position of the ....

....in the image. ffl The position of each pattern in the image. In a long document, pattern indices and positions make up the majority of the code space, so it is important to code them compactly. Previous document image compression systems do not compress pattern positions well. Witten et al. s TIC[7], the best previous system, spends more bits on pattern positions than on pattern indices, roughly 10 bits per character. TIC codes the offset between one pattern and the next exactly. The offsets are coded using a first order predictive model based on the pattern index. Exploiting the knowledge ....

Witten, I.H., Bell, T.C., Emberson,H., and S. Inglis, "Textual image compression: two-stage lossy/lossless encoding of textual images," Proceedings of the IEEE, v. 86, No. 6, pp. 878-888, 1994.


A Pattern-Based Lossy Compression Scheme for Document Images - Qin Zhang   (Correct)

....pages. Document images contain patterns such as characters and lines, which may be repeated many times in the image. Pattern matching coding techniques exploit these macroscopic properties of document images. These techniques were originally proposed in [3] and further studied in [4] 5] and [6]. 6] and [7] discuss lossless compression based on pattern matching. Our Compression for Document Image System (CDIS) is also pattern matching based. Exploitingthe knowledge of document images one step further, CDIS takes advantage of the Paper for submission to Electronic ....

....Document images contain patterns such as characters and lines, which may be repeated many times in the image. Pattern matching coding techniques exploit these macroscopic properties of document images. These techniques were originally proposed in [3] and further studied in [4] 5] and [6] [6] and [7] discuss lossless compression based on pattern matching. Our Compression for Document Image System (CDIS) is also pattern matching based. Exploitingthe knowledge of document images one step further, CDIS takes advantage of the Paper for submission to Electronic Publishing Origination, ....

[Article contains additional citation context not shown here]

I. H. Witten, T. C. Bell, H. Emberson, and S. Inglis, `Textual image compression: two-stage lossy/lossless encoding of textual images', Proc. IEEE, 86(6), 878--888, (1994).


Lossless Document Image Compression - Inglis (1999)   (4 citations)  Self-citation (Inglis)   (Correct)

No context found.

Ian H. Witten, Timothy C. Bell, Hugh Emberson, Stuart J. Inglis, and Alistair Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82(6):878--888, June 1994. 170


Musical Image Compression - Bainbridge, Inglis (1998)   (2 citations)  Self-citation (Inglis)   (Correct)

No context found.

I. H. Witten, T. C. Bell, H. Emberson, S. Inglis, and A. Moffat, Textual image compression: Two-stage lossy/lossless encoding of textual images, Proceedings of the IEEE 82 (1994), no. 6, 878--888.


Compressed Domain Document Retrieval and Analysis - Kia, Doermann, Chellappa (1996)   (1 citation)  (Correct)

No context found.

I. Witten, T. Bell, H. Emberson, S. Inglis, and A. Moffat. Textual image compression: Two-stage lossy/lossless encoding of textual images. Proceedings of the IEEE, 82:878--888, 1994.


Better PostScript than PostScript - portable self-extracting.. - Zhang, Danskin   (Correct)

No context found.

I. H. Witten, T. C. Bell, H. Emberson, and S. Inglis, "Textual image compression: two-stage lossy/lossless encoding of textual images," Proc. IEEE 86(6), pp. 878--888, 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC