| Hong, T., "Degraded Text Recognition Using Visual and Linguistic Context." Ph.D. thesis, Computer Science Department, SUNY Buffalo, 1995. |
....Threshold , wok MDF WTF . WDF WTF DF I Baseline Figure 1: CLIR: Dependence of retrieval effectiveness on cumulative probability threshold, title queries. 4 OCR BASED RETRIEVAL Previous approaches to retrieval of OCR degraded text have focused primarily on correcting OCR errors [7][15] or on fuzzy matching techniques that are less sensitive than exact string matching to OCR errors [1] 5] This section demonstrates the generality of the query time replacement techniques developed above, using them to combine TF and DF evidence for a novel technique which attempts to replace ....
Hong, T., "Degraded Text Recognition Using Visual and Linguistic Context." Ph.D. thesis, Computer Science Department, SUNY Buffalo, 1995.
....a character has been extracted, with the decision string divided accordingly. Then characters obtained from other words are shifted on top of the partial word images for further matching and segmentation. The process continues until no prototypes are available to match the remaining partial images [9]. 21 Table 3: Results of stop word recognition. #stop #recognized #distinct #pages w o words #accepted words chars accepted Set #pages found recognition per page per page recognition B 200 14377 2104 10.5 7.3 19 J 99 19699 2827 28.5 10.3 5 UW 99 14141 1475 14.9 8.3 12 total 398 48217 6406 ....
T. Hong, Degraded Text Recognition Using Visual and Linguistic Context, Doctoral Dissertation, Dept. of Computer Science, SUNY at Buffalo, 1995.
....reject criterion so that one knows when a word should not be recognized as one of those in the lexicon. We will detail how this is done through the use of a decision forest classifier [5] 6] Our initial focus on a small set of common words differentiates our method from that proposed by Hong [8], who describes two uses of equivalence of word images: 1) character segmentation by matching equivalent partial words; and (2) OCR error correction by majority vote within word image clusters. In each use there is no direct recognition of word images before character segmentation and ....
....a character has been extracted, with the decision string divided accordingly. Then characters obtained from other words are shifted on top of the partial word images for further matching and segmentation. The process continues until no prototypes are available to match the remaining partial images [8]. 5 Using Character Prototypes in Text Recognition The character prototypes extracted from the stop words can be used in a variety of ways. They can be used to train or adapt a classifier on the fly, or matched to a font database to estimate the dominant font in the page. If sufficient ....
T. Hong, Degraded Text Recognition Using Visual and Linguistic Context, Doctoral Dissertation, Dept. of Computer Science, SUNY at Buffalo, 1995.
.... misspelled strings of a text has been considered by many authors (e.g. Bla60, RE71, Ull77, AFW83, SHC83, Sri85, TIAY90, Kuk92, ZD95, DHH 97] While some more recent work tries to use the sentence or document context for correcting errors and resolving ambiguities (e.g. Hul92, KEW91, Hon95] most contributions suggest methods for correcting isolated words of a text using statistical or lexical information. Since the correction power of statistical methods is very limited, recent work has concentrated on lexical correction techniques. If an electronic dictionary is available that ....
Tao Hong. Degraded Text Recognition Using Visual and Linguistic Context. PhD thesis, CEDAR, State Unicersity of New York at Buffalo, 1995.
....of character segmentation points and character recognition results [1, 2, 4, 3, 12, 13, 14] In previous studies, each entry in a lattice is usually limited to a simple representation at the character level. Our recent work demonstrated that visual context can be useful for text recognition[8, 9, 10, 11]. We designed different algorithms to use visual inter word relations for character segmentation and postprocessing. In this paper, we extend our previous work and propose a unified approach for text recognition by exploiting visual inter word context. Under the approach, different stages of text ....
....generated as word recognition result for the word image. We also define I(s1 x ; s1 y ; e1 x ; e1 y ) I(s2 x ; s2 y ; e2 x ; e2 y ) if the two images can match each other very well. Details of visual similarity measurement and inter word relation calculation can be found in our previous papers [8, 9, 10, 11]. 3 Lattice Based Unification Given two images I 1 = I(s1 x ; s1 y ; e1 x ; e1 y ) and I 2 = s2 x ; s2 y ; e2 x ; e2 y ) if I 1 I 2 , the lattice L(s1 x ; s1 y ; e1 x ; e1 y ) for I 1 can be upgraded to a new lattice, L(s1 x ; s1 y ; e1 x ; e1 y ) has (0.95) bus (0.73) c) a) h a s b u ....
[Article contains additional citation context not shown here]
Tao Hong. Degraded Text Recognition Using Visual and Linguistic Context. PhD thesis, Computer Science Department of SUNY at Buffalo, 1995.
....information of the well trained Japanese classifier. A Unicode based OCR for Far East Languages is being developed based on the JOCR system. In the system, we also developed methods for visual context analysis, such as image based character image clustering and inter character similarity analysis [12, 9]. The results of visual context analysis can be used for OCR error correction [10] The system is implemented on a SPARC Workstation under SunOS. Given a TIFF image which is a text block or a document page, the system will execute its modules and save the details of analysis and recognition into a ....
T. Hong. Degraded Text Recognition Using Visual and Linguistic Context. PhD thesis, Computer Science Department of SUNY at Buffalo, 1995.
No context found.
Hong, T., "Degraded Text Recognition Using Visual and Linguistic Context." Ph.D. thesis, Computer Science Department, SUNY Buffalo, 1995.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC