Results 1 - 10
of
164
TextFinder: An Automatic System To Detect And Recognize Text In Images
, 1997
"... There are many applications in which the automatic detection and recognition of text embedded in images is useful. These applications include digital libraries, multimedia systems, Information Retrievial Systems, and Geographical Information Systems. When machine generated text is printed against cl ..."
Abstract
-
Cited by 146 (0 self)
- Add to MetaCart
There are many applications in which the automatic detection and recognition of text embedded in images is useful. These applications include digital libraries, multimedia systems, Information Retrievial Systems, and Geographical Information Systems. When machine generated text is printed against clean backgrounds, it can be converted to a computer readable form (ASCII) using current Optical Character Recognition (OCR) technology. However, text is often printed against shaded or textured backgrounds or is embedded in images. Examples include maps, advertisements, photographs, videos and stock certificates. Current document segmentation and recognition technologies cannot handle these situations well. In this paper, a four-step system which automatically detects and extracts text in images is proposed. First, a texture segmentation scheme is used to focus attention on regions where text may occur. Second, strokes are extracted from the segmented text regions. Using reasonable heuristics...
Finding Text in Images
, 1997
"... There are many applications in which the automatic detection and recognition of text embedded in images is useful. These applications include multimedia systems, digital libraries, and Geographical Information Systems. When machine generated text is printed against clean backgrounds, it can be conve ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
(Show Context)
There are many applications in which the automatic detection and recognition of text embedded in images is useful. These applications include multimedia systems, digital libraries, and Geographical Information Systems. When machine generated text is printed against clean backgrounds, it can be converted to a computer readble form (ASCII) using current Optical Character Recognition (OCR) technology. However, text is often printed against shaded or textured backgrounds or is embedded in images. Examples include maps, advergisements, photographs, videos and stock certificates. Cur?'ent OCR and other document segmentation and recognition technologies cannot handle these situations well.
Geometric layout analysis techniques for document image understanding: a review
, 1998
"... Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with par ..."
Abstract
-
Cited by 63 (0 self)
- Add to MetaCart
Document Image Understanding (DIU) is an interesting research area with a large variety of challenging applications. Researchers have worked from decades on this topic, as witnessed by the scientific literature. The main purpose of the present report is to describe the current status of DIU with particular attention to two subprocesses: document skew angle estimation and page decomposition. Several algorithms proposed in the literature are synthetically described. They are included in a novel classification scheme. Some methods proposed for the evaluation of page decomposition algorithms are described. Critical discussions are reported about the current status of the field and about the open problems. Some considerations about the logical layout analysis are also reported.
Document Structure Analysis Algorithms: A Literature Survey
, 2003
"... Document structure analysis can be regarded as a syntactic analysis problem. The order and containment relations among the physical or logical components of a document page can be described by an ordered tree structure and can be modeled by a tree grammar which describes the page at the component le ..."
Abstract
-
Cited by 56 (0 self)
- Add to MetaCart
Document structure analysis can be regarded as a syntactic analysis problem. The order and containment relations among the physical or logical components of a document page can be described by an ordered tree structure and can be modeled by a tree grammar which describes the page at the component level in terms of regions or blocks. This paper provides a detailed survey of past work on document structure analysis algorithms and summarize the limitations of past approaches. In particular, we survey past work on document physical layout representations and algorithms, document logical structure representations and algorithms, and performance evaluation of document structure analysis algorithms. In the last section, we summarize this work and point out its limitations.
A system for interpretation of line drawings
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1990
"... Abstract-A system for interpretation of images of paper-based line drawings is described. Since a typical drawing contains both test strings and graphics, an algorithm has been developed to locate and separate text strings of various font size, style, and orientation. This is accom-plished by applyi ..."
Abstract
-
Cited by 51 (1 self)
- Add to MetaCart
Abstract-A system for interpretation of images of paper-based line drawings is described. Since a typical drawing contains both test strings and graphics, an algorithm has been developed to locate and separate text strings of various font size, style, and orientation. This is accom-plished by applying the Hough transform to the centroids of connected components in the image. The graphics in the segmented image is pro-cessed to represent thin entities by their core-lines and thick objects by their boundaries. The core-lines and boundaries are segmented into straight line segments and curved lines. The line segments and their interconnections are analyzed to locate minimum redundancy loops which are adequate to generate a succinct description of the graphics. Such a description includes the location and attributes of simple po-lygonal shapes, circles, and interconnecting lines, and a description of the spatial relationships and occlusions among them. Hatching and fill-ing patterns are also identified. The performance of the system is eval-uated using several test images and the results are presented. The su-periority of these algorithms in generating meaningful interpretations of graphics, compared to conventional data compression schemes, is clear from these results. Index Terns-Document image analysis, drawing conversion, fea-ture extraction, graphics recognition, image understanding, knowl-edge-based systems, line-drawing interpretation, pattern recognition, text segmentation, vectorization. I.
Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... this paper, we use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) First, we create mutually exclusive training and test data sets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimizatio ..."
Abstract
-
Cited by 38 (5 self)
- Add to MetaCart
(Show Context)
this paper, we use the following five-step methodology to quantitatively compare the performance of page segmentation algorithms: 1) First, we create mutually exclusive training and test data sets with groundtruth, 2) we then select a meaningful and computable performance metric, 3) an optimization procedure is then used to search automatically for the optimal parameter values of the segmentation algorithms on the training data set, 4) the segmentation algorithms are then evaluated on the test data set, and, finally, 5) a statistical and error analysis is performed to give the statistical significance of the experimental results. In particular, instead of the ad hoc and manual approach typically used in the literature for training algorithms, we pose the automatic training of algorithms as an optimization problem and use the Simplex algorithm to search for the optimal parameter value. A paired-model statistical analysis and an error analysis are then conducted to provide confidence intervals for the experimental results of the algorithms. This methodology is applied to the evaluation of five page segmentation algorithms of which, three are representative research algorithms and the other two are well-known commercial products, on 978 images from the University of Washington III data set. It is found that the performance indices (average textline accuracy) of the Voronoi, Docstrum, and Caere segmentation algorithms are not significantly different from each other, but they are significantly better than that of ScanSoft's segmentation algorithm, which, in turn, is significantly better than that of X-Y cut
Context-based Multiscale Classification of Document Images Using Wavelet Coefficient Distributions
, 2000
"... In this paper, an algorithm is developed for segmenting document images into four classes: background, photograph, text, and graph. Features used for classification are based on the distribution patterns of wavelet coefficients in high frequency bands. Two important attributes of the algorithm are ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
(Show Context)
In this paper, an algorithm is developed for segmenting document images into four classes: background, photograph, text, and graph. Features used for classification are based on the distribution patterns of wavelet coefficients in high frequency bands. Two important attributes of the algorithm are its multiscale nature---it classifies an image at different resolutions adaptively, enabling accurate classification at class boundaries as well as fast classification overall--- and its use of accumulated context information for improving classification accuracy.
Skew Detection and Text Line Position Determination in Digitized Documents
- Pattern Recognition
, 1997
"... Abstract--This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized ocuments, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination f the skew angle in documents i essential i ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
Abstract--This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized ocuments, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination f the skew angle in documents i essential in optical character recognition systems. Due to the text skew, each horizontal text line intersects a predefined set of vertical lines at non-horizontal positions. Using only the pixels on these vertical ines we construct a correlation matrix and evaluate the skew angle of the document with high accuracy. In addition, using the same matrix, we compute the positions of text lines in the document. The proposed method is tested on a variety of mixed-type documents and it provides good and accurate results while it requires only a short computational time. We illustrate the effectiveness of the algorithm by presenting four characteristic examples. ( ~ 1997 Pattern Recognition Society.
Efficient automatic text location method and content-based indexing and structuring of the video database
- J vi's Commun Image Represent
"... semantic audiovisual objects for indexing and retrieval of image and video documents has used relatively low-levelAn efficient automatic text detection and location method for video documents is proposed and its application for the perceptual content analysis such as texture (Picard and content-base ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
semantic audiovisual objects for indexing and retrieval of image and video documents has used relatively low-levelAn efficient automatic text detection and location method for video documents is proposed and its application for the perceptual content analysis such as texture (Picard and content-based retrieval of video is presented and discussed. Minka, 1995), shape (Sclaroff and Pentland, 1995; Tegolo, Target frames are selected at fixed time intervals from shots 1994), and color (Sakamoto et al., 1994) in image data, and detected by a scene-change detection method. For each selected various sound parameters for audio (Blum et al., 1995). frame, segmentation by color clustering is performed around Automatic or semi-automatic methods are favored, for the color peaks using a color histogram. For each color plane, text-sake of excluding subjectivism and saving the labor costlines are detected using heuristics, and the temporal and spatial and tedious repetitive work associated with human in-position and the text-image of each text-line are stored in a dexing. It turns out that some important semantic cluesdatabase. Experimental results for text detection in video im-can be recovered directly from the perceptual contents.ages and the performance of the method are reported for various Recently, researchers have tried to propose methods forvideo documents. A user interface for text-image based brows-ing is designed for direct content-based access to video docu- semantic speech or text-based retrieval of video. Some ments, and other applications are discussed. 1996 Academic Press researchers have proposed the textual indexing of nontex-