Results 1 - 10
of
30
Icdar 2011 robust reading competition challenge 2: Reading text in scene images
- ICDAR
, 2011
"... Abstract—Recognition of text in natural scene images is becoming a prominent research area due to the widespread availablity of imaging devices in low-cost consumer products like mobile phones. To evaluate the performance of recent algorithms in detecting and recognizing text from complex images, th ..."
Abstract
-
Cited by 46 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Recognition of text in natural scene images is becoming a prominent research area due to the widespread availablity of imaging devices in low-cost consumer products like mobile phones. To evaluate the performance of recent algorithms in detecting and recognizing text from complex images, the ICDAR 2011 Robust Reading Competition was organized. Challenge 2 of the competition dealt specifically with detecting/recognizing text in natural scene images. This paper presents an overview of the approaches that the participants used, the evaluation measure, and the dataset used in the Challenge 2 of the contest. We also report the performance of all participating methods for text localization and word recognition tasks and compare their results using standard methods of area precision/recall and edit distance. Keywords-Scene text detection, natural images, text recognition I.
Recognition and Retrieval of Mathematical Expressions
- INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
"... Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of Optical Character Recognition (OCR) and text-based ..."
Abstract
-
Cited by 31 (10 self)
- Add to MetaCart
Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of Optical Character Recognition (OCR) and text-based search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.
T.M.: Script-independent handwritten textlines segmentation using active contours
- In: Proc. 10th Int. Conf. on Document Analysis and Recognition
, 2009
"... Handwritten document images contain textlines with multi orientations, touching and overlapping characters within consecutive textlines, and small inter-line spacing making textline segmentation a difficult task. In this paper we propose a novel, script-independent textline segmentation approach for ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
(Show Context)
Handwritten document images contain textlines with multi orientations, touching and overlapping characters within consecutive textlines, and small inter-line spacing making textline segmentation a difficult task. In this paper we propose a novel, script-independent textline segmentation approach for handwritten documents, which is robust against above mentioned problems. We model textline extraction as a general image segmentation task. We compute the central line of parts of textlines using ridges over the smoothed image. Then we adapt the state-of-the-art active contours (snakes) over ridges, which results in textline segmentation. Unlike the “Level Set ” and “Mumford-Shah model ” based handwritten textline segmentation methods, our method use matched filter bank approach for smoothing and does not require heuristic postprocessing steps for merging or splitting segmented textlines. Experimental results prove the effectiveness of the proposed algorithm. We evaluated our algorithm on ICDAR 2007 handwritten segmentation contest dataset and obtained an accuracy of 96.3%. 1
T.M.: Dewarping of document images using coupled-snakes
- In: Proceedings of Third International Workshop on Camera-Based Document Analysis and Recognition
, 2009
"... Traditional OCR systems are designed for planar (dewarped) images and the accuracy is reduced when applied on warped images. Therefore, developing new OCR techniques for warped images or developing dewarping techniques are the possible solutions for improving OCR accuracy camera-captured documents. ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Traditional OCR systems are designed for planar (dewarped) images and the accuracy is reduced when applied on warped images. Therefore, developing new OCR techniques for warped images or developing dewarping techniques are the possible solutions for improving OCR accuracy camera-captured documents. Among different types of dewarping techniques, curled textlines information based dewarping techniques are the most popular ones, but are sensitive to high degrees of curl and variable line spacing. In this paper we build a novel dewarping approach based on curled textlines information, which has been extracted using ridges based modified active contour model (coupledsnakes). Our dewarping approach is less sensitive different direction of curl and variable line spacing. Experimental results show that OCR error rate, from warped to dewarped documents, has been reduced from 5.15 % to 1.92 % on the dataset of CBDAR 2007 document image dewarping contest. We also report the performance of our method in comparison with other state-of-the-art methods. 1
Segmentation of curled textlines using active contours
- In 8th IAPRWorkshop on Document Analysis Systems
, 2008
"... Segmentation of curled textlines from warped document images is one of the major issues in document image de-warping. Most of the curled textlines segmentation algo-rithms present in the literature today are sensitive to the degree of curl, direction of curl, and spacing between adja-cent lines. We ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
(Show Context)
Segmentation of curled textlines from warped document images is one of the major issues in document image de-warping. Most of the curled textlines segmentation algo-rithms present in the literature today are sensitive to the degree of curl, direction of curl, and spacing between adja-cent lines. We present a new algorithm for curled textline segmentation which is robust to above mentioned problems at the expense of high execution time. We will demon-strate this insensitivity in a performance evaluation section. Our approach is based on the state-of-the-art image seg-mentation technique: Active Contour Model (Snake) with the novel idea of several baby snakes and their conver-gence in a vertical direction only. Experiment on publically available CBDAR 2007 document image dewarping contest dataset shows our textline segmentation algorithm accuracy of 97.96%. 1
T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images
- In: Proc. 10th Int. Conf. on Document Analysis and Recognition
, 2009
"... Detection of curled textline is important for dewarping of hand-held camera-captured document images. Then base-lines and the lines following the top of x-height of charac-ters (x-lines) are estimated for dewarping. Existing curled textline segmentation approaches are sensitive to outlier points and ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
Detection of curled textline is important for dewarping of hand-held camera-captured document images. Then base-lines and the lines following the top of x-height of charac-ters (x-lines) are estimated for dewarping. Existing curled textline segmentation approaches are sensitive to outlier points and perspective distortions. Furthermore these ap-proaches use regression over top and bottom points of a segmented textline to estimate its x-line and baseline sep-arately, which may results in inaccurate estimation. Here we propose a novel curled textline segmentation approach based on active contours (snakes) in which we perform seg-mentation by estimating the pairs of x-line and baseline; solving both problems together. Starting form a connected component we jointly trace a pair of x-line and baseline using coupled snakes and external energies of neighboring top-bottom points. We grow neighborhood region iteratively during tracing, which results in robustness to perspective distortions, and maintain a natural property of similar dis-tance within the pair of x-line and baseline pair, which re-sults in robustness to outlier points. We achieved 90.76% of one-to-one match-score recognition accuracy of curled textline segmentation on CBDAR 2007 Document Image Dewarping Contest dataset, with good estimation of pairs of x-line and baseline. 1
Ridges based curled textline region detection from grayscale camera-captured document images
- in Proc. 13th Int. Conf. on Computer Analysis of Images and Patterns, Muenster
, 2009
"... Abstract. As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
(Show Context)
Abstract. As compared to scanners, cameras offer fast, flexible and non-contact document imaging, but with distortions like uneven shading and warped shape. Therefore, camera-captured document images need preprocessing steps like binarization and textline detection for dewarping so that traditional document image processing steps can be applied on them. Previous approaches of binarization and curled textline detection are sensitive to distortions and loose some crucial image information during each step, which badly affects dewarping and further processing. Here we introduce a novel algorithm for curled textline region detection directly from a grayscale camera-captured document image, in which matched filter bank approach is used for enhancing textline structure and then ridges detection is applied for finding central line of curled textlines. The resulting ridges can be potentially used for binarization, dewarping or designing new techniques for camera-captured document image processing. Our approach is robust against bad shading and high degrees of curl. We have achieved around 91 % detection accuracy on the dataset of CBDAR 2007 document image dewarping contest.
Background Variability Modeling for Statistical Layout Analysis
"... Geometric layout analysis plays an important role in document image understanding. Many algorithms known in literature work well on standard document images, achieving high text line segmentation accuracy on the UW-III dataset. These algorithms rely on certain assumptions about document layouts, and ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Geometric layout analysis plays an important role in document image understanding. Many algorithms known in literature work well on standard document images, achieving high text line segmentation accuracy on the UW-III dataset. These algorithms rely on certain assumptions about document layouts, and fail when their underlying assumptions are not met. Also, they do not provide confidence scores for their output. These two problems limit the usefulness of general purpose layout analysis methods in large scale applications. In this contribution, we propose a statistically motivated model-based trainable layout analysis system that allows assumption-free adaptation to different layout types and produces likelihood estimates of the correctness of the computed page segmentation. The performance of our approach is tested on a subset of the Google 1000 books dataset where it achieved a text line segmentation accuracy of 98.4 % on layouts where other generalpurpose algorithms failed to do a correct segmentation. 1
Table detection in heterogeneous documents
- In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
, 2010
"... Detecting tables in document images is important since not only do tables contain important information, but also most of the layout analysis methods fail in the presence of tables in the document image. Existing approaches for table de-tection mainly focus on detecting tables in single columns of t ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Detecting tables in document images is important since not only do tables contain important information, but also most of the layout analysis methods fail in the presence of tables in the document image. Existing approaches for table de-tection mainly focus on detecting tables in single columns of text and do not work reliably on documents with varying layouts. This paper presents a practical algorithm for table detection that works with a high accuracy on documents with varying layouts (company reports, newspaper articles, magazine pages,...). An open source implementation of the algorithm is provided as part of the Tesseract OCR engine. Evaluation of the algorithm on document images from pub-licly available UNLV dataset shows competitive performance in comparison to the table detection module of a commercial OCR system.
Decapod: a flexible, low cost digitization solution for small and medium archives
- Proc. CBDAR 2011
"... Abstract—Scholarly content needs to be online, and for much mass produced content, that migration has already happened. Unfortunately, the online presence of scholarly content is much more sporadic for long tail material such as small journals, original source materials in the humanities and social ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract—Scholarly content needs to be online, and for much mass produced content, that migration has already happened. Unfortunately, the online presence of scholarly content is much more sporadic for long tail material such as small journals, original source materials in the humanities and social sciences, non-journal periodicals, and more. A large barrier to this content being available is the cost and complexity of setting up a digitization project for small and scattered collections coupled with a lack of revenue opportunities to recoup those costs. Collections with limited audiences and hence limited revenue opportunities are nonetheless often of considerable scholarly importance within their domains. The expense and difficulty of digitization presents a significant obstacle to making such paper archives available online. To address this problem, the Decapod project aims at providing a solution that is primarily suitable for small to medium paper archives with material that is rare or unique and is of sufficient interest that it warrants being made more widely available. This paper gives an overview of the project and presents its current status. Keywords-document capture, digitization, scanning, low cost book scanning I.