Results 1 - 10
of
10
Word Spotting and Recognition with Embedded Attributes
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2014
"... This article addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We descri ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This article addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. In this subspace, images and strings that represent the same word are close together, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem. Contrary to most other existing methods, our representation has a fixed length, is low dimensional, and is very fast to compute and, especially, to compare. We test our approach on four public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.
A Simple and Fast Word Spotting Method
"... Abstract—A simple and efficient pipeline for word spotting in handwritten documents is proposed. The method allows for extremely rapid querying, while still maintaining high accuracy. The dataset images that are to be queried are preprocessed by a simple binarization operation, followed by the extra ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract—A simple and efficient pipeline for word spotting in handwritten documents is proposed. The method allows for extremely rapid querying, while still maintaining high accuracy. The dataset images that are to be queried are preprocessed by a simple binarization operation, followed by the extraction of multiple overlapping candidate targets. Each binary target, as well as the binarized query, is resized to fit a fixed-size rectangle and represented by conventional image descriptors. Then, a cosine similarity operator—followed by maximum pooling over random groups—is used to represent each target or query as a concise 250D vector. Retrieval is performed in a fraction of a second by nearest-neighbor search within that space, followed by a simple suppression of extra overlapping candidates. I.
Supervised mid-level features for word image representation
- In CVPR
"... This paper addresses the problem of learning word im-age representations: given the cropped image of a word, we are interested in finding a descriptive, robust, and compact fixed-length representation. Machine learning techniques can then be supplied with these representations to produce models usef ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
This paper addresses the problem of learning word im-age representations: given the cropped image of a word, we are interested in finding a descriptive, robust, and compact fixed-length representation. Machine learning techniques can then be supplied with these representations to produce models useful for word retrieval or recognition tasks. Al-though many works have focused on the machine learning aspect once a global representation has been produced, lit-tle work has been devoted to the construction of those base image representations: most works use standard coding and aggregation techniques directly on top of standard com-puter vision features such as SIFT or HOG. We propose to learn local mid-level features suitable for building word image representations. These features are learnt by leveraging character bounding box annotations on a small set of training images. However, contrary to other approaches that use character bounding box infor-mation, our approach does not rely on detecting the indi-vidual characters explicitly at testing time. Our local mid-level features can then be aggregated to produce a global word image signature. When pairing these features with the recent word attributes framework of [4], we obtain re-sults comparable with or better than the state-of-the-art on matching and recognition tasks using global descriptors of only 96 dimensions. 1.
Viral Transcript Alignment
"... Abstract—We present an end-to-end system for aligning tran-script letters to their coordinates in a manuscript image. An intuitive GUI and an automatic line detection method enable the user to perform an exact alignment of parts of document pages. In order to bridge large regions in between annotati ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract—We present an end-to-end system for aligning tran-script letters to their coordinates in a manuscript image. An intuitive GUI and an automatic line detection method enable the user to perform an exact alignment of parts of document pages. In order to bridge large regions in between annotation, and augment the manual effort, the system employs an optical-flow engine for directly matching at the pixel level the image of a line of a historical text with a synthetic image created from the transcript’s matching line. Meanwhile, by accumulating aligned letters, and performing letter spotting, the system is able to bootstrap a rapid semi-automatic transcription of the remaining text. Thus, the amount of manual work is greatly diminished and the transcript alignment task becomes practical regardless of the corpus size. I.
Word spotting in handwritten text using contour-based models
- in proc. ICFHR, Sep 2014
"... Abstract—In this paper, we propose a method for spotting keywords in images of handwritten text. Relying on an object detection system in real images, local contour features are extracted from segmented word images in order to obtain a representative shape of a word-class. Thus, word spotting is cas ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we propose a method for spotting keywords in images of handwritten text. Relying on an object detection system in real images, local contour features are extracted from segmented word images in order to obtain a representative shape of a word-class. Thus, word spotting is cast following a query-by-word-class scenario where class models are generated using a random subset of the images belonging to that class. Cumbersome multi-writer conditions are tackled with a statistical model of intra-class deforma-tions using principal component analysis (PCA). Novel word instances are detected through a combination of a Hough-style voting scheme with a non-rigid point matching algorithm. Finally, we evaluate the system’s performance for word spotting as a classification task, using a vocabulary of word models. Keywords-Word spotting; handwritten text; local contour features; word-class models; I.
Improving OCR for an Under-Resourced Script Using Unsupervised Word-Spotting
"... Abstract—Optical character recognition (OCR) quality, es-pecially for under-resourced scripts like Bangla, as well as for documents printed in old typefaces, is a major concern. An efficient and effective pipeline for OCR betterment is proposed here. The method is unsupervised. It employs a baseline ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Optical character recognition (OCR) quality, es-pecially for under-resourced scripts like Bangla, as well as for documents printed in old typefaces, is a major concern. An efficient and effective pipeline for OCR betterment is proposed here. The method is unsupervised. It employs a baseline OCR engine as a black box plus a dataset of unlabeled document images. That engine is applied to the images, followed by a visual encoding designed to support efficient word spotting. Given a new document to be analyzed, the black-box recognition engine is first applied. Then, for each result, word spotting is carried out within the dataset. The unreliable OCR outputs of the retrieved word spotting results are then considered. The word that is the centroid of the set of OCR words, measured by edit distance, is deemed a candidate reading. I.
Noname manuscript No. (will be inserted by the editor) A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting
"... has gained popularity among the document image anal-ysis community, specifically as a representation of hand-written words for recognition or spotting purposes. Al-though in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis dom ..."
Abstract
- Add to MetaCart
(Show Context)
has gained popularity among the document image anal-ysis community, specifically as a representation of hand-written words for recognition or spotting purposes. Al-though in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregard-ing such latest refinements. In this paper we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the base-line and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
EFFICIENT REPRESENTATION AND MATCHING OF TEXTS AND IMAGES IN SCANNED BOOK COLLECTIONS
, 2014
"... as to style and content by: ..."
(Show Context)