Results 1 -
6 of
6
Recognition and Retrieval of Mathematical Expressions
- INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
"... Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of Optical Character Recognition (OCR) and text-based ..."
Abstract
-
Cited by 31 (10 self)
- Add to MetaCart
Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of Optical Character Recognition (OCR) and text-based search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.
A new approach for recognizing handwritten mathematics using relational grammars and fuzzy sets
"... ..."
Baseline Extraction-Driven Parsing of Handwritten Mathematical Expressions
"... We generalize recursive baseline extraction algorithms for symbol layout analysis in math expressions so that handwritten strokes may be provided as input. Specifically, baseline extraction is used for lexical analysis in a modified LL(1) parser, returning a set of candidate symbols when the leftmos ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
We generalize recursive baseline extraction algorithms for symbol layout analysis in math expressions so that handwritten strokes may be provided as input. Specifically, baseline extraction is used for lexical analysis in a modified LL(1) parser, returning a set of candidate symbols when the leftmost or next symbol along the current baseline (from left-to-right) is requested by the parser. Candidate symbols are used to produce a forest of parse trees, and the highest ranked parse returned. Hidden Markov Models (HMMs) are used for symbol classification, and horizontal adjacency between symbols is determined using two probabilistic quadratic classifiers, one for ascenders (e.g. ‘A’) and another for centered and descender symbols (e.g. ‘y ’ and ‘x’). The system placed second in the CROHME 2011 handwritten math recognition competition. 1.
IJDAR DOI 10.1007/s10032-012-0184-x ORIGINAL PAPER
"... Abstract We present a new approach for parsing twodimensional input using relational grammars and fuzzy sets. A fast, incremental parsing algorithm is developed, motivated by the two-dimensional structure of written mathematics. The approach reports all identifiable parses of the input. The parses a ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We present a new approach for parsing twodimensional input using relational grammars and fuzzy sets. A fast, incremental parsing algorithm is developed, motivated by the two-dimensional structure of written mathematics. The approach reports all identifiable parses of the input. The parses are represented as a fuzzy set, in which the membership grade of a parse measures the similarity between it and the handwritten input. To identify and report parses efficiently, we adapt and apply existing techniques such as rectangular partitions and shared parse forests, and introduce new ideas such as relational classes and interchangeability. We also present a correction mechanism that allows users to navigate parse results and choose the correct interpretation in case of recognition errors or ambiguity. Such corrections are incorporated into subsequent incremental recognition results. Finally, we include two empirical evaluations of our recognizer. One uses a novel user-oriented correction count metric, while the other replicates the CROHME 2011 math recognition contest. Both evaluations demonstrate the effectiveness of our proposed approach. 1
2009 10th International Conference on Document Analysis and Recognition Utilizing Consistency Context for Handwritten Mathematical Expression Recognition
"... This paper presents a rule-based approach that utilizes some types of contextual information to improve the accuracy of handwritten mathematical expression(ME) recognition. Mining context from corpus is not practical for ME recognition due to the complexity originated from 2-D nature of MEs. For pra ..."
Abstract
- Add to MetaCart
(Show Context)
This paper presents a rule-based approach that utilizes some types of contextual information to improve the accuracy of handwritten mathematical expression(ME) recognition. Mining context from corpus is not practical for ME recognition due to the complexity originated from 2-D nature of MEs. For practicality, we identify typical types of consistencies that are often found in customary usage and general patterns in MEs. We aim to increase these consistencies in recognition results by correcting symbol labels and/or spatial relationships among symbols. Such consistencies are easily encoded as condition-action pairs. Preliminary interpretations generated by the base recognizer are reordered by increasing or decreasing scores by the rules. Although our approach is not complete, it easily implements even global context among distant symbols. Experimental results show that our approach is useful to increase the accuracy of handwritten ME recognition. 1.
Symbol Knowledge Extraction from a Simple Graphical Language
"... Abstract—In this paper, we study the problem of symbol knowledge extraction. We assume that some unknown symbols are used to compose a handwritten message, and from a dataset of handwritten samples, we would like to recover the symbol set used in the corresponding language. We applied our approach o ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—In this paper, we study the problem of symbol knowledge extraction. We assume that some unknown symbols are used to compose a handwritten message, and from a dataset of handwritten samples, we would like to recover the symbol set used in the corresponding language. We applied our approach on online handwriting, and select the domain of numerical expressions, mixing digits and operators, to test the ability to retrieve the corresponding symbol classes. The proposed method is based on three steps: a quantization of the stroke space, a description of the layout of strokes with a relational graph, and the extraction of an optimal lexicon using a minimum description length algorithm. At the symbol level, a recall rate of 74 % is obtained on the test dataset produced by 100 writers. Keywords-online handwriting; knowledge extraction; mini-mum description length; spatial relation; I.