Results 1 
2 of
2
Recognition and Retrieval of Mathematical Expressions
 INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
"... Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with widespread availability of Optical Character Recognition (OCR) and textbased ..."
Abstract

Cited by 31 (10 self)
 Add to MetaCart
Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with widespread availability of Optical Character Recognition (OCR) and textbased search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.
Math spotting: Retrieving math in technical documents using handwritten query images
 In Proc. Int’l Conf. Document Analysis and Recognition
, 2011
"... Abstract—A method for locating mathematical expressions in document images without the use of optical character recognition is presented. An index of document regions is produced from recursive XY trees produced for each page in the corpus. Queries are provided as images of handwritten expressions, ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract—A method for locating mathematical expressions in document images without the use of optical character recognition is presented. An index of document regions is produced from recursive XY trees produced for each page in the corpus. Queries are provided as images of handwritten expressions, for which an XY tree is computed. During retrieval, the query is looked up in the document region index using features of its XY tree, producing a set of candidate regions. Candidate regions are ranked by the similarity of vertical pixel projections in their upper and lower halves with those of the query image, as computed using Dynamic Time Warping of the image columns. In an experiment, ten participants each wrote twenty queries from a 200page corpus. On average, the top10 retrieval candidates included a candidate covering 43.3 % of the test query image (σ = 14.0), with the correct page being returned between 30.0 % and 85.0 % of the time across participants (µ = 63.2%, σ = 14.9%). When testing using the original query images, 90.0 % of the queries were retrieved correctly.