Layout and language: An ecient algorithm for detecting text blocks based on spatial and linguistic evidence (2001) [1 citations — 0 self]
by Matthew Hurst
In Document Recognition and Retrieval VIII
ftp://ftp.cogsci.ed.ac.uk/pub/matth/spie01.ps
Add To MetaCart
Abstract:
The ability to accurately detect those areas in plain text documents that consist of contiguous text is an important pre-process to many applications. This paper introduces a novel method that uses both spatial and linguistic knowledge in an accurate manner to provide an initial analysis of the document. This initial analysis may then be extended to provide a complete analysis of the text areas in the document.
Citations
| 12 | Using white space for automated document structuring – RUS, SUMMERS - 1994 |
| 10 | Automatic Discovery of Logical Document Structure – Summers - 1998 |
| 8 | A paper-to-HTML table converting system – Kieninger, Dengel - 1998 |

