Thick 2D Relations for Document Understanding

Aiello, Marco and Smeulders, Arnold M.W. (2002) Thick 2D Relations for Document Understanding. UNSPECIFIED. (Unpublished)

Download (928Kb) | Preview


    We use a propositional language of qualitative rectangle relations to detect the reading order from document images. To this end, we define the notion of a document encoding rule and we analyze possible formalisms to express document encoding rules such as LATEX and SGML. Document encoding rules expressed in the propositional language of rectangles are used to build a reading order detector for document images. In order to achieve robustness and avoid brittleness when applying the system to real life document images, the notion of a thick boundary interpretation for a qualitative relation is introduced. The framework is tested on a collection of heterogeneous document images showing recall rates up to 89%.

    Item Type: Departmental Technical Report
    Department or Research center: Information Engineering and Computer Science
    Subjects: Q Science > QA Mathematics > QA075 Electronic computers. Computer science
    Uncontrolled Keywords: document image analysis, document understanding, spatial reasoning, bidimensional Allen relations, constraint satisfaction: applications
    Additional Information: Submitted to: Information Sciences, Elsevier
    Report Number: DIT-02-063
    Repository staff approval on: 21 Jan 2003

    Actions (login required)

    View Item