Lightweight Parsing of Natural Language Metadata

Autayeu, Aliaksandr and Giunchiglia, Fausto and Andrews, Pierre and Ju, Qi (2009) Lightweight Parsing of Natural Language Metadata. UNSPECIFIED.

Download (94Kb) | Preview


    Understanding metadata written in natural language is a premise to successful automated integration of large scale language-rich datasets, such as digital libraries. In this paper we describe an analysis of the part of speech structure of two different datasets of metadata, show how this structure can be used to detect structural patterns that can be parsed by lightweight grammars with an accuracy ranging from 95.3% to 99.8%. This allows deeper understanding of metadata semantics, important for such tasks as translating classifications into lightweight ontologies for use in semantic matching.

    Item Type: Departmental Technical Report
    Department or Research center: Information Engineering and Computer Science
    Subjects: Q Science > QA Mathematics > QA076 Computer software
    Additional Information: In Natural Language Processing for Digital Libraries (NLP4DL) Workshop, Viareggio, Italy, June 15th 2009.
    Report Number: DISI-09-028
    Repository staff approval on: 11 Jun 2009

    Actions (login required)

    View Item