Textual Article Clustering in Newspaper Pages

Aiello, Marco and Pegoretti, Andrea (2004) Textual Article Clustering in Newspaper Pages. UNSPECIFIED. (Unpublished)

Download (2824Kb) | Preview


    In the analysis of a newspaper page an important step is the clustering of various text blocks into logical units, i.e., into articles. We propose three algorithms based on text processing techniques to cluster articles in newspaper pages. Based on the complexity of the three algorithms and experimentation on actual pages from the Italian newspaper L’Adige, we select one of the algorithms as the preferred choice to solve the textual clustering problem.

    Item Type: Departmental Technical Report
    Department or Research center: Information Engineering and Computer Science
    Subjects: Q Science > QA Mathematics > QA076 Computer software > QA076.7 Programming Languages - Semantics
    Report Number: DIT-04-102
    Repository staff approval on: 23 Dec 2004

    Actions (login required)

    View Item