Skip to main content
SLU:s publikationsdatabas (SLUpub)

Konferensartikel2011Vetenskapligt granskadÖppen tillgång

Data Mining Medieval Documents by Word Spotting

Wahlberg, Fredrik; Dahllöf, Mats; Mårtensson, Lasse; Brun, Anders

Sammanfattning

This paper presents novel results for word spotting based on dynamic time warping applied to medieval manuscripts in Latin and Old Swedish. A target word is marked by a user, and the method automatically finds similar word forms in the document by matching them against the target. The method automatically identifies pages and lines. We show that our method improves accuracy compared to earlier proposals for this kind of handwriting. An advantage of the new method is that it performs matching within a text line without presupposing that the difficult problem of segmenting the text line into individual words has been solved. We evaluate our word spotting implementation on two medieval manuscripts representing two script types. We also show that it can be useful by helping a user find words in a manuscript and present graphs of word statistics as a function of page number

Publicerad i

Titel: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
ISBN: 978-1-4503-0916-5
Utgivare: Association for Computing Machinery

Konferens

2011 Workshop on Historical Document Imaging and Processing

      SLU författare

    • Brun, Anders

      • Centre for Image Analysis, Sveriges lantbruksuniversitet

    UKÄ forskningsämne

    Datavetenskap (datalogi)
    Datorseende och robotik (autonoma system)

    Publikationens identifierare

    DOI: https://doi.org/10.1145/2037342.2037355

    Permanent länk till denna sida (URI)

    https://res.slu.se/id/publ/36342