Learning to extract folktale keywords

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    1 Citation (Scopus)
    96 Downloads (Pure)

    Abstract

    Manually assigned keywords provide a valuable means for accessing large document collections. They can serve as a shallow document summary and enable more efficient retrieval and aggregation of information. In this paper we investigate keywords in the context of the Dutch Folktale Database, a large collection of stories including fairy tales, jokes and urban legends. We carry out a quantitative and qualitative analysis of the keywords in the collection. Up to 80% of the assigned keywords (or a minor variation) appear in the text itself. Human annotators show moderate to substantial agreement in their judgment of keywords. Finally, we evaluate a learning to rank approach to extract and rank keyword candidates. We conclude that this is a promising approach to automate this time intensive task.
    Original languageUndefined
    Title of host publicationProceedings of the 7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2013)
    EditorsP. Lendvai, K. Zervanou
    Place of PublicationStroudsburg
    PublisherAssociation for Computational Linguistics (ACL)
    Pages65-73
    Number of pages9
    ISBN (Print)978-1-937284-62-6
    Publication statusPublished - Aug 2013
    Event7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2013 - Sofia, Bulgaria
    Duration: 8 Aug 20138 Aug 2013

    Publication series

    Name
    PublisherThe Association for Computational Linguistics

    Workshop

    Workshop7th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2013
    Period8/08/138/08/13
    Other8 August 2013

    Keywords

    • EWI-23556
    • METIS-297760
    • IR-87094

    Cite this