A Support Vector Machine Approach to Dutch Part-of-Speech Tagging

Mannes Poel, L. Stegeman, Hendrikus J.A. op den Akker

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    7 Citations (Scopus)

    Abstract

    Part-of-Speech tagging, the assignment of Parts-of-Speech to the words in a given context of use, is a basic technique in many systems that handle natural languages. This paper describes a method for supervised training of a Part-of-Speech tagger using a committee of Support Vector Machines on a large corpus of annotated transcriptions of spoken Dutch. Special attention is paid to the decomposition of the large data set into parts for common, uncommon and unknown words. This does not only solve the space problems caused by the amount of data, it also improves the tagging time. The performance of the resulting tagger in terms of accuracy is 97.54%, which is quite good, where the speed of the tagger is reasonably good.
    Original languageUndefined
    Title of host publicationAdvances in Intelligent Data Analysis VII. Proceedings of the 7th International Symposium on Intelligent Data Analysis, IDA 2007
    EditorsM.R. Berthold, J. Shawe-Taylor, N. Lavrac
    Place of PublicationLondon
    PublisherSpringer
    Pages274-283
    Number of pages10
    ISBN (Print)978-3-540-74824-3
    DOIs
    Publication statusPublished - Sep 2007

    Publication series

    NameLecture Notes in Computer Science
    PublisherSpringer Verlag
    NumberLNCS4549
    Volume4723
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Keywords

    • EWI-11050
    • IR-61912
    • METIS-241907
    • HMI-CI: Computational Intelligence

    Cite this

    Poel, M., Stegeman, L., & op den Akker, H. J. A. (2007). A Support Vector Machine Approach to Dutch Part-of-Speech Tagging. In M. R. Berthold, J. Shawe-Taylor, & N. Lavrac (Eds.), Advances in Intelligent Data Analysis VII. Proceedings of the 7th International Symposium on Intelligent Data Analysis, IDA 2007 (pp. 274-283). [10.1007/978-3-540-74825-0_25] (Lecture Notes in Computer Science; Vol. 4723, No. LNCS4549). London: Springer. https://doi.org/10.1007/978-3-540-74825-0_25