Skip to main navigation Skip to search Skip to main content

Automatic Detection of Intra-Word Code-Switching

  • Dong-Phuong Nguyen
  • , Leonie Cornips

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    220 Downloads (Pure)

    Abstract

    Many people are multilingual and they may draw from multiple language varieties when writing their messages. This paper is a first step towards analyzing and detecting code-switching within words. We first segment words into smaller units. Then, words are identified that are composed of sequences of subunits associated with different languages. We demonstrate our method on Twitter data in which both Dutch and dialect varieties labeled as Limburgish, a minority language, are used.
    Original languageEnglish
    Title of host publicationProceedings of the 14th Annual SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
    Place of PublicationStroudsburg, PA, USA
    PublisherAssociation for Computational Linguistics (ACL)
    Pages82-86
    Number of pages5
    Publication statusPublished - 11 Aug 2016
    Event14th Annual SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 2016 - Berlin, Germany
    Duration: 11 Aug 201611 Aug 2016
    Conference number: 14

    Workshop

    Workshop14th Annual SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology 2016
    Country/TerritoryGermany
    CityBerlin
    Period11/08/1611/08/16

    Keywords

    • CR-I.2.7
    • Social Media
    • code-switching
    • Computational Linguistics

    Fingerprint

    Dive into the research topics of 'Automatic Detection of Intra-Word Code-Switching'. Together they form a unique fingerprint.

    Cite this