Programming a multicore architecture without coherency and atomic operations

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    35 Downloads (Pure)

    Abstract

    It is hard to reason about the state of a multicore system-on-chip, because operations on memory need multiple cycles to complete, since cores communicate via an interconnect like a network-on-chip. To simplify programming, atomicity is required, by means of atomic read-modify-write (RMW) operations, a strong memory model, and hardware cache coherency. As a result, multicore architectures are very complex, but this stems from the fact that they are designed with an imperative programming paradigm in mind, i.e. based on threads that communicate via shared memory. In this paper, we show the impact on a multicore architecture, when the programming paradigm is changed and a lambda-calculus-based (functional) language is used instead. Ordering requirements of memory operations are more relaxed and synchronization is simplified, because lambda-calculus does not have a notion of state or memory, and therefore does not impose ordering requirements on the platform. We implemented a functional language for multicores with a weak memory model, without the need of hardware cache coherency, any atomic RMW operation, or mutex--the execution is atomic-free. Experiments show that even on a system with (transparently applied) software cache coherency, execution scales properly up to 32 cores. This shows that concurrent hardware complexity can be reduced by making different choices in the software layers on top.
    Original languageUndefined
    Title of host publicationProceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014)
    Place of PublicationNew York
    PublisherAssociation for Computing Machinery (ACM)
    Pages29-38
    Number of pages10
    ISBN (Print)978-1-4503-2655-1
    DOIs
    Publication statusPublished - 15 Feb 2014

    Publication series

    Name
    PublisherACM

    Keywords

    • EWI-24377
    • Embedded system
    • functional language
    • METIS-303999
    • memory model
    • distributed shared memory
    • IR-89490
    • cache coherency

    Cite this

    Rutgers, J. H., Bekooij, M. J. G., & Smit, G. J. M. (2014). Programming a multicore architecture without coherency and atomic operations. In Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014) (pp. 29-38). New York: Association for Computing Machinery (ACM). https://doi.org/10.1145/2560683.2560697
    Rutgers, J.H. ; Bekooij, Marco Jan Gerrit ; Smit, Gerardus Johannes Maria. / Programming a multicore architecture without coherency and atomic operations. Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014). New York : Association for Computing Machinery (ACM), 2014. pp. 29-38
    @inproceedings{15e3fc1023f646899e477c7af913adfe,
    title = "Programming a multicore architecture without coherency and atomic operations",
    abstract = "It is hard to reason about the state of a multicore system-on-chip, because operations on memory need multiple cycles to complete, since cores communicate via an interconnect like a network-on-chip. To simplify programming, atomicity is required, by means of atomic read-modify-write (RMW) operations, a strong memory model, and hardware cache coherency. As a result, multicore architectures are very complex, but this stems from the fact that they are designed with an imperative programming paradigm in mind, i.e. based on threads that communicate via shared memory. In this paper, we show the impact on a multicore architecture, when the programming paradigm is changed and a lambda-calculus-based (functional) language is used instead. Ordering requirements of memory operations are more relaxed and synchronization is simplified, because lambda-calculus does not have a notion of state or memory, and therefore does not impose ordering requirements on the platform. We implemented a functional language for multicores with a weak memory model, without the need of hardware cache coherency, any atomic RMW operation, or mutex--the execution is atomic-free. Experiments show that even on a system with (transparently applied) software cache coherency, execution scales properly up to 32 cores. This shows that concurrent hardware complexity can be reduced by making different choices in the software layers on top.",
    keywords = "EWI-24377, Embedded system, functional language, METIS-303999, memory model, distributed shared memory, IR-89490, cache coherency",
    author = "J.H. Rutgers and Bekooij, {Marco Jan Gerrit} and Smit, {Gerardus Johannes Maria}",
    note = "10.1145/2560683.2560697",
    year = "2014",
    month = "2",
    day = "15",
    doi = "10.1145/2560683.2560697",
    language = "Undefined",
    isbn = "978-1-4503-2655-1",
    publisher = "Association for Computing Machinery (ACM)",
    pages = "29--38",
    booktitle = "Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014)",
    address = "United States",

    }

    Rutgers, JH, Bekooij, MJG & Smit, GJM 2014, Programming a multicore architecture without coherency and atomic operations. in Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014). Association for Computing Machinery (ACM), New York, pp. 29-38. https://doi.org/10.1145/2560683.2560697

    Programming a multicore architecture without coherency and atomic operations. / Rutgers, J.H.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria.

    Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014). New York : Association for Computing Machinery (ACM), 2014. p. 29-38.

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    TY - GEN

    T1 - Programming a multicore architecture without coherency and atomic operations

    AU - Rutgers, J.H.

    AU - Bekooij, Marco Jan Gerrit

    AU - Smit, Gerardus Johannes Maria

    N1 - 10.1145/2560683.2560697

    PY - 2014/2/15

    Y1 - 2014/2/15

    N2 - It is hard to reason about the state of a multicore system-on-chip, because operations on memory need multiple cycles to complete, since cores communicate via an interconnect like a network-on-chip. To simplify programming, atomicity is required, by means of atomic read-modify-write (RMW) operations, a strong memory model, and hardware cache coherency. As a result, multicore architectures are very complex, but this stems from the fact that they are designed with an imperative programming paradigm in mind, i.e. based on threads that communicate via shared memory. In this paper, we show the impact on a multicore architecture, when the programming paradigm is changed and a lambda-calculus-based (functional) language is used instead. Ordering requirements of memory operations are more relaxed and synchronization is simplified, because lambda-calculus does not have a notion of state or memory, and therefore does not impose ordering requirements on the platform. We implemented a functional language for multicores with a weak memory model, without the need of hardware cache coherency, any atomic RMW operation, or mutex--the execution is atomic-free. Experiments show that even on a system with (transparently applied) software cache coherency, execution scales properly up to 32 cores. This shows that concurrent hardware complexity can be reduced by making different choices in the software layers on top.

    AB - It is hard to reason about the state of a multicore system-on-chip, because operations on memory need multiple cycles to complete, since cores communicate via an interconnect like a network-on-chip. To simplify programming, atomicity is required, by means of atomic read-modify-write (RMW) operations, a strong memory model, and hardware cache coherency. As a result, multicore architectures are very complex, but this stems from the fact that they are designed with an imperative programming paradigm in mind, i.e. based on threads that communicate via shared memory. In this paper, we show the impact on a multicore architecture, when the programming paradigm is changed and a lambda-calculus-based (functional) language is used instead. Ordering requirements of memory operations are more relaxed and synchronization is simplified, because lambda-calculus does not have a notion of state or memory, and therefore does not impose ordering requirements on the platform. We implemented a functional language for multicores with a weak memory model, without the need of hardware cache coherency, any atomic RMW operation, or mutex--the execution is atomic-free. Experiments show that even on a system with (transparently applied) software cache coherency, execution scales properly up to 32 cores. This shows that concurrent hardware complexity can be reduced by making different choices in the software layers on top.

    KW - EWI-24377

    KW - Embedded system

    KW - functional language

    KW - METIS-303999

    KW - memory model

    KW - distributed shared memory

    KW - IR-89490

    KW - cache coherency

    U2 - 10.1145/2560683.2560697

    DO - 10.1145/2560683.2560697

    M3 - Conference contribution

    SN - 978-1-4503-2655-1

    SP - 29

    EP - 38

    BT - Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014)

    PB - Association for Computing Machinery (ACM)

    CY - New York

    ER -

    Rutgers JH, Bekooij MJG, Smit GJM. Programming a multicore architecture without coherency and atomic operations. In Proceedings of the International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM 2014). New York: Association for Computing Machinery (ACM). 2014. p. 29-38 https://doi.org/10.1145/2560683.2560697