Skip to main navigation Skip to search Skip to main content

Leveraging Search-Based and Pre-Trained Code Language Models for Automated Program Repair

  • Oebele Lijzenga
  • , Iman Hemati Moghadam*
  • , Vadim Zaytsev
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Downloads (Pure)

Abstract

Background. Automated Program Repair (APR) techniques often face challenges in navigating vast search space of possible patches and often rely on redundancy-based assumptions, which can restrict the diversity of generated patches. Recently, Code Language Models (CLMs) have emerged as a method for dynamically generating patch ingredients, potentially enhancing patch quality.

Aim. This study aims to enhance APR by integrating search-based methods with CLMs to improve both the quality of generated patch ingredients and the efficiency of the search process.

Method. We propose ARJACLM, a novel APR technique that uses a genetic algorithm for search space navigation and dynamically generates patch ingredients with the CodeLLaMA-13B model, combining redundancy-based and CLM-derived patch ingredients.

Results. Testing on 176 bugs across 9 Java projects from Defect4J shows that CLM-generated patch ingredients significantly boost ARJACLM's performance, though at the cost of increased computation time. ARJACLM outperforms ARJA and GenProg, and CLM-generated patch ingredients are of higher quality than their redundancy-based counterparts. Additionally, ARJACLM performs best when redundancy-based patch ingredients are ignored.

Original languageEnglish
Title of host publicationSAC '25
Subtitle of host publicationProceedings of the 40th ACM/SIGAPP Symposium on Applied Computing
EditorsJiman Hong, Sebastiano Battiato, Christian Esposito
Place of PublicationNew York, NY
PublisherAssociation for Computing Machinery (ACM)
Pages1627-1636
Number of pages10
ISBN (Electronic)979-8-4007-0629-5
DOIs
Publication statusPublished - 14 May 2025
Event40th Annual ACM Symposium on Applied Computing, SAC 2025 - Catania International Airport Hotel, Catania, Italy
Duration: 31 Mar 20254 Apr 2025
Conference number: 40
https://www.sigapp.org/sac/sac2025/

Conference

Conference40th Annual ACM Symposium on Applied Computing, SAC 2025
Abbreviated titleSAC 2025
Country/TerritoryItaly
CityCatania
Period31/03/254/04/25
Internet address

Keywords

  • Code language model
  • Program repair
  • Search-based algorithm

Fingerprint

Dive into the research topics of 'Leveraging Search-Based and Pre-Trained Code Language Models for Automated Program Repair'. Together they form a unique fingerprint.

Cite this