*Result*: A Causal Learning Framework for Enhancing Robustness of Source Code Models

Title:
A Causal Learning Framework for Enhancing Robustness of Source Code Models
Source:
Proceedings of the ACM on Software Engineering ; volume 2, issue FSE, page 2641-2664 ; ISSN 2994-970X
Publisher Information:
Association for Computing Machinery (ACM)
Publication Year:
2025
Document Type:
*Academic Journal* article in journal/newspaper
Language:
English
DOI:
10.1145/3729387
Accession Number:
edsbas.247D000
Database:
BASE

*Further Information*

*Deep Learning (DL) models are useful for many software engineering tasks. However, these models are susceptible to adversarial attacks, partly because they learn spurious features that incur spurious correlations between these features and model predictions. In this paper, we tackle the problem with a novel causal learning framework, dubbed CausalCode, which leverages causal inference principles to mitigate spurious correlations. At a high level, CausalCode can be characterized as follows: (i) it uses causal data augmentation to generate intervention examples to disrupt spurious correlations; (ii) it leverages regularization to learn invariant representations that prefer causal features to spurious features; (iii) it can enhance the robustness of multiple DL models for source code-based software engineering tasks because it is task-agnostic and model-agnostic. To evaluate its effectiveness, we conduct comprehensive experiments on two models (i.e., CodeBERT and GraphCodeBERT), with respect to four software engineering tasks (i.e., defect detection, functionality classification, code translation, and code repair). Experimental results show that CausalCode outperforms the state-of-the-art approaches in enhancing the robustness of these models.*