[修士論文] The Enhancement of Software Change Recommendations using Context-Aware Representations of Commits
- 01 /
-
28 2026
小林研M2のRamadhantyさんが修士論文を提出しました.
題目:The Enhancement of Software Change Recommendations using Context-Aware Representations of Commits
論文概要:
During software development, developers frequently modify software elements to fix bugs or implement new functionalities. Because dependencies exist among software elements, it is necessary to modify multiple related components when an element is changed. However, as software evolves, these interdependencies become increasingly complex, often resulting in incomplete changes. Such incomplete changes can lead to new bugs and pose significant challenges for software maintenance.
To mitigate the risk of incomplete changes, researchers have proposed automated change recommendation systems. Traditional methods use evolutionary coupling, derived from historical co-change patterns, and conceptual coupling, based on the similarity of textual information within the codebase. The state-of-the-art (SoTA) method adopts hybrid approaches that combine information from the set of changed items and textual data, such as commit messages and code diffs. This system calculate a composite similarity score to recommend target files using collaborative filtering. Despite its effectiveness, this method relies on TF-IDF for textual representation, which caputes only lexical overlap and fails to account for the deeper semantic context of the words.
In this research, I aim to address this limitation by incorporating change semantics derived from textual information using pre-trained models. I evaluated a variety of architectures, ranging from general-purpose code-task pre-trained models to specialized variants. To adapt these models for change recommendation task, I implemented a fine-tuning strategy based on triplet margin loss, which optimizes the semantic space to distinguish between related and unrelated commits. Furthermore, I investigate the impact of key hyperparameters, specifically the number of negative samples and the similarity of threshold used for negative sampling.
Following the practice in SoTA, the proposed method was evaluated on actual incomplete changes from Open Source Software (OSS) projects. The evaluation dataset was constructed based on Bug-Inducing Commit (BIC) and Bug-Fixing Commit (BFC) relations. The files modified in a BFC that were not modified in the corresponding BIC served as the ground-truth target files.
The experimental results indicate that general models perform better for this task compared to specialized variants. I also found that using a lower number of “hard negative”, i.e. those with high similarity to the BIC, enhances the performance of the base models when combined with changed-item features. By using embeddings generated from general models, I successfully improved the practical performance of the recommendation system, as evidenced by higher Recall@N and Hit@N metrics for N in {10, 20}. This research provides a foundation for the application of more recent and advanced pre-trained models for software change recommendation task.


小林研のそのほかの学位論文等の一覧は こちら です.


English