TY - JOUR T1 - A Code Clustering Technique for Unifying Method Full Path of Reusable Cloned Code Sets of a Product Family AU - Taeyoung, Kim AU - Jihyun, Lee AU - Eunmi, Kim JO - KIPS Transactions on Software and Data Engineering PY - 2023 DA - 2023/1/30 DO - https://doi.org/10.3745/KTSDE.2023.12.1.1 KW - Clone-and-own Approach KW - Software Product Line Migration KW - Product Line Code Base KW - Code Clustering AB - Similar software is often developed with the Clone-And-Own (CAO) approach that copies and modifies existing artifacts. The CAO approach is considered as a bad practice because it makes maintenance difficult as the number of cloned products increases. Software product line engineering is a methodology that can solve the issue of the CAO approach by developing a product family through systematic reuse. Migrating product families that have been developed with the CAO approach to the product line engineering begins with finding, integrating, and building them as reusable assets. However, cloning occurs at various levels from directories to code lines, and their structures can be changed. This makes it difficult to build product line code base simply by finding clones. Successful migration thus requires unifying the source code's file path, class name, and method signature. This paper proposes a clustering method that identifies a set of similar codes scattered across product variants and some of their method full paths are different, so path unification is necessary. In order to show the effectiveness of the proposed method, we conducted an experiment using the Apo Games product line, which has evolved with the CAO approach. As a result, the average precision of clustering performed without preprocessing was 0.91 and the number of identified common clusters was 0, whereas our method showed 0.98 and 15 respectively