{"id":556,"date":"2018-03-03T23:58:46","date_gmt":"2018-03-03T23:58:46","guid":{"rendered":"http:\/\/cppdepend.com\/blog\/?p=556"},"modified":"2019-03-13T23:00:50","modified_gmt":"2019-03-13T23:00:50","slug":"tracking-the-hidden-duplicate-code-in-a-c-code-base","status":"publish","type":"post","link":"https:\/\/cppdepend.com\/blog\/tracking-the-hidden-duplicate-code-in-a-c-code-base\/","title":{"rendered":"Tracking the hidden duplicate code in a C++ code base."},"content":{"rendered":"<p>It\u2019s known\u00a0that the presence of duplicate code has negative impacts on software development and maintenance. Indeed a major drawback is when \u00a0an instance of duplicate code is changed for fixing bugs or adding new features, its correspondents have to be changed simultaneously.<\/p>\n<p>The most popular reason of duplicate code is the Copy\/Paste operations, and in this case the source code is exactly similar\u00a0\u00a0in two or more places , this practice is discouraged\u00a0in many articles, books, and web sites.However,\u00a0 sometimes it\u2019s not easy to practice the recommendations,\u00a0and the developer chose the easy solution: the \u00a0Copy\/Paste method.<!--more--><\/p>\n<p>There are many tools to detect these kind of cloned code,\u00a0<a href=\"http:\/\/www.ccfinder.net\/ccfinderxos.html\">CCFinderX<\/a>\u00a0is one of the interesting available open source tools. CCFinderX is a code-clone detector, which detects code clones (duplicated code fragments) from source files written in Java, C\/C++, COBOL, VB, C#.\u00a0It\u2019s enable a user-side customization of a preprocessor, and providing an interactive analysis based on metrics.<\/p>\n<p>Using the appropriate tool makes easy the detection of\u00a0the duplicate code from the copy\/paste operations, however there are some cases where\u00a0cloned\u00a0code are not trivial\u00a0to detect.<\/p>\n<p><strong>Hidden duplicate code<\/strong><\/p>\n<p><em>Case1: Modified Copy\/pasted code.<\/em><\/p>\n<p>As described before the major problem of a copy\/pasted code is when an instance of duplicate code is changed, its correspondents have to be changed simultaneously. Unfortunately it\u2019s not always the case and the duplicate code instances became different.<\/p>\n<p>To avoid these kind of hidden duplicate code, don\u2019t hesitate to use a tool like CFinderX to discover the duplicate code instances, and at least tag them by adding comments if you don\u2019t have time to refactor your code. This operation is very useful when a developer try to change a duplicate code instance, he will be noticed that other places has the same code. however if \u00a0the developer is not informed, he will\u00a0change only one place, and it will be very difficult in the future to detect the modified duplicate code.<\/p>\n<p><em>Case 2: Similar functionality<\/em><\/p>\n<p>The copy\/paste operations is not the only origin of duplicate code, another reason is when a similar functionality is implemented.<\/p>\n<p>Here\u2019s from\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Duplicate_code\">wikipedia<\/a>\u00a0a brief description of this second duplicate code origin:<\/p>\n<pre>Functionality that is very similar to that in another part of a program is required and a developer independently writes code that is very similar to what exists elsewhere. Studies suggest, that such independently rewritten code is typically not syntactically similar.<\/pre>\n<p><strong>Tracking hidden duplicate code<\/strong>:<\/p>\n<p>In case of duplicate code not exactly the same, no tool could give you a reliable results, it could report only suspicious duplicate code, and it\u2019s the responsibility of developers to check if it really concern a cloned code or just a false\u00a0positive result.<\/p>\n<p>Each tool uses a specific algorithm to track these kind of duplicate code, we didnt test any of these tools but I think that most of them could be interesting to check at least once, it could give you some interesting results that could help you to improve \u00a0the design and implementation of your code, as we will discover later in this post.<\/p>\n<p>In our case\u00a0we will use an algorithm\u00a0which\u00a0consists in\u00a0<strong>defining\u00a0sets of methods that are using the same members, i.e calling the same methods, reading the same fields, writing the same fields<\/strong>. We call these sets,\u00a0<em>suspect-sets<\/em>. Suspect-sets are sorted by the number of same members used.<\/p>\n<p><a href=\"http:\/\/www.cppdepend.com\/\">CppDepend<\/a>\u00a0implements this algorithm as a CppDepend Power-Tool. Power-Tools are a set of open-source tools based on CppDpend.API. The source code of Power-Tools can be found in\u00a0<em>$CppDependInstallPath$\\ CppDepend.PowerTools.SourceCode\\ CppDepend.PowerTools.sln.<\/em><\/p>\n<p>Let\u2019s discover the efficiency of this algorithm by searching the duplicate code in the Irrlicht 3D engine code base.<\/p>\n<p><strong>Case study:\u00a0Irrlicht 3D engine<\/strong><\/p>\n<p>The\u00a0<a href=\"http:\/\/irrlicht.sourceforge.net\/\">Irrlicht<\/a>\u00a0Engine is an open-source high-performance realtime 3D engine written in C++. It is completely cross-platform.<\/p>\n<p>Here are\u00a0two of the suspicious duplicate code detected:<\/p>\n<p><em>1- Exact code duplicate<\/em><\/p>\n<p>In this case 18 methods detected are using the same 3 methods, reading the same 2 fields and writing same 9 fields.<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-216\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone5.png\" alt=\"clone5\" width=\"671\" height=\"267\" \/><\/a><\/p>\n<p>After checking the source code of these methods, it concern the exact code duplicated, however other tools are more interesting to detect these kind of duplicate, and the algorithm used has no added value when it concern the exact cloned code.<\/p>\n<p><em>2- Similar functionality<\/em><\/p>\n<p>Here\u2019s a second suspicious duplicate code, it concern four methods using the same 11 methods, reading the same 6 fields and writing the same 2 fields.<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone6.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-217\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone6.png\" alt=\"clone6\" width=\"670\" height=\"265\" \/><\/a><\/p>\n<p>After checking These four\u00a0methods source code, it\u2019s not exactly the same code. However, they implement an unique layout algorithm. So here I\u2019d vote for a factorization.<\/p>\n<p>To explain better this case\u00a0here\u2019s a relation between the classes concerned by the duplicate code:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-218\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/clone7.png\" alt=\"clone7\" width=\"752\" height=\"262\" \/><\/a><\/p>\n<p>OnSetConstants is declared in the\u00a0IShaderConstantSetCallBack \u00a0interface and implemented by all the derived classes. All the four implementations has the same\u00a0layout algorithm \u00a0and in such cases the\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Template_method_pattern\">template method pattern<\/a>\u00a0is a good solution to refactor the existing implementation.<\/p>\n<p>When testing this algorithm in many C++ open source projects we had very surprised that many duplicate code are similar to this case, and the template method pattern is rarely used.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>Tracking duplicate code is very useful to improve both the implementation and the design of your projects. Fortunately many tools exist to detect the cloned code, and it\u2019s recommended to execute periodically one of these tools and at least tag the duplicate instances.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It\u2019s known\u00a0that the presence of duplicate code has negative impacts on software development and maintenance. Indeed a major drawback is when \u00a0an instance of duplicate code is changed for fixing bugs or adding new features, its correspondents have to be changed simultaneously. The most popular reason of duplicate code is the Copy\/Paste operations, and in &hellip; <a href=\"https:\/\/cppdepend.com\/blog\/tracking-the-hidden-duplicate-code-in-a-c-code-base\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Tracking the hidden duplicate code in a C++ code base.&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[13,51],"class_list":["post-556","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-cpp","tag-duplication"],"_links":{"self":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/556","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/comments?post=556"}],"version-history":[{"count":5,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/556\/revisions"}],"predecessor-version":[{"id":1249,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/556\/revisions\/1249"}],"wp:attachment":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/media?parent=556"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/categories?post=556"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/tags?post=556"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}