{"id":898,"date":"2018-03-28T10:43:07","date_gmt":"2018-03-28T10:43:07","guid":{"rendered":"http:\/\/cppdepend.com\/blog\/?p=898"},"modified":"2023-05-31T15:39:05","modified_gmt":"2023-05-31T15:39:05","slug":"write-efficient-c-code-learn-from-linus-torvalds","status":"publish","type":"post","link":"https:\/\/cppdepend.com\/blog\/write-efficient-c-code-learn-from-linus-torvalds\/","title":{"rendered":"Write Efficient C Code: Learn from Linus Torvalds"},"content":{"rendered":"<p>Every project has its own style guide: a set of conventions about how to write code for that project.\u00a0Some managers choose basic coding rules, others prefer very advanced ones and for many projects, no coding rules are specified, and each developer uses his style.<\/p>\n<p>It is much easier to understand a large codebase when all the code in it is in a consistent style.<!--more--><span id=\"more-163\"><\/span><\/p>\n<p>Many resources exist talking about the better coding rules to adopt, we can learn good coding rules from :<\/p>\n<ul>\n<li>Reading a book or a magazine.<\/li>\n<li>Websites.<\/li>\n<li>From a colleague.<\/li>\n<li>Doing a training.<\/li>\n<\/ul>\n<p>We can also work with an expert for few months to elevate the coding skills of the team. However, it&#8217;s not easy to find the right person and it could cost a lot of money to the company. But why searching for an expert, if we can be inspired by\u00a0genius developers like Linus Torvalds. Indeed, you have just to explore the source code developed or maintained by him\u00a0to have a good idea of how we must develop an efficient C code.<\/p>\n<p>Linus Torvalds\u00a0is genius because it&#8217;s the creator, and for a long time, principal developer of the\u00a0Linux kernel and he also created the most popular distributed revision control\u00a0system\u00a0<a class=\"mw-redirect\" title=\"Git (software)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Git_(software)\">Git<\/a>. That&#8217;s it, no need to search for other arguments \ud83d\ude42<\/p>\n<p><strong>Inside the git source code<\/strong><\/p>\n<p>Let&#8217;s take a look at a code snippet from git:<br \/>\n<img decoding=\"async\" src=\"http:\/\/www.cppdepend.com\/img\/git.png\" \/><\/p>\n<p>Here are some remarks about this code:<\/p>\n<ul>\n<li>The functions are declared as static.<\/li>\n<li>The functions\u00a0return an error code.<\/li>\n<li>The function has few parameters.<\/li>\n<li>The\u00a0function exit as early as possible.<\/li>\n<li>The variables are declared as static.<\/li>\n<li>The variable naming is easy to understand.<\/li>\n<li>The methods are very short.<\/li>\n<li>It\u2019s well indented.<\/li>\n<li>No extra comments in the body. The code explains easily itself.<\/li>\n<li>The function bodies are well indented.<\/li>\n<li>The define guards are clear.<\/li>\n<\/ul>\n<p>If we\u00a0navigate across all the git source code we can remark the coherence of the implementation. The same best practice rules are applied to each function. To be sure let&#8217;s\u00a0 search for the\u00a0static functions:<\/p>\n<pre>from m in Methods where m.IsStatic select m\r\n<\/pre>\n<p>The treemap is very useful to have a good idea of code elements concerned by a CQLinq query, the blue rectangles represent the result.<\/p>\n<p><a href=\"http:\/\/cppdepend.files.wordpress.com\/2012\/10\/git2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-842\" title=\"git2\" src=\"http:\/\/cppdepend.files.wordpress.com\/2012\/10\/git2.png\" alt=\"\" width=\"595\" height=\"337\" \/><\/a><br \/>\nAlmost all functions are declared as static to be visible only in the translation unit where there are declared.<\/p>\n<p><strong>Inside the Linux kernel source code<\/strong><\/p>\n<p>Let\u2019s switch to the Linux source code and take as an example this function implementation:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux11.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-176\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux11.png\" alt=\"linux11\" width=\"550\" height=\"344\" \/><\/a><\/p>\n<p>The code looks very clean, indeed the \u00a0function<\/p>\n<ul>\n<li>Has\u00a0only a few lines of code.<\/li>\n<li>The signature is well defined.<\/li>\n<li>It\u2019s well commented.<\/li>\n<li>It\u2019s well indented.<\/li>\n<li>The variable names are very clear.<\/li>\n<li>the\u00a0const correctness is satisfied.<\/li>\n<li>Check the entry parameters and warn if they not satisfy some conditions.<\/li>\n<\/ul>\n<p>The same \u00a0function could be implemented by another developer like this<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux12.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-177\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux12.png\" alt=\"linux12\" width=\"711\" height=\"273\" \/><\/a><\/p>\n<p>The coding style has a big impact on\u00a0the source code readability, investing some hours to train developers, and doing periodically a code review is always good to make the code easy to maintain and evolve.<\/p>\n<p>Let\u2019s go inside the Linux kernel source code using\u00a0<a href=\"http:\/\/www.cppdepend.com\/\">CppDepend\u00a0<\/a>and discover some basic coding rules adopted by their developers.<\/p>\n<p><strong>Modularity<\/strong><\/p>\n<p>Modularity is a software design technique that increases the extent to which software is composed of separate parts, you can manage and maintain modular code easily.<\/p>\n<p>For a procedural language like C \u00a0where no logical artifacts like namespace, component or class do not exist, we can modularize by using directories and files.<\/p>\n<p>Here are some possible scenarios :<\/p>\n<ul>\n<li>Put all the source files in one directory<\/li>\n<li>Isolate files related to a module or a submodule \u00a0into a specific directory.<\/li>\n<\/ul>\n<p>In case of the Linux kernel, directories and subdirectories are used to modularize the kernel source code.<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux15.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-191\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux15.png\" alt=\"linux15\" width=\"348\" height=\"322\" \/><\/a><\/p>\n<p><strong>Encapsulation<\/strong><\/p>\n<p>Encapsulation is the hiding of functions and data which are internal to an implementation. \u00a0In C, encapsulation is performed by using the keyword static. These entities are called file-scope functions and variables.<\/p>\n<p>Let\u2019s search for all static functions by executing the following CQLinq query<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux17.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-192\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux17.png\" alt=\"linux17\" width=\"383\" height=\"38\" \/><\/a><\/p>\n<p>We can use the Metric view to\u00a0have a good idea how many functions are concerned. In the Metric View, the code base is represented by a Treemap. Treemapping is a method for displaying tree-structured data by using nested rectangles. The tree structure used in a CppDepend treemap is the usual code hierarchy:<\/p>\n<ul>\n<li>Projects contain directories.<\/li>\n<li>Directories\u00a0contain files.<\/li>\n<li>Files\u00a0contains struects, functions\u00a0and variables.<\/li>\n<\/ul>\n<p>The treemap view provides a useful way to represent the result of a CQLinq request, so we can visually see the types concerned by the request.<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-170\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux2.png\" alt=\"linux2\" width=\"771\" height=\"383\" \/><\/a><\/p>\n<p>As we can observe many functions are declared as static.<\/p>\n<p>Let\u2019s search now for the static fields:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-179\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux3.png\" alt=\"linux3\" width=\"370\" height=\"473\" \/><\/a><\/p>\n<p>The same remark as functions, many variables are declared as static.<\/p>\n<p>In the Linux kernel source code, the encapsulation is used whenever the functions and variables must be private to the file scope.<\/p>\n<p><strong>Use structs to store your data model<\/strong><\/p>\n<p>In C programming the functions use variables to achieve their treatments, these variables could be:<\/p>\n<ul>\n<li>Static variables.<\/li>\n<li>Global variables.<\/li>\n<li>Local variables<\/li>\n<li>Variables from structs.<\/li>\n<\/ul>\n<p>Each project has its data model which could be used by many source files, using global variables is a solution but not the good one, using structs to group data is more recommended.<\/p>\n<p>Let\u2019s search for global variables with a primitive type:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-180\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux4.png\" alt=\"linux4\" width=\"357\" height=\"469\" \/><\/a><\/p>\n<p>only very few variables are concerned, and maybe we can group some of them into structs, like (elfcorehdr_addr and\u00a0elfcorehdr_size) or (pm_freezing and pm_nosig_freezing).<\/p>\n<p><strong>Let function be short and sweet<\/strong><\/p>\n<p>Here\u2019s from the\u00a0<a href=\"https:\/\/www.kernel.org\/doc\/Documentation\/CodingStyle\">linux coding style web page<\/a>, an\u00a0advice about the length of functions:<\/p>\n<pre>Functions should be short and sweet, and do just one thing.  They should\r\nfit on one or two screenfuls of text (the ISO\/ANSI screen size is 80x24,\r\nas we all know), and do one thing and do that well.\r\n\r\nThe maximum length of a function is inversely proportional to the\r\ncomplexity and indentation level of that function.  So, if you have a\r\nconceptually simple function that is just one long (but simple)\r\ncase-statement, where you have to do lots of small things for a lot of\r\ndifferent cases, it's OK to have a longer function.<\/pre>\n<p>Let\u2019s search for functions where the number of lines of code is more than 30<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux14.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-188\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux14.png\" alt=\"linux14\" width=\"383\" height=\"466\" \/><\/a><\/p>\n<p>Only a few methods have more than 30 lines of code.<\/p>\n<p><strong>Function Number of parameters<\/strong><\/p>\n<p>Functions\u00a0where NbParameters &gt; 8\u00a0might be painful to call and might degrade performance. Another alternative is to provide a structure dedicated to handling arguments passing.<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-182\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux7.png\" alt=\"linux7\" width=\"356\" height=\"352\" \/><\/a><\/p>\n<p>only 2 methods have more than 8 parameters.<\/p>\n<p><strong>Number of\u00a0local variables<\/strong><\/p>\n<p>Methods, where NbVariables is higher than 8, are hard to understand and maintain. Methods, where NbVariables is higher than 15, are extremely complex and should be split into smaller methods (except if they are automatically generated by a tool).<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux9.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-183\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux9.png\" alt=\"linux9\" width=\"352\" height=\"413\" \/><\/a><\/p>\n<p>only 5 functions have more than 15 local variables.<\/p>\n<p><strong>Avoid defining complex functions<\/strong><\/p>\n<p>Many metrics exist to detect complex functions, NBLinesOfCode, Number of parameters and number of local variables are the basic ones.<\/p>\n<p>There are other interesting metrics to detect complex functions:<\/p>\n<ul>\n<li>Cyclomatic complexity is a popular procedural software metric equal to the number of decisions that can be taken in a procedure.<\/li>\n<li>Nesting Depth\u00a0is a metric defined on methods that is relative to the maximum depth\u00a0of the more nested scope in a method body.<\/li>\n<li>Max Nested loop equals the maximum level of loop nesting in a function.<\/li>\n<\/ul>\n<p>The max value tolerated for these metrics depends more on the team choices, there are no standard values.<\/p>\n<p>Let\u2019s search for functions candidate to be refactored:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-184\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux8.png\" alt=\"linux8\" width=\"350\" height=\"468\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>only very few functions could be considered as complex.<\/p>\n<p><strong>Naming convention<\/strong><\/p>\n<p>There\u2019s no standard for the naming convention, each project managers could choose what they think it\u2019s better, however, what\u2019s very important is to respect the chosen convention to have an homegenous naming.<\/p>\n<p>In case of Linux, the structs must began with a lower case, and we can check if it\u2019s true for the whole kernel source code, let\u2019s execute the following query:<\/p>\n<p><a href=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-185\" src=\"http:\/\/www.javadepend.com\/Blog\/wp-content\/uploads\/linux5.png\" alt=\"linux5\" width=\"355\" height=\"397\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>Only 4 structs began with \u201c_\u201d instead of a lower case letter.<\/p>\n<p><strong>\u00a0Indentation<\/strong><\/p>\n<p>The indentation is very useful to make the code easy to read,\u00a0here\u2019s from the\u00a0<a href=\"https:\/\/www.kernel.org\/doc\/Documentation\/CodingStyle\">linux coding style web page<\/a>\u00a0 the motivations behind the indentation:<\/p>\n<pre>Rationale: The whole idea behind indentation is to clearly define where\r\na block of control starts and ends.  Especially when you've been looking\r\nat your screen for 20 straight hours, you'll find it a lot easier to see\r\nhow the indentation works if you have large indentations.\r\n\r\nNow, some people will claim that having 8-character indentations makes\r\nthe code move too far to the right, and makes it hard to read on a\r\n80-character terminal screen.  The answer to that is that if you need\r\nmore than 3 levels of indentation, you're screwed anyway, and should fix\r\nyour program.<\/pre>\n<p><strong>\u00a0Conclusion<\/strong><\/p>\n<p>Exploring some known open source projects is always good to elevate your programming skills, especially if they are developed and maintained by experts. No need to download and build the project, you can just discover the code from GitHub.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Every project has its own style guide: a set of conventions about how to write code for that project.\u00a0Some managers choose basic coding rules, others prefer very advanced ones and for many projects, no coding rules are specified, and each developer uses his style. It is much easier to understand a large codebase when all &hellip; <a href=\"https:\/\/cppdepend.com\/blog\/write-efficient-c-code-learn-from-linus-torvalds\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Write Efficient C Code: Learn from Linus Torvalds&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[7,56],"class_list":["post-898","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-c","tag-linus-torvalds"],"_links":{"self":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/898","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/comments?post=898"}],"version-history":[{"count":11,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/898\/revisions"}],"predecessor-version":[{"id":1473,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/posts\/898\/revisions\/1473"}],"wp:attachment":[{"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/media?parent=898"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/categories?post=898"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cppdepend.com\/blog\/wp-json\/wp\/v2\/tags?post=898"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}