Try to understand the Linus Torvalds C++ opinion.

Linux, Php, and Git are a popular projects developed with C, in the other side OpenOffice, firefox, Clang, Photoshop are developed with C++, so it’s proven that each one is a good candidate to develop complex applications. Try to prove that a language is better than the other is maybe not the good debate. However we can discuss the motivations behind choosing one of them.

When I discovered the opinion of Linus Trovalds about C++, and as a C++ developer I totally disagree with his point of view. But it’s a point of view from the lead developer of the Linux kernel and git and he can’t be totally wrong.

When reading again its opinion I agree with this assertion

inefficient abstracted programming models where two years down the road you notice that some abstraction wasn't very efficient, but now all your code depends on all the nice object models around it, and you cannot fix it without rewriting your app.

It’s true that C++ provides a better possibilities to have a beautiful and well structured code, but it comes with a price, any changes or refactoring could be difficult. However it does not mean that I have to choose another language. Indeed each language or library comes with a price but we have to know how we can limit the impact of some possible changes of our C++ code after few years of development.

Let’s  analyze with CppDepend the Git source code and discover some design facts and  compare between C and C++ concerning these two points:

  • Easy to understand.
  • Managing changes.

Modularity: Physical vs Logical

Modularity is a software design technique that increases the extent to which software is composed from separate parts, you can manage and maintain modular code easily.

We can modularize a project using  two approaches:

  • Physically: by using directories and files, this modularity is provided by the operating system and can be applied to any language.
  • Logically: by using namespaces, components, classes and structs, this technique depends on the language capabilities.

When we develop with C, and to package our code we use essentially physical modularity, the code is structured by using directories to isolate the modules, here’s for Git the dependency graph between some of its directories.

However for C++ instead of C we can use namespaces to modularize the codebase, these artifacts are provided by the language, and for the previous graph the shapes could be  namespaces to modularize our code instead of directories.

Impact of choosing one of the two approaches:

Easy to understand : The logical approach is better because the modularity is well defined by the language artifacts, and just reading the code we can know in which module a code element exist.

Managing changes: a good design need in general many iterations, and for the physical approach the impact of design changes can be very limited than the logical one, indeed we need only to move function or variable from a file to another, or move file from directory to another.

However for C++ it can impact a lot of code because the logical modularity is implemented by the language artifacts and a code modification is needed.

Encapsulation:Class vs File

for C++ the encapsulation is defined as the process of combining data and functions into a single unit called class. Using the method of encapsulation, the programmer cannot directly access the data. Data is only accessible through the functions present inside the class.

For C we can have an encapsulation, but using also a physical approach  and a class can be a file containing the functions and the data used by them, and we can limit the accessibility of functions and variables by using “static” keyword.

Git use this technique to hide functions and variables, to discover that let’s search for static function:

from m in Methods where m.IsStatic select m

The treemap is very useful to have a good idea of code elements concerned by a CQLinq query, the blue rectangles represent the result.


Almost all functions are declared as static to be visible only in the translation unit where there are declared, the same remark could be applicable for variables.

from f in Fields where f.IsStatic select f


Easy to Understand: Using C++ encapsulation mechanism improve the understanding and visibility of code, C is low level and use physical approach rather than logical.

Managing changes:If we have to change the place where variable or function are encapsulated, it can very easy for C, but for C++ it can impact a lot of code.

Polymorphism vs Selection idiom

Polymorphism means that some code or operations or objects behave differently in different contexts.

This technique is very used in C++ projects, but what about C?

For procedural languages the selection technique is adopted by using the keywords “switch”, “if” or maybe “goto”, but this technique tend to increase the cyclomatic complexity of the code.

Let’s search for complex function inside Git code source.


Even Git is well developed, but many functions could be considered complex, it’s due to overusing of control flow instructions like “if”, “switch” or “goto”, with C++ however we can use polymorphism and to minimize the complexity of the code.

Easy to understand: Using Polymorphism permits the isolation of a specific behavior to a class, it improves the visibility and the cohesion of the code.

Managing changes: Adding another behavior with polymorphism can implies the adding of another class, however with selection idiom, you can add only another case under the switch statement.

Inheritance vs Composition

Git uses essentially structs to define data manipulated by functions. Let’s search for all structs used:

from t in Types where t.IsStructure select t


What’s interesting is that almost all data are isolated inside structs, and to verify that we can search for all not const public variables that are primitives and not inside a struct:

from f in Fields where f.IsPublic && f.IsPrimitiveType
&& !f.IsStatic && !f.IsConst
select f

Only some variables are concerned what’s a good point for Git design.

So what about extending a struct, with C we can use the composition like the case of “remote” struct, where many structs reference it.

However for C++ we can use also inheritance to extend structs, for example known_remote struct could inherit from remote one.

Easy to understand: using inheritance can improve the understanding of data, but we have to be careful when using it, its used only for the “Is” relation.

Managing changes: Inheritance implies a high coupling so any changes can impact a lot of code.

Conclusion:

C++ provides a better possibilities to have a beautiful and well structured code, but it comes with a price, any changes or refactoring could be difficult.

But doing refactoring need to understand the existing code before making changes, C programs are more difficult to understand, but easy to change, however C++ project can be more structured than C one, but need some effort when making changes.

How we can limit the impact of changes for C++?

The good solution to limit the impact of changes when choosing an OOP approach  is to use patterns, specially low coupling and high cohesion concepts to isolate changes only in a specific place.

However the best approach is to adopt the generic programming and the modern c++ practices. the generic programming approach is more flexible than than the OOP one and it help better to limit the impact of the C++ code changes.