Tracking C++ Code Smells: Database Approach

Static analysis is not only about directly finding bugs, but also about finding bug-prone situations that can decrease code understanding and maintainability. Static analysis can handle many other properties of the code:

  • Code metrics: for example, methods with too many loops, if, else, switch, case… end up being non-understandable, hence non-maintainable. Counting these through the code metric Cyclomatic Complexity is a great way to assess when a method becomes too complex.
  • Dependencies: if the classes of your program are entangled, effects of any changes in the code becomes unpredictable. Static analysis can help to assess when classes and components are entangled.
  • Immutability: types that are used concurrently by several threads should be immutable, else you’ll have to protect state read/write access with complex lock strategies that will end up being un-maintainable. Static analysis can make sure that some classes remain immutable.
  • Dead code: dead code is code that can be removed safely, because it is not invoked anymore at runtime. Not only can it be removed, but it must be removed, because this extra code add unnecessary complexity to the program. Static analysis can find most of dead code in your program (yet not all).
  • API breaking change: if you present an API to your client, it is very easy to remove a public member without noticing and thus, breaking your clients code. Static analysis can compare two states of a program and can warn about this pitfall.
  • API usage: some APIs are intended to be used carefully. For example, a class that hold disposable fields must be itself disposable in general, except when the disposable field lifetime is not aligned with the class instances lifetime, which then sounds like a design problem.

Code smell can be also considered as a bug-prone situation, here’s it’s definition from wikipedia:

In computer programming, code smell, (or bad smell) is any symptom in the source code of a program that possibly indicates a deeper problem. According to Martin Fowler, "a code smell is a surface indication that usually corresponds to a deeper problem in the system". Another way to look at smells is with respect to principles and quality: "smells are certain structures in the code that indicate violation of fundamental design principles and negatively impact design quality". Code smells are usually not bugs—they are not technically incorrect and do not currently prevent the program from functioning. Instead, they indicate weaknesses in design that may be slowing down development or increasing the risk of bugs or failures in the future. Bad code smells can be an indicator of factors that contribute to technical debt. Robert C. Martin calls a list of code smells a "value system" for software craftsmanship.

Many interesting tools exist to detect bugs in your C++ code base  like cppcheck, clang-tidy and visual studio analyzer. But what about the detection of the bug-prone situations?

If the static analysis tools creators could decide which situations are considered as bugs, it’s not the case of the code smells cases which depends on the development team choices. For example a team could consider that a method with more than  20 lines is complex, another team could define the max to 30. If a tool provides the detection of the code smells , it must provides also the possibility to customize it.

Code as Data is the better way to detect the code smells

Static analysis is the idea of analyzing source code  for various properties and reporting on those properties, but it’s also, philosophically, the idea of treating code as data.  This is deeply weird to us as application developers, since we’re very much used to thinking of source code as instructions, procedures, and algorithms.  But it’s also deeply powerful.

After the source code analysis of a source file, we can extract its AST and generate a model containing  many interesting data  about the code. This way we can query it using a code query language similar to SQL.

CppDepend provides a code query language named CQLinq to query the code base like a database. Developers, designers and architects could define their custom queries to find easily the bug-prone situations.

With CQlinq we can  combine the data from the code metrics, dependencies, API usage and other model data to  define very advanced queries that match some bug-prone situations.

Here’s an example  of a CQLinq query that matches the most complex methods:




It’s better to combine many C++ tools to detect some problems in your C++ code base, some tools detect bugs, some others detect also the bug-prone situations .With CppDepend we try to combine between many tools, indeed we provides an easy way to define your queries, but also we can import  the result from other static analysis tools to query them with CQLinq.