Unlocking the Power of Code Metrics with CppDepend
- Introduction
- CppDepend Code Metrics Walkthrough (3 minutes)
- Technical Debt Metrics
- Code Metrics Visualization
- Metrics on Application
- Metrics on Assemblies
- Metrics on Namespaces
- Metrics on Types
- Metrics on Methods
- Metrics on fields
Introduction
To learn metrics, you can print this Placemat Visualization Expert by Stuart Celarier MVP (Corillian) or you can print this Metrics Cheat Sheet by Frank-Leonardo Quednau'
-
8 metrics on Application
NbLinesOfCode, NbLinesOfComment, PercentageComment, NbProjects, NbNamespaces, NbTypes, NbMethods, NbFields
-
14 metrics on Projects:
NbLinesOfCode, NbLinesOfComment, PercentageComment, NbNamespaces, NbTypes, NbMethods, NbFields, Assembly level, Afferent coupling (Ca), Efferent coupling (Ce), Relational Cohesion(H), Instability (I), Abstractness (A), Distance from main sequence (D)
-
9 metrics on Namespaces:
NbLinesOfCode, NbLinesOfComment, PercentageComment, NbTypes, NbMethods, NbFields, Project level, Afferent coupling at namespace level (NamespaceCa), Efferent coupling at namespace level (NamespaceCe)
-
16 metrics on Types:
NbLinesOfCode, NbLinesOfComment, PercentageComment, NbMethods, NbFields, Type level, Type rank, Afferent coupling at type level (TypeCa), Efferent coupling at type level (TypeCe), Lack of Cohesion Of Methods (LCOM), Lack of Cohesion Of Methods Henderson-Sellers (LCOM HS), Code Source Cyclomatic Complexity, Size of instance, Association Between Class (ABC) Number of Children (NOC), Depth of Inheritance Tree (DIT)
-
11 metrics on Methods:
NbLinesOfCode, NbLinesOfComment, PercentageComment, Method level, Method rank, Afferent coupling at method level (MethodCa), Efferent coupling at method level (MethodCe), Code Source Cyclomatic Complexity, NbParameters, NbVariables, NbOverloads
-
2 metrics on Fields:
Size of instance, Afferent coupling at field level (FieldCa)
-
11 Halstead Metrics:
Total Operators, Total Operands, Distinct Operators, Distinct Operands, Halstead Program Length, Halstead Program Volume, Halstead Program Level, Halstead Program Difficulty, Halstead Programming Effort, Halstead Programming Time, Halstead Intelligent Content
Technical Debt Metrics
Since version 2017.1.0 CppDepend offers smart technical-debt estimation of a code base.
Basically each CppDepend code rule produces issues, and for each issue, some C# customizable formulas estimate the cost to fix these issues in terms of person-time.
This cost-to-fix can be seen as a debt the team owns: as long as the issue is not fixed, the debt is not reimbursed, and it has interests in terms of development friction. The code base technical-debt is the sum of all these debt estimations.
The technical debt can be seen as the mother of all code metrics.
- Every other code metrics, lines of code, complexity, code coverage, coupling ... can be harnessed through code rules with thresholds. The rules creates issues upon code metrics thresholds violations.
- And for each issue, we can estimate the cost to fix in terms of person-time.
- For each issue, we can also estimate the severity in terms of person-time per year consumed because of the issue left unfixed consequences.
Below you'll find the technical details for each code metric supported by CppDepend. The technical debt estimation has its own documentation page.
Code Metrics Visualization
CppDepend comes with a dashboard to quickly visualize all application metrics. The dashboard is available both in the Visual Studio extension and in the report.
For each metric, the dashboard shows the diff since baseline. It also shows if the metric value gets better (in green) or wort (in red).
Each value is clickable to drill down. For example clicking the number of types will list all the types of the code base.
CppDepend also offers some special metric visualization through a colored treemap. Such visualization is especially useful to browse code coverage by tests of methods or classes of your code base.
Metrics on Application
NbLinesOfCode
Notice that the LOC for a type is the sum of its methods’ LOC, the LOC for a namespace is the sum of its types’ LOC, the LOC for an project is the sum of its namespaces’ LOC and the LOC for an application is the sum of its projects LOC. Here are some observations:
- Abstract methods and enumerations have a LOC equals to 0. Only concrete code that is effectively executed is considered when computing LOC.
- Namespaces, types, fields and methods declarations are not considered as line of code because they don’t have corresponding sequence points.
Recommendations:
Methods where NbLinesOfCode is higher than 20 are hard to understand and maintain.
Related Links:
Why is it useful to count the number of Lines Of Code (LOC) ?
How do you count your number of Lines Of Code (LOC) ?
NbLinesOfComments
Defined for application, projects, namespaces, types, methods
Recommendations:
This metric is not helpful to asses the quality of source code. We prefer to use the metric
PercentageComment.
PercentageComment
Defined for application, projects, namespaces, types, methods
PercentageComment = 100*NbLinesOfComment / ( NbLinesOfComment + NbLinesOfCode)
Recommendations:
Code where the percentage of comment is lower than 20% should be more commented. However overly commented code (>40%) is not necessarily a blessing as it can be considered as an insult to the intelligence of the reader. Guidelines about code commenting can be found
here.
NbProjects
Defined for application - The number of projects.
NbNamespaces
Defined for application and projects. The number of namespaces. The anonymous namespace counts as one. If a namespace is defined over N projects, it will count as N. Namespaces declared in framework projects are not taken account.
NbTypes
Defined for application, projects, namespaces. The number of types. A type can be an abstract or a concrete class, a structure, an enumeration.
NbMethods
Defined for application, projects, namespaces, types. The number of methods. A method can be an abstract, virtual or non-virtual method, a method declared in an interface, a constructor, a class constructor, a finalizer, a property/indexer getter or setter, an event adder or remover.
Recommendations:
Types where NbMethods > 20 might be hard to understand and maintain but there might be cases where it is relevant to have a high value for NbMethods.
NbFields
Defined for application, projects, namespaces, types. The number of fields.
Recommendations:
Types where NbFields is higher 20 might be hard to understand and maintain but there might be cases where it is relevant to have a high value for NbFields.
Metrics on Projects
By measuring coupling between types of your application, CppDepend assesses the stability of each project. A project is considered stable if its types are used by a lot of types of tier projects (i.e stable = painful to modify). If a project contains many abstract types and few concrete types, it is considered as abstract. Thus, CppDepend helps you detect which projects are potentially painful to maintain (i.e concrete and stable) and which projects are potentially useless (i.e abstract and instable).
Note:
This theory and metrics have been first introduced by the excellent book Agile Software Development: Principles, Patterns, and Practices in C# Robert C. Martin (Prentice Hall PTR, 2006)
Afferent coupling (Ca)
The number of types outside this project that depend on types within this project. High afferent coupling indicates that the concerned projects have many responsibilities.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Efferent coupling (Ce)
The number of types inside this project that depends on types outside this project. High efferent coupling indicates that the concerned project is dependant.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Relational Cohesion (H)
Average number of internal relationships per type. Let R be the number of type relationships that are internal to this project (i.e that do not connect to types outside the project). Let N be the number of types within the project. H = (R + 1)/ N. The extra 1 in the formula prevents H=0 when N=1. The relational cohesion represents the relationship that this project has to all its types.
Recommendations:
As classes inside an project should be strongly related, the cohesion should be high. On the other hand, too high values may indicate over-coupling. A good range for RelationalCohesion is 1.5 to 4.0. Projects where RelationalCohesion < 1.5 or RelationalCohesion > 4.0 might be problematic.
Instability (I)
The ratio of efferent coupling (Ce) to total coupling. I = Ce / (Ce + Ca). This metric is an indicator of the package's resilience to change. The range for this metric is 0 to 1, with I=0 indicating a completely stable package and I=1 indicating a completely instable package.
Abstractness (A)
The ratio of the number of internal abstract types to the number of types. The range for this metric is 0 to 1, with A=0 indicating a completely concrete project and A=1 indicating a completely abstract project.
Distance from main sequence (D)
The perpendicular normalized distance of an project from the idealized line A + I = 1 (called main sequence). This metric is an indicator of the project's balance between abstractness and stability. An project squarely on the main sequence is optimally balanced with respect to its abstractness and stability. Ideal projects are either completely abstract and stable (I=0, A=1) or completely concrete and instable (I=1, A=0). The range for this metric is 0 to 1, with D=0 indicating an project that is coincident with the main sequence and D=1 indicating an project that is as far from the main sequence as possible. The picture in the report reveals if an project is in the zone of pain (I and A both close to 0) or in the zone of uselessness (I and A both close to 1).
Recommendations:
Projects where NormDistFromMainSeq is higher than 0.7 might be problematic. However, in the real world it is very hard to avoid such projects. Therefore, you should allow a small percentage of your projects to violate this CQLinq constraint: WARN IF Percentage > 15 IN SELECT PROJECTS WHERE NormDistFromMainSeq > 0.7
Metrics on Namespaces
Afferent coupling at namespace level (NamespaceCa)
The Afferent Coupling for a particular namespace is the number of namespaces that depends directly on it.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Efferent coupling at namespace level (NamespaceCe)
The Efferent Coupling for a particular namespace is the number of namespaces it directly depends on. Notice that namespaces declared in framework projects are taken into account.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Level
Defined for projects, namespaces, types, methods. The Level value for a namespace is defined as follow:
- Level = 0 : if the namespace doesn’t use any other namespace.
- Level = 1 : if the namespace only uses directly namespace defined in tierce projects.
- Level = 1 + (Max Level over namespace it uses direcly)
- Level = N/A : if the namespace is involved in a dependency cycle or uses directly or indirectly a namespace involved in a dependency cycle.
This metric has been first defined by John Lakos in his book Large-Scale C++ Software Design.
Recommendations:
This metric helps objectively classify the projects, namespaces, types and methods as high level,mid level or low level. There is no particular recommendation for high or small values. This metric is also useful to discover dependency cycles in your application. For instance if some namespaces are matched by the following CQLinq query, it means that there is some dependency cycles between the namespaces of your application: SELECT NAMESPACES WHERE !HasLevel AND !IsInTierProject.
Related Link:
Layering, the Level metric and the Discourse of Method
Metrics on Types
Type rank
TypeRank values are computed by applying the Google PageRank algorithm on the graph of types' dependencies. A homothety of center 0.15 is applied to make it so that the average of TypeRank is 1.
Recommendations:
Types with high TypeRank should be more carefully tested because bugs in such types will likely be more catastrophic.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Afferent Coupling at type level (Ca)
The Afferent Coupling for a particular type is the number of types that depends directly on it.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Efferent Coupling at type level (Ce)
The Efferent Coupling for a particular type is the number of types it directly depends on. Notice that types declared in framework projects are taken into account.
Recommendations:
Types where TypeCe > 50 are types that depends on too many other types. They are complex and have more than one responsability. They are good candidate for refactoring.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Lack of Cohesion Of Methods (LCOM)
The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming. Here are algorithms used by CppDepend to compute LCOM metrics:
- LCOM = 1 – (sum(MF)/M*F)
- LCOM HS = (M – sum(MF)/F)(M-1) Where:
- M is the number of methods in class (both static and instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods).
- F is the number of instance fields in the class.
- MF is the number of methods of the class accessing a particular instance field.
- Sum(MF) is the sum of MF over all instance fields of the class.
Recommendations:
Types where LCOM > 0.8 and NbFields > 10 and NbMethods >10 might be problematic. However, it is very hard to avoid such non-cohesive types. Types where LCOMHS > 1.0 and NbFields > 10 and NbMethods >10 should be avoided. Note that this constraint is stronger (and thus easier to satisfy) than the constraint types where LCOM > 0.8 and NbFields > 10 and NbMethods >10.
Cyclomatic Complexity (CC)
Defined for types, methods. Cyclomatic complexity is a popular procedural software metric equal to the number of decisions that can be taken in a procedure. Concretely, in C++ the CC of a method is 1 + {the number of following expressions found in the body of the method}:
if | while | for | case | default | continue | goto | && | || | catch | ternary operator ?: | ??
Following expressions are not counted for CC computation:
else | do | switch | try | using | throw | finally | return | object creation | method call | field access
Adapted to the OO world, this metric is defined both on methods and classes/structures (as the sum of its methods CC). Notice that the CC of an anonymous method is not counted when computing the CC of its outer method.
Recommendations:
Methods where CC is higher than 15 are hard to understand and maintain. Methods where CC is higher than 30 are extremely complex and should be split in smaller methods (except if they are automatically generated by a tool.)
Size of instance
Defined for instance fields and types. The size of instances of an instance field is defined as the size, in bytes, of instances of its type. The size of instance of a static field is equal to 0.
The size of instances of a class or a structure is defined as the sum of size of instances of its fields plus the size of instances of its base class.
Fields of reference types (class, interface, delegate…) always count for 4 bytes while the footprint of fields of value types (structure, int, byte, double…) might vary.
Size of instances of an enumeration is equal to the size of instances of the underlying numeric primitive type. It is computed from the value__ instance field (all enumerations have such a field when compiled in IL).
Size of instances of generic types might be erroneous because we can’t statically know the footprint of parameter types (except when they have the class constraint).
Recommendations:
Types where SizeOfInst is higher than 64 might degrade performance (depending on the number of instances created at runtime) and might be hard to maintain. However it is not a rule since sometime there is no alternative (the size of instances of the System.Net.NetworkInformation.SystemIcmpV6Statistics framework class is 2064 bytes).
Non-static and non-generic types where SizeOfInst is higher than 0 indicate stateless types that might eventually be turned into static classes.
Association Between Class (ABC)
The Association Between Classes metric for a particular class or structure is the number of members of others types it directly uses in the body of its methods.
Number of Children (NOC)
The number of children for a class is the number of sub-classes (whatever their positions in the sub branch of the inheritance tree). The number of children for an interface is the number of types that implement it. In both cases the computation of this metric only count types declared in the application code and thus, doesn't take account of types declared in tiers projects.
Depth of Inheritance Tree (DIT)
The Depth of Inheritance Tree for a class or a structure is its number of base classes (including the System.Object class thus DIT >= 1).
Recommendations:
Types where DepthOfInheritance is higher than 6 might be hard to maintain. However it is not a rule since sometime your classes might inherit from tier classes which have a high value for depth of inheritance. For example, the average depth of inheritance for framework classes which derive from System.Windows.Forms.Control is 5.3.
Metrics on Methods
Method rank
MethodRank values are computed by applying the Google PageRank algorithm on the graph of methods' dependencies. A homothety of center 0.15 is applied to make it so that the average of MethodRank is 1.
Recommendations:
Methods with high MethodRank should be more carefully tested because bugs in such methods will likely be more catastrophic.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Afferent coupling at method level (MethodCa)
The Afferent Coupling for a particular method is the number of methods that depends directly on it.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Efferent coupling at method level (MethodCe)
The Efferent Coupling for a particular method is the number of methods it directly depends on. Notice that methods declared in framework projects are taken into account.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
NbParameters
The number of parameters of a method. Ref and Out are also counted. The this reference passed to instance methods in IL is not counted as a parameter.
Recommendations:
Methods where NbParameters is higher than 5 might be painful to call and might degrade performance. You should prefer using additional properties/fields to the declaring type to handle numerous states. Another alternative is to provide a class or structure dedicated to handle arguments passing (for example see the class System.Diagnostics.ProcessStartInfo and the method System.Diagnostics.Process.Start(ProcessStartInfo)).
NbVariables
The number of variables declared in the body of a method.
Recommendations:
Methods where NbVariables is higher than 8 are hard to understand and maintain. Methods where NbVariables is higher than 15 are extremely complex and should be split in smaller methods (except if they are automatically generated by a tool).
NbOverloads
The number of overloads of a method. . If a method is not overloaded, its NbOverloads value is equals to 1. This metric is also applicable to constructors.
Recommendations:
Methods where NbOverloads is higher than 6 might be a problem to maintain and provoke higher coupling than necessary. This might also reveal a potential misused of the C# and VB.NET language that since C#3 and VB9 support object initialization. This feature helps reducing the number of constructors of a class.
Metrics on fields
Afferent coupling at field level (FieldCa)
The Afferent Coupling for a particular field is the number of methods that directly use it.
Related Link:
Code metrics on Coupling, Dead Code, Design flaws and Re-engineering.
Halstead Metrics
CppDepend computes various Halstead metrics, as defined by Maurice H. Halstead in Elements of Software Science (1977). Halstead metrics are based on definitions of operators and operands. These are language dependent; CppDepend uses the following definitions when computing Halstead metrics for C and C++.
- operators: arithmetic('+'), equality/inequality('<'), assignment('+='), shift ('>>'), logical ('&&'), and unary ('*') operators. Reserved words for specifying control points ('while') and control infrastructure ('else'), type ('double'), and storage ('extern'). Function calls, array references, etc.
- operands: identifiers, literals, labels, and function names. Each literal is treated as a distinct operand.
CppDepend ships with an assortment of metrics at function, file, and analysis granularity.
Underlying the various Halstead measures are counts of operators and operands.
Total Operators
Defined for types and methods.
N1: The total number of operators present.
Total Operands
Defined for types and methods.
N2: The total number of operands present.
Distinct Operators
Defined for types and methods.
n1: The number of distinct operators present.
Distinct Operands
Defined for types and methods.
n2: The number of distinct operands present. As noted above, every constant is treated as distinct.
Halstead Program Length
Defined for types and methods.
N = N1 + N2
Halstead Program Length describes the size of the abstracted program obtained by removing everything except operators and operands from the original program. It has some similarities with the Lines with Code metric, but also some important differences. A function's Lines with Code measure can be altered simply by adding or removing line breaks; Halstead Length is not sensitive to this manipulation. Similarly, Lines with Code does not tell us anything about how complex the lines are: a line containing an extremely complex expression is not treated any differently to one that is very simple. Halstead Length gives a better accounting of the overall statement complexity.
Halstead Program Volume
Defined for types and methods.
V = N * log2(n1 + n2)
This models the number of bits required to store the abstracted program of length N if the operators and operands are encoded as binary strings of uniform (and potentially nonintegral) length.
Halstead Program Level
Defined for types and methods.
L = (2/n1)*(n2/N2)
L describes the ratio between the volume V of the current program and the volume V* of the "most compact" implementation of the same algorithm (as defined by Halstead). A longer implementation of an algorithm will have a lower program level than a shorter implementation of the same algorithm.
Halstead Program Difficulty
Defined for types and methods.
D = (n1/2) * (N2/n2) = 1/L
Difficulty is the inverse of Level: a longer implementation of an algorithm will have a higher difficulty than a shorter implementation of the same algorithm. Difficulty increases as the number of unique operators increases, and as the average number of occurrences per operand increases.
Halstead Programming Effort
Defined for types and methods.
E = D * V
Halstead's formulation for the effort required to author (or understand) a program characterizes effort as proportional to both difficulty and volume.
Halstead Programming Time
Defined for types and methods.
T = E/18 seconds
Programming time is considered to be directly proportional to programming effort.
Halstead Intelligent Content
Defined for types and methods.
I = (V / D)
Halstead intended this metric to be a language-iCppDependent measure of algorithmic complexity.