OpenCV: The art of using the KISS and YAGNI principles.

As programmers, we’re often tempted to leverage design patterns, language idioms, advanced language features, and well-known libraries, which is certainly advisable. However, it’s essential to put on the KISS/YAGNI glasses before diving into these techniques 🙂

“KISS” stands for “Keep It Simple, Stupid”. It’s a design principle that suggests simplicity should be a key goal in design and unnecessary complexity should be avoided. The idea is that simple solutions are easier to understand, maintain, and troubleshoot. The KISS principle is widely applied in various fields, including engineering, software development, user interface design, and project management.

YAGNI stands for “You Ain’t Gonna Need It.” It’s a principle in software development and agile methodologies that suggests developers should not add functionality or features to their codebase until those features are actually needed to solve a specific problem or fulfill a requirement.

The YAGNI principle is based on the idea that adding unnecessary features or functionality prematurely can lead to several potential issues.

OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision, developed by Intel Russia research center in Nizhny Novgorod. The library is cross-platform. It focuses mainly on real-time image processing.

OpenCV is a vast project encompassing numerous intricate features. Nonetheless, the developers of OpenCV have embraced fundamental principles that have simplified its comprehension and maintenance significantly.

Let’s discover some OpenCV design choices:

Modularity

1- Library based architecture

A library-based architecture makes the reuse and integration of functionality provided more flexible and easier to integrate into other projects. In addition, the library based architecture encourages clean APIs and separation. Therefore making it easier for developers to understand, since they only have to understand small pieces of the big picture.

OpenCV adopt this approach and defines many libraries, each one has a specific responsibility and all of them uses the opencv_core library.

opencv1

2- Modularize by namespaces

OpenCV employs namespaces extensively to organize its codebase effectively. Here, we present some examples of namespaces utilized within the opencv_core project:

opencv2

OpenCV uses the “Namespace-by-feature” approach. Namespace-by-feature uses namespaces to reflect the feature set. It places all items related to a single feature (and only that feature) into a single namespace. This results in namespaces with high cohesion and high modularity, and with minimal coupling between namespaces. Items that work closely together are placed next to each other.

In case of OpenCV the namespaces are used for three main reasons:

  • Modularize the libraries.
  • Hide details like for “cv::detail” namespace, this approach could be very interesting if we want to inform the library user that he doesn’t need to use directly types inside this namespace and it’s only for internal use. In C# the “internal” keyword did the job, but in C++ there’s no way to hide public types to the library user.
  • Anonymous namespace: namespace with no name. It avoids making global static variable. The “anonymous” namespace you have created will only be accessible within the file you created it in.

Defines data model as POD types

Each project has its data model, and we can define this model using the plain old data (POD) types. POD type is a data structure that is represented only as passive collections of field values (instance variables), without using object-oriented features. The advantages of using Plain Old Data (POD) types in programming are numerous and include:

  1. Efficiency: POD types typically have a simple memory layout, which often leads to more efficient memory usage and faster performance. They avoid the overhead associated with complex data structures and member functions.
  2. Compatibility: POD types are compatible with low-level programming constructs and data interchange formats, making them suitable for interfacing with external systems and languages.
  3. Interoperability: POD types can be easily passed between different modules or components of a system, as well as between different systems or programming languages, facilitating interoperability.
  4. Ease of Use: POD types are straightforward to work with and understand, as they typically represent basic data types or aggregates of such types.
  5. Performance: POD types often lead to better performance in terms of both execution speed and memory usage compared to more complex data types.
  6. Predictability: Since POD types have a simple and well-defined structure, their behavior is generally more predictable, which can make debugging and optimization easier.

Overall, using POD types can contribute to simpler, more efficient, and more maintainable code, especially in performance-critical or resource-constrained environments.

Let’s search in the OpenCV code base for structs with no methods and having only fields.

opencv3

The result of this query concern 25% of the number of types defined in OpenCV projects. OpenCV defines almost all its data model in structs with only fields.

Avoid multiple inheritance

Using multiple inheritance could complicate the design, debuggers can have a hard time with it, therefore it’s not recommended by many C++ experts.

Let’s search which classes inherit from more than one concrete base class in the OpenCV code base.

opencv4

Only a few classes from test projects use the multiple inheritance, this concept is avoided in the whole OpenCV code base.

Avoid defining complex functions

Many metrics exist to detect complex functions, NBLinesOfCode, Number of parameters and number of local variables are the basic ones.

There are other interesting metrics to detect complex functions:

  • Cyclomatic complexity is a popular procedural software metric equal to the number of decisions that can be taken in a procedure.
  • Nesting Depth is a metric defined on methods that is relative to the maximum depth of the more nested scope in a method body.
  • Max Nested loop equals the maximum level of loop nesting in a function.

The max value tolerated for these metrics depends more on the team choices, there are no standard values.

Let’s search for methods that could be considered as complex in the OpenCV code base.

opencv5

Only 1% is a candidate to be refactored to minimize their complexity.

Coupling

Low coupling is desirable because a change in one area of an application will require fewer changes throughout the entire application. In the long run, this could alleviate a lot of time, effort, and cost associated with modifying and adding new features to an application.

Low coupling could be achieved by using abstract classes. Here are three key benefits derived from using them:

  • An abstract class provides a way to define a contract that promotes reuse. If an object implements an abstract class then that object is to conform to a standard. An object that uses another object is called a consumer. An abstract class is a contract between an object and its consumer.
  • An abstract class also provides a level of abstraction that makes programs easier to understand. abstract class allows developers to start talking about the general way that code behaves without having to get into a lot of detailed specifics.
  • An abstract class enforces low coupling between components, what’s make easy to protect the abstract class consumer from any implementation changes in the classes implementing the abstract classes.

Let’s search for all abstract classes defined by OpenCV :

opencv6

If our primary goal is to enforce low coupling, there’s a common mistake when using abstract classes, that could kill the utility of using them. It’s the using of the concrete classes instead of abstract ones, to explain better this problem let’s take the following example:

The class A implements the abstract class IA which contains the calculate() method, the consumer class C is implemented like this

public class C
{
   ….
   public:
      void calculate()
      {
        …..
        m_a->calculate();
        ….
       }
       A* m_a;
 };

The class C instead of referencing the abstract class IA, it references the class A, in this case, we lose the low coupling benefit, this implementation has two major drawbacks:

  • If we decide to use another implementation of IA, we must change the code of C class.
  • If some methods are added to A not existing in IA, and C use them, we also lose the contract benefit of using interfaces.

C# introduced the explicit interface implementation capability to the language to ensure that a method from the IA will be never called from a reference to concrete classes, but only from a reference to the interface. This technique is very useful to protect developers from losing the benefit of using interfaces.

Cohesion

The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOM HS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. A LCOM HS value highest than 1 should be considered alarming. Here are  to compute LCOM metrics:

LCOM = 1 – (sum(MF)/M*F)
LCOM HS = (M – sum(MF)/F)(M-1)

Where:

  • M is the number of methods in class (both static and instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods).
  • F is the number of instance fields in the class.
  • MF is the number of methods of the class accessing a particular instance field.
  • Sum(MF) is the sum of MF overall instance fields of the class.

The underlying idea behind these formulas can be stated as follow: a class is utterly cohesive if all its methods use all its methods use all its instance fields, which means that sum(MF)=M*F and then LCOM = 0 and LCOMHS = 0.

LCOMHS value higher than 1 should be considered alarming.

opencv8

Only a few types are not cohesive.

Conclusion

If you take a look at the OpenCV source code, you will be surprised by the simplicity of its implementation, no advanced design concepts are used, no over-engineering, just some basic principles applied.