Why you should consider using the C++ POCO library?

The POCO C++ Libraries (POCO stands for “Portable Components”) is a collection of open-source C++ class libraries that simplify and accelerate the development of network-centric, portable applications in C++. These libraries provide a wealth of features, ranging from HTTP and HTTPS clients and servers, to XML parsing, to data encryption, to threading support, and much more.

We’ve relied on the POCO library for over 15 years to verify whether CppDepend accurately evaluates well-implemented projects. Therefore, this assessment is not drawn from a fleeting encounter with the library but from a thorough analysis of its many versions over the past 15 years.

Let’s explore a code snippet from the POCO source code:

This method implementation is characterized by:

  • Has a few parameters.
  • Assert technique is used to check if the entries are OK.
  • The  variable names are easy to understand.
  • The method is short.
  •  No extra comments in the body. The code explains itself.
  • The function body is well indented.
  • The STL library is used when needed.

When delving into the POCO source code, one can readily observe the consistency of its implementation, as the same best practice rules are applied to each function. Upon exploring the source code, it becomes apparent that POCO is approachable even for C++ beginners. Specifically, POCO has the following characteristics:

  1. It’s not over-engineered, and one doesn’t need to possess advanced C++ skills to comprehend its implementation.
  2. The public classes provided to end users are well-organized and straightforward to utilize.
  3. The design is modular, with the library divided into several projects, each addressing a specific need.
  4. For those seeking advanced utilization of the library, extending, customizing, or modifying the behavior of certain classes is remarkably straightforward.

In this post we will focus on some keys of its design:

ABSTRACT VS INSTABILITY

Robert C.Martin wrote an interesting article about a set of metrics that can be used to measure the quality of an object-oriented design in terms of the interdependence between the subsystems of that design.

Here’s from the article what he said about the interdependence between modules:

What is it that makes a design rigid, fragile and difficult to reuse. It is the interdependence of the subsystems within that design. A design is rigid if it cannot be easily changed. Such rigidity is due to the fact that a single change to heavily interdependent software begins a cascade of changes in dependent modules. When the extent of that cascade of change cannot be predicted by the designers or maintainers the impact of the change cannot be estimated. This makes the cost of the change impossible to estimate. Managers, faced with such unpredictability, become reluctant to authorize changes. Thus the design becomes rigid.

And to fight the rigidity he introduce metrics like Afferent coupling, Efferent coupling, Abstractness, Instability and the “distance from main sequence” and the “Abstractness vs Instability” graph.

The “Abstractness vs Instability” graph can be useful to identify the projects  difficult to maintain and evolve. Here’s the “Abstractness vs Instability” graph of the POCO library:

The idea behind this graph is that the more a code element of a program is popular, the more it should be abstract. Or in other words, avoid depending too much directly on implementations, depend on abstractions instead. By popular code element I mean a project (but the idea works also for packages and types) that is massively used by other projects of the program.
It is not a good idea to have concrete types very popular in your code base. This provokes some Zones of Pains in your program, where changing the implementations can potentially affect a large portion of the program. And implementations are known to evolve more often than abstractions.

The main sequence line (dotted) in the above diagram shows the how abstractness and instability should be balanced. A stable component would be positioned on the left. If you check the main sequence you can see that such a component should be very abstract to be near the desirable line – on the other hand, if its degree of abstraction is low, it is positioned in an area that is called the “zone of pain”.

Only the Fondation project is inside the zone of pain , which is normal because it’s very used by the other projects. and contains mostly utility classes which are not abstracts.

TYPE COHESION

The single responsibility principle states that a class should not have more than one reason to change. Such a class is said to be cohesive. A high LCOM value generally pinpoints a poorly cohesive class. There are several LCOM metrics. The LCOM takes its values in the range [0-1]. The LCOMHS (HS stands for Henderson-Sellers) takes its values in the range [0-2]. Note that the LCOMHS metric is often considered as more efficient to detect non-cohesive types. LCOMHS value higher than 1 should be considered alarming.

Only 1% of types are considered as no cohesive.

In this post, we’ve offered a brief glimpse into the design of POCO. In upcoming posts, we’ll explore its design and implementation in greater detail to understand why this library stands out as one of the well-implemented open-source C++ libraries.

C++ always comes to the rescue for challenging problems: the llamafile case study is a prime example.

C++ has been instrumental in resolving numerous challenging problems across various domains due to its efficiency, performance, and versatility. Some of the challenging problems resolved by C++ include:

  1. System Software Development: C++ has been extensively used in developing system software such as operating systems (e.g., Windows, Linux), device drivers, and embedded systems due to its low-level capabilities and ability to interact closely with hardware.
  2. Game Development: C++ is widely employed in the game development industry to create high-performance and resource-efficient games. Its ability to manage memory and provide low-level access to hardware makes it suitable for developing game engines and graphics-intensive applications.
  3. High-Performance Computing: C++ is a preferred choice for developing high-performance computing applications, including simulations, scientific computing, and numerical analysis. Its ability to optimize code for speed and efficiency allows for faster execution of complex algorithms.
  4. Financial Systems: C++ is commonly used in developing financial systems and trading platforms due to its speed and reliability. It is crucial in building algorithmic trading systems, risk management software, and market analysis tools.
  5. Networking and Telecommunications: C++ is utilized in networking and telecommunications for building efficient network protocols, routers, and communication software. Its ability to handle low-level network operations and optimize network performance makes it invaluable in this domain.

These are just a few examples of the challenging problems resolved by C++, showcasing its wide-ranging applicability and importance across various industries and domains.

C++ remains the preferred language for tackling contemporary challenges, as evidenced by projects like Mozilla’s llamafile.

Large language models are advanced artificial intelligence systems designed to understand and generate human-like text. These models are trained on vast amounts of text data and utilize sophisticated algorithms to process and generate responses. And a llamafile is an executable LLM that you can run on your own computer. It contains the weights for a given open LLM, as well as everything needed to actually run that model on your computer. There’s nothing to install or configure . So the goal is to make open LLMs much more accessible to both developers and end users. This great work is done by combining  llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a “llamafile”) that runs locally on most computers, with no installation.

llamafile is based on two major components:

Cosmopolitan Libc:
Cosmopolitan Libc makes C a build-once run-anywhere language, like Java, except it doesn’t need an interpreter or virtual machine. Instead, it reconfigures stock GCC and Clang to output a POSIX-approved polyglot format that runs natively on Linux + Mac + Windows + FreeBSD + OpenBSD + NetBSD + BIOS with the best possible performance and the tiniest footprint 

llama.cpp:

The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware – locally and in the cloud.

Advantages of using llamafile:

  1. llamafiles can run on multiple CPU microarchitectures. We added runtime dispatching to llama.cpp that lets new Intel systems use modern CPU features without trading away support for older computers.
  2. llamafiles can run on multiple CPU architectures. We do that by concatenating AMD64 and ARM64 builds with a shell script that launches the appropriate one. Our file format is compatible with WIN32 and most UNIX shells. It’s also able to be easily converted (by either you or your users) to the platform-native format, whenever required.
  3. llamafiles can run on six OSes (macOS, Windows, Linux, FreeBSD, OpenBSD, and NetBSD). If you make your own llama files, you’ll only need to build your code once, using a Linux-style toolchain. The GCC-based compiler we provide is itself an Actually Portable Executable, so you can build your software for all six OSes from the comfort of whichever one you prefer most for development.
  4. The weights for an LLM can be embedded within the llamafile. We added support for PKZIP to the GGML library. This lets uncompressed weights be mapped directly into memory, similar to a self-extracting archive. It enables quantized weights distributed online to be prefixed with a compatible version of the llama.cpp software, thereby ensuring its originally observed behaviors can be reproduced indefinitely.
  5. Finally, with the tools included in this project you can create your own llamafiles, using any compatible model weights you want. You can then distribute these llamafiles to other people, who can easily make use of them regardless of what kind of computer they have.

To resume, thanks to C++, it helped always to resolve the challenging problems in an efficient way, and we can’t imagine the programming world without this amazing programming language 🙂