Unlocking the benefits of C++20 — improving the developer experience with zero downtime

Engineers at Macquarie
Macquarie Engineering Blog
6 min readMar 18, 2024

--

By Kip Hamiltons, Software Engineer at Macquarie Group

Overview

We rolled out C++20 to each of the C++ applications in our risk ecosystem with zero disruption to our Commodities and Global Markets (CGM) business and to the great excitement of our developers. This represented a large undertaking due to the binary incompatibility between existing systems and newly compiled programs.

C++20 adds many fantastic new features to the coding language, including ranges, modules, and type-safe text formatting. To get C++20, you need a compiler which supports it, usually obtained by upgrading your existing compiler. However, upgrading compilers often brings a host of challenges, including application binary interface (ABI) changes, which make new code incompatible with old code. This article documents our journey navigating the GCC std::string short string optimisation ABI change, in order to upgrade our codebase from C++14 to C++20, unlocking a myriad of new features and improvements.

What did we have to do to ensure a successful upgrade to C++20?

When an ABI changes, it means that existing libraries using the changed interface become incompatible. The incompatibility is far from obvious though. Typically, you find out there’s a problem when you use the library and your program is served a segmentation fault.

Warning: May induce flashbacks to your ‘intro to C++’ class

Without an automatically generated Software Bill Of Materials (SBOM) for all installed applications and libraries on all systems, it’s difficult to get assurance that you have covered all the bases. However, deriving a complete SBOM across the massive, shared file system that our applications primarily ran on at the time is near impossible. Taking a risk-based approach, we did an enumeration of use cases, testing each environment our applications ran in, and fixing issues and bugs as we found them.

Upgrading to GCC10

Toolchain upgrades often start with bumping the version, running a build, then seeing what breaks — but this time it was different. We didn’t have a practical containerised build environment at the time, so to get reproducible, hermetic builds that are consistent across environments, we added our compiler toolchain as a downloaded dependency in our build graph.

Our build system allows us to treat the compiler like any other dependency, fetching it before it is required. Once we had the toolchain as a dependency, we started the iterative problem-solving compile -> fix -> compile loop.

GCC maintains lists of breakages, incompatibilities, and changes between each version released. There were a few which could have affected us, but by far the most impactful was the short string optimisation. We were already on C++14, so the fact it was now the default didn’t affect us. The other various deprecations and minor incompatibilities were not a concern for us either.

We compile with warnings-as-errors, which meant that any warnings issued by new warning flags would fail the build. Fixing the new warnings took a lot of grinding effort, but they exposed some very old bugs in the process. Some of the most interesting bugs were caught by -Wmisleading-indentation, usually caused by a sneaky semicolon.

An example would be something like this:

if (conditionIsTrue); 
{
doTheThing();
}

On first glance, this looks like it will only do the thing if the condition is true. The semicolon after the ‘if’ statement changes this though, such that the block always executes and the thing is always done. This is one example of the safety and reliability advancements that upgraded compilers provide. The bug is quite easy for humans to miss, sitting in a stack of other changes.

Fixing the new warnings involved changing thousands of lines of code, often including logic and behaviour changes, which can increase the risk. However, our custom risk engine regression testing platform gave us the confidence to make these changes and refactor as needed. The regression tests meticulously find each and every numerical difference in our risk calculations, allowing us to move ahead with assurance that our risk evaluations are correct.

Once we were regression-testing regularly, we found a particular set of numerical differences in some circumstances. Numerical differences between code versions represent a change in a risk measure, which if unintended, could impact market risk and PnL (profit and loss) reporting.

Diving deep into the code paths, we found that the differences arose from calls to Boost libraries, which we had been forced to upgrade earlier in the process. Boost is a popular set of libraries providing utilities to augment the C++ standard library. We isolated the Boost upgrade out of the overall upgrade and confirmed it was the source of the differences, allowing us to assess the risk of that upgrade independently and get that change in earlier.

By this stage, we were spending a lot of time on rebasing and deconflicting the changes. The scope of the project was quite large due to the size of our codebase. With changes constantly coming in through other business deliveries, we had to put more and more effort into keeping our changes current so they could be applied cleanly.

With all the warnings fixed, all numerical differences sufficiently explained and understood, and regression tests passed, we were able to move ahead with a merge. To avoid last-second version control conflicts, we briefly paused the merging of other code and had a special release with just the toolchain upgrade. We made sure our global technology teams were geared up to respond quickly to any issues and fix-forward where possible, and that business users were well aware of the release even though it carried zero functional change.

The issues that arose in the wake of the release were primarily to do with ABI incompatibility. We found that there were a few binaries and libraries with the old std::string ABI which had use cases we weren’t aware of. The issues manifested as heap corruptions (when your program edits something it shouldn’t in RAM), resulting in sudden crashes with inconsistent, unhelpful reporting. The primary difficulty was tracking down exactly which library required recompilation to give it a compatible ABI. Once we found it though, the fix was as simple as recompiling and redeploying it.

Rolling out C++20

With GCC10 rolled out to production, we were able to move on to upgrading to C++20. We fixed non-C++20-compliant dependency headers and libraries, then our own codebase. Mostly, this was changing variables named ‘requires’, and removing std::auto_ptr from our dependencies, which were quite straightforward changes.

A typical std::auto_ptr fix

Once each dependency was made compliant, we were able to add the magical -std=c++20 flag to our global compiler flags, completing the transition and enabling some of the best features of modern C++ for our codebase.

Lessons and takeaways

Our biggest takeaway from the project was to split up changes into the smallest pieces you can. Splitting out each dependency upgrade (particularly Boost) and each warning which was fixed into independent changes would have reduced the toil of rebasing and deconflicting the changes, as well as the burden on the final code reviewers.
Other things that became increasingly clear as we progressed were the potential benefits of SBOMs, as well as the related benefits of using a constrained containerised environment for managing and auditing dependencies. This may have been easier if we simply had lists of binaries and libraries to run through and redeploy.

Future plans

Getting our GCC version up to the current major release — GCC13 — is next on our roadmap, which will allow us to access the most cutting- edge implementation available, including features from the newest C++23 standard. We’re particularly looking forward to using modules and std::format, which weren’t yet properly implemented in GCC10, as these will significantly improve compile times and performance for the platform.

Conclusion

By rolling out C++20, we have been able to integrate exciting new language features, improving the safety of our code and having fun with the latest that C++ has to offer. We have enjoyed working with the modern capabilities, delivering expressive, portable code to fulfill our business needs.

--

--

Engineers at Macquarie
Macquarie Engineering Blog

Sharing insights, innovative ideas and ways of working at Macquarie.