Static code analysis and compiler warnings
Compiler generated warnings are one form of static code analysis that provides a codified form of certain types of beneficial programming practices. Nowadays modern compilers used to compile C family languages (C, C++, and Objective-C) provide hundreds of different warnings whose usefulness varies depending on project and its aims.
In this article I will examine what level of issues compiler warnings can find, what is the cost of enabling warnings and analyze compiler warning flag lists for both clang and GCC compilers.
Levels of static code analysis
Compiling C family languages usually involves preprocessor, compiler, assembler, and a linker. This also leads to situation that static code analysis can be done in various phases of program construction. These generally are:
- Analysis on plain source files.
- Analysis on preprocesses source files.
- Analysis on compilation unit level.
- Link-time analysis.
This multi-stage program construction results in difficulties for tools that are not called with the exact same arguments providing information about preprocessor definitions, and include and library directories. For example tools like splint, Cppcheck, and many editor front-ends work outside the build system and can result in false warnings because they can not see inside some macro definitions that were not included in the simple static analysis setup. This becomes an issue with larger projects that do not necessarily have the most straightforward build setups and the most trivial header file inclusion policies. This does not mean that such tools are useless, but they will result in false positive warnings that can be really annoying unless they are silenced or ignored in some way.
Analysis on preprocessed source files already provides pretty accurate picture of what kind of issues there can be in the program, but it necessarily is not enough. In the compilation phase compilers constantly transform the program into new, functionally equivalent, forms during optimization phases that can even result in unexpected code removal that is not necessarily trivial to notice. Compilation phase also gives more opportunities for target platform specific static code analysis. For example pipeline stalls or value overflows due to incorrect assumptions on data type sizes can usually be noticed only after the target platform is known.
Final phase in program construction, that provides options for static analysis, is the linking phase. In the linking phase linker takes care that all the functions and global variables that the program calls come from somewhere and that there are no conflicting duplicate names defined. This should also enable some automatic detection capabilities for memory leaks and such that come from calling functions defined in different compilation units. I’m not sure if any freely available static analyzer does this.
Compiler warning flags
Compiler warning flags are one way to do static code analysis that cover all possible phases of program construction. This assumes that the compiler is involved in all phases of program construction. And they usually are, as in all phases from preprocessing to linking compiler front-end is used as a wrapper to all the tools that do the actual hard work.
Warning flags and compilation time
Using static code analysis in form of compiler warnings incurs some penalty, as they need to execute some extra code in addition to normal code related to compilation. To measure the penalty and to contrast it with some more advanced static analysis tools,
I did some benchmarks by compiling Cppcheck 1.73 and FFTW 3.3.4 with clang 3.8, GCC 6.1, and Infer 0.8.1 by using -O3
optimization level. Cppcheck is a program mainly written in C++ and FFTW is mainly written in C. Infer has some experimental checks for C++ enabled with --cxx
command line option, so I ran Infer twice for Cppcheck, with and without C++ checks. Clang had all warnings enabled -Weverything
and GCC had all warning options that did not require any special values. This resulted in following minimum execution times of 3 runs:
Compiler | Program | No warnings | All warnings |
---|---|---|---|
clang | Cppcheck | 59.3 s | 1 min 1.1 s (+ 3.0 %) |
GCC | Cppcheck | 1 min 32.7 s | 1 min 38.8 s (+ 6.6 %) |
Infer | Cppcheck | - | 17 min 50 s (18x slower) |
Infer --cxx | Cppcheck | - | 1 h 36 min (97x slower) |
clang | FFTW | 40.5 s | 40.9 s (+ 1 %) |
GCC | FFTW | 42.7 s | 58.1 s (+ 36 %) |
Infer | FFTW | - | 4 min 43 s (10x slower) |
We can see that for clang and GCC the extra processing time added even by all warnings flags is pretty small compared to all the other compilation and optimization steps for a C++ application (Cppcheck). But for mostly C based application (FFTW) GCC gets surprisingly heavy, although build times still remain within the same order of magnitude.
If we then compare the time that a more heavy static code analyzer takes, these compiler warnings are extremely cheap way to add static code analysis. They may not catch all the same bugs as these more advanced methods do, but they do offer a cheap way to avoid the basic mistakes.
Warning flag lists
I have created a project that can automatically parse compiler warning flags from command line option definition files in clang and GCC. This came partially from a necessity and partially from curiosity to examine what kind of options clang and GCC provide in easy to digest format. Although both compiler provide some kind of lists of warning flags as part of their documentation, they are pretty cumbersome to go through when the main interest is first figure what there is available and then just look at the details.
Warning options and deprecation
Different compilers have different policies about backwards compatibility and deprecation. When looking at how warning options have evolved, GCC has not removed between versions 3.4 and 6.1 a single switch, it has just switched them to do nothing (-Wimport
, -Wunreachable-code
, and -Wmudflap
switches). Clang on the other hand has removed multiple switches between versions and for example there is no references to -Wcxx98-cxx11-compat
in the current codebase even if clang 3.3 had such switch.
Examining differences visually
Generating large purely textual differences between different files becomes quite cumbersome quite soon if you want to do anything more complicated than a simple difference of unique command line options between two subsequent versions. For example if we look at figure 1 that shows what other warnings -Wall
flag enables in GCC 6 when compared to GCC 5. We can see that there are quite many extra warnings added to -Wall
switch so newer compiler versions provide extra analysis capabilities even without adding all the new options individually.
From figure 2 we can also see that GCC 6 uses -Wc++11-compat
as the default warning flag indicating differences between ISO C++ 1998 and ISO C++ 2011 for constructs that have the same name instead of -Wc++0x-compat
, that refers to a draft standard. So GCC has basically deprecated -Wc++0x-compat
switch in favor of a switch that refers to the actual standard.
Suggestions for usable warning options
I won’t be giving any specific suggestions here for warning flags, as there seem to be new options for each subsequent compiler release. A good place to start is NASA’s JPL Institutional Coding Standard for the C Programming Language that includes a very short list of rudimentary warning flags for GCC. It also includes a short list of coding standards of which each one would have prevented a mission failure for NASA. SEI CERT coding standards for secure coding also provide various automatically generated lists for clang warning flags and GCC warning flags based on the issues that these standards take into account.
And finally, check out the warning flag lists for clang and GCC and make your own combinations that bring the most benefit for whatever you are working with. Not all of them are appropriate for your project and some of them may be even working against the useful development patterns that you have.
Cautionary tales about compiler warnings flags
Even though it might sound like a good idea to rush and fix all the issues that these new compiler warning flags uncover, it might actually cause some new bugs to pop up. Specifically SQLite database engine has had its own take on compiler warnings and their fixing and they have concluded that fixing compiler warnings actually has produced some extra bugs that would not have come into light if there would have not been tries to fix compiler warnings.
I have also had my own take on compiler warning fixes and sometimes I have screwed up and messed up with a perfectly working code while fixing a misleading warning. But generally my own experience has lead to more fixes than what there have been bugs. And the coolest thing is, that having these warnings enabled as the standard development process prevent some bugs from ever creeping up to the application in the first place.