My group is making an emulator to run the 1978 classic arcade game Space Invaders. We are writing our project in C. C is a compiled language, which means that part of our planning process included selecting a compiler. C is also old and widely used, which means there are many compilers to choose from (consider the list here). The choice of compiler is important in a project like this, so we took our time weighing our options. Some popular compilers were quickly eliminated because of their operating environment, like Microsoft Visual C++ (MSVC) which is proprietary and tied to Visual Studio. Others were eliminated because they have a niche use targeted toward experienced C developers (not us…yet), like Tiny C Compiler. And still others were eliminated because they have been officially discontinued, like Open64. In the end, our final decision came down to two titans of the C compiler universe – clang and gcc.
I’ll spare you the suspense – we picked clang. Here are the three main reasons why: (1) ubiquity, (2) error messages, and (3) static analysis features. Let’s take a closer look at each of these.
Ubiquity
A potential headache was quickly eliminated in our first group meeting when we realized we were not dealing with the ultimate level of complexity in terms of developer operating environment, which would be a quad-headed group composed of one Windows user, one macOS user, one debian-based Linux user, and one non-debian based Linux user. Some of us run some flavor of debian-based Linux with a Windows dual boot, whereas others use iOS. Pretty simple and we all have access to something UNIX-y. Even if we were playing on hard mode, either GCC or Clang would both be suitable compiler options for this project because they are ubiquitous and can be installed on most systems if they don’t come pre-installed. In fact, each of these compilers is also available on the OSU servers.
Compiler Messages
Compiled languages are notorious for generating obscure error and warning messages at compile time, and these can be particularly disorienting for programmers that are new to compiled languages. We have a varied range of experience with compiled languages in our group: I work with C++ on a daily basis and started the program when it was taught in C++, but one group member has really only worked with C in the context of CS344 Operating Systems. As such we wanted a compiler that would be friendly to less experienced C developers. Most programmers find that clang generates messages that are easier to interpret and thus more helpful in diagnosing problems. Better messages help all programmers work more efficiently, and we chose clang in the hope that this feature would help everyone on the team get their work done.
Here’s a quick example. Consider the following program:
There is a common error in this code. I have created an integer array with five elements, but then I try to access the sixth element of that array, which doesn’t exist (remember arrays are 0-indexed). C does not automatically perform boundary checking on arrays as some languages do, so you can perform this operation and whatever happens to be stored in memory at that location will be stored in the my_int variable, which could be some garbage value. Gcc will let you compile and run this program:
Where did 32765 come from? I have no idea – it’s a garbage value. Compare that with what happens when we try to compile the same program using clang:
Not only does clang generate a warning at compile time, but the warning is detailed, human-readable, and helpful to programmers.
Static Analysis
In the end, the extensive support for static analysis and standardization tools in clang was the deciding factor in our group’s decision to use clang. Reflecting on our positive experience using python linters in our continuous integration (CI) workflow in CS362, we agreed early on that we wanted to use static analysis tools in a similar way to enforce good code quality for this project. In doing our research, we found that clang has support for many static analysis tools that we plan to use. First, we plan to use clang-tidy as our linter to complement code review to help catch things like common coding errors and format issues. Second, we plan to use clang-format to ensure that we are writing code compliant with our code style standards (tabs vs spaces, curly brace location etc). Clang-format allows users to define their own standard or use pre-defined standards from companies like Google and Mozilla and then apply that standard to their source code. To be fair, gcc also includes support for static analysis, but we determined that the combination of support for static analysis with clearer compiler messages in clang was the better option for our team.
There are many things to take into consideration when choosing a compiler for a project written in C, and I’m sure we only scratched the difference in parsing out the differences between them. It was fun to nerd out and dig into this topic. Hopefully we made a good choice!
Leave a Reply