Higher-level C

My series on techniques for writing higher-level C programs.

There is any number of books about patterns in Java and object-oriented Perl. There are few C textbooks of any type. The most famous one is The C Programming Language. While it is insuperably authoritative, it explains only the individual particles of C. There is no guidance on how best to assemble them into a program of any complexity.

Perhaps the belief is that general programming knowledge coupled with information on the language’s syntax and semantics is precisely enough to use it effectively. I disagree with this attitude in the case of C, because there are issues encountered in C that do not feature in most other programming. How do we arrange memory allocation and freeing calls to reduce memory leaks but prevent memory errors? What is a function pointer and what can we do with it? How do we make a generic data structure or algorithm using void *? Why are buffer overruns so dangerous, and how do we avoid them?

There is little guidance on writing large programs in C. And large programs are those that benefit most from abstraction and higher-level control over complexity. There is a niche in the literature for higher-level C. The best material out there is existing practice: we can look at successful C projects, and (if they are open) read their source to see how they are using C at a higher level.

(An example of a language textbook that does accurately cover syntax and semantics, but also raises the difficulties large programs bring and provides advice on how to overcome them with the tools of the language, is The C++ Programming Language. Unfortunately the language in question is C++.)

Why C?

C is about 40 years old. It does not encourage object-oriented programming, abstraction, or genericity. Our standards for high-level language have changed since then and now C is thought of as little more than a “portable assembly language”.

Despite that, C is still popular. For some of us there is an ineffable attraction to it. It’s not just the cleanness of its syntax (other languages have cleaner). It might be the natural way it maps onto actual, physical computing hardware, yet allows us to implement our ideas without worrying about register spilling or calling conventions. (Perhaps it partly reflects the large investment we have made in learning it and our sense of privilege in being able to comprehend advanced pointer use.)

C is renowned for its efficiency. A modern compiler will produce code as good as or better than almost any assembly expert. Competing languages that reach even 50% of its speed for typical tasks tend to be considered “fast”. This kind of raw efficiency will always be important. But if a “slower” language makes it easier for the programmer to implement a less complex algorithm, then that language is worth considering over C for efficiency reasons. At the end of the day, however, when we admire the performance of programs written in higher level languages, we unwittingly rely on the fact that the VM it’s running on is usually written in C (or C++).

It’s risky to make long-term predictions about computing. But I can comfortably predict that in 10 years, operating systems like Linux, web servers like Apache, and even the interpreters and VMs for other languages will continue to be written in C, will continue to incorporate new inventions, and will require diligent maintenance by new programmers.

Why higher-level?

As long as C continues to be used for large projects, we should try to optimise our use of it. We probably cannot hope to make C high-level again. But we can resolve to use it at the high end of its natural range.

A good technique or idiom should:

Do what it claims
Help the programmer to manage complexity
Look nice in code
Save on typing in the long run
Not be shockingly expensive
Be decipherable by a competent C programmer
Be easily debugged in the case of programmer error

Open source projects are good sources of examples. I plan to collect them here and explain them if I can.

My main concern is the tricks that will help us write and understand a large C program. Efficient code is not a particular goal, but I will consider the efficiency implications for each technique. Readability is important; this is often a matter of taste and I will try to avoid prescribing my style over others. The general approach is to suggest a technique, discuss how it works and what tradeoffs it may have, and let the programmer decide when and how to use it — if they consider that the advantages outweigh the costs.

See the contents page for my posts on Higher-level C.

7 Responses to Higher-level C

Richard says:

February 20, 2011 at 12:56 am

I managed to finally read your article! Pretty interesting. I guess I learn something new every day. You should really show phrizer this. I;m sure he would be somewhat quite interested. Amazing how old C really is!

Pingback: Star Control II mod | EJRH
Pingback: Objects in C | EJRH
Pingback: Blog statistics | EJRH
Pingback: Encapsulation in C | EJRH
Pingback: On C++ | EJRH
Pingback: Things I’ve missed | EJRH

Higher-level C

Why C?

Why higher-level?

7 Responses to Higher-level C

Leave a comment Cancel reply

Recent Posts

Archives

Categories (non-disjoint!)

Blog Stats

Email Subscription

Meta

Related links

Higher-level C

Why C?

Why higher-level?

Share this:

Related

7 Responses to Higher-level C

Leave a comment Cancel reply

Recent Posts

Archives

Categories (non-disjoint!)

Blog Stats

Email Subscription

Meta

Related links