Monday, November 8, 2021

Linux Fu: Automatic Header File Generation

I’ve tried a lot of the “newer” languages and, somehow, I’m always happiest when I go back to C++ or even C. However, there is one thing that gets a little on my nerves when I go back: the need to have header files with a declaration and then a separate file with almost the same information duplicated. I constantly make a change and forget to update the header, and many other languages take care of that for you. So I went looking for a way to automate things. Sure, some IDEs will automatically insert declarations but I’ve never been very happy with those for a variety of reasons. I wanted something lightweight that I could use in lots of different toolsets.

I found an older tool, however, that does a pretty good job, although there are a few limitations. The tool seems to be a little obscure, so I thought I’d show you what makeheaders — part of the Fossil software configuration management system. The program dates back to 1993 when [Dwayne Richard Hipp] — the same guy that wrote SQLite — created it for his own use. It isn’t very complex — the whole thing lives in one fairly large C source file but it can scan a directory and create header files for everything. In some cases, you won’t need to make big changes to your source code, but if you are willing, there are several things you can do.

The Problem

Suppose you have two C files that cooperate. Let’s say you have A.c and B.c. Inside the A file, you have a simple function:


double ctof(double c)
{
  return (9.0*c)/f+32.0;
}

If you expect to use this inside file B, there needs to be a declaration so that when you compile B, the compiler can know that the function takes a single double argument and returns a double. With ANSI C (and C++) you need something like:

double ctf(double c);

There’s no actual programming, just a note to the compiler about what the function looks like. This is what you call a prototype. Normally, you’ll create a header file with the prototype. You can include that header in both A.c and B.c.

The problem is when you change the function in A.c:

double ctof(double c1, int double c2)
{
  return (9.0*(c1+c2))/f+32.0;
}

If you don’t change the header to match, you’ll have problems. Not only that, but you need to make the same change. If you make a mistake and mark the arguments as floats in the header, that won’t work either.

The Program

Assuming you’ve installed the software, you can simply run it passing all the C and H files you want it to scan. Usually, the glob *.[ch] will do the trick. You can also use it with .cpp files and even a mix. By default, this will pull all the global variable declarations and global functions you define into a series of header files.

Why a series? The program makes an odd assumption that makes sense once you think about it. Since the headers are automatically generated, it doesn’t make sense to reuse the headers. Instead, each source file gets its own customized header file. The program puts in what is necessary and in the right order. So A.c will use A.h and B.c will use B.h. There won’t be any cross-dependency between the two headers. If something changes, you simply run the program again to regenerate the header files.

What Gets Copied?

Here’s what the documentation says gets copied into header files:

  • When a function is defined in any .c file, a prototype of that function is placed in the generated .h file of every .c file that calls the function. If the “static” keyword of C appears at the beginning of the function definition, the prototype is suppressed. If you use the “LOCAL” keyword where you would normally say “static”, then a prototype is generated, but it will only appear in the single header file that corresponds to the source file containing the function. However, no other generated header files will contain a prototype for the static function since it has only file scope. If you invoke makeheaders with a “-local” command-line option, then it treats the “static” keyword like “LOCAL” and generates prototypes in the header file that corresponds to the source file containing the function definition.
  • When a global variable is defined in a .c file, an “extern” declaration of that variable is placed in the header of every .c file that uses the variable.
    When a structure, union, or enumeration declaration or a function prototype or a C++ class declaration appears in a manually produced .h file, that declaration is copied into the automatically generated .h files of all .c files that use the structure, union, enumeration, function or class. But declarations that appear in a .c file are considered private to that .c file and are not copied into any automatically generated files.
  • All #defines and typedefs that appear in manually produced .h files are copied into automatically generated .h files as needed. Similar constructs that appear in .c files are considered private to those files and are not copied. When a structure, union, or enumeration declaration appears in a .h file, makeheaders will automatically generate a typedef that allows the declaration to be referenced without the “struct”, “union” or “enum” qualifier.

Note that the tool can tell when a header is one it produces, so you don’t have to exclude them from the input files.

A C++ Example

For things like C++ classes — or anything, really — you can enclose a block of code inside a special preprocessor directive to make the tool process it. Here’s a very simple example I used to test things out:

A few things to notice. First, the include for test.hpp will grab the generated header file specific to this file. The INTERFACE directive wraps the code that should be in the header. At compile time, INTERFACE will equal zero, so this code won’t compile twice.

The member functions declared outside of the INTERFACE section have PUBLIC in front of them (and could, of course, have PRIVATE or PROTECTED, as well). This will cause the tool to pick them up. Finally, notice that there is a global variable and a global function at the bottom of the file.

Notice that when using PUBLIC or the other keywords that you omit the functions from the declaration. The only reason the example has some functions there is because they are inline. If you put all the functions outside the interface section of the file, the generated header will correctly assemble the class declaration. In this case, it will add these functions to the ones that are already there.

The Generated Header

The header seems pretty normal. You might be surprised that the header isn’t wrapped with the usual preprocessor statements that prevent the file from being included more than once. After all, since only one file will include the header, that code is unnecessary.

Here’s the file:

Notice that INTERFACE gets set to zero at the end, which means in the source file, the interface portion won’t get compiled again. For C source, the tool also generates typedefs for things like structures. For C++ this is unnecessary, of course. You can see the byproduct of having some declarations in the interface section and some in the implementation section: there is a redundant public tag. This is harmless and wouldn’t appear if I had put all the code outside the interface section.

There’s More

There’s more that this versatile tool can do, but you can read the documentation. There’s a flag that dumps information about your code you can use for documentation purposes. You can build hierarchies of interfaces. It also can help you mix C++ and C code. The tool is smart enough to handle conditional compilation. Note, though, that the C++ support doesn’t handle things like templates and namespaces. You have the source, though, so you could fix that if you like. There are some other limitations you should read about before you adopt this for a big project.

Will you try a tool like this or are you happy with manually handling headers? C++ can even target web pages. Or, use it for shell scripts, if you dare.


No comments:

Post a Comment