Code generation with cog

Code generation with cog

I was always taught that if you find yourself repeating the same basic code over and over again, even with minor variations, you’re probably not taking full advantage of the tools made available to you by your programming language of choice.

In many cases this is true. For example, in C++, you can use templates to eliminate a lot of copy/paste/minor change style code. C# has generics, and though they’re not as powerful as C++’s templates, they also can go a long way to limit the amount of code you have to write when it comes to dealing with objects that happen to share similar interfaces, or when you want to use an object regardless of its type in a similar manner.

An example of the power of C++’s templates can be found all over the web, and in C++ boost where they are used extensively. But even templates don’t solve all of the problems that may lead to copy and paste.

In C++, you can get a little bit further by utilizing the preprocessor, but compared to templates, it is arcane and easy to use improperly in ways that introduce subtle bugs into your code. There are even libraries available to help tame the preprocessor and make it more useful (see: boost.preprocessor )

I recently found when working on network software with new messages being added every few days that neither the preprocessor, nor templates prevented the code repetition that was being caused by handling new message types in similar, but slightly different ways. An example from the project is listed below:

void messageutil::switch_dispatch(message_context_ptr ctx)
        {
            switch (ctx->type)
            {
                case MT_ANSWER_CHALLENGE:
                    messageutil::template_dispatch(ctx, std::make_shared<AnswerChallengeMessage>());
                    break;

                case MT_AUTH_ACK:
                    messageutil::template_dispatch(ctx, std::make_shared<AuthAckMessage>());
                    break;

                case ...

Because the library I’m working with doesn’t support the visitor pattern, I have to create an identity based switch statement to extract the message type and create the messages based on their IDs. This creates a large switch statement and me having to write boilerplate code for every new message that is added.

I could’ve used preprocessor macros and/or templates to make the amount of typing slightly less, but it still wouldn’t eliminate the core problem, which is that I have to write SOMETHING for every new message type that gets created, even though I’m executing what is pretty much the same factory code over and over.

This brings me to the point of the article: cog. Cog is a library written by Ned Batchelder to generate code in a variety of different languages. The cog website where the basic operation is described can be found here.

The cog syntax is python, and in fact, you can use any library that you can use in python to assist you in code generation. Since I know that all the network messages that I’ll need to handled are first written as *.proto files, I realized I could use that as a basis to form the set of messages from the filenames.

The first example on the cog website looks like this:

// This is my C++ file.
/*[[[cog
import cog
fnames = ['DoSomething', 'DoAnotherThing', 'DoLastThing']
for fn in fnames:
    cog.outl("void %s();" % fn)
]]]*/
// ------> your generated code will end up here <------
//[[[end]]]

cog.outl() will write lines of text directly to your source file between the ]]]*/ and //[[[end]]] tags. After “cogging” the file, the cpp source will end up looking like this:

// This is my C++ file.
...
/*[[[cog
import cog
fnames = ['DoSomething', 'DoAnotherThing', 'DoLastThing']
for fn in fnames:
    cog.outl("void %s();" % fn)
]]]*/
void DoSomething();
void DoAnotherThing();
void DoLastThing();
//[[[end]]]

Let’s take a look at how we can apply this same technique to generate some message dispatching code for us! Below is an excerpt from my project:

/*[[[cog
def underscore(name):
   import re
   s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
   return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).upper()

# (fnames is the list of protobuf file names
# provided from other cog code above this)
for fn in fnames:
   rawname = os.path.splitext(os.path.basename(fn))[0]
   enumName = rawname
   if enumName.endswith('Message'):
       enumName = enumName[:-7]
   else:
       continue

   enumName = underscore(enumName)

   cog.outl("case MT_%s:" % enumName)
   cog.outl("    messageutil::template_dispatch(ctx, std::make_shared<%s>());" % rawname)
   cog.outl("    break;");
   cog.outl("");
]]]*/

What the heck are we doing here? Lets step through this.

First we define a function called underscore that takes the camel case file names and turns them into their equivalent capitalized underscore versions. Eg. MyGoodMessage becomes MY_GOOD_MESSAGE. We’ll use this later.

Then, we’re using a list of all the *.proto files in a specific directory called fnames and getting just the basename of the file which is the filename without path information. We then apply splitext which splits the filename into a list of [filename, extension] and take the [0]th element of that which is the filename without the extension. We then trim off the “Message” portion of the filename. Eg. AwesomeCoolMessage becomes AwesomeCool.

Then, before we start to output code, we apply our previously mentioned underscore function to the name. So enumName = AwesomeCoolMessage then AwesomeCool and finally AWESOME_COOL.

Now, we start outputting code directly into the cpp file:

cog.outl("case MT_%s:" % enumName)
#outputs: case MT_AWESOME_COOL: followed by a newline 

cog.outl("    messageutil::template_dispatch(ctx, std::make_shared<%s>());" % rawname)
#outputs: messageutil::template_dispatch(ctx, std::make_shared<AwesomeCoolMessage>()); then a newline

cog.outl("    break;");
cog.outl("");
#outputs: break; then a newline

When we put this together in the actual project with the messages that are currently defined, we get the following output:

case MT_ANSWER_CHALLENGE:
    messageutil::template_dispatch(ctx, std::make_shared<AnswerChallengeMessage>());
    break;

case MT_AUTH_ACK:
    messageutil::template_dispatch(ctx, std::make_shared<AuthAckMessage>());
    break;

case MT_CHALLENGE_RESPONSE:
    messageutil::template_dispatch(ctx, std::make_shared<ChallengeResponseMessage>());
    break;

case MT_GET_CHALLENGE:
    messageutil::template_dispatch(ctx, std::make_shared<GetChallengeMessage>());
    break;

You can see the cog code and output generation both embedded in the source file here.

Cog is awesome, and has already saved me a ton of time. Thanks a TON to Ned Batchelder for developing it!

One thought on “Code generation with cog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s