The BigBoxoCo Disco Party: Why Segmentation is Good

As the freshly brewed coffee enters my mouth, I experience my first glimpse of consciousness for the day. “Where am I?” I mutter, in broken English. The gray walls around me slowly come into focus, lit by the flickering of a long-in-the-tooth fluorescent bulb. The top half of a man’s face appears over the top of my cubicle wall.

How’s the wonderful world of iNetConjoinApp?”

The caffeine must have made it past my blood-brain barrier, as I recognize at once that I’m at EnergyModCo, where I am one of a handful of employees. The half-head belongs to Freyr, EnergyModCo’s COO, lead customer service rep, and deployment technician.

Umm, it’s, well, I just started working on the –”

Great, that sounds good. You remember BigBoxoCo?”

You mean, as in our biggest cust–”

There’s a problem at one of their warehouses. Something to do with our lighting controller.”

Oh?”

I just got off the phone with the warehouse manager. All the lights went out for a few minutes, but they’re back on now.”

Uh, that’s bad. Thank goodness they have skylights.”

Nope. This is their first two-story warehouse. The only light the first-floor customers had was from the emergency floodlights.”

My throat tightens. “Well, I’m on it. We can’t let that happen again.”

The weight of the situation slams into me like an over-packed palette of giant mayonnaise jars. After being awake for only 25 seconds, I’m not ready to douse this kind of blaze. I don’t have a choice though, so I lean back in my chair and gaze at the craquelure on the ceiling tiles. How could this have happened? I recall that at one time we did have problems with the smart-breakers that switched the lights. They would sometimes mysteriously ignore the commands sent to them by EnergyModCo’s software. But I fixed that by adding a watchdog that would retry the switch commands if they did not take effect. After a brief palpitation subsides, I admit to myself that the lighting control watchdog must contain a nasty bug.

After some brief email digging, face palming, and silent cursing, I manage to get a VPN connection set up so that I can SSH into EnergyModCo’s on-site lighting controller. I bump up the logging verbosity, which requires that I restart the system, and start looking for clues. After a few minutes, I see the periscope that is Freyr’s forehead rise above my cubicle wall.

The lights are off again! I’ve got the BigBoxoCo manager on hold, and he’s about to lose it!”

My lip quivers as I struggle to suppress my fight-or-flight instinct. It can’t be a coincidence that the lights went off right when I restarted the software. What have I done? In a panic, I force our software to turn the lights back on, and thankfully it works. At this point, I am paralyzed with fear. I want to disable our software entirely, but what if stopping it is what made the lights go off just now? I pull my hands away from the keyboard, fearing that anything I do might cause the BigBoxoCo manager to enter sudden cardiac arrest, or worse.

Without being able to touch the on-site software, I dive into the source code, hoping to track down the bug analytically. I pore through the entire stack, following the data flow and logic for the relatively simple lighting control subsystem. The scheduling code makes sense. So does the the timer code. The trickiest code, for the watchdog system, looks entirely correct. I rack my brain; what am I missing? I am startled by the crack of thunder, but there doesn’t seem to be a storm outside. As my nerves resonate with the imagined sound, Freyr’s forehead crests my cubicle wall.

“The BigBoxoCo manager is flipping out. He says, and I quote, that ‘There’s a God damned disco party going on’ in his warehouse. They are going to have to stop accepting customers.”

Content with having surpassed even my worst expectations, Freyr jogs back to his office. I follow him briskly.

Freyr, can’t the manager hit the manual lighting override? I think it might take me a while to figure out the problem.”

What are you doing away from your desk? No! Only BigBoxoCo’s maintenance engineer has a key to the enclosure, and he’s AWOL. Go!”

I save three seconds by running back to my desk. At this point, I’m bouncing ideas off our other programmer, Nate. No good; he has never worked on this system, and can only offer limited advice. I go back to staring at the code. I may have been unconscious an hour ago, but now the fire of my mind is burning with the focused intensity of a TIG welder. The coffee is gone. I begin questioning all of my assumptions. Compiler bug? Memory corruption? Cosmic rays? Nate complains about the sound of my forehead slamming against the desk. In my heightened state of awareness, I perceive the ghostly sound of footsteps come to a stop outside my cube. Moments pass before the shrunken form of Freyr emerges from the hallway. I am calmed by his lack of speed as well as the fact that he is not using his periscope.

I’m sorry,” he says.

Uh, hey Freyr, what’s up…?”

I fixed the lights. It was my fault.”

Until this point, it had not crossed my mind that the problems may have been caused by the lighting controller being configured incorrectly. “Wha — what the hell happened?”

I had configured the lighting controller at a different site with the IP address of the smart-breaker at the disco warehouse. The other site was in a different timezone, and its schedule said that the lights should be off.”

The problem was too simple. Why didn’t I think of this? One controller thought the lights should be on, and the other thought they should be off. Thus, the lighting watchdogs at each site were fighting over control of the lights. Neither controller knew about the other one; they just thought that the smart-breakers were disobeying them and retried their commands. Over and over. I subdue my first instinct to tackle Freyr on the spot, and murmur, “Okay. Thanks for letting me know.”

I sit still for a few minutes, allowing the turbulence of my rage to subside. My initial response is to be angry at Freyr for wasting my time and terrifying me. However, as I calm down and regain clarity, I realize that he did nothing wrong. Who hasn’t mistyped an IP address before? I know I certainly have, many times. Freyr made a simple and understandable mistake. The problem was that the BigBoxoCo network was set up in such a way as to allow a simple mistake to wreak utter chaos.


As it turned out, BigBoxoCo had all of their hundreds of warehouses on the same virtual network. Not only could BigBoxoCo’s corporate headquarters reach machines at every single warehouse, but so could any individual warehouse. A PC at a BigBoxoCo in New York could ping a PC at a BigBoxoCo in Oregon with no problem. Even ignoring the security repercussions of such a setup, there are good reasons to avoid it. If the network was set up with a star topology, with only the corporate headquarters having access to every single warehouse, the disco party fiasco could have been easily avoided. In other words, segmentation is good.


New Song — Spelunkatronis

Spelunkatronis

I don’t have a whole lot to say about this track, other than that it is one of my first songs that actually feels somewhat finished. I had a lot of fun making this one. In particular, I fell in love with Ableton Live’s Collision instrument and the infinite variety of bell-like sounds that it can create. And, of course, the song features some pads courtesy of my Alesis Andromeda A6. I think I’m going to name it Padosaur.

Evan Mezeske — Spelunkatronis


C++ Streams & Typedefs: Be Charful

The C++ typedef keyword is indispensable in many situations, especially for writing portable low-level code. However, in some circumstances it can cause trouble, particularly when it comes to function overloading. Consider the following C++ template class:

template <typename T>
struct foobar
{
    foobar( const T foo ) : foo_( foo ) {}
    T foo_;
};

One might want to write a simple stream output operator to format the template class’ member values, e.g. for debugging purposes:

template <typename T>
ostream& operator<<( ostream& s, const foobar<T>& fb )
{
    return s << "foo: " << fb.foo_;
}

This seems reasonable. Now, let’s assume that this template is going to be used in a context where T will be one of several fixed-width integer types. These are usually typedefs from a header like stdint.h (for those that don’t mind including a C header) or boost/cstdint.hpp (to be a C++ purist). They are commonly named int64_t, int32_t, int16_t, and int8_t, where the X in intX_t specifies the number of bits used to represent the integer. There are also unsigned variants, but we’ll ignore those for this discussion.

Let’s now explore what happens when we initialize a foobar<intX_t> instance with its foo_ member set to a small integer and print it to standard output via our custom stream output operator:

cout << foobar<int64_t>( 42 ) << endl;
cout << foobar<int32_t>( 42 ) << endl;
cout << foobar<int16_t>( 42 ) << endl;

Each of these statements prints “foo: 42″, as expected. Great, everything works! But wait, there was one type that we didn’t test:

cout << foobar<int8_t>( 42 ) << endl;

This prints “foo: *” instead of “foo: 42″. This is probably not the expected result of printing the value of an int8_t. After all, it looks and feels just like all of the other intX_t types! What causes it to be printed differently from the other types? Let’s look at how the integer types might be defined for an x86 machine:

typedef long int int64_t;
typedef int int32_t;
typedef short int16_t;
typedef char int8_t;

The problem is that the only way to represent an integer with exactly 8 bits (and no more) is with a char (at least on the x86 architecture). While a char is an integer, it is also a… character. So, this trouble is caused by the fact that the char type is trying to be two things at once.

A simple (but incorrect) approach to work around this is to overload1 the stream output operator for the int8_t type, and force it to be printed as a number:

// This is incorrect:
ostream& operator<<( ostream& s, const int8_t i )
{
    return s << static_cast<int>( i );
}

The problem with this approach is that the int8_t typedef does not represent a unique type. The typedef keyword is named poorly; it does not introduce new types. Rather, it creates aliases for existing types. By overloading the stream output operator for the int8_t type, the char type’s operator is being overloaded as well. Since the standard library already defines a stream output operator for the char type, the above definition would violate the One Definition Rule and result in a compiler error. Even if it did compile, the results of redefining the way characters are printed would probably not be desirable.

An alternative (working) solution to the problem is to overload the output stream operator for the foobar<int8_t> type:

ostream& operator<<( ostream& s, const foobar<int8_t>& fb )
{
    return s << "foo: " << static_cast<int>( fb.foo_ );
}

This definition does not clash with any existing overloads from the standard library, and it effectively causes the int8_t to be printed as an integer. The downside is that it will cause unexpected behavior when a foobar<char> is printed, if the programmer intends char to represent a character. The only way to avoid this would be to define int8_t as a class instead of making it a typedef, and providing a well-behaved stream output operator for that class. The class’ arithmetic operators could be overloaded to make it look almost exactly like a POD integer, and it wouldn’t necessarily take up any extra memory. However, this solution is still not ideal, because classes behave differently than POD types in subtle ways (e.g. POD types are not initialized by default, but classes are).

If there’s anything to take away from this, it’s that the C++ char type is an odd beast to watch out for. Also, the name of the typedef operator could use some improvement…

  1. If you are curious as to why I suggest overloading instead of template specialization, see this article. []

New Song — Tecate Plus

An ethanol molecule

The chemical that makes beer fun happens to look a bit like a Corgi.

This song is a little bit newer; I made it sometime in early 2010. I can’t cite any specific inspiration in its making, but as the title might suggest, alcohol had something to do with it. Really, I think that this song was just the result of my recent acquisition of a legitimate copy of Ableton Live 8. It was a lot more stable than the pirate trial version I was using before, and knowing that it wasn’t going to randomly crash made it easier to spend more time on a track. I can’t really say enough good things about Live 8, but I’ll save that for a future post (maybe I should do some tutorials?). Aside from my Live 8 fever, I think that I had recently made a nice pad sound on my Alesis Andromeda A6, and I had to create a song in which to use it.

Evan Mezeske — Tecate Plus


New Song — Arabesque

64K is a lot of memory.

I made this song a couple years back, mostly due to inspiration from the demoscene. I was also listening to a lot of Commodore 64 music at the time (both classics and modern originals composed for the SID chip). The MOS Technology SID chips used for audio synthesis in the C64 were extremely advanced, and way ahead of their time, featuring things like ASDR envelopes which are normally found on standalone synthesisers. Speaking of which, the SID chips had such a following that Elektron put together a standalone sound module based on them called the SidStation. Of course the SID can be accurately emulated in software, but that won’t stop the hardware purists from pining for the original, flawed hardware. At any rate, I don’t have a SID chip or an emulator, but that didn’t stop me from trying to make a SID-like song!

Evan Mezeske — Arabesque


The Terror of the Long Comment

A code comment can be a wonderful thing. It can offer a gem of context around a quirky bit of code that will make the reader’s life easier for years to come:

// Fields c and d are intentionally switched in version 3 of the
// protocol; see RFC 2324 S 12.
fields.push_back( a );
fields.push_back( b );
fields.push_back( d );
fields.push_back( c );

If I were reading the above code snippet without the comment, I’d probably be very skeptical about the order in which things were pushed into the fields container. The comment cures this skepticism and justifies the odd-looking but entirely correct code.

I find that most useful comments are about the length of the one above, and they usually have the same purpose: to explain something that would otherwise give the reader some pause. The other necessary condition for a useful comment is that there isn’t any obvious way to rewrite the code so that no comment is required for it to be easily understood. In the case of the above code snippet, it could perhaps be rewritten as follows:

void reorder_fields_for_version_3( container& fields )
{
    assert( fields.size() >= 4 );
    std::swap( fields[2], fields[3] ); // See RFC 2324 S 12.
}

fields.push_back( a );
fields.push_back( b );
fields.push_back( c );
fields.push_back( d );
reorder_fields_for_version_3( fields );

I think that each of the above code snippets could be appropriate, depending on its context. If the field-switching only occurs in one place, the first approach would be the simplest. Otherwise, if the field-switching happens in several places, the second approach would probably be best. At any rate, if I ran across either of the solutions in some code I was reading, I would understand what was going on.

This brings me to the topic of this post, which is that while comments can be extremely useful, they can also be one of the most terrifying implements of the confused programmer — striking pain and suffering directly into the soul of the reader. Comments of this type can twist the reader’s contented mood into one of abject pessimism:

// When using this function, ensure that you obtain the
// base_class_lock_ before you obtain the send_queue_lock_,
// otherwise when the base class' munge() function is called,
// it will update the target map prematurely -- and if that
// happens, the next asynchronous call to pop_queue() will skip
// over some of the records in the queue.  Normally this would be
// acceptable, but other_class_x depends on the records coming
// out of the queue in the order they were put in (unless
// WEIRD_MACRO_Z is defined).
queue_result enqueue( const record& r )
{
    /* ... */
}

If you’re anything like me, you will anticipate the pain of that comment before you start reading it, simply because of its length. I almost never see six-line comments that do not hurt my brain. The only occasion on which such a lengthy comment is really justified is when it describes some complicating factor that is external to the code itself. For example, a comment in a piece of client software might need to be lengthy to describe some bizarre behavior in the server it is meant to talk to. That doesn’t seem to be avoidable. However, if the code that one is currently writing requires a six-line comment to explain itself, that should be seen as a huge red flag. Such a long comment should tell the author to step back and look at their design, to figure out how to simplify it so that no such comments are necessary. For instance, the author might read that comment and ask themselves the following questions:

// When using this function, ensure that you obtain the
// base_class_lock_ before you obtain the send_queue_lock_,

Do there need to be two locks? If so, could the send_queue_lock_ be integrated directly into the send queue? Could the locking be factored out into a separate function so that it always happens in the right order?

// otherwise when the base class' munge() function is called,
// it will update the target map prematurely -- and if that
// happens, the next asynchronous call to pop_queue() will skip
// over some of the records in the queue.

Can the munge() function be changed so that it always updates the target map correctly? Maybe it should be responsible for obtaining its own locks. Why should pop_queue() ever be allowed to skip records? Can it be changed so that it doesn’t?

// Normally this would be
// acceptable, but other_class_x depends on the records coming
// out of the queue in the order they were put in (unless
// WEIRD_MACRO_Z is defined).

The user of this function shouldn’t have to know about what other_class_x does. And is WEIRD_MACRO_Z necessary? If it’s specific to other_class_x, why does the user of this function need to know about it?

By addressing a few of these questions, it might be possible for the author to shrink that big, complicated, painful, six-line comment down to a reasonable size. And by doing so, they might save some of the future maintainers of their code from developing severe drinking problems. More importantly, though, they will probably end up with less buggy code, because in my experience, the more complicated rules that I have to hold in my head while writing code, the more I screw up. The code that contains the most lengthy, most frequent comments is commonly the worst code.