C++ Streams & Typedefs: Be Charful

The C++ type­def key­word is indis­pens­able in many sit­u­a­tions, espe­cial­ly for writ­ing portable low-lev­el code. However, in some cir­cum­stances it can cause trou­ble, par­tic­u­lar­ly when it comes to func­tion over­load­ing. Consider the fol­low­ing C++ tem­plate class:

template <typename T>
struct foobar
{
    foobar( const T foo ) : foo_( foo ) {}
    T foo_;
};

One might want to write a sim­ple stream out­put oper­a­tor to for­mat the tem­plate class’ mem­ber val­ues, e.g. for debug­ging purposes:

template <typename T>
ostream& operator<<( ostream& s, const foobar<T>& fb )
{
    return s << "foo: " << fb.foo_;
}

This seems rea­son­able. Now, let’s assume that this tem­plate is going to be used in a con­text where T will be one of sev­er­al fixed-width inte­ger types. These are usu­al­ly type­defs from a head­er like stdint.h (for those that don’t mind includ­ing a C head­er) or boost/cstdint.hpp (to be a C++ purist). They are com­mon­ly named int64_t, int32_t, int16_t, and int8_t, where the X in intX_t spec­i­fies the num­ber of bits used to rep­re­sent the inte­ger. There are also unsigned vari­ants, but we’ll ignore those for this discussion.

Let’s now explore what hap­pens when we ini­tial­ize a foobar<intX_t> instance with its foo_ mem­ber set to a small inte­ger and print it to stan­dard out­put via our cus­tom stream out­put operator:

cout << foobar<int64_t>( 42 ) << endl;
cout << foobar<int32_t>( 42 ) << endl;
cout << foobar<int16_t>( 42 ) << endl;

Each of these state­ments prints “foo: 42”, as expect­ed. Great, every­thing works! But wait, there was one type that we did­n’t test:

cout << foobar<int8_t>( 42 ) << endl;

This prints “foo: *” instead of “foo: 42”. This is prob­a­bly not the expect­ed result of print­ing the val­ue of an int8_t. After all, it looks and feels just like all of the oth­er intX_t types! What caus­es it to be print­ed dif­fer­ent­ly from the oth­er types? Let’s look at how the inte­ger types might be defined for an x86 machine:

typedef long int int64_t;
typedef int int32_t;
typedef short int16_t;
typedef char int8_t;

The prob­lem is that the only way to rep­re­sent an inte­ger with exact­ly 8 bits (and no more) is with a char (at least on the x86 archi­tec­ture). While a char is an inte­ger, it is also a… char­ac­ter. So, this trou­ble is caused by the fact that the char type is try­ing to be two things at once.

A sim­ple (but incor­rect) approach to work around this is to over­load1 the stream out­put oper­a­tor for the int8_t type, and force it to be print­ed as a number:

// This is incorrect:
ostream& operator<<( ostream& s, const int8_t i )
{
    return s << static_cast<int>( i );
}

The prob­lem with this approach is that the int8_t type­def does not rep­re­sent a unique type. The type­def key­word is named poor­ly; it does not intro­duce new types. Rather, it cre­ates alias­es for exist­ing types. By over­load­ing the stream out­put oper­a­tor for the int8_t type, the char type­’s oper­a­tor is being over­loaded as well. Since the stan­dard library already defines a stream out­put oper­a­tor for the char type, the above def­i­n­i­tion would vio­late the One Definition Rule and result in a com­pil­er error. Even if it did com­pile, the results of redefin­ing the way char­ac­ters are print­ed would prob­a­bly not be desirable.

An alter­na­tive (work­ing) solu­tion to the prob­lem is to over­load the out­put stream oper­a­tor for the foobar<int8_t> type:

ostream& operator<<( ostream& s, const foobar<int8_t>& fb )
{
    return s << "foo: " << static_cast<int>( fb.foo_ );
}

This def­i­n­i­tion does not clash with any exist­ing over­loads from the stan­dard library, and it effec­tive­ly caus­es the int8_t to be print­ed as an inte­ger. The down­side is that it will cause unex­pect­ed behav­ior when a foobar<char> is print­ed, if the pro­gram­mer intends char to rep­re­sent a char­ac­ter. The only way to avoid this would be to define int8_t as a class instead of mak­ing it a type­def, and pro­vid­ing a well-behaved stream out­put oper­a­tor for that class. The class’ arith­metic oper­a­tors could be over­loaded to make it look almost exact­ly like a POD inte­ger, and it would­n’t nec­es­sar­i­ly take up any extra mem­o­ry. However, this solu­tion is still not ide­al, because class­es behave dif­fer­ent­ly than POD types in sub­tle ways (e.g. POD types are not ini­tial­ized by default, but class­es are).

If there’s any­thing to take away from this, it’s that the C++ char type is an odd beast to watch out for. Also, the name of the type­def oper­a­tor could use some improvement…

  1. If you are curi­ous as to why I sug­gest over­load­ing instead of tem­plate spe­cial­iza­tion, see this arti­cle. []

Comments are disabled for this post