Generic Programming: Mappings between Types and Values

类别:编程语言 点击:0 评论:0 推荐:

 

Generic<Programming>: Mappings between Types and Values Andrei Alexandrescu

Note: This article updated September 21, 2000

The term "conversion" in C++ describes the process of obtaining a value of one type from a value of another type. However, sometimes you need a different kind of conversion: you might need to obtain a value when you have a type, or the other way around. Such conversions are unnatural to C++ because the world of types and that of values are separated by a thick boundary. In certain circumstances, however, you need to cross the border between the two, and this column discusses how to do that.

Mapping Integers to Types

A surprisingly simple template can be very helpful to many generic programming idioms:

template <int v> struct Int2Type { enum { value = v }; };

Int2Type "generates" a distinct type for each distinct constant integral value passed. This is because different template instantiations are distinct types, so Int2Type<0> is different from Int2Type<1> and so on. In addition, the value that generated the type is "saved" in the enum member value.

You might use Int2Type whenever you need to quickly "typify" an integral constant. Say, for example, that you design a NiftyContainer class template.

template <typename T> class NiftyContainer { ... };

NiftyContainer stores pointers to T. In some member functions of NiftyContainer, you need to clone objects of type T. If T is a non-polymorphic type, you would say:

T* pSomeObj = ...; T* pNewObj = new T(*pSomeObj);

With polymorphic types, the story is a bit more complicated, so let's say you establish the convention that all polymorphic types to be used with NiftyContainer must define a Clone virtual function. You can then clone objects like so:

T* pNewObj = pSomeObj->Clone();

Because your container must be able to accommodate both types, you must implement both cloning algorithms and select the appropriate one at compile time. You then communicate whether or not a type is polymorphic through a boolean (non-type) template parameter of NiftyContainer and rely on the programmer to pass the correct flag.

template <typename T, bool isPolymorphicWithClone> class NiftyContainer { ... }; NiftyContainer<Widget, true> widgetBag; NiftyContainer<double, false> numberBag;

If the type you store in NiftyContainer is not polymorphic, you can optimize many member functions of NiftyContainer because you can count on constant object size and value semantics. In all these member functions, you need to select one algorithm or another depending on the isPolymorphic template argument.

At first glance, it seems like you can use a mere if statement.

template <typename T, bool isPolymorphic> class NiftyContainer { ... void DoSomething(T* pObj) { if (isPolymorphic) { ... polymorphic algorithm ... } else { ... non-polymorphic algorithm ... } } };

The problem is that the compiler won't let you get away with this code. For example, if the polymorphic algorithm uses pObj->Clone, then NiftyContainer::DoSomething does not compile for any type that doesn't define a member function Clone. True, it is obvious at compile time what branch of the if statement executes. However, that doesn't matter to the compiler, which diligently tries to compile both branches, even if the optimizer will later eliminate the dead code. If you try to call DoSomething for NiftyContainer<int, false>, the compiler stops at the pObj->Clone call and says, "huh?"

But wait, there's more. If T is a polymorphic type, again the code might not compile. Assume T disables its copy constructor by making it private or protected — as a well-behaved polymorphic class should do. Then, if the non-polymorphic code branch attempts new T(*pObj), the code fails to compile.

It would be nice if the compiler didn't bother with compiling dead code, but that's not the case, so what would be a satisfactory solution?

As it turns out, there are a number of solutions, and Int2Type provides a particularly clean one. It can transform ad-hoc the boolean value isPolymorphic into two distinct types corresponding to isPolymorphic's true and false values. Then you can use Int2Type<isPolymorphic> with simple overloading, and voilà!

The incarnation of the "integral typifying" idiom is shown below.

template <typename T, bool isPolymorphic> class NiftyContainer { private: void DoSomething(T* pObj, Int2Type<true>) { ... polymorphic algorithm ... } void DoSomething(T* pObj, Int2Type<false>) { ... non-polymorphic algorithm ... } public: void DoSomething(T* pObj) { DoSomething(pObj, Int2Type<isPolymorphic>()); } };

The code is simple and to the point. DoSomething calls a private overloaded function. Depending on the value of isPolymorphic, one of the two private overloads gets called, completing the dispatch. The dummy temporary variable of type Int2Type<isPolymorphic> isn't even used; it serves only for propagating type information.

Not So Fast, Skywalker!

Seeing the above, you might think there's a smarter way, by using some of those template specialization tricks. Why is that dummy temporary necessary? There's got to be a better way. Surprisingly, however, Int2Type is very hard to beat in simplicity, generality, and effectiveness.

One attempt could involve specializing NiftyContainer::DoSomething for any T and the two possible values of isPolymorphic. This is the charter of partial template specialization, right?

template <typename T> void NiftyContainer<T, true>::DoSomething(T* pObj) { ... polymorphic algorithm ... } template <typename T> void NiftyContainer<T, false>::DoSomething(T* pObj) { ... non-polymorphic algorithm ... }

As nice as the code above might look, alas, it's not legal. There is no such thing as partially specializing one member function of a class template. You can partially specialize all of NiftyContainer:

template <typename T> class NiftyContainer<T, false> { ... non-polymorphic NiftyContainer ... };

You also can totally specialize DoSomething:

template <> void NiftyContainer<int, false>::DoSomething(int* pObj) { ... non-polymorphic algorithm ... }

but, oddly, you cannot do anything in between [1].

Another solution might involve using traits [2] and implementing DoSomething outside NiftyContainer (in the traits class). This can be awkward as it segregates part of DoSomething's implementation.

A third solution that attempts to still use traits, but tries to keep everything together, would be to define the traits class inside NiftyContainer as a private inner class. To make a long story short, this works, but by the time you manage to implement it, you come to realize just how much better the Int2Type-based idiom is. And maybe the nicest part of it is that you can actually put the little Int2Type template in a library and document its intended use.

Type-to-Type Mappings

Consider the function below:

template <class T, class U> T* Create(const U& arg) { return new T(arg); }

Create makes a new object, passing an argument to its constructor.

Now say there is a rule in your application: objects of type Widget are legacy code and must take two arguments upon construction, the second being a fixed value like -1. You don't have this problem in all classes derived from Widget.

How can you specialize Create so that it treats Widget differently from all other types? You cannot partially specialize a function; that is, you can't say something like:

// Illegal code - don't try this at home template <class U> Widget* Create<Widget, U>(const U& arg) { return new Widget(arg, -1); }

In absence of partial specialization of functions, the only tool we have is, again, overloading. We might pass a dummy object of type T and rely on overloading.

// An implementation of Create relying on overloading template <class T, class U> T* Create(const U& arg, T) { return new T(arg); } template <class U> Widget* Create(const U& arg, Widget) { return new Widget(arg, -1); } // Use Create() String* pStr = Create("Hello", String()); Widget* pW = Create(100, Widget());

The second parameter of Create serves only for selecting the appropriate overload. It also is the trouble with this would-be idiom: you waste time constructing an arbitrarily complex object that you don't use. Even if the optimizer might help you with that, nothing helps if Widget disables its default constructor.

The proverbial extra level of indirection can help here, too. An idea is to pass T* instead of T as the dummy template parameter. At runtime, you can always pass the null pointer, which is reasonably cheap to construct.

template <class T, class U> T* Create(const U& arg, T*) { return new T(arg); } template <class U> Widget* Create(const U& arg, Widget*) { return new Widget(arg, -1); } // Use Create() String* pStr = Create("Hello", (String*)0); Widget* pW = Create(100, (Widget*)0);

This way is at best confusing for Create's users. To eternalize this solution as an idiom, we can use a little template similar to Int2Type.

template <typename T> struct Type2Type { typedef T OriginalType; };

Now you can write:

template <class T, class U> T* Create(const U& arg, Type2Type<T>) { return new T(arg); } template <class U> Widget* Create(const U& arg, Type2Type<Widget>) { return new Widget(arg, -1); } // Use Create() String* pStr = Create("Hello", Type2Type<String>()); Widget* pW = Create(100, Type2Type<Widget>());

This certainly is more explanatory than an alternative solution. Again, you can include Type2Type in a library and document its intended use.

Detecting Convertibility and Inheritance

In implementing template functions and classes, a question that arises every so often is: given two arbitrary types B and D, how can you detect whether D inherits from B or not?

Discovering such relationships at compile time is key to implementing advanced optimizations in generic libraries. In a generic function, you can rely on an optimized algorithm if a class implements a certain interface, without having to apply a dynamic_cast for that.

Detecting inheritance relies on a more general mechanism, that of detecting convertibility. The more general problem that we will solve together is how to detect whether an arbitrary type T supports automatic conversion to an arbitrary type U?

There is a solution to this problem, and it relies on sizeof. (You thought the only use for sizeof was as an argument to memset, huh?) There is a surprising amount of power in sizeof; this is because you can apply sizeof to any expression, no matter how complex, and sizeof returns its size, without actually evaluating that expression at runtime. This means that sizeof is aware of overloading, template instantiation, conversion rules — everything that can take part in a C++ expression. In fact, sizeof is a complete facility for deducing the type of an expression; eventually, sizeof throws away the expression and returns you only the size of its result [3].

The idea of conversion detection relies on using sizeof with overloaded functions. You cunningly provide two overloads of a function: one accepts the type to convert to (U), and the other accepts just about anything else. You call the overloaded function with T, whose convertibility to U you want to determine. If the function that accepts a U gets called, you know that T is convertible to U; if the "fallback" function gets called, then T is not convertible to U.

To detect which function gets called, you arrange the two overloads to return types of different sizes, and discriminate with sizeof. The types themselves do not matter, as long as they have different sizes.

Let's first create two types of different sizes. (Apparently, char and long double do have different sizes, but that's not guaranteed by the standard.) A foolproof scheme would be the following:

typedef char Small; struct Big { char dummy[2]; };

By definition, sizeof(Small) is one. Also, the size of Big is unknown, but it's for sure greater than one, which is the only guarantee we need.

Next, we need the two overloads. One accepts a U and returns, say, a Small.

Small Test(U);

How do you write a function that accepts "anything else"? A template is not a solution because the template would qualify as the best match, thus hiding the conversion. We need a worse match than an automatic conversion. A quick look through the conversion rules applied for a function call gives the ellipsis match, which is the worst of all — the bottom of the list. That's exactly what the doctor prescribed.

Big Test(...);

(Calling a function with ellipsis with a C++ object has undefined results, but who cares? Nobody actually calls the function. It's not even implemented.)

Now we need to apply sizeof to the call of Test, passing it a T:

const bool convExists = sizeof(Test(T())) == sizeof(Small);

That's it! The call of Test gets a default-constructed object, T, and then sizeof extracts that expression's result size. It can be either sizeof(Small) or sizeof(Big), depending on whether the compiler found a conversion or not.

There is one little problem. What if T makes its default constructor private? In this case, the expression T fails to compile and so does all of our scaffolding. Fortunately, there is a simple solution — just use a strawman function returning a T. In this case, the compiler is happy and so are we.

T MakeT(); const bool convExists = sizeof(Test(MakeT())) == sizeof(Small);

(By the way, isn抰 it nifty just how much you can do with functions, like MakeT and Test, which not only don抰 do anything, but which don抰 even really exist at all?)

Now that we have it working, let's package everything in a class template that hides all the details of type deduction and exposes the result only.

template <class T, class U> class Conversion { typedef char Small; struct Big { char dummy[2]; }; static Small Test(U); static Big Test(...); T MakeT(); public: enum { exists = sizeof(Test(MakeT())) == sizeof(Small) }; };

Now you can test the Conversion template class by writing:

int main() { using namespace std; cout << Conversion<double, int>::exists << ' ' << Conversion<char, char*>::exists << ' ' << Conversion<size_t, vector<int> >::exists << ' '; }

This little program prints "1 0 0". Note that, although std::vector does implement a constructor taking a size_t, the conversion test returns zero because that constructor is explicit.

We can implement two more constants inside Conversion:

exists2Way, which says whether there are conversions between T and U in both directions. This is the case with int and double, for example, but various user-defined types can implement such two-way conversions. sameType, which is true if T and U are the same type.

template <class T, class U> class Conversion { ... as above ... enum { exists2Way = exists && Conversion<U, T>::exists }; enum { sameType = false }; };

We implement sameType through a partial specialization of Conversion.

template <class T> class Conversion<T, T> { public: enum { exists = 1, exists2Way = 1, sameType = 1 }; };

Hey, what about that inheritance detection thing? The nicest part is, once you抳e got the conversion thing, detecting inheritance is easy.

#define SUPERSUBCLASS(B, D) (Conversion<const D*, const B*>::exists && !Conversion<const B*, const void*>::sameType)

Obvious? Maybe still blurred a bit. SUPERSUBCLASS(B, D) evaluates to true if D inherits from B publicly, or if B and D represent the same type. SUPERSUBCLASS does its job by evaluating the convertibility from a const D* to a const B*. There are only three cases when const D* converts implicitly to const B*:

B is the same type as D; B is an unambiguous public base of D; B is void.

The last case is eliminated by the second test. Accepting the first case (B same as D) is useful in practice, because for practical purposes you can often consider that a class is its own superclass. If you need a stricter test, you can write it like so:

#define SUPERSUBCLASS_STRICT(B, D) (SUPERSUBCLASS(B, D) && !Conversion<const B, const D>::sameType)

Why does the code add all those const modifiers? The reason is that you don't want the conversion test to fail due to const issues. Therefore, the code uses const all over the place; if template code applies const twice (to a type that's already const), the second const is ignored. In a nutshell, by using const in SUPERSUBCLASS, we're always on the safe side.

Why SUPERSUBCLASS and not the cuter BASE_OF or INHERITS? For a very practical reason: with INHERITS(B, D), I kept forgetting which way the test works — does it test whether B inherits D or vice versa? SUPERSUBCLASS(B, D) makes it clearer (at least for me) which one is first and which one is second.

Conclusion

The single most important thing about the three idioms presented here is that they are reusable. You can write them in a library and let programmers use them, as opposed to asking them to grasp their intricate inner workings.

Reusability of nontrivial techniques is important. Ask people to memorize a complicated technique that helps them do something that they can do in simpler, if a bit more tedious, ways; they won't use it. Give people a simple black box that does a little helpful wonder; they might like it and use it because it's in a way "free."

Int2Type, Type2Type, and especially Conversion belong to a generic utilities library. They extend the compile-time abilities of a programmer, by making important compile-time type inferences.

Acknowledgments

If Clint Eastwood asked me "Do you feel lucky?", I'd certainly say I do. This is my third article in a row that benefits from the direct attention, and the great advice, of Herb Sutter himself. Thanks to Tomio Hoshida in Japan for spotting a bug and making insightful suggestions.

Note: This article uses excerpts from Andrei Alexandrescu's upcoming book tentatively titled Modern C++ Design (Addison-Wesley, 2001).

Notes

[1] There is no conceptual hitch to C++ supporting partial specialization of functions, a very desirable (in my opinion) feature.

[2] Andrei Alexandrescu. "Traits: The else-if-then of Types", C++ Report (April 2000).

[3] There is a proposal for adding a typeof operator to C++; that is, an operator returning the type of an expression. Having a typeof would make much template code easier to write and understand. Gnu C++ already implements typeof as an extension. Obviously, typeof and sizeof share the same backend, because sizeof has to compute the type anyway. [Look for Bill Gibbons?article in an upcoming issue of CUJ, where he will present a nifty way to create an (almost) natural typeof operator in today抯 Standard C++. — mb]

About the Author

Andrei Alexandrescu is a Development Manager at RealNetworks Inc. (www.realnetworks.com), based in Seattle, WA, and author of the acclaimed book Modern C++ Design. He may be contacted at [email protected]. Andrei is also one of the featured instructors of The C++ Seminar (www.gotw.ca/cpp_seminar).

本文地址:http://com.8s8s.com/it/it29841.htm