C++ : Container classes and templates

PART 18


Q117: How can I insert/access/change elements from a linked list/hashtable/etc?
A: I'll use a `inserting into a linked list' as a prototypical example.  The
obvious approach is to allow insertion at the head and tail of the list, but
that would produce a library that is too weak (a weak library is almost worse
than no library).  Whenever encapsulation frustrates rather than helps a user,
it may be that the class' public interface needs enhancing.  If class List only
supports adding at the front and tail, it *definitely* needs more strength.

This answer will be a lot to swallow for novice C++'ers, so I'll give a couple
of options.  As usual, the first is easiest, while the second option is better.
I also give a thumbnail sketch of a third option, which has certain advantages
and disadvantages over the second option.

[1] Empower the List with a `viewport' or `cursor' that references an arbitrary
list element.  Implies adding member fns to List such as advance(), backup(),
atend(), atbegin(), rewind(), fastforward(), and current().  `current()'
returns the element of the List that is currently `under the cursor'.  Finally
you'll need to add a few more member fns to *mutate* the list, such as
changeto(X), insert(X), remove(), etc.

[2] Provide a separate class called ListIter.  ListIter has member fns named
similar to the above (though probably with operator overloading, but that's
just syntactic sugar for member fns).  The List itself would have none of the
above mentioned member fns, and the ListIter would be a `friend' of List
(ListIter would have to have access to the innards of List; this allows the
world to have safe access abilities to List without violating encapsulation).

The reason option [2] is better becomes apparent when you use classes that only
support [1].  In particular, if you pass a List off to a subcall, you'd better
hope the subcall didn't warp the `cursor' around, or your code may fail.  Also,
if you try to get all pairs of elements, it's very hard if you only have one
cursor.  The distinct-class iterator concept removes these restrictions.  My
own class library uses these extensively, as will most any other commercial
grade class library.

Note that the options are not mutually exclusive; it is possible to provide
both [2] *and* [1], giving rise to the notion of a `primary' list position,
with instances of ListIter being somewhat secondary.

[3] The third possibility considers the entire iteration as an atomic event.  A
class is created which embodies this event.  The nice thing about this third
alternative is that the public access methods (which may be virtual fns) can be
avoided during the inner loop, thus enhancing performance.  The down side is
that you get extra object code in the application, since templates gain speed
by duplicating code.  This third technique is due to Andrew Koenig in a paper
published recently in JOOP [`Templates as interfaces', JOOP, 4, 5 (Sept 91)].
You can also see a taste of it in Bjarne Stroustrup's book, The C++ Programming
Language Second Edition (look for `Comparator' in the index).



Q118: What's the idea behind `templates'?
A: A template is a cookie-cutter that specifies how to cut cookies that all
look pretty much the same (although the cookies can be made of various kinds of
dough, they'll all have the same basic shape).  In the same way, a class
template is a cookie cutter to description of how to build classes that all
look basically the same, and a function template describes how to build similar
looking functions.

The questions about templates are in the `Containers' section since templates
are often used to build type safe containers (although this only scratches the
surface for how they can be used).  We will see how to use templates to build
container classes below.



Q119: What's the syntax / semantics for a `function template'?
A: Consider this function that swaps its two integer arguments:

	void swap(int& x, int& y)
	{
	  int tmp = x;
	  x = y;
	  y = tmp;
	}

If we also had to swap floats, longs, Strings, Sets, and FileSystems, we'd get
pretty tired of coding lines that look almost identical except for the type.
Mindless repetition is an ideal job for a computer, hence a function template:

	template
	void swap(T& x, T& y)
	{
	  T tmp = x;
	  x = y;
	  y = tmp;
	}

Every time we used `swap()' with a given pair of types, the compiler will go to
the above definition and will create yet another `template function' as an
instantiation of the above.  Ex:

	main()
	{
	  int    i,j;  /*...*/  swap(i,j);  //instantiates a swap for `int'
	  float  a,b;  /*...*/  swap(a,b);  //instantiates a swap for `float'
	  char   c,d;  /*...*/  swap(c,d);  //instantiates a swap for `char'
	  String s,t;  /*...*/  swap(s,t);  //instantiates a swap for `String'
	}

(note: a `template function' is the instantiation of a `function template').



Q120: What's the syntax / semantics for a `class template'?
A: Consider this container class of that acts like an array of integers:

	//this would go into a header file such as `Vec.h':
	class Vec {
	  int  xlen;
	  int* xdata;
	  int  check(int i);  //return i if i>=0 && i= xlen) throw BoundsViol("Vec", i, xlen);
	  return i;
	}

Just as with `swap()' above, repeating the above over and over for Vec of
float, char, String, Vec, Vec-of-Vec-of-Vec, etc, will become tedious.  Hence
we create a single class template:

	//this would go into a header file such as `Vec.h':
	template
	class Vec {
	  int  xlen;
	  T*   xdata;
	  int  check(int i);  //return i if i>=0 && i
	int Vec::check(int i)
	{
	  if (i < 0 || i >= xlen) throw BoundsViol("Vec", i, xlen);
	  return i;
	}

Unlike template functions, template classes (instantiations of class templates)
need to be explicit about the parameters over which they are instantiating:

	main()
	{
	  Vec        vi;
	  Vec      vf;
	  Vec      vc;
	  Vec     vs;
	  Vec< Vec > vv;
	}          // ^^^-- note the space; do NOT use Vec> since the
	           //       `maximal munch' rule would grab a single `>>' token



Q121: What is a `parameterized type'?
A: A parameterized type is a type that is parameterized over another value or
type.  Ex: List is a type that is parameterized over another type, `int'.
Therefore the C++ rendition of parameterized types is provided by class
templates.



Q122: What is `genericity'?
A: Not to be confused with `generality' (which just means avoiding solutions
which are overly specific), `genericity' means parameterized types.  In C++,
this is provided by class templates.



Q123: How can I fake templates if I don't have a compiler that supports them?
A: The best answer is: buy a compiler that supports templates.  When this is
not feasible, the next best answer is to buy or build a template preprocessor
(ex: reads C++-with-templates, outputs C++-with-expanded-template-classes; such
a system needn't be perfect; it cost my company about three man-weeks to
develop such a preprocessor).  If neither of these is feasible, you can use the
macro preprocessor to fake templates.  But beware: it's tedious; templates are
a better solution wrt development and maintenance costs.

Here's how you'd declare my `Vec' example from above.  First we define `Vec(T)'
to concatenate the name `Vec' with that of the type T (ex: Vec(String) becomes
`VecString').  This would go into a header file such as Vec.h:

	#include 	//to get the `name2()' macro
	#define  Vec(T)  name2(Vec,T)

Next we declare the class Vec(T) using the name `Vecdeclare(T)' (in general you
would postfix the name of the class with `declare', such as `Listdeclare' etc):

	#define Vecdeclare(T)						\
	class Vec(T) {							\
	  int  xlen;							\
	  T*   xdata;							\
	  int  check(int i);  /*return i if in bounds else throw*/	\
	public:								\
	  int       len()             const { return xlen;     }	\
	  const T&  operator[](int i) const { xdata[check(i)]; }	\
	        T&  operator[](int i)       { xdata[check(i)]; }	\
	            Vec(T)(int L=10): xlen(L), xdata(new T[L]) {/*...*/}\
	           ~Vec(T)()                   { delete [] xdata; }	\
	};

Note how each occurrence of `Vec' has the `(T)' postfixed.  Finally we set up
another macro that `implements' the non-inline member function(s) of Vec:

	//strangely enough this can also go into Vec.h
	#define Vecimplement(T)						\
	int Vec(T)::check(int i)					\
	{								\
	  if (i < 0 || i >= xlen) throw BoundsViol("Vec", i, xlen);	\
	  return i;							\
	}

When you wish to use a Vec-of-String and Vec-of-int, you would say:

	#include "Vec.h"     //pulls in  too; see below...
	declare(Vec,String)  //`declare()' is a macro defined in 
	declare(Vec,int)
	Vec(String) vs;      //Vec(String) becomes the single token `VecString'
	Vec(int)    vi;

In exactly one source file in the system, you must provide implementations for
the non-inlined member functions:

	#include "Vec.h"
	declare  (Vec,String)   declare  (Vec,int)   declare  (Vec,float)
	implement(Vec,String)   implement(Vec,int)   implement(Vec,float)

Note that types whose names are other than a single identifier do not work
properly.  Ex: Vec(char*) creates a class whose name is `Vecchar*'.  The patch
is to create a typedef for the appropriate type:

	#include "Vec.h"
	typedef char* charP;
	declare(Vec,charP)

It is important that every declaration of what amounts to `Vec' must all
use exactly the same typedef, otherwise you will end up with several equivalent
classes, and you'll have unnecessary code duplication.  This is the sort of
tedium which a template mechanism can handle for you.