C++ : C++/Smalltalk differences and keys to learning C++

PART 14


Q88: Why does C++'s FAQ have a section on Smalltalk? Is this Smalltalk-bashing?
A: The two `major' OOPLs in the world are C++ and Smalltalk.  Due to its
popularity as the OOPL with the second largest user pool, many new C++
programmers come from a Smalltalk background.  This section answers the
questions:
 * what's different about the two languages
 * what must a Smalltalk-turned-C++ programmer know to master C++

This section does *!*NOT*!* attempt to answer the questions:
 * which language is `better'?
 * why is Smalltalk `bad'?

Nor is it an open invitation for some Smalltalk terrorist to slash my tires
while I sleep (on those rare occasions when I have time to rest these days :-).



Q89: What's the difference between C++ and Smalltalk?
A: There are many differences such as compiled vs perceived-as-interpreted,
pure vs hybrid, faster vs perceived-as-slower, etc.  Some of these aren't true
(ex: a large portion of a typical Smalltalk program can be compiled by current
implementations, and some Smalltalk implementations perform reasonably well).
But none of these affect the programmer as much as the following three issues:

	* strong typing vs weak typing (some say `static vs dynamic')
	* how you use inheritance
	* value vs reference semantics

The first two differences are illuminated in the remainder of this section; the
third point is the subject of the section that follows.

If you're a Smalltalk programmer who wants to learn C++, you'd be very wise to
study the next three questions carefully.  Historically there have been many
attempts to `make' C++ look/act like Smalltalk, even though the languages are
very Very different.  This hasn't always led to failures, but the differences
are significant enough that it has led to a lot of needless frustration and
expense.  The quotable quote of the year goes to Bjarne Stroustrup at the `C++
1995' panel discussion, 1990 C++-At-Work conference, discussing library design:
		`Smalltalk is the best Smalltalk around'.



Q90: What is `static typing', and how is it similar/dissimilar to Smalltalk?
A: Static (most say `strong') typing says the compiler checks the type-safety
of every operation *statically* (at compile-time), rather than to generate code
which will check things at run-time.  For example, the signature matching of fn
arguments is checked, and an improper match is flagged as an error by the
*compiler*, not at run-time.

In OO code, the most common `typing mismatch' is sending a message to an object
that the recipient isn't prepare to handle.  Ex: if class `X' has member fn f()
but not g(), and `x' is an instance of class X, then x.f() is legal and x.g()
is illegal.  C++ (statically/strongly typed) catches the error at compile time,
and Smalltalk (dynamically/weakly typed) catches `type' errors at run-time.
(Technically speaking, C++ is like Pascal [*pseudo* statically typed], since
ptr casts and unions can be used to violate the typing system; you probably
shouldn't use these constructs very much).



Q91: Which is a better fit for C++: `static typing' or `dynamic typing'?
A: The arguments over the relative goodness of static vs dynamic typing will
continue forever.  However one thing is clear: you should use a tool like it
was intended and designed to be used.  If you want to use C++ most effectively,
use it as a statically typed language.  C++ is flexible enough that you can
(via ptr casts, unions, and #defines) make it `look' like Smalltalk.

There are places where ptr casts and unions are necessary and even wholesome,
but they should be used carefully and sparingly.  A ptr cast tells the compiler
to believe you.  It effectively suspends the normal type checking facilities.
An incorrect ptr cast might corrupt your heap, scribble into memory owned by
other objects, call nonexistent methods, and cause general failures.  It's not
a pretty sight.  If you avoid these and related constructs, you can make your
C++ code both safer and faster -- anything that can be checked at compile time
is something that doesn't have to be done at run-time, one `pro' of strong
typing.

Even if you're in love with weak typing, please consider using C++ as a
strongly typed OOPL, or else please consider using another language that better
supports your desire to defer typing decisions to run-time.  Since C++ performs
100% type checking decisions at compile time, there is *no* built-in mechanism
to do *any* type checking at run-time; if you use C++ as a weakly typed OOPL,
you put your life in your own hands.



Q92Q92: How can you tell if you have a dynamically typed C++ class library?
A: One hint that a C++ class library is weakly typed is when everything is
derived from a single root class, usually `Object'.  Even more telling is the
implementation of the container classes (List, Stack, Set, etc): if these
containers are non-templates, and if their elements are inserted/extracted as
ptrs to `Object', the container will promote weak typing.  You can put an Apple
into such a container, but when you get it out, the compiler only knows that it
is derived from Object, so you have to do a pointer cast (a `down cast') to
cast it `down' to an Apple (you also might hope a lot that you got it right,
cause your blood is on your own head).

You can make the down cast `safe' by putting a virtual fn into Object such as
`are_you_an_Apple()' or perhaps `give_me_the_name_of_your_class()', but this
dynamic testing is just that: dynamic.  This coding style is the essence of
weak typing in C++.  You call a function that says `convert this Object into an
Apple or kill yourself if its not an Apple', and you've got weak typing: you
don't know if the call will succeed until run-time.

When used with templates, the C++ compiler can statically validate 99% of an
application's typing information (the figure `99%' is apocryphal; some claim
they always get 100%, others find the need to do persistence which cannot be
statically type checked).  The point is: C++ gets genericity from templates,
not from inheritance.



Q93: Will `standard C++' include any dynamic typing primitives?
A: The ANSI/ISO C++ standardization committees are considering proposals to add
type-safe pointer casts and other run-time type mechanisms into the C++
standard.  When (if?) this happens, it will be easier to do run-time typing in
those cases where it truly is needed (ex: for persistence), but hopefully the
new syntax won't encourage abuses where if-then-else'ing the run-time type is
used to replace a virtual function call.

Note that the effect of a down-cast and a virtual fn call are similar: in the
member fn that results from the virtual fn call, the `this' ptr is a downcasted
version of what it used to be (it went from ptr-to-Base to ptr-to-Derived).
The difference is that the virtual fn call *always* works: it never makes the
wrong `down-cast' and it automatically extends itself whenever a new subclass
is created -- as if an extra `case' or `if/else' magically appearing in the
weak typing technique.  The other difference is that the client gives control
to the object rather than reasoning *about* the object.



Q94: How do you use inheritance in C++, and is that different from Smalltalk?
A: There are two reasons one might want to use inheritance: to share code, or
to express your interface compliance.  Ie: given a class `B' (`B' stands for
`base class', which is called `superclass' in Smalltalkese), a class `D' which
is derived from B is expressed this way:
	class B { /*...*/ };
	class D : public B { /*...*/ };

This says two distinct things: (1) the bits(data structure) + code(algorithms)
are inherited from B, and (2) `D's public interface is `conformal' to `B's
(anything you can do to a B, you can also do to a D, plus perhaps some other
things that only D's can do; ie: a D is-a-kind-of-a B).

In C++, one can use inheritance to mean:
	--> #2(is-a) alone (ex:you intend to override most/all inherited code)
	--> both #2(is-a) and #1(code-sharing)
but one should never Never use the above form of inheritance to mean
	--> #1(code-sharing) alone (ex: D really *isn't* a B, but...)

This is a major difference with Smalltalk, where there is only one form of
inheritance (C++ provides `private' inheritance to mean `share the code but
don't conform to the interface').  The Smalltalk language proper (as opposed to
coding practice) allows you to have the *effect* of `hiding' an inherited
method by providing an override that calls the `does not understand' method.
Furthermore Smalltalk allows a conceptual `is-a' relationship to exist *apart*
from the subclassing hierarchy (subtypes don't have to be subclasses; ex: you
can make something that `is-a Stack' yet doesn't inherit from `Stack').

In contrast, C++ is more restrictive about inheritance: there's no way to make
a `conceptual is-a' relationship without using inheritance (the C++ work-around
is to separate interface from implementation via ABCs).  The C++ compiler
exploits the added semantic information associated with public inheritance to
provide static typing.



Q95: What are the practical consequences of diffs in Smalltalk/C++ inheritance?
A: Since Smalltalk lets you make a subtype without making a subclass, one can
be very carefree in putting data (bits, representation, data structure) into a
class (ex: you might put a linked list into a Stack class).  After all, if
someone wants something that an array-based-Stack, they don't have to inherit
from Stack; they can go off and make effectively a stand-alone class (they
might even *inherit* from an Array class, even though they're not-a-kind-of-
Array!).

In C++, you can't be nearly as carefree.  Since only mechanism (method code),
but not representation (data bits) can be overridden in subclasses, you're
usually better off *not* putting the data structure in a class.  This leads to
the concept of Abstract Base Classes (ABCs), which are discussed in a separate
question.  You can change the algorithm but NOT the data structure.  Bits are
forever.

I like to think of the difference between an ATV and a Maseratti.  An ATV [all
terrain vehicle] is more fun, since you can `play around' by driving through
fields, streams, sidewalks and the like.  A Maseratti, on the other hand, gets
you there faster, but it forces you to stay on the road.  My advice to C++
programmers is simple: stay on the road.  Even if you're one of those people
who like the `expressive freedom' to drive through the bushes, don't do it in
C++; it's not a good `fit'.

Note that C++ compilers uphold the is-a semantic constraint only with `public'
inheritance.  Neither containment (has-a), nor private or protected inheritance
implies conformance.



Q96: Do you need to learn a `pure' OOPL before you learn C++?
A: The short answer is, No.

The medium answer length answer is: learning some `pure' OOPLs may *hurt*
rather than help.

The long answer is: read the previous questions on the difference between C++
and Smalltalk (the usual `pure' OOPL being discussed; `pure' means everything
is an object of some class; `hybrid' [like C++] means things like int, char,
and float are not instances of a class, hence aren't subclassable).

The `purity' of the OOPL doesn't make the transition to C++ any more or less
difficult; it is the weak typing and improper inheritance that is so hard to
get.  I've taught numerous people C++ with a Smalltalk background, and they
usually have just as hard a time as those who've never seen inheritance before.
In fact, my personal observation is that those with extensive experience with a
weakly typed OOPL (usually but not always Smalltalk) have a *harder* time,
since it's harder to *unlearn* habits than it is to learn the statically typed
way from the beginning.



Q97: What is the NIHCL?  Where can I get it?
A: NIHCL stands for `national-institute-of-health's-class-library'.
it can be acquired via anonymous ftp from [128.231.128.251]
in the file pub/nihcl-3.0.tar.Z

NIHCL (some people pronounce it `N-I-H-C-L', others pronounce it like `nickel')
is a C++ translation of the Smalltalk class library.  There are some ways where
NIHCL's use of weak typing helps (ex: persistent objects).  There are also
places where the weak typing it introduces create tension with the underlying
statically typed language.

A draft version of the 250pp reference manual is included with version 3.10
(gnu emacs TeX-info format).  It is not available via uucp, or via regular mail
on tape, disk, paper, etc (at least not from Keith Gorlen).

See previous questions on Smalltalk for more.