Language notes

Some language-specific issues with C# and with C++

Implementing an Equals method

Let us implement an .Equals() method in the class StrList. Note how two StrLists are equal if they have the same length and if the corresponding values are equal up to that length.

    public override bool Equals(System.Object obj) {
        if (obj == null) return false;

        // If parameter cannot be cast to StrList return false.
        StrList s2 = obj as StrList;	// safe downcast
        if (s2 == null) return false;

        // now we check the actual data:
        if (currsize != s2.currsize) return false;
	for (int i=0; i< currsize; i++) {
		//if (!(elements[i].equals(s2.elements[i]))) return false;
		if (!(elements[i]==s2.elements[i])) return false;
	}
	return true;
    }

For this to work properly in all contexts, we also have to implement GetHashCode(), below.

TList doesn't have a remove() operation, so all unused slots are always null. However, we could easily add remove(), and have two TLists with the same currsize and same data up to that point, but wildly different data beyond that point.

GetHashCode(): There is a tricky issue regarding .Equals(). Some (most) built-in data structures in C# (and in Java, with .hashCode()) just assume that if s1.GetHashCode() != s2.GetHashCode(), then !s1.Equals(s2) (ie s1 ≠ s2). But if GetHashCode() is implemented for StrList by adding up the GetHashCode() values of all the cell contents, even beyond currsize, this will fail! (This is done so the language can use different hashcodes as a fast way of determining that two objects are not .Equals() to each other).

C# operator overloading

C# supports redefining operators in some contexts. For example, if you define a class Bignum, you can define + to work for Bignums:

    Bignum f = fact(100);
    Bignum g = exp(2,100);

    Console.WriteLine( f+g );        // versus f.Add(g)

You do this with something like this, in the class Bignum:

    public Bignum Add( Bignum second) {
        ...
    }

    public Bignum operator + (Bignum second) {
        return Add(second);
    }

In essence, f+g is now treated as if it were f.operator+(g), that is, like f.Add(g).

hashtable enumerator: demos/dictionary.cs

I want this to work:

	foreach (KeyValuePair<string,int> kvp in d) 
		Console.WriteLine("{0}: {1}", kvp.Key, kvp.Value);

The hashtable is an array of linked lists; the linked-list cell type is

    public class Cell {
	private K key_;
	private V val_;
	private Cell  next_;
	public Cell(K k, V v, Cell n) {key_ = k; val_ = v; next_ = n;}
	public K getKey() {return key_;}
	public V getVal() {return val_;}
	public Cell   next() {return next_;}
        public void setVal(V v) {val_ = v;}
        public void setNext(Cell c)   {next_ = c;}
    }

To start, I must have class dictionary<K,V> inherit from System.Collections.Generic.whatever. This works:

	class dictionary<K,V> : System.Collections.Generic.IEnumerable<KeyValuePair<K,V>> {

Then I must implement the IEnumerable method. The exact method signature is as follows; note the return type.

    IEnumerator<KeyValuePair<K,V>> IEnumerable<KeyValuePair<K,V>>.GetEnumerator() {
	return foonumerator();
    }

What is up with foonumerator()? That's here:

    IEnumerator<KeyValuePair<K,V>> foonumerator() {
	for (int i = 0; i<htablesize; i++) {
	    Cell p = htable[i];
	    while (p != null) {
		yield return new .KeyValuePair<K,V>(p.getKey(), p.getVal());
		p = p.next();
	    }
	}
	yield break;
    }

Why didn't I just define this in IEnumerable, above? Because we also must implement the non-generic form of IEnumerable, due to inheritance constraints. I did that this way:

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { 
	return foonumerator(); 
    }

Otherwise I would have to type everything twice.

I figured this all out by reading the MSDN Dictionary.cs reference code, here.

Introduction to C++

Here are a few notes on this: Intro to C+++

What about installing it?

Macs sometimes have xcode. Or you can get it at https://developer.apple.com/xcode/ (or maybe the Apple App Store).

For windows, you can install MS Visual Studio, or mingw. The link to the MSDNAA site for Visual Studio is http://e5.onthehub.com/WebStore/ProductsByMajorVersionList.aspx?ws=afe1b6ef-7d9b-e011-969d-0030487d8897&vsro=8/.

Be sure to click register the first time you connect. Your account identifier is your Loyola email address, with the "@luc.edu".

The C++ Memory Problem

Remember List.cpp? I wanted you to write a linked-list destructor.

    ~LinkedList() {
	if (head == NULL) return;
	cout << "calling destructor on LinkedList:" << endl;
	printList();
	Cell<T> * q = head;		// we checked above head != null
	Cell<T> * p = q->next();
	
	while (q != NULL) { 
	    cout << "deleting " << q->data() << endl;
	    delete q;
	    q = p;
	    if (p != NULL) p = p->next();
	}
	cout << "done with destructor" << endl;
    }

(demo of lab3/pldlinkedlist)

How does this destructor get triggered? The lists go out of scope; that is, when we get to the } for the scope in which the list was declared, it gets destructed.

Alas, this is not really enough. Suppose we create a method to return pointers to new list objects:

    LinkedList<T> * p = mylistmaker();

Now when p goes out of scope, the destructor is not called, because we might have assigned the list to another variable:

    hashtable[i] = p;

But this chain is hard to keep track of. How can we, in C++, be sure that an object is deleted when it should be?

One method is iron discipline: we carefully document that callers of mylistmaker(), above, must be careful to delete the object pointed to.

Another mechanism is to allow memory leakage. I mean, how long will it take to use up 4 GB?

Smart Pointers

(Much of this material comes from Using C++11's Smart Pointers, by David Kieras of the University of Michigan.)

Yet another strategy, though, is so-called smart pointers: we create a Pointer object, and overload the (unary) * and -> operators. When a smart pointer goes out of scope, the pointer is deconstructed. However, we put into the object pointed to a reference counting mechanism (there are some other smart-pointer implementations, but this is the most common).

Here are the rules:

When a Pointer is initialized to point to an object NewObj, the reference count in NewObj is incremented
When a Pointer is assigned to point to an object NewObj,

the reference count in NewObj is incremented
if the pointer previously pointed to OldObj, the reference count there is decremented

When a Pointer to Obj goes out of scope, the reference count in Obj is decremented

As long as we can control assignment to (and initialization of) the smart-pointer variables, the reference counts are easy to implement. When an object has its reference count reach zero, it is deconstructed.

This is not perfect (because of the possibility of "looped" pointers, forming a cycle), but in practice it works quite well.

Here is some code involving raw pointers p,

LinkedList<T> * p = mylistmaker();
LinkedList<T> * q = p;
{
    LinkedList<T> * r = mylistmaker();
    r = p;                    // what happens to the object created by the line above?
}
q = NULL;                // what happens to the object q pointed to before?
p = mylistmaker();    // now what happens?

What could go wrong?

C++ actually contains three types of smart pointer:

shared_ptr
unique_ptr (formerly auto_ptr)
weak_ptr

shared_ptr

The reference-counting smart pointer described above is shared_ptr.

Here is some code:

Thing * p = new Thing(); // raw pointer
shared_ptr<Thing> ps(new Thing()); // ps is constructed from a raw pointer

We can now do things like call (*ps).foo(), or ps->foo(). This amounts to overloading the * and -> operators.

We can even do this:

shared_ptr<Thing> ps1 (new Thing());
ps = ps1; // old Thing pointed to by ps gets deleted (or ref count gets decremented); new Thing gets its refcount incremented

Now consider this:

shared_ptr<Thing> ps2 = new Thing(); // equiv to the creation above of ps, but suspicious

Why is this suspicious? Compare it to the following (recall p is a raw pointer to a Thing):

shared_ptr<Thing> ps3 = p; // this is dangerous!

In ps3, nothing prevents us from passing p to some other part of the program, or calling delete p.This messes up the sharing count achieved via ps3. (For the record, note that the above represents a call to the constructor for ps3, and is not an assignment.)

What about a derived class ThingSpawn?

Thing* p = new ThingSpawn(); // legal
shared_ptr<Thing> ps(new ThingSpawn()); // also legal!

When we run

shared_ptr<Thing> ps (new Thing());

there are two object allocations going on: first we create the new Thing(), and then we create the new smart pointer ps. Life would be faster if we could combine these, which we can do like this:

shared_ptr<Thing> ps (make_shared<Thing>());
shared_ptr<Thing> ps (make_shared<ThingSpawn>());

What's really going on with a shared_ptr is that it points literally to a manager object, which in turn points to the managed object, which is the Thing in the examples above:

ps ----> manager[count=1] -----------> Thing

When we add a couple pointers

shared_ptr<Thing> ps4 = ps
shared_ptr<Thing> ps5 = ps

we then have a picture like this:

    ps---------------> manager[count=3] ---------> Thing
    ps4------------>----/     /
    ps5------------>---------/

weak_ptr

Recall that if we create a ring of shared_ptr, the object will not get deleted when the "external" references hit zero, because there are still "internal" references. There is another type of pointer, weak_ptr, that attempts to provide some flexibility here. In the managers diagrammed above, the count refers to the shared_ptr count; all shared_ptr managers also include a field to count weak_ptrs.

When the shared_count hits zero, the managed object is deleted, but if the weak_count is nonzero, the manager object is retained. This allows a later query via the weak_ptr to see if the managed object (the Thing) still exists.

Here's a list of ways to initialize a weak_ptr (from ):

shared_ptr<Thing> sp(new Thing);         // create shared_ptr
weak_ptr<Thing>   wp1(sp);              //
weak_ptr<Thing>   wp2;                   // wp2 points to nothing (yet)
wp2 = sp;                                // legal; wp2=wp1 would also be legal
weak_ptr<Thing> wp3(wp2);                // construct wp3 from wp2
wp1.reset();                             // now wp1 is uninitialized, though wp2 and wp3 still point to sp.

We can check if the managed object pointed to by a weak_ptr wp still exists by checking if (wp), or by calling wp.expired(). If wp is still valid, we can resurrect the Thing it points to with

shared_ptr<Thing> sp = wp.lock();

If we have two objects C1 and C2 in which each object has a shared_ptr to the other, then when C1 and C2 go out of scope, the shared_ptr values keep them alive. If we'd used weak_ptrs, then C1 and C2 would be deallocated.

If we're trying to build a list structure CL that contains nodes forming a circular list, we can't use weak_ptrs, because nothing else points to the nodes. But what does make sense in this case is for the CL destructor to know about the circularity (given that the circularity is "private" to CL), and take explicit steps to delete every node.

unique_ptr

These are lightweight versions of shared_ptr, for the case when we don't actually want to do any sharing. The object pointed to by a unique_ptr gets deconstructed when the unique_ptr goes out of scope. If we create a unique_ptr up:

unique_ptr<Thing> up (new Thing);
unique_ptr<Thing> up(make_unique<Thing>());

then we're not allowed to assign up to another weak_ptr, or use it to construct another pointer.

Unique_ptr values can be moved from one unique_ptr to another.