Search

9.12 — Shallow vs. deep copying

In the previous lesson on The copy constructor and overloading the assignment operator, you learned about the differences and similarities of the copy constructor and the assignment operator. This lesson is a follow-up to that one.

Shallow copying

Because C++ does not know much about your class, the default copy constructor and default assignment operators it provides use a copying method known as a shallow copy (also known as a memberwise copy). A shallow copy means that C++ copies each member of the class individually using the assignment operator. When classes are simple (eg. do not contain any dynamically allocated memory), this works very well.

For example, let’s take a look at our Cents class:

When C++ does a shallow copy of this class, it will copy m_nCents using the standard integer assignment operator. Since this is exactly what we’d be doing anyway if we wrote our own copy constructor or overloaded assignment operator, there’s really no reason to write our own version of these functions!

However, when designing classes that handle dynamically allocated memory, memberwise (shallow) copying can get us in a lot of trouble! This is because the standard pointer assignment operator just copies the address of the pointer -- it does not allocate any memory or copy the contents being pointed to!

Let’s take a look at an example of this:

The above is a simple string class that allocates memory to hold a string that we pass in. Note that we have not defined a copy constructor or overloaded assignment operator. Consequently, C++ will provide a default copy constructor and default assignment operator that do a shallow copy.

Now, consider the following snippet of code:

While this code looks harmless enough, it contains an insidious problem that will cause the program to crash! Can you spot it? Don’t worry if you can’t, it’s rather subtle.

Let’s break down this example line by line:

This line is harmless enough. This calls the MyString constructor, which allocates some memory, sets cHello.m_pchString to point to it, and then copies the string “Hello, world!” into it.

This line seems harmless enough as well, but it’s actually the source of our problem! When this line is evaluated, C++ will use the default copy constructor (because we haven’t provided our own), which does a shallow pointer copy on cHello.m_pchString. Because a shallow pointer copy just copies the address of the pointer, the address of cHello.m_pchString is copied into cCopy.m_pchString. As a result, cCopy.m_pchString and cHello.m_pchString are now both pointing to the same piece of memory!

When cCopy goes out of scope, the MyString destructor is called on cCopy. The destructor deletes the dynamically allocated memory that both cCopy.m_pchString and cHello.m_pchString are pointing to! Consequently, by deleting cCopy, we’ve also (inadvertently) affected cHello. Note that the destructor will set cCopy.m_pchString to 0, but cHello.m_pchString will be left pointing to the deleted (invalid) memory!

Now you can see why this crashes. We deleted the string that cHello was pointing to, and now we are trying to print the value of memory that is no longer allocated.

The root of this problem is the shallow copy done by the copy constructor -- doing a shallow copy on pointer values in a copy constructor or overloaded assignment operator is almost always asking for trouble.

Deep copying

The answer to this problem is to do a deep copy on any non-null pointers being copied. A deep copy duplicates the object or variable being pointed to so that the destination (the object being assigned to) receives it’s own local copy. This way, the destination can do whatever it wants to it’s local copy and the object that was copied from will not be affected. Doing deep copies requires that we write our own copy constructors and overloaded assignment operators.

Let’s go ahead and show how this is done for our MyString class:

As you can see, this is quite a bit more involved than a simple shallow copy! First, we have to check to make sure cSource even has a string (line 8). If it does, then we allocate enough memory to hold a copy of that string (line 11). Finally, we have to manually copy the string using strncpy() (line 14).

Now let’s do the overloaded assignment operator. The overloaded assignment operator is a tad bit trickier:

Note that our assignment operator is very similar to our copy constructor, but there are three major differences:

  • We added a self-assignment check (line 5).
  • We return *this so we can chain the assignment operator (line 26).
  • We need to explicitly deallocate any value that the string is already holding (line 9).

When the overloaded assignment operator is called, the item being assigned to may already contain a previous value, which we need to make sure we clean up before we assign memory for new values. For non-dynamically allocated variables (which are a fixed size), we don’t have to bother because the new value just overwrite the old one. However, for dynamically allocated variables, we need to explicitly deallocate any old memory before we allocate any new memory. If we don’t, the code will not crash, but we will have a memory leak that will eat away our free memory every time we do an assignment!

Checking for self-assignment

In our overloaded assignment operators, the first thing we do is check for self assignment. There are two reasons for this. One is simple efficiency: if we don’t need to make a copy, why make one? The second reason is because not checking for self-assignment when doing a deep copy will cause problems if the class uses dynamically allocated memory. Let’s take a look at an example of this.

Consider the following overloaded assignment operator that does not do a self-assignment check:

What happens when we do the following?

This statement will call our overloaded assignment operator. The this pointer will point to the address of cHello (because it’s the left operand), and cSource will be a reference to cHello (because it’s the right operand). Consequently, m_pchString is the same as cSource.m_pchString.

Now look at the first line of code that would be executed: delete[] m_pchString;.

This line is meant to deallocate any previously allocated memory in cHello so we can copy the new string from the source without a memory leak. However, in this case, when we delete m_pchString, we also delete cSource.m_pchString! We’ve now destroyed our source string, and have lost the information we wanted to copy in the first place. The rest of the code will allocate a new string, then copy the uninitialized garbage in that string to itself. As a final result, you will end up with a new string of the correct length that contain garbage characters.

The self-assignment check prevents this from happening.

Preventing copying

Sometimes we simply don’t want our classes to be copied at all. The best way to do this is to add the prototypes for the copy constructor and overloaded operator= to the private section of your class.

In this case, C++ will not automatically create a default copy constructor and default assignment operator, because we’ve told the compiler we’re defining our own functions. Furthermore, any code located outside the class will not be able to access these functions because they’re private.

Summary

  • The default copy constructor and default assignment operators do shallow copies, which is fine for classes that contain no dynamically allocated variables.
  • Classes with dynamically allocated variables need to have a copy constructor and assignment operator that do a deep copy.
  • The assignment operator is usually implemented using the same code as the copy constructor, but it checks for self-assignment, returns *this, and deallocates any previously allocated memory before deep copying.
  • If you don’t want a class to be copyable, use a private copy constructor and assignment operator prototype in the class header.
10.1 -- Constructor initialization lists
Index
9.11 -- The copy constructor and overloading the assignment operator

32 comments to 9.12 — Shallow vs. deep copying

  • enoquick

    An interesting technical for operator=() is :

    T& T::operator=(const T& x) {
    T t(x); // copy constructor
    swap(t); // exception safe
    return *this;
    }

    void T::swap(T&)throw(); // exchange the single components -- no throw exception

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">