Copying, Cloning, and Marshalling in .NET
Pages: 1, 2, 3
Listing 2: Introducing the System.ICloneable interface
namespace System
{
interface ICloneable
{
object Clone();
}
}
But there are two problems with the ICloneable interface. First, it's
weakly-typed -- it's specified to return an
object, which could be darn well anything -- the lack
of generics (templates) in the current version of .NET makes this an
unavoidable necessity. This forces clients of ICloneable to downcast the
clone back to the type in question, which can sometimes result in cumbersome
and error-prone (or at least ugly) code.
It seems to me that the best analog of a "copy constructor/assignment
operator" pair, in C#, would be an implementation of ICloneable that delegates
to a public, type-safe alternative Clone method:
Listing 3: A well-designed, cloneable class
class MyCloneableClass : System.ICloneable
{
// Explicit interface method impl -- available for
// clients of ICloneable, but invisible to casual
// clients of MyCloneableClass
object ICloneable.Clone()
{
// simply delegate to our type-safe cousin
return this.Clone();
}
// Friendly, type-safe clone method
public virtual MyCloneableClass Clone()
{
// Start with a flat, memberwise copy
MyCloneableClass x =
this.MemberwiseClone() as MyCloneableClass;
// Then deep-copy everything that needs the
// special attention
x.somethingDeep = this.somethingDeep.Clone();
//...
return x;
}
}
ICloneable.Clone vs. Object.MemberwiseClone
|
Related Reading
Programming C# |
In the previous section, we made use of an interesting member function, present
on all .NET objects: MemberwiseClone. This method is a source of great
confusion in the developer community. Don't be fooled by its name -- it's
certainly not any kind of alternative to ICloneable.Clone, because it's
a protected method. Furthermore, it's not even overrideable by derived types,
because it's not a virtual method. Its only purpose in life seems to be to
assist us in our implementations of Clone methods, by performing the default
.NET shallow-copy in just one line of code.
Now, this begs the following question: why is there no corresponding "DeepClone"
method on System.Object? Shouldn't it be possible for the framework to provide
a method that queries each member for the ICloneable interface, and either
calls ICloneable.Clone on that member or performs a bitwise (shallow) copy of
the member, as appropriate? This would allow a great many implementations of
Clone to be implemented with just one trivial line of code:
Listing 4: Wishful thinking
public virtual MyCloneableClass Clone()
{
// let .NET do the heavy lifting
return this.DeepClone();
}
The only exceptions, of course, would be types that contain one or more references to objects that neglect to implement ICloneable as expected, or object graphs that contains circular references (like the doubly-linked list example in Figure 1). These objects would have to be copied "by hand," in the current manner.
Anyway, this brings us to the second problem with ICloneable: although it's a
well-known interface, defined by the system, the .NET runtime doesn't seem to
make use of it (at least not in any context that I've yet encountered). This is
in contrast to most of the other system-defined "IXXXXable" interfaces (e.g.:
ISerializable, IComparable, IEnumerable, IDisposable, etc.) each of which is
either called by the .NET runtime in some situation, or else serves to support
some language feature (e.g.: C#'s
foreach
and
using
constructs are supported by IEnumerable and IDisposable, respectively).
It seems that ICloneable is purely a convention -- no better or worse than
recommending that we all implement a public Clone method to accomplish the
same thing. The fact that ICloneable is an interface does, however, make it
easy for callers to query an object of unknown origin for its copy-semantics,
without resorting to reflection (although in practice, the need to do this does
not come up very often).
Listing 5: Do our best to make a copy of object x, deep or shallow
public static object MakeCopyOf( object x)
{
if (x is ICloneable)
{
// Return a deep copy of the object
return ((ICloneable)x).Clone();
}
else if (x is ValueType)
{
// Return a shallow copy of the value
return ((ValueType)x);
}
else
{
// Without resorting to reflection or serialization,
// all we can do is fall back to default copy semantics,
// which will return a ref to the same physical object
// (not what we want!)
throw new
System.NotSupportedException("object not cloneable");
}
}
What's So Special About System.String?
An interesting case study in the field of object-cloning is that of
System.String. We all know how easy it is give pass-by-reference semantics to a
value-type -- just box it, and copy the
object
around. But how do you give pass-by-value semantics to a reference-type? Surely
it must be possible, because System.String gets away with it. Or is
System.String special in some way?
To be sure, strings are given a lot of special treatment in .NET. They even have
some of their very own Intermediate Language (IL) opcodes. However, there is no
magic at play, here -- System.String accomplishes its pass-by-value trick by
virtue of being immutable, by design. In other words, strings in .NET are
passed by-reference just like instances of any other class. But you can't
easily test that hypothesis without modifying the string's contents, and
there simply aren't any methods that modify a string without returning a newly-created string instance.
Every method that might appear, at first glance, to modify a string in fact
returns a modified copy of the string. Unlike C++ and some other languages,
.NET does not offer the concept of "const" methods -- methods that do not modify
any of the object's member variables. If it did, however, then every instance
method on the System.String class would surely be marked "const".
This clever design is similar to the "pass-by-reference-but-copy-on-write semantics" made popular by the C++ string classes found in STL and MFC. This is a very efficient design, because strings take great resources to copy (it should be done only when necessary).
Note that it is probably not worthwhile to imitate this technique in your own
classes. Only strings are used so heavily as to justify the excessive amount of
code involved (viz. a whole separate class, System.Text.StringBuilder, is
needed to avoid spurious copying in some other common usage scenarios). But
it's always good to understand how the magic works.
Now let's leave the topic of object-cloning behind, for a while, and expand our horizons by analyzing what happens when we pass parameters to objects that live in remote app domains!
Marshalling Arguments Across AppDomain Boundaries
The topic of remoting in .NET (whereby objects communicate across AppDomain boundaries) is long and complex -- far beyond the scope of this article. However, a central premise of all remoting architectures is marshalling, and marshalling is very closely related to the subject matter of this article (namely, the passing of objects as arguments to method calls), so it's worth taking a look at how marshalling works in .NET remoting. (See the References section for some interesting links to learn more about remoting in .NET, in general.)
For our purposes, marshalling can be defined as the mechanism by which arguments to/from a method call are transported, across some communication channel, to a remote recipient. Often the process of marshalling involves serialization -- persisting the object's state into a stream, and reconstituting the object "over there." Other times, it involves the creation of a proxy object, which will in turn marshal arguments to subsequent method calls back and forth across the wall.
The former case is known as "marshal-by-value" (or MBV), and it's very much like the pass-by-value semantics exhibited by value-types. The latter case is known as "marshal-by-reference" (or MBR), and, no surprise, it's very much like the pass-by-reference semantics that we've already seen.. But the most important thing to understand about marshalling in .NET is that the default semantics are different: by default, all objects in .NET (both value- and reference-types) are marshalled by value when sent across the "wire" to a remote AppDomain.
But how can a reference-type be passed by-value? Didn't we learn, back in Listing
5, that this was impossible (without resorting to reflection or
serialization)? We did. And indeed it's true -- if you attempt to pass an MBV
object that is not serializable (marked with the
[Serializable]
attribute), you will experience a SerializationException at run time. Listing 6
demonstrates this phenomenon.

