Generics in .NET 2.0 is an exciting feature. But what are generics? Are they for you? Should you use them in your applications? In this article, we'll answer these questions and take a closer look at generics usage, and their capabilities and limitations.
Many of the languages in .NET, like C#, C++, and VB.NET (with option strict on), are
strongly typed languages. As a programmer using these languages, you expect the
compiler to perform type-safety checks. For instance, if you try to treat
or cast a reference of the type Book as a reference of the type Vehicle, the compiler
will tell you that such a cast is invalid.
However, when it comes to collections in .NET 1.0 and 1.1, there is no help with
type safety. Consider an ArrayList, for example. It holds a collection of objects.
This allows you to place an object of just about any type into an ArrayList.
Let's take a look at the code in Example 1.
Example 1. Lack of type safety in ArrayList
using System;
using System.Collections;
namespace TestApp
{
class Test
{
[STAThread]
static void Main(string[] args)
{
ArrayList list = new ArrayList();
list.Add(3);
list.Add(4);
//list.Add(5.0);
int total = 0;
foreach(int val in list)
{
total = total + val;
}
Console.WriteLine(
"Total is {0}", total);
}
}
}
I am creating an instance of ArrayList and adding 3 and 4 to it. Then I loop
though the ArrayList, fetching the int values from it and adding them. This
program will produce the result "Total is 7." Now, if I uncomment the statement:
list.Add(5.0);
the program will produce a runtime exception:
Unhandled Exception: System.InvalidCastException: Specified cast is not valid.
at TestApp.Test.Main(String[] args) in c:\workarea\testapp\class1.cs:line 18
What went wrong? Remember that ArrayList holds a collection of objects.
When you add a 3 to the ArrayList, you are boxing the value 3.
When you loop though the list, you are unboxing the elements as int.
However, when you add the value 5.0, you are boxing a double.
On line 18, that double value is being unboxed as an int,
and that is the cause of failure.
(The above example, if it was written using
VB.NET would not fail, however. The reason is VB.NET, instead of unboxing,
invokes a method that converts the values into Integers. The VB.NET code
will also fail if the value in ArrayList is not convertible to Integer.
See Gotcha #9, "Typeless ArrayList Isn't Type-Safe," in my book
.NET Gotchas for further details.)
As a programmer who is used to the type safety provided by the language, you would rather have the problems pop up during compile time instead of runtime. This is where generics come in.
Generics allow you to realize type safety at compile time. They allow you to create a data structure without committing to a specific data type. When the data structure is used, however, the compiler makes sure that the types used with it are consistent for type safety. Generics provide type safety, but without any loss of performance or code bloat. While they are similar to templates in C++ in this regard, they are very different in their implementation.
The System.Collections.Generics namespace contains the generics collections in .NET 2.0.
Various collections/container classes have been "parameterized." To use them,
simply specify the type for the parameterized type and off you go. See Example 2:
Example 2. Type-safe generic List
List<int> aList = new List<int>();
aList.Add(3);
aList.Add(4);
// aList.Add(5.0);
int total = 0;
foreach(int val in aList)
{
total = total + val;
}
Console.WriteLine("Total is {0}", total);
In Example 2, I am creating an instance of the generic List with the type
int, given within the angle brackets (<>), as the parameterized type. This code, when
executed, will produce the result "Total is 7." Now, if I uncomment the statement
doubleList.Add(5.0);, I will get a compilation error. The compiler determines that it
can't send the value 5.0 to the method Add(), as it only accepts an int. Unlike the
example in Example 1, this code has type safety built into it.
Generics is not a mere language-level feature. The .NET CLR recognizes generics. In
that regard, the use of generics is a first-class feature in .NET. For each type of parameter used for
a generic, a class is not rolled out in the Microsoft Intermediate Language (MSIL). In
other words, your assembly contains only one definition of your parameterized data
structure or class, irrespective of how many different types are used for that parameterized
type. For instance, if you define a generic type MyList<T>, only one definition of that
type is present in MSIL. When the program executes, different classes are dynamically created,
one for each type for the parameterized type. If you use MyList<int> and
MyList<double>, then two classes are created on the fly when your program executes.
Let's examine this further in Example 3.
Example 3. Writing a generic class
//MyList.cs
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
#endregion
namespace CLRSupportExample
{
public class MyList<T>
{
private static int objCount = 0;
public MyList()
{
objCount++;
}
public int Count
{
get
{
return objCount;
}
}
}
}
//Program.cs
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
#endregion
namespace CLRSupportExample
{
class SampleClass {}
class Program
{
static void Main(string[] args)
{
MyList<int> myIntList = new MyList<int>();
MyList<int> myIntList2 = new MyList<int>();
MyList<double> myDoubleList
= new MyList<double>();
MyList<SampleClass> mySampleList
= new MyList<SampleClass>();
Console.WriteLine(myIntList.Count);
Console.WriteLine(myIntList2.Count);
Console.WriteLine(myDoubleList.Count);
Console.WriteLine(mySampleList.Count);
Console.WriteLine(
new MyList<sampleclass>().Count);
Console.ReadLine();
}
}
}
I have created a generic class named MyList. To parameterize it, I simply inserted an angle
bracket. The T within <> represents the actual type that will be specified when the
class is used. Within the MyList class, I have a static field, objCount. I am incrementing this
within the constructor so I can find out how many objects of that type are created by the
user of my class. The Count property returns the number of instances of the same type as
the instance on which it is called.
In the Main() method, I am creating two instances of MyList<int>, one instance of
MyList<double>, and two instances of MyList<SampleClass>, where
SampleClass is a
class I have defined. The question is: what will be the value of Count? That is, what is the
output from the above program? Go ahead and think on this and try to answer this
question before you read further.
Have you worked the above question? Did you get the following answer?
2
2
1
1
2
The first two values of 2 are for MyList<int>. The first value of
1 is for MyList<double>.
The second value of 1 is for MyList<SampleClass>; only one instance of this
type had been
created at that point in the control flow. The last value of 2 is also for MyList<SampleClass>, since another instance of this type has been created at this point in
the code. The above example illustrates that MyList<int> is a different class from MyList<double>, which in turn is a
different class from MyList<SampleClass>. So, in this example, we have four classes of
MyList: MyList<T>, MyList<int>, MyList<double>,
and MyList<X>. Again, while there
are four classes of MyList, only one is stored in MSIL. How can we prove this? Figure 1
shows the MSIL using the ildasm.exe tool.

Figure 1. A look at MSIL for Example 3
In addition to having generic classes, you may also have generic methods. Generic methods may be part of any class. Let's look at Example 4:
Example 4. A generic method
public class Program
{
public static void Copy<T>(List<T> source, List<T> destination)
{
foreach (T obj in source)
{
destination.Add(obj);
}
}
static void Main(string[] args)
{
List<int> lst1 = new List<int>();
lst1.Add(2);
lst1.Add(4);
List<int> lst2 = new List<int>();
Copy(lst1, lst2);
Console.WriteLine(lst2.Count);
}
}
The Copy() method is a generic method that works with the parameterized type T.
When Copy() is invoked in Main(), the compiler figures out the specific
type to use, based on the arguments presented to the Copy() method.
If you create generics data structures or classes, like MyList in Example 3, there are no
restrictions on what type the parametric type you may use for the parameteric type. This
leads to some limitations, however. For example, you are not allowed to use
==, !=, or < on instances of the parametric type:
if (obj1 == obj2) …
The implementation of operators such as == and != are different for value types and
reference types. The behavior of the code may not be easier to understand if these were
allowed arbitrarily. Another restriction is the use of default constructor. For instance, if
you write new T(), you will get a compilation error, because not all classes have a
no-parameter constructor. What if you do want to create an object using new T(), or you
want to use operators such as == and !=? You can, but first you have to constraint the type
that can be used for the parameterized type. Let's look at how to do that.
A generic class allows you to write your class without committing to any type, yet allows the user of your class, later on, to indicate the specific type to be used. While this gives greater flexibility by placing some constraints on the types that may be used for the parameterized type, you gain some control in writing your class. Let's look at an example:
Example 5. The need for constraints: code that will not compile
public static T Max<T>(T op1, T op2)
{
if (op1.CompareTo(op2) < 0)
return op1;
return op2;
}
The code in Example 5 will produce a compilation error:
Error 1 'T' does not contain a definition for 'CompareTo'
Assume I need the type to support the CompareTo() method. I can specify
this by using the constraint that the type specified for the parameterized type must implement the
IComparable interface. Example 6 has the code:
Example 6. Specifying a constraint
public static T Max<T>(T op1, T op2) where T : IComparable
{
if (op1.CompareTo(op2) < 0)
return op1;
return op2;
}
In Example 6, I have specified the constraint that the type used for
parameterized type must inherit from (implement) IComparable.
The following constraints may be used:
where T : struct type must be a value type (a struct)
where T : class type must be reference type (a class)
where T : new() type must have a no-parameter constructor
where T : class_name type may be either class_name or one of its
sub-classes (or is below class_name
in the inheritance hierarchy)
where T : interface_name type must implement the specified interface
You may specify a combination of constraints, as in: where T : IComparable, new().
This says that the type for the parameterized type must implement the
IComparable interface and must have a no-parameter constructor.
A generic class that uses parameterized types, like MyClass1<T>, is called
an open-constructed generic. A generic class that uses no parameterized types,
like MyClass1<int>, is called a closed-constructed generic.
You may derive from a closed-constructed generic; that is, you may inherit a class
named MyClass2 from another class named MyClass1, as in:
public class MyClass2<T> : MyClass1<int>
You may derive from an open-constructed generic, provided the type is parameterized. For example:
public class MyClass2<T> : MyClass2<T>
is valid, but
public class MyClass2<T> : MyClass2<Y>
is not valid, where Y is a parameterized type. Non-generic classes may derive from closed-constructed
generic classes, but not from open-constructed generic classes. That is,
public class MyClass : MyClass1<int>
is valid, but
public class MyClass : MyClass1<T>
is not.
When we deal with inheritance, we need to be careful about substitutability. If B inherits
from A, then anywhere an object of A is used, an object of B may also be used. Let's
assume we have a Basket of Fruits (Basket<Fruit>). We have Apple and Banana (kinds of Fruits)
inherit from Fruit. Should Basket of Apples (Basket<apple>) inherit from Basket of Fruits
(Basket<Fruit>)? The answer is no, if we think about substitutability. Why? Consider a
method that works with a Basket of Fruits:
public void Package(Basket<Fruit> aBasket)
{
aBasket.Add(new Apple());
aBasket.Add(new Banana());
}
If an instance of Basket<Fruit> is sent to this method, the method would add an Apple and
a Banana. However, what would the effect be of sending an instance of a Basket<Apple>
to this method? You see, this gets tricky. That is why if you write:
Basket<Apple> anAppleBasket = new Basket<Apple>();
Package(anAppleBasket);
You will get an error:
Error 2 Argument '1':
cannot convert from 'TestApp.Basket<testapp.apple>'
to 'TestApp.Basket<testapp.fruit>'
The compiler protects us from shooting ourselves in the foot by making sure we don't arbitrarily pass a collection of derived where a collection of base is expected. That is pretty good, isn't it?
Wait a minute, though! That was great in the above example, but there are times when I do want to
pass a collection of derived where a collection of base is needed. For instance, consider an Animal
(such as Monkey), which has a method named Eat that takes a Basket<Fruit>, as shown below:
public void Eat(Basket<Fruit> fruits)
{
foreach (Fruit aFruit in fruits)
{
// code to eat fruit
}
}
Now, you may call:
Basket<Fruit> fruitsBasket = new Basket<Fruit>();
… // Fruits added to Basket
anAnimal.Eat(fruitsBasket);
What if you have a Basket<Banana> with you? Would it make sense to send a Basket<Banana>
to the Eat method? In this case, it would, no? But the compiler will give us an
error if we try:
Basket<Banana> bananaBasket = new Basket<Banana>();
//…
anAnimal.Eat(bananaBasket);
The compiler is protecting us here. How can we ask the compiler to let us through in this particular case? Again, constraints come in handy for this:
public void Eat<t>(Basket<t> fruits) where T : Fruit
{
foreach (Fruit aFruit in fruits)
{
// code to eat fruit
}
}
In writing the Eat() method, I am asking the compiler to allow a Basket of any type T,
where T is of the type Fruit or any class that inherits from Fruit.
|
Delegates can be generics as well. This provides quite a bit of flexibility.
Assume we are interested in writing a framework. We need to provide a mechanism
for an event source to talk to an object that is interested in the event.
Our framework may not be able to control what the events are. You may be
dealing with a stock price change (double price). I may be dealing with
temperature change in a boiler (temperature value), where Temperature may
be an object that has some information such as value, units,
threshold, and so on. How can I define an interface for these events?
Let's take a look at how we can realize this by using pre-generic delegates:
public delegate void NotifyDelegate(Object info);
public interface ISource
{
event NotifyDelegate NotifyActivity;
}
We have the NotifyDelegate accepting an Object. This is the best we
could do in the past, as Object can be use to represent different types
such as double, Temperature, and so on, though it involves boxing overhead for
value types. ISource is an interface that different sources will
support. The framework exposes the NotifyDelegate delegate and the
ISource interface.
Let's look at two different sources:
public class StockPriceSource : ISource
{
public event NotifyDelegate NotifyActivity;
//…
}
public class BoilerSource : ISource
{
public event NotifyDelegate NotifyActivity;
//…
}
If we have an object of each of the above classes, we would register a handler for events, as shown below:
StockPriceSource stockSource = new StockPriceSource();
stockSource.NotifyActivity
+= new NotifyDelegate(
stockSource_NotifyActivity);
// Not necessarily in the same program… we may have
BoilerSource boilerSource = new BoilerSource();
boilerSource.NotifyActivity
+= new NotifyDelegate(
boilerSource_NotifyActivity);
In the delegate handler methods, we would do something like the following:
For the handler for stock event, we would have:
void stockSource_NotifyActivity(object info)
{
double price = (double)info;
// downcast required before use
}
The handler for the temperature event may look like this:
void boilerSource_NotifyActivity(object info)
{
Temperature value = info as Temperature;
// downcast required before use
}
The above code is not intuitive, and is messy with the downcasts. With generics, the code is more readable and easier to work with. Let's take a look at the code with generics at work:
Here is the delegate and the interface:
public delegate void NotifyDelegate<t>(T info);
public interface ISource<t>
{
event NotifyDelegate<t> NotifyActivity;
}
We have parameterized the delegate and the interface. The implementor of the interface can now say what the type should be.
The Stock source would look like this:
public class StockPriceSource : ISource<double>
{
public event NotifyDelegate<double> NotifyActivity;
//…
}
and the Boiler source would look like this:
public class BoilerSource : ISource<temperature>
{
public event NotifyDelegate<temperature> NotifyActivity;
//…
}
If we have an object of each of the above classes, we would register a handler for events, as shown below:
StockPriceSource stockSource = new StockPriceSource();
stockSource.NotifyActivity
+= new NotifyDelegate<double>(
stockSource_NotifyActivity);
// Not necessarily in the same program… we may have
BoilerSource boilerSource = new BoilerSource();
boilerSource.NotifyActivity
+= new NotifyDelegate<temperature>(
boilerSource_NotifyActivity);
Now, the event handler for stock price would be:
void stockSource_NotifyActivity(double info)
{
//…
}
and the event handler for the temperature is:
void boilerSource_NotifyActivity(Temperature info)
{
//…
}
This code has no downcast and the types involved are very clear.
Since generics are supported at the CLR level, you may use reflection API to get information about generics. One thing may be a bit confusing when you are new to generics: you have to keep in mind that there is the generics class you write and then there are types created from it at runtime. So, when using the reflection API, you have to make an extra effort to keep in mind which type you are dealing with. I illustrate this in the Example 7:
Example 7. Reflection on generics
public class MyClass<t> { }
class Program
{
static void Main(string[] args)
{
MyClass<int> obj1 = new MyClass<int>();
MyClass<double> obj2 = new MyClass<double>();
Type type1 = obj1.GetType();
Type type2 = obj2.GetType();
Console.WriteLine("obj1's Type");
Console.WriteLine(type1.FullName);
Console.WriteLine(
type1.GetGenericTypeDefinition().FullName);
Console.WriteLine("obj2's Type");
Console.WriteLine(type2.FullName);
Console.WriteLine(
type2.GetGenericTypeDefinition().FullName);
}
}
I have an instance of MyClass<int>. I ask for the class name of this instance.
Then I ask for the GenericTypeDefinition() of this type. GenericTypeDefinition()
will return the type metadata for MyClass<T> in this example. You may call
IsGenericTypeDefinition to ask if this is a generic type (like MyClass<T>)
or if its type parameters have been specified (like MyClass<int>). Similarly,
I query an instance of MyClass<double> for its metadata. The output from the above
program is shown below:
obj1's Type
TestApp.MyClass`1
[[System.Int32, mscorlib, Version=2.0.0.0, Culture=neutral,
PublicKeyToken=b77a5c561934e089]]
TestApp.MyClass`1
obj2's Type
TestApp.MyClass`1
[[System.Double, mscorlib, Version=2.0.0.0, Culture=neutral,
PublicKeyToken=b77a5c561934e089]]
TestApp.MyClass`1
We can see that MyClass<int> and MyClass<double> are classes that belong to
the mscorlib assembly (dynamically created), while the class MyClass<t> belongs
to my assembly.
We have seen the power of generics so far in this article. Are there any limitations? There is one significant limitation, which I hope Microsoft addresses. In expressing constraints, we can specify that the parameter type must inherit from a class. How about specifying that the parameter must be a base class of some class? Why do we need that?
In Example 4, I showed you a Copy() method that copied contents of a source
List to a destination list. I can use it as follows:
List<Apple> appleList1 = new List<Apple>();
List<Apple> appleList2 = new List<Apple>();
…
Copy(appleList1, appleList2);
However, what if I want to copy apples from one list into a list of Fruits (where Apple
inherits from Fruit). Most certainly, a list of Fruits can hold Apples. So I want to write:
List<Apple> appleList1 = new List<Apple>();
List<Fruit> fruitsList2 = new List<Fruit>();
…
Copy(appleList1, fruitsList2);
This will not compile. You will get an error:
Error 1 The type arguments for method
'TestApp.Program.Copy<t>(System.Collections.Generic.List<t>,
System.Collections.Generic.List<t>)' cannot be inferred from the usage.
The compiler, based on the call arguments, is not able to decide what T should be.
What I really want to say is that the Copy should accept a List of some type as
the first parameter, and a List of the same type or a List of its base type as
the second parameter.
Even though there is no way to say that a type must be a base type of another, you can get around this limitation by still using the constraints. Here is how:
public static void Copy<T, E>(List<t> source,
List<e> destination) where T : E
Here I have specified that the type T must be the same type as, or a sub-type of, E.
We got lucky with this. Why? Both T and E are being defined here. We were
able to specify the constraint (though the C# specification discourages using
E to define the constraint of T when E is being defined as well).
Consider the following example, however:
public class MyList<t>
{
public void CopyTo(MyList<t> destination)
{
//…
}
}
I should be able to call CopyTo:
MyList<apple> appleList = new MyList<apple>();
MyList<apple> appleList2 = new MyList<apple>();
//…
appleList.CopyTo(appleList2);
I must also be able to do this:
MyList<apple> appleList = new MyList<apple>();
MyList<fruit> fruitList2 = new MyList<fruit>();
//…
appleList.CopyTo(fruitList2);
This, of course, will not work. How can we fix this? We need to say that the argument
to CopyTo() can be either MyList of some type or MyList
of the base type of that type.
However, the constraints do not allow us to specify the base type. How about the
following?
public void CopyTo<e>(MyList<e> destination) where T : E
Sorry, this does not work. It gives a compilation error that:
Error 1 'TestApp.MyList<t>.CopyTo<e>()' does not define type
parameter 'T'
Of course, you may write the code to accept MyList of any arbitrary type and then within
your code, you may verify that the type is one of acceptable type. However, this
pushes the checking to runtime, losing the benefit of compile-time type safety.
Generics in .NET 2.0 are very powerful. They allow you to write code without committing to a particular type, yet your code can enjoy type safety. Generics are implemented in such a way as to provide good performance and avoid code bloat. While there is the drawback of constraints' inability to specify that a type must be a base type of another type, the constraints mechanism gives you the flexibility to write code with a greater degree of freedom than sticking with the least-common-denominator capability of all types.
|
Related Reading .NET Gotchas |
Venkat Subramaniam founder of Agile Developer, Inc., has trained and mentored thousands of software developers in the US, Canada, Europe, and Asia. Venkat helps his clients effectively apply and succeed with agile practices on their software projects. He is a frequently invited speaker at international software conferences and user groups.
Return to OnDotNet.com
Copyright © 2009 O'Reilly Media, Inc.