Improving Typed DataSets
by Shawn Wildermuth03/31/2003
The first time I used a typed DataSet, it was much like the beginning of a relationship.
After dealing with raw DataSets, typed DataSets seemed elegant and perfect.
Soon the cracks in the facade appeared. I knew that typed DataSets were much
easier to work with than the raw DataSets, but I still longed to be able to
change some of the way that the code was generated. Unlike relationships, we
have some limited control of how typed DataSets work. In this article, I will
show you typed DataSet annotations and how they can change the way that typed
DataSets are generated.
The Typed DataSet Rationale
For the uninitiated, typed DataSets are way of creating a classes that derive
from the standard ADO.NET classes of DataSet, DataTable, DataRow, etc. For example,
if you were to try and access the CustomerID value from the first row in the
Customers table within an untyped DataSet, the code would look something like
this:
DataSet dataSet = new DataSet();
// Fill the DataSet (Code omitted for brevity)
string customerID = (string)dataSet.Tables["Customers"].Rows[0]["CustomerID"];
There are three problems with this code. First, the syntax is dependent on
lookups, so that the syntax is muddled and not immediately clear to the reader
of the code. Second, any misspelling of "Customers" or "CustomerID"
would only show up as a run-time error, not a compile error (where we would
like it to happen, to help us find this bug sooner). Lastly, we have to have
knowledge that the CustomerID field is in fact a string and not an int, Guid,
or other type. In a perfect world, it would be nice if the access were more
like a class hierarchy:
MyTypedDataSet dataSet = new MyTypedDataSet();
// Fill the DataSet (Code omited for brevity)
string customerID = dataSet.Customers[0].CustomerID;
This syntax is much cleaner, don't you think? This is a leap forward in productivity
as well, since IntelliSense will now allow us to view the typed members more
easily. The problem comes in that we do not have much control over how the objects
are named. The Customers table is called that, but the individual row class
is called CustomersRow. This is clear, but not necessarily a naming convention
that is cohesive with naming conventions throughout your enterprise. Though
Microsoft has not given us full control over the code generation, they did add
Typed DataSet Annotations to help solve some of the more common issues.
Overview of Annotations
Annotations are simply a set of extensions to the raw XSD file that is used
by .NET to generate the typed DataSet. In general, I like the code the typed
DataSet generates, but by using annotations, I can solve some common problems:
- Renaming of classes and properties.
- Renaming relationship accessors.
- Dealing with database nulls.
In order to use annotations, you need to modify the raw XSD file to include a new namespace:
<xs:schema id="MyTypedDataSet"
targetNamespace="http://tempuri.org/Dataset1.xsd"
elementFormDefault="qualified"
attributeFormDefault="qualified"
xmlns="http://tempuri.org/Dataset1.xsd"
xmlns:mstns="http://tempuri.org/Dataset1.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
xmlns:codegen="urn:schemas-microsoft-com:xml-msprop">
Once you have added the namespace, you're ready to start annotating the typed DataSet!
Renaming Classes and Properties
Probably the most common use of annotations is to rename classes and properties
in the typed DataSet to something more friendly to your development team. By
default, the typed DataSet names are generated classes by the name of the table
element:
| DataSet Element | Default Naming | Annotation to Modify |
| DataTable | TableNameDataTable | typedPlural |
| DataTable methods | NewTableNameRow AddTableNameRow DeleteTableNameRow |
typedName |
| DataRowCollection | TableName | typedPlural |
| DataRow | TableNameRow | typedName |
| DataSet Events | TableNameRowChangeEvent TableNameRowChangeEventHandler |
typedName |
For example, if your table is named Customers, the DataTable class
will be named CustomersDataTable; the DataRowCollection will be named
Customers, and the method to create a new DataTableRow is called
NewCustomersRow. To change these names, you will want to add codegen
annotations to change the typedPlural and typedName of the table element:
<xs:element name="Customers"
codegen:typedName="MyCustomer"
codegen:typedPlural="MyCustomers">
<xs:complexType>
<xs:sequence>
...
</xs:sequence>
</xs:complexType>
</xs:element>
Once this change is made, the DataTable is called MyCustomersDataTable,
the DataRowCollection is now called MyCustomers, and the new DataTableRow
method is now called NewMyCustomerRow. You can also use the typedName
annotation to change the way that individual DataColumns are named, to allow
you to change the name:
<xs:element name="Customers"
codegen:typedName="MyCustomer"
codegen:typedPlural="MyCustomers">
<xs:complexType>
<xs:sequence>
<xs:element name="CustomerID" type="xs:string" />
<xs:element name="CompanyName" type="xs:string" />
<xs:element name="ContactName" type="xs:string" minOccurs="0" />
<xs:element name="ContactTitle"
type="xs:string"
minOccurs="0"
codegen:typedName="TheTitle"/>
<xs:element name="Address" type="xs:string" minOccurs="0" />
<xs:element name="City" type="xs:string" minOccurs="0" />
<xs:element name="Region" type="xs:string" minOccurs="0" />
<xs:element name="PostalCode" type="xs:string" minOccurs="0" />
<xs:element name="Country" type="xs:string" minOccurs="0" />
<xs:element name="Phone" type="xs:string" minOccurs="0" />
<xs:element name="Fax" type="xs:string" minOccurs="0" />
<xs:element name="BirthDate" type="xs:dateTime" minOccurs="0" />
</xs:sequence>
</xs:complexType>
</xs:element>
Renaming Relationship Accessors
When you have set up a typed DataSet with relationships between tables, the
generated code allows you to navigate up and down each relationship using a
method that return the matching rows in the child table and a property to access
the parent rows. By default, the method to get the child rows is called GetTableNameRows
and the property for getting the parent row is named TableName.
In this case, we actually need to annotate the code gen of the relationship (or
keyref in the XSD file):
<xs:keyref name="CustomersOrders"
refer="Dataset1Key1"
msdata:DeleteRule="Cascade"
codegen:typedChildren="TheOrders"
codegen:typedParent="TheCustomer">
<xs:selector xpath=".//mstns:Orders" />
<xs:field xpath="mstns:CustomerID" />
</xs:keyref>
In the CustomerRow class, we now have a method called TheOrders that returns
the orders for the particular customer. Conversely, in the OrdersRow class,
we now have have a property called TheCustomer that returns the CustomerRow
who owns a particular order. Simple, huh?
Dealing with Database Nulls
By default, in DataSets (and typed DataSets, as well), when you try and access
a value in a row that is null in the database, an exception is thrown. Typed
DataSets make this easier by allowing you to call "IsFieldNameNull()"
methods to determine if a field is null before you try and access it. Sometimes
it would be nice to have a null behave differently than throw an exception or
force us to test for the null. Annotations come to the rescue again. Within
each field in a typed DataSet, you can specify a nullValue annotation to tell
the typed DataSet how to react when an underlying field is DbNull. The possible
values are:
| Value | Behavior |
_throw |
Throw an exception. (This is what happens when you do not specify an annotation.) |
_null |
Returns a null reference if the field type is a reference type, or throws
an exception if the field is a value type (e.g. strings return null, ints
throw an exception.) |
_empty |
Returns String.Empty for strings, returns an object from an empty constructor
from all other reference types. Still throws an exception if the field is
a value type. |
| Replacement Value | Specifies a default value to be returned when the type is null. The replacement
must be compatible with type (e.g. nullValue="0" for an int, but
nullValue="Hi There" for a string.) |
You can use these annotations like so:
<xs:element name="ShipName"
type="xs:string"
minOccurs="0"
codegen:nullValue=""/>
<xs:element name="ShipAddress"
type="xs:string"
minOccurs="0"
codegen:nullValue="_empty"/>
<xs:element name="ShipVia"
type="xs:int"
minOccurs="0"
codegen:nullValue="0"/>
<xs:element name="ShippedDate"
type="xs:dateTime"
minOccurs="0"
codegen:nullValue="1980-01-01T00:00:00"/>
By annotating these fields, we can control the way the nulls are handled. The
first two fields (ShipName and ShipAddress) return empty strings when a DbNull
is encountered. The third field (ShipVia) defaults the field to zero and the
last field (ShippedDate) defaults to January 1st, 1980.
Conclusion
While annotations will not fix all issues we have with typed DataSets, it does
allow us some flexibility over how naming and null behaviors are handled.
Shawn Wildermuth is the founder of ADOGuy.com and is the author of "Pragmatic ADO.NET" for Addison-Wesley.
Return to ONDotnet.com

