Proper handling of calendar dates in computer programs is hard. Not only are there obvious internationalization requirements (English: January, French: Janvier, German: Januar, etc.), but also issues regarding different calendar systems (not every culture counts years starting with the birth of Jesus Christ). If very high precision or very long time scales have to be treated properly, additional concerns need to be addressed, such as the possibility of leap seconds or calendar system changes. (The Gregorian calendar commonly used in the West was adopted only in 1582, and not by all countries on the same day!)
Over all of the issues concerning leap seconds, time zones, daylight savings time (DST), and lunar calendars, it is easy to forget that measuring time is a very simple concept: time progresses linearly. Once an origin of the time axis has been defined, any point in time is uniquely identified by the time elapsed since the origin. Note that this is independent of the geographical location or the local time zone — for a given point in time, the duration since the origin is the same for any location (ignoring relativistic corrections).
|
Related Reading Java In a Nutshell |
The difficulties arise when we try to interpret this point in time according to some calendar, i.e., representing it in terms of months, days, or years. Geographical information becomes relevant at this step: the same point in time corresponds to different times of day, depending on the location (i.e., time zone). Modifications based on interpreted dates are often required (which date corresponds to the day a month from today?) and pose additional difficulties: over- and underflows (a month from Dec. 15 is next year), as well as ambiguities (which day exactly corresponds to a month from Jan. 30?).
In the original JDK 1.0, the representation for a point in time was lumped
together with the responsibility to interpret it into the class
java.util.Date. While relatively easy to handle, it was not
amenable to internationalization. This was recognized relatively early; since
JDK 1.1.4 or JDK 1.1.5, the various responsibilities for handling dates have
been distributed among the following classes:
java.util.Date |
Represents a point in time. |
abstract java.util.Calendarjava.util.GregorianCalendar extends java.util.Calendar |
Interpretation and manipulation of Dates. |
abstract java.util.TimeZonejava.util.SimpleTimeZone extends java.util.TimeZone |
Representation of an arbitrary offset from Greenwich Mean Time (GMT), including information about applicable daylight savings rules. |
abstract java.text.DateFormat extends java.text.Format
java.text.SimpleDateFormat extends java.text.DateFormat |
Transformation into well-formatted, printable String and vice
versa. |
java.text.DateFormatSymbols |
Translation of the names of months, weekdays, etc., as an alternative to
using the information from Locale. |
java.sql.Date extends java.util.Datejava.sql.Time extends java.util.Datejava.sql.Timestamp extends java.util.Date |
Represent points in time, and also include proper formatting for use in SQL statements. |
Note that DateFormat and related classes are in the
java.text.* package. All date-handling classes in the
java.sql.* package extend java.util.Date. All other
classes are in the java.util.* package.
The "new" classes form three separate inheritance hierarchies, with the
top-level classes (Calendar, TimeZone, and
DateFormat) being abstract. For each abstract class,
the Java Standard Library provides one concrete implementation.
java.util.Date The class java.util.Date represents a point in time. In many
applications, such an abstraction would be called a "TimeStamp." In the
standard Java library implementation, this point in time is represented by the
number of milliseconds since the start of the Unix epoch on January 1, 1970,
00:00:00 GMT. Conceptually, this class is therefore a very thin wrapper around
a long.
In concordance with this interpretation, observe that the only methods in this class that are not deprecated (besides those getting and setting the number of milliseconds) are those required to allow ordering.
This class depends on System.currentTimeMillis() to obtain the
current point in time. Its accuracy and precision is therefore determined by
the implementation of System and the underlying layer (essentially
the OS) that it calls.
The
|
java.util.Calendar The Calendar class represents a point in time (a
"Date"), interpreted appropriately for some locale and time zone.
Each Calendar instance wraps a long variable
containing the number of milliseconds since the epoch for the represented point
in time.
This means that Calendar is neither a (stateless) transformer
or interpreter, nor a factory for modified dates. It does not support
idioms such as:
Month Interpreter.getMonth( inputDate )
or
Date Factory.addMonth( inputDate )
Instead, a Calendar instance must be initialized to some
Date. This Calendar instance can then be modified or
queried for interpreted properties.
Bizarrely, instances of this class are always initialized to the
current time. It is not possible to obtain a Calendar instance
initialized to an arbitrary Date — the API forces the
programmer to set the date explicitly by a subsequent method call such as
setTime( date ) on an existing instance.
The Calendar class follows an unusual idiom for allowing access
to the individual fields of the interpreted date instance. Rather than
offering a number of dedicated property getters and setters (such as
getMonth()), it offers only one, which takes an identifier for the
requested field as argument:
int get( Calendar.MONTH ) etc.
Notice that this function always returns an int!
The identifiers for the fields are defined in the Calendar
class as public static final variables. (These identifiers are raw
integers, not wrapped into an enumeration abstraction.)
Besides the identifiers (or keys) for the fields, the Calendar
class defines a number of additional public static final variables
holding the values for the fields. So, to test whether a certain date
(represented by the Calendar instance calendar) falls
into the first month of the year, one would write code like this:
if( calendar.get( Calendar.MONTH ) == Calendar.JANUARY ) {...}
Note that the months are called JANUARY,
FEBRUARY, etc., irrespective of location (as opposed to more
neutral names such as MONTH_1, MONTH_2, and so on).
There is also a field UNDECIMBER, representing the 13th month of
the year, which is required by some (non-Gregorian) calendars.
Unfortunately, keys and values are neither distinguished by name nor by grouping into separate nested interfaces.
The Calendar offers three ways to modify the date represented
by the current instance: set(), add(), and
roll(). The set() method simply sets the specified
field to the desired value. The difference between add() and
roll() concerns the way they treat over- and underflows: while
add() propagates changes to "smaller" or "larger" fields,
roll() does not. For instance, when adding a month to a
Calendar instance representing Dec. 15, the year will be
incremented when using add(), but left untouched when using
roll(). The decision to have two different functions for either
case was motivated by their possible uses in GUI situations.
The way Calendar is implemented, it contains redundant data:
all of the individual fields can be computed from the number of milliseconds since
the epoch given a time zone, and vice versa. The class declares the abstract
methods computeFields() and computeTime() for these
operations, respectively, as well as the complete() method, which
performs a complete round-trip. Because there are two sets of redundant data,
the two sets can get out of synch. According to the class' documentation,
dependent data is recomputed lazily when changes are made. Subclasses
must maintain a set of dirty flags to signal when recomputation is
required.
|
Implementation Leakage It has to be said that implementation details have been oozing into the
APIs to an uncommon degree for the "new" date-handling classes. Up to a point,
this is a reflection of their intended use as base classes for customized
development, but it also seems to occasionally be a consequence of insufficient
clarity in the design of the public interfaces. Whether the
|
The additional functions offered by the Calendar base class
fall into three categories. There are several static factory methods to obtain
instances initialized for arbitrary time zones and locales. As mentioned above,
all instances obtained this way have already been initialized to the
current time. No factory methods are provided to obtain a
Calendar instance initialized to an arbitrary point in time.
The second group of methods consists of the methods before( Object
) and after( Object ). They take arguments of type
Object, thus allowing these methods to be overridden in subclasses
for arbitrary types of arguments.
Finally, there are a number of functions to get and set additional properties, such as the current time zone. Among them are several methods that query the possible and actual minimum and maximum values of certain fields for the current calendar implementation.
When Does the Week Begin? The documentation on the |
java.util.GregorianCalendar The class GregorianCalendar is the only commonly available
subclass of Calendar. It provides an implementation of the basic
Calendar abstraction suitable for the interpretation of dates
according to the conventions used commonly in the West. It adds a number of
public constructors, as well as some functions specific to Gregorian Calendars,
such as isLeapYear().
java.util.TimeZone and java.util.SimpleTimeZone The TimeZone class and its subclasses are auxiliary classes,
required by Calendar to interpret dates according to the
selected time zone. Semantically, a time zone specifies a certain offset to be
added to GMT to reach the local time. Clearly, this offset changes when
daylight savings time (DST) is in effect. The TimeZone abstraction
therefore needs to keep track not only of the additional offset to be applied
if DST is in effect, but also of the rules that determine when DST is
in effect, in order to calculate the local time for any given date and
time.
The abstract base class TimeZone provides basic methods to
handle "raw" (without taking DST into account) and actual offsets (in
milliseconds!), but implementation of any functionality related to DST rules is
left to subclasses, such as SimpleTimeZone. The latter class
provides several ways to specify rules controlling the beginning and ending of
DST, such as a giving an explicit day in a month or a certain weekday following
a given date. Each TimeZone also has a human-readable,
locale-dependent display name. Display names come in two styles:
LONG and SHORT.
Time zones are unambiguously determined by an identifier string. The base
class provides the static method String[] getAvailableIDs() to obtain
all installed "well-known" standard time zones. (There are 557 for my
installation, using JDK 1.4.1.) The JavaDoc defines the proper syntax to build
custom time zone identifiers, if the need arises. Also provided are static
factory methods, to obtain TimeZone instances — either for a
specific ID or the default for the current location.
SimpleTimeZone also provides some public constructors and,
surprisingly for an abstract class, so does TimeZone. (The
JavaDoc states: "For invocation by subclass constructors." Apparently, it
should have been declared protected.)
java.text.DateFormat While Calendar and related classes handle the locale-specific
interpretation of dates, the DateFormat classes assist
with the transformation of dates to and from human-readable strings. When
representing points in time, an additional localization issue arises: not only
the language, but also the date format is locale-dependent
(U.S.: Month/Day/Year, Germany: Day.Month.Year, etc.). The
DateFormat utility tries to manage these differences for the
application programmer.
The abstract base class DateFormat does not require (and does
not permit) the definition of arbitrary, programmer-defined date formats.
Instead, it defines four different format styles: SHORT,
MEDIUM, LONG, and FULL (in increasing
order of verbosity). Given a locale and a style, the programmer can rely on the
class to use an appropriate date format.
The abstract base class DateFormat does not define static
methods for formatting (date -> text) or parsing (text -> date). Instead, it
defines several static factory methods to obtain instances (of concrete
subclasses) initialized for a given locale and a chosen style. Since the
standard formats always include both date and time, additional factory
methods are available to obtain instances treating only the time or
date part. The String format( Date ) and Date parse( String
) methods then perform the transformation. Note that concrete subclasses
may choose to break this idiom.
The Calendar object used internally to interpret dates is
accessible and can be modified, as are the employed TimeZone and
NumberFormat objects. However, the locale and style can no longer
be changed once the DateFormat has been instantiated.
Also available are (abstract) methods for piece-wise parsing or formatting,
taking an additional ParsePosition or FieldPosition
argument, respectively. There are two versions for each of these methods. One
takes or returns a Date instance and the other takes or returns a
general Object, to allow handling of alternatives to
Date in subclasses. The class defines several public static
variables with names ending in _FIELD to identify the various
possible fields for use with FieldPosition (cf. the JavaDoc
for java.util.Format).
The only commonly available concrete subclass of DateFormat is
SimpleDateFormat. It provides all of the aforementioned
functionality, additionally allowing the definition of arbitrary date-formatting patterns. There is a rich syntax to specify formatting patterns; the
JavaDoc gives the full details. The pattern can be specified as an argument to
the constructors of this class or set explicitly.
Printing a Timestamp: A Cut-and-Paste ExampleImagine you want to print the current time in a user-defined format; for instance, to a log file. Here is how to do this:
Note the |
java.sql.* The date-and-time-handling classes in the java.sql.* all
extend java.util.Date. The fact that there are three of them
reflects the need to model the three standard SQL92 types DATE,
TIME, and TIMESTAMP.
Like java.util.Date, all three classes in the SQL package are
thin wrappers around a numeric value representing a point in time. The
Date and Time classes ignore the information
regarding the time of day or the calendar date, respectively.
The Timestamp class, however, not only includes the usual time
and date information up to millisecond precision, but also allows storing
additional data to accurately represent a point in time with nanosecond
precision. (A nanosecond is a billionth of a second.)
Besides shadowing the corresponding SQL datatypes, these classes handle
transformations to and from SQL-conforming String representations.
To this end, each of the three classes overrides the toString()
method. Furthermore, each class provides a static factory method,
valueOf( String ), which returns an instance of the class that it
has been invoked on, initialized to the time value represented by the
String passed to it. The format of the String
representation for all of these methods is fixed by the SQL standard and cannot be
changed by the programmer.
The additional data required to store nanosecond information has not been
very well integrated with the rest of the data representing the usual time and
date information in the Timestamp class. For example, calling
getTime() on a Timestamp instance will return the
number of milliseconds since the start of the Unix epoch, ignoring the
nanosecond data. Similarly, according to the JavaDoc, the
hashCode() method has not been overridden in the subclass, and
therefore also ignores the nanosecond data.
The JavaDoc for java.sql.Timestamp states that the
"inheritance relationship (...) really denotes implementation inheritance, and
not type inheritance," but even this statement is incorrect, since Java has no
notion of private (i.e. implementation) inheritance. Instead of inheriting
from java.util.Date, all of the classes in the
java.sql.* package should have been designed to encapsulate
a java.util.Date object, exposing only the methods required
— at the very least, methods such as hashCode() should have
been properly overridden.
A final comment concerns the handling of time zones by the database engine.
The classes in the java.sql.* package do not allow one to specify the
intended time zone explicitly. Database servers (or drivers) are free to
interpret this information as being valid in the server's local time zone, which
may be subject to change (for instance, due to daylight savings time).
From the foregoing discussion it should be clear that Java's date-handling
classes are not just complicated, but also poorly designed. Encapsulation is
leaky, the APIs are baroque and not well-organized, and uncommon idioms are
employed frequently for no good reason. The implementation holds additional
surprises (I suggest a look at the actual type of the object returned from
Calendar.getInstance( Locale ) for all available locales!) On
the other hand, the classes manage to treat all of the difficulties inherent in
internationalized date handling and, in any case, are here to stay. I hope
that this article was a little contribution in helping to clarify their proper
usage.
Call Me By My True NamesAs a last example of the wonderful consistency and orthogonality of Java's APIs, I would like to list three (maybe there are more!) different methods to obtain the number of milliseconds since the start of the Unix epoch:
|
The author would like to thank Wilhelm Fitzpatrick (Seattle) for a careful reading of the manuscript and valuable comments.
Calendar for
Buddhist, Hebrew, Muslim, and Japanese calendars used to be available at
IBM's alphaWorks. Unfortunately, they seem to be temporarily unavailable.Philipp K. Janert is a software project consultant, server programmer, and architect.
Return to ONJava.com
Copyright © 2009 O'Reilly Media, Inc.