Monday, January 28, 2008

Object state, identity, immutability, and values

A recent discussion on the Scala mailing list got me thinking about the concepts of Object state, identity, and immutability.

Let's start by trying to define Object State. There seem to be two different ways of looking at state:
  • From an Object Oriented (OO) perspective: the state of an object is defined by the data it contains within its fields. This state helps to distinguish it from other instances of its class (with possibly different data in their fields).
  • From a Functional Programming (FP) perspective: an object has state if its behavior (as defined by its methods) may observably change over the course of time. In other words, the output of (the methods of) a stateful object is determined not only by the input received, but also by some modifiable internal state.
Both viewpoints are equally valid, but for the rest of this post, I am going to stick with the OO definition of state (just to keep things consistent and well defined).

Let's move on to look at the concepts of identity and mutability. First of all, what is object identity? And why do we need it in software systems?

A good way to approach this question is to distinguish between two kinds of types: Value types and Reference types.
Note - Think of a type as a class/set of objects
Note - Value types are different from values, which are things that can be stored in variables, passed as arguments to methods, returned by methods, etc.

As a first approximation, Value types represent things that don't change (i.e. are immutable). Examples of Value types are things like dates, numbers, etc.

As opposed to this, Reference types represent things that can change (i.e. are mutable). And because they can change, we need to be able to share them. To see why this follows, think, for example, about two different cars that have the same (mutable) owner. If we're modeling this in an OO software system, both the cars need to refer to the same owner object. If they don't, we run into the big problem that if some attribute of the owner changes (the owner is mutable, after all), we need to hunt down all things that refer to this owner and update them.

So - mutability gives rise to the concept of a sharable reference to an object (of a Reference type). And the concept of a reference leads to the notion of identity. For something to be reference-able, that thing needs to have an identity. This identity is independent of the state of the object (as represented by the data in its fields).
Note - if you think of state from an FP perspective, identity is deeply tied to state, because state implies mutability, and mutability gives rise to the need for identity (to enable referencing/sharing).


So - we see that the concept of identity arises when we start dealing with Reference types. Conversely, identity has no meaning for value types.

Within languages that provide first-class support for value types, assignment for value types is implemented via field-copying, and equality is defined based on equivalence (i.e. field comparison).

Within languages that do not provide first-class support for value types, immutability can be used to provide value semantics. Once you have fields that cannot change, reference copying gives you the same results as field/value copying. Now, it might not be possible to redefine equality (==) so that it is based on equivalence (i.e. field comparison), but you can use equivalence by convention to compare value objects.
Note - in Java, equivalence comparison is provided by the equals() method.

Here are some good relevant links (pointed out by folks on Scala mailing list):

Raoul Duke provided a link to this excellent article about Value types:
http://www.riehle.org/computer-science/research/1998/ubilab-tr-1998-10-1.pdf

Matt Hellige provided some excellent links:
http://citeseer.ist.psu.edu/jacobs95objects.html
http://citeseer.ist.psu.edu/bawden93implementing.html
http://lambda-the-ultimate.org/node/2425