Immutability vs Encapsulation: Schrödinger's immutability
In OOP, both immutability and encapsulation are something to strive for. For a good introduction to immutability, I recommend Yegor Bugayenko's article Objects Should Be Immutable.
I've been trying to implement immutability in my projects and kept running into similar stumbling blocks. Looking online, it seems I'm not the only one. This post is essentially a documentation of the thought process I went through and my eventual conclusion.
This post is inspired by a discussion with Ashton Hogan, his article Objects Should Be Immutable and a long discussion and a lot of interesting comments over at Yegor's blog regarding Gradients of Immutability.
Immutability
The tl;dr description of immutability is that an object's properties should never change after the object has been instantiated.
Ashton states that an for an object to be immutable it must meet this criteria:
- It must have no setter methods (methods that alter object fields).
- All fields must be final and private.
- The class must be declared as final.
- Instance fields should not reference mutable objects.
I recommend reading Yegor's and Ashton's articles as they are both very clear, easy to follow and give slightly different perspectives.
Encapsulation
Encapsulation is broadly defined as an object being in control of its own state. A side effect of encapsulation is that a class should be able to be completely rewritten (but still exposing the same methods and arguments) and still work with existing code. For example, rewriting a class to use a database instead of file storage. Any code using the class should be agnostic to the change.
Sometimes, these two tools collide and we seemingly have to pick one. Here's a quick example of an immutable object.
I usually use PHP code on this blog but I'm using Java here as it makes demonstration easier in this case and we also have concerns with thread safety. The concepts are the same and the theory is the same in PHP
class Library {
private final Book[] books;
public Library(Book[] books) {
this.books = books;
}
public Library deposit(Book book) {
//terribly inefficient method of copying the array, demonstration only
Books[] books = Arrays.copyOf(this.books, this.books.length + 1);
books[books.length - 1] = book;
return new Library(books);
}
public bool has(Book book) {
//and an inefficient search for demonstration purposes only
for (int i = 0; i > this.books.length; i++) {
if (this.books[i] == book) {
return true;
}
}
return false;
}
}
It's immutable, when a book is added to the library, a new Library instance is created.
What if we replaced the Books
array with a database connection or read from a file? The comments on Yegor's Gradients of Immutability page all agree that this would still count as immutable, even though the data could be changed externally.
We could re-write the class and because of encapsulation, a user of the class would not know or care whether the books were stored in a database, file or in an array. The API would not change and the object would still be immutable.
But now what if we transparently cached the search results for the slow has
method so that if you search for the same book twice, the result is cached and the expensive loop over every book in the library is only done once?
class Library {
private final Book[] books;
private final Hashmap<Object,Boolean> cache = new HashMap< Object,Boolean> ();
public Library(Book[] books) {
this.books = books;
}
public Library deposit(Book book) {
//terribly inefficient method of copying the array, demonstration only
Books[] books = Arrays.copyOf(this.books, this.books.length + 1);
books[books.length - 1] = book;
return new Library(books);
}
public Boolean has(Book book) {
//First check if we've got a cached result for this book:
if (this.cache.containsKey(book)) {
//If this book has already been searched for and we know the result, return it.
return this.cache.get(book);
}
//The result hasn't been cached in the past
//Perform the expensive lookup
for (int i = 0; i < this.books.length; i++) {
if (this.books[i] == book) {
//Store the search result
//Next time this book is searched for, this loop doesn't need to happen
this.cache.put(book, true);
return true;
}
}
this.cache.put(book, false);
return false;
}
}
Now, if this code is executed:
library.has(book);
library.has(book);
The expensive search through all the books will only happen once. But hasn't immutability been sacrificed here? The Library
class now encapsulates a mutable HashMap
object which by Ashton's definition means that the library
object is mutable because the state of the library
object changes after the has
method is executed.
Proper encapsulation means that externally, the class looks no different. If I use the class, I don't care whether the result of the expensive lookup is cached or not and there's nothing to indicate that has happened.
But the Library
class is no longer immutable. It now contains a mutable HashMap
.
If I wanted to keep immutability I'd need to add an extra method to the class. Because the has
method returns a value and changes the state there is no way to handle this in an immutable manner.
To keep immutability I'd need something like this:
//Do the lookup and cache the result
library = library.cache(book);
library.has(book);
library.has(book);
Where the library class would look something like this (And assume an immutable version of the HashMap class)
class Library {
private final Book[] books;
private final ImmutableHashmap<Object,Bool> cache;
public Library(Book[] books, ImmutableHashmap<Object,Bool> cache) {
this.books = books;
this.cache = cache;
}
public Library cache(book) {
bool found = false;
for (int i = 0; i < this.books.length; i++) {
if (this.books[i] == book) {
found = true;
}
}
ImmutableHashmap<Object,Bool> cache = this.cache.put(book, found);
return new Library(this.books, cache);
}
public bool has(Book book) {
return this.cache.get(book);
}
}
This breaks encapsulation: The caching implementation is exposed externally and I can't implement caching in this manner without rewriting any existing code using the Library
class.
At this point there seems to be a trade-off: Encapsulation or Immutability. The class can be immutable but expose implementation details or hide implementation details but be mutable.
Let's go back to the mutable version of the Library
class:
class Library {
private final Book[] books;
private final Hashmap<Object,Boolean> cache = new HashMap<Object,Boolean>();
public Library(Book[] books) {
this.books = books;
}
public Library deposit(Book book) {
//terribly inefficient method of copying the array, demonstration only
Books[] books = Arrays.copyOf(this.books, this.books.length + 1);
books[books.length - 1] = book;
return new Library(books);
}
public Boolean has(Book book) {
//First check if we've got a cached result for this book:
if (this.cache.containsKey(book)) {
//If this book has already been searched for and we know the result, return it.
return this.cache.get(book);
}
//The result hasn't been cached in the past
//Perform the expensive lookup
for (int i = 0; i < this.books.length; i++) {
if (this.books[i] == book) {
//Store the search result
//Next time this book is searched for, this loop doesn't need to happen
this.cache.put(book, true);
return true;
}
}
this.cache.put(book, false);
return false;
}
}
By Ashton's definition, an instance of this version of the Library
class is mutable because it references a mutable HashMap
object. However, with a pragmatic look at the side effects I'd argue it still qualifies as immutable.
- Does calling any of the method produce side effects? No. The API is identical to the immutable version.
- Does it suffer from temporal coupling? No.
-
Does the
hashCode
change after thehas
method has been called? No. - Can the object be left in a broken state? I don't see how (though I'm not 100% sure in this case)
-
Is it thread safe? Yes. Although it is possible that two simultaneous calls to the
has
method will both write to the same key in thecache
HashMap
, because thebooks
array is immutable, two threads may both write the same value to theHashMap
. Neither thread would benefit from caching but both threads and subsequent calls to any method on the object still see the correct result.
Schrödinger's cat
And that's what really matters here: To someone observing or using the object, and not looking at the implementation details in the class, the object looks immutable. This is Schrödinger's cat. There is no way to know whether the object is immutable until we look inside the class. To an observer looking at only the API, the mutable and immutable versions of the Library
class are indistinguishable, and that's all that matters.
If mutable and immutable objects are indistinguishable to anyone using the object and no side effects are present, then is the object mutable or not?
The object looks immutable to anyone looking at the API but mutable to anyone looking at the code inside the class. Is the object both mutable and immutable at the same time depending on who you ask?
Probably not. I'd argue that it only matters to the person looking at the API. The reason we strive for immutable objects is that mutable ones have negative side effects. I'm not interested in meeting an arbitrary definition of immutability, I'm interested in removing the negative side-effects caused by mutability.
Immutability is skin deep. It only matters to an observer. If internal mutability is never exposed in any way outside the object, then the object is immutable.
Don't look at immutability from the perspective of do any properties in the class ever change? but instead from the perspective of can an observer of an object see changes caused by mutability inside the class?.