Immutable variables is a topic that gives everyone the shudders when they first get into it. Let’s get the big question out of the way first: how can an application run if variables never change? This is a good question, so let’s look at the following rules about immutability:
Object variables, especially in Java, are references to the object itself. This means that changing the “reference” to which the variable points should be an atomic process. This is important because if we are going to update the variable, we will access it either pre- or post-update but never in an intermediate state. We’ll discuss this a little later, but right now, let’s look at mutability.
Remember from the preceding chapter that we’re going to be writing in Groovy from this point on.
When we think of variables, we normally think of mutable variables. After all, a variable is variable, which means that we should be able to store many different values in it and reuse it.
As we think of mutable variables, we realize that this is how we normally write code—with variables that inherently change over time. In Example 4-1, notice how f
changes and is assigned two distinct values? This is how we normally deal with variables.
So what happens when we have a variable that is passed to a function and we try to mutate that? Let’s see in Example 4-2.
def
f
=
"Foo"
def
func
(
obj
)
{
obj
=
"Bar"
}
println
f
func
(
f
)
println
f
We can see from the output that we get two "Foo"
printouts. This is correct because the reference that f
contained, "Foo"
, was passed to func
, and then we update the variable obj
with a new reference to "Bar"
. But because there is no connection between obj
and f
, f
remains unchanged and contains our original reference to "Foo"
.
This was probably not what the author intended, so he fixes it by using a mutable object containing the reference he wants to change. Let’s see this in action in Example 4-3.
class
Foo
{
String
str
}
def
f
=
new
Foo
(
str:
"Foo"
)
def
func
(
Foo
obj
)
{
obj
.
str
=
"Bar"
}
println
f
.
str
func
(
f
)
println
f
.
str
We can see that, although f
didn’t change, f.str
did. This looks like it’s a fairly standard mutation of an object, but let’s think about this in another light. What if it were not clear that func
was going to mutate f.str
, and we now need to determine why f.str
has changed over time? We’ll need to debug to find out that func
is indeed changing our variable.
Using code comments or setting something in the name of the function to indicate that you are mutating the object is one way to help answer the question “Why did this change?” Immutability gives us the confidence that our variables will not be changing and that our objects will be the same no matter to which function we send them.
Let’s head back over to XXY. Your boss has come back with another request, this time a little more sane. He needs to send emails to the customers if the following conditions are met:
Customer
is enabled.
Contract
is enabled.
Contract
has not expired.
Contact
is still enabled.
The boss has indicated that this really shouldn’t be a big deal because someone else already added a list of Contact
s to the Customer
class. The definition of a Contact
is in the Contact.java file, shown in Example 4-4.
public
class
Contact
{
public
Integer
contact_id
=
0
;
public
String
firstName
=
""
;
public
String
lastName
=
""
;
public
String
=
""
;
public
Boolean
enabled
=
true
;
public
Contact
(
Integer
contact_id
,
String
firstName
,
String
lastName
,
String
,
Boolean
enabled
)
{
this
.
contact_id
=
contact_id
;
this
.
firstName
=
firstName
;
this
.
lastName
=
lastName
;
this
.
=
;
this
.
enabled
=
enabled
;
}
}
The message template is as follows, where <firstName> and <lastName> are placeholders to be replaced by the user’s name:
Hello <firstName> <lastName>,
We would like to let you know that a new product is available for you to try. Please feel free to give us a call at 1-800-555-1983 if you would like to see this product in action.
Sincerely, Your Friends at XXY
We’re going to add the functionality into the Customer
class. Let’s think about this functionally. First, we will findAll
Customer.allCustomer
records where both the customer is enabled and the customer’s contract is enabled. For each
of those customers, we will then findAll
contacts that are enabled. And finally, for each
of those contacts, we will sendEmail
. Let’s go ahead and write the code in Groovy, as seen in Example 4-5.
public
static
void
sendEnabledCustomersEmails
(
String
msg
)
{
Customer
.
allCustomers
.
findAll
{
customer
->
customer
.
enabled
&&
customer
.
contract
.
enabled
}.
each
{
customer
->
customer
.
contacts
.
findAll
{
contact
->
contact
.
enabled
}.
each
{
contact
->
contact
.
sendEmail
(
msg
)
}
}
}
I don’t want to get too far into a battle about how best to handle sending emails, so let’s assume that we’ve already written Contact.sendEmail
, which takes a string, performs a replace for member variables, and then sends out the email. Let’s get even more functional—we might need to do something else later for each enabled Contact
. So, let’s use a closure, as shown in Example 4-6.
public
static
void
eachEnabledContact
(
Closure
cls
)
{
Customer
.
allCustomers
.
findAll
{
customer
->
customer
.
enabled
&&
customer
.
contract
.
enabled
}.
each
{
customer
->
customer
.
contacts
.
each
(
cls
)
}
}
Now, we can call Customer.eachEnabledContact({ contact -> contact.sendEmail(msg) })
and get our functionality. At this point, we have a nice set of functionality that we can call anytime we need to do something for all enabled contacts. For example, we might just want to create a list of all the enabled contacts.
Your boss has asked you to add functionality to change a Contact
’s name and email, because people get married or have other life events requiring name changes. Now let’s assume that our application is actually threaded (maybe it’s a web server). If you don’t see an issue, you’re about to.
You just sat down to work, happy that you got the “change name and email” functionality done and rolled out. You get an email from your boss asking you to take a look at a new blocker bug: “Send email sometimes sends to an old email address.” The support team includes the broken email in the bug as well.
from: XXY Product Trials <[email protected]>
to: Jane Doe <[email protected]>
subject: New Product Trial
Hello Jane Smith,
We would like to let you know that a new product is available for you to try. Please feel free to give us a call at 1-800-555-1983 if you would like to see this product in action.
Sincerely, Your Friends at XXY
In the bug, the support team says Jane just got married and her name changed from Jane Doe to Jane Smith. The thing they can’t figure out is why the email went to Jane Doe <[email protected]> but her name is referenced as Jane Smith in the body.
OK, before I break down the entire runtime, I’ll try to explain this. User A updates the user’s last name and email and clicks Save at the same time that another user clicks Send email. Because we have no synchronization, it’s possible for the name to be updated but not the email when the email is actually created. Let’s look at the simplified sequence of events in Table 4-1.
Step | User A | User B | |
1 | Saves user name change | Clicks “Send email” | |
2 | System updates last name | Unscheduled | |
3 | Unscheduled | Sends email with inconsistent data | |
4 | System updates email | Unscheduled |
Concurrency means there is no guarantee that a shared variable will actually be in a specific state at any given time. How do you even reproduce concurrency bugs? How do you validate that you have actually fixed a concurrency bug?
We haven’t even looked at a more likely scenario: what happens if we have functionality to remove a Contact
or a Customer
? Now we might be iterating over our list and remove an item from the list. Let’s look at all of these issues in one fell swoop. There are two primary ways to fix our concurrency issue:
Customer.allCustomers
object.
Customer.allCustomers
list and its members cannot be changed.
Our first option means that we must have a synchronized
block for every possible access of the Customer.allCustomers
object. Invariably someone will forget to do a synchronized access and break the entire paradigm.
Our second option is much better; anyone can write any accessor to the Customer.allCustomers
variable without worrying about the list mutating. Of course, this means that we have to be able to generate new lists with updated members. This is the idea behind immutability.
As we get deeper into immutability, think about database transactions. Database transactions are atomic, which means that the system is either in a pre-transaction or post-transaction state, never in a mid-transaction state.
This means that when a database transaction is committed, the new records are made available to new queries. Older queries are still using older data, which is fine because the functionality they were doing was predicated on the previous data.
I’m going to show that, if we have two good states, it’s better to be in one or the other, but we cannot ever be in both. Let’s begin by defining our function f(x,y)
. We also define that our two states (without the tick mark and with the tick mark) are not equal:
Let’s create a set of our known two good states:
So, this means that mixing the sets of parameters still works and still gives us a value; however, these are not values that exist in our set of good states.
So, we’re going to think about variables as placeholders within a specific scope. If we think back to our email issue, then, we know that we can operate only in a known good state on both the list and the Customer
and Contact
records themselves.
Let’s begin working on our fix by doing the simplest thing and making our Customer.allCustomers
an immutable list. Remember, we’re not making the variable immutable, we’re making the thing the variable contains immutable. Let’s see this in Example 4-7.
static
public
List
<
Customer
>
allCustomers
=
new
ArrayList
<
Customer
>();
That was simple enough, but now we have to deal with our eachEnabledContact
, right? Actually, we don’t have to do anything, because it was read-only functionality.
Let’s continue our momentum and make all fields of the Customer
object immutable. Again, this is fairly straightforward, as we make all fields final
with one caveat: we must have a constructor that sets every field, as shown in Example 4-8.
public
final
Integer
customer_id
=
0
;
public
final
String
name
=
""
;
public
final
String
state
=
""
;
public
final
String
domain
=
""
;
public
final
Boolean
enabled
=
true
;
public
final
Contract
contract
=
null
;
public
final
List
<
Contact
>
contacts
=
new
ArrayList
<
Contact
>();
public
Customer
(
Integer
customer_id
,
String
name
,
String
state
,
String
domain
,
Boolean
enabled
,
Contract
contract
,
List
<
Contact
>
contacts
)
{
this
.
customer_id
=
customer_id
;
this
.
name
=
name
;
this
.
state
=
state
;
this
.
domain
=
domain
;
this
.
enabled
=
enabled
;
this
.
contract
=
contract
;
this
.
contacts
=
contacts
;
}
Because we’re changing our fields to immutable, we must remove all setters. If you think about it, having setters for immutable fields is a fallacy in and of itself, because the fields can be set only when the object is created.
Next, let’s update our Contract
class and make it immutable as well (Example 4-9). It is important to understand that as we do this, we will be unable to run and test the functionality until we’ve completed this refactor. Remember, our original code for updating a contract sets the field, which does not work with immutable variables.
import
java.util.List
;
import
java.util.Calendar
;
import
java.util.concurrent.ThreadPoolExecutor
;
import
java.util.concurrent.TimeUnit
;
import
java.util.concurrent.LinkedBlockingQueue
;
public
class
Contract
{
public
final
Calendar
begin_date
;
public
final
Calendar
end_date
;
public
final
Boolean
enabled
=
true
;
public
Contract
(
Calendar
begin_date
,
Boolean
enabled
)
{
this
.
begin_date
=
begin_date
;
this
.
end_date
=
this
.
begin_date
.
getInstance
();
this
.
end_date
.
setTimeInMillis
(
this
.
begin_date
.
getTimeInMillis
());
this
.
end_date
.
add
(
Calendar
.
YEAR
,
2
);
this
.
enabled
=
enabled
;
}
}
Even though we know we need to update setContractForCustomerList
, we’re going to switch from a concurrent design for now. Instead, we’ll create a new constructor, as shown in Example 4-10, so that we can create a new object with all members set.
public
Contract
(
Calendar
begin_date
,
Calendar
end_date
,
Boolean
enabled
)
{
this
.
begin_date
=
begin_date
;
this
.
end_date
=
end_date
;
this
.
enabled
=
enabled
;
}
Now, let’s go ahead and update our setContractForCustomerList
method so that we can get things working again. We’ll want to map over our allCustomers
list, updating customers that have specific id
s. All of this is shown in Example 4-11.
public
static
List
<
Customer
>
setContractForCustomerList
(
List
<
Integer
>
ids
,
Boolean
status
)
{
Customer
.
allCustomers
.
collect
{
customer
->
if
(
ids
.
indexOf
(
customer
.
customer_id
)
>=
0
)
{
new
Customer
(
customer
.
customer_id
,
customer
.
name
,
customer
.
state
,
customer
.
domain
,
customer
.
enabled
,
new
Contract
(
customer
.
contract
.
begin_date
,
customer
.
contract
.
end_date
,
status
),
customer
.
contacts
)
}
else
{
customer
}
}
}
Some might think that this looks terrible, but it is a fantastic piece of code. We iterate over the list of objects, then check to see if the current customer_id
is in our list of id
s. If it is, we create a new customer, copying all the fields over except Contract
. Instead, we create a new Contract
with the specific status that was passed to us. This new customer is then used in place of the original customer record. If it is not in our list, we return the original customer.
Let’s try to refactor this so that if we want to, we can change the Contract
in any manner. We’ll add a method to Customer.java called updateContractForCustomerList
, which will do the same thing as Example 4-11, except now we execute a higher-order function on the contract itself. We will then expect that a contract will be returned. Let’s look at the code in Example 4-12.
public
static
List
<
Customer
>
updateContractForCustomerList
(
List
<
Integer
>
ids
,
Closure
cls
)
{
Customer
.
allCustomers
.
collect
{
customer
->
if
(
ids
.
indexOf
(
customer
.
customer_id
)
>=
0
)
{
new
Customer
(
customer
.
customer_id
,
customer
.
name
,
customer
.
state
,
customer
.
domain
,
customer
.
enabled
,
cls
(
customer
.
contract
),
customer
.
contacts
)
}
else
{
customer
}
}
}
Now, we update our original setContractForCustomerList
function in Contract.java to call into Customer.updateContractForCustomerList
, as shown in Example 4-13. We are returning a List
of Customer
s, so we are able to execute Customer.allCustomers = Contract.setContractForCustomerList(…)
, which provides us with a constant, pristine list.
public
static
List
<
Customer
>
setContractForCustomerList
(
List
<
Integer
>
ids
,
Boolean
status
)
{
Customer
.
updateContractForCustomerList
(
ids
,
{
contract
->
new
Contract
(
contract
.
begin_date
,
contract
.
end_date
,
status
)
})
}
Remember how I mentioned an update contact method earlier? This was the entire reason for our bug; let’s go ahead and update that method so that we can fix the broken code, which is still trying to update objects.
In Example 4-14, we’ll see our new updateContact
method, which will map or collect all the Customer
records.
public
static
List
<
Customer
>
updateContact
(
Integer
customer_id
,
Integer
contact_id
,
Closure
cls
)
{
Customer
.
allCustomers
.
collect
{
customer
->
if
(
customer
.
customer_id
==
customer_id
)
{
new
Customer
(
customer
.
customer_id
,
customer
.
name
,
customer
.
state
,
customer
.
domain
,
customer
.
enabled
,
customer
.
contract
,
customer
.
contacts
.
collect
{
contact
->
if
(
contact
.
contact_id
==
contact_id
)
{
cls
(
contact
)
}
else
{
contact
}
}
)
}
else
{
customer
}
}
}
But wait: we’re starting to repeat ourselves, so let’s remember DRY and see what we can abstract. Take a few minutes to work on it yourself, and then check Example 4-15 to see what I did.
public
static
List
<
Customer
>
updateCustomerByIdList
(
List
<
Integer
>
ids
,
Closure
cls
)
{
Customer
.
allCustomers
.
collect
{
customer
->
if
(
ids
.
indexOf
(
customer
.
customer_id
)
>=
0
)
{
cls
(
customer
)
}
else
{
customer
}
}
}
public
static
List
<
Customer
>
updateContact
(
Integer
customer_id
,
Integer
contact_id
,
Closure
cls
)
{
updateCustomerByIdList
([
customer_id
],
{
customer
->
new
Customer
(
customer
.
customer_id
,
customer
.
name
,
customer
.
state
,
customer
.
domain
,
customer
.
enabled
,
customer
.
contract
,
customer
.
contacts
.
collect
{
contact
->
if
(
contact
.
contact_id
==
contact_id
)
{
cls
(
contact
)
}
else
{
contact
}
}
)
})
}
public
static
List
<
Customer
>
updateContractForCustomerList
(
List
<
Integer
>
ids
,
Closure
cls
)
{
updateCustomerByIdList
(
ids
,
{
customer
->
new
Customer
(
customer
.
customer_id
,
customer
.
name
,
customer
.
state
,
customer
.
domain
,
customer
.
enabled
,
cls
(
customer
.
contract
),
customer
.
contacts
)
})
}
Most people believe that moving to immutable variables will increase the complexity of their code; however, it actually helps in many different ways. Tracking down bugs—because we know certain variables cannot change—becomes easier; we can better understand what might have been passed into and out of functions.
Immutability is a difficult technique to implement because you will most likely need to do large refactorings in order to accomplish it. Just look back at our conversion of the Customer
object; we actually had to make changes to other classes and methods to support this. The key to implementing immutability is to start on your new classes and work backward during downtime to refactor your old code. Start with smaller classes that don’t change much and then move on to your harder classes.