Let’s write our first specification! In this chapter we will take a simple problem and translate it into a specification. Then, we’ll model check the specification and see if it has a flaw. Spoiler alert, it will have a flaw. This will be a whirlwind tour: we will be using concepts we will gradually learn over the course of the book.
The Problem
Each wire must be between two different people in the bank and wire at least one dollar.
If a wire is successful, the value of the wire is deducted from the sender account and added to the receiver account.
If the wire fails, the two accounts are unchanged.
A wire may not cause an account to have a negative balance.
For scaling reasons, multiple wires may happen simultaneously.
Your implementation must guarantee all properties, even if wires can take an arbitrary amount of time. In other words, even if wire A starts before B, A may still finish after wire B. Your algorithm must guarantee that, even in those cases, we satisfy all of the requirements. Given that money is on the line, you should probably design this in advance.
Boilerplate
The left panel is the list of TLA+ projects you have, while the right panel is where you will write your spec. Everything above the dashes and below the equals are ignored. We will not write them in the code snippets, and you can assume that everything is happening inside those.
Warning
The name of the module must match the name of the file, or TLA+ will consider the spec invalid.
Single line comments are *, comment blocks are (**). The algorithm is inside a comment: we’ll cover why later. That’s all the boilerplate we need. Now we can get around to specifying what we want.
Specifying
Note
Whitespace is not significant here. There is exactly one place in TLA+ where it is significant, which we will cover in Chapter 3.
Let’s start with people. people is a set. It’s an unordered collection of things, same as most programming languages. In this case, it has two strings as elements: “alice” and “bob”. Nothing too surprising.
Okay, what about acc? acc is a function. These aren’t like normal programming functions: they’re closer to dictionaries or mappings. For each value in a given set, it maps to some output value. Here the set is people and the element is p. This means acc["alice"] = acc["bob"] = 5. This would be equivalent, in a language like Python, to writing {"alice": 5, "bob": 5}. We could also make the function depend on each element. For example, we could write double = [x in 1..10 |-> 2*x].
Note that we moved the semicolon, as acc is no longer the last variable we declare.
The == here does not mean comparison . Rather, it’s the definition of an operator, which is closer to what we normally think of programming functions. We’ll discuss them more in Chapter 4. For now, we’re using it as an invariant that we’ll want to check. NoOverdrafts, in English, is “for all p in the set of people, their account must be greater than or equal to 0”. This accurately represents our property, whether we have two people in the set or two hundred.
Implementing
If you’re assigning a value to a variable for the very first time, you use =. However, if the variable already exists and you assign a new value to it, you use :=. Withdraw and Deposit are labels. They signify that everything inside them happens in the same moment of time. If we put the two assignments in the same label, they’d happen simultaneously. As it is, we let some time pass between the withdrawal and the deposit. We only allowed for one wire to happen, so this really doesn’t change much. But if we wrote the specification to allow multiple wires (which we’ll do later), this becomes more important.
Writing implementations in TLA+ can be a barrier starting out, so we wrote part of our spec in PlusCal, a special language that compiles to pure TLA+. PlusCal adds additional syntax, such as the := assignment syntax and the labels, giving us a pseudocode-like structure on top of TLA+. This is why we have to put it in a comment, as it needs to be translated first. To compile the PlusCal, we can go to File > Translate PlusCal Algorithm.
Tip
The shortcut is Ctrl-T, or Cmd-T on macOS. If you right-click in the editor, there’s an option in the context menu for “Translate PlusCal Automatically,” which does just that on every save.
You don’t need to do anything else with it. You’re now ready to check your specification.
Verifying
To check that this spec works, we need to create a model to check it. The model is like the “simulation” we want to run. Different models may set up different initial conditions and check for different things. Since we aren’t doing anything too fancy here, all we need to do with our model is check that our one invariant holds.
Find the drop-down labeled Invariants and click the Add button. Type in the name of the invariant we want to check (NoOverdrafts), save it, and run the model (the green button just under “Model Overview”).
Note
The shortcut for running the model is F11. You can bind your own shortcut in Preferences.
Warning
If you don’t see any states found, go back to Model Overview and make sure Temporal Formula is selected as the behavior with the temporal formula Spec. This is what tells TLC to specifically check the behavior of your spec.
- 1.
Choose a possible initial state (there is only one possible initial state).
- 2.
Check that the starting state follows the NoOverdrafts invariant. It does, so begin evaluation. Execute the Withdraw step.
- 3.
Check the NoOverdrafts invariant. It’s true, so continue.
- 4.
Execute the Deposit step.
- 5.
Check the NoOverdrafts invariant . It’s true, so finish.
If our spec had any errors, it would have popped up an error bar. Since that didn’t happen, our spec is probably correct.
Or maybe it’s too simple. Let’s break it.
Initial Conditions
We’ve expanded the set of possible initial states for our spec. Our model checker will check every single one of them to make sure they all pass. Recompile the PlusCal and rerun the model.
Note
If you see deleteOutOfSyncFiles, don’t worry about it. It’s a cleanup thing.
- 1.
Choose amount = 6 as part of the initial state.
- 2.
Check that the starting state follows the NoOverdrafts invariant. It does, so begin evaluation.
- 3.
Execute the Withdraw step.
- 4.
Check the NoOverdrafts invariant. It’s false, so raise the error.
How can we fix it? We could restrict the spec to only consider amount in 1..acc[sender], which would make it pass. However, this might not reflect the bank’s expected use. People in the real world might attempt to transfer more than they have, and we should be ready for that case. To continue with the example, though, we’ll assume that this is an acceptable assumption. Make the change and confirm the model passes again (20 states).
Multiple Processes
<Initial Predicate> is the starting state we picked. All of the variables have an assignment here. pc is the current label each process is on, and amount is the corresponding wire amount for each process. Since we let the wire amount vary, for this state TLC picked <<1, 5>>.
- 1.
Choose amount[1] = 1 and amount[2] = 5.
- 2.
Execute CheckFunds[1]. Since 1 <= acc["alice"], proceed to Withdraw[1].
- 3.
Execute CheckFunds[2]. Since 5 <= acc["alice"], proceed to Withdraw[2].
- 4.
Execute Withdraw[1]. Now acc["alice"] = 4.
- 5.
Execute Withdraw[2]. Now acc["alice"] = -1.
That seems to work (332 states). We have fewer states because we took out the label, which removed some of the concurrency. Let’s move on to the last requirement and make sure it still works if the wire fails.
Temporal Properties
EventuallyConsistent looks like NoOverdrafts, but the equation starts with <>[]. <>[] is the “eventually-always” operator , and means that no matter what the algorithm does, in the end the given equation must eventually be true. It may be false for a while, and it may switch between true and false several times, but it will end true.
The error we see (Figure 1-7) is unusual. The first four steps in the trace look normal: we withdraw in the first wire, we deposit in the first wire, and we withdraw in the second wire. Then we see that the fifth state is labeled <Stuttering>. Stuttering is when a process simply stops. There’s nothing preventing the wire from making the deposit, it just doesn’t try. If this seems far-fetched, consider the case where the server crashes in between steps. Or the power goes out. Or the deposit eventually happens, but it takes so long Alice complains to the SEC. Those are all real and quite worrisome concerns, and they are all represented by the stuttering state.
We could make the check, withdraw, and deposit all happen in the same step. Then there’s no way for the server to crash between the withdraw and the deposit, because there is no timespan between the withdraw and the deposit. In practice, though, this means our process only has one label, which means it effectively takes zero time to run. This violates the core requirement that wires take an arbitrary amount of time.
We could specifically tell TLA+ that our process cannot stutter between the withdrawal and the deposit. Our spec would pass, but there’s no way we could implement it. Server faults are as certain as death and taxes.
We could convince the Project Manager to relax the EventuallyConsistent requirement.
We could try a different implementation entirely.
We could relax the NoOverdrafts requirement. In the end, this is how most banks do it: instead of guaranteeing that overdrafts never happen, they work to make overdrafts less likely and have procedures in place when they do happen.
Specifying safe transfers is surprisingly hard! What’s important, though, is that we can test our specification without having to write code first. This lets us explore solutions and confirm they are safe before building our system.
Summary
We wrote a specification of a bank transfer and saw how the model checker, TLC, was able to find race conditions and logic bugs in our design. In the process, we sampled all of the various features of both TLA+ and PlusCal. Over the next five chapters, we will cover all of these features in more detail. Invariants, operators, and functions are all covered in Chapter 3, using TLC better is part of Chapter 4, Chapter 5 is all about concurrency, and Chapter 6 is stuttering and temporal properties.
As for the basics of PlusCal, that will be the next chapter.