Introduction
In this chapter we’ll be introducing PlusCal. PlusCal is a language that compiles down to TLA+. Lamport developed it in 2009 to make TLA+ more accessible to programmers. Most of the things we’ll want to do will be significantly easier in PlusCal than in TLA+. This chapter will cover all of PlusCal with the exception of multiprocess algorithms, which is Chapter 5; and fair processes, which is Chapter 6.
Specifications
Layout of a Spec
- (1)
All TLA+ specs must start with at least four – on each side of the MODULE and four = at the end. This is for backwards compatibility reasons. Everything outside these two boundaries is ignored, and people often put metadata there.
- (2)
The module name must be the same as the filename.
- (3)
EXTENDS is the import keyword. We’re importing the Integers module.
- (4)
* starts a line comment in TLA+, (* ... *) is a block comment. PlusCal specs are placed in comments (so the parser ignores it), are started with --algorithm <name>, and closed with end algorithm. The name of the algorithm does not have to match the filename.
- (5)
Inside the algorithm , we initialize variables with variables. Variables are separated by commas or semicolons.
- (6)
This is where we write the algorithm itself.
Note
In the tutorial we had Labels too. They’re only necessary for concurrent algorithms, so for now we leave them out and will cover all of the uses for them in Chapter 5.
Expressions
Everything in an expression is either a value, like {TRUE}, or an operator, like +. In the next chapter we will be writing our own operators, but for this one we will only use the ones provided by the standard library.
Tip
Checking an expression will also evaluate your spec. If you don’t want to run it, you can switch it to “No Behavior Spec” under Model Overview > What is the behavior Spec?
Values
Operator | Meaning | Example |
---|---|---|
x = y | Equals | >> 1 = 2FALSE |
x /= yx # y | Not Equals | >> 1 /= 2TRUE |
x / y | And | >> TRUE / FALSEFALSE |
x / y | Or | >> TRUE / FALSEFALSE |
x := y | Assignment | N/A [PlusCal only] |
~x | Not | >> ~TRUEFALSE |
= VS :=
If = is equality and := is assignment, how come we write variables x = 1 and not variables x := 1? In raw TLA+, there is no assignment, only equality. If you want to initialize x to 1, you write x = 1. If x is initialized and you want to compare x to 1, you write x = 1. If x is initialized and you want to assign it to 1, you write x' = 1. In TLA+, these are all actually equality checks! While this might seem unintuitive, it’s all part of the underlying way TLA+ treats time.
Here’s a rule of thumb: if it’s the first time you’re using the variable, = is initialization. Every other time, = is equality and := is assignment. If you write variables x = 2, y = x, z = (x = y) you will get x = 2, y = 2, z = TRUE. By the time we reach z we’ve already initialized x and y, so (x = y) is an equality check. If this is a little confusing, that’s understandable, and you’ll build an intuition with some experience.
There are four kinds of constructed types in TLA+: sets, tuples/sequences, structures, and functions. Functions are covered in the next chapter. We can cover the basics of the rest here.
Operator | Meaning | Example |
---|---|---|
x in set | Is element of set | >> 1 in 1..2TRUE |
x otin set ~(x in set) | Is not element of set | >> 1 otin 1..2FALSE |
set1 subseteq set2 | Is subset of set | >> {1, 2} subseteq {1, 2, 3}TRUE |
set1 union set2 | Set Union | >> (1..2) union (2..3){1, 2, 3} |
set1 intersect set2 | Set Intersection | >> (1..2) intersect (2..3){2} |
set1 set2 | Set Difference | >> (1..2) (2..3){1} |
Cardinality(set) | Number of elements in set (requires EXTENDS FiniteSets) | >> Cardinality({"a", "b"}) 2 |
Note
When reading specs, you might come across union written as cup and intersect as cap. This is a visual match of the mathematical symbols, ∪ and ∩. We’re using union/intersect because it’s more explicit.
Note
It’s very rare to need to write something like {x in set1 : y in set2}, but it (again, rarely) does happen. In this edge case, it’s treated as a filter on set1.
Operator | Meaning | Example |
---|---|---|
Head(sequence) | Head | >> Head(<<1, 2>>)1 |
Tail(seq) | Tail | >> Tail(<<1, 2, 3>>) <<2, 3>> |
Append(seq, x) | Append | >> Append(<<1, 2>>, 3) <<1, 2, 3>> |
seq1 o seq2 | Combine | >> <<1>> o <<2, 3>> <<1, 2, 3>> |
Len(seq) | Length of sequence | >> <<1, 1, 1, 1>> 4 |
While the terms are interchangeable, by convention we use tuple when we don’t expect to use sequence operators on it or change its length, and we use sequence if we do.
Note
While the syntax for the two is different, structures and sequences are the same data type! We’ll learn more about this when we cover functions in the next chapter.
Here we modified a set inside a struct inside a tuple/sequence.
That covers the basics of data types. Now let’s quickly run through some basic syntax of the PlusCal algorithm body. Most of this should be familiar from programming languages, so we won’t go into too much detail.
PlusCal Algorithm Body
Assignment
Assign an existing variable to a different value. Done with :=.
assert
An assertion. assert TRUE does nothing. assert FALSE raises an error. Adding assertions is one common way we test invariants: the assertion checks that in that step a given expression holds, so if it fails our spec broke the invariant. In order to use assertions, you need to add EXTENDS TLC.
skip
A no-op. We can use this to fill represent parts of the spec that we haven’t filled out yet or conditionals that don’t update anything.
if
While if is the only conditional in PlusCal, it is not the only branching statement. Two others, either and with, will be introduced later in this chapter.
while
Note that we use do here, while for the if statement we use then.
Macros
While set_false doesn’t take x as a parameter, it’s still able to change the variable.
Example
If the item is labeled as “recycling” and it is under the remaining capacity for the recycling bin, the item goes into recycling.
If the item is labeled as “trash” OR the item is labeled as “recycling” and there is not enough recycling capacity AND there is sufficient capacity in the trash bin, the item goes into trash.
Otherwise, it’s dropped on the floor and somebody else gets to sweep it up.
Let’s start by thinking about our representations. The capacities of the two bins can be represented by numbers. The items can be represented by a structure with a key for size and a key for type: an item might look like [type |-> "trash", size |-> 2]. We can represent the numbers in each bin with sets. Finally, we can represent the order that items come in as a sequence, such as <<item1, item2, item3>>.
Next, we should figure out what our invariants are. We don’t have the tools yet to inspect properties on sets, so we can start by just checking that we don’t go over the capacity limit for either bin, and that each bin has the appropriate amount of items in it.
We replaced the bodies of the two conditions branches with calls to add_item. Confirm again that this works.
Complex Behaviors
We now know how to write a very simple spec, but what we have is barely more interesting than a deterministic unit test. If we want to make this useful, we need a way to check not just one setup, but an entire space of setups and runtime occurrences. There are three basic ways to do this.
Multiple Starting States
TLC would first try running the whole algorithm with x = 1, then x = 2, then finally x = 3, which fails. If we added a second variable y that also used in, TLC would check every single possible combination of x and y.
Tip
TLA+ defines a shorthand BOOLEAN for the set {TRUE, FALSE}. This can be useful if you have a flag variable, such as variable is_ready in BOOLEAN.
We can use this to choose some arbitrary number. What about arbitrary sets, structures, and tuples? We have some special operators to generalize them.
Tip
Sometimes you want a structure where one key is always a specific value, but another key is some value in a set. In that case you can wrap the value in a one-element set, as in [key1: set, key2: {value}].
As with everything else, all of these can be freely mixed and matched. We can write variable x in [key: (set1 X set2)] to mean “x is a structure where the value of key is some pair of elements, the first being in set1, the second being in set2.” We can use this to detail complex data structures in our specifications. In particular, we can use this to detail complex starting states that break our spec.
Example
- 1.
Checking the model takes longer. Before, we had one possible starting state. Now we have 10 × 10 × (2 × 6)4 = 2,073,600 starting states. TLC will be clever about checking them, but optimizing your models is an important skill you’ll develop.
- 2.
Our spec failed. The TLC error will look something like this:
This means that the spec failed because one of our asserts failed. The exact values of the failure you get will probably be different run to run, but it will be the same core problem. Sets are unique, and {x} union {x} = {x}, not {x, x}. If we handle two items with the exact same type and size, we end up storing it once but increasing count twice. Then the size of the set and the value of count don’t match up.
Ultimately, the problem is in the set: our count is correct and the set is wrong. We want duplicates, so we should preferably store the items in sequences instead of sets. set union {curr} looks ugly, anyway. Plus, we can get rid of the FiniteSets dependency, since we’d be using Append instead of Cardinality.
It should pass with 9,323,626 states .
Nondeterministic Behavior
Not all behavior is deterministic. A request may succeed or fail, a query might return a random result, there might be one of several choices to make. For single process algorithms, we have two PlusCal constructs to simulate nondeterminisim.
Either
When you model-check your spec, TLC will check all branches simultaneously. We can use this to represent one of several possibilities happening. There is no way to make one possibility more likely than the other. We generally assume that if some possible choice invalidates our spec, no matter how unlikely, it’s something we’ll want to fix.
We can place any assignment or PlusCal expression inside of an either branch. If all of the branches are “macro-valid,” we may place an either inside of a macro.
With
The second case, however, is nondeterministic. TLC will check what happens for all possible assignments of var to elements of set. If the set is empty, the spec halts until the set is not empty. For single-process apps, this is considered a spec failure.
with statements follow macro rules: no double-assignments and no while loops. You can place with statements inside macros.
Warning
with gives you values, not references. If x and y are variables, you could not reassign to them by writing with t in {x, y} do t := 1. You could, though, write with t in {x, y} do x := t.
Example
- 1.
On the sender’s turn, they put a message in transit.
- 2.
On the receiver’s turn, they either receive a message in transit or do nothing (they’re outside cell coverage).
- 1.
The sender places message 1 in transit.
- 2.
The receiver skips.
- 3.
The sender places message 2 in transit.
- 4.
The receiver pulls message 1.
- 5.
The sender places message 3 in transit.
- 6.
The receiver pulls message 3.
- 7.
The receiver pulls message 2.
This is called a concurrency bug. TLA+ is especially well-suited to identifying and debugging concurrency bugs, which is good, because a lot of the nastiest and most subtle bugs are concurrency bugs.
With this fix, the spec is valid (18 states). But there’s a subtle and very dangerous problem here: if you can only send if the other person receives , what if the message is never received? Our only invariant is that, at the very end, the messages have arrived in order. One way to satisfy this is if the messages never arrive at all! This is called a liveness bug and we will study them further in Chapter 6.
If you run this, you should see it fail, with the error “Deadlock reached.” This means TLC reached a state where it can’t make any more progress. In this case, the sender places message 1 in transit, and the receiver receives message 1 but does not reset can_send. The sender can’t do anything else because can_send is false, and the receiver can’t do anything because in_transit is empty. Deadlock.
Deadlocks are a particularly common problem in concurrent code, and we will discuss them more in Chapter 5.
Summary
In this chapter we covered the basics of using PlusCal to write specs. We learned the syntax for sets, tuples/sequences, and structures, as well as how to check multiple starting states and simulate nondeterministic behavior.
In the next chapter, we will learn how to use TLA+ proper to create complex data and invariants. We will also introduce the last data type: functions.