Specifications are more expressive than code. This comes at a cost: it’s not always clear how to go from a specification to reality. A technique for handling this is to write a very abstract spec and expand it into a more detailed, lower-level “implementation” that is closer to the program you’ll be writing. One very common pattern for doing this is to use a state machine. In this chapter, we will learn how to represent them and use them in writing our specifications.
State Machines
This passes with eight states. Only four of those states are distinct, though, which matches our expectations. Note the behavior is nondeterministic : when the door is unlocked and closed, for example, we can either lock it or open it. That’s two separate transitions that are possible. Most state machines (at least most interesting state machines) are concurrent.
Now the spec can deadlock: if we don’t have the key, then we can open the door, lock it, and then close the door again. One way around this is to make the door only closable if it’s unlocked, like how most deadbolts work. If you change the “close” guard to await open / ~locked, the spec passes again (14 states). Only 7 are distinct: we have twice as many initial states due to having a key, but if we don’t have a key then reaching the closed locked door state is impossible.
This passes with 12 states . This actually ends up being about 1.5x more lines, but this style is often more concise and clearer for larger state machines. It also makes the concurrency of the state machine more evident, showing us the framework we’d be using for complex examples. Finally, it lets us include non-atomic transitions via labels, which becomes important when we try to write distributed state machines.
We used await to shape the event flow in both the single-process and the multiprocess state machine. We can roughly describe all PlusCal “state machines” as looking like this: branches and processes controlled by await and either statements.
Scaffolding Implementations
Most real-life problems aren’t explicitly state machines. But many look like state machines and can be abstractly specified as state machines. Then we can implement our spec off that state machine, filling in the details on how the transitions are actually implemented in code and making sure it preserves the same invariants. As an example of this, we will spec a simple data pattern. Some clients can read from and write to a database. We first specify this as a state machine without a database, and then progress to a more-detailed one that more accurately models how, exactly, the clients “read” and “write.”
REFINEMENT
TLA+ formalizes the concept of “scaffolding” with refinement . You refine a spec by writing a second, more detailed spec and showing that every valid behavior of the detailed spec is also a valid behavior of the general spec. But this requires some tooling we don’t have access to in PlusCal, and so it is outside the scope of this book. For now, we’ll have to use a more informal version.
All of this should be fairly simple. The only used internal state so far is db_value, and the only transitions we have are reading db_value (trivially correct) and modifying db_value . The only thing we check (aside from deadlock) is assert result = db_value, which is our primary invariant.
This passes with 20 states. You may also want to check that it passes with two clients (110 states). We will return to the two-client case once we’ve added some implementation details to the spec.
This is the most abstract version of our spec. In reality, we can’t just have the client directly writing to the database; we’d have to have them communicate with whatever controls the database . The key here, though, is that any details we add to the state machine won’t change this overall spec structure. The more complex state machines we make will be more elaborate versions of this simple state machine and will still preserve the same high-level invariants.
This passes with 56 states.
In adding detail to our high-level state machine, we also added new behaviors. Making a request and getting a response is no longer an atomic operation. We want to make sure the implemented version preserves the same invariants: in this case, the assert result = db_value. The abstract state machine and the detailed state machine match for a single client. But what about two clients? Try rerunning both versions with Client <- [ model value ] {c1, c2}.
You should see they don’t agree: the abstract version passes (110 states) while the detailed version fails. Adding that communication layer means that c1 can request a read and get a response, but c2 can write to the database before c1 reads its response.
As always, there are two things we can do: we can change the implementation, or we can rethink what our requirements really are. Here it is worth asking what the invariant of the abstract machine actually means. Is it that the client always knows what’s in the database, or that the database is always honest with the client? Both of these interpretations are compatible with the original invariant.
The former case is more difficult, and we’d have to backtrack and start again. In the latter case, we can achieve it by further elaborating on our invariant.
Ghost Variables
One way to formalize “the database is honest with the client” is to say that whatever the client receives was correct data at the time of the request. Our implementation only tracks what the data is currently. We can alter it to also store history. This, though, would be irrelevant to our actual, physical system: the history only matters for checking the invariant, not for the actual implementation of the interactions we want.
The trick here is that our specification is doing two things. First, it shows how our state machine is intended to work. Second, it represents the wider context our state machine exists in. Even if the implementation doesn’t track prior values, that’s still part of our wider context. What we can do is add more detail to that context and see if that gets us to a correct system.
Contextual data that we track to verify invariants is called auxiliary, or ghost, data. We can also have ghost operators, ghost processes, etc. What’s important is that the ghost data is only used for checking invariants. While our spec can affect our ghost data, our ghosts cannot change the behavior of the spec. It may define which states are considered invariant breaking, but it cannot prevent the model checker from reaching those states.
The client isn’t reading ghost_db_history either. It only appears in an assertion and so is only used to check an invariant. With this, our two-client model passes (6,098 states). If we implemented this, the database would not be tracking history, since ghost_db_history isn’t part of the implementation. But it would still conform to the spec we have.
We now have a relationship between our initial, abstract state machine and our final spec. If we read the invariant as “the response is the value of the database at the time the request is processed,” then our final spec implements the initial state machine. If we read the invariant as “the client always reads the current value of the database,” then our final spec does not implement the initial state machine.
Summary
We learned how to write a state machine pattern, how to use it as a means of designing the implementations of specs, and the value of ghost values for checking properties of a spec. We designed a client-database system and showed that it behaves correctly.
TLA+ helped us create an implementation that matched a higher-level specification. In the next chapter, we will go one step higher and use TLA+ to turn a set of informal business requirements into a specification, formalizing the requirements in the process.