2. Decomposition

Software architecture is the high-level design and structure of the software system. While designing the system is quick and inexpensive compared with building the system, it is critical to get the architecture right. Once the system is built, if the architecture is defective, wrong, or just inadequate for your needs, it is extremely expensive to maintain or extend the system.

The essence of the architecture of any system is the breakdown of the concept of the system as a whole into its comprising components, be it a car, a house, a laptop, or a software system. A good architecture also prescribes how these components interact at run-time. The act of identifying the constituent components of a system is called system decomposition.

The correct decomposition is critical. A wrong decomposition means wrong architecture, which in turn inflicts a horrendous pain in the future, often leading to a complete rewrite of the system.

In years past, these building blocks were C++ objects and later COM, Java, or .NET components. In a modern system and in this book, services (as in service-orientation) are the most granular unit of the architecture. However, the technology used to implement the components and their details (such as interfaces, operations, and class hierarchies) are detailed design aspects, not system decomposition. In fact, such details can change without ever affecting the decomposition and therefore the architecture.

Unfortunately, the majority, if not the vast majority, of all software systems are not designed correctly and arguably are designed in the worst possible way. The design flaws are a direct result of the incorrect decomposition of the systems. This chapter therefore starts by explaining why the common ways of decomposition are flawed to the core and then discusses the rationale behind The Method’s decomposition approach. You will also see some powerful and helpful techniques to leverage when designing the system.

Avoid Functional Decomposition

Functional decomposition decomposes a system into its building blocks based on the functionality of the system. For example, if the system needs to perform a set of operations, such as invoicing, billing, and shipping, you end up with the Invoicing service, the Billing service, and the Shipping service.

Problems with Functional Decomposition

The problems with functional decomposition are many and acute. At the very least, functional decomposition couples services to the requirements because the services are a reflection of the requirements. Any change in the required functionality imposes a change on the functional services. Such changes are inevitable over time and impose a painful future change to your system by requiring a new decomposition after the fact to reflect the new requirements. In addition to costly changes to the system, functional decomposition precludes reuse and leads to overly complex systems and clients.

Precluding Reuse

Consider a simple functionally decomposed system that uses three services A, B, and C, which are called in the order of A then B then C. Because functional decomposition is also decomposition based on time (call A and then call B), it effectively precludes individual reuse of services. Suppose another system also needs a B service (such as Billing). Built into the fabric of B is the notion that it was called after an A and before a C service (such as first Invoicing, and only then Billing against an invoice, and finally Shipping). Any attempt to lift the B service from the first system and drop it in the second system will fail because, in the second system, no one is doing A before it and C after it. When you lift the B service, the A and the C services are hanging off it. B is not an independent reusable service at all—A, B, and C are a clique of tightly coupled services.

Too Many or Too Big

One way of performing functional decomposition is to have as many services as there are variations of the functionalities. This decomposition leads to an explosion of services, since a decently sized system may have hundreds of functionalities. Not only do you have too many services, but these services often duplicate a lot of the common functionality, each customized to their case. The explosion of services inflicts a disproportional cost in integration and testing and increases overall complexity.

Another functional decomposition approach is to lump all possible ways of performing the operations into mega services. This leads to bloating in the size of the services, making them overly complex and impossible to maintain. Such god monoliths become ugly dumping grounds for all related variations of the original functionality, with intricate relationships inside and between the services.

Functional decomposition, therefore, tends to make services either too big and too few or too small and too many. You often see both afflictions side by side in the same system.

Clients Bloat and Coupling

Functional decomposition often leads to flattening of the system hierarchy. Since each service or building block is devoted to a specific functionality, someone must combine these discrete functionalities into a required behavior. That someone is often the client. When the client is the one orchestrating the services, the system becomes a flat two-tier system: clients and services, and any notion of additional layering is gone. Suppose your system needs to perform three operations (or functionalities): A, B and C, in that order. As illustrated in Figure 2-1, the client must stitch the services together.

Figure 2-1 Bloated client orchestrating functionality

By bloating the client with the orchestration logic, you pollute the client code with the business logic of the system. The client is no longer just about invoking operations on the system or presenting information to users. The client is now intimately aware of all internal services, how to call them, how to handle their errors, how to compensate for the failure of B after the success of A, and so on. Calling the services is almost always synchronous because the client proceeds along the expected sequence of A then B then C, and it is difficult otherwise to ensure the order of the calls while remaining responsive to the outside world. Furthermore, the client is now coupled to the required functionality. Any change in the operations, such as calling B' instead of B, forces the client to reflect that change. The hallmark of a bad design is when any change to the system affects the client. Ideally, the client and services should be able to evolve independently. Decades ago, software engineers discovered that it was a bad idea to include business logic with the client. Yet, when designed as in Figure 2-1, you are forced to pollute the client with the business logic of sequencing, ordering, error compensation, and duration of the calls. Ultimately, the client is no longer the client—it has become the system.

What if there are multiple clients (e.g., rich clients, web pages, mobile devices), each trying to invoke the same sequence of functional services? You are destined to duplicate that logic across the clients, making maintenance of all those clients wasteful and expensive. As the functionality changes, you now are forced to keep up with that change across multiple clients, since all of them will be affected. Often, once that is the case, developers try to avoid any changes to the functionality of the services because of the cascading effect it will have on the clients. With the multiplicity of clients, each with its own version of the sequencing tailored to its needs, it becomes even more challenging to change or interchange services, thus precluding reuse of the same behavior across the clients. Effectively, you end up maintaining multiple complex systems, trying to keep them all in sync. Ultimately, this leads to both stifling of innovation and increased time to market when the changes are forced through development and production.

As an example of the problems with functional decomposition discussed thus far, consider Figure 2-2. It is the visualization of cyclomatic complexity analysis of a system I reviewed. The design methodology used was functional decomposition.

Figure 2-2 Complexity analysis of a functional design

Cyclomatic complexity measures the number of independent paths through the code of a class or service. The more the internals are convoluted and coupled, the higher the cyclomatic complexity score. The tool used to generate Figure 2-2 measured and rated the various classes in the system. In the visualization, the more complex the class is, the larger and darker it is in color. At first glance, you see three very large and very complex classes. How easy would it be to maintain MainForm? Is this just a form, a UI element, a clean conduit from the user to the system, or is it the system? Observe the complexity required to set up MainForm in the size and shade of FormSetup. Not to be outdone, Resources is very complex, since it is very complex to change the resources used in MainForm. Ideally, Resources should have been trivial, comprising simple lists of images and strings. The rest of the system is made up of dozens of small, simple classes, each devoted to a particular functionality. The smaller classes are literally in the shadow of the three massive ones. However, while each of the small classes may be trivial, the sheer number of the smaller classes is a complexity issue all on its own, involving intricate integration across that many classes. The result is both too many components and too big components as well as a bloated client.

Multiple Points of Entry

Another problem with the decomposition of Figure 2-1 is that it requires multiple points of entry to the system. The client (or clients) needs to enter the system in three places: once for the A, then for the B, then for the C service. This means there are multiple places to worry about authentication, authorization, scalability, instance management, transaction propagation, identities, hosting, and so on. When you need to change the way you perform any one of these aspects, you will need to change it in multiple places across services and clients. Over time, these multiple changes make adding new and different clients very expensive.

Services Bloating and Coupling

As an alternative to sequencing the functional services as in Figure 2-1, you can opt for what, on the face of it, appears as a lesser evil by having the functional services call each other, as shown in Figure 2-3.

ch02lev2sec6

Figure 2-3 Chaining functional services

The advantage of doing so is that you get to keep the clients simple and even asynchronous: the clients issue the call to the A service. The A service then calls B, and B calls C.

The problem now is that the functional services are coupled to each other and to the order of the functional calls. For example, you can call the Billing service only after the Invoicing service but before the Shipping service. In the case of Figure 2-3, built into the A service is the knowledge that it needs to call the B service. The B service can be called only after the A service and before the C service. A change in the required ordering of the calls is likely to affect all services up and down the chain because their implementation will have to change to reflect the new required order.

But Figure 2-3 does not reveal the full picture. The B service of Figure 2-3 is drastically different from that of Figure 2-1. The original B service performed only the B functionality. The B service in Figure 2-3 must be aware of the C service, and the B contract must contain the parameters that will be required by the C service to perform its functionality. These details were the responsibility of the client in Figure 2-1. The problem is compounded by the A service, which must now accommodate in its service contract the parameters required for calling the B and the C services for them to perform their respective business functionality. Any change to the B and C functionality is reflected in a change to the implementation of the A service, which is now coupled to them. This kind of bloating and coupling is depicted in Figure 2-4.

ch02lev2sec5

Figure 2-4 Chaining functionality leads to bloated services.

Sadly, even Figure 2-4 does not tell the whole truth. Suppose the A service performed the A functionality successfully and then proceeded to calling the B service to perform the B functionality. The B service, however, encountered an error and failed to execute properly. If A called B synchronously, then A must be intimately aware of the internal logic and state of B in order to recover its error. This means the B functionality must also reside in the A service. If A called B asynchronously, then the B service must now somehow reach back to the A service and undo the A functionality or contain the rollback of A within itself. In other words, the A functionality also resides in the B service. This creates tight coupling between the B service and the A service and bloats the B service with the need to compensate for the success of the A service. This situation is shown in Figure 2-5.

ch02lev2sec2

Figure 2-5 Additional bloating and coupling due to compensation

The issue is compounded in the C service. What if both the A and B functionalities succeeded and completed, but the C service failed to perform its business function? The C service must reach back to both the B and the A services to undo their operations. This creates far more bloating in the C service and couples it to the A and B services. Given the coupling and bloating in Figure 2-5, what will it take to replace the B service with a B' service that performs the functionality differently than B? What will be the adverse effects on the A and C services? Again, what degree of reuse exists in Figure 2-5 when the functionality in the services is asked for in other contexts, such as calling the B service after the D service and before the E service? Are A, B, and C three distinct services or just one fused mess?

Reflecting on Functional Decomposition

Functional decomposition holds an almost irresistible allure. It looks like a simple and clear way of designing the system, requiring you to simply list the required functionalities and then create a component in your architecture for each. Functional decomposition (and its kin, the domain decomposition discussed later) is how most systems are designed. Most people choose functional decomposition naturally, and it is likely what your computer science professor showed you in school. The prevalence of functional decomposition in poorly designed systems makes a near-perfect indicator of something to avoid. At all costs, you must resist the temptations of functional decomposition.

Nature of the Universe (TANSTAAFL)

You can prove that functional decomposition is precluded from ever working without using a single software engineering argument. The proof has to do with the very nature of the universe, specifically, the first law of thermodynamics. Stripping away the math, the first law of thermodynamics simply states that you cannot add value without sweating. A colloquial way of saying the same is: “There ain’t no such thing as a free lunch.”

Design, by its very nature, is a high-added-value activity. You are reading this book instead of yet another programming book because you value design, or put differently, you think design adds value, or even a lot of value.

The problem with functional decomposition is that it endeavors to cheat the first law of thermodynamics. The outcome of a functional decomposition, namely, system design, should be a high-added-value activity. However, functional decomposition is easy and straightforward: given a set of requirements that call for performing the A, B, and C functionalities, you decompose into the A, B, and C services. “No sweat!” you say. “Functional decomposition is so easy that a tool could do it.” However, precisely because it is a fast, easy, mechanistic, and straightforward design, it also manifests a contradiction to the first law of thermodynamics. Since you cannot add value without effort, the very attributes that make functional decomposition so appealing are those that preclude functional decomposition from adding value.

The Anti-Design Effort

It will be an uphill struggle to convince colleagues and managers to do anything other than functional decomposition. “We have always done it that way,” they will say. There are two ways to counter that argument. The first is replying, “And how many times have we met the deadline or the budget to which we committed? What were our quality and complexity like? How easy was it to maintain the system?”

The second is to perform an anti-design effort. Inform the team that you are conducting a design contest for the next-generation system. Split the team into halves, each in a separate conference room. Ask the first half to produce the best design for the system. Ask the second half to produce the worst possible design: a design that will maximize your inability to extend and maintain the system, a design that will disallow reuse, and so on. Let them work on it for one afternoon and then bring them together. When you compare the results, you will usually see they have produced the same design. The labels on the components may differ, but the essence of the design will be the same. Only now confess that they were not working on the same problem and discuss the implications. Perhaps a different approach is called for this time.

Example: Functional House

The fact you should never design using functional decomposition is a universal observation that has nothing to do with software systems. Consider building a house functionally, as if it were a software system. You start by listing all the required functionalities of the house, such as cooking, playing, resting, sleeping, and so on. You then create an actual component in the architecture for each functionality, as shown in Figure 2-6.

ch02lev2sec3

Figure 2-6 Functional decomposition of a house

While Figure 2-6 is already preposterous, the true insanity becomes evident only when it is time to build this house. You start with a clean plot of land and build cooking. Just cooking. You take a microwave oven out of its box and put it aside. Pour a small concrete pad, build a wood frame on the pad, cover it with countertop, and place the microwave on it. Build a small pantry for the microwave and hammer a tiny roof over it, connect just the microwave to the power grid. “We have cooking!” you announce to the boss and customers.

But is cooking really done? Can cooking ever be done this way? Where are you serving the meal, storing the leftovers, or disposing of trash? What about cooking over the gas stove? What will it take to duplicate this feat for cooking over the stove? What degree of reuse can you have between the two separate ways of expressing the functionality of cooking? Can you extend any one of them easily? What about cooking with a microwave somewhere else? What does it take to relocate the microwave? All of this mess is not even the beginning because it all depends on the type of cooking you perform. You need to build separate cooking functionality, perhaps, if cooking involves multiple appliances and differs by context—for example, if you are cooking breakfast, lunch, dinner, dessert, or snacks. You end up with either explosion of minute cooking services, each dedicated to a specific scenario that must be known in advance, or you end up with massive cooking service that has it all. Will you ever build a house like that? If not, why design and build a software system that way?

When To Use Functional Decomposition

The derision in these pages does not mean functional decomposition is a bad idea. Functional decomposition has a place—it is a decent requirements discovery technique. It helps architects (or product managers) discover hidden or implied areas of functionality. Starting at the top, even with vague functional requirements, you can drive functional decomposition to a very fine level, uncovering requirements and their relationship, arranging the requirements in a tree-like manner, and identifying redundancies or mutually exclusive functionalities. Extending functional decomposition into a design, however, is deadly. There should never be direct mapping between the requirements and the design.

Avoid Domain Decomposition

The house design in Figure 2-6 is obviously absurd. In your house, you likely do the cooking in the kitchen, so an alternative decomposition of the house is shown in Figure 2-7. This form of decomposition is called domain decomposition: decomposing a system into building blocks based on the business domains, such as sales, engineering, accounting, and shipping. Sadly, domain decomposition such as Figure 2-7 shows is even worse than the functional decomposition of Figure 2-6. The reason domain decomposition does not work is that it is still functional decomposition in disguise: Kitchen is where you do the cooking, Bedroom is where you do the sleeping, Garage is where you do the parking, and so on.

ch02lev2sec4

Figure 2-7 Domain decomposition of a house

In fact, every one of the functional areas of Figure 2-6 can be mapped to domains in Figure 2-7, which presents severe problems. While each bedroom may be unique, you must duplicate the functionality of sleeping in all of them. Further duplication occurs when sleeping in front of the TV in the living room or when entertaining guests in the kitchen (as almost all house parties end up in the kitchen). Each domain often devolves into an ugly grab bag of functionality, increasing the internal complexity of the domain. The increased inner complexity causes you to avoid the pain of cross-domain connectivity, and communication across domains is typically reduced to simple state changes (CRUD-like) rather than actions triggering required behavior execution involving all domains. Composing more complex behaviors across domains is very difficult. Some functionalities are simply impossible in such domain decompositions. For example, in the house in Figure 2-7, where would you perform cooking that cannot take place in the kitchen (e.g., a barbecue)?

Building a Domain House

As with the pure functional approach, the real problems with domain decomposition become evident during construction. Imagine building a house along the decomposition of Figure 2-7. You start with a clean plot of land. You dig a trench for the foundation for the kitchen, pour concrete for the foundation (just for the kitchen), and add bolts in the concrete. You then erect the kitchen walls (all have to be exterior walls); bolt them to the foundation; run electrical wires and plumbing in the walls; connect the kitchen to the water, power, and gas supplies; connect the kitchen to the sewer discharge; add heating and cooling ducts and vents; connect the kitchen to a furnace; add water, power, and gas meters; build a roof over the kitchen; screw drywall on the inside; hang cabinets; coat the outside walls (all walls) with stucco; and paint it. You announce to the customer that the Kitchen is done and that milestone 1.0 is met.

Then you move on to the bedroom. You first bust the stucco off the kitchen walls to expose the bolts connecting the walls to the foundation and unbolt the kitchen from the foundation. You disconnect the kitchen from the power supply, gas supply, water supply, and sewer discharge and then use expensive hydraulic jacks to lift the kitchen. While suspending the kitchen in midair, you shift it to the side so that you can demolish the foundation for the kitchen with jackhammers, hauling the debris away and paying expensive dump fees. Now you can dig a new trench that will contain a continuous foundation for the bedroom and the kitchen. You pour concrete into the trenches to cast the new foundation and add the bolts hopefully at exactly the same spots as before. Next, you very carefully lower the kitchen back on top of the new foundation, making sure all the bolt holes align (this is next to impossible). You erect new walls for the bedroom. You temporarily remove the cabinets from the kitchen walls; remove the drywall to expose the inner electrical wires, pipes, and ducts; and connect the ducts, plumbing, and wires to those of the bedroom. You add drywall in the kitchen and the bedroom, rehang the kitchen cabinets, and add closets in the bedroom. You knock down any remaining stucco from the walls of the kitchen so that you can apply continuous, crack-free stucco on the outside walls. You must convert several of the previous outside walls of the kitchen to internal walls now, with implications on stucco, insulation, paint, and so on. You remove the roof of the kitchen and build a new continuous roof over the bedroom and the kitchen. You announce to the customer that milestone 2.0 is met, and Bedroom 1 is done.

The fact that you had to rebuild the kitchen is not disclosed. The fact that building the kitchen the second time around was much more expensive and riskier than the first time is also undisclosed. What will it take to add another bedroom to this house? How many times will you end up building and demolishing the kitchen? How many times can you actually rebuild the kitchen before it crumbles into a shifting pile of useless debris? Was the kitchen really done when you announced it so? Rework penalties aside, what degree of reuse is there between the various parts of the house? How much more expensive is building a house this way? Why would it make sense to build a software system this way?

Faulty Motivation

The motivation for functional or domain decomposition is that the business or the customer wants its feature as soon as possible. The problem is that you can never deploy a single feature in isolation. There is no business value in Billing independent from Invoicing and Shipping.

The situation is even worse when legacy systems are involved. Rarely do developers get the privilege of a completely new, green-field system. Most likely there is an existing, decaying system that was designed functionally whose inflexibility and maintenance costs justify the new system.

Suppose your business has three functionalities A, B, and C, running in a legacy system. When building a new system to replace the old, you decide to build and, more important, deploy the A functionality first to satisfy the customers and managers who wish to see value early and often. The problem is that the business has no use for just A on its own. The business needs B and C as well. Performing A in the new system and B and C in the old system will not work, because the old system does not know about the new system and cannot execute just B and C. Doing A in both the old system and the new system adds no value and even has negative value due to the repeated work, so users are likely to revolt. The only solution is to somehow reconcile the old and the new systems. The reconciliation typically far eclipses in complexity the challenge of the original underlying business problem, so developers end up solving a far more complex problem. To use the house analogy again, what would it be like to live in a cramped old house while building a new house on the other side of town according to Figure 2-6 or Figure 2-7? Suppose you are building just cooking or the kitchen in the new house while continuing to live in the old house. Every time you are hungry, you have to drive to the new house and come back. You would not accept it with your house, so you should not inflict this kind of abuse on your customers.

Testability and Design

A crucial flaw of both functional and domain decomposition has to do with testing. With such designs, the level of coupling and complexity is so high that the only kind of testing developers can do is unit testing. However, that does not make unit testing important, and it is merely another example of the streetlight effect1 (i.e., searching for something where it is easiest to look).

1. https://en.wikipedia.org/wiki/Streetlight_effect

The sad reality is that unit testing is borderline useless. While unit testing is an essential part of testing, it cannot really test a system. Consider a jumbo jet that has numerous internal components (pumps, actuators, servos, gears, turbines, etc.). Now suppose all components have independently passed unit testing perfectly, but that is the only testing that took place before the components were assembled into an aircraft. Would you dare board that airplane? The reason unit testing is so marginal is that in any complex system, the defects are not going to be in any of the units but rather are the result of the interactions between the units. This is why you instinctively know that, while each component in the jumbo jet example works, the aggregate could be horribly wrong. Worse, even if the complex system is at a perfect state of impeccable quality, changing a single, unit-tested component could break some other unit(s) relying on an old behavior. You must repeat testing of all units when changing a single unit. Even then it would be meaningless because the change to one of the components could affect some interaction between other components or a subsystem, which no unit testing could discover. The only way to verify change is full regression testing of the system, its subsystems, its components and interactions, and finally its units. If, as a result of your change, other units need to change, the effect on regression testing is nonlinear. The inefficacy of unit testing is not a new observation and has been demonstrated across thousands of well-measured systems.

In theory, you could perform regression testing even on a functionally decomposed system. In practice, the complexity of that task would set the bar very high. The sheer number of the functional components would make testing all the interactions impractical. The very large services would be internally so complex that no one could effectively devise a comprehensive strategy that tests all code paths through such services. With functional decomposition, most developers give up and perform just simple unit testing. Therefore, by precluding regression testing, functional decomposition makes the entire system untestable, and untestable systems are always rife with defects.

Physical Versus Software Systems

In this book, I resort to using examples from the physical world (such as houses) to demonstrate universal design principles. A common sentiment in the software industry is that you cannot extrapolate from the design of such physical entities to software, that software design and construction are somehow exempt from the design or process limitations of physical systems, or that software is too different from physical systems. After all, in software you can paint a house first and then build the walls to fit the paint. In software you do not have cost-of-goods such as beams and bricks.

I find that not only can the industry borrow from the physical world experience and best practices; it must do so. Contrary to intuition, software requires design even more than physical systems do. The reason is simple: complexity. The complexity of physical systems such as typical houses is capped by physical constraints. You cannot have a poorly designed house with hundreds of interconnecting corridors and rooms. The walls will either weigh too much, have too many openings, be too thin, have doors too small, or cost too much to assemble. You cannot use too much building materials because the house will implode, or you will not have the cash flow to buy them or a place to store the extra material on-site.

Without such natural physical restraints, complexity in software systems can get quickly out of control. The only way to rein in that complexity is to apply good engineering methods, of which design and process are paramount. Well-designed software systems are very much like physical entities and are built very much the same way. They are like well-designed machines.

Functional or domain decomposition makes no sense when designing and building either a house or a software system. All complex entities (physical or not) share the same abstract attributes, from the design decision tree to the project critical path of execution. All composite systems should be designed to be safe, maintainable, reusable, extensible, and of high quality. This is true for a house, a machine part, or a software system. These are practical engineering attributes, and the only way to obtain and maintain them is to use universal engineering practices.

That said, there is a fundamental difference between a physical system and a software system: visibility. Anyone who tries to build a house as in Figure 2-6 or Figure 2-7 will be fired on the spot. Such a person is clearly insane, and the horrendous waste of building material, time, and money as well as the risk of injuries would be plain for everyone to see. The problem with software systems is that while there is enormous waste, that waste is hidden. In software, dust and debris are replaced by wasted career prospects, energy, and youth. Yet no one ever sees or cares about this hidden waste, and the insanity is not only permitted but encouraged, as if the inmates have taken over the asylum. Correct design allows you to break free and restore control by eliminating the concealed waste. This is even more the case with project design, as the second part of this book shows.

Example: Functional Trading System

Instead of a house, consider the following simplified requirements for a stock trading system for a financial company:

  • The system should enable in-house traders to:

    –  Buy and sell stocks

    –  Schedule trades

    –  Issue reports

    –  Analyze the trades

  • The users of the system utilize a browser to connect to the system and manage connected sessions, completing a form and submitting the request.

  • After a trade, report, or analysis request, the system sends an email to the users confirming their request or containing the results.

  • The data should be stored in a local database.

A straightforward functional decomposition would yield the design of Figure 2-8.

Figure 2-8 Functional trading system

Each of the functional requirements is expressed in a respective component of the architecture. Figure 2-8 represents a common design to which many novice software developers would gravitate without hesitation.

Problems with the Functional Trading System

The flaws of such a system design are many. It is very likely the client in the present system is the one that orchestrates Buying Stocks, Selling Stocks, and Trade Scheduling; issues a report with Reporting; and so on. Suppose the user wants to fund purchasing of a certain number of stocks by selling other stocks. This means two orders: first sell and then buy. But what should the client do if by the time these two transactions take place, the price of the stocks sold has dropped or the price of the bought stocks has risen so that the selling cannot fulfill the buying? Should the client buy just as many as possible? Should it perhaps sell more stocks than intended? Should it dip into the cash account behind the trading account to supplement the order? Should it abort the whole thing? Should it ask for user assistance? The exact resolution is immaterial for this discussion. Whatever the resolution, it requires business logic, which now resides in the client.

What will it take to change the client from a web portal to a mobile device? Would that not mean duplicating the business logic into the mobile device? It is likely that little of the business logic and the effort invested in developing it for the web client can be salvaged and reused in the mobile client because it is embedded in the web portal. Over time, the developers will end up maintaining several versions of the business logic in multiple clients.

Per the requirements, Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing all respond to the user with an email listing their activities. What if the users prefer to receive a text message (or a paper letter) instead of an email? You will have to change the implementation of Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing activities from an email to a text message.

Per the design decision, the data is stored in a database, and Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing all access that database. Now suppose you decide to move the data storage from the local database to a cloud-based solution. At the very least, this will force you to change the data-access code in Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing to go from a local database to a cloud offering. The way you structure, access, and consume the data has to change across all components.

What if the client wishes to interact with the system asynchronously, issuing a few trades and collecting the results later? You built the components with the notion of a connected, synchronous client that orchestrates the components. You will likely need to rewrite Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing activities to orchestrate each other, along the lines of Figure 2-5.

Often, financial portfolios are comprised of multiple financial instruments besides stocks, such as currencies, bonds, commodities, and even options and futures on those instruments. What if the users of the system wish to start trading currencies or commodities instead of stocks? What if the users demand a single application, rather than several applications, to manage all of their portfolios? Buying Stocks, Selling Stocks, and Trade Scheduling are all about stocks and cannot handle currencies or bonds, requiring you to add additional components (like Figure 2-6). Similarly, Reporting and Analyzing need a major rewrite to accommodate reporting and analysis of trades other than stocks. The client needs a rewrite to accommodate the new trade items.

Even without branching to commodities, what if you must localize the application to foreign markets? At the very least, the client will need a serious makeover to accommodate language localization, but the real effect is going to be the system components again. Foreign markets are going to have different trading rules, regulations, and compliance requirements, drastically affecting what the system is allowed to do and how it is to go about trading. This will mean much rework to Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing whenever entering a new locale. You are going to end up with either bloated god services that can trade in any market or a version of the system for each deployment locale.

Finally, all components presently connect to some stock ticker feed that provides them with the latest stock values. What is required to switch to a new feed provider or to incorporate multiple feeds? At the very least, Buying Stocks, Selling Stocks, Trade Scheduling, Reporting, and Analyzing will require work to move to a new feed, connect to it, handle its errors, pay for its service, and so on. There are also no guarantees that the new feed uses the same data format as the old one. All components require some conversion and transformation work as well.

Volatility-Based Decomposition

The Method’s design directive is:

Decompose based on volatility.

Volatility-based decomposition identifies areas of potential change and encapsulates those into services or system building blocks. You then implement the required behavior as the interaction between the encapsulated areas of volatility.

The motivation for volatility-based decomposition is simplicity itself: any change is encapsulated, containing the effect on the system.

When you use volatility-based decomposition, you start thinking of your system as a series of vaults, as in Figure 2-9.

Figure 2-9 Encapsulated areas of volatility (Images: media500/Shutterstock; pikepicture/Shutterstock)

Any change is potentially very dangerous, like a hand grenade with the pin pulled out. Yet, with volatility-based decomposition, you open the door of the appropriate vault, toss the grenade inside, and close the door. Whatever was inside the vault may be destroyed completely, but there is no shrapnel flying everywhere, destroying everything in its path. You have contained the change.

With functional decomposition, your building blocks represent areas of functionality, not volatility. As a result, when a change happens, by the very definition of the decomposition, it affects multiple (if not most) of the components in your architecture. Functional decomposition therefore tends to maximize the effect of the change. Since most software systems are designed functionally, change is often painful and expensive, and the system is likely to resonate with the change. Changes made in one area of functionality trigger other changes and so on. Accommodating change is the real reason you must avoid functional decomposition.

All the other problems with functional decomposition pale when compared with the poor ability and high cost of handling change. With functional decomposition, a change is like swallowing a live hand grenade.

What you choose to encapsulate can be functional in nature, but hardly ever is it domain-functional, meaning it has no meaning for the business. For example, the electricity that powers a house is indeed an area of functionality but is also an important area to encapsulate for two reasons. The first reason is that power in a house is highly volatile: power can be AC or DC; 110 volts or 220 volts; single phase or three phases; 50 hertz or 60 hertz; produced by solar panels on the roof, a generator in the backyard, or plain grid connectivity; delivered on wires with different gauges; and on and on. All that volatility is encapsulated behind a receptacle. When it is time to consume power, all the user sees is an opaque receptacle, encapsulating the power volatility. This decouples the power-consuming appliances from the power volatility, increasing reuse, safety, and extensibility while reducing overall complexity. It makes using power in one house indistinguishable from using it in another, highlighting the second reason it is valid to identify power as something to encapsulate in the house. While powering a house is an area of functionality, in general, the use of power is not specific to the domain of the house (the family living in the house, their relationships, their wellbeing, property, etc.).

What would it be like to live in a house where the power volatility was not encapsulated? Whenever you wanted to consume power, you would have to first expose the wires, measure the frequency with an oscilloscope, and certify the voltage with a voltmeter. While you could use power that way, it is far easier to rely on the encapsulation of that volatility behind the receptacle, allowing you instead to add value by integrating power into your tasks or routine.

Decomposition, Maintenance, and Development

As explained previously, functional decomposition drastically increases the system’s complexity. Functional decomposition also makes maintenance a nightmare. Not only is the code in such systems complex, changes are spread across multiple services. This makes maintaining the code labor intensive, error prone, and very time-consuming. Generally, the more complex the code, the lower its quality, and low quality makes maintenance even more challenging. You must contend with high complexity and avoid introducing new defects while resolving old ones. In a functionally decomposed system, it is common for new changes to result in new defects due to the confluence of low quality and complexity. Extending the functional system often requires effort disproportionally expensive with respect to the benefit to the customer.

Even before maintenance ever starts, when the system is under development, functional decomposition harbors danger. Requirements will change throughout development (as they invariably do), and the cost of each change is huge, affecting multiple areas, forcing considerable rework, and ultimately endangering the deadline.

Systems designed with volatility-based decomposition present a stark contrast in their ability to respond to change. Since changes are contained in each module, there is at least a hope for easy maintenance with no side effects outside the module boundary. With lower complexity and easier maintenance, quality is much improved. You have a chance at reuse if something is encapsulated the same way in another system. You can extend the system by adding more areas of encapsulated volatility or integrate existing areas of volatility in a different way. Encapsulating volatility means far better resiliency to feature creep during development and a chance of meeting the schedule, since changes will be contained.

Universal Principle

The merits of volatility-based decomposition are not specific to software systems. They are universal principles of good design, from commerce to business interactions to biology to physical systems and great software. Universal principles, by their very nature, apply to software too (else they would not be universal). For example, consider your own body. A functional decomposition of your own body would have components for every task you are required to do, from driving to programming to presenting, yet your body does not have any such components. You accomplish a task such as programming by integrating areas of volatility. For example, your heart provides an important service for your system: pumping blood. Pumping blood has enormous volatility to it: high blood pressure and low pressure, salinity, viscosity, pulse rate, activity level (sitting or running), with and without adrenaline, different blood types, healthy and sick, and so on. Yet all that volatility is encapsulated behind the service called the heart. Would you be able to program if you had to care about the volatility involved in pumping blood?

You can also integrate into your implementation external areas of encapsulated volatility. Consider your computer, which is different from literally any other computer in the world, yet all that volatility is encapsulated. As long as the computer can send a signal to the screen, you do not care what happens behind the graphic port. You perform the task of programming by integrating encapsulated areas of volatility, some internal, some external. You can reuse the same areas of volatility (such as the heart) while performing other functionalities such as driving a car or presenting your work to customers. There is simply no other way of designing and building a viable system.

Decomposing based on volatility is the essence of system design. All well-designed systems, software and physical systems alike, encapsulate their volatility inside the system’s building blocks.

Volatility-Based Decomposition and Testing

Volatility-based decomposition lends well to regression testing. The reduction in the number of components, the reduction in the size of components, and the simplification of the interactions between components all drastically reduce the complexity of the system. This makes it feasible to write regression testing that tests the system end to end, tests each subsystem individually, and eventually tests independent components. Since volatility-based decomposition contains the changes inside the building blocks of the system, once the inevitable changes do happen, they do not disrupt the regression testing in place. You can test the effect of a change in a component in isolation from the rest of the system without interfering with the inter-components and inter-subsystems testing.

Shoulders Of Giants: David Parnas

In 1972, David Parnas (an early pioneer of software engineering) published a seminal paper called “On the Criteria to Be Used in Decomposing Systems into Modules.”a This short, five-page paper contains most elements of modern software engineering, including encapsulation, information hiding, cohesion, modules, and loose coupling. Most notably, Parnas identified in that paper the need to look for change as the key criteria for decomposition as opposed to functionality. While the specifics of that paper are quite archaic, it was the very first time anyone in the software industry asked the pertinent questions about what it takes to make software systems maintainable, reusable, and extensible. As such, this paper represents the genesis of modern software engineering. Parnas spent the next 40 years trying to introduce proven classic engineering practices to software development.

a. Communications of the ACM 15, no. 12 (1972): 1053–1058.

The Volatility Challenge

The ideas and motivations behind volatility-based decomposition are simple, practical, and consistent with reality and common sense. The main challenges in performing a volatility-based decomposition have to do with time, communication, and perception. You will find that volatility is often not self-evident. No customer or product manager at the onset of a project will ever present you the requirements for the system the following way: “This could change, we will change that one later, and we will never change those.” The outside world (be it customers, management, or marketing) always presents you with requirements in terms of functionality: “The system should do this and that.” Even you, reading these pages, are likely struggling to wrap your head around this concept as you to try to identify the areas of volatility in your current system. Consequently, volatility-based decomposition takes longer compared with functional decomposition.

Note that volatility-based decomposition does not mean you should ignore the requirements. You must analyze the requirements to recognize the areas of volatility. Arguably, the whole purpose of requirements analysis is to identify the areas of volatility, and this analysis requires effort and sweat. This is actually great news because now you are given a chance to comply with the first law of thermodynamics. Sadly, merely sweating on the problem does not mean a thing. The first law of thermodynamics does not state that if you sweat on something, you will add value. Adding value is much more difficult. This book provides you with powerful mental tools for design and analysis, including structure, guidelines, and a sound engineering methodology. These tools give you a fighting chance in your quest to add value. You still must practice and fight.

The 2% Problem

With every knowledge-intensive subject, it takes time to become proficient and effective and even more to excel at it. This is true in areas as varied as kitchen plumbing, internal medicine, and software architecture. In life, you often choose not to pursue certain areas of expertise because the time and cost required to master them would dwarf the time and cost required to utilize an expert. For example, precluding any chronic health problem, a working-age person is sick for about a week a year. A week a year of downtime due to illness is roughly 2% of the working year. So, when you are sick, do you open up medicine books and start reading, or do you go and see a doctor? At only 2% of your time, the frequency is low enough (and the specialty bar high enough) that there is little sense in doing anything other than going to the doctor. It is not worth your while to become as good as a doctor. If, however, you were sick 80% of the time, you might spend a considerable portion of your time educating yourself about your condition, possible complications, treatments, and options, often to the point of sparring with your doctor. Your innate propensity for anatomy and medicine has not changed; only your degree of investment has (hopefully, you will never have to be really good at medicine).

Similarly, when your kitchen sink is clogged somewhere behind the garbage disposal and the dishwasher, do you go to the hardware store, purchase a P-trap, an S-trap, various adapters, three different types of wrenches, various O-rings and other accessories, or do you call a plumber? It is the 2% problem again: it is not worth your while learning how to fix that sink if it is clogged less than 2% of the time. The moral is that when you spend 2% of your time on any complex task, you will never be any good at it.

With software system architecture, architects get to decompose a complete system into modules only on major revolutions of the cycle. Such events happen, on average, every few years. All other designs in the interim between clean slates are at best incremental and at worse detrimental to the existing systems. How much time will the manager allow the architect to invest in architecture for the next project? One week? Two weeks? Three weeks?? Six weeks??? The exact answer is irrelevant. On one hand, you have cycles measured in years and, on the other, activities measured in weeks. The week-to-year ratio is roughly 1:50, or 2% again. Architects have learned the hard way that they need to hone their skills getting ready for that 2% window. Now consider the architect’s manager. If the architect spends 2% of the time architecting the system, what percentage of the time does that architect’s manager spend managing said architect? The answer is probably a small fraction of that time. Therefore, the manager is never going to be good at managing architects at that critical phase. The manager is constantly going to exclaim, “I don’t understand why this is taking so long! Why can’t we just do A, B, C?”

Gaining the time to do decomposition correctly will likely be as much of a challenge as doing the decomposition, if not more so. However, the difficulty of a task should not preclude it from being done. Precisely because it is difficult, it must be done. You will see later on in this book several techniques for gaining the time.

The Dunning-Kruger Effect

In 1999, David Dunning and Justin Kruger published their research2 demonstrating conclusively that people unskilled in a domain tend to look down on it, thinking it is less complex, risky, or demanding than it truly is. This cognitive bias has nothing to do with intelligence or expertise in other domains. If you are unskilled in something, you never assume it is more complex than it is, you assume it is less!

2. Justin Kruger and David Dunning, “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments,” Journal of Personality and Social Psychology 77, no. 6 (1999): 1121–1134.

When the manager is throwing hands in the air saying, “I don't understand why this is taking so long,” the manager really does not understand why you cannot just do the A, then B, and then C. Do not be upset. You should expect this behavior and resolve it correctly by educating your manager and peers who, by their own admission, do not understand.

Fighting Insanity

Albert Einstein is attributed with saying that doing things the same way but expecting better results is the definition of insanity. Since the manager typically expects you to do better than last time, you must point out the insanity of pursuing functional decomposition yet again and explain the merits of volatility-based decomposition. In the end, even if you fail to convince a single person, you should not simply follow orders and dig the project into an early grave. You must still decompose based on volatility. Your professional integrity (and ultimately your sanity and long-term peace of mind) is at stake.

Identifying Volatility

The rest of this chapter provides you with a set of tools to use when you go searching for and identifying areas of volatility. While these techniques are valuable and effective in their own right, they are somewhat loose. The next chapter introduces structure and constraints that allow for quicker and repeatable identification of areas of volatility. However, that discussion merely fine-tunes and specializes the ideas in this section.

Volatile Versus Variable

A key question many novices struggle with is the difference between things that change and things that are volatile. Not everything that is variable is also volatile. You resort to encapsulating a volatility at the system design level only when it is open-ended and, unless encapsulated in a component of the architecture, would be very expensive to contain. Variability, on the other hand, describes those aspects that you can easily handle in your code using conditional logic. When searching for volatility, you should be on the lookout for the kind of changes or risks that would have ripple effects across the system. Changes must not invalidate the architecture.

Axes Of Volatility

Finding areas of volatility is a process of discovery that takes place during requirements analysis and interviews with the project stakeholders.

There is a simple technique I call axes of volatility. This technique examines the ways the system is used by customers. Customer in this context refers to a consumer of the system, which could be a single user or a whole other business entity.

In any business, there are only two ways your system could face change: the first axis is at the same customer over time. Even if presently the system is perfectly aligned with a particular customer’s needs, over time, that customer’s business context will change. Even the use of the system by the customer will often change the requirements against which it was written in the first place.3 Over time, the customer’s requirements and expectation of the system will change.

3. The tendency of a solution to change the requirements against which it was developed was first observed by the 19th-century English economist William Jevons with regard to coal production, and it is referred to since as the Jevons paradox. Other manifestations are the increase in paper consumption with the digital office and the worsening traffic congestion following an increase in road capacity.

The second way change could come is at the same time across customers. If you could freeze time and examine your customer base, are all your customers now using the system in exactly the same way? What are some of them doing that is different from the others? Do you have to accommodate such differences? All such changes define the second axis of volatility.

When searching for potential volatility in interviews, you will find it very helpful to phrase the questions in terms of the axes of volatility (same customer over time, all customers at the same point in time). Framing the questions in this way helps you identify the volatilities. If something does not map to the axes of volatility, you should not encapsulate it at all, and there should be no building block in your system to which it is mapped. Creating such a block would likely indicate functional decomposition.

Design Factoring

Often, the act of looking for areas of volatility using the axes of volatility is an iterative process interleaved with the factoring of the design itself. Consider, for example, the progression of design iterations in Figure 2-10.

Figure 2-10 Design iterations along axes of volatility

Your first take of the proposed architecture might look like diagram A—one big thing, one single component. Ask yourself, Could you use the same component, as is, with a particular customer, forever? If the answer is no, then why? Often, it is because you know that customer will, over time, want to change a specific thing. In that case, you must encapsulate that thing, yielding diagram B. Ask yourself now, Could you use diagram B across all customers now? If the answer is no, then identify the thing that the customers want to do differently, encapsulate it, and produce diagram C. You keep factoring the design that way until all possible points on the axes of volatility are encapsulated.

Independence of the Axes

Almost always, the axes should be independent. Something that changes for one customer over time should not change as much across all customers at the same point in time, and vice versa. If areas of change cannot be isolated to one of the axes, it often indicates a functional decomposition in disguise.

Example: Volatility-Based Decomposition of a House

You can use the axes of volatility to encapsulate the volatility of a house. Start by looking at your own house and observe how it changes over time. For example, consider furniture. Over time, you may rearrange the furniture in the living room and occasionally add new pieces or replace old ones. The conclusion is that furniture in a house is volatile. Next consider appliances. Over time, you may switch to energy-efficient appliances. You likely have already replaced the old CRT with flat plasma screen and that with a large, wafer-thin OLED TV. This is a strong indication that at your house, appliances are volatile. How about the occupants of the house? Is that aspect static? Do you ever have guests come over? Can the house be empty of people? The occupants of the house are volatile. What about appearance? Do you ever paint the house, change the draperies or landscaping? The appearance of a house is volatile. The house is likely connected to some utilities, from Internet to power and security. Previously, I pointed out the power volatility in a house, but what about Internet? In years past, you may have used dial-up for Internet, then moved to DSL, then cable, and now fiber optics or a satellite connection. While these options are drastically different, you would not want to change the way you send emails based on the type of connectivity. You should encapsulate the volatilities of all utilities. Figure 2-11 shows this possible decomposition along the first axis of volatility (same customer over time).

Figure 2-11 Same house over time

Now, even at the same point in time, is your house the same as every other house? Other houses have a different structure, so the structure of the house is volatile. Even if you were to copy and paste your house to another city, would it be the same house?4 The answer is clearly negative. The house will have different neighbors and be subjected to different city regulations, building codes, and taxes. Figure 2-12 shows this possible decomposition along the second axis of volatility (different customers at the same point in time).

Figure 2-12 At the same time across houses

4. The ancient Greeks grappled with this question in Theseus’s paradox (https://en.wikipedia.org/wiki/Ship_of_Theseus).

Note the independence of the axis. The city where you live over time does change its regulations, but the changes come at a slow pace. Similarly, the likelihood of new neighbors is fairly low as long as you live in the same house but is a certainty if you compare your house to another at the same point in time. The assignment of a volatility to one of the axes is therefore not an absolute exclusion but more one of disproportional probability.

Note also that the Neighbors Volatility component can deal with volatility of neighbors at the same house over time as easily as it can do that across different houses at the same point in time. Assigning the component to an axis helps to discover the volatility in the first place; the volatility is just more apparent across different houses at the same point in time.

Finally, in sharp contrast to the decompositions of Figure 2-6 and Figure 2-7, in Figure 2-11 and Figure 2-12 there is no component in the decomposition for cooking or kitchen. In a volatility-based decomposition, the required behavior is accomplished by an interaction between the various encapsulated areas of volatility. Cooking dinner may be the product of an interaction between the occupants, the appliances, the structure, and the utilities. Since something still needs to manage that interaction, the design is not complete. The axes of volatility are a great starting point, but it is not the only tool to bring to bear on the problem.

Solutions Masquerading As Requirements

Consider again the functional requirement for the house to support the cooking feature. Such requirements are quite common in requirements specs, and many developers will simply map that to a Cooking component in their architecture. Cooking, however, is not a requirement (even though it was in the requirement spec). Cooking is a possible solution for the requirement of feeding the people in the house. You can satisfy the feeding requirement by ordering pizza or taking the family out for dinner.

It is exceedingly common for customers to provide solutions masquerading as requirements. With functional decomposition, once you deploy the system with only Cooking, the customer will ask for the pizza option, resulting in either another component in your system or bloating of another component. The “going out to dinner” requirement will soon follow, leading to a never-ending cycle of features going around and around the real requirement. With volatility-based decomposition, during requirements analysis, you should identify the volatility in feeding the occupants and provide for it. The volatility of feeding is encapsulated within the Feeding, component and as the feeding options change, your design does not.

However, while feeding is a better requirement than cooking, it is still a solution masquerading as a requirement. What if in the interest of diet, the people in the house should go to bed hungry tonight? A feeding requirement and diet requirement might be mutually exclusive. You can do either one, but not both. Mutually exclusive requirements are also quite common.

The real requirement for any house is to take care of the well-being of the occupants, not just their caloric intake. The house should not be too cold or too warm or too humid or too dry. While the customers may only discuss cooking and never discuss temperature control, you should recognize the real volatility, well-being, and encapsulate that in the Wellbeing component of your architecture.

Since most requirements specifications are chock-full of solutions masquerading as requirements, functional decomposition absolutely maximizes your pain. You will forever be chasing the ever-evolving solutions, never recognizing the true underlying requirements.

The fact that requirements specifications have all those solutions masquerading as requirements is actually a blessing in disguise because you can generalize the example of cooking in the house into a bona fide analysis technique for discovering areas of volatility. Start by pointing out the solutions masquerading as requirements, and ask if there are other possible solutions? If so, then what were the real requirements and the underlying volatility? Once you identify the volatility, you must determine if the need to address that volatility is a true requirement or is still a solution masquerading as a requirement. Once you have finished scrubbing away all the solutions, what you are left with are likely great candidates for volatility-based decomposition.

Volatilities List

Prior to decomposing a system and creating an architecture, you should simply compile a list of the candidate areas of volatility as a natural part of requirements gathering and analysis. You should approach the list with an open mind. Ask what could change along the axes of volatility. Identify solutions masquerading as requirements, and apply the additional techniques described later in this chapter. The list is a powerful instrument for keeping track of your observations and organizing your thoughts. Do not commit yet to the actual design. All you are doing is maintaining a list. Note that while the design of the system should not take more than a few days, identifying the correct areas of volatility may take considerably longer.

Example: Volatility-Based Trading System

Using the previous requirements for the stock trading system, you should start by preparing a list of possible areas of volatility, capturing also the rationale behind each:

  • User volatility. The traders serve end customers on whose portfolios they operate. The end customers are also likely interested in the current state of their funds. While they could write the trader a letter or call, a more appropriate means would be for the end customers to log into the system to see the current balance and the ongoing trades. Even though the requirements never stated anything about end customer access (the requirements were for professional traders), you should contemplate such access. While the end customers may not be able to trade, they should be able to see the status of their accounts. There could also be system administrators. There is volatility in the type of user.

  • Client application volatility. Volatility in users often manifests in volatility in the type of client application and technology. A simple web page may suffice for external end customers looking up their balance. However, professional traders will prefer a multi-monitor, rich desktop application with market trends, account details, market tickers, newsfeed, spreadsheet projection, and proprietary data. Other users may want to review the trades on mobile devices of various types.

  • Security volatility. Volatility in users implies volatility in how the users authenticate themselves against the system. The number of in-house traders could be small, from a few dozens to a few hundred. The system, however, could have millions of end customers. The in-house traders could rely on domain accounts for authentication, but this is a poor choice for the millions of customers accessing information through the Internet. For Internet users, perhaps a simple user name and password will do, or maybe some sophisticated federated security single sign-on option is needed. Similar volatility exists with authorization options. Security is volatile.

  • Notification volatility. The requirements specify that the system is to send an email after every request. However, what if the email bounces? Should the system fall back to a paper letter? How about a text message or a fax instead of an email? The requirement to send an email is a solution masquerading as a requirement. The real requirement is to notify the users, but the notification transport is volatile. There is also volatility in who receives the notification: a single user or a broadcast to several users receiving the same notification and over whichever transport. Perhaps the end customer prefers an email while the end customer’s tax lawyer prefers a documented paper statement. There is also volatility in who publishes the notification in the first place.

  • Storage volatility. The requirements specify the use of a local database. However, over time, more and more systems migrate to the cloud. There is nothing inherent in stock trading that precludes benefiting from the cost and economy of scale of the cloud. The requirement to use a local database is actually another solution masquerading as a requirement. A better requirement is data persistence, which accommodates the volatility in the persistence options. However, the majority of users are end customers, and those users actually perform read-only requests. This implies the system will benefit greatly from the use of an in-memory cache. Furthermore, some cloud offerings utilize a distributed in-memory hash table that offers the same resiliency as traditional file-based durable storage. Requiring data persistence would exclude these last two options because data persistence is still a solution masquerading as a requirement. The real requirement is simply that the system must not lose the data, or that the system is required to store the data. How that is accomplished is an implementation detail, with a great deal of volatility, from a local database to a remote in-memory cache in the cloud.

  • Connection and synchronization volatility. The current requirements call for a connected, synchronous, lock-step manner of completing a web form and submitting it in-order. This implies that the traders can do only one request at a time. However, the more trades the traders execute, the more money they make. If the requests are independent, why not issue them asynchronously? If the requests are deferred in time (trades in the future), why not queue up the calls to the system to reduce the load? When performing asynchronous calls (including queued calls), the requests can execute out of order. Connectivity and synchronicity are volatile.

  • Duration and device volatility. Some users will complete a trade in one short session. However, traders earn their keep and maximize their income when they perform complicated trades that distribute and hedge risk, involving multiple stocks and sectors, domestic or foreign markets, and so on. Constructing such a trade can be time-consuming, lasting anywhere from several hours to several days. Such a long-running interaction will likely span multiple system sessions and possibly multiple physical devices. There is volatility in the duration of the interaction, which in turn triggers volatility in the devices and connections involved.

  • Trade item volatility. As discussed previously, over time, the end customers may want to trade not just stocks but also commodities, bonds, currencies, and maybe even future contracts. The trade item itself is volatile.

  • Workflow volatility. If the trade item is volatile, processing of the steps involved in the trade will be volatile too. Buying and selling stocks, scheduling their orders, and so on are very different from selling commodities, bonds, or currencies. The workflow of the trade is therefore volatile. Similarly, the workflow of trade analysis is volatile.

  • Locale and regulations volatility. Over time, the system may be deployed into different locales. Volatility in the locale has drastic implications on the trading rules, UI localization, the listing of trade items, taxation, and regulatory compliance. The locale and the regulations that apply therein are volatile.

  • Market feed volatility. The source of market data could change over time. Various feeds have a different format, cost, update rate, communication protocols, and so on. Different feeds may show slightly different value for the same stock at the same point in time. The feeds can be external (e.g., Bloomberg or Reuters) or internal (e.g., simulated market data for testing, diagnostics, or trading algorithms research). The market feed is volatile.

A Key Observation

The preceding list is by no means an exhaustive list of all the things that could change in a stock trading system. Its objective is to point out what could change and the mindset you need to adopt when searching for volatility. Some of the volatile areas may be out of scope for the project. They may be ruled out by domain experts as improbable or may relate too much to the nature of the business (such as branching out of stocks into currencies or foreign markets). My experience, however, is that it is vital to call out the areas of volatility and map them in your decomposition as early as possible. Designating a component in the architecture costs you next to nothing. Later, you must decide whether or not to allocate the effort to designing and constructing it. However, at least now you are aware how to handle that eventuality.

System Decomposition

Once you have settled on the areas of volatility, you need to encapsulate them in components of the architecture. One such possible decomposition is depicted in Figure 2-13.

Figure 2-13 Volatility-based decomposition of a trading system

The transition from the list of volatile areas to components of the architecture is hardly ever one to one. Sometimes a single component can encapsulate more than one area of volatility. Some areas of volatility may not be mapped directly to a component but rather to an operational concept such as queuing or publishing an event. At other times, the volatility of an area may be encapsulated in a third-party service.

With design, always start with the simple and easy decisions. Those decisions constrain the system, making subsequent decisions easier. In this example, some mapping is easy to do. The volatility in the data storage is encapsulated behind data access components, which do not betray where the storage is and what technology is used to access it. Note in Figure 2-13 the key abstraction of referring to the storage as Storage and not as Database. While the implementation (according to the requirements) is a local database, there is nothing in the architecture that precludes other options, such as the raw file system, a cache, or the cloud. If a change to the storage takes place, it is encapsulated in the respective access component (such as the Trades Access) and does not affect the other components, including any other access component. This enables you to change the storage with minimal consequences.

The volatility in notifying the clients is encapsulated in the Notification component. This component knows how to notify each client and which clients subscribe to which event. For simple scenarios, you can manage sufficiently with a general-purpose events publishing and subscription service (Pub/Sub) instead of a custom Notification component. However, in this case, there are likely some business rules on the type of transport and nature of the broadcast. The Notification component may still use some Pub/Sub service underneath it, but that is an internal implementation detail whose volatility is also encapsulated in the Notification component.

The volatility in the trading workflow is encapsulated in the Trade Workflow component. That component encapsulates the volatility of what is being traded (stocks or currencies), the specific steps involved in buying or selling a trade item, the required customization for local markets, the details for the required reports, and so on. Note that even if the trade items are fixed (not volatile), the workflow of trading stocks can change, justifying the use of Trade Workflow to encapsulate the volatility. The design also relies on the operational concept of storing the workflows (this should be implemented using some third-party workflow tool). Trade Workflow retrieves the appropriate workflow instance for each session, operates on it, and stores it back in the Workflow Storage. This concept helps encapsulate several volatilities. First, different trade items can now have distinct trading workflows. Second, different locales can have different workflows. Third, this enables supporting long-running workflows spanning multiple devices and sessions. The system does not care if two calls are seconds apart or days apart. In each case, the system loads the workflow instance to process the next step. The design treats connected, single-session trades exactly the same as a long-running distributed trade. Symmetry and consistency are good qualities in system architecture. Note also that the workflow storage access is encapsulated in the same fashion as the trades storage access.

You can use the same pattern for the stock trading workflow and the analysis workflows. The dedicated Analysis Workflow component encapsulated the volatility in the analysis workflows, and it can use the same Workflow Storage.

The volatility of accessing the market feed is encapsulated in the Feed Access. This component encapsulates how to access the feed and whether the feed itself is internal or external. The volatility in the format or even value of the various market data coming from the different feeds is encapsulated in the Feed Transformation component. Both of these components decouple the other components from the feeds by providing a uniform interface and format regardless of the origin of the data.

The Security component encapsulates the volatility of the possible ways of authenticating and authorizing the users. Internally, it may look up credentials from a local storage or interact with some distributed provider.

The clients of the system can be the trading application (Trader App A) or a mobile app (Trader App B). The end customers can use their own website (Customer Portal). Each client application also encapsulates the details and the best way of rendering the information on the target device.

Resist The Siren Song

Note in Figure 2-13 the absence of a dedicated reporting component. For demonstration purposes, reporting was not listed as a volatile area (from the business perspective). Therefore, there is nothing to encapsulate with a component. Adding such a component manifests functional decomposition. However, if functional decomposition is all you have ever done, you will likely hear an irresistible siren song calling you to add a reporting block. Just because you always have had a reporting block, or even because you have an existing reporting block, does not mean you need a reporting block.

In Homer’s Odyssey, a story that is more than 2500 years old, Odysseus sails home via the Straights of the Sirens. The Sirens are beautiful winged fairy-like creatures who have the voices of angels. They sing a song that no man can resist. The sailors jump to their arms, and the Sirens drown the men under the waves and eat them. Before encountering the deadly allure of the Sirens’ songs, Odysseus (you, the architect) is advised to plug with beeswax the ears of his sailors (the rank and file software developers) and tie them to the oars. The sailors’ job is to row (write code), and they are not even at liberty to listen to the Sirens. Odysseus himself, on the other hand, as the leader, does not have the luxury of plugging his ears (e.g., maybe you do need that reporting block). Odysseus ties himself to the mast of the ship so that he cannot succumb to the Sirens even if he wanted to do so (see Figure 2-14, depicting the scene on a period vase). You are Odysseus, and volatility-based decomposition is your mast. Resist the siren song of your previous bad habits.

Figure 2-14 Tied to the mast (Image: Werner Forman Archive/Shutterstock)

Volatility And The Business

While you must encapsulate the volatile areas, not everything that could change should be encapsulated. Put differently, things that could change are not necessarily volatile. A classic example is the nature of the business, and you should not attempt to encapsulate the nature of the business. With almost all business applications, the applications exist to serve some need of the business or its customers. However, the nature of the business, and by extension, each application, tends to be fairly constant. A company that has been in a business for a long time will likely stay in that business. For example, Federal Express has been, is, and will be in the shipment and delivery business. While in theory it is possible for Federal Express to branch into healthcare, such a potential change is not something you should encapsulate.

During system decomposition, you must identify both the areas of volatility to encapsulate and those not to encapsulate (e.g., the nature of the business). Sometimes, you will have initial difficulty in telling these apart. There are two simple indicators if something that could change is indeed part of the nature of the business. The first indicator is that the possible change is rare. Yes, it could happen, but the likelihood of it happening is very low. The second indicator is that any attempt to encapsulate the change can only be done poorly. No practical amount of investment in time or effort will properly encapsulate the aspect in a way of which you can be proud.

For example, consider designing a simple residential house on a plot of land. At some point in the future, the homeowner may decide to extend the home into a 50-story skyscraper. Encapsulating that possible change in your house design produces a very different design than that of your typical residential house design. Instead of a shallow form-poured foundation, the house foundation must include dozens of friction pylons, driven down to maybe hundreds of feet to support the weight of the building. This will allow the foundation to support both a single family residential and a skyscraper. Next, the power panel must be able to distribute thousands of amps and likely requires the house to have its own transformer. While the water company can bring water to the house, you must devote a room for a large water pump that can push the water up 50 floors. The sewer line must be able to handle 50 floors of inhabitants. You will have to do all that tremendous investment for a single-family home.

When you are finished, the foundation will encapsulate the change to the weight of the building, the power panel will encapsulate the demands of both a single home and 50 stories, and so on. However, the two indicators are now violated. First, how many homeowners in your city annually do convert their home to a skyscraper? How common is that? In a large metropolitan area with a million homes, it may happen once every few years, making the change very rare, once in a million if that. Second, do you really have the funds (allocated initially for a single home) to properly execute all these encapsulations? A single pylon may cost more than the single-family building. Any attempt to encapsulate the future transition to a skyscraper will be done poorly and will be neither useful nor cost-effective.

Converting the single-family home to a 50-story building is a change to the nature of the business. No longer is the building in the business of housing a family. Now it is in the business of being a hotel or an office building. When a land developer purchases that plot of land for the purpose of such conversion, the developer usually chooses to raze the building, dig out the old foundation, and start afresh. A change to the nature of the business permits you to kill the old system and start from scratch. It is important to note that the context of the nature of the business is somewhat fractal. The context can be the business of the company, the business of a department or a division in a company, or even the business added value of a specific application. All these represent things that you should not encapsulate.

Speculative Design

Speculative design is a variation on trying to encapsulate the nature of the business. Once you subscribe to the principle of volatility-based decomposition, you will start seeing possible volatilities everywhere and can easily overdo it. When taken to the extreme, you run the risk of trying to encapsulate anything and everywhere. Your design will have numerous building blocks, a clear sign of a bad design.

Consider for example the item in Figure 2-15.

Figure 2-15 Speculative design (Image: Gercen/Shutterstock)

The item is a pair of SCUBA-ready lady’s high heels. While a lady adorned in a fine evening gown could entertain her guests at the party wearing these, how likely is it that she will excuse herself, proceed immediately to the porch, draw on SCUBA gear, and dive into the reef? Are these shoes as elegant as conventional high heels? Are these as effective as regular flippers when it comes to swimming or stepping on sharp coral? While the use of the items in Figure 2-15 is possible, it is extremely unlikely. In addition, everything they try to provide is done very poorly because of the attempt to encapsulate a change to the nature of the shoe, from a fashion accessory to a diving accessory, something you should never attempt. If you try this, you have fallen into the speculative design trap. Most such designs are simply frivolous speculation on a future change to your system (i.e., a change to the nature of the business).

Design For Your Competitors

Another useful technique for identifying volatilities is to try to design a system for your competitor (or another division in your company). For example, suppose you are the lead architect for Federal Express’s next-generation system. Your main competitor is UPS. Both Federal Express and UPS ship packages. Both collect funds, schedule pickup and delivery, track packages, insure content, and manage trucks and airplane fleets. Ask yourself the following question: Can Federal Express use the software system UPS is using? Can UPS use the system Federal Express wants to build? If the likely answer is no, start listing all the barriers for such a reuse or extensibility. While both companies perform in the abstract the same service, the way they conduct their business is different. For example, Federal Express may plan shipment routes one way, while UPS may plan them another. In that case, shipment planning is probably volatile because if there are two ways of doing something, there may be many more. You must encapsulate the shipment planning and designate a component in your architecture for that purpose. If Federal Express starts planning shipments the same as UPS at some future time, the change is now contained in a single component, making the change easy and affecting only the implementation of that component, not the decomposition. You have future-proofed your system.

The opposite case is also true. If you and your competitor (and even better, all competitors) do some activity or sequence the same way, and there is no chance of your system doing it any other way, then there is no need to allocate a component in the architecture for that activity. To do so would create a functional decomposition. When you encounter something your competitors do identically, more likely than not, it represents the nature of the business, and as discussed previously, you should not encapsulate it.

Volatility And Longevity

Volatility is intimately related to longevity. The longer the company or the application has been doing something the same way, the higher the likelihood the company will keep doing it the same way. Put differently, the longer things do not change, the longer they have until they do change or are replaced. You must put forward a design that accommodates such changes, even if at first glance such changes are independent of the current requirements.

You can even guesstimate how long it will be until such a change is likely to take place using a simple heuristic: the ability of the organization (or the customer or the market) to instigate or absorb a change is more or less constant because it is tied to the nature of the business. For example, a hospital IT department is more conservative and has less tolerance for change than a nascent blockchain startup. Thus, the more frequently things change, the more likely they will change in the future, but at the same rate. For example, if every 2 years the company changes its payroll system, it is likely the company will change the payroll system within the next 2 years. If the system you design needs to interface with the payroll system and the horizon for using your system is longer than 2 years, then you must encapsulate the volatility in the payroll system and plan to contain the expected change. You must take into account the effect of a payroll system change even if the change was never given to you as an explicit requirement. You should strive to encapsulate changes that occur within the life of the system. If that projected lifespan is 5 to 7 years, a good starting point is identifying all the things that have changed in the application domain over the past 7 years. It is likely similar changes will occur within a similar timespan.

You should examine this way the longevity of all involved systems and subsystems with which your design interacts. For example, if the enterprise resource planning (ERP) system changes every 10 years, the last ERP change was 8 years ago, and the horizon for your new system is 5 years, then it is a good bet the ERP will change during the life of your system.

The Importance Of Practicing

If you only spend 2% of the time on anything, you will never be any good at it, regardless of your built-in intellect or methodology used. An amazing level of hubris is required to believe that once every few years someone can approach a whiteboard, draw a few lines, and nail the architecture. The basic expectation of professionals, be they doctors, pilots, welders or lawyers, is that they master their craft by training for it. You would not wish to be the passenger aboard a plane where the pilot has only a handful of flying hours. You would not wish to be the first patient of a doctor. Commercial airline pilots spend years (plural) in simulators and are trained through hundreds of flights by veteran pilots. Doctors dissect countless cadavers before they can touch the first patient, and even then, they are closely supervised.

Identifying areas of volatility is an acquired skill. Hardly any software architect is initially trained in volatility-based decomposition, and the vast majority of systems and projects use functional decomposition (with abysmal results). The best way of going about mastering volatility-based decomposition is to practice. This is the only way to address the 2% problem. Here are several ways you can start:

  • Practice on an everyday software system with which you are familiar, such as your typical insurance company, a mobile app, a bank, or an online store.

  • Examine your own past projects. In hindsight, you already know what the pain points were. Was that architecture of that past project done functionally? What things did change? What were the ripple effects of those changes? If you had encapsulated that volatility, would you have been able to deal with that change better?

  • Look at your current project. It may not be too late to save it: Is it designed functionally? Can you list the areas of volatility and propose a superior architecture?

  • Look at non-software systems such as a bicycle, a laptop, a house, and identify in those the areas of volatility.

Then do it again and do it some more. Practice and practice. After you have analyzed three to five systems, you should get the general technique. Sadly, learning to identify areas of volatility is not something you get to master by watching others. You cannot learn to ride a bicycle from a book. You have to mount a bicycle (and fall) a few times. The same is true with volatility-based decomposition. It is, however, preferable to fall during practice than to experiment on live subjects.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset