7. Project Design Overview

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7. Project Design Overview

The following overview describes the basic methodology and techniques you apply when designing a project. A good project design includes your staffing plan, scope and effort estimations, the services’ construction and integration plan, the detailed schedule of activities, cost calculation, viability and validation of the plan, and setup for execution and tracking.

This chapter covers most of these concepts of project design, while leaving certain details and one or two crucial concepts for later chapters. However, even though it serves as a mere overview, this chapter contains all the essential elements of success in designing and delivering software projects. It also provides the development process motivation for the design activities, while the rest of the chapters are more technical in nature.

Defining Success

Before you continue reading, you must understand that project design is about success and what it takes to succeed. The software industry at large has had such a poor track record that the industry has changed its very definition of success: Success today is defined as anything that does not bankrupt the company right now. With such a low bar, literally anything goes and nothing matters, from low quality to deceiving numbers and frustrated customers. My definition of success is different, though it is also a low bar in its own way. I define success as meeting your commitments.

If you call for a year for the project and $1 million in costs, I expect the project to take one year, not two, and for the project to cost $1 million, not $3 million. In the software industry, many people lack the skills and training it takes to meet even this low bar for success. The ideas presented in this chapter are all about accomplishing just that.

A higher bar is to deliver the project the fastest, least costly, and safest way. Such a higher bar requires the techniques described in the following chapters. You can raise the bar even further and call for having the system architecture remain good for decades and be maintainable, reusable, extensible, and secure across its entire long and prosperous life. That would inevitably require the design ideas of the first part of this book. Since, in general, you need to walk before you can run, it is best to start with the basic level of success and work your way up.

Reporting Success

Part 1 of this book stated a universal design rule: Features are always and everywhere aspects of integration, not implementation. As such, there are no features in any of the early services. At some point you will have integrated enough to start seeing features. I call that point the system. The system is unlikely to appear at the very end of the project since there may be some additional concluding activities such as system testing and deployment. The system typically appears toward the end because it requires most of the services as well as the clients. When using The Method, this means only once you have integrated inside the Managers, the Engines, the ResourceAccess, and the Utilities can you support the behaviors that the Clients require.

While the system is the product of the integration, not all the integration happens inside the Managers. Some integration happens before the Managers are complete (such as the Engines integrating ResourceAccess) and some integration happens after the Managers (such as between the Clients and the Managers). There might also be explicit integration activities, such as developing a client of a service against a simulator and then integrating the client with the real service.

The problem with the system appearing only toward the end of the project is pushback from management. Most people tasked with managing software development do not understand the design concepts in this book and simply want features. They would never stop to think that if a feature can appear early and quickly, then it does not add much value for the business or the customers because the company or the team did not spend much effort on the feature. Usually, management uses features as the metric to gauge progress and success, and tends to cancel sick projects that do not show progress. As such, the project faces a serious risk: It could be perfectly on schedule but because the system only appears at the end, if the project bases its progress report on features, it is asking to be canceled. The solution is simple:

Never base progress reports on features. Always base progress reports on integration.

A Method-based project performs a lot of integration along the project. These integrations are small and doable. As a result, there is the potential for a constant stream of good news coming out of the project, building trust and avoiding cancellation.

Project Initial Staffing

A good architecture does not happen on its own, is not a happenstance, and does not emerge organically in any reasonable amount of time or cost. Good architecture is the result of deliberate effort by the software architect. As such, the first act of wisdom in any software project is to assign a qualified and competent architect to the project. Nothing less will do, because the principal risk in any project is not having an architect accountable for the architecture. This risk far eclipses any other initial risk the project faces. It does not matter what the level of the developers’ technical acumen is, how mature the technology is, or how pampering the development environment is. None of these will amount to anything if the system design is flawed. To use the house analogy, would you like to build a house from the best material, with the best construction crew, at the best location, but without any architecture or with a flawed architecture?

Architect, Not Architects

The architect will need to spend time gathering and analyzing the requirements, identifying the core use cases and the areas of volatility, and producing the system and the project design. While the design itself is not time-consuming (the architect can usually design both the system and the project in a week or two), it may take several months to get to the point that the architect can design the system and the project.

Most managers will recoil both at spending some three or four months on design and at skipping design entirely. They may wish to accelerate the design effort by having more architects participate. However, requirements analysis and architecture are contemplative, time-consuming activities. Assigning more architects to these activities does not expedite them at all, but instead will make matters worse. Architects are typically senior self-confident personnel, used to working independently. Assigning multiple architects only results in them contesting with each other, rather than in system and project design blueprints.

One way of resolving the multiple architects conflict is to appoint a design committee. Unfortunately, the surest way of killing anything is to appoint a committee to oversee it. Another option is to carve up the system and assign each architect a specific area to design. With this option, the system is likely to end as a Chimera—a mythological Greek beast that has the head of a lion, the wings of a dragon, the front legs of an ox, and the hind legs of a goat. While each part of the Chimera is well designed and even highly optimized, the Chimera is inferior at anything it attempts: It does not fly as well as a dragon, run as fast as a lion, pull as much as an ox, or climb as well as a goat. The Chimera lacks design integrity—and the same is true when multiple architects design the system, each responsible for their part.

A single architect is absolutely crucial for design integrity. You can extend this observation to the general rule that the only way to allow for design integrity is to have a single architect own the design. The opposite is also true: If no single person owns the design and can visualize it cover-to-cover, the system will not have design integrity.

Additionally, with multiple architects no one owns the in-betweens, the cross-subsystem or even cross-services design aspects. As a result, no one is accountable for the system design as a whole. When no one is accountable for something, it never gets done, or at best is done poorly.

With a single architect in charge, that architect is accountable for the system design. Ultimately, being accountable is the only way to earn the respect and trust of management. Respect always emerges out of accountability. When no one is accountable, as is the case with a group of architects, management intuitively has often nothing but scorn for the architects and their design effort.

Caution

A single architect in charge does not mean the architect’s work is exempt from review by other professional architects. Being accountable for the design does not imply working in isolation or avoiding constructive criticism. The architect should seek out such reviews to verify that the design is adequate.

Junior Architects

Most software projects need only a single architect. This is true regardless of the project size and is essential for success. However, large projects very easily saturate the architect with various responsibilities, preventing the architect from focusing on the key goal of designing the system and keeping the design from drifting away during development. Additionally, the role of architect involves technical leadership, requirements review, design review, code review for each service in the system, design documents updates, discussion of feature requests from marketing, and so on.

Management can address this overload by assigning a junior architect (or more than one) to the project. The architect can offload many secondary tasks to the junior architect, allowing the architect to focus on the design of the system and the project at the beginning and on keeping the system true to its design throughout the project. The architect and the junior architect are unlikely to compete because there is no doubt who is in charge, and there are clearly delineated lines of responsibilities. Having junior architects is also a great way of grooming and mentoring the next generation of architects for the organization.

The Core Team

As vital as the architect is to the project, the architect cannot work in isolation. On day 1, the project must have a core team in place. The core team consists of three roles: project manager, product manager, and architect. These are logical roles and may or may not map to three individuals. When they do not, you may see the same person as both the architect and the project manager, or a project with several product managers.

Most organizations and teams have these roles, but the job titles they use may be different. I define these roles as follows:

The project manager. The job of the project manager is to shield the team from the organization. Most organizations, even small ones, create too much noise. If that noise makes its way into the development team, it can paralyze the team. A good project manager is like a firewall, blocking the noise, allowing only sanctioned communication through. The project manager tracks progress and reports status to management and other project managers, negotiates terms, and deals with cross-organization constraints. Internally, the project manager assigns work items to developers, schedules activities, and keeps the project on schedule, on budget, and on quality. No one in the organization other than the project manager should assign work activity or ask for status from developers.
The product manager. The product manager should encapsulate the customers. Customers are also a constant source of noise. The product manager acts as a proxy for the customers. For example, when the architect needs to clarify the required behaviors, the architect should not chase customers; instead, the product manager should provide the answers. The product manager also resolves conflicts between customers (often expressed as mutually exclusive requirements), negotiates requirements, defines priorities, and communicates expectations about what is feasible and on what terms.
The architect. The architect is the technical manager, acting as the design lead, the process lead, and the technical lead of the project. The architect not only designs the system, but also sees it through development. The architect needs to work with the product manager to produce the system design and with the project manager to produce the project design. While the collaboration with both the product manager and the project manager is essential, the architect is held responsible for both of these design efforts. As a process lead, the architect has to ensure the team builds the system incrementally, following the system and the project design with a relentless commitment to quality. As a technical lead, the architect often has to decide on the best way of accomplishing technical tasks (the what-to-do) while leaving the details (the how-to-do) to developers. This requires continuous hands-on mentoring, training, and reviews.

Perhaps the most glaring omission from this definition of the core team are developers. Developers (and testers) are transient resources that come and go across projects—a very important point that this chapter revisits as part of the discussion of scheduling activities and resource assignment.

Unlike developers, the core team stays throughout the project since the project needs all three roles from beginning to end. However, what these roles do in the project changes over time. For example, the project manager shifts from negotiating with stakeholders to providing status reports, and the product manager shifts from gathering requirements to performing demos. The architect shifts from designing the system and the project to providing ongoing technical and process leadership, such as conducting design and code reviews at the service level and resolving technical conflicts.

The Core Mission

The mission of the core team at the beginning is to design the project. This means reliably answering the questions of how long it will take and how much it will cost. It is impossible to know the answers to these key questions without project design, and to design the project you require the architecture. In this respect, the architecture is merely a means to an end: project design. Since the architect needs to work with the product manager on the architecture and with the project manager on the project design, the project requires the core team at the beginning of the project.

The Fuzzy Front End

The core team designs the project in the fuzzy front end leading to development. The fuzzy front end is a general term¹ in all technical projects referring to the very start of the project. The front end commences when someone has an idea about the project, and it concludes when developers start construction. The front end often lasts considerably longer than most people recognize: By the time they become involved in the project, the front end may have been in progress for several years. There is a large degree of variance across projects, which leads to the fuzziness about the exact duration of the front end. The duration of the front end is most heavily dependent on the constraints applied to the project. The more constrained the project is, the less time you need to spend in the front end. Conversely, the fewer the constraints, the more time you should invest in figuring out what lies ahead and how to go about it.

1. https://en.wikipedia.org/wiki/Front_end_innovation

Software projects are never constraint-free. All projects face some constraints on time, scope, effort, resources, technology, legacy, business context, and so on. These constraints can be explicit or implicit. It is vital to invest the time in both verifying the explicit constraints and discovering the implicit constraints. Designing a system and project that violates a constraint is a recipe for failure. From my experience, a software project should spend roughly between 15% and 25% of the entire duration of the project in the front end, depending on constraints.

Educated Decisions

It is pointless to approve a project without knowing its true schedule, cost, and risk. After all, you would not buy a house without knowing how much it costs. You would not buy a house that you can afford up front but whose upkeep and taxes you cannot pay. In any walk of life, it is obvious that you commit time and capital only after the scope is known. Many software projects recklessly proceed with no idea of the real time and cost required.

It is just as pointless to staff a project with resources before the organization is committed to the project and certain to have the required time and money. In fact, staffing a project before the commitment is made has a tendency to force the project ahead regardless of affordability. If the right thing to do is to avoid doing the project in the first place, the organization will only be wasting good money. A rush to commit the resources will almost always be accompanied by a poor functional design and no plan at all—hardly the ingredients of success.

The key to success is to make educated decisions, based on sound design and scope calculations. Wishful thinking is not a strategy, and intuition is not knowledge, especially when dealing with complex software systems.

Note

The inability to make educated decisions about time and cost is a constant source of frustration for business stakeholders when working with software teams. The business people in charge simply want to know the cost involved and when they can realize the value of the effort. Avoiding these questions eventually creates tension, suspicion, and animosity between the team and management. Business-side people are used to planning and budgeting. Their fair expectation of software professionals is that they should have the expertise to do the same.

Plans, Not Plan

The result of project design is a set of plans, not a single plan. As described in the previous chapter, the project plan is not a single coordinate of time and cost. There are always multiple possible ways of building any system, and only one option will offer the right combination of time, cost, and risk. The architect may be tempted to simply ask management what the design parameters of the project are and just design that single option. The problem is that managers often do not say what they mean or mean what they say.

For example, consider a 10-man-year project—that is, a project where the sum of effort across all activities is 10 man-years. Suppose management asks for the least costly way of building the system. Such a project would have one person working for 10 years, but management is unlikely to be willing to wait 10 years. Now suppose that management asks for the quickest possible way to build the system. Imagine it is possible to build the same system by engaging 3650 people for 1 day (or even 365 people for 10 days). Management is unlikely to hire so many people for such short durations. Similarly, management will never ask for the safest way of building the system (because anything worth doing requires risk, and safe projects are not worth doing) or knowingly go for the riskiest way of doing the project.

Software Development Plan Review

The only way to resolve the ambiguity about what management really wants is to present a buffet of good options from which to choose, with each option being a viable combination of time, cost, and risk. You present these options to management in a dedicated meeting unofficially called the Feed Me/Kill Me meeting. As the name implies, the purpose of this meeting is for management to choose one of the project design options and commit the required resources (the “Feed Me” route). One of the options is always that of not doing the project (the “Kill Me” route). Officially, the name of the meeting should be the Software Development Plan Review, or SDP review. It makes no difference if your process does not have an SDP review point: Just call a meeting (no manager can refuse a meeting request whose subject line is “Software Development Plan Review”).

Once the desired option is identified, management must literally sign off on the SDP document. This document now becomes your project’s life insurance policy because, as long as you do not deviate from the plan’s parameters, there is no reason to cancel your project. This does require proper tracking (as described in Appendix A) and project management.

If no option is palatable, then you need to drive the right decision—in this case, killing the project. A doomed project, a project that from inception did not receive adequate time and resources, will do no one any good. The project will eventually run out of time or money or both, and the organization will have wasted not just the funds and time but the opportunity cost of devoting these resources to another doable project. It is also detrimental to the careers of the core team members to be on a project that never has a chance. Since you have only a few years to make your mark and move ahead, every project must count and be a feather in your cap. Spending a year or two on a sideways move that failed will limit your career prospects. Killing such a project before development starts is beneficial for all involved.

Services and Developers

With the project design in hand (but only after management has chosen a specific option), the team can start constructing the system. Typically this requires assigning services (or modules, components, classes, etc.) to developers. The exact assignment methodology deserves a section on its own later on in the chapter. For now, recognize that you should always assign services to developers in a 1:1 ratio. The 1:1 ratio does not mean that a developer works on only one service, but rather that if you do a cross-section of the team at any moment in time, you will see a developer working on one and only one service. It is perfectly fine for a developer to finish one service and move to the next. However, you should never see a developer working on more than one service at a time or more than one developer working concurrently on the same service. Any other way of assigning services to developers will result in failure. Examples of the poor assignment options include:

Multiple developers per service. The motivation for assigning two (or more) developers to one service is not a surplus of developers, but rather the desire to complete the work sooner. However, two people cannot really work on the same thing at the same time, so some subscheme must be used:

– Serialization. The developers could work serially so that only one of them is working on the service at a time. This takes longer due to the context switch overhead—that is, the need to figure out what happened with the service since the current developer looked at it last. This defeats the purpose of assigning the two developers in the first place.

– Parallelization. The developers could work in parallel and then integrate their work. This scheme will take much longer than just having a single developer working on the service. For example, suppose a service estimated as one month of effort is assigned to two developers who will work in parallel. One might be tempted to assume that the work will be complete after two weeks, but that is a false assumption. First, not all units of work can be split this way. Second, the developers would have to allocate at least another week to integrate their work. This integration is not at all guaranteed to succeed if the developers worked in parallel and did not collaborate during development. Even if the integration is possible, it would void all the testing effort that went into each part due to the integration changes. Testing the service as a whole also would require additional time. In all, the effort will take at least a month (and likely more). Meanwhile, other developers who are working on dependent services and expect the service to be ready after two weeks will be further delayed.
Multiple services per developer. The option of assigning two (or more) services to a single developer is just as bad. Suppose two services, A and B, each estimated as a month of work, are assigned to a single developer, with the developer expected to finish both after a single month. Since the sum of work is two months, not only will the services be incomplete after one month, but finishing them will take much longer. While the developer is working on the A service, the developer is not working on the B service, causing the developers dependent on the B service to demand that the developer work on the B service. The developer might switch to the B service, but then those dependent on the A service would demand some attention. All this switching back and forth drastically reduces the developer’s efficiency, prolonging the duration to much more than two months. In the end, perhaps after three or four months, the A and B services may be complete.

Either assigning more than one developer per service or assigning multiple services per developer causes a mushroom cloud of delays to propagate throughout the project, mostly due to delayed dependencies affecting other developers. This, in turn, makes accurate estimations very difficult. The only option that has any semblance of accountability and a chance of meeting the estimation is a 1:1 assignment of services to developers.

Design and Team Efficiency

When using 1:1 assignments of services to developers, it follows that the interaction between the services is isomorphic to the interaction between the developers. Consider Figure 7-1.

**Figure 7-1** The system’s design is the team’s design. (Images: Sapann Design/Shutterstock)

The relationship between the services, their interactions and communication, dictates the relationships and interactions between the developers. When using 1:1 assignment, the design of the system is the design of the team.

Next, consider Figure 7-2. While the number of services and their size has not changed from Figure 7-1, no one could claim it is a good design.

**Figure 7-2** Tightly coupled system and team

A good system design strives to reduce the number of interactions between the modules to their bare minimum—the exact opposite of what happens in Figure 7-2. A loosely coupled system design such as that in Figure 7-1 has minimized the number of interactions to the point that removing one interaction makes the system inoperable.

The design in Figure 7-2 is clearly tightly coupled, and it also describes the way the team operates. Compare the teams from Figure 7-1 and Figure 7-2. Which team would you rather join? The team in Figure 7-2 is a high-stress, fragile team. The team members are likely territorial and resist change because every change has ripple effects that disrupt their work and the work of everybody else. They spend an inordinate amount of time in meetings to resolve their issues. In contrast, the team in Figure 7-1 can address issues locally and contain them. Each team member is almost independent from the others and does not need to spend much time coordinating work. Simply put, the team in Figure 7-1 is far more efficient than the team in Figure 7-2. As a result, the team with the better system design has far better prospects of meeting an aggressive deadline.

This last observation is paramount: Most managers just pay lip service to system design because the benefits of architecture (maintainability, extensibility, and reusability) are down-the-road benefits. Future benefits do not help a manager who is facing the harsh reality of scant resources and a tight schedule. If anything, it behooves the manager to reduce the scope of work as much as possible to meet the deadline. Since system design is supposedly not helping with the current objectives, the manager will throw overboard any meaningful investment in design. Sadly, by doing so, the manager loses all chance of meeting the commitments, because the only way to meet an aggressive deadline is with a world-class design that yields the most efficient team. When striving to get management support for your design effort, show how design helps with the immediate objective. The long-term benefits will flow out of that.

Personal Relationships and Design

While the way the design affects the team efficiency may be self-evident, the team also affects the design. In Figure 7-1, if two developers do not talk with each other, then that area of the design will be weak. You should assign two coupled services to two developers who naturally work effectively with each other.

Task Continuity

When assigning services (or activities such as UI development), try to maintain task continuity, a logical continuation between tasks assigned to each person. Often, such task assignments follow the service dependency graph. If service A depends on service B, then assign A to the developer of B. One advantage is that the A developer who is already familiar with B needs less ramp-up time. An important, yet often overlooked advantage of maintaining task continuity is that the project and the developer’s win criteria are aligned. The developer is motivated to do an adequate job on B to avoid suffering when it is time to do A. Perfect task continuity is hardly ever possible, but it should be the goal.

Finally, take the developers’ personal technical proclivities into account when making assignments. For example, it will likely not work well to have the security expert design the UI, to have the database expert implement the business logic, or to have junior developers implement the utilities such as message bus or diagnostics.

Effort Estimations

Effort estimation is how you try to answer the question of how long something will take. There are two types of estimations: individual activity estimation (estimating the effort for an activity assigned to a resource) and overall project estimation. The two types of estimations are unrelated, because the overall duration of the project is not the sum of effort across all activities divided by number of resources. This is due to the inherent inefficiency in utilizing people, the internal dependencies between activities, and any risk mitigation you may need to put in place.

In many software teams, engaging in estimations is at best a nice ritual and at worst an exercise in futility. The poor results of estimations in the software industry are due to several reasons:

Uncertainty in how long activities take, and even uncertainty in the list of activities, is the primary reason for poor accuracy of estimations. Do not confuse cause and effect: The uncertainty is the cause, and poor estimation accuracy is the result. You must proactively reduce the uncertainty, as described later in this chapter.
Few people in software development are trained in simple and effective estimation techniques. Most are left to rely on bias, guesswork, and intuition.
Many people overestimate or underestimate in an attempt to compensate for the uncertainty, which results in far worse outcomes.
Most people tend to look at just the tip of the iceberg when listing activities. Naturally, if you omit activities that are essential to success, your estimations will be off. This is true both when omitting activities across the project and when omitting internal phases inside activities. For example, estimators may list just the coding activities or, inside coding activities, account for coding but not design or testing.

Classic Mistakes

As just mentioned, people tend to overestimate and underestimate in an attempt to compensate for uncertainty. Both of these are deadly when it comes to project success.

Overestimation never works because of Parkinson’s law.² For example, if you give a developer three weeks to perform a two-week activity, the developer will simply not work on it for two weeks and then be idle for a week. Instead, the developer will work on the activity for three weeks. Since the actual work consumed only two of those three weeks, in the extra week the developer will engage in gold plating—adding bells and whistles, aspects, and capabilities that no one needs or wants, and that were not part of the design. This gold plating significantly increases the complexity of the task, and the increased complexity drastically reduces the probability of success. Consequently, the developer labors for four or six weeks to finish the original task. Other developers in the project, who expect to receive the code after three weeks, are now delayed, too. Furthermore, the team now owns, perhaps for years and across multiple versions, a code module that is needlessly more complex than what it should have been in the first place.

2. Cyril N. Parkinson, “Parkinson’s Law,” The Economist (November 19, 1955).

Underestimation guarantees failure just as well. Undoubtedly, giving a developer two days to perform a two-week coding activity will preclude any gold plating. The problem is that the developer will try to do the activity quick-and-dirty, cutting corners and disregarding all known best practices. This is as sensible as asking a surgeon to operate on you quick-and-dirty or a contractor to build a house quick-and-dirty.

Sadly, there is no quick-and-dirty with any intricate task. Instead, the two options are quick-and-clean and dirty-and-slow. Because the developer is missing all the best practices in software development, from testing to detailed design to documentation, the developer is now trying to perform the task in the worst possible way. Consequently, the developer will not work on the activity for the nominal two weeks it could have taken, assuming the work was performed correctly, but will work on it for four or six (or more) weeks due to the low quality and increased complexity. As with overestimation, other developers in the project who expected the code after the scheduled two days are much delayed. Furthermore, the team now has to own, perhaps for years and across multiple versions, a code module that is done the worst possible way.

Probability of Success

While these conclusions may make common sense, what many miss is the magnitude of these classic mistakes. Figure 7-3 plots in a qualitative manner the probability of success as a function of the estimation. For example, consider a 1-year project. With proper architecture and project design, the project’s normal estimation is 1 year, indicated by point N in Figure 7-3. What would be the probability of success if you give this project a day? A week? A month? Clearly, with sufficiently aggressive estimations, the probability of success is zero. How about 6 months? While the probability of a 1-year project completing in 6 months is extremely low, it is not zero because maybe a miracle will happen. The probability of success if you estimate at 11 months and 3 weeks is actually very high, and it is also fairly high for 11 months. However, it is unlikely the project can complete in 9 months. Therefore, to the left of the normal estimation is a tipping point where the probability of success drastically improves in a nonlinear way. Similarly, this 1-year project could last 13 months, and even 14 months is reasonable. But if you give this project 18 or 24 months, you will surely kill it because Parkinson’s law will kick in: Work will expand to fill the allotted time, and the project will fail due to the increased complexity. Therefore, another tipping point exists to the right of the normal estimation, where the probability of success again collapses in a nonlinear way.

Figure 7-3 illustrates the paramount importance of good nominal estimations because they maximize the probability of success, in a nonlinear way. In the past, you were likely to hurt yourself and others when you both underestimated and overestimated. These are not just common, classic mistakes—they are cardinal mistakes.

Estimation Techniques

The poor track record with estimations in the software industry persists even though a decent set of effective estimation techniques have been available for decades and across multiple other industries. I have yet to see a team that has practiced estimations correctly and was also off the mark with their project design and commitments. Instead of trying to review all of these techniques, this section highlights some of the ideas and techniques I have found over the years to be the most simple and effective.

Accuracy, Not Precision

Good estimations are accurate, but not precise. For example, consider an activity that actually took 13 days and had 2 estimations: 10 days or 23.8 days. While the second estimation is far more precise, clearly the first estimation is better because it is more accurate. With estimations, accuracy counts more than precision. Since most software projects significantly veer off from their commitments at delivery (sometimes by multiples of the initial estimations), it is nonsensical when the people involved those projects estimate the activities down to the hour or the day.

Estimations also must match the tracking resolution. If the project manager tracks the project on a weekly basis, any estimation less than a week is pointless because it is smaller than the measurement resolution. Doing so makes as much sense as estimating the size of your house down to the micron when using a measuring tape for the actual measurement.

Even when an activity is actually 13 days in duration, it is better to estimate it as 15 days rather than 12.5 days. Any decent-size project will likely have several dozens of activities; by opting for accuracy, you will probably overestimate (a little) on some activities and underestimate (a little) on others. On average, though, your estimations will be fairly accurate. If you are trying to be precise, you can accumulate errors because you do not allow for errors in the estimations to cancel each other out. In addition, if you ask people for precise estimations, they will endlessly agonize and deliberate on them. If you ask for accurate estimations, the estimations will be easy, simple, and quick to make.

Reduce Uncertainty

Uncertainty is the leading cause of missed estimations. It is important not to confuse the unknown with the uncertain. For example, while the exact day of my demise is unknown, it is far from uncertain, and a whole industry (life insurance) is based on the ability to estimate that date. While the estimation may not be precise when it comes to me specifically, the life insurance industry has sufficient customers to make it accurate enough.

When asking people to estimate, you should help them overcome their fear of estimations. Many may have had their poor estimations used against them in the past. You may even encounter refusal to estimate in the form of “I don’t know” or “Estimations never work.” Such attitudes may indicate fear of entrapment, or trying to avoid the effort of estimating, or being ignorant and inexperienced in estimation techniques, rather than a fundamental inability to estimate.

Confronted with the uncertain, take these steps:

Ask first for the order of magnitude: Is the activity more like a day, a week, a month, or a year? With the magnitude known, narrow it down using factor of 2 to zoom in. For example, if the answer to the first question was a month as the type of unit, ask if it is more like two weeks, one month, two months, or four months. The first answer rules out eight months (since that is more like a year as an order of magnitude), and it cannot be one week because that was not provided in the first place as an order of magnitude.
Make an explicit effort to list the areas of uncertainty in the project and focus on estimating them. Always break down large activities into smaller, more manageable activities to greatly increase the accuracy of the estimations.
Invest in an exploratory discovery effort that will give insight into the nature of the problem and reduce the uncertainty. Review the history of the team or the organization, and learn from your own history how long things have taken in the past.

PERT Estimations

One estimation technique dealing specifically with high uncertainly is part of Program Evaluation and Review Technique (PERT).³ For every activity, you provide three estimations: the most optimistic, the most pessimistic, and the most likely. The final estimation is provided by this formula:

3. https://en.wikipedia.org/wiki/Program_evaluation_and_review_technique

$E = \frac{O + 4 * M + P}{6}$ $E = \frac{O + 4 * M + P}{6}$

where:

E is the calculated estimation.
O is the optimistic estimation.
M is the most likely estimation.
P is the pessimistic estimation.

For example, if an activity has an optimistic estimation of 10 days, a pessimistic estimation of 90 days, and a most likely estimation of 25 days, the PERT estimation for it would be 33.3 days:

$E = \frac{10 + 4 * 25 + 90}{6} = 33.3$ $E = \frac{10 + 4 * 25 + 90}{6} = 33.3$

Overall Project Estimation

Estimating the project as a whole is useful primarily for project design validation, but can also be beneficial when initiating project design. When you finish the detailed project design, compare it to the overall project estimation. The two need not match perfectly but should be congruent and validate each other. For example, if the detailed project design was 13 months and the overall project estimation was 11 months, then the detailed project design is valid. But if the overall estimation was 18 months, then at least one of these numbers is wrong, and you must investigate the source of the discrepancy. You can also utilize the overall project estimation when dealing with a project with very few up-front constraints. Such a clean canvas project has a great deal of unknowns, making it difficult to design. You can use the overall project estimation to work backward to box in certain activities as a way of initiating the project design process.

Historical Records

With overall project estimation, your track record and history matter the most. With even a modest degree of repeatability (see Figure 6-1), it is unlikely that you could deliver the project faster or slower than similar projects in the organization’s past. The dominant factor in throughput and efficiency is the organization’s nature, its own unique fingerprint of maturity, which is something that does not change overnight or between projects. If it took your company a year to deliver a similar project in the past, then it will take it a year in the future. Perhaps this project could be done in six months somewhere else, but with your company it will take a year. There is some good news here, though: Repeatability also means the company likely will not take two or three years to complete the project.

Estimation Tools

A great yet little-known technique for overall project estimation is leveraging project estimation tools. These tools typically assume some nonlinear relationship exists between size and cost, such as a power function, and use a large number of previously analyzed projects as their training data. Some tools even use Monte Carlo simulations to narrow down the range of the variables based on your project attributes or historical records. I have used such tools for decades, and they produce accurate results.

Broadband Estimation

The broadband estimation is my adaptation of the Wideband Delphi⁴ estimation technique. The broadband estimation uses multiple individual estimations to identify the average of the overall project estimation, then adds a band of estimations above and below it. You use the estimations outside the band to gain insight into the nature of the project and refine the estimations, repeating this process until the band and the project estimations converge.

4. Barry Boehm, Software Engineering Economics (Prentice Hall, 1981).

To start any broadband estimation effort, first assemble a large group of project stakeholders, ranging from developers to testers, managers, and even support people—diversity of the group is key with the broadband technique. Strive for a mix of newcomers and veterans, devil’s advocates, experts and generalists, creative people, and worker bees. You want to tap into the group’s synergy of knowledge, intelligence, experience, intuition, and risk assessment. A good group size is between 12 and 30 people. Using fewer than 12 participants is possible, but the statistical element may not be strong enough to produce good results. With more than 30 participants, it is difficult to finish the estimation in a single meeting.

Begin the meeting by briefly describing the current state and phase of the project, what you have already accomplished (such as architecture), and additional contextual information (such as the system’s operational concepts) that may not be known to stakeholders who were not part of the core team. Each participant needs to estimate two numbers for the project: how long will it take in months and how many people it will require. Have the estimators write these numbers, along with their name, on a note. Collect the notes, enter them in a spreadsheet, and calculate both the average and the standard deviation for each value. Now, identify the estimations (both in time and people) that were at least one standard deviation removed from the average—that is, those values outside the broadband of consensus (hence the name of the technique). These are the outliers.

Instead of culling the outliers from the analysis (the common practice in most statistical methods), solicit input from those who produced them—because they may know something that the others do not. This is a great way of identifying the uncertainties. Once the outliers have voiced their reasoning for the estimation and all have heard it, you conduct another round of estimations. You repeat this process until all estimations fall within one standard deviation, or the deviation is less than your measurement resolution (such as one person or one month). Broadband estimation typically converges this way by the third round.

Caution

During the broadband meeting, it is important to maintain a free, collegial atmosphere. Those who produce the outliers (both high and low) are known to all involved in the process, and their estimations must not be perceived as criticism of management and the organization.

A Word of Caution

Overall project estimation, whether done by using historical records, estimation tools, or the broadband method, tends to be accurate, if not highly accurate. You should compare the various overall estimations to ensure that you do, indeed, have a good estimation. Unfortunately, while these overall estimations are accurate, they merely augment and verify your detailed project design effort. They serve only as reinforcement and a sanity check because they are not actionable on their own. You may be fairly certain that the project requires 18 months and 6 people, but as yet you have no idea how to utilize those resources to finish the project on that schedule. You have to design the project to learn this information.

Activity Estimations

You start the project design with the estimated duration of the individual activities in the project. Before you estimate individual activities, you must prepare a meticulous list of all activities in the project, both coding and noncoding activities alike. In a way, even that list of activities is an estimation of the actual set of activities, so the same rationale about reducing uncertainties holds true here. Avoid the temptation to focus on the structural coding activities indicated by the system architecture, and actively look below the waterline at the full extent of the iceberg. Invest time in looking for activities, and ask other people to compile that list so you could compare it with your own list. Have colleagues review, critique, and challenge your list of activities. You may be surprised by what you actually missed.

Since accuracy is superior to precision, a best practice is to always use a quantum of 5 days in any activity estimation. Activities that take 1 or 2 days should not be part of the plan. Activities that are 3 or 4 days are always estimated at 5 days. Activities are either 5, 10, 15, 20, 25, 30, or 35 days long. Activities estimated at 40 or more days may be good candidates to break down into smaller activities to reduce the uncertainty. Using 5 days for each activity aligns the project nicely on week boundaries and reduces waste of parts of weeks before or after an activity. This practice also matches real life—no activity has ever started on a Friday.

The reduction in uncertainty benefits even regular-size activities. Force yourself and others to break down each activity into tasks in addition to coding, such as learning curves, test clients, installation, integration points, peer reviews, and documentation. Again, by avoiding focusing on coding and examining the full scope of the work ahead, you greatly reduce the uncertainty of individual activity estimations.

The Estimation Dialog

If you ask others to estimate an activity, you must maintain a correct estimation dialog with them. Never dictate duration by saying, “You have two weeks!” Not only is that based on nothing, but the owner of the activity also does not feel accountable to actually finish in two weeks. When people are unaccountable, progress and quality will be lacking. Avoid leading questions, such as “It is going to take two weeks, right?” While this is somewhat better than dictating the estimation, you now bias the other party toward your estimation. Even if the person agrees, he or she still will not feel accountable to your estimation. A far better question is the open question, “How long will it take?” Do not accept an immediate answer. Always force people to get back to you later with the answer because you want them to itemize what is really involved and to reflect and contemplate on the answer. You must have good estimations to maximize the probability of success and people’s accountability (see Figure 7-3).

Critical Path Analysis

To calculate the actual duration of a project as well several other key aspects of the project, you need to find the project’s critical path. Critical path analysis is the single most important project design technique. However, you cannot perform this analysis without the following prerequisites:

The system architecture. You must have the decomposition of the system into services and other building blocks such as Clients and Managers. While you could design a project with even a bad architecture, that is certainly less than ideal. A bad system design will keep changing, and with it, your project design will change. It is crucial that the system architecture be valid, so that it holds true over time.
A list of all project activities. Your list must contain both coding and noncoding activities. It is straightforward to derive the list of most coding activities by examining the architecture. The list of noncoding activities is obtained as discussed previously and is also a product of the nature of the business. For example, a banking software company will have compliance and regulatory activities.
Activity effort estimation. Have an accurate estimation of the effort for each activity in the list of activities. You should use multiple estimation techniques to drive accuracy.
Services dependency tree. Use the call chains to identify the dependencies between the various services in the architecture.
Activity dependencies. Beyond the dependencies between your services, you must compile a list of how all activities depend on other activities, coding and noncoding alike. Add explicit integration activities as needed.
Planning assumptions. You must know the resources available for the project or, more correctly, the staffing scenarios that your plan calls for. If you have several such scenarios, then you will have a different project design for each availability scenario. The planning assumptions will include which type of resource is required at which phase of the project.

Project Network

You can graphically arrange the activities in the project into a network diagram. The network diagram shows all activities in the project and their dependencies. You first derive the activity dependencies from the way the call chains propagate through the system. For each of the use cases you have validated, you should have a call chain or sequence diagram showing how some interaction between the system’s building blocks supports each use case. If one diagram has Client A calling Manager A and a second diagram has Client A calling Manager B, then Client A depends on both Manager A and Manager B. In this way, you systematically discover the dependencies between the components of the architecture. Figure 7-4 shows the dependency chart of the code modules in a sample Method-based architecture.

**Figure 7-4** Services dependency chart

The dependency chart shown in Figure 7-4 has several problems. First, it is highly structural and is missing all the nonstructural coding and noncoding activities. Second, it is graphically bulky and with larger projects would become visually too crowded and unmanageable. Third, you should avoid grouping activities together, as is the case with the Utilities in the figure.

You should turn the diagram in Figure 7-4 into the detailed abstract chart shown in Figure 7-5. That chart now contains all activities, coding and noncoding alike, such as architecture and system testing. You may want to also add a side legend identifying the activities for easy review.

Activity Times

The effort estimation for an activity alone does not determine when that activity will complete: Dependencies on other activities also come into play. Therefore, the time to finish each activity is the product of the effort estimation for that activity plus the time it takes to get to that activity in the project network. The time to get to an activity, or the time it takes to be ready to start working on the activity, is the maximum of time of all network paths leading to that activity. In a more formal manner, you calculate the time for completing activity i in the project with this recursive formula:

$T_{i} = E_{i} + M a x (T_{i - 1}, T_{i - 2}, ..., T_{i - n})$ $T_{i} = E_{i} + M a x (T_{i - 1}, T_{i - 2}, ..., T_{i - n})$

where:

T_i is the time for completing activity i.
E_i is the effort estimation for activity i.
n is the number of activities leading directly to activity i.

The time for each of the preceding activities is resolved the same way. Using regression, you can start with the last activity in the project and find the completion time for each activity in the network. For example, consider the activity network in Figure 7-6.

**Figure 7-6** Project network used in the time calculation example

In the diagram in Figure 7-6, activity 5 is the last activity. Thus, the set of regression expressions that define the time to finish activity 5 are:

$\begin{matrix} T_{5} & = & E_{5} + M a x (T_{3}, T_{6}) \\ T_{6} & = & E_{6} \\ T_{3} & = & E_{3} + M a x (T_{2}, T_{4}) \\ T_{4} & = & E_{4} \\ T_{2} & = & E_{2} + T_{1} \\ T_{1} & = & E_{1} \end{matrix}$ $\begin{matrix} T_{5} & = & E_{5} + M a x (T_{3}, T_{6}) \\ T_{6} & = & E_{6} \\ T_{3} & = & E_{3} + M a x (T_{2}, T_{4}) \\ T_{4} & = & E_{4} \\ T_{2} & = & E_{2} + T_{1} \\ T_{1} & = & E_{1} \end{matrix}$

Note that the time to finish activity 5 depends on the effort estimation of the previous activities as much as it depends on the network topology. For example, if all the activities in Figure 7-6 are of equal duration, then:

$T_{5} = E_{1} + E_{2} + E_{3} + E_{5}$ $T_{5} = E_{1} + E_{2} + E_{3} + E_{5}$

However, if all activities except activity 6 are estimated at 5 days, and activity 6 is estimated at 20 days, then:

$T_{5} = E_{6} + E_{5}$ $T_{5} = E_{6} + E_{5}$

While you could manually calculate the activity times for small networks such as Figure 7-6, this calculation quickly gets out of hand with large networks. Computers excel at regression problems, so you should use tools (such as Microsoft Project or a spreadsheet) to calculate activity times.

The Critical Path

By calculating the activity times, you can identify the longest possible path in the network of activities. In this context, the longest path means the path with greatest duration, not necessarily the one with the greatest number of activities. For example, the project network in Figure 7-7 has 17 activities, each of different estimated duration (the numbers in Figure 7-7 are just the activity IDs; durations are not shown).

Based on the effort estimation for each activity and the dependencies, using the formula given earlier and starting from activity 17, the longest path in the network is shown in bold. That longest path in the network is called the critical path. You should highlight the critical path in your network diagrams using a different color or bold lines. Calculating the critical path is the only way to answer the question of how long it will take to build the system.

Because the critical path is the longest path in the network, it also represents the shortest possible project duration. Any delay on the critical path delays the entire project and jeopardizes your commitments.

No project can ever be accelerated beyond its critical path. Put another way, you must build the system along its critical path to build the system the quickest possible way. This is true in any project, regardless of technology, architecture, development methodology, development process, management style, and team size.

In any project with multiple activities on which multiple people are working, you will have a network of activities with a critical path. The critical path does not care if you acknowledge it or not; it is just there. Without critical path analysis, the likelihood of developers building the system along the critical path is nearly zero. Working this way is likely to be substantially slower.

Assigning Resources

During project design, the architect assigns abstract resources (such as Developer 1) to each of the project design options. Only after the decision makers have chosen a particular project design option can the project manager assign actual resources. Since any delay in the critical path will delay the project, the project manager should always assign resources to the critical path first. You should take matters a step further by always assigning your best resources to the critical path. By “best,” I mean the most reliable and trustworthy developers, the ones who will not fail to deliver. Avoid the classic mistake of first assigning developers to high-visibility but noncritical activities, or to activities that the customer or management care the most about. Assigning development resources first to noncritical activities does nothing to accelerate the project. Slowing down the critical path absolutely slows down the project.

Staffing Level

During project design, for each project design option the architect needs to find out how many resources (such as developers) the project will require overall. The architect discovers the required staffing level iteratively. Consider the network in Figure 7-7, where the critical path is already identified, and assume each node is a service. How many developers are required on the first day of the project? If you were given just a single developer, that developer is by definition your best developer, so the single developer goes to activity 1. If you are given two developers, then you can assign the second developer to activity 2, even though that activity is not required until much later. If you are given three developers, then the third developer is at best idle, and at worst disrupting the developer working on activity 1. Therefore, the answer to the question of how many developers are required on day 1 of the project is at most two developers.

Next, suppose activity 1 is complete. How many developers are required now? The answer is at most six (activities 3, 4, 5, 6, 7, and 2 are available). However, asking for six developers is less than ideal since by the time you have progressed up the critical path to the level of activities 8 or 12, you need only three or even two developers. Perhaps it is better to ask for just four developers instead of six developers once activity 1 is complete. Utilizing only four as opposed to six developers has two significant advantages. First, you will reduce the cost of the project. A project with four developers is 33% less expensive than a project with six developers. Second, a team of four developers is far more efficient than a team of six developers. The smaller team will have less communication overhead and less temptation for interference from the idle hands.

Based on this criterion alone, a team of three or even two developers would be better than a team of four developers. However, when examining the network of Figure 7-7 it is likely impossible to build the system with just three developers and keep the same duration. With so few developers, you will paint yourself into a corner in which a developer on the critical path needs a noncritical activity that is simply not ready yet (such as activity 15 needing activity 11). This promotes a noncritical activity to a critical activity, in effect creating a new and longer critical path. I call this situation subcritical staffing. When the project goes subcritical, it will miss its deadline because the old critical path no longer applies.

The real question is not how many resources are required. The question to ask at any point of the project is:

What is the lowest level of resources that allows the project to progress unimpeded along the critical path?

Finding this lowest level of resources keeps the project critically staffed at all points in time and delivers the project at the least cost and in the most efficient way. Note that the critical level of staffing can and should change throughout the life of the project.

Imagine a group of developers without project design. The likelihood of that group constituting the lowest level of resources required to progress unimpeded along the critical path is nearly zero. The only way to compensate for the unknown staffing needs of the project is by using horrendously wasteful and inefficient overcapacity staffing. As illustrated previously, working this way cannot be the fastest way of completing the project—and now you see it also cannot be the least costly way of building the system. My experience is that overcapacity can be more expensive than the lowest cost level by many multiples.

Float-Based Assignment

Returning to the network in Figure 7-7, once you have concluded that you could try to build the system with only four developers, you face a new challenge: Where and when will you deploy these four developers? For example, with activity 1 complete, you could assign the developers to activities 3, 4, 5, 6 or 3, 5, 6, 7, or 3, 4, 6, 2, and so on. Even with a simple network, the combinatorial spectrum of possibilities is staggering. Each of these options would have its own set of possible downstream assignments.

Fortunately, you do not have to try any of these combinations. Examine activity 2 in Figure 7-7. You can actually defer assigning resources to activity 2 until the day that activity 16 (which is on the critical path) must start, minus the estimated duration of activity 2. Activity 2 can “float” to the top (remain unassigned and not start) until it bumps against activity 16. All noncritical activities have float, which is the amount of time you could delay completing them without delaying the project. Critical activities have no float (or more precisely, their float is zero) since any delay in these activities would delay the project. When you assign resources to the project, follow this rule:

Always assign resources based on float.

To figure out how to assign developers in the previous example once activity 1 is complete, calculate the float of all activities that are possible once activity 1 is complete, and assign the four developers based on the float, from low to high. First, assign a developer to the critical path, not because it is special but because it has the lowest possible float. Now, suppose activity 2 has 60 days of float and activity 4 has 5 days of float. This means that if you defer getting to activity 4 by more than 5 days, you will derail the project. By contrast, you could defer getting to activity 2 by at most 60 days, so you assign the next developer to activity 4. During the intervening time while activity 2 remains unassigned, you are in effect consuming the activity’s float. Perhaps by the time the float of activity 2 has become 15 days, you will be finally able to assign a developer to this activity.

Classic Pitfall

As observed by Tom Demarco,^a most organizations incentivize their managers to do the wrong thing when it comes to project staffing, even when starting with the best of intentions. Managers can correctly assign developers to the project only after project design, which is possible only after the architecture is complete. These design activities, while short in nature, conclude the fuzzy front end of the project, which itself may take months of scoping the work, prototyping, evaluating technologies, interviewing customers, analyzing requirements, and more. There is no point in hiring developers until the project manager can assign them based on the plan, because otherwise they will have nothing to do.

However, empty offices and desks, for months on end, reflect poorly on the manager, making it seem as if the manager is just slacking. The manager fears that when (not if) the project is late (as software projects are known to be), the manager will get the blame because the manager did not hire the developers at the beginning of the project. To avoid this liability, as soon as the fuzzy front end starts, the manager will hire developers to avoid empty offices. The developers still have nothing to do, so they will play games, read blogs, and take long lunch breaks. Unfortunately, this behavior reflects even worse on the manager than the empty offices, because now the perception is that the manager does not know how to delegate and manage, and the organization has to pay for it, too.

Again, the manager fears that if the project is late, the manager will be the one left holding the bag. As soon as the front end starts, the manager will staff the project and assign feature A to the first developer, feature B to the second developer, and so on, even though the project lacks a sound architecture or critical path analysis. When several weeks or months later the architect produces the architecture and the project design, they will be irrelevant, since the developers have been working on a completely different system and project. The project will grossly miss the schedule and blow through any set budget, not just because of the lack of architecture and critical path analysis, but also because what took place instead was both functional decomposition of the system and functional decomposition of the team.

The arguments from Chapter 2 about system decomposition could easily be made about team decomposition, too. The project now has the worst possible combination of system design and team design. The manager will continually be pleading for more time and resources from top management. When the project is late (again as most projects are in the software industry), the manager looks no worse than any other manager in the organization.

Doing the right thing is a lot easier the second time around, when you have already proven that you know how to deliver on time and on budget. The organization may never understand how it worked (or why the way other managers always try it does not work), but cannot argue with results. The first time going this route, and without a track record of success, you will face a struggle. The best action is to confront this pitfall head on and make resolving it part of your project design, as described in Chapter 11.

a. Tom Demarco, The Deadline (Dorset House, 1997).

The nature of this process is iterative both because initially the lowest level of staffing is unknown and because using float-based assignment changes the floats of the activities. Start by attempting to staff the project with some resource level, such as six resources, and then assign these resources based on float. Every time a resource is scheduled to finish an activity, you scan the network for the nearest available activities, choosing the activity with the lowest float as the next assignment for that resource. If you successfully staff the project, try again, this time with a reduced staffing level such as five or even four resources. At some point, you will have an excess of activities compared with the available resources. If those unassigned activities have high enough float, you could defer assigning resources to them until some resources become available. While these activities are unassigned, you will be consuming their float. If the activities become critical, then you cannot build the project with that staffing level, and you must settle for a higher level of resources.

Another key advantage of float-based assignment relates to risk reduction. Activities with the least float are the riskiest, the ones most capable of delaying the project. Assigning resources to these activities first allows you to staff a project in the safest possible way and reduce the overall risk associated with any given staffing level. Again, without project design, the likelihood that a project manager or a group of developers will assign activities based on float is nearly zero. Working this way is not just slow and expensive, but also risky.

Network and Resources

The discussion so far has focused on the dependencies between the activities as the way to construct the network. However, the resources also affect the network. For example, if you were to assign the network depicted in Figure 7-7 to a single developer, the actual network diagram would be a long string, not Figure 7-7. The dependency on the single resource drastically changes the network diagram. Therefore, the network diagram is actually not just a network of activities, but first and foremost a network of dependencies. If you have unlimited resources and very elastic staffing, then you can rely only on the dependencies between the activities. Once you start consuming float, you must add the dependencies on the resources to the network. The key observation here is:

Resource dependencies are dependencies.

The actual way of assigning resources to the project network is a product of multiple variables. When you assign resources you must take the following into account:

Planning assumptions
Critical path
Floats
Available resources
Constraints

These will always result in several project design options, even for straightforward projects.

Scheduling Activities

Together, the project network, the critical path, and the float analysis allow you to calculate the duration of the project as well as when each activity should start with respect to the project beginning. However, the information in the network is based on workdays, not on calendar dates. You need to convert the information in the network to calendar dates by scheduling the activities. This is a task that you can easily perform by using a tool (such as Microsoft Project). Define all activities in the tool, then add dependencies as predecessors, and assign the resources according to your plan. Once you select a start date for the project, the tool will schedule all activities. The output may also include a Gantt chart, but that is incidental to the core piece of information you can now glean from the tool: the planned start and completion dates for each activity in the project.

Caution

Gantt charts in isolation are detrimental because they may give management the illusion of planning and control. A Gantt chart is merely one view of the project network, and it does not encompass the full project design.

Staffing Distribution

The required staffing for your project is not constant with time. At the beginning, you need only the core team. Once management selects a project design option and approves the project, you can add resources such as developers and testers.

Not all resources are needed all at once due to the dependencies and the critical path. Much the same way, not all resources are retired uniformly. The core team is required throughout, but developers should not be needed through the last day of the project. Ideally, you should phase in developers at the beginning of the project as more and more activities become possible, and phase out the developers toward the end of the project.

This approach of phasing in and phasing out resources has two significant advantages. First, it avoids the feast-or-famine cycles experienced by many software projects. Even if you have the required average level of staffing for the project, you could be understaffed in one part of the project and overstaffed in another part. These cycles of idleness or intense overtime are demoralizing and very inefficient. Second (and more importantly), phasing resources offers the possibility of realizing economy of scale. If you have several projects in the organization, then you could arrange them such that developers are always phasing out of one project while phasing into another. Working this way yields a hundreds of percent increase in productivity, the classic “doing much more with less.”

Staffing Distribution Chart

Figure 7-8 depicts the typical staffing distribution chart of a well-designed and properly staffed project. At the start of the project is the front end, during which the core team is working on the system and project design; this phase ends with the SDP review. If the project is terminated at that point, the staffing goes to zero and the core team is available for other projects. If the project is approved, an initial ramp-up in staffing occurs in which developers and other resources are working on the lowest-level activities in the project that enable other activities. When those activities become available, the project can absorb additional staff. At some point you have phased in all the resources the project ever needs, reaching peak staffing. For a while, the project is fully staffed. The system tends to appear at the end of this phase. Now the project can phase out resources, and those left are working on the most dependent activities. The project concludes with the level of staffing required for system testing and release.

**Figure 7-8** Correct staffing distribution

Figure 7-9 shows a staffing distribution chart that demonstrates the behavior of Figure 7-8. You produce a chart such as Figure 7-9 by first staffing the project, then listing all the dates of interest (unique dates when activities start and end) in chronological order. You then count how many resources are required for each category of resources in each time period between dates of interest. Do not forget to include in the staffing distribution resources that do not have specific activities but are nonetheless required, such as the core team, quality control, and developers between coding activities. This sort of stacking bar diagram is trivial to do in a spreadsheet. The files accompanying this book contain several example projects and templates for these charts.

**Figure 7-9** Sample staffing distribution

Since the dates of interest may not be regularly spaced, the bars in the staffing distribution chart may vary in time resolution. However, in most decent-size projects with enough activities, the overall shape of the chart should follow that of Figure 7-8. By examining the staffing distribution chart, you get a quick and valuable feedback on the quality of your project design.

Staffing Mistakes

Several common project staffing mistakes may be evident in the staffing distribution chart. If the chart looks rectangular, it implies constant staffing—a practice against which I have already cautioned.

A staffing distribution with a huge peak in the middle of the chart (as shown in Figure 7-10) is also a red flag: Such a peak always indicates waste.

**Figure 7-10** Peak in staffing distribution

Consider the effort expended in hiring people and training them on the domain, architecture, and technology when you use them for only a short period of time. A peak is usually caused by not consuming enough float in the project, resulting in a spike in resource demand. If the project were to trade some float for resources, the curve would be smoother. Figure 7-11 depicts a sample project with a peak in staffing.

**Figure 7-11** Sample peak in staffing distribution

A flat line in the staffing distribution chart (as shown in Figure 7-12) is yet another classic mistake. The flat line indicates the absence of the high plateau of Figure 7-8. The project is likely subcritical and is missing the resources to staff the noncritical activities of the original plan.

**Figure 7-12** Flat subcritical staffing distribution

Figure 7-13 shows the staffing distribution for a sample subcritical project. This project goes subcritical at a level of 11 or 12 resources. It is not just missing the plateau, but has a valley instead.

**Figure 7-13** Sample subcritical staffing distribution

Erratic staffing distributions (as in Figure 7-14) are yet another distress signal. Projects that are designed with this kind of elasticity in mind are due for a disappointment (see Figure 7-15) because staffing can never be that elastic. Most projects cannot conjure people out of thin air, have them be instantly productive, and then dispose of them a moment later. In addition, when people constantly come and go from a project, training (or retraining) them is very expensive. It is difficult to hold people accountable or retain their knowledge under such circumstances.

**Figure 7-14** Erratic staffing distribution

**Figure 7-15** Sample erratic staffing distribution

Figure 7-16 illustrates another staffing distribution to avoid, the high ramp-up coming into the project. While this figure does not include any numbers, the chart clearly indicates wishful thinking. No team can instantly go from zero to peak staffing and have everyone add value and deliver high-quality, production-worthy code. Even if the project initially has that much parallel work, and even if you have the resources, the network downstream throttles how many resources the project can actually absorb beyond that, and the required staffing fizzles out.

**Figure 7-16** High ramp-up in staffing distribution

Figure 7-17 demonstrates such a project. This plan expects instantaneously to get to 11 people, and shortly afterward deflates to around six people until the end of the project. It is improbable that any team can ramp up this way, and the available resources are used inefficiently due to the oversized team.

**Figure 7-17** Sample initial high ramp in staffing distribution

Smoothing the Curve

A key observation from the visual indicators of mistakes in the charts is that good projects have smooth staffing distributions. Life is much better when you are cruising along through your project rather than negotiating sharp turns or experiencing screaming acceleration and emergency braking.

As just mentioned, the two root causes of incorrect staffing are assuming too elastic staffing and not consuming float when assigning resources. When considering staffing elasticity, you have to know your team and have a good grasp on what is feasible as far as availability and efficiency. The degree of staffing elasticity also depends on the nature of the organization and the quality of the system and project design. The better the designs, the more quickly developers can come to terms with the new system and activities. Consuming float is easy to do in most projects and likely to reduce both the volatility in the staffing and the absolute level of the required staffing. Being more realistic about staffing elasticity and consuming float often eliminate the peaks, the ups-and-downs, and the high ramp-ups.

Project Cost

Plotting the staffing distribution chart for each project design option is a great validation aid in reflecting on the option and seeing if it makes sense. With project design, if something does not feel right, more often than not, something is indeed wrong.

Drawing the staffing distribution chart offers another distinct benefit: It is how you figure out the cost of the project. Unlike physical construction projects, software projects do not have a cost of goods or raw materials. The cost of software is overwhelmingly in labor. This labor includes all team members, from the core team to the developers and testers. Labor cost is simply the staffing level multiplied by time:

$Cost = Staffing * Time$ $Cost = Staffing * Time$

Multiplying staffing by time is actually the area under the staffing distribution chart. To calculate the cost, you need to calculate that area.

The staffing distribution chart is a discrete model of the project that has vertical bars (the staffing level) in each time period between dates of interest. You calculate the area under the staffing distribution chart by multiplying the height of each vertical bar (the number of people) by the duration of the time period between its dates of interest (Figure 7-18). You then sum the results of these multiplications.

**Figure 7-18** Calculating project cost

The formula for the calculation of the area under the staffing chart is:

$Cost = Σ_{i = 1}^{n} (S_{i} * (T_{i} - T_{i - 1}))$ $Cost = Σ_{i = 1}^{n} (S_{i} * (T_{i} - T_{i - 1}))$

where:

S_i is the staffing level at date of interest i.
T_i is the date of interest i (T₀ is the start date).
n is the number of dates of interest in the project.

Finding the area under the staffing distribution chart is the only way to answer the question how much the project will cost.

If you use a spreadsheet to produce the staffing distribution chart, you just need to add another column with a running sum to calculate the area under the chart (in essence, a numerical integration). The support files accompanying this book contain several examples of this calculation.

Since cost is defined as staffing multiplied by time, the units of cost should be effort and time, such as man-month or man-year. It is better to use these units as opposed to currency to neutralize differences in salary, local currencies and budgets. It then becomes possible to objectively compare the cost of different project design options.

Given the architecture, the initial work breakdown, and the effort estimation, it is a matter of a few hours to a day at the most to answer the questions of how long it will take and how much it will cost to build the system. Sadly, most software projects are running blind. This is as sensible as playing poker without ever looking at the cards—except, instead of chips, you have your project, your career prospects, or even the company’s future on the line.

Project Efficiency

Once the project cost is known, you can calculate the project efficiency. The efficiency of a project is the ratio between the sum of effort across all activities (assuming perfect utilization of people) and the actual project cost. For example, if the sum of effort across all activities is 10 man-months (assuming 30 workdays in a month), and the project cost is 50 man-months (of regular workdays), then the project efficiency is 20%.

The project efficiency is a great indicator of the quality and sanity of the project’s design. The expected efficiency of a well-designed system, along with a properly designed and staffed project, ranges between 15% and 25%.

These efficiency rates may seem appallingly low, but higher efficiency is actually a strong indicator of an unrealistic project plan. No process in nature can ever even approach 100% efficiency. No project is free from constraints, and these constraints prevent you from leveraging your resources in the most efficient way. By the time you add the cost of the core team, the testers, the Build and DevOps, and all the other resources associated with your project, the portion of the effort devoted to just writing code is greatly diminished. Projects with high efficiency such as 40% are simply impossible to build.

Even 25% efficiency is on the high side and is predicated on having a correct system architecture that will provide the project with the most efficient team (see Figure 7-1) and a correct project design that uses the smallest level of resources and assigns them based on floats. Additional factors required for delivering on high efficiency expectations include a small, experienced team whose members are accustomed to working together, and a project manager who is committed to quality and can handle the complexity of the project.

Efficiency also correlates with staffing elasticity. If staffing were truly elastic (i.e., you could always get resources just when you need them and let them go at the precise moment when you no longer need them), the efficiency would be high. Of course, staffing is never that elastic, so sometimes resources will be idle while still assigned to the project, driving the efficiency down. This is especially the case when utilizing resources outside the critical path. If a single person is working on all critical activities, that person is actually at peak efficiency because the person works on activities back-to-back, and the cost of that effort approaches the sum of the cost of the critical activities. With noncritical activities, there is always float. Since staffing is never truly elastic, the resources outside the critical path can never be utilized at very high efficiency.

If the project design option has a high expected efficiency, you must investigate the root cause. Perhaps you assumed too liberal and elastic staffing or the project network is too critical. After all, if most network paths are either critical or near-critical (most activities have low float), then you would get a high efficiency ratio. However, such a project is obviously at high risk of not meeting its commitments.

Efficiency as Overall Estimation

The efficiency of software projects is tightly correlated with the nature of the organization. Inefficient organizations do not turn efficient overnight, and vice versa. Efficiency also relates to the nature of the business. The overhead required in a project that produces software for a medical device will differ from that of a small startup developing a social media plug-in.

You can use efficiency as yet another broad project estimation technique. Suppose you know that historically your projects were 20% efficient. Once you have your individual activity breakdown and their estimations, simply multiply the sum of effort (assuming perfect utilization) across all activities by 5 to produce a rough overall project cost.

Earned Value Planning

Another insightful project design technique is earned value planning. Earned value is a popular means of tracking a project, but you can also use it as a great project design tool. With earned value planning you assign value to each activity toward the completion of project, and then combine it with the schedule of each activity to see how you plan to earn value as a function of time.

The formula for the planned earned value is:

$E V (t) = \frac{Σ_{i = 1}^{m} E_{i}}{Σ_{i = 1}^{N} E_{i}}$ $E V (t) = \frac{Σ_{i = 1}^{m} E_{i}}{Σ_{i = 1}^{N} E_{i}}$

where:

E_i is the estimated duration for activity i.
m is the number of activities completed at time t.
N is the number of activities in the project.
t is a point in time.

The earned value at time t is the ratio between the sum of estimated duration of all activities completed by time t divided by the sum of the estimated durations of all activities.

Consider, for example, the very simple project in Table 7-1.

Table 7-1 Sample project earned value

Activity	Duration (days)	Value (%)
Front End	40	20
Access Service	30	15
UI	40	20
Manager Service	20	10
Utility Service	40	20
System Testing	30	15
Total	200	100

The sum of estimated duration across all activities in Table 7-1 is 200 days. The UI activity, for example, is estimated at 40 days. Since 40 is 20% of 200, you could state that by completing the UI activity, you have earned 20% toward the completion of the project. From your scheduling of activities you also know when the UI activity is scheduled to complete, so you can actually calculate how you plan to earn value as a function of time (Table 7-2).

Table 7-2 Sample planned earned value as function of time

Activity	Completion Date	Value (%)	Earned Value (%)
Start	0	0	0
Front End	t₁	20	20
Access Service	t₂	15	35
UI	t₃	20	55
Manager Service	t₄	10	65
Utility Service	t₅	20	85
System Testing	t₆	15	100

Such a chart of planned progress is shown in Figure 7-19. By the time the project reaches the planned completion date, it should have earned 100% of the value. The key observation in Figure 7-19 is that the pitch of the planned earned value curve represents the throughput of the team. If you were to assign exactly the same project to a better team, they would meet the same 100% of earned value sooner, so their line would be steeper.

**Figure 7-19** Planned earned value chart

Classic Mistakes

The realization that you can gauge the expected throughput of the team from the earned value chart enables you to quickly discern mistakes in your project plan. For example, consider the planned earned value chart in Figure 7-20. No team in the world could ever deliver on such a plan. For much of the project, the expected throughput was shallow. What kind of miracle of productivity would deliver the rocket launch of earned value toward the end of the project?

**Figure 7-20** Unrealistically optimistic plan

Such unrealistic, overly optimistic plans are usually the result of back-scheduling. The plan may even start with the best of intentions, progressing along the critical path. Unfortunately, you find that someone has already committed the project on a specific date with no regard for a project design or the actual capabilities of the team. You then take the remaining activities and cram them against the deadline, basically back-scheduling from that. Only by plotting the planned earned value will you be able to call attention to the impracticality of this plan and try to avert failure. Figure 7-21 depicts a project with such behavior.

**Figure 7-21** Sample unrealistically optimistic plan

In much the same way, you can detect unrealistically pessimistic plans such as that shown in Figure 7-22. This project starts out well, but then productivity is expected to suddenly diminish—or more likely, the project was given much more time than was required. The project in Figure 7-22 will fail because it allows for gold plating and complexity to raise their heads. You can even extrapolate from the healthy part of the curve when the project should have finished (somewhere above the knee in the curve).

**Figure 7-22** Unrealistically pessimistic plan

The Shallow S Curve

A project utilizing a fixed-size team would always results in a straight line on the planned earned value chart. As mentioned already, you should not keep the team size fixed. A properly staffed and well-designed project always results in a shallow S curve for the earned value chart, as shown in Figure 7-23.

The shape of the planned earned value curve is related to the planned staffing distribution. At the beginning of the project, only the core team is available, so not much measurable value is added at the front end, and the pitch of the earned value curve is almost flat. After the SDP review, the project can start adding people. As you increase the size of the team, you will also increase its throughput, so the earned value curve gets steeper and steeper. At some point you reach peak staffing. For a while the team size is mostly fixed, so there is a straight line at maximum throughput in the center of the curve. Once you start phasing out resources, the earned value curve levels off until the project completes. Figure 7-24 shows a sample shallow S curve.

The Logistic Function

The shallow S curve of the planned earned value is a special case of the logistic function.^a The generic form of the logistic function can take any S shape (S, mirror S, inverted S, ascending or descending), can span any range of values, and can even be asymmetric.

Every process that involves change can be modeled using a logistic function. For example, the temperature in a room rises and falls according to a logistic function, as does your body weight, the market share of a company, radioactive decay, the risk of burning your skin as a function of distance from a flame, statistical distributions, population growth, effectiveness of design, the intelligence of neural networks, and pretty much everything else. The logistic function is the single most important function known to mankind because it enables us to quantify and model the world—a world that is highly dynamic.

The standard logistic function is defined by this expression:

$F (x) = \frac{1}{1 + e^{- x}}$ $F (x) = \frac{1}{1 + e^{- x}}$

Figure 7-25 plots the standard logistic function. The standard logistic function approaches 0 and 1 asymptotically, crossing the y-axis when x = 0 and y = 0.5.

**Figure 7-25** Standard logistic function

Subsequent chapters will refer to the logistic function in a qualitative manner when putting it to good use for risk and complexity modeling.

a. https://en.wikipedia.org/wiki/Logistic_function

The earned value curve is a simple and easy way to answer the question: “Does the plan make sense?” If the planned earned value is a straight line, or it exhibits the issues of Figure 7-20 or Figure 7-22, the project is in danger. If it looks like a shallow S, then at least you have hope that the plan is sound and sensible.

Roles and Responsibilities

It is up to the architect to design both the system and the project to build that system. The architect is likely the only member of the team with the insight and perspective on the correct architecture, the limits of the technology, the dependencies between the activities, the design constraints of both the system and the project, and the relative resource skills. It is futile to expect management, project managers, product managers, or developers to design the project. All of them simply lack the insight, information, and training required to design a project. Furthermore, designing the project is not part of their job. However, the architect does need the input, insight, and perspective of the project manager on the resources cost, the availability scenarios, planning assumptions, priorities, feasibility, and even the politics involved, just as the product manager is essential in producing the architecture.

The architect designs the project as a continuous design effort following the system design. This process is identical to that used in every other engineering discipline: The design of the project is part of the engineering effort and is never left for the construction workers and foremen to figure out on-site or on the factory floor.

The architect is not responsible for managing and tracking the project. Instead, the project manager assigns the actual developers to the project and tracks their progress against the plan. When things change during execution, both the project manager and the architect need to close the loop together and redesign the project.

The realization that the architect should design the project is part of the maturity of the role of the architect. The demand for architects has emerged in the late 1990s in response to the increased cost of ownership and complexity of software systems. Architects are now required to design systems that enable maintainability, reusability, extensibility, feasibility, scalability, throughput, availability, responsiveness, performance, and security. All of these are design attributes, and the way to address them is not via technology or keywords, but with correct design.

However, that list of design attributes is incomplete. This chapter started with the definition of success, and to succeed you must add to that list schedule, cost, and risk. These are design attributes as much as the others, and you provide them by designing the project.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7. Project Design Overview

Create new playlist

Sign In

Sign Up

7. Project Design Overview

Defining Success

Reporting Success

Project Initial Staffing

Architect, Not Architects

Junior Architects

The Core Team

The Core Mission

The Fuzzy Front End

Educated Decisions

Plans, Not Plan

Software Development Plan Review

Services and Developers

Design and Team Efficiency

Personal Relationships and Design

Task Continuity

Effort Estimations

Classic Mistakes

Probability of Success

Estimation Techniques

Accuracy, Not Precision

Reduce Uncertainty

PERT Estimations

Overall Project Estimation

Historical Records

Estimation Tools

Broadband Estimation

A Word of Caution

Activity Estimations

The Estimation Dialog

Critical Path Analysis

Project Network

Activity Times

The Critical Path

Assigning Resources

Staffing Level

Float-Based Assignment

Network and Resources

Scheduling Activities

Staffing Distribution

Staffing Distribution Chart

Staffing Mistakes

Smoothing the Curve

Project Cost

Project Efficiency

Efficiency as Overall Estimation

Earned Value Planning

Classic Mistakes

The Shallow S Curve

Roles and Responsibilities

Table of Contents for
7. Project Design Overview