CHAPTER 4
Process and Politics of Business Forecasting

We all believe that business forecasting should be an objective, dispassionate, and scientific undertaking, and yet it is conducted within the often-politicized confines of an organization. Forecasters and decision makers can have biases and agendas that undermine forecast accuracy. With all the potential “game playing” in the process, you never know whom you can trust.

We begin this chapter with a look at forecast value added (FVA) analysis, an increasingly popular method for uncovering activities that fail to improve the forecasts. The unfortunate reality is that many common forecasting practices—even so-called best practices—may actually result in forecasts that are worse than doing nothing more than simply using the naïve no-change model. Steve Morlidge finds, for example, that half of the forecasts in his study failed to beat the naïve model forecasts.

Other articles in this chapter look at specific issues in forecast-process design: where to position the forecasting function, whether to hold face-to-face meetings, whether to include the sales force in the process, and how to set performance objectives. Various “worst practices” are identified, along with ways to provide a more trustworthy forecast that management will believe in and act on. We end this chapter with a look at the widely adopted process of sales and operations planning (S&OP)—how it can be applied in the retail industry, and its future direction.

4.1 FVA: A Reality Check on Forecasting Practices1

Michael Gilliland

Introduction

We all want our business practices to be effective, efficient, and certainly as waste-free as possible. No conscientious executive willingly squanders company resources on activities that have no benefit to customers or to the business’s own bottom line. So when it comes to the practice of business forecasting, how do we know whether we are performing up to these standards?

A traditional forecasting performance metric, such as the MAPE, tells us the magnitude of our forecast error but little else. Knowing the MAPE of our forecasts does not tell us how efficient we were at achieving this level of error or how low an error would be reasonable to achieve. Nor does it tell us how our methods and processes perform compared to simpler alternatives. This is where forecast value added (FVA) steps in.

FVA analysis turns attention away from the end result (forecast accuracy) to focus on the overall effectiveness of the forecasting process. As the FDA will test a new drug for its safety and efficacy, FVA evaluates each step of the forecasting process to determine its net contribution. If the process step (such as a sophisticated statistical model or an analyst override) makes the forecast better, then it is “adding value” and FVA is positive. But if the effect of the step is inconclusive (we can’t discern whether it is improving the forecast) or if it is making the forecast worse, then we can rightly question whether this step should even exist.

This article presents the basic data requirements and calculations for FVA analysis, along with sample report formats. It also examines some implementations by industry practitioners.

Calculating Forecast Value Added

Suppose we have this simple forecasting process:

numbered Display Equation

In this common situation, historical sales information is read into forecasting software, where the history is modeled and the statistical forecast is generated. At that point, the forecast is reviewed and potentially adjusted, resulting in the “final forecast” that will be published and sent to downstream planning systems.

FVA is a measure of past performance. For each item being forecast, and for each time period in our history, we would need to gather:

  • The Statistical Forecast
  • The Final Forecast
  • The Actual Value (e.g., actual sales)

If we have 100 items and have been forecasting them for the past 52 weeks, we would have 5,200 records in our data file, with the variables:

numbered Display Equation

FVA is defined as:

The change in a forecasting performance metric that can be attributed to a particular step or participant in the forecasting process.

In this simple example there are two process steps: the software’s generation of the statistical forecast and the management’s override resulting in the final forecast. A more elaborate process may have additional steps, such as a consensus or collaboration, and an executive approval.

In FVA analysis, there is also an implied initial step: generation of a naïve forecast. It is normal to use the random walk (no-change model) to generate the naïve forecast. This is easy to reconstruct from the historical data and can be added to our data file as a new variable:

numbered Display Equation

We compute FVA by comparing the performance of sequential steps in the forecasting process. Here, we would compute performance of the naïve, statistical, and final forecasts, and determine whether there was “value added” by these successive steps.

FVA doesn’t care which traditional metric you are using to evaluate performance (although some flavor of MAPE is most common in industry). Results can be reported in the “stairstep” format shown in Figure 4.1. Rows represent sequential process steps, the second column shows the MAPE (or whatever performance metric is being used) for each step, and the right columns show pairwise comparisons between steps.

PROCESS STEP MAPE FVA vs NAÏVE_FCST FVA vs STATFCST
NAÏVE FORECAST 50%
STATISTICAL FORECAST 40% 10%
FINAL FORECAST 42% 8% −2%

Note that a more elaborate process would have additional rows for the additional process steps and additional columns for the pairwise comparisons.

The stairstep report can be generated for each item being forecast, for item groupings, and for all items combined. Groupings are of interest when they have different demand patterns, use different forecasting processes, or are overseen by different forecast analysts. For example, a retailer might separate products with everyday low pricing from those with high–low (promotional) pricing to compare demand volatility, forecast accuracy, and FVA between the two groups.

Of course, over the thousands of items that may be forecast by a large organization, some of the observed FVA differences may be too small to be meaningful, or the observed difference may just be due to chance. One must be cautious in interpreting such a report and not jump to unwarranted conclusions or make rash process changes. Additional analysis can confirm that the observed difference is indeed “real” and not likely to be random.

How Organizations Are Using FVA

Our objective is to generate forecasts that are as accurate and unbiased as we can reasonably expect (given the nature of what we are trying to forecast), and also to do this as efficiently as possible. We can’t completely control the level of accuracy achieved (since accuracy is ultimately limited by the forecastability of the behavior being forecast), but we can control the processes used and the resources we invest into forecasting.

S&OP thought leader Tom Wallace has called FVA “the lean-manufacturing approach applied to sales forecasting” (Wallace, 2011), and some organizations are using FVA in just this way: to identify process “waste.” Activities that are failing to improve the forecast can be considered wasteful and resources committed to performing them can be redirected to more productive activities.

Practitioners have extended the FVA concept with new ways of analysis and reporting or have otherwise used FVA results to modify the way they do forecasting.

Newell Rubbermaid

Schubert and Rickard (2011) reported an analysis that found a positive 5% FVA in going from the naïve to the statistical forecast but a negative 2% FVA for judgmental overrides of the statistical forecasts. Realizing a limitation of the basic stairstep report—that important information may be buried in the “average FVA” reported for a group of items—they utilized histograms as in Figure 4.2 to show the distribution of FVA values across a product group.

images

Figure 4.1 Forecast Value Added “Stairstep” Report

images

Figure 4.2 Statistical Forecast Value Added

Even though the statistical forecast was (on average) five percentage points more accurate than the naïve, for many items the statistical forecast did considerably worse, and these merited further attention. Likewise, the (not uncommon) finding that, on average, management overrides made the forecast worse provided opportunity for additional investigation and process tuning.

Tempur-Pedic

Eric Wilson (2008) oversaw a collaborative forecasting process wherein a baseline statistical forecast was manually updated with market intelligence, resulting in the final forecast. He used FVA analysis for visibility into the process and to identify areas for improvement.

With FVA, Wilson realized the best way to leverage the knowledge of salespeople was to appeal to their competitive nature. Instead of forcing them to adjust all statistical forecasts, he instead challenged them to “beat the nerd in the corner” by adding value to the nerd’s computer-generated forecasts. This reduced frivolous forecast adjustments that were being made simply because of the requirement to make changes.

Amway

Mark Hahn (2011) used FVA in conjunction with analysis of forecastability to better understand and communicate what “good” performance is and what is realistic to achieve. He utilized monthly product-level reporting to determine the value added by analyst inputs and also identified instances where the statistical forecast was underperforming the naïve model.

Cisco

Fisher and Sanver (2011) reported on Cisco’s use of FVA for accountability of the forecasting process and the people executing it. FVA was considered a simple and important metric for judging performance and appeared on the dashboards of Cisco’s senior management. It showed the team where to put resources and where a naïve forecast suffices.

Which Naïve Model to Use?

In forecasting literature, the classic naïve model is the random walk or no-change model—our forecast for next period (and all future periods) is what we observed last period. In FVA analysis, the random walk can be the point of comparison for our forecasting process, the placebo against which we measure our process effectiveness.

The spirit of a naïve model is that it be something easily computed, with the minimal amount of effort and data manipulation, thus generating a forecast at virtually no cost, without requiring expensive computers or software or staffing. If our system and process cannot forecast any better than the naïve model on average, then why bother? Why not just stop doing what we are doing and use the naïve forecast?

The random walk may not be a suitable “default” model to use in situations where the existing forecasting process is not adding value. Suppose that forecasts are changing radically with each new period of actual values, producing instability in an organization’s planning processes. For example, while a year ends with a strong week of 1,000 units sold, the naïve model would forecast 1,000 units per week through the next year, and the supply chain would have to gear up. However, if we only sell 100 units in week one of the new year, we would change our forecast to 100 units per week for the rest of the year, and gear the supply chain back down. This up and down could occur with each period of new actuals.

Supply-chain planners could not operate in an environment of such volatile forecasts and would end up tempering their actions around some “average” value they expect for the year. So rather than defaulting to a random walk when the forecasting process is not adding value, it may be better to default to another simple model which mitigates such up-and-down volatility (such as a moving average, seasonal random walk, or simple exponential smoothing). Just make sure this default model is performing better than the existing process and, hopefully, better than the random walk!

As a practical consideration, the default model should be included in the FVA stairstep report, so its performance can be monitored. In the unlikely event that it performs worse than a random walk, as long as it doesn’t perform substantially worse, it has the advantage of providing stability to the downstream planning processes.

A Reality Check on Forecasting Practices

“Forecasting is a huge waste of management time.”

We’ve heard this before—especially from management—but it doesn’t mean that forecasting is pointless and irrelevant. It doesn’t mean that forecasting isn’t useful or necessary to run our organizations. And it doesn’t mean that executives should neither care about their forecasting issues nor seek ways to improve them. It simply means that the amount of time, money, and human effort spent on forecasting is not commensurate with the amount of benefit achieved (the improvement in accuracy).

We spend far too much in organizational resources creating our forecasts, while almost invariably failing to achieve the level of accuracy desired. Instead of employing costly and heroic efforts to extract every last bit of accuracy possible, FVA analysis seeks to achieve a level of accuracy that is as good as we can reasonably expect, and to do so as efficiently as possible. FVA allows an organization to reduce the resources spent on forecasting and potentially achieve better forecasts—by eliminating process activities that are just making the forecast worse.

Remember: FVA analysis may not make you the best forecaster you can be—but it will help you to avoid becoming the worst forecaster you might be!

REFERENCE

  1. Fisher, A., and M. Sanver (2011). Large-scale statistical forecasting for Cisco’s high-tech demand patterns. INFORMS Conference on Business Analytics and Operations Research (April).
  2. Hahn, M. (2011). Lean forecasting: How to get more bang for your forecasting buck.Best of the Best S&OP Conference (June).
  3. Schubert, S., and R. Rickard (2011). Using forecast value added analysis for data-driven forecasting improvement. IBF Best Practices Conference (May).
  4. Wallace, T. (2011). Forecasting: It’s getting better. BONEZONE (June).
  5. Wilson, J. E. (2008). How to speak sales. IBF Supply Chain Forecasting Conference (February).

4.2 Where Should the Forecasting Function Reside?2

Larry Lapide

The Fall 2002 Journal of Business Forecasting publication was a special issue on benchmarking. The publication covered a variety of benchmark data collected by the Institute of Business Forecasting (IBF) in a survey it conducted. One of the most interesting data sets I noticed dealt with results on where respondents stated their forecasting function resides. The results showed that across all industries polled the percent of companies where the forecasting function resides by department was:

Operations/Production 20%
Marketing 20%
Finance 14%
Sales 12%
Forecasting 10%
Logistics 9%
Strategic Planning 6%
Other 9%

While the data are certainly interesting, it is not insightful in helping a company determine where it should put its forecasting function. Essentially, it says: “Take your pick”! This type of inconclusive benchmarking result is the reason why I often talk about what is important in deciding in which department a forecasting group should reside.

On the face of it, my usual advice does not at first appear to be much help, because I take the opinion that it depends on a variety of factors, and that there is no one right answer for a company, generally speaking. The right answer to me is to put the forecasting function inside a department that will diligently execute an effective forecasting process in a way it needs to be conducted to ensure the best output possible—namely, the most accurate consensus forecast that can be developed and one that is used as the basis for all operational planning activities.

Executing an Effective Forecasting Process

Executing an effective operational forecasting process means setting up and adhering to a set of activities that enables the following to occur:

  • A finalized demand forecast that incorporates a balanced mix of quantitative and qualitative data. The process should start with the development of a baseline forecast that is based on objective information, often developed using statistical forecasting methods to blend historical information with known factors about the future. The baseline forecast should then be adjusted to incorporate market intelligence.
  • All stakeholder departments (such as Marketing, Sales, Operations, and Finance) provide the market intelligence that is used to adjust the baseline forecast to account for factors not incorporated into it.
  • A consensus forecast is developed to which all stakeholder departments agree, as well as are accountable for in their respective ways. This further means that the consensus demand forecast is used as the basis for every department’s operational planning—enabling a single number planning best practice.

As long as a department can successfully execute a forecasting process and enables the above to occur, it qualifies as a good place to put the forecasting function. Of course, if the forecasting function can accomplish this without being in any department, that is another option.

Evaluation Criteria

Not every department has all the qualifications for successfully conducting an effective demand planning process. There are evaluation criteria that can be used to assess whether a particular department fits the bill, as follows:

  • Objectivity: A department needs to be objective to develop a good operational forecast. This manifests itself in being able to produce a set of forecasts that are based on facts and sound judgment, even if the department is heavily impacted by the forecasts generated. In addition, objectivity also derives from a department being a stakeholder that is impacted by customer demand. It also means a department needs to be open to valuable input from all stakeholder departments.
  • Business Understanding: It is extremely important for a forecasting group to understand the nature of a business, especially in terms of what drives the dynamic nature of customer demand. Various drivers of demand need to be incorporated to develop a good forecast, so possession of this type of knowledge is essential to gathering inputs and synthesizing them to produce a good forecast.
  • Quantitative Skills: The reality of forecasting is that it is quantitative in nature. A demand forecast may have to be produced for tens of thousands or millions of items, so the sheer scale of it means that computer skills are necessary. This requires a department to be somewhat “left-brained” and have an appreciation for quantitative and computer skills, in order to best leverage the capabilities of quantitative analysts.
  • Organization Skills: One of the most critical criteria is the organization skills of the department. To run an effective forecasting process requires discipline and adherence to a process. This includes preparing for, leading, and doing follow-up for ongoing meetings, such as sales and operations planning (S&OP) meetings. It also includes publishing the demand forecast on time and ensuring that everyone has easy access to it for their own planning purposes.

Selecting a department in which to put the forecast function requires using the above criteria to assess which one is the best. Not all departments in a company meet the criteria, and which department is best often varies by industry.

The Pros and Cons of Departments

In industries, such as in Consumer Products and Pharmaceuticals, that are distribution-intensive in nature, there is a tendency to place the forecasting function in the Marketing department, since it may best understand future customer demand. However, in some companies there are a variety of reasons that may run counter to this placement of the forecasting function, including that their Marketing departments may not be objective enough or have the right quantitative skills to do the job.

Generally, here are some of the pro and con arguments for each department, as well as for creating a standalone forecasting department:

  • Standalone Forecasting Department: The biggest reason for having a standalone forecasting department is that it can have a dispassionate, objective view of the company, as well as have the time and inclination to organizationally run a very effective forecasting process. In addition, it can be easily staffed with the appropriate quantitative skills needed. However, there is a significant downside to establishing a standalone group. As a non-stakeholder group it is not responsible for any operational processes that impact demand; therefore, will not have to take on any accountability for achieving a demand forecast. If not carefully managed, while this type of standalone group might get very efficient in developing and publishing forecasts, it may never develop a true understanding of the business and customers. This can lead to a tendency to develop forecasts in a vacuum without sufficient input from stakeholder organizations.
  • Marketing Department: The biggest reason for putting the forecasting function in the Marketing department is that it has a very good understanding of future customer demands. It may or may not be objective enough depending on whether its performance goals are based on actual customer demand. There may be reluctance by the Marketing department to change to an operational forecast that is not aligned with its performance goals. In addition, this typically right-brained organization may lack the quantitative skills to do statistical forecasting and the computer skills to run the requisite software.
  • Production, Operations, or Logistics Department: The biggest negative in putting the forecasting function into the Operations or Production department is that these organizations often do not get enough contact with customers to truly understand future demand. The Logistics department does have more contact with customers, but not with regard to their future needs. On the positive side, all these three left-brained organizations possess the quantitative skills to do statistical forecasting and the computer skills needed to run the requisite software. They are also disciplined enough to execute an effective forecasting process and are objective, because their operations and costs are highly dependent on future demand.
  • Sales Department: Since the Sales department has the most contact with customers, this is a big plus in any argument assessing whether to put the forecasting function into it; as they should understand future customer needs the best. However, on other evaluation criteria the Sales department often falls short. Regarding objectivity, since sales reps are commissioned based on what they sell, they usually refuse to change operational forecasts to differ from their sales goals—so they often can’t be objective. A Sales department may also lack the required quantitative skills needed and is often not interested in running a routine, structured forecasting process.
  • Finance Department: Similar to the standalone Forecasting department, the best reason for putting the forecasting function into Finance is that this left-brained organization has the required quantitative skills and organizational discipline. The biggest negatives are that Finance has no direct contact with customers and customer demand does not directly impact its operations. Therefore, Finance usually understands customers the least among all other departments and is not held accountable for achieving the demand forecast. A concern relative to objectivity is that Finance has an innate reluctance to change the operational forecasts that differ from the revenue forecasts incorporated into the financial budgets.
  • Strategic Planning Department: Since the Strategic Planning department deals with long-term planning issues, it does not make sense in most companies to have the forecasting function to reside in it. Strategic Planning is more often about planning years in advance, in terms of developing strategic capital and resource plans, and is not really focused on forecasting to support tactical and operational activities. If the forecasting function is put into a Strategic Planning department, there will be a tendency for strategic revenue plans to become the operational forecasts, rather than the forecasts representing a true objective view of what will happen in the short and intermediate term. On the plus side, a Strategic Planning group usually has the quantitative and organizational skills needed to drive an effective forecasting process. However on the negative side, it usually does not understand shorter-term customer needs.

Conclusion

Table 4.1 summarizes the above pros and cons of putting the operational demand forecasting function into each department. As can be noted from the arguments made, there is no one department that is a clear-cut choice. That is, where the forecasting function should reside is highly dependent on a company and the industry it is in.

Table 4.1 Summary of Pros and Cons of Putting the Forecasting Function in Each Type of Department

Department Objectivity Business Understanding Quantitative Skills Organizational Skills
Standalone Forecasting Objective, but not impacted by demand No direct contact with customers High level High level of discipline
Marketing Objective, but some bias from performance goals Very good understanding of future customer needs Low level Moderate level of discipline
Production, Operations, and Logistics Objective and impacted by demand Little direct contact with customers High level High level of discipline
Sales Bias from sales goals and commissions Highest level of contact with customers Low level Less interest in running structured, routine processes
Finance Objective, but some bias from budgeting and not impacted by demand No direct contact with customers High level High level of discipline
Strategic Planning Objective, but not impacted by demand and view is too long-term No direct contact with customers High level High level of discipline

So what is the bottom line on where the forecasting function should reside in your company? As the benchmarking data shows, it depends. A good way to determine where the forecasting function should reside is to evaluate departmental competencies using the pros and cons arguments I’ve provided. In the final analysis, however, wherever your company decides to put it, make sure that the department has most of the characteristics needed to run an effective forecasting process—and more importantly, really wants to do it right!

4.3 Setting Forecasting Performance Objectives3

Michael Gilliland

Setting forecasting performance objectives is one way for management to shine . . . or to demonstrate an abysmal lack of understanding of the forecasting problem. Inappropriate performance objectives can provide undue rewards (if they are too easy to achieve), or can serve to demoralize employees and encourage them to cheat (when they are too difficult or impossible). For example:

  • Suppose you have the peculiar job of forecasting Heads or Tails in the daily toss of a fair coin. While you sometimes get on a hot streak and forecast correctly for a few days in a row, you also hit cold streaks, where you are wrong on several consecutive days. But overall, over the course of a long career, you forecast correctly just about 50% of the time.
  • If your manager had been satisfied with 40% forecast accuracy, then you would have enjoyed many years of excellent bonuses for doing nothing. Because of the nature of the process—the tossing of a fair coin—it took no skill to achieve 50% accuracy. (By one definition, if doing something requires “skill,” then you can purposely do poorly at it. Since you could not purposely call the tossing of a fair coin only 40% of the time, performance is not due to skill but to luck. See Mauboussin (2012) for more thorough discussion of skill vs. luck.)
  • If you get a new manager who sets your goal at 60% accuracy, then you either need to find a new job or figure out how to cheat. Because again, by the nature of the process of tossing a fair coin, your long term forecasting performance can be nothing other than 50%. Achieving 60% accuracy is impossible.

So how do you set objectives that are appropriate for forecasting performance?

Five Steps for Setting Forecasting Performance Objectives

  1. Ignore industry benchmarks, past performance, arbitrary objectives, and what management “needs” your accuracy to be.

    Published benchmarks of industry forecasting performance are not relevant. This is addressed in Gilliland (2005) and more extensively by Kolassa (2008).

    Previous forecasting performance may be interesting to know, but not relevant to setting next year’s objectives. We have no guarantee that next year’s data will be equally forecastable. For example, what if a retailer switches a product from everyday low pricing (which generated stable demand) to high–low pricing (where alternating on and off promotion will generate highly volatile demand). You cannot expect to forecast the volatile demand as accurately as the stable demand.

    And of course, arbitrary objectives (like “All MAPEs < 20%”) or what management “feels it needs” to run a profitable business, are inappropriate.

  2. Consider forecastability . . . but realize you don’t know what it will be next year.

    Forecast accuracy objectives should be set based on the “forecastability” of what you are trying to forecast. If something has smooth and stable behavior, then we ought to be able to forecast it quite accurately. If it has wild, volatile, erratic behavior, then we can’t have such lofty accuracy expectations.

    While it is easy to look back on history and see which patterns were more or less forecastable, we don’t have that knowledge of the future. We don’t know, in advance, whether product X or product Y will prove to be more forecastable, so we can’t set specific accuracy targets for them.

  3. Do no worse than the naïve model.

    Every forecaster should be required to take the oath, “First, do no harm.” Doing harm is doing something that makes the results worse than doing nothing. And in forecasting, doing nothing is utilizing the naïve model (i.e., random walk, aka no-change model) where your forecast of the future is your most recent “actual” value. (So if you sold 50 last week, your forecast for future weeks is 50. If you actually sell 60 this week, your forecast for future weeks becomes 60, etc.)

    You don’t need fancy systems or people or processes to generate a naïve forecast—it is essentially free. So the most basic (albeit pathetic) minimum performance requirement for any forecaster is to do no worse than the naïve forecast.

  4. Irritate management by not committing to specific numerical forecast accuracy objectives.

    It is generally agreed that a forecasting process should do no worse than the naïve model. Yet in real life, perhaps half of business forecasts fail to achieve this embarrassingly low threshold (Morlidge, 2014). Since we do not yet know how well the naïve model will forecast next year, we cannot set a specific numerical accuracy objective. So next year’s objective can only be “Do no worse than the naïve model.”

    If you are a forecaster, it can be reckless and career threatening to commit to a more specific objective.

  5. Track performance over time.

    Once we are into the new year and the “actuals” start rolling in each period, we can compare our forecasting performance to the performance of the naïve model. Of course, you cannot jump to any conclusions with just a few periods of data. But over time you may be able to discern whether you, or the naïve model, is performing better.

Always start your analysis with the null hypothesis:

numbered Display Equation

Until there is sufficient data to reject H0, you cannot claim to be doing better (or worse) than the naïve model.

REFERENCES

  1. Gilliland, M. (2005) Danger, danger: The perils of operational performance benchmarks. APICS e-News 5(23) (December 6).
  2. Kolassa, S. (2008). Can we obtain valid benchmarks from published surveys of forecast accuracy? Foresight: International Journal of Applied Forecasting 20 (Fall).
  3. Mauboussin, M. J. (2012). The Success Equation: Untangling Skill and Luck in Business, Sports, and Investing. Boston: Harvard Business Review Press.
  4. Morlidge, S. (2014). Using relative error metrics to improve forecast quality in the supply chain. Foresight: International Journal of Applied Forecasting 34 (Summer), 39–46.

4.4 Using Relative Error Metrics to Improve Forecast Quality in the Supply Chain4

Steve Morlidge

Introduction

“This is too wishy-washy. You will have to do something about this.”

This was one among the many comments made by Foresight editors on receipt of my last article (Morlidge, 2014). In it, I had detailed the results of the survey of nine sets of supply-chain forecasts drawn from eight businesses, comprising over 300,000 data points in total. I measured the performance of all these forecasts using a relative absolute error (RAE) metric, where actual forecast error is compared to the simple “same as last period” naïve forecast error.

My purpose was to assess forecast quality in the supply chain by determining practical upper and lower bounds of forecast error—the lower bound representing the best accuracy that can be expected, the upper bound the worst that should be tolerated. My results—printed in the Spring 2014 issue of Foresight—showed that there were very few forecasts that had forecast errors more than 50% better than the naïve forecasts. Thus, for practical purposes, the lower bound of forecast error for the granular supply-chain data is an RAE of 0.5.

But also, and somewhat shockingly, I found that approximately 50% of the forecast errors were worse than those from the naïve forecasts, with an RAE > 1.0, the logical upper bound of forecast error. This is not a healthy situation: In principle, it should be easy to beat the naïve forecast. Failure to do so means that the forecast process is adding no value to the business. It also begs a couple of key questions: “What is causing this?” and, “What can be done about it?”

This was the issue that frustrated Foresight editors, and quite rightly so. Improving the craft of forecast measurement is laudable, but if nothing can be done with the results, then we have won no more than a Pyrrhic victory. No approach to measuring the quality of forecasts can, in itself, improve accuracy; it is a challenge for any measurement scheme, not just for RAE.

Therefore, in this current article, I will offer specifics on how to use the forecast-quality metric (RAE) in conjunction with product volumes to target efforts to improve forecast quality in the supply chain.

Before starting out on this quest, let me reprise some relevant points from my previous articles and explain their relevance to the task of forecasting in the supply chain.

Background

My motivation has been to discover the upper and lower bound—the worst and best levels—of forecast error and, in the process, produce a metric that can be used to make objective judgments about forecast quality.

The Upper Bound

The upper bound is easy to establish: There is no good reason why any set of forecasts should have larger errors on average than forecasts produced by the most primitive forecast conceivable—a naïve forecast that uses the prior period’s actual as a forecast. This upper bound provides a benchmark against which forecast performance can be compared. A relative absolute error (RAE) of below 1.0 means that the average level of absolute errors from a forecast is lower than that of the naïve forecast; above 1.0 means that it is worse. But for practitioners working in the supply chain, the naïve forecast is more than a convenient benchmark.

Forecasting demand, and replenishing stock based on the demand forecast, is only economically worthwhile if it is possible to improve on the simple strategy of holding a fixed buffer (safety stock) and replenishing it to make good any withdrawals in the period. This simplistic replenishment strategy is arithmetically equivalent to using a naïve forecast (assuming no stockouts), since the naïve forecast is one of no change from our current level.

Safety Stock and Forecasting Value

The safety stock needed to meet a given service level is determined by our forecast errors. If the RAE of our forecasts is 1.0, yielding the same error on average as a naïve forecast, the buffer set by the naïve errors is appropriate. If our forecast has an RAE below 1.0, however, it means that the business needs to hold less stock than that indicated by the naïve. This is how forecasting adds value to a supply chain: The greater the level of absolute errors below those of the naïve forecast, the less stock is needed and the more value is added. Put simply, forecasting is not an end in itself; it is a means to an end, the end being a more efficient way of managing inventory (Boylan and Syntetos, 2006).

In order to assess the potential of a forecast to add more value (how much improvement it is possible to make), we need to be able to identify the lower bound of forecast error.

The Lower Bound

My first article in this series on forecastability included a demonstration of how the lower bound of error could be determined theoretically (Morlidge, 2013). It showed that the lower bound of forecast error is a product of

  1. The level of random noise in a data series compared to the change in the signal, and
  2. The volatility of the change in a signal. In the case of a signal with no trend, the theoretical lower bound of error was close to 30% below the naïve forecast, irrespective of the level of noise: i.e., an RAE of 0.7.

Trends, seasonal movements, and other systematic changes in the signal could theoretically lower (improve) the RAE further, but it was my speculation that the more changeable the signal is, the more difficult it is to forecast. In practical terms, I argued that it would be difficult for any forecast to better an RAE of 0.5, a hypothesis that was supported by my empirical work on supply-chain forecasts (Morlidge, 2014b).

The Practical Challenge

If 0.5 is accepted as a practical lower bound, then error in excess of an RAE of 0.5 is avoidable, while error below an RAE of 0.5 is unachievable and hence unavoidable. In principle, then, supply-chain forecasters should seek to drive RAE down as close to 0.5 as possible. However, they need to be mindful of the likelihood of increased difficulty of making incremental improvements the closer they get to the lower bound. Moreover, the value that forecasting generates for the business is related to the absolute amount of avoidable error, which is determined mainly by the product volume to be forecast. Hence analysts should be guided by the RAE weighted by volume, which is more meaningful as a measure of forecast performance than the unweighted average RAE.

With the requirement to forecast hundreds and often thousands of items by week or month, the practical challenges that supply-chain forecasters face are formidable. Some of these items can be volatile or intermittent, and may be affected by marketplace activity. In these situations, standard time-series methods cannot be used without adjustments and embellishments. Judgmental adjustments to statistical forecasts are therefore common (Goodwin and Fildes, 2007), and these are frequently based on input from people who are not forecasting experts. Worse, they may be motivated by “silo” concerns and pure self-interest (for example, submitting forecasts that are below target to ensure meeting a quota). Finally, forecasting software typically offers a bewildering array of methods and parameters and “black-box” automatic algorithm selection processes that (as demonstrated by other research) cannot always be relied on to produce acceptable results, even in controlled conditions (Morlidge, 2014).

Given the nature of these challenges, any approach to improving the quality of supply-chain forecasts must help practitioners:

  1. Focus on those areas where the effort / reward ratio is most favorable;
  2. Devise approaches that help identify the likely cause of problems and tailor strategies to solve them; and
  3. Set realistic goals mindful of 1 and 2 above.

Focus the Efforts

Portfolio classification methods, such as “ABC,” have been used extensively in inventory management as a way of helping practitioners develop differentiated approaches to the management of a portfolio, and to focus their efforts in those areas where they will be best rewarded (Synetos and colleagues, 2011).

One obvious way in which this approach could be applied to the challenge of forecast improvement is in helping practitioners target their efforts on those items with, at once, the poorest forecast performance (as measured by RAE weighted by volumes) and largest volumes.

This task will be easier if: (1) a large proportion of the opportunity (total amount of avoidable error in excess of 0.5 RAE) is concentrated in a small proportion of the product portfolio (true for our supply-chain data: approximately 20% of items contributed 80% of the avoidable error); and (2) forecast quality (RAE) is not strongly correlated with volume, as such a correlation might suggest that small-volume items are more difficult to forecast. In practice, we found this was not often the case, as large-volume products often did not have significantly lower RAE than low-volume products.

The first condition is the most important. A significant proportion of the opportunity (total amount of avoidable error) is typically concentrated in a small proportion of the product portfolio. For example, consider my previously used data comprising 11,000 items forecast in monthly buckets over a two-year period. Figure 4.3 plots these 11,000 items (each represented by a dot) on a chart where the y-axis shows the average volume and the x-axis marks forecast quality (RAE). (The volume axis uses a logarithmic scale so that the wide range of values can be displayed clearly, and so that any correlation between RAE and volume would be very obvious.) It is clear that no significant correlation exists in this case.

images

Figure 4.3 RAE vs. Volume

The histogram below the chart, Figure 4.4, shows a large number of items with RAE in excess of 1.0 (about 40%), all of which could be avoided by using the naïve forecast (although in practice this should be the last resort) and very few below with RAE 0.5.

images

Figure 4.4 Distribution of RAE

I have drawn separators in Figure 4.3 to distinguish four quadrants. This shows that 77% of the avoidable error (opportunity) comes from items associated with the high RAEs of 0.85 or above.

Accuracy improvement here should be relatively easy to achieve. Further, 80% of avoidable error is with the largest (by volume) 10% of the items. As a result, the “High Volume/High RAE” quadrant holds only 6% of the items but accounts for 62% of the opportunity, giving a very favorable effort/reward ratio. In this way, the focus of work to improve forecasting can be directed to those items where the greatest opportunities lie.

This leads to the next question: How do we identify the best approach for exploiting these opportunities?

Devise Improvement Strategies

There are two ways that forecast quality can be improved:

  1. Choosing better forecasting methods; and
  2. Making better judgmental adjustments.

This is a truism that applies to all items, but the trick is to match the improvement strategy with the right part of the portfolio.

The approach outlined here involves isolating those parts of the portfolio where, in principle, judgment can make a significant contribution to forecast quality, and then taking steps to ensure that such judgment is used judiciously. Outside this zone, the use of judgmental adjustments should be restricted; instead, effort must be focused on optimizing forecasting methods.

Figure 4.5 plots all the items in our sample portfolio on a second grid, which will help us select the most appropriate strategy to employ. This matrix is similar to the so-called ABC/XYZ approach used in the supply chain to help select the most appropriate replenishment and inventory policies.

images

Figure 4.5 Volume vs. Volatility (CoV) of Forecast Items (Color codes distinguish forecast quality (RAE))

As with the first classification grid, the y-axis represents volume and the horizontal line segregates the 20% of items that account for 80% of the avoidable error. However, here the x-axis records the coefficient of variation (CoV) of demand, which measures the volatility of the demand pattern. (I have calculated the CoV as the ratio of the mean absolute deviation—rather than standard deviation—to the arithmetic mean, a calculation that mitigates the impact of more extreme observations.)

This approach is based on the reasonable assumption that, all things being equal, the less volatile the demand pattern (the lower the CoV), the easier it will be for forecasting methods to successfully pick up and forecast the signal in the data.

With lower CoVs, there is less chance that judgmental intervention will improve forecast quality. On the other hand, higher CoVs are more likely to be associated with data series heavily affected by sporadic events where relatively large judgmental interventions may be needed to improve statistical forecasts (Goodwin and Fildes, 2007).

The items are color-coded based on their RAE:

RAE > 1.0 = red
RAE 0.85 to 1.0 = amber
RAE 0.7 to 0.85 = green
RAE < 0.7 = blue

A cursory visual inspection of the chart suggests that there is considerable scope for improvement, based on the widespread scattering of red items. To maximize the opportunities for meaningful improvement, we must proceed in a structured, stepwise manner. This is my approach:

Priority 1: High-Volume/High-RAE Items

This is the part of the portfolio where the effort/reward ratio is most favorable, in that 6% of the items contribute 62% of the avoidable error. In Figure 4.5, these are (a) above the line and (b) coded with amber or red dots.

Some of these items are in Zone 1, where the COV is relatively low. For these the strategy should be to focus on refining the forecasting method (how data are cleansed, models selected, forecasts tracked), allowing judgmental adjustments to statistical forecasts only where the case for making a change is overwhelmingly favorable and the impact is likely to be significant (Goodwin and Fildes, 2007).

Zone 2 contains those items with a more volatile data pattern. Optimizing the forecasting method here is more difficult given the volatile nature of the data series and impact of one-off events. The focus in this zone should be on the effective use of judgment. The exception to this may be items with a well-defined seasonal pattern, which could be forecast statistically without manual intervention despite having a high CoV.

Zone 2 is the part of the portfolio where consensus forecasting techniques (statistical plus judgmental) are likely to add most value. That these items encompass a small proportion of the total number of items means that valuable management time can be focused very effectively. The success of these interventions can be quantified by measuring RAE before and after the consensus process, and using the forecast value added concept for the comparison (Gilliland, 2013). Since poor judgment is often manifest in consistent over- or underforecasting, managers should continuously monitor for bias.

Priority 2: High-Volume/Low-RAE Items

This second most interesting part of the portfolio comprised an additional 18% of the avoidable error. These items lie above the line and are color-coded green or blue. For the green items, I’d recommend the same approach followed for Priority 1; that is, improving the statistical forecasts while discouraging the application of judgment except for those items with a high CoV. Of course, it would not be worthwhile to work on the blue items, since they already have the very lowest RAE (lower than 0.7).

Priority 3: Low-Volume Items

In our sample, Zones 3 and 4 of the portfolio contain 90% of the items but only 20% of the avoidable error. Irrespective of the level of variation in the data series, they are unlikely to reward any efforts involved in a consensus forecasting process.

Instead, the focus should be using a very simple and conservative forecasting method, such as simple exponential smoothing (SES). The intermittent-demand items, which are most likely to be in Zone 3, should be forecast using SES or a variant of Croston’s method (Synetos and colleagues, 2011). In some cases, where a data series approximates a random walk, the naïve model itself may be the best we can do. Perhaps these are not worth forecasting at all, using instead simple replenishment strategies or make-to-order (Boylan and Syntetos, 2006).

Setting Realistic Targets

Because the portfolio analysis is an exercise that will be carried out only periodically, it will be necessary to continuously track forecast quality (Hoover, 2009) to check that the hoped-for results are delivered and to identify when performance levels start to drop, necessitating another review. The key question, however, is, “What level of performance should we be aiming to achieve?” Clearly an RAE above 1.0 always flags a problem, and should be investigated (particularly if it is associated with a high-volume item), but what target should we be shooting for?

In a previous issue of Foresight, Sean Schubert suggests an approach based on the forecastability DNA of a product (Schubert, 2012), which takes account of factors other than the naïve forecast error. Here I propose adopting a similar approach by taking into account the volatility of the data series.

We have established that an RAE of 0.5 represents a practical lower limit of error in most cases. It would not be productive to adopt 0.5 as a target for small-volume items since the effort involved here probably could not be justified. For larger-volume items, Paul Goodwin has suggested a formula for setting sensible targets.

Goodwin’s formulation is based on the assumption that the lowest RAEs are associated with the items with the most volatile signals, which are likely to be items with the highest CoV. This is counterintuitive: CoV is often considered to be a measurement of forecastability, with higher CoVs indicating more volatility and thus greater difficulty in achieving any given level of forecast accuracy. But, as shown in Figure 4.6, as the CoV increases, the weighted RAE tends to decline. Hence our argument is that we should set more stringent RAE targets for the higher CoV items.

images

Figure 4.6 Average RAE Volatility (CoV)

The logic underpinning this argument is this: If the product is unforecastable—if the naïve forecast error is totally driven by noise—an RAE below 1.0 is unachievable. If there is a signal in the data (trend, seasonal, external factor), then the product is potentially forecastable, and the RAE should be expected to be better (lower) than 1.0. And we see here that lower CoV forecasts often perform very badly compared to the naïve, resulting in high RAEs.

Figure 4.6 plots the average and weighted-average RAE against CoV for our sample.

Figure 4.6 shows an increasing gap between the simple and weighted-average RAEs, reflecting that high-volume items/high-CoV items (i.e., those in Zone 2) have lower RAEs than those items with lower volumes.

Targets for High-CoV Items

Figure 4.6 results suggest that the target for Zone 2 items (high volume, high volatility) should be a relatively low RAE, while the target for items in Zone 3 (low volume, high volatility) should be less ambitious on the grounds that we quickly reach diminishing returns.

Targets for Low-CoV Items

In Zones 1 and 4 of Figure 4.5, which comprise items with low CoV, our intuition is to expect lower levels of forecast error than in Zones 2 and 3—that is, better RAE scores. Figure 4.6, however, shows that the lower the CoV, the worse the RAE (in this case, the RAE is significantly higher than 1.0). Also, there is no consistent difference between the simple and weighted-average RAEs, meaning that high-volume items have been forecast no better than low-volume items on average. What is causing this pattern is not clear—it may be the result of poorly judged manual interventions or overfitting of forecasting models—but whatever the cause, it is clearly unacceptable, and reasonable to expect better RAE scores for items in Zones 1 and 4 (though perhaps not as high as in Zone 2).

In summary, items in Zone 2 should have the most stretching targets since this is where the greatest scope exists to add value by manual intervention, and Zone 3 the least stretching because the low volumes make it unrewarding to expend the effort required to achieve good RAE scores. The targets for Zones 1 and 4 lie in between these extremes, but should be achievable with less effort because judgmental intervention is less likely to be needed.

Based on this analysis, I have proposed targets for items in each of these four zones in our sample, compared to the historic performance (Table 4.2). The scale of potential improvements is very significant: Avoidable forecast error (as measured by the weighted RAE) might perhaps be halved, with 71% of the total potential being contributed by 16% of the product portfolio. For the remaining 84% of items, the biggest contribution of this approach probably lies with the scope it gives to significantly reduce the amount of time and effort applied to forecasting them.

Table 4.2 Performance Targets and the Scale of Potential Improvement

Percentage of Items Current RAE Target Range Improvement Potential
Zone 1 3% 1.01 0.7–0.85 14%
Zone 2 13% 0.92 0.5–0.7 25%
Zone 3 7% 1.05 > 1.0 0%
Zone 4 77% 0.99 0.7–0.85 17%
Average 100% 0.97 0.70 55%

Conclusion

While it is unwise to make big claims based on one example, using RAE in conjunction with a small number of other easily calculated measures does appear to provide an objective and rational platform for constructing a set of forecast-improvement strategies tailored to a product portfolio. The goal is to maximize the overall benefit for a given level of effort.

Compared to a similar classification but based on conventional error metrics, RAE brings a number of benefits:

  • It identifies where the greatest opportunities lie by quantifying the scope for improvement and where it is concentrated in the portfolio.
  • It provides a quick and simple approach for dealing with items that are forecast poorly, and where the scope for improvement does not warrant the effort (the naïve forecast).
  • It helps set meaningful goals, tailored to the nature of the product and the role it plays within a portfolio. These can be used to quantify the scope for improvement and track progress.

REFERENCES

  1. Boylan, J., and A. Syntetos (2006). Accuracy and accuracy implications for intermittent demand. Foresight: International Journal of Applied Forecasting 4, 39–42.
  2. Gilliland, M. (2013). FVA: A reality check on forecasting practices. Foresight: International Journal of Applied Forecasting 29 (Spring 2013), 14–19.
  3. Goodwin, P., and R. Fildes (2007). Good and bad judgment in forecasting: Lessons from four companies. Foresight: International Journal of Applied Forecasting 8 (Fall 2007), 5–10
  4. Hoover, J. (2009). How to track forecast accuracy to guide forecast process improvement. Foresight: International Journal of Applied Forecasting 14 (Summer), 17–23.
  5. Morlidge, S. (2014a). Do forecasting methods reduce avoidable error? Evidence from forecasting competitions. Foresight: International Journal of Applied Forecasting 32 (Winter), 34–39.
  6. Morlidge, S. (2014b). Forecastability and forecast quality in the supply chain. Foresight: International Journal of Applied Forecasting 33 (Spring), 26–31.
  7. Morlidge, S. (2013). How good is a “good” forecast? Forecast errors and their avoidability. Foresight: International Journal of Applied Forecasting 30 (Summer), 5–11.
  8. Schubert, S. (2012). Forecastability: A new method for benchmarking and driving improvement. Foresight: International Journal of Applied Forecasting 26 (Summer), 5–13.
  9. Syntetos, A., J. Boylan, and R. Teutner (2011). Classification of forecasting and inventory. Foresight: International Journal of Applied Forecasting 20 (Winter), 12–17.

4.5 Why Should I Trust Your Forecasts?5

M. Sinan Gönül, Dilek Önkal, and Paul Goodwin

Introduction

Let’s say you’re sitting comfortably at your desk, sipping your coffee and preparing to plan your company’s production levels for the following month. You begin first by examining the forecast report that’s just been e-mailed to you. This report exhibits the predicted demand levels for the coming month. Suddenly a question pops into your head that, once there, just doesn’t seem to want to go away: “Do I really trust these forecasts enough to base all my plans on these numbers?”

Trust and Forecasting

In everyday language, we use the word “trust” so frequently and casually that we sometimes forget what it actually means and entails. According to the Oxford English Dictionary, to “trust” something is to have a “firm belief in the reliability and truth” of that thing. This implies that when we trust a forecast, we strongly believe the prediction is reliable and accurate.

But a mere strong belief is not enough to embrace the word’s entire scope. Having that belief also means accepting certain consequences. For instance, when we use “trusted” forecasts and base our managerial decisions on them, we automatically shoulder the responsibility for those decisions, which includes admitting the possibility that these forecasts may be flawed. Of course, we would rarely expect any forecast—even one that we trust—to be totally accurate. We would, however, expect a trusted forecast to make the best use of available information, to be based on correctly applied methods and justifiable assumptions that are made explicit, and to be free of political or motivational biases (Gönül and colleagues, 2009). Overall, we would expect it to be a competent and honest expectation of future demand.

Trust, therefore, involves risk, because it makes us vulnerable to negative consequences if our trust is misplaced (Rousseau and colleagues, 1998).

The Determinants of Trust

What are the key factors that determine whether we should trust a forecast? There is general agreement among researchers that one factor is our perception of the goodwill of the forecast provider. If decision makers believe that the forecaster providing the predictions is striving to do his or her best to deliver reliable and accurate predictions, then we are more likely to trust that source. We will be less trusting if we perceive that the forecasts are influenced by the provider’s agenda, which differs from ours.

For example, Adam Gordon (2008) discusses “future-influencing” forecasts that are used to try to achieve the future the forecast provider wants, rather than representing their genuine belief of what the future will hold. Forecasts by pressure groups that a new tax will drive companies out of business or that a new technology will treble cancer deaths may be of this type. Providers may also have other motivations. Within a company, forecasts provided by the marketing department may be perceived to be biased downwards so that the department looks good when sales regularly exceed forecasts (Goodwin, 1998).

If you are an intended recipient of a forecast, one indication that the forecast providers might share your agenda is their use of language which is familiar to you and free of jargon. In a study we recently concluded (Goodwin and colleagues, forthcoming), people trusted forecasts more when they were presented as “best case” and “worst case” values rather than as “bounds of a 90% prediction interval.” In some situations, managers who are not mathematically inclined may be suspicious of forecasts presented using technical terminology and obscure statistical notation (Taylor and Thomas, 1982). Such a manager may respect the forecast provider’s quantitative skills, but simultaneously perceive that the provider has no understanding of managers’ forecasting needs—hence the manager distrusts the provider’s forecasts.

Another critical factor is the perceived competence or ability of the forecast providers. In some cases, decision makers may prefer to entrust the job of forecast generation to professional forecasters, believing that they have more technical knowledge and insights. Sometimes this trust may be misplaced. People who confidently portray themselves as experts may be highly trusted—while an examination of their track record would reveal that, in fact, they may perform no better than chance (Tetlock, 2005).

In general, it appears that people just are not very good at assessing the competence of forecasters. A forecaster’s reputation may be destroyed by one isolated bad forecast that people readily recall, even though the forecaster’s overall accuracy is exemplary. In unfortunate contrast, one surprisingly accurate forecast of a major event that no one else foresaw will probably promote a poor forecaster to the status of a seer, thus eclipsing a record of wild inaccuracy (Denrell and Fang, 2010). If, for example, you correctly predicted the financial crisis of 2008, your forecasts are likely to be trusted without question, even if your past forecasting history suggests you generally have trouble foreseeing what day of the week follows Tuesday.

Of course, many forecasts originate from computers, not human beings. Do we trust computers more? It seems not. In a recent study (Önkal and colleagues, 2009), identical forecasts of stock market prices were presented to two groups of people, together with a graph depicting the stock price histories over time. One group was told that the forecasts emanated from a statistical algorithm, the other that they came from a financial expert (who, in fact, was the true source). When the groups were asked if they wanted to adjust the forecasts to make them more reliable, people made significantly larger changes to the forecasts that they thought came from the statistical algorithm—this despite the fact that the performance of experts in stock market forecasting is famously poor.

Future research is needed to see if attempting to give the computer systems human qualities, or creating a digital “persona,” will improve trust perceptions. However, some research suggests that trust can be improved if the computer system provides an explanation of its forecast. Explanations have been a feature of expert systems since their inception (Önkal and colleagues, 2008). Through explanations, providers can convey their justification and rationale behind a given prediction, and through this information, users can build their perceptions about the competence, benevolence, and integrity of the forecasting source.

Researchers also observed (Gönül and colleagues, 2006) that the higher the perceived value of the explanations, the higher the level of acceptance of the forecast. Interviews with the users participating in these studies revealed that they enjoyed receiving explanations. The explanations provided “stories” that made the forecasts more “believable.”

Trust and Adjustments to Provided Forecasts

Is the level of trust that people say they have in a set of forecasts (be they statistical or managerial) reflected in the way they treat these forecasts? Not surprisingly, it appears that greater levels of trust are associated with a decreasing tendency to adjust the forecasts.

However, the correlation is not perfect (Goodwin, forthcoming). Sometimes people may indicate a high level of trust and still go on to make big adjustments to the forecasts they receive. It seems that trust is only one factor determining forecast-adjustment behavior. This may be because separate and distinct mental processes are associated with assessing trust and judging the extent to which forecasts need to be adjusted (Twyman and colleagues, 2008). Trust assessments may originate from conscious and reflective thought processes and involve explicit thinking about whether we should trust what we are offered or not. On the other hand, when we make judgmental adjustments to forecasts there is plenty of evidence (Kahneman, 2011) that we unconsciously use heuristics—that is, intuitive rules of thumb. These may lead to different levels of adjustment, depending on the nature of the data we are given and the way it is presented. Whatever their cause, these discrepancies mean that people may treat two forecasts differently, even when they have told you they have the same level of trust in them.

The Need for Open Communication Channels

All these points indicate that communication between forecast users and forecast providers is critical. It is through open communication channels that users can express their expectations and receive cues to evaluate the prediction source in order to decide whether to trust or not to trust. The forecast providers might have benevolent intentions, might uphold similar principles, might be very skilled and experienced about generating predictions, and might indeed offer very accurate forecasts. But if they cannot effectively convey this information to their users and learn what the users are actually expecting, then all of these good qualities will be in vain.

Being transparent about general accuracy over a long period will reduce the tendency for users to make judgments on the basis of a single forecasting triumph or disaster. If this accuracy can be demonstrated relative to a reasonable benchmark, then so much the better. In very unpredictable situations, this will help to show that relatively high forecast errors are unavoidable and not a result of the forecaster’s lack of competence. Being transparent about assumptions, and even presenting multiple forecasts based on different assumptions, will most likely reassure the user about the integrity of the provider.

Revealing previous assignments and giving information about groups or clients other than the current users might also be beneficial to demonstrating intentions of goodwill. By investigating the forecaster’s client portfolio, the users of forecasts can find out what sort of people the provider is working with and has worked with in the past, which helps in formulating a picture of the values and principles that are important to the provider. However, more research is needed to find innovative ways through which communications between the two sides can be further enhanced, particularly where the forecasts are generated by statistical software.

Working to Earn Trust

So why should I trust your forecasts? The answer appears to lie in the quality of interaction and communication between the forecaster and the user. Getting this right is perhaps easier said than done, but remember these crucial points:

  • Work to increase the forecast user’s belief and confidence in the reliability and integrity of your forecasts, and you greatly increase the likelihood that the inevitable occasional forecast miscues will be seen as acceptable anomalies if viewed in the bigger picture.
  • Affirm the forecast user’s perception of your goodwill, not only by delivering the best, most accurate forecasts you can, but through reassuring the users that you share their motives and objectives and are not shoring up your own self-interest packaged as a forecast.
  • Consider your audience, and take care to share information in language the forecast user is comfortable with, avoiding technical jargon and forecaster-speak wherever possible.
  • Reassure the forecast user of your confidence in your systems and methods, while conveying the necessary degree of humility in your work by acknowledging that no forecaster ever gets it “right” every time.
  • Be transparent about methodologies and increase user comfort levels by providing clear, cogent explanations of your forecasts.
  • Let users review an honest history of your forecast accuracy levels that they can quickly assess and understand, preferably relative to reasonable benchmarks.
  • Be forthcoming about your other current and past forecast clients or customers, as these relationships, by association, can help to convey to the forecast user a comforting and heartening sense of your own principles and values.

A tall order, yes—but get these priorities straight, and all the effort that you put into your forecasts is far less likely to be wasted on distrustful users. After all, creating and disseminating accurate forecasts is a hard enough job; the good news is that there are practical steps you can take to further a more trusting and trustful working environment with the people who use and depend on those forecasts.

REFERENCES

  1. Denrell, J., and C. Fang (2010). Predicting the next big thing: Success as a signal of poor judgment. Management Science 56, 1653–1667.
  2. Gönül, M. S., D. Önkal, and P. Goodwin (2009). Expectations, use and judgmental adjustment of external financial and economic forecasts: An empirical investigation. Journal of Forecasting 28, 19–37.
  3. Gönül, M. S., D. Önkal, and M. Lawrence (2006). The effects of structural characteristics of explanations on use of a DSS. Decision Support Systems 42(3), 1481–1493.
  4. Goodwin, P., M. S. Gönül, and D. Önkal (2013). Antecedents and effects of trust in forecasting advice. International Journal of Forecasting 29, 354–366.
  5. Goodwin, P. (1998). Enhancing judgmental forecasting: The role of laboratory research. In Wright, G., and P. Goodwin (Eds.). Forecasting with Judgment. Chichester: John Wiley & Sons.
  6. Gordon, A. (2008). Future Savvy: Identifying Trends to Make Better Decisions, Manage Uncertainty, and Profit from Change. New York: AMACOM.
  7. Kahneman, D. (2011). Thinking, Fast and Slow. London: Allen Lane.
  8. Önkal, D., P. Goodwin, M. Thomson, M. S. Gönül, and A. Pollock (2009). The relative influence of advice from human experts and statistical methods on forecast adjustments. Journal of Behavioral Decision Making 22, 390–409.
  9. Önkal, D., M. S. Gönül, and M. Lawrence (2008). Judgmental adjustments of previously adjusted forecasts. Decision Sciences 39(2), 213–238.
  10. Rousseau, D. M., S. B. Sitkin, R. S. Burt, and C. Camerer (1998). Not so different after all: A cross-discipline view of trust. Academy of Management Review 23, 393–404.
  11. Taylor, P. F., and M. E. Thomas (1982). Short-term forecasting: Horses for courses. Journal of the Operational Research Society 33, 685–694.
  12. Tetlock, P. E. (2005). Expert Political Judgment. Princeton: Princeton University Press.
  13. Twyman, M., N. Harvey, and H. Harries (2008) Trust in motives, trust in competence: Separate factors determining the effectiveness of risk communication. Judgment and Decision Making 3, 111–120.

4.6 High on Complexity, Low on Evidence: Are Advanced Forecasting Methods Always as Good as They Seem?6

Paul Goodwin

The Complexity Love Affair

Some forecasting researchers love complexity. Read their papers, and you just might begin to feel guilty that you haven’t been integrating genetic fuzzy systems with data clustering to forecast sales in your company. If you are a long-term forecaster, you may be dreading the day when your boss finds out that you have not been using a neuro-fuzzy-stochastic frontier-analysis approach. Or perhaps you should take a whack at utility-based models, incorporating price forecasts based on an experience curve that has been fitted to your data using nonlinear least squares.

Of course, the world is a complex place, and intricate models may be needed to take into account the many factors that might have an impact on the demand for your product or your future costs. Hence, for reasons that seem obvious, forecasting researchers are justified in experimenting with difficult new methods, even if those methods would challenge most PhDs in math. However, there are two commonsense criteria that we should expect researchers to meet. First, they should provide reliable evidence that the methods they advocate can produce accurate forecasts under given sets of conditions. Second, they should compare the accuracy of their methods with appropriate benchmarks.

Typical benchmarks would be simpler methods—to see if the extra complexity is justified—and existing methods that are currently in widespread use. Many research studies fail on both criteria.

A Case in Point

I recently reviewed for a journal a paper that recommends a technique to produce sales forecasts when there are few past observations: the analytic network process. This technique, which is based on relatively complex mathematics, allows experts to systematically structure their knowledge of the key drivers of sales with the aim of making their judgmental forecasts consistent and accurate. The paper’s authors used this process to forecast the annual sales of printers in an Asian country. Their main finding was that the technique yielded forecasts with a percentage error of only 1.3%, which the researchers pointed out was minimal compared to the errors of six common statistical techniques that they had also applied to their data. This level of accuracy is highly impressive, of course, and it appeared to justify the considerable effort involved in applying the analytic network process.

However, a careful reading of the paper revealed a couple of problems. First, the statistical methods relied upon in the comparison were not designed to be used with the short time series studied. For example, their series exhibited a marked negative trend, which meant that moving averages and simple exponential smoothing could not be expected to give reliable forecasts. Instead, the obvious benchmark would have been experts’ forecasts made without the benefit of the analytic network process. This would have indicated whether the method’s complexity was worth the effort.

But that wasn’t the most serious problem with the paper. It turned out that the researchers had only tested the accuracy of the methods on one sales figure. Their paper contained 33 pages of discussion and nine tables of results—two of which were 13-by-13 matrices containing figures to five decimal places. And yet the only evidence they provided in favor of their method was based on one number.

Proper Testing of Accuracy

Foresight Editor Len Tashman pointed out over a decade ago (Tashman, 2000) that forecasting methods need to be tested on a sufficient number of out-of-sample observations (i.e., observations that are unavailable to the method when it is fitted to the past data) to meet the criteria of adequacy and diversity. Adequacy applies when a method is tested on a sufficient number of observations from similar time series to enable the forecaster to draw reliable inferences about the performance of the forecasting method on such series. Diversity is achieved when testing is applied to series that are heterogeneous in both time period and nature, so that the forecaster can make an assessment of how effective the method is under a range of different circumstances.

Testing a model on out-of-sample observations is necessary because a close fit to past data does not guarantee accurate forecasts. Indeed, an improved fit to past data may be associated with poorer forecasts because the method is falsely seeing systematic patterns in the random movements in past data, and assuming that these will continue into the future. Forecasting accuracy is also not guaranteed even when a model is based on an internally consistent and rigorously tested theory (Clements and Hendry, 2008), so extensive out-of-sample testing is still highly advisable.

Inadequate Evidence

Despite these advisories, papers are still being published that draw big conclusions from small amounts of data. A recent article by Decker and Gnibba-Yakawa (2010) recommended that managers should use utility-based models to forecast the sales of high-technology consumer products (like CD and DVD players) in the years before the sales reach a peak. The models are elegant and well underpinned by economic theory. They allow forecasters to take into account factors like customers’ expectations of falling prices (a common occurrence in marketing high-technology products) and network effects, where a product becomes more attractive as its user base grows due to the development and subsequent availability of complementary or supporting products (apps for smartphones, for example). The researchers reported that their models gave forecasts with a mean absolute percentage error of only 4.08%. It all looks very promising until you realize that this result is based on a total of just six holdout sales figures—three different products contributed only one, two, and three sales figures, respectively.

Or take a study by Zhu, Wang, Zhao, and Wang (2011) that proposed the use of a complex hybrid of methods (including moving averages and an adaptive particle-swarm optimization algorithm) to forecast electricity demand in China. They fitted their method to 57 past monthly observations and concluded that “our proposed model is an effective forecasting technique for seasonal time series with nonlinear trend.” But their out-of-sample forecasts only covered the months from January to September of 2010. It is difficult to draw any conclusions about a method that is designed to handle seasonality when we have no forecasts for three months of the year.

It isn’t hard to find other recent examples where complex, often computer-intensive methods are described in great detail but the recommendation to use the method is based on thin evidence. Significantly, these examples tend to be found in journals that do not specialize in forecasting, which may indicate that good forecasting principles and practices are having a hard time making inroads to other fields. For example, in the journal Applied Soft Computing, Azadeh and Faiz (2011) claim that the use of their “flexible integrated meta-heuristic framework based on an artificial neural network multilayer perceptron” would provide “more reliable and precise forecasting for policy makers” concerned with electricity supply. But they tested their method on just eight annual household electricity figures from Iran. On the basis of a mere two out-of-sample observations—China’s natural gas consumption in 2007 and 2008—Xu and Wang (2010) concluded in the Journal of Natural Gas Chemistry that their “Polynomial Curve and Moving Average Combination Projection (PCMACP) model” can “reliably and accurately be used for forecasting natural gas consumption.”

Conclusions

We should not be against complexity per se. Modern computing muscle gives us an unprecedented opportunity to apply more powerful techniques in pursuit of greater forecasting accuracy. Moreover, many of the techniques described above are a tribute to the inventiveness and intellectual caliber of the researchers who describe them. But for forecasters in the field, complexity can come at the cost of greater effort and time in preparing forecasts and a loss of credibility with senior managers. It is therefore vital that recommendations to use complex methods be supported with strong evidence about their reliability. If the name of a method contains more words than the number of observations that were used to test it, then it’s wise to put any plans to adopt the method on hold.

REFERENCES

  1. Azadeh, A., and Z. S. Faiz (2011). A meta-heuristic framework for forecasting household electricity consumption. Applied Soft Computing 11, 614–620.
  2. Clements, M. P., and D. F. Hendry (2008). Economic forecasting in a changing world. Capitalism and Society 3: Issue 2, Article 1.
  3. Decker, R., and K. Gnibba-Yakawa (2010). Sales forecasting in high-technology markets: a utility-based approach. Journal of Product Innovation Management 27,115–129.
  4. Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: An analysis and review. International Journal of Forecasting 16, 437–450.
  5. Xu, G., and W. Wang (2010). Forecasting China’s natural gas consumption based on a combination model. Journal of Natural Gas Chemistry 19, 493–496.
  6. Zhu, S., J. Wang, W. Zhao, and J. Wang (2011). A seasonal hybrid procedure for electricity demand forecasting in China. Applied Energy 88, 3807–3815.

4.7 Should the Forecasting Process Eliminate Face-to-Face Meetings?7

J. Scott Armstrong

Introduction

Every week I hear people complain about meetings. What would happen to your organization if it became difficult to have face-to-face meetings? To this end, some organizations hold meetings in rooms without chairs. Some impose limits on the length of the session or the number of people who attend.

But what if an organization went further and penalized people for spending time in meetings? Or required that the meeting have a clear-cut payoff? As part of assessing the results, management could provide a visible taxi-style meter that uses attendees’ billing rates to show the meeting’s costs. Or what if management abolished face-to-face meetings entirely?

The Wisdom of Crowds

I have been thinking about the need for face-to-face meetings for some time now. Recently, I have been spurred on by The Wisdom of Crowds (Surowiecki, 2004), a delightful yet exasperating book. It is delightful because the writing is so clever and contains descriptions of interesting research studies, many of which were new to me; it is exasperating because it is not well organized, but the writing is so clever that one may not notice the gaps in logic. Nevertheless, the book’s major conclusion is important:

Traditional meetings yield poor decisions and inaccurate forecasts.

Dave Barry summarized this conclusion in fewer words: “If you had to identify, in one word, the reason that the human race has not achieved, and never will achieve, its full potential, that word would be meetings.” Apparently Barry’s quote hit a nerve; a Google search for his conclusion turned up almost 600 relevant sites (out of 10,000 total sites) in July 2006.

The term crowds in the title of Surowiecki’s The Wisdom of Crowds is unfortunate. He claims that the collective thinking of many individuals, when acting alone, contains wisdom. Crowds act together, and they do not have wisdom. A more descriptive title would have been The Superiority of Combined Independent Anonymous Judgments.

The book has been widely reviewed on Amazon, with comments from over 200 readers who have provided a bimodal ratings distribution. The negative reviewers fell into two classes: those who were upset at the basic conclusions and those who were upset at the gaps in logic. The experts priced this book at $25, but the crowd’s price for a new copy in May 2006 was $10. If you enjoy books like Who Moved My Cheese? and Jack Welch’s Winning, you are unlikely to enjoy The Wisdom of Crowds. But it will make you reconsider your assumptions about meetings. At least it had that strong effect on me.

Face-to-Face Meetings Could Be Effective

We do have guidelines on how to run meetings effectively. This was well summarized over four decades ago by Norman R. F. Maier. His research showed how group leaders could make effective use of people’s information. His book (Maier, 1963) provides evidence-based principles for running meetings. Figure 4.7 provides a summary of guidelines that draws heavily on Maier’s research.

  • Use time budgets. Allocate time to discuss various topics and provide ample slack time.
  • Be problem centered. Keep your discussion focused on a problem. Avoid looking for excuses or seeking to blame others.
  • Record suggestions. Keep track of all suggestions for solving a problem or making sense of an issue so that each suggestion may be explored fully.
  • Explore. Explore a number of suggestions for addressing an issue. Probing and evaluative questions can then be asked. How would that strategy work out? Do I understand the issue, or do I need to search out more information? Am I mistaken in my assumptions about the issue? What are the advantages or disadvantages of each proposal? Is there a way to combine suggestions to generate a better solution?
  • Protect people. Protect individuals from personal attacks and criticism, especially if they present minority or divergent viewpoints. Avoid saying, ‘That’s a bad idea.”
  • Understand and resolve differences. Once ideas have been generated, encourage dissent. Understand differences of opinions within the group and attempt to resolve them.

Figure 4.7 Guidelines for Problem-Solving Meetings

Unfortunately, it is rare to find group leaders who use Maier’s advice. In my 46-year career, I can remember only a handful of business students, academic administrators, or business executives who have run meetings effectively. Productive meetings are possible but rare.

Why do people persist in holding face-to-face meetings? First, we are social animals; many of us enjoy the interaction with others in a face-to-face setting. Second, managers like the control that meetings give them over others; they can see that others are coming together at their commands. Third, people believe that meetings are effective; managers believe that they are doing something useful when they meet (although they often do not have the same opinion about meetings among their blue-collar workers).

A fourth reason is that people falsely believe that by merely aggregating opinions without a face-to-face meeting, one would get a decision or forecast that is only average. The scientist Sir Francis Galton dispelled such a belief in 1878. He showed that by averaging portraits of women, the resulting portrait was judged not average looking but rather more beautiful than all the component portraits. Larrick and Soll (2006), in a clever series of experiments, showed that among highly intelligent subjects (MBA students at INSEAD), most did not understand that the error of the group-average judgment is almost always smaller than the error of the average person in a group. More surprising to them was that the group-average judgment is sometimes better than the best judgment.

The Case Against Face-to-Face Meetings

Face-to-face meetings are expensive to schedule and run. They might involve travel costs or come at inconvenient times, when attendees are busy or tired. Time is wasted when people come late, talk about irrelevant topics, or leave early.

Meetings are also subject to many types of biases. How loudly do people talk? How deep are their voices? What do the people look like? How is the furniture arranged? How are people dressed? What is each person’s body posture? Who has the power? How does the group leader guide the meeting? Does the group nurture dissent? Do people have preconceived positions on the topic at hand?

Some attendees are so concerned about what they want to say that they do not listen to what others are saying. Some are so intent on listening that they have no time to think. Some feel the need to restate their positions. Few people take notes; therefore they soon forget what happened in the meeting and are unable to develop useful action plans.

Not surprising then is the prevalence of studies showing that, compared with other methods of aggregating opinions (such as using the average of a set of independent judgments), the simple act of meeting face-to-face harms forecasting and decision making, although the people involved in these experiments typically do not believe the results.

Interestingly, the findings for forecasting and decision making are similar to those studies that involve groups generating creative ideas. As shown in the research review by Gallupe et al. (1991), individuals produce more creative suggestions than groups do, even if the groups are well run.

There are two conditions under which independent judgments should be combined. First, the experts must have useful information about the topic of interest; combining ignorance does not lead to wisdom. Second, participants must represent diversity of knowledge. The key word is knowledge. For example, it makes little sense to include experts because of differences in looks, heights, weights, religions, races, genders, and so on. In fact, Stewart’s (2006) meta-analysis of 26 tests found a small negative relation between team members’ demographic heterogeneity and group performance.

Decision making and forecasting can be improved to the extent that

  • People state opinions independently, and
  • Opinions are aggregated objectively, using a predetermined mechanical scheme.

The implication of the above research is that managers need to be creative in finding ways to use the knowledge effectively in a group while preventing members from meeting face-to-face. This will improve forecasting and decision making. It will also save time and money. Fortunately, modern technology has provided useful alternatives.

Alternatives to Face-to-Face Meetings: Markets, Nominal Groups, and Virtual Teams

There are a number of ways to implement alternatives to face-to-face meetings. I will discuss three: markets, nominal groups, and virtual teams.

Markets (Prediction Markets, Information Markets, or Betting Markets)

Experts and nonexperts alike bet on outcomes. These markets are common in finance and sporting events. People receive feedback only through prices and volume of trading.

In The Wisdom of Crowds, Surowiecki describes the use of markets for prediction. Their superiority has been shown by studies in financial markets since the 1920s. Although investors do not meet, they observe the outcomes of actions by others and draw on related information to make their decisions.

Outside of finance and sports, there have been few comparative studies on the value of prediction markets. The future looks promising, however. Surowiecki reports that some companies are using prediction markets for new-product sales. Since predictions for such problems are typically made in traditional meetings, I would expect prediction markets to produce more accurate forecasts.

I hope that The Wisdom of Crowds will lead some companies to consider the use of prediction markets. Technology should not pose a barrier. Some organizations will delegate a person to set up a betting market for sporting events.

Nominal Groups (Including Delphi)

In nominal groups, judgments are collected from a group of experts and are summarized by a group facilitator.

Surowiecki relied on suggestive and interesting (but indirect) evidence on the value of nominal groups. He failed to effectively use the wisdom of the crowds of forecasting researchers in his search for evidence on the value of simply combining judgments. In Armstrong (2001), I summarized 11 comparative empirical studies on the value of combining judgmental forecasts, which, to my knowledge, was an exhaustive listing of such studies. The median error of the group average in these studies was 12.2% less than that of the average expert’s error.

The Delphi technique goes beyond nominal groups. It involves an anonymous collection of expert judgments by mail, Internet, or written responses. Feedback on the responses is provided to the experts, who then repeat this exercise for at least two rounds.

Rowe and Wright (2001) found that Delphi improved accuracy over traditional groups in five of the studies, harmed accuracy in one, and was inconclusive in two. Using an alternative benchmark, they found that it was more accurate than one-round expert surveys for 12 of 16 studies, with two ties and two cases in which Delphi was less accurate. For these 24 comparisons, Delphi improved accuracy in 71% of the cases and harmed it in 12%. I was unable to find any published studies that compared prediction markets with Delphi.

Freeware for Delphi is provided at forecastingprinciples.com. Usage of this freeware has increased substantially in recent years.

Virtual Teams

Virtual teams, enabled largely by the Internet, have several advantages. They allow for a freer flow of information than do markets or nominal groups. Members of virtual teams can use mail, e-mail, and websites. Phone calls are used only in emergencies, and conference calls are not used. These procedures remove some biases (for example, body language and modes of speech). They allow time for people to think before responding, and they provide a record of what was accomplished.

Despite the growing popularity of virtual teams, I was unable to find comparative studies on the value of these groups. However, based on related research summarized by Surowiecki, I expect that virtual teams would be much more effective than face-to-face groups, but less effective than prediction markets and Delphi. Consistent with this, Ganesan, Malter, and Rindfleisch (2005), in a study on new-product development, found that e-mail was superior to face-to-face meetings with respect to new-product creativity and development speed.

A Prediction Case

Can you predict the results of the following experiment? To solicit useful feedback on research studies, a group of 160 experts was provided with research papers, one paper per expert. The experts were randomly divided into two treatment groups. In group A, ten sets of eight experts participated in 80-minute meetings, where authors of the ten studies presented their papers and addressed questions. (Each group heard only one study.) In group B, each subject in the nominal groups of eight experts worked alone and without interruption for 80 minutes on one of the ten papers. These experts wrote comments in the margins of the papers. In effect, the intent was to have equal amounts of time spent on each paper. Which treatment, A or B, produced more useful suggestions? In which treatment did the authors of the study use the suggestions more effectively?

Unfortunately, there is little research to establish which of the mechanical methods of combining expert judgments are most creative, most accurate, least expensive, and most acceptable. For example, I have found that no published empirical comparisons have been made among prediction markets, Delphi, and virtual teams. In fact, the above study has not been conducted. Based on related research, however, I assume that Treatment B (nominal groups) would be superior to Treatment A (traditional groups) in terms of producing useful, accurate, and creative ideas. I also assume that, in Treatment B, the acceptance rate by the authors of the papers would be much higher.

Are Face-to-Face Meetings Useful Under Some Conditions?

The evidence against face-to-face meetings is extensive, and I have made no attempt to provide a complete summary here. I did, however, attempt to contact all authors whose work I cite in order to ensure that I have referenced the information properly. My primary concern is to find evidence that favors face-to-face meetings.

Are there conditions under which meetings contribute to forecasting or decision making? I speculate on three possibilities. The first is when the experts cannot read. The second is when very small groups, perhaps two people, may be able to work effectively. The third is when it is important to gain commitment to decisions. With respect to the third condition, one must be concerned not only with the quality of a decision but also with its acceptability. Would the feeling of involvement in a decision more likely lead to acceptance when the group has made a forecast or decision?

Some papers have suggested that meetings are useful when the situation is complex and the solutions are not obvious. While this suggestion has some intuitive appeal, tests of this concept have failed, according to Dennis and Kinney (1998). I doubt that such meetings are effective, given the evidence that (1) people can understand complex material better and faster when it is written (Chaiken and Eagly, 1976); (2) people in groups are poor at generating creative approaches; (3) many participants have difficulty performing complex analyses in the presence of others; and (4) groups are not tolerant of creative solutions.

Although I have circulated my paper for comments from e-mail lists and from other researchers, I have been unable to obtain evidence to support the use of face-to-face groups under these or any other conditions. Some people have responded with their opinions that meetings are useful or that many managers like meetings. There was one paper that provided promising results for face-to-face meetings, but the findings were not clear. Some people responded that they could not think of evidence favoring face-to-face meetings.

Such sessions may meet people’s needs for socializing. Magne Jørgensen (personal communication) mentioned one company that did away with face-to-face meetings for their projects, replacing them with e-mail messages. To satisfy people’s needs for meeting and talking, they sponsored social events.

Action Steps

Perhaps the first step is damage control. Reduce the number of meetings, the length of meetings, and the number of people invited. Post a chart on the group’s homepage to track the people-hours (and their associated costs) that are consumed by the meetings. Ask the group leader to use Maier’s guidelines for meetings. In addition, ask attendees to summarize the actions they have taken after each meeting.

If people in your organization do not know how to respond without meetings, you can bring them together in a room and then use structured procedures that simulate nominal groups, as described by Aiken and Vanjani (2003). For example, you could ask for a short “time-out” during a meeting and ask everyone to write his or her ideas. Software is available for conducting structured meetings, and these products have proved useful (Valacich et al., 1994) and have been gaining acceptance in organizations. For example, Briggs et al. (1998) reported that electronic brainwriting (individual idea generation) has been used by several million people in over 1,500 organizations around the world.

Individuals can also take action. My approach is to ask the person who calls a meeting to describe the problem and to inquire whether it would be useful to ask participants to provide written suggestions rather than to attend the meeting. The leader nearly always says yes and takes my proposal in a positive way. This approach makes it easier for people to absorb my suggestions and my reasoning while it reduces their desire to argue against me (because I am not there).

Conclusions

We rely heavily on face-to-face meetings, which are more expensive than alternative approaches, even though it is difficult to find evidence that supports their use. Although evidence-based principles exist for running face-to-face meetings effectively, they are used so rarely that we must turn to more practical solutions. In fact, a pattern of evidence suggests that prediction markets, nominal groups, and virtual teams allow for a more effective use of a group’s collective wisdom. Technology has enhanced the value of these approaches.

REFERENCES

  1. Aiken, M., and M. B. Vanjani (2003). Comment distribution in electronic poolwriting and gallery writing meetings. Communications of the International Information Management Association 3(2), 17–36.
  2. Armstrong, J. S. (2001). Combining forecasts, in J. S. Armstrong (Ed.), Principles of Forecasting. Boston: Kluwer Academic Publishers, 417–439.
  3. Briggs, R. O., J. F. Nunamaker Jr., and R. H. Sprague Jr. (1998). 1001 Unanswered research questions in GSS. Journal of Management Information Systems 14(3), 3–21.
  4. Chaiken, S., and A. H. Eagly (1976). Communication modality as a determinant of message persuasiveness and message comprehensibility. Journal of Personality and Social Psychology 34, 605–614.
  5. Dennis, A. R., and S. T. Kinney (1998). Testing media richness theory in new media: The effects of cues, feedback, and task equivocality. Information Systems Research 9(3), 256–274.
  6. Gallupe, R. B., L. M. Bastianutti, and W. H. Cooper (1991). Unlocking brainstorms. Journal of Applied Psychology 76(1), 137–142.
  7. Ganesan, S., A. J. Malter, and A. Rindfleisch (2005). Does distance still matter? Geographic proximity and new product development. Journal of Marketing 69, 44–60.
  8. Larrick, R. P., and J. B. Soll (2006). Intuitions about combining opinions: Misappreciation of the averaging principle. Management Science 52, 111–127.
  9. Maier, N. R. F. (1963). Problem Solving Discussions and Conferences. New York: McGraw-Hill. (Out of print, but used copies are available.)
  10. Rowe, G., and G. Wright (2001). Expert opinions in forecasting: The role of the Delphi technique. In J. S. Armstrong (ed.), Principles of Forecasting. Boston: Kluwer Academic Publishers, 125–144.
  11. Stewart, G. L. (2006). A meta-analytic review of relationships between team design features and team performance. Journal of Management 32, 29–54.
  12. Surowiecki, J. (2004). The Wisdom of Crowds. New York: Doubleday.
  13. Valacich, J. S., R. D. Alan, and T. Connolly (1994). Idea generation in computer-based groups: A new ending to an old story. Organizational Behavior and Human Decision Processes 57, 448–467.

Acknowledgments: Useful suggestions were provided by Monica Adya, Fred Collopy, Kesten Green, and Magne Jørgensen, as well as by Foresight editors.

4.8 The Impact of Sales Forecast Game Playing on Supply Chains8

John Mello

Introduction

A well-established fact about forecasting is that the introduction of biases leads to inaccurate forecasts (Mentzer and Moon, 2005). One way that biases enter the picture is through forecast game playing, the intentional manipulation of forecasting processes to gain personal, group, or corporate advantage.

Adroit game players can win power, money, and other prizes by bending or breaking the rules of good forecasting practice. In the context of sales forecasting, game playing not only reduces the accuracy of a company’s forecasts but can have deleterious impacts on company operations, and the operations of suppliers and customers as well.

My purpose here is to demonstrate the far-reaching effects of such sales-forecasting game playing on supply chains and to explore how companies can control the practice. I begin by describing various types of sales-forecasting games and their impact on the management of supply chains. I explore conditions that encourage game playing and conclude with recommendations to reduce or eliminate this practice in sales-forecasting processes.

The data for this study was obtained from audits conducted over a nine-year period by the Department of Marketing and Logistics at the University of Tennessee. I selected eleven audits that included firms in consumer packaged goods, major appliances, consumer electronics, home and garden supplies, beverages, food processing, and computer hard drives. Each audit typically involved interviews with 25–40 participants from forecasting, operations, production planning, marketing, sales, and senior-level positions, and addressed the company’s sales-forecasting processes, issues, and competencies.

The Nature of Supply Chains

A supply chain has been defined as “all the organizations involved in the upstream and downstream flows of products, services, finances, and information from the ultimate supplier to the ultimate customer” (Mentzer et al., 2001, p. 2). The concept thus centers on the cross-company activities facilitating delivery of products and services to consumers. As Figure 4.8 indicates, products and services primarily flow downstream from suppliers, finances flow upstream in payment of products and services, and information travels in both directions across the various tiers of the chain. Supply chains frequently involve multiple firms at each tier, functioning more as a network of interacting companies than as a linear chain. One company’s actions can affect many other companies along the chain. This highlights the importance of proper supply-chain management among interlinked companies.

images

Figure 4.8 A Manufacturing Supply Chain Example

Due to the interdependence of companies in supply chains, any bias that enters into sales forecasts not only affects the company creating the forecast but also affects the operations of other companies along the chain. When individuals, groups, or companies “play games” with the sales-forecasting process, they hamper their own ability to operate effectively and potentially create far-reaching effects throughout the supply chain. It is important for companies to eliminate game-playing behaviors wherever they occur. The key here is honing our ability to identify the various forms of game playing that take place in sales forecasting.

Games People Play

During the course of this research, I observed a number of game-playing behaviors. My names for them are:

  • Enforcing
  • Filtering
  • Hedging
  • Sandbagging
  • Second-guessing
  • Spinning
  • Withholding

In the following game definitions, the quotes are taken directly from the organizational participants who identified these behaviors.

Enforcing: Maintaining a higher forecast than actually anticipated sales, in order to keep forecasts in line with the organization’s sales or financial goals.

Enforcing is played when “taking the number down is not acceptable.” In some companies, setting “stretch goals” is a tactic to motivate employees to exceed historical sales levels.

If forecasts fail to meet sales and financial goals, the numbers are changed to match the goals rather than reflect actual projections. Enforcing often occurs in a climate where employees feel there is “no option to not achieving sales goals.” This precludes the reporting of forecasts that underachieve corporate goals and fosters “CYA” tactics: When forecasts simply reflect goals, no one needs to justify their forecast.

Filtering: Changing forecasts to reflect the amount of product actually available for sale in a given period.

This game is used to mask capacity or supply issues. The supply functions are then able to “attain demand requirements,” even though real demand is unattainable. Filtering occurs at a production planning level, where forecasts are changed within the master production schedule by manufacturing personnel. It happens in companies where healthy S&OP practices are absent or unenforced.

Hedging: Overestimating sales in order to secure additional product or production capacity.

When production capacity shortages exist, field salespeople may use this game so that “the factory will have my stuff when I want it.” Overestimating sales ensures that any potential “upside to the forecast” can be covered; it compensates for lack of manufacturing or supply flexibility and guarantees salespersons “will have plenty of inventory to meet their quotas at the end of the year.” Other reasons for hedging include justifying higher budgets, seeking approval for advertising and promotions, asking for additional head-count, and selling new product development to senior management. Hedging is also used by customers when products from a supplier are in short supply or on allocation.

Sandbagging: Underestimating sales in order to set expectations lower than actually anticipated demand.

Described by one participant as “gaming the quota” and by another as “skillful lying,” this tactic sets sales quotas low in order to ensure they are exceeded, resulting in payment of bonuses and other compensation. Other reasons for sandbagging are found in the nature of the organization.

One participant described his firm as a “good news company” that encouraged overselling the forecast. In this environment, “if sales guys exceed their forecast there are high fives all around,” but there are no penalties for forecasting inaccurately. Customer sandbagging occurs when customers associate forecasts with confirmed orders and are afraid of getting stuck with too much inventory.

Second Guessing: Changing forecasts based on instinct or intuition about future sales.

Used when an individual or group mistrusts the forecasting process or has an “I know best” attitude. Production schedulers noted this game as an oft-used tactic. For example, one master scheduler described how she routinely changes forecasts because she feels that salespeople are “lowballing,” yet she is the one who has to “live with the consequences” of excess inventories. Second-guessing is also used when individuals in power positions force changes to the forecast because they think they know their market better than anyone else in the company. An example discussed by several participants involved a vice president of sales who often “pulls numbers out of his hat” and orders changes to the forecast based on “gut feelings.”

Spinning: Manipulating forecasts to obtain the most favorable reaction from individuals or departments in the organization.

Spinning is used to “control the news cycle” in a company. One participant described his firm’s salespeople as “very tactically focused” who give a “pie-in-the-sky forecast because that’s what they think people want to hear.” Another example is a sales director who asks her forecasters to “tell the truth and let her spin it” because she “knows how to hide certain numbers” that may not be well received by upper management. Spinning does not include legitimate management overrides of forecasts based on current data; it involves manipulation of the data in order to moderate the responses of higher-level managers who may react poorly to bad news.

Withholding: Refusing to share current sales information with other members of the organization.

This game is used when news is good or bad. An example of withholding good news occurs when the sales force feels that it will sell well above the forecast but “put that information in their pocket” until they are sure sales will materialize. Withholding bad news is used as a protective mechanism because “no one wants to be the one to record bad news in the forecasting system.” One participant commented, “For sales, it’s a game of chicken. They keep forecasting the ‘hockey stick’ (a spike in demand at the end of a sales period corresponding to company sales goals) because they prefer to have just one screaming session with the boss” when sales don’t materialize. Customer withholding often takes the form of refusal to share promotional plans or inventory policy changes with suppliers due to distrust of supply chain partners.

Unquestionably, these games bring bias into the forecasting process, degrading forecast accuracy. Most companies rely on accurate forecasting to help plan and execute procurement, manufacturing, and distribution operations. When individuals, groups, or companies intentionally play games with the forecast, they seriously hamper their own firm’s ability to effectively perform these operations, inevitably disrupting the operations of other firms throughout the chain.

Consequences for the Supply Chain

Whether companies are in a make-to-stock, make- or assemble-to-order, or lean/just-in-time manufacturing environment, they attempt to match supply with demand so that adequate amounts are produced and delivered to the right place at the right time. Too little inventory results in production and distribution delays at customer locations; too much results in wasted money. When games are played with sales forecasts, demand signals are distorted, giving false information within and between firms. Such erroneous information can manifest itself in uncertainty, higher costs, and inter-firm behavioral dysfunction. Figure 4.9 depicts these outcomes and their relationships to specific game playing, both within a company and between customers and suppliers.

images

Figure 4.9 Sales Forecasting Games and Outcomes

Greater Uncertainty in the Supply Chain

One serious result of game-playing bias is uncertainty within and between companies regarding how much supply is needed to meet demand.

Manufacturers and suppliers need to know what and how much to produce and how much manufacturing capacity is needed. In make-to-stock companies, businesses must decide where to store inventory, affecting warehousing and distribution space requirements. Companies also need volume information so they can select transportation modes and carriers with sufficient capacity to service customers.

These decisions must be made up and down the supply chain and are usually triggered by sales forecasts. Good decisions hinge on accurate and timely information. Bad decisions often result from biased information corrupted by game playing.

Hedging (overestimating sales in order to secure additional product or production capacity) sends a signal of higher-than-needed demand, which triggers surplus production of finished goods and excessive procurement of materials, components, and semi-finished items. The surplus travels up the supply chain, as suppliers react by passing unnecessary demand to their suppliers.

Sandbagging (underestimating sales in order to set expectations lower than anticipated demand) has the opposite effect: Manufacturing will schedule less than required volumes, and suppliers will do the same. When the higher volumes of sales materialize, there may not be sufficient time for the supply chain to react.

Withholding (refusing to share current sales information with other members of the organization) and other games that distort or delay the sharing of relevant information, prevent the supply chain from properly planning for and executing supply operations. The same result occurs when retail customers play games with manufacturers supplying their distribution centers and stores.

Excess Costs and Lost Sales

Primarily in make-to-stock companies, game playing results in excess cost from surplus inventory, expenses of expediting, and labor inefficiency. Enforcing, which substitutes sales goals for calculated forecasts, creates excess inventory as manufacturers gear up for demands unlikely to materialize. Likewise, hedging by retail customers or sales functions and second-guessing by executives or production planning force the procurement or production of “just-in-case” inventory that may be unnecessary in the sales period. Withholding bad news allows unwanted finished goods and supporting materials from suppliers to build up in the supply chain.

Excess labor costs throughout the supply are another liability, if manufacturing capacity is outstripped, if overtime is needed to meet production requirements, or if last-minute scheduling changes induce quality or rework problems or use of less cost-effective resources. Moreover, when excess inventory must be worked off and production slows, layoff expenses may ensue.

Excess labor costs also result from games that underestimate sales forecasts. Sandbagging, second-guessing (changing forecasts based on instinct or intuition about future sales) that the forecast is too high, and withholding good news all delay needed production volumes. As sales materialize and inventories drop below required volumes, manufacturing enters expedite mode. Production is ramped up, orders go out to suppliers, and companies all along the chain increase their production. The result is higher costs of overtime, premium transportation, and other expenses associated with expediting. Perhaps most seriously, if expedited production is unsuccessful, sales and even customers may be lost.

Upstream and Downstream Problems

When intentional overforecasting (hedging), underforecasting (sandbagging), or withholding sales, promotions, and other information affect upstream suppliers, these games can undermine the inter-firm trust and commitment essential to good supply chain management. Mistrust leads to second-guessing demand, resulting in excess inventories, customer service issues, and the costs associated with these problems. Mistrust grievously undermines the commitment to working closely together that is essential for companies to operate effective and efficient supply chains.

So game playing in sales forecasting causes serious problems not only within companies but throughout entire supply chains. Companies should strive to control, curtail, or eliminate the practice. To accomplish this, it is necessary to understand the conditions within firms that encourage, sustain, or tolerate game playing.

Conditions Fostering Game Playing

The conditions that compel individuals, groups, or businesses to play games with sales forecasts are often created and nurtured by the company. As shown in Figure 4.10, the three conditions that stand out as fostering game playing are reward structures, supply structures, and the sales-forecasting process itself.

images

Figure 4.10 Conditions Fostering Game Playing

Reward Structure

A company’s priorities are established by how employees are rewarded. My research revealed certain aspects of reward structures tied to forecast game playing: the lack of accountability for forecast consequences, conflicting goals, and the lack of incentives for achieving forecasting accuracy.

Lack of accountability is partly a consequence of the separation of responsibility for forecasting from the responsibility for customer service and inventory management. As one participant told us, “We haven’t decided in this company who is responsible for excess inventory or being out of stock.” Lack of accountability particularly enables hedging and sandbagging because there are no repercussions for the forecasters whose games drove excessive or inadequate inventory levels.

Conflicting goals between departments are what one participant likened to being “pushed and pulled by the objectives of different units.” This climate fosters games such as hedging, sandbagging, second-guessing, and withholding, all of which work to the benefit of one department but against the objectives of other departments.

Then there are reward structures that lack incentives to forecast accurately, opening the door to sandbagging and hedging. “Underforecast during quota month, then overforecast the other months to make sure you get plenty of supply.” “We never get in trouble for overforecasting.”

Lack of incentives for forecasting accurately and lack of repercussions for poor forecasting enable employees to maximize circumstances that benefit themselves even if they harm the organization.

Supply Structure

One major contributor to game playing within a company is the lack of production flexibility/capacity. Production cannot always or immediately adapt to supply products in sufficient quantity and variety to meet demand. One participant explained, “We tend to overforecast because we aren’t flexible in manufacturing, and we want to make sure our products are available.” Another stated, “Salespersons get beat up when product is not available, so there is a push to have more inventory.” Hedging and withholding knowledge of slow sales are used to ensure that particular products are available “just in case” sales materialize. Customers play these same games, particularly when products are on allocation and they know that only a percentage of their forecasts will be produced for them.

Long lead-times for finished goods and corresponding materials also encourage game playing. Increased use of offshore suppliers, by both retailers and manufacturers, has lengthened lead times and reduced flexibility of supply chains.

One company stated that “80% of items need to be ordered at least three months prior to production,” with some materials having an even longer lead time. Knowing that response times are slow, people resort to hedging, filtering, and second-guessing the forecast to buffer spikes in demand and ensure adequate supplies. Compounding the lead time problem is that each retailer, manufacturer, and supplier in the supply chain has its own set of lead times.

The Forecasting Process Itself

The forecasting process itself can foster game playing. Forecasts tied to revenue or sales goals are unduly influenced by these goals, leading to games such as Enforcing and Spinning. Representative comments included “Corporate mandates forecasts,” “Salespeople forecast what they need to sell, not what they think will sell,” and “It would not be acceptable for me to forecast anything other than my targets.”

When the forecasting process allows forecasts to be tied into sales quotas, salespeople are pressured to “play the quota games” of sandbagging and withholding, which often amounts to “lowballing” the forecast to influence the sales quota.

Forecast overrides are a third aspect of the problem with the forecasting process. When employees can override forecasts, it tempts people with an “I know best” attitude to make changes based on “gut feelings,” which may not correspond to actual market conditions. This does not imply that numbers should never be changed, especially through legitimate processes such as S&OP. But when forecasts can be changed unilaterally without input from other stakeholders, principled forecasting methods are endangered. Paul Goodwin (2005) provides a prudent set of principles for when and how forecasts should be judgmentally adjusted.

How to Control Game Playing

In order for companies to reduce or eliminate game playing in their sales forecasting process, they need to address the conditions fostering these behaviors, those that are rooted in reward structures, supply structures, and the sales forecasting process itself.

Change the Reward Structure

Tie compensation to forecasting accuracy. When people are accountable for the results of their actions and rewarded accordingly, they have incentives to behave in desired ways. Therefore, those responsible for inputs to the forecast should have their compensation (base and bonus) tied in part to forecast accuracy. This discourages games such as hedging, sandbagging, and withholding that distort or prevent the flow of accurate information in forecasting processes.

Unify goals between supply and demand functions. Silo mentalities, turf battles, and suboptimization of functions occur in climates where functional areas compete rather than cooperate. When all functions are jointly responsible for customer service, inventory costs, and operational costs, the importance of accurate forecasting is heightened.

While this type of goal unification among functions is rare, it is a critical component of good supply-chain management. Game playing that exclusively benefits one’s own functional area is drastically reduced.

Provide customers better incentives for forecasting accuracy. Offer preferred pricing or higher service levels to companies that forecast well. Stop conditioning customers to expect quarterly price breaks from low forecasts by varying timing of promotions and discounts.

Change the Supply Structure

Build more flexibility into production capabilities. When customers or salespersons think they cannot satisfy product demand due to manufacturing capacity or changeover issues, they naturally want to hedge by ordering or forecasting more than they need. When companies add capacity or capability to production operations, they help ensure adequate supply of products or materials, reducing or eliminating the temptation for hedging, filtering, withholding, and second-guessing in manufacturing. Some remedies include:

  • Qualifying co-packers to make certain high-volume products
  • Adding machine flexibility through purchasing general-purpose machines and equipment run by cross-trained employees (Coyle et al., 2009)
  • Adding more production lines and facilities if long-range forecasts warrant
  • Reducing changeover times and applying lean manufacturing techniques to get more flexibility out of existing capacity
  • Reducing supplier lead times by selecting suppliers that are physically closer, have manufacturing flexibility, and work with reliable transportation carriers

Make a transition from an anticipation-based to a response-based supply chain. Anticipation-based supply chains make products to stock and wait for orders from customers; response-based supply chains take orders from customers, then make and ship the product. Response-based supply chains often use postponement strategies in which products are stored close to customers in a semi-completed state and finished to order. The highly reactive nature of this type of chain can help reduce or eliminate games like hedging, second-guessing, and withholding, since there is no need to build extra product to meet customer demands. Such a supply chain utilizes a partnership approach with suppliers and can include vendor-managed inventory systems (VMI); collaborative planning, forecasting, and replenishment (CPFR) programs; and other methods where forecasts and production schedules are shared and continually updated among supply-chain partners.

Change the Forecasting Process Itself

Delineate responsibilities and monitor performance. Clarifying forecast-accuracy targets, formalizing forecast-accuracy metrics, publishing accuracy achievements, performing root-cause analyses, analyzing the service level and inventory costs of forecast errors (see Catt, 2007), and having senior management monitor all this falls within an S&OP process.

Initiate a sales and operations planning (S&OP) process. S&OP is a cooperative, cross-functional effort that uses available market intelligence and key metrics to guide and synchronize demand and supply plans. The goal is a consensus forecast that provides a basis for action by all functions in a company, thus eliminating second-guessing and information withholding.

Build a collaborative planning, forecasting, and replenishment (CPFR) program with customers and suppliers. CPFR programs work toward developing a single forecast or production plan for an item, and sharing that forecast/plan with upstream suppliers. In order for CPFR to work, cooperation and honest information flow between companies are imperative (Coyle et al., 2009). Accurate data flow helps eliminate customer games such as hedging and withholding; there is less need for extra inventory to cover for uncertainty in the supply chain. The resulting saving in costs can be substantial (Boone and Ganeshan, 2008).

Keep sales forecasting separate from goal setting and quotas. Input from salespeople to the forecasting process is desirable, but discourage executives from pressuring forecasters to meet budget or sales targets. The forecasts should still be periodically compared to goals and quotas so that gaps can be dealt with through marketing programs, sales efforts, pricing, and other means.

Conclusion

Forecasting games take many forms. When individuals, groups, or companies play these games, the consequences reach well beyond the boundaries of a firm into the numerous tiers of a supply chain. An understanding of the conditions that foster game playing helps determine the actions companies can take to end these conditions. Controlling sales forecast game playing will almost certainly deliver a bigger payoff toward improving supply chain performance than will any new forecasting methodology.

REFERENCES

  1. Boone, T., and R. Ganeshan (2008). The value of information sharing in the retail supply chain: Two case studies. Foresight: International Journal of Applied Forecasting 9, 12–17.
  2. Catt, P. (2007). Assessing the cost of forecast error. Foresight: International Journal of Applied Forecasting 7, 5–10.
  3. Coyle, J. J., C. J. Langley Jr., B. J. Gibson, R. A. Novack, and E. J. Bardi (2009). Supply Chain Management: A Logistics Perspective. Mason, OH: South-Western.
  4. Goodwin, P. (2005). How to integrate managerial judgment with statistical forecasts. Foresight: International Journal of Applied Forecasting, 8–12.
  5. Mentzer, J. T., and M. A. Moon (2005). Sales Forecasting Management: A Demand Management Approach (2nd ed.). Thousand Oaks, CA: Sage Publications.
  6. Mentzer, J. T., W. DeWitt, J. S. Keebler, S. Min, N. W. Nix, C. D. Smith, and Z. G. Zacharia (2001). What is supply chain management? In J. T. Mentzer (ed.), Supply Chain Management. Thousand Oaks, CA: Sage Publications, 1–26.

4.9 Role of the Sales Force in Forecasting9

Michael Gilliland

Three Assumptions About Salespeople

A recurring question among business forecasters is how to incorporate input from the sales force. For example, from a recent LinkedIn discussion group:

My company is using Excel to do sales forecasting on a monthly basis, I am looking for a solution to automate the front part where salespeople will input their numbers directly in the system (instead of compiling different Excel spreadsheets currently). Please recommend a software that could automate this function.

Involving salespeople in forecasting does indeed sound like a good idea. But the value of sales force engagement rests on three assumptions:

  1. Salespeople have the ability to accurately predict their customers’ future buying behavior.
  2. Salespeople will provide an honest forecast to their management.
  3. Improving customer-level forecasts improves company performance.

While sales force involvement is sometimes advocated as a best practice in the forecasting process, closer inspection reveals that such engagement may not be necessary or even advisable in many situations. I learned this from a humbling experience early in my forecasting career.

While this one failure doesn’t damn sales force engagement to the heap of “worst practices” in forecasting, the circumstances of the experiment are not unique. This article examines common methods for obtaining sales force input to the forecasting process, the use of incentives for motivating sales rep forecasting performance, and whether improved customer/item forecasts provide any benefit.

Gathering Sales Force Input

Two main ways of soliciting sales force input are to ask sales for their forecasts, or to have them adjust forecasts that have been provided. One study argues for the latter method:

. . . we have found that salespeople generally do a poor job of taking their previous experience and turning that into an initial sales forecast. However, these same people are generally quite good at taking an initial quantitative sales forecast and qualitatively adjusting it to improve overall accuracy (Mentzer and Moon, 2005, p. 321).

Other studies question the value of many types of judgmental adjustments to the statistical forecasts and find them often overused and ineffective (Fildes and Goodwin, 2007, and Fildes and colleagues, 2009). For example, at the manufacturer cited above, we found that 60% of adjustments to the initial statistical forecast actually made it worse! Mercifully, only about half of the statistical forecasts were adjusted.

Eric Wilson of Tempur Sealy dealt with this issue by appealing to the competitive nature of salespeople, urging them to “beat the nerd in the corner” and make adjustments only when certain they will improve the nerd’s statistical forecast (Wilson, 2008). This procedure not only limited the adjustments being made to those forecasts the reps were confident they could improve upon but also provided the data to determine the effectiveness of their adjustments through use of the forecast value added (FVA) concept (Gilliland, 2013).

Recognizing that time spent forecasting is time taken away from selling, Stefan de Kok of ToolsGroup has suggested another way of gathering sales force input:

. . . there is huge value in getting input from humans, sales reps included. That input however should be market intelligence, not adjustments to quantities. For example, let the sales rep input that their account is running a promotion and then let the system determine what the quantity impact is. Not only will the uplift become more accurate quickly, but also the baseline will improve. Ultimately it becomes a lower effort (but not zero) for the sales people and their forecasts become much more reliable.

Under this approach, salespeople would no longer provide (or adjust) specific numerical forecasts. Instead, they provide information (promotional plans, new store openings (or store closings), more (or less) shelf space, etc.) that can be put into the statistical forecasting models. Implementation, of course, requires software that can incorporate these inputs and assess their effectiveness in improving forecast accuracy. On the negative side, there is the risk of disenfranchising salespeople by removing their direct control of a specific numerical forecast.

Can Salespeople Forecast Their Customers’ Behavior?

Don’t salespeople have the closest contact with customers, and know their customers’ future behavior better than anyone else in the organization? That is the common understanding:

  • Firms rely on their salespeople to stay in touch with customers. Good salespeople know what customers need and want and the sales prospects of the market they serve (Chen, 2005, p. 60).
  • . . . experienced salespeople are likely to have more precise information about sales prospects in their own territories than central planners who are not close to the market (Mantrala and Raman, 1990, p. 189).

But does this superior knowledge of their customers allow salespeople to accurately predict their customers’ future buying behavior?

The value of sales force involvement is predicated upon their ability to provide better forecasts, or to provide information that can be used for better forecasts. But salespeople are employed because of their ability to sell—to execute the marching orders of sales management, to find and nurture prospective customers, and to achieve a revenue target. Knowledge of time-series modeling techniques, or demonstrated talent for predicting the future, are at best secondary or tertiary job requirements.

While implausible to believe that all salespeople have exceptional ability to predict the future buying behavior of their customers, let’s assume this is true. Is this reason enough to engage them in the forecasting process?

Can You Trust the Forecast from a Salesperson?

It has long been recognized that biases and personal agendas influence the input from participants in the forecasting process. For example, from a 1976 article in the Journal of Marketing,

. . . asking a sales force for future sales estimates and using these inputs in any fashion requires a degree of caution and a concern for just how to interpret that information (Wotruba and Thurlow, 1976, p. 11).

Even if we grant that salespeople can predict their customers’ future buying behavior, is there reason to believe they will share this information? What is their motivation to provide an honest forecast?

Forecasts are used to make business decisions. Forecasts of future demand drive all sorts of important supply chain decisions regarding procurement, production, inventory, and distribution. Demand forecasts also assist decisions in finance (projecting cash flow and profit), marketing (how much promotional support is required?), and sales (quotas and compensation plans).

These decisions are not made in a vacuum, but in an intense political environment. Any participant in the forecasting process can have a personal agenda—some interest or incentive that outweighs the incentive for an unbiased forecast. Any time there is asymmetry in the personal cost of missing a forecast too high versus too low, there is the opportunity for intentional bias. Two obvious scenarios potentially biasing the sales force are:

  1. Quota Setting: There may be a natural (and well justified) inclination for salespeople to lower expectations and argue that demand is going to tenuous, to help keep their quotas as low as possible. We often refer to this gaming as sandbagging.
  2. Maintaining Service: A necessary condition for avoiding service issues is to have available inventory. While inventory planners might not be keen on keeping excess inventory (something contrary to their own incentives), the sales force can help assure an excess by purposely over forecasting future demand. (Note that this is opposite the behavior exhibited during quota-setting time.)

John Mello’s earlier article in Foresight offers a comprehensive description of the “sales force game playing” and its impact on the firm and its supply chain (Mello, 2009).

So even if salespeople can accurately forecast their customers’ future demand, there may be more personal benefit to bias the forecast than to give an honest answer.

Compensation as an Incentive for Honesty

In a basic compensation system, commissions are paid for achievement of a quota. It is perfectly reasonable to set the quota based on the potential of a sales territory, so as to reward each salesperson for their selling effort. (Note that selling effort is not necessarily proportional to sales volume, since some territories will have more potential than others.) However, “Inducing salespeople to disclose what they know about the market and providing incentives for them to work hard can sometimes be conflicting goals” (Chen, 2005, p. 60). In other words, there is every reason for the salespeople to turn in the lowest possible forecast.

The Gonik System

One way to address intentional bias is to make forecast accuracy itself a performance objective (along with meeting the quota and other objectives). To have any impact on behavior, the incentives for good forecasting would have to be large enough to overcome incentives for biasing the forecast. Gonik (1978) published a frequently-cited compensation program developed for IBM Brazil that purports to do just this.

In the Gonik system, the company provides an initial objective or quota (Q) for each salesperson. The salesperson then provides his or her forecast (F), and the F/Q ratio provides the horizontal axis on the bonus payout grid (a small section of which is shown in Table 4.3). The ratio of actual sales (A) to quota provides the vertical axis of the grid.

Table 4.3 Gonik Bonus Payout Grid

images

At period end, when actual sales are known, bonus payout can be determined from the grid. In Gonik’s example, if John is given a quota of Q = 500, forecasts 500, and then sells 500, his bonus payout is 120% (since F/Q = 1.0 and A/Q = 1.0); that is, he received a 20% premium for his accurate forecasting.

If John had sold 750 (150% of his quota), while still having forecasted 500, his bonus payout would be 150%. So he was properly awarded for his hard work (exceeding his quota), even though his forecast was off.

However, if John had upped his forecast to 750 and then sold 750, this would have awarded him a 180% bonus payout (an additional 30% premium for his hard work and accurate forecasting).

In the other direction, if John sold only 250 (on a quota of 500 and forecast of 500), payout would have been just 30%. But if he had forecast 250, payout would be 60%.

Note that Gonik’s system does not appear to protect against bias in the quota setting by management. (This was pointed out by Len Tashman.) Since Q is in the denominator, management has an incentive to set higher quotas than they would have otherwise, as a higher Q reduces payouts at all levels of Forecast and Actual.

Menu of Contracts

A somewhat more complicated alternative to the Gonik scheme is to offer a menu of contracts (alternative bonus plans) to each salesperson.

A menu of contracts system assumes that each salesperson has special knowledge about the market (for example, potential sales volume) that is unknown to central planners. The salesperson does not want to reveal this information because it could be used to set a higher quota. However, the very process of choosing a particular contract (presumably the one likely to maximize his or her bonus payout) reveals market information to the central planners.

We want the sales force to work hard, sell as much as possible, and to accurately forecast future sales. Both the menu of contracts and the Gonik approach show that it may be possible to motivate this desired behavior. But then the effort spent evaluating contracts or generating forecasts is effort that could be spent building relationships with customers and making sales. This is an unspoken opportunity cost.

Does Improving Customer Level Forecasts Always Matter?

Let’s continue to assume that our sales force can accurately forecast customer buying behavior and also that our sales force is honest, willingly providing accurate, unbiased forecasts for each customer. We must now ask: Is engaging the sales force worth the effort? Will better customer-level forecasts improve overall company performance?

Many (perhaps most) companies do not need customer-level forecasts for planning purposes. To maintain appropriate inventory and customer service (order fill) levels, they find it sufficient to have a good forecast for each item at each point of distribution (which we’ll refer to as the DC). Therefore, it doesn’t matter whether individual customer/item forecasts are good or bad as long as they aggregate to a reasonably good forecast at DC/item (or other intermediate level).

When customer level forecasts are unbiased, then positive and negative errors in the customer/item forecasts will tend to cancel out when aggregated to intermediate levels. In this situation, improved customer-level forecasts will likely have little effect improving DC/item forecasts, so such effort would be a waste.

However, when customer-level forecasts are biased (as we suspect they sometimes may be when coming from the sales force), then improving their forecasts (reducing bias) would translate directly to more accurate DC/item forecasts. In such circumstances, the focus should be on efficiently gathering forecasts for the biggest customers who dominate demand for the item at the DC. (It is probably not worth the effort to solicit forecasts for the many small customers that have little impact on total DC demand.)

It is true that some organizations utilize customer level forecasts for account planning, setting sales quotas, etc. So there can be other direct applications of customer level forecasting. But improving supply chain operations and demand fulfillment, in many situations, is not one of them.

A related issue, not covered in this article, is the appropriate level of aggregation from which to reconcile the hierarchy of forecasts. It may be that a top-down or middle-out approach is more effective than a bottom-up approach (which starts with customer/item forecasts).

Commitments Are Not Forecasts

In response to a blog post in which I questioned the value of sales force input to forecasting (Gilliland 2014), John Hughes of software vendor Silvon presented a counter-argument in favor of engaging the sales force.

Sales people have a responsibility to themselves and their company to try and predict sales for many good reasons, mostly to help balance company assets (inventory) that drive customer service. Engaging sales people directly with an on line tool ensures their commitment to the numbers and publicly displays the results for all to see and grade them. . . . [T]hey have the same responsibility as the rest of us to commit to a task and then complete it.1

While engaging the sales force for their commitment to a number is a legitimate objective, this is not the same as providing a forecast. Presumably, there is a reward for beating the commitment (or at least a penalty for failing to achieve it), naturally biasing the commitment toward the low side. Empirical evidence of bias could easily be found by seeing whether actuals are historically below, above (what we might expect), or about the same as the commitment.

There is nothing wrong with having quotas, stretch targets, budgets, commitments, and any other numbers that float around the organization for informational and motivational use—as long as we recognize their different purposes. These are not the same as an “unbiased best guess at what is really going to happen” which is what the forecast represents. The fatal flaw of “one number forecasting” (aka “a single version of the truth”) is that it reduces quotas, targets, budgets, commitments, and forecasts to a single number—when they are meant to be different!

Conclusions

It is good to be wary of any inputs into the forecasting process, and this naturally includes inputs from sales. Forecasting process participants have personal agendas, and when we ask someone for a forecast, we shouldn’t expect an honest answer.

There can be good reasons to engage the sales force in forecasting; we just can’t assume this is always the case. In the spirit of “economy of process,” unless there is solid evidence that input from the sales force has improved the forecast (to a degree commensurate with the cost of engaging them), we are wasting their time—and squandering company resources that could better be spent generating revenue.

On a personal note, if you find that your salespeople are not improving the forecast, then you’ll make them very happy—and give them more time to sell—by no longer requiring their forecasting input. Going back to the original question that started this discussion, rather than implementing new software to gather sales input, it may be simpler, cheaper, and ultimately more beneficial, to not pursue their input.

NOTES

REFERENCES

  1. Chen, F. (2005). Salesforce incentives, market information, and production / inventory planning. Management Science 51 (1) (January), 60–75.
  2. Fildes, R., and P. Goodwin (2007). Good and bad judgment in forecasting. Foresight: International Journal of Applied Forecasting 8 (Fall), 5–10.
  3. Fildes, R., P., Goodwin, M., Lawrence, and K. Nikolopoulos (2009). Effective forecasting and judgmental adjustments: An empirical evaluation and strategies for improvement in supply-chain planning. International Journal of Forecasting 25 (1), 3–23.
  4. Gilliland, M. (2013). Forecast value added: A reality check on forecasting practices. Foresight: International Journal of Applied Forecasting 29 (Spring), 14–18.
  5. Gilliland, M. (2014). To gather forecasting input from the sales force—or not? The Business Forecasting Deal March 14. http://blogs.sas.com/content/forecasting/.
  6. Gonik, J. (1978). Tie salesmen’s bonuses to their forecasts. Harvard Business Review (May–June), 116–122.
  7. Mantrala, M. K., and K. Raman (1990). Analysis of sales force incentive plan for accurate sales forecasting and performance. International Journal of Research in Marketing 7, 189–202.
  8. Mello, J. (2009). The impact of sales forecast game playing on supply chains. Foresight: International Journal of Applied Forecasting 13 (Spring 2009), 13–22.
  9. Mentzer, J. T., and M. Moon (2005). Sales Forecasting Management (2nd ed.). Thousand Oaks, CA: Sage Publications.
  10. Wilson, J. E. (2008). How to speak sales. IBF Supply Chain Forecasting Conference. Phoenix, AZ, February 2008.
  11. Wotruba, T. R., and M. L. Thurlow (1976). Sales force participation in quota setting and sales forecasting. Journal of Marketing 40(2) (April), 11–16.

4.10 Good and Bad Judgment in Forecasting: Lessons from Four Companies10

Robert Fildes and Paul Goodwin

Introduction

If you are a forecaster in a supply chain company, you probably spend a lot of your working life adjusting the statistical demand forecasts that roll down your computer screen. Like most forecasters, your aim is to improve accuracy. Perhaps your gut feeling is that a statistical forecast just doesn’t look right. Or maybe you have good reason to make an adjustment because a product is being promoted next month and you know that the statistical forecast has taken no account of this.

But if you are spending hours trying to explain the latest twist in every sales graph or agonizing over the possible impact of Wal-Mart’s forthcoming price cut, is this time well spent? Would it make any difference to forecast accuracy if you halved the number of adjustments you made and spent your newly found free time chatting with colleagues at the water cooler about the Broncos’ latest signing, Wayne Rooney’s soccer injury, or the best beaches in the Caribbean?

To answer this question, we have carried out an in-depth study of four British-based supply chain companies:

  1. A nationwide retailer
  2. A leading international food company
  3. A subsidiary of a U.S. pharmaceutical company
  4. A manufacturer of own-label domestic cleaning products

We collected data on over 60,000 forecasts, interviewed the companies’ forecasters, and observed forecast review meetings where managers discussed and approved any adjustments that they thought were necessary. The results allowed us to identify which types of adjustments tend to improve accuracy substantially, which make the forecasts worse, and which make little difference, but simply waste management time. We supplemented this data with survey evidence of 149 (mostly U.S.) forecasters.

Adjustments Galore

Adjusting forecasts is certainly a popular activity in all our companies, as shown in Figure 4.11. In fact, the forecasters spend so much time making adjustments that they are probably making a significant contribution to world demand for headache tablets.

images

Figure 4.11 Percentage of Company Forecasts That Are Adjusted

Those working for the food manufacturer adjusted 91% of the forecasts that had been generated by their expensive and sophisticated forecasting software. The four forecasters employed by the retailer adjusted only about 8% of their forecasts, but then they had over 26,000 forecasts to make each week, so there probably wasn’t enough time to put their mark on each forecast. The pharmaceutical company held 17 forecast review meetings every month, tying up about 80 person hours of valuable management time. On average 75% of the statistical forecasts in our companies were adjusted. Our survey of forecasters (Fildes and Goodwin, 2007) tells much the same story, with just 25% of the forecasts based only on a statistical method. Judgment, either used exclusively (25%) or combined with a statistical forecast (50%), was regarded as important or very important by most of the respondents.

What sort of adjustments did the forecasters make? Many of the adjustments were small, and in some cases very small. It was as if forecasters sometimes simply wanted to put their calling card on forecasts by tweaking them slightly to show that they were still doing their job. Indeed, we received anecdotal evidence from a consultant that people at review meetings tend to adjust more of the forecasts that are presented earlier in the meetings, rather than later on. As the meeting progresses they tire and feel that they have already done enough to justify the meeting, so later forecasts are simply waved through.

Of course, showing that they were still alive was not the only reason the forecasters made adjustments. They usually felt that they had good justifications for making them and we found that often this was the case. The problem is that people have a tendency to find a ready explanation for every movement in the sales graphs, including those swings which are really random. And this makes them overconfident that their adjustments will increase accuracy.

Our customers were stocking up two months ago because they were anticipating a price increase so our sales swung upwards.

Okay, they didn’t stock up in the previous year when they knew there was going to be a price increase because interest rates were high and there was a lot of uncertainty about.

We are brilliant at inventing theories for everything we observe. Scott Armstrong (1985, p. 54) discusses a case where a Nobel laureate published a hypothesis to explain an oddity in the graph of a macroeconomic variable. Later, it was shown that the anomaly was the result of an arithmetic error. At 13:01 on a December day in 2003 after Saddam Hussein had been captured, the price of U.S. Treasuries rose. Half an hour later, the price fell. Taleb (2007, p. 74) reports that the Bloomberg News channel used the capture of Saddam to explain both price movements. The unfortunate, dull statistical forecast can offer no competition to these colorful, but often groundless, tales and so it gets adjusted.

The Illusion of Control

All this adjustment behavior can have some odd consequences, according to psychologists. When we engage in activities that involve skill and effort, we normally believe that we have more control over what we are doing. For example, if you develop your skills and invest effort in learning to play a musical instrument, you will make fewer mistakes. The same applies to controlling the ball in a sport like football. But many of the swings in a sales graph are beyond the forecaster’s control. They are the result of random, unpredictable events. Yet, because forecasters see making adjustment as a skillful activity, they can develop the false belief that they have control over the demand that they are trying to forecast and hence that they can predict the movements in the sales graph. The phenomenon is called the illusion of control. It’s likely to motivate you to make even more adjustments. After all, the more you adjust, the more control you think you have.

When Do Adjustments Improve Accuracy and When Do They Not?

Despite these concerns, judgmental adjustments to statistical forecasts can still play a useful role in improving accuracy. Our study found that on average they lowered the average percentage error (MAPE) by 3.6 percentage points for all companies except the retailer. But this modest improvement masks considerable variation in the effectiveness of the adjustments. Is it possible to filter out the type of adjustments that are likely to be useless or even damaging to accuracy?

We first examined how the direction of adjustments affected forecast accuracy. We contrasted adjustments that increased the forecast (positive adjustments) with those that lowered it (negative adjustments). For one of our four companies, Figure 4.12 shows the extent to which these adjustments led to improvements. The results are typical of our three non-retail companies.

images

Figure 4.12 Effect of Adjustments by Size and Direction (% improvement measures the reduction in Median APE, so higher is better)

The graph breaks the size of the adjustments into quartiles: Quartile 25% represents the smallest 25% of the adjustments while quartile 100% represents the largest quarter of the adjustments. Two results are immediately apparent: (1) larger adjustments tend to improve accuracy and (2) negative adjustments tend to be much more beneficial than positive.

Why are larger adjustments more likely to improve accuracy? To make a large adjustment takes some nerve. Senior managers may be on to you if you make a big adjustment and then things go badly wrong. This means that the larger adjustments are likely to be made for very good reasons. You are likely to have reliable information about some important future events that will cause the statistical forecast to have a large error. In contrast, the smaller adjustments are the tweaks that we mentioned earlier or the result of a forecaster hedging his or her bets because information about a future event is unreliable. The lesson is clear: While small adjustments, by definition, can do relatively little harm to accuracy, they are generally a waste of time. Doodling in your notepad is likely to be more productive and certainly more therapeutic.

Why do the positive adjustments fare so much worse than the negative? Psychologists tell us that people have an innate bias toward optimism. For example, most construction projects usually take longer to complete and cost far more than was originally predicted. Some of this may be a result of deliberate misrepresentation (see Flyvbjerg et al., 2005) to gain contracts, but there is evidence that optimism bias still plays a significant role in these poor estimates. It seems, therefore, that when our company forecasters are asked to estimate the effects of a sales promotion campaign or a price reduction, they cannot resist being overly optimistic. And of course this reflects the enthusiasm of their colleagues in sales or marketing. In contrast, when they make a negative adjustment they are much more realistic in their expectations.

A particularly damaging intervention is called a wrong-sided adjustment. For example, this occurs when you adjust the forecast upward but should have made a negative adjustment. Suppose that the statistical forecast was for 600 units and you adjusted upward to make a forecast of 650 units. When actual sales turn out to be 580, you’ll realize that your adjustment was in the wrong direction. Any wrong-sided adjustment is bound to reduce accuracy. Yet surprisingly, our companies made a large number of these adjustments, particularly when the direction of adjustment was positive. More than a third of the positive adjustments made by the nonretailers were in the wrong direction. If we could remove even 50% of the wrong-sided positive adjustments, accuracy would be improved by 7 percentage points. For negative adjustments the effects were much more limited.

We investigated whether the wrong-sided adjustments might be a result of misjudging the timing of promotion effects (e.g., expecting an immediate uplift in sales when the actual increase is delayed) but found no evidence of this. Once again, misplaced optimism seems to be the most likely explanation.

But how can forecasters make fewer wrong-direction mistakes? We’ve explored some possible solutions. We believe that the first stage in eliminating wrong-sided adjustments is to catalogue the reasons behind each and every adjustment. In our survey 69% of companies claimed to do this. But of the companies we observed, none collected this information effectively. Second, when large errors have occurred, a post-mortem on the reasons has the potential to help the next time similar incidents threaten. And this should be done as part of a forecast quality improvement program rather than in an atmosphere of blame. An effective forecasting support system can help by encouraging the compilation of the historical record to make life easy for the forecaster to look back at past events (such as promotions) and to reflect on how today’s circumstances match with the past record. In our research we showed just how this can be done through the design of effective software that lets the forecaster examine the past record of similar analogous promotions (Lee et al., 2007).

The Importance of Definitions

So far, we have not mentioned the retailer. When we analyzed the accuracy of the retailer’s adjustments, they looked awful. The positive adjustments its forecasters made more than doubled the MAPE from 32% to 65%. Moreover, 83% of these adjustments were either too large or in the wrong direction. Something odd was going on. Why would the forecasters of a major national company be spending so much time and effort making mediocre statistical forecasts so much worse?

Most people would probably consider a forecast to be an estimate of the most likely level of future demand. It turned out that the retail forecasters were estimating a different quantity. Often they were trying to determine the levels of demand that only had a small chance of being exceeded—that is, the level that would limit stockouts. Determining this level would tell them how much inventory they needed to hold. For example, their statistical forecasting system might provide a demand forecast of 500 units but they would adjust this upwards to 550 units, reasoning that this level of inventory would be sufficient to cover anything but the most extreme level of demand. In an informal way they were forecasting fractiles, as discussed by Goodwin in the Hot New Research Column in Foresight, Summer 2007. So our MAPEs were unfairly measuring the effectiveness of the forecasters’ adjustment because they were not trying to predict the actual demand.

However, there were still serious problems with the retail forecasters’ approach. First, they had never clearly defined what they were forecasting. They simply referred to their adjusted figures as “forecasts,” posing the obvious danger that other managers would wrongly interpret these as estimates of the most likely level of demand and then make decisions based on this assumption. Second, their approach was informal. They had never determined what probability of a stockout was appropriate in order to balance inventory-holding costs against the costs of disappointing customers (see Catt, 2007). Nor had they done any analysis to see whether their adjustments were leading to over- or understocking for the many products they sold.

Finally, the forecasters were trying to do two jobs at once. They were adjusting the statistical forecasts for special events like promotions and, at the same time, adjusting them to estimate inventory requirements. They may have been taking on too much. The evidence from psychologists is that humans have limited information-processing capacity and that better judgments can be obtained by breaking judgmental estimation down into simpler and easier tasks—a process called decomposition.

History Is Not Bunk

Henry Ford is alleged to have said that history is more or less bunk. Many of the forecasters in our companies had the same philosophy. In review meetings they examined most recent movements in sales graphs with forensic intensity while they often ignored earlier data. In one company, the forecasters told us that they never fit their statistical methods to demand data that are more than three years old because “back then, the trends were different.” Sometimes the software they had bought seemed to share the same attitude—the active data base only went back three years!

There was no evidence that they had tested this claim. So great was the bias toward recency that sometimes statistical methods were only fitted to the last six months’ data. This did not give these methods much of a chance. As Rob Hyndman and Andrey Kostenko wrote in the Spring 2007 issue of Foresight, statistical methods can require quite lengthy periods of data to detect underlying patterns, even when the demand data is well behaved and the level of randomness in the series is relatively low. Moreover, the methods commonly found in business forecasting software are designed so they can adapt to changes in trends or seasonal patterns if these occur. If you restrict the data available to your statistical methods, then you are unlikely to be making judgmental adjustments from a reliable baseline.

Conclusions

Judgmental adjustment of statistical forecasts is a crucial part of the forecasting process in most companies. It is often not practical to use statistical methods to model the effect of forthcoming events that you know are likely to have a big impact on demand. Management judgment then has to step in to bridge this gap and, if applied correctly, it can bring great benefits to forecasts. However, our study has shown that these potential benefits are largely negated by excessive intervention and overoptimism. Indeed, had our nonretail forecasters been banned from making positive adjustments to their forecasts, but still been allowed to make negative adjustments, their judgmental adjustments would have improved the MAPE by over 20 percentage points, rather than the mediocre 3.6 points that we reported earlier.

In most companies, however, banning all positive adjustments would not be a realistic strategy. The answer is to make these adjustments with more care and only on the basis of better market information. In the long run, software enhancements might be helpful here.

Our study also emphasizes the importance of having a clear definition of what you are forecasting. It’s not good for morale when a colleague complains you’ve overforecasted demand by 40% when that’s not what you were trying to predict.

Finally, we leave you with these recommendations on your adjustment policy.

  • Accept that many of the movements in your sales graph are random. You have no control over them and they cannot be predicted.
  • Small adjustments are likely to waste time and effort and may damage accuracy.
  • Positive adjustments (moving the statistical forecast upwards) should only be made with care. Be especially cautious about being too optimistic.
  • Give statistical forecasting methods a chance; they need plenty of data to detect underlying patterns in demand.
  • Define clearly what you are forecasting.

Acknowledgments: This research was supported by Engineering and Physical Sciences Research Council (EPSRC) grants GR/60198/01 and GR/60181/01. Michael Lawrence, Kostantinos Nikolopoulos, and Alastair Robertson contributed substantially to the data analysis.

REFERENCES

  1. Armstrong, J. S. (1985). Long-Range Forecasting (2nd ed.). New York: John Wiley & Sons.
  2. Catt, P. M. (2007). Assessing the cost of forecast error: A practical example. Foresight: International Journal of Applied Forecasting 7, 5–10.
  3. Fildes, R., and P. Goodwin (2007). Against your better judgment? How organizations can improve their use of management judgment in forecasting. Interfaces 37 (6), 570–576.
  4. Flyvbjerg, B., M. K. Skamris Holm, and S. L. Buhl (2005). How (in)accurate are demand forecasts in public works projects? The case of transportation. Journal of the American Planning Association 71 (2), 131–146.
  5. Goodwin, P. (2007). Supermarket forecasting: Check out three new approaches. Foresight: International Journal of Applied Forecasting 7, 53–55.
  6. Hyndman, R. J., and A. V. Kostenko (2007). Minimum sample size requirements for seasonal forecasting models. Foresight: International Journal of Applied Forecasting 6, 12–15.
  7. Lee, W. Y., P. Goodwin, R. Fildes, K. Nikolopoulos, and M. Lawrence (2007). Providing support for the use of analogies in demand forecasting tasks. International Journal of Forecasting 23, 377–390.
  8. Taleb, N. N. (2007). The Black Swan. London: Allen Lane.

4.11 Worst Practices in New Product Forecasting11

Michael Gilliland

New product forecasting (NPF) is perhaps the most difficult and thankless of all forecasting endeavors. Organizations commit significant resources to new product development and release, and sometimes even “bet the company” on promising new ideas. Yet the foundation on which such decisions are made—the forecast of unit sales and revenue—may be very shaky, ill-conceived, and implausible. This article identifies several of the “worst practices” that can plague new product forecasting.

Unrealistic Accuracy Expectations

Perhaps the most fundamental worst practice of new product forecasting, like any kind of forecasting, is to have unrealistic expectations for the accuracy of the forecasts.

Forecasting is about knowing the future, something humans (and organizations) are not necessarily very good at accomplishing. Consistently accurate forecasting is possible when three conditions are met:

  1. The behavior we are forecasting (e.g., demand for a product) is guided by a structure or rule (known as the data generating process [DGP]).
  2. There is not too much randomness in the behavior (i.e., demand follows the DGP quite closely).
  3. The DGP does not change within our forecasting horizon (i.e., demand follows the DGP in all future time periods we are forecasting).

When these conditions are met, along with the assumption that we understand the DGP and have it correctly expressed in our forecasting model, then we can generate accurate forecasts. If any of these conditions don’t hold, or if we fail to correctly express the DGP in our model, then accurate forecasting is much less likely.

In new product forecasting, we of course have no prior demand behavior to analyze. Even using the method of forecasting by analogy (where we look to the prior introduction of similar products), determination of the DGP is still highly speculative. It would seem reasonable, therefore, to temper our faith in the accuracy of new product forecasts.

What we want to avoid is making huge, dangerous bets based on the assumption of accurate new product forecasts. If the above argument isn’t convincing, then take a look at your history of past new product introductions. Look at the forecasts prior to the product releases, and look at the actuals as they rolled in after the release. How accurate were your forecasts? Some perhaps were reasonably good, but most were poor to awful. Each new product is released with an expectation for success in the marketplace, even though we know most new products fail. Looking at your organization’s history of forecasting new products gives you the hard evidence of your new product forecasting capability.

Reverse Engineering the Forecast

Well before a new product is released, there has usually been a management review of the new product idea so that it can be “approved” and funded for development. As part of the review process, product management (or whoever is pitching the new product idea) is expected to provide estimates of product cost and demand, so that the financial implications of the new product can be assessed.

The review board may set a minimum units or revenue hurdle that must be met to obtain approval for the new product idea. Not surprisingly, those proposing the new product idea will provide a forecast high enough to meet the hurdle. Generating such a forecast is not an objective and dispassionate process, seeking an “unbiased best guess” at what demand is really going to be. Rather, generating that forecast was an application of reverse engineering, starting with the forecast that was needed for management approval, and then figuring out some justification for why that forecast is correct.

Reverse engineering the new-product forecast is just a special case of the worst practice of evangelical forecasting, where the “forecast” (which should be an unbiased best guess of what is really going to happen) gets contaminated by the “target” (what management wants to see happen). Such an environment can be quite disheartening to the professional forecaster, whose role becomes not to objectively guess the future but to generate numbers that management will approve.

Cherry-Picking Analogies

Forecasting by analogy is a commonly used method for generating new product forecasts. In this approach, we start by identifying similar (analogous) products that have been introduced in the past. Analogs are selected by what we believe are relevant attributes, such as type of product or its function (e.g., smart phone or men’s dress shirt), features, size, color, intended market, etc. As Ken Kahn (2006) stated in his book, New Product Forecasting: An Applied Approach, “Looks-like analysis (i.e., forecasting by analogy) is a popular technique applied to line extensions by using sales of previous product line introductions to profile sales of the new product.”

The use of analogies is not, in itself, an unreasonable approach to NPF. What better way to gain perspective on what might happen with a new product than to look at the history of prior introductions of similar products? The worst practice comes into play, however, when the analogs are “cherry picked” to favor the desired outcome.

Figure 4.13 shows the first 20 weeks of sales for a group of analogous DVDs (all sharing the two attributes Rating [= R] and Genre [= Horror]). While they all share a similar profile (biggest sales in the first week of release, and rapidly falling thereafter), the actual units sold in the release week range from around one thousand to nearly one million. (A three-orders-of-magnitude range is not very helpful for supply planning!) It might be possible to narrow this range by using additional attributes (e.g., movie director or starring actors) to reduce the set of analogous DVDs; but is there any justification for this? It is easy for the unscrupulous forecaster to limit the set of analogs to those that were successful in the marketplace to have the desired amount of sales, while ignoring other perfectly analogous products that failed.

images

Figure 4.13 Ignoring the Uncertainty

Source: Gilliland and Guseman (2009)

Forecasts are usually expressed as a single point (e.g., sales forecast = 500 units), but a better approach is to also provide a range (prediction interval) about that single point where the actual is likely to fall. This range is an important consideration in decision-making, as we may select an entirely different course of action when the forecast is 500 ± 50 units compared to when the forecast is 500 ± 500 units.

For ongoing products with lots of available history, you can get a reasonable sense of the prediction interval. (Note: The prediction interval may be much wider than you wish it would be!) However, for NPF when there is no history, how would we determine this? Here, forecasting by analogy shows its main value. The history of analogous products (as we saw in Figure 4.13) can give you a sense of the range of likely outcomes.

Insisting on a Wiggle

A time series of historical sales is almost invariably “wiggly.” There are not only the major ups and downs of trend, seasonality, and long-term cycles, but there is randomness or noise. When we release a new product to the marketplace, we would expect its sales to be wiggly as well. So, is our forecasting model wrong if it doesn’t wiggle?

A good statistical model removes the noise and expresses the underlying data generating process. This model will generate forecasts with the major ups and downs of trend, seasonality, long-term cycles, and event-related impacts, but will not reintroduce the randomness and noise that caused all the little wiggles in history. Forecasts are sometimes criticized because they are not wiggly enough (they do not look enough like history), but this criticism is misguided. It is actually a bad practice (overfitting) to try to model the noise.

The Hold-and-Roll

It is not unreasonable to expect forecasts to become more accurate as we get nearer the period being forecast. However, if you find your forecasts get worse the closer you get to the period being forecast, this may be a very bad sign. But how can such a thing happen?

Organizations are fond of making all sorts of plans, such as the annual operating plan. A technique favored by some management teams is the hold-and-roll, where plan misses are rolled into future periods to maintain (“hold”) the annual plan. Thus, if we miss January sales by 10%, that miss would be added to the sales forecasts for February (or spread across February through December), so the total for the year would be unchanged. Several months of downside misses, added to the forecasts of remaining future months, could lead to even greater forecast errors in those months.

In his book Demand-Driven Forecasting, Charlie Chase (2009) points out a comical example of dealing with the Sales Department:

A common response to the question “What makes you think you can make up last month’s missed demand next month?” is that “The sales organization will be more focused.” “But weren’t they focused last month?” “Yes, but they will be more focused next month.”

Would such a response give anyone more confidence in Sales’ ability to hit the numbers?

Ignoring the Product Portfolio

New products are not released in isolation; rather, they are part of a portfolio of all the company’s offerings. Sometimes the new product is completely new to the world (an invention), or part of a product category that the company previously didn’t offer. In such cases, it may be reasonable to assume that all new sales are incremental to the sales of existing company offerings. Also, an existing product may be released into a new market (e.g., into additional countries), and again the new volume would likely be incremental to existing volume.

In the case of new products that are simply improvements or extensions of existing product offerings, it is unlikely that sales of the new product will be entirely incremental. It would be wrong to model the new product’s demand as a distinct, independent entity and not part of an integrated portfolio. Cannibalization and halo effects are likely, as are phase-in/phase-out effects when the new product is intended to replace its predecessor.

Using Inappropriate Methods

As with any other forecasting exercise, it is important to fit the method used to the problem being solved. NPF problems include new to the world inventions, product improvements or line extensions, and new categories or markets. Also, new products can be intended to have short (e.g., fashion items or magazine issues) or long life cycles. Forecasting in these different situations will require different approaches, including both judgment and statistical modeling, although not necessarily traditional time-series modeling. A worst practice is to expect one method to work well across all of these types of NPF situations, and not be open to creative new approaches to this vexing business problem.

(Thanks to Andrew Christian, Snurre Jensen, Michael Leonard, Charlie Chase, Peter Dillman, Priya Sarathy, Udo Sglavo, Tammy Jackson, Diana McHenry, David Hardoon, and Gerhard Svolba, who contributed suggestions for this article.)

REFERENCES

  1. Chase, C. (2009). Demand-Driven Forecasting: A Structured Approach to Forecasting. Hoboken, NJ: John Wiley & Sons.
  2. Gilliland, M., and S. Gusman (2009). Forecasting new products by structured analogy. Journal of Business Forecasting (Winter), 12–15.
  3. Kahn, K. (2006). New Product Forecasting: An Applied Approach. Armonk, NY: M.E. Sharpe.

4.12 Sales and Operations Planning in the Retail Industry12

Jack Harwell

As the postwar industrial revolution in Japan reshaped the economy, Japanese manufacturers adopted practices that enabled them to threaten the existence of American manufacturers. This threat came in the last few decades of the 20th century in the form of customer preferences. American customers recognized the superior quality and better value of Japanese goods and began to express their preferences by consuming these foreign goods.

To avoid extinction, manufacturers in the United States responded to this threat by aggressively improving the quality of their products, their production processes, and the supply chains. Retailers are facing pressures similar to those that American manufacturers felt in the last century.

Challenges facing retailers in the United States have increased dramatically. The proliferation of stock-keeping units (SKUs), growth in retail space, and increased pricing competition are forcing retailers to recognize that successful retailing requires more than just exceptional merchandising. Improving the supply chain is emerging as a key competitive weapon in the retail industry.

Historically, a small assortment of basic products in most categories was sufficient to meet the needs of the consumer. There were relatively few items, and the dominant store format was the inner-city department store. In the last half of the 20th century, fashion merchandise became the primary driver of revenues and profits. Specialty stores became more popular as shopping centers moved to the suburbs and the product choices within virtually every category exploded. As retailers began to source more products from overseas, consumer expectations for lower prices resulted in extended supply chains. Demand for a particular SKU became less predictable as the abundance of choice tempted consumers to vary their buying habits.

As these market changes occurred, retailers struggled to keep up with outdated supply chain capabilities. Historically, fashion products were bought in bulk and kept in a warehouse or the backroom of each store until it was time to display them for sale. This often resulted in overstocks or lost sales, and costly product markdowns became a strategy to not only increase sales and profits, but also to maintain cash and inventory flow.

Successful retailers have looked to manufacturing to learn how to improve their operations. They are learning how to apply the techniques developed and proven in the manufacturing industry to upgrade their retail planning and distribution processes. Quick response and lean distribution are based on putting lean manufacturing concepts into operation along the retail supply chain. More and more retailers are applying lean and Six Sigma methodologies to keep up with consumer demands and with those competitors who have become early adopters of these techniques.

Sales and Operations Planning

  • One discipline that has been very successful in improving the planning capabilities in manufacturing is sales and operations planning (S&OP). S&OP was developed in manufacturing in the late 1980s as an extension of MRP II. The concept of S&OP is simple: It is a structure of internal collaboration to resolve the natural tensions between supply and demand, to reach consensus on a single sales plan that every department in the company will support, and to coordinate and communicate operational plans required to achieve the sales plan.
  • Planning in retail also involves natural tensions that must be resolved. The availability of products, both in larger assortments and quantities, supports more sales. However, this comes with the cost of carrying inventory, reduced cash, obsolete merchandise, and severe markdowns. In retail, the balance of supply and demand can be viewed as optimizing two constraints—cash and space—while capitalizing on demand. Productivity of cash and space are essential to success in retail.

Three Plans

  • There are three plans that are critical in any retail organization. These are the assortment/life cycle plan, the promotion plan, and the sales and inventory plan. These plans address the opportunities in the marketplace while optimizing results. If a retail organization can organize an S&OP process around these plans, there is a high probability this process will add value to the organization.

Assortment/Life Cycle Planning

  • Planning assortments, a key activity of the retail merchant, is typically performed in two stages: long-term assortment planning and detailed assortment planning. Long-term assortment planning involves determining which categories of products best represent the brand image of the company, serve the needs of its customers, and attract new customers in support of the corporate strategy. Once it is determined that a category fits the overall assortment strategy, the merchant must perform detailed assortment planning.
  • Detailed assortment planning consists of choosing the items that will be sold by the retailer in the assortment. Before selecting the items, certain constraints must be considered. These constraints include:
  • Cash available for purchasing the merchandise and the related inventory budget
  • Display space in the store
  • Marketing funds
  • Required profits

Within these constraints, a selection of items must be chosen to meet customer expectations, which may include color, size, features, and quality. Typically, the customer wants to have multiple options when they make a purchase. However, the ideal assortment rarely involves providing all available options known to man. To do this would probably violate one or more constraints. In fact, the essence of assortment planning is to select the ideal number and variety of items within a category that optimizes the demand with space and cash availability. This also includes the pricing of items to achieve certain profit goals and to guide the consumer to purchase those items that maximize the profit.

Many activities of the retail organization depend on the assortment decisions. Where to put merchandise, how to display it, how much to purchase of each item, and other questions are answered by the assortment.

A related activity that is critical to assortment planning is life cycle planning. Because of changing technology, the customers’ demand for variety, and the need to extend the brand to remain relevant to an ever-changing marketplace, many products offered by today’s retailer have a limited life cycle. The typical consumer product goes through five major life cycle phases: (1) product introduction, (2) sales growth, (3) maturity, (4) decline and markdown, and (5) liquidation. Within a category, there may be many items at each life cycle phase.

Shepherding products through their product life cycles requires considerable planning and coordination. Decisions made through each stage include pricing, quantities forecasted and purchased, marketing strategies, and discount strategies to minimize product costs and maximize profits over the entire life cycle.

Promotion Planning

Since promotional events are designed to increase sales volumes, draw traffic into stores to purchase other products, and increase brand awareness, they are the staple of most retailers. These events typically take the form of product advertising in various media and in-store. Promotional planning addresses media, store operations, product selection and pricing, merchandise deliveries, and other activities required to successfully maximize profits and customer satisfaction during a promotional event.

Planning media that supports a promotion starts with choosing the geographic and socioeconomic markets that will yield the best results. The media type (newspaper insert, direct mail, broadcast, etc.) is then identified.

Product planning is also critical to the success of a promotional event. Products are selected because they are expected to attract customer interest. Typically, this is based on seasonal preferences, technological innovations, and current fads. These time-based attributes imply that the retailer must be able to make quick decisions, respond to changes in market conditions, and change direction in a coordinated manner.

Product availability, inventory budgets, and advertising budgets are all constraints that must be managed in a promotion. S&OP is an ideal structure to facilitate decisions to optimize profits from promotional activities.

Sales and Inventory Planning

Sales planning is fundamental to establishing the company goals and it serves as a basis to measure success in reaching these goals. Sales and the associated gross profits must be planned on both a financial and a unit basis.

Financial sales plans are referred to as top-down planning. Corporate goals are established, supporting activities such as staffing, financing, and inventory budgeting. Typically, financial sales plans are managed at levels of the product hierarchy above the item level and in monthly or quarterly periods.

Unit sales plans, on the other hand, are required to execute the financial sales plans and to identify which items must be purchased, displayed, and promoted. These unit sales plans are defined by item and by weeks or days.

Additionally, inventory planning is required at the financial level as well as at the unit level. The financial inventory plans help formulate the inventory budget and purchasing plan—typically referred to as open to buy. Unit inventory plans are used to identify which items to purchase, in what quantities, and when. Inventory flows from suppliers through distribution centers to the points of sales are determined from the unit inventory plans.

Sales and inventory planning is the process of reconciling the financial and unit sales plans, both for sales and inventory. Without this reconciliation, the company goals are at considerable risk. Sales will not materialize and inventory will be short or in excess.

Three Levels

When considering S&OP, one should view the organization as having three basic levels of structure: the execution level, the executive level, and the C-level. The execution level participants include individuals and managers that develop and execute the three plans. The executive level consists of those mid-level managers, directors, and vice presidents who are charged with setting policy, approving plans, and resolving issues among departments. The C-level is made up of senior executives who are accountable for establishing and meeting the expectations of the company’s stakeholders.

Execution Level

The amount of coordination and communication required in a retail company to successfully execute the three plans is enormous. The S&OP process is a structure that can greatly enhance the company’s ability to formulate and execute these plans across multiple functions.

Organizations vary from one retail company to another. However, there are five basic functions that are involved in developing and executing the three plans of retailing. These are merchandising, marketing, finance, supply chain, and store operations. Though organizational strategies combine different functions into various roles, the descriptions of these functions follow.

  • Merchandising determines the assortment, approves the product, establishes pricing policies, and is accountable for the sales, gross profit, and inventory plans. The merchant owns the product plans and is responsible for obtaining financial results for their assigned product categories. This is a key function in the S&OP process.
  • Marketing sets the advertising and brand strategy, manages the advertising budget, determines how to display products, defines store formats, and plans media and other events.
  • Finance reconciles the sales, gross profit, and inventory plans with the corporate financial strategy.
  • Supply chain activities include planning and executing the distribution of merchandise through the supply chain.
  • Store operations manage the execution of displays, formats, and inventory control at the store level, interfacing with the customers, and making the sale.

The individuals working at the execution level determine assortments; plan promotions; project sales, margin, and inventory; place purchase orders with suppliers; plan and execute distribution to stores; and evaluate the financial impact of these plans. They work together on a daily basis to review and analyze results and take action to respond to the unfolding dynamics of the market, making minor corrections as required to achieve their goals.

In the S&OP structure, participants working at the execution level manage both item plans (SKUs and quantities) and financial plans (sales, gross profits, and inventory dollars aggregated at various levels of product hierarchy). Teams are formed to align with product categories or subcategories, enabling everyone to focus on the same strategies and results.

Once detailed plans are formulated, actual results are evaluated and corrective actions are developed. These must then be translated into an overall picture of where the company is going and what is required to get there. This picture must be painted at the summary level, with only the amount of detail required to clarify significant issues. Decisions that can be made at the execution level are filtered out, leaving only those requiring support from senior management. The purpose of this translation is to effectively communicate with and request assistance from the executive staff involved. Typically, the discussion at the executive level is kept at the category or department level.

Executive Level

The responsibilities of the executive level in S&OP include resolving conflicts between organizations, approving major decisions made at the execution level, and ensuring that the plans and results coming out of the execution level meet the corporate objectives.

Representatives of the execution level participants meet with those of the executive level to review the plans, results, issues, and results for their assigned products. Decisions that can be made at this level are documented and communicated to the rest of the execution team.

C-Level

As leaders of the organizations involved in sales and operations planning, the C-level executives must be kept informed of progress with regard to plans and goals, major actions required to remain on track, and opportunities that have developed in the market. This is done by combining the information from the various categories into one consolidated view of the business. This high-level view is used to communicate results and future expectations to the company owners.

Sales and Operations Planning Escalation Process

The interaction among the execution, executive, and C-level staff is a sequence of escalation and filtering, as described above. Plans, issues, and decisions required are filtered up through the organization, while decisions are cascaded down. This process of escalation and filtering is an efficient way to keep senior executives involved in the S&OP process.

Figure 4.14 is an illustration of how the various levels interact, significant issues and information are escalated, and decisions are cascaded.

images

Figure 4.14 S&OP Escalation Process

Other Keys to S&OP Success

One important aspect of an S&OP structure is the frequency of meetings. In most cases, it is sufficient to review and discuss the three plans mentioned above monthly. However, when designing the meeting structures at the execution level, it is advised that a separate meeting be held for each plan. This allows sufficient time on the agenda to adequately cover all topics.

The timing of sales and operations planning meetings should consider when the financial plans must be developed. In a public company, this is usually dictated by the schedule set by a board of directors to support announcements and disclosure to stockholders and the Securities and Exchange Commission. Even though not accountable to the public, private companies also require periodic review and resetting of financial plans. Most companies—public and private—have adopted a monthly financial review.

It is also important to structure the timing of the meeting levels so that the executive level and C-level meetings follow the execution level. This allows the execution level participants to identify issues, develop actions to resolve them, and reset plans.

Week 1 Week 2 Week 3 Week 4
Financial Planning Cadence Sales Plan Inventory Plan
Weekly S&OP Cadence Sales/Inventory Plan Assortment/Life Cycle Plan Promotional Plan
Monthly S&OP Cadence Executive Review C-Level Operations Review

Figure 4.15 Meeting Cadence

Adequate time must be available to articulate a comprehensive view of the business to senior management. In addition, those larger issues and changes to the plan that the execution team cannot resolve or approve must be communicated to the next level for approval. Figure 4.15 is an example of a meeting cadence that can be used for S&OP.

Goals and Key Performance Indicators

It is critical that metrics, or key performance indicators (KPIs), are defined to establish goals and measure success. These KPIs should take into account the objectives of the company and measure the ability to manage the constraints.

In a retail environment, there are two main constraints to consider: display space and inventory levels. Gross profit return on space (GPROS) and gross profit return on inventory (GPROI) are effective in measuring the ability to optimize profits within these constraints. Because these KPIs reflect the efforts of all participants and the cross-functional nature of the S&OP process, putting an emphasis on them will keep everyone involved focused on success. GPROS and GPROI are measured as follows:

numbered Display Equation

Top-Level Support

S&OP is a highly cross-functional activity. Because of this, top level support of the process is critical to success. However, it is not sufficient for senior leadership to simply express their support in words. It is great to have a powerful cheerleader on your side, but more is required. Senior managers must define KPIs, agree on reporting format, establish expectations for information flow and decision making, require participation from all, and expect actions to be completed on time.

As mentioned before, KPIs must be designed to measure the performance of everyone involved in S&OP. When top management identifies KPIs and sets goals using these metrics, a definition of success emerges. The organization will more likely make the right decisions if they focus on these KPIs, and will share both the successes and failures.

It is typical for different organizations within a company to use various report formats to support their unique activities. The cross-functional nature of S&OP requires that a standard format for reports is used to communicate plans and record results. To keep everyone on the same page—literally—the report formats must be the same. Because top management ultimately needs to understand the information communicated to their level, it is a good idea for senior managers to define how the information will appear to them. Of course, the actual work in defining the reports will probably be performed by one or more subordinates; however, top level agreement is critical.

Information flow defines how and when information about plans, results, issues, and decisions are communicated up and down the organization. Top management must define when issues are to be communicated, the degree of granularity, and the rules regarding who must be informed of an issue or make a decision. This will make sure there is consistency among organizations and provide the necessary structure for delegating planning authority at each level.

S&OP demands involvement of all the functions to build and execute effective plans. Senior leaders of the organization must make it clear that there has to be participation at all levels, including attendance at all meetings and the timely completion of assigned tasks. A standard format for action logs, including a record of attendance, will assist management in supporting the process.

If effectively communicated, formats, expectations, and discipline will cascade from the top. It is more likely that everyone will fall in line with top management guidance if this communication is clear and consistent.

Conclusion

U.S. manufacturers have responded to competitive pressures by aggressively reducing costs and improving quality. Among the many processes developed to achieve these goals, sales and operations planning is used to gain consensus on a sales plan, to coordinate activities required to achieve these plans, and to manage constraints.

Like other manufacturing methods, this process can be adapted successfully to the retail environment. Addressing the three plans of retailing—assortment/life cycle, sales and inventory, and promotion—the key functions of retail reach a consensus on common goals and the means to achieve them. Once established at the execution level, these plans should be reviewed with the executive level and C-level leaders. Significant issues and required decisions are filtered upward; direction and support are cascaded downward. The process of working together both horizontally and vertically ensures that the entire organization is working toward common goals and is successfully meeting the challenges that retailers face today.

4.13 Sales and Operations Planning: Where Is It Going?13

Tom Wallace

Sales and operations planning (S&OP) or executive S&OP has emerged as a highly effective process for managing a business. As a result, it’s quite popular, just about everyone is either “doing S&OP” or implementing it. Contained in this thought is good news and bad news. The good news is that S&OP is robust.

When done properly, a company can generate enormous benefits, which we divide into two categories: hard benefits and soft benefits. Hard benefits are those that can be quantified, including among others:

  • Higher levels of on-time shipments
  • Lower inventories
  • Shorter customer lead times
  • Reduced time to launch new products
  • Improved plant productivity

Soft benefits can’t be quantified, but in many successful S&OP-using companies, they’re considered of equal importance or nearly so. Soft benefits include:

  • Enhanced teamwork
  • Improved communications—institutionalized
  • Better decisions with less effort and time
  • Better $$$ plans with less effort and time
  • Greater accountability
  • Greater control
  • Window into the future

Today some companies are receiving all of these benefits, and more. But now for the bad news: Not all companies are getting all or most of the benefits; some are receiving less than half, some are near zero, and there are some that haven’t yet started. The reason for this sub-par performance is that, in almost all cases, S&OP was not implemented properly. However, when people see more than a few companies are not being successful with the process, the tendency is to blame the process, and not the people who implemented it. This, of course, is too bad because it may lead them to think that, since the process is not working, it should be discontinued. I get a fair amount of questions along these lines, and here’s a sample of the more common ones:

  1. Q. How long will S&OP stay popular?

    A. If by popular we mean red hot, as it’s been for the past roughly half dozen years, my guess is another six years or so. On the other hand, if by popular we mean widely used in businesses around the world, I believe strongly that it’s here to stay. As long as there are businesses, manufacturing and otherwise, it’ll be here even long after we’re all gone. (Double entry bookkeeping, one of the important foundations for Generally Accepted Accounting Practices, was invented in the 15th century in Italy, and it’s still widely used.)

    The Adoption Curve, as shown in Figure 4.16, states that there is roughly a 20-year lag between the initial creation of a process and its widespread adoption (Inflection Point A). After a period of intense implementation of the process, its growth levels off and the growth rate of the process becomes that of industry in general (Inflection Point B). We can see that S&OP is following the same curve as Six Sigma, Lean, and ERP.

    images

    Figure 4.16 The Adoption Curve

  2. Q. Will S&OP change and evolve?

    A. This is virtually certain. What’s equally certain is that it will change less than it stays the same. The changes will be primarily peripheral rather than being in the heart of the process . . . in extensions and enhancements, but not in fundamentals. A good example is MRP. Back in its heyday, a frequently asked question was will it change a lot and will it maybe go away? Today, we see it stronger than ever: evolving into ERP, more widely used beyond manufacturing but with the same basic capabilities that were present in the 1970s.

  3. Q. Will the S&OP software affect its growth?

    A. Yes it will, and for the better. The world is becoming a far more complex place, and the level of complexity that is prevalent today didn’t exist a dozen of years ago. Back in those days, we were saying, “You don’t need software to be successful with S&OP; you can get by with Excel.” Today we have proliferation of products and channels of distribution as well as shorter life cycles and longer lead-times. One can say, “You don’t need software to be successful if your business is relatively small and simple.” As shown in Figure 4.17, if you’re at the lower end of the diagonal—a simple business with few products and materials, short lead times, highly cooperative customers, and so on—you may need S&OP but won’t need S&OP software.

    images

    Figure 4.17 Complexity, Change, and Coordination

    Figure 4.18 shows that as you move up the diagonal toward more and more complexity and change, your need for first-rate S&OP software increases right along with it. Complexity increases as a result of doing business globally; lengthy, fragile, and non-agile supply chain; need to respond quickly to changes in market dynamics; demand for higher and higher customer service level; and availability of large amount of data. For most companies now, the days of “Blood, Sweat, and Excel” are over.

    images

    Figure 4.18 Complexity, Change, and Coordination

  4. Q. Will S&OP become a real-time process?

    A. In one sense, it already is. For example, in an Executive Meeting, when data is needed to support a given decision, it often can be gathered, synthesized, and presented in the meeting on the fly. However, I don’t believe that real time in the true sense of the word will ever play much of a role. If S&OP becomes truly real time, it will no longer be S&OP and the job that S&OP does today will almost certainly not be done. Why? Because S&OP is not a short-term scheduling and optimizing tool; rather it’s a medium-to-long-term planning process, directed by executive management, and it sets the conditions for success when the medium and long terms move into the near term. Using short-term, real-time tools to address medium and long-term issues is the managerial equivalent of using a saw to hammer nails. Does it make sense to try to use in real time the tool that has monthly time buckets and operates on a monthly basis? I think not.

  5. Q. But what about companies that do weekly S&OP?

    A. I’ve seen some of these and they do not qualify as S&OP. Rather they look more like a type of Master Scheduling. S&OP operates with aggregated data such as product families; Master Scheduling focuses on individual products and stock keeping units (SKUs). I have nothing against a weekly meeting, in addition to S&OP. That way, you’ll be using a saw to cut wood and a hammer to drive nails. That works. A weekly meeting to address next week’s schedules and other issues can be very helpful (as it was for me back in the old days).

  6. Q. One of the factors that adds complexity is globalization. Since most companies today do business globally, don’t most companies need to do global S&OP?

    A. Not necessarily. Let’s take the case of the Superior Widgets Company, a manufacturer of widgets for home and industry. Its only plant is located in Nebraska serving its four sales regions around the world. Guess what? Superior does not need Global S&OP; it can make S&OP work very well with the standard Five-Step Process. (See Figure 4.19.) The plant has to get the demand planning picture from all of the regions before it can complete its supply planning. How could it do otherwise? If Superior at some point adds a plant outside of North America making the same products, then they would need Global S&OP because they have two separate demand/supply streams.

    images

    Figure 4.19 The Executive S&OP Process for Superior Widgets, Inc.

  7. Q. New product launch (NPL) is, for many companies, the most difficult thing to do. Can S&OP help in launching new products?

    A. Yes, it can help and has been helping for many years, for decades even. Some of the early adopters of S&OP saw that it was a natural to support the introduction of new products. Why is it a natural? Because NPL is replete with opportunities for things to go wrong: quality and performance issues, production problems, supplier shortfalls, and so forth. As mentioned earlier, S&OP is fundamentally a coordination tool; when things go wrong, S&OP can be very helpful in realigning priorities and creating a new plan. This will help keep the NPL on schedule and to be more “sure-footed” during the entire launch.

    A disclaimer: It’s likely that most or all of your existing NPL processes will still be needed, e.g., stage-gate decision making. S&OP’s job is to blend the specific new product plans into the company’s sales and supply plans.

  8. Q. Is S&OP strategic or tactical?

    A. The answer is yes; it’s actually both. It was developed as a tactical planning tool and has filled that role extremely well. However, a first-rate S&OP process can greatly enhance a company’s ability to execute its strategic plans. Here, coordination is the key. A somewhat tongue-in-cheek statement about the CEO’s mission is: Keep the Herd Moving Roughly West. We call it energy alignment. S&OP will work well, if all are moving in the same direction. (See Figure 4.20.)

    images

    Figure 4.20 Energy Alignment

    Here are three examples of companies that use S&OP to enhance and execute their strategic plans (Wallace, 2011).

    1. BASF: Around the year 2000, the company set a strategic goal to become No. 1 in the chemical business worldwide. It achieved that goal before the end of the decade, and they credit S&OP with providing substantial help in that regard. How? By optimizing its production plans globally based on gross profit, and used the additional profit to fund the necessary growth.
    2. Cisco Systems: It created an entirely new business from scratch, with products quite different from their traditional ones. Cisco people state that S&OP played a major role in enabling them to launch on schedule, ship 98% on time from day one, and keep order fulfillment lead times to less than five days. It is now No. 1 globally in this business.
    3. Dow Chemical: The acquisition of Rohm and Haas was a key element in Dow’s long-term strategy of moving more heavily into specialty chemicals. S&OP is credited with playing a significant role in helping to integrate the two companies—on time and ahead of budget—and was a driving force for generating cost and growth synergies.

    There is a relationship between the strategic/tactical issue and the time frame. In the short term, S&OP is almost totally tactical; medium term it becomes somewhat more strategic; long term, it’s focused heavily on strategic issues.

Summary

Here’s a recap of the issues we focused on:

  1. How long will S&OP be popular? Longer than any of us will be around.
  2. Will it change and evolve? Yes, but the fundamentals will stay the same.
  3. Will S&OP software affect its growth? Yes, and on balance for the better.
  4. Will S&OP become a real-time process? Perhaps partially, but not fully.
  5. What about the companies that do weekly S&OP? Weekly S&OP is not S&OP except in periods when there is extreme seasonality.
  6. Don’t most companies need to do global S&OP? No. Only those with global sources of supply in addition to global demand.
  7. Can S&OP help with launching new products? Absolutely. It has been helping for years, decades.
  8. Is S&OP strategic or tactical? It’s both.

My concluding comment: S&OP is here to stay; it’ll get better and better; and will, at some point in the future, be widely accepted as the standard set of processes with which to run a business. Neglect it at your peril.

REFERENCE

  1. Wallace, T. (2011). Sales and Operations Planning: Beyond the Basics. Montgomery, OH: T. F. Wallace & Company.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset