Chapter 6

Managerial Considerations

This chapter covers topics where an astute managerial choice can minimize uncertainties and improve overall performance for businesses affected by some form of waiting line behavior. The areas of consideration include the following:

  • Variances in waiting line performance, which includes looking at variances from the customer and the business points of view plus estimating the risks of worst-case scenarios
  • Waiting line cost factors
  • Cost trade-off analysis
  • Service improvement approaches
  • Use of customer participation to reduce waiting and service time
  • Waiting line configuration considerations
  • Psychological factors (While difficult to quantify, these factors are significant considerations that are interlinked with many other areas in this list.)
  • Arrival rate management methods
  • Options for dealing with priorities, such as more urgent need, rush orders, and preferred customers

Some of you may have jumped ahead to this chapter because you feel that you already know all you have to know about waiting line theory. That may be true for some; but for most, it would be wise to review at least some of the topics in chapters 2 through 5 if parts of the following discussion are confusing.

Variance and Risk Considerations

When I ask business students to select the better business solution after giving them the average performance for each choice, most of them quickly choose the one with the highest average. When more than one choice has the same average performance, the same group often comments that any choice will do.

A few students will look beyond the average performance to considerations of risk or the accuracy of the average performance value before making their choice. To evaluate risk or accuracy, some knowledge of the variances in a process is necessary. Unfortunately waiting line equations provide only steady-state average performance. Start-up variances, such as whether there is a line of customers waiting when a business opens its doors or there is a set of items left over from the previous day of production that need to be completed before beginning new work, are not included in steady-state results. Larger variances, such as no customers or a large group of customers that arrive all at once during the day, occur less often; but when they do, they can cause significant business disruptions.

Risks such as the probability the line will be longer than the capacity of the business can be calculated, but when that condition can occur over a period of time cannot be predicted. Managers need to remember that waiting line arrivals and service times are memoryless. That is, what happens next is not influenced by what has just happened in the past. For example, when flipping a coin, the next coin toss result is unaffected by whether or not there were two heads, two tails, a head and a tail, or tail and a head for the previous two tosses.

One variance often not considered is the potential variance in normal service and arrival times when waiting lines are much longer or shorter than usual. This is often referred to as a state-dependent variance or rate.1

When waiting lines get long, some newly arriving customers may be discouraged from entering the line (balking), and some current customers may choose to give up and leave (reneging). At the same time, many servers feel the pressure to speed up their work to help move the line faster. As a result, this creates three general managerial concerns:

  1. Arrival rate variances caused by balking or reneging result in lost business opportunities, which was discussed in earlier chapters.
  2. When the line is short, a server may unnecessarily chat with a current customer or otherwise work more slowly. This reduces the opportunity for the server to do other work useful for the business.
  3. When the line is long, servers may work faster by taking shortcuts and perhaps not being as careful to prevent mistakes, which, when they occur, often offset any time gained because of the time required to correct them. This affects the customer’s perception of the service quality received and can be quite serious if the mistake is preparing or assembling the wrong order, omitting something from the order, or making an error in recording the payment and giving the customer change.

Clear managerial communication and good server training can minimize server variances by letting servers know they will not be penalized for remaining careful and not taking shortcuts when lines are long and making clear any requirements regarding what servers are expected to do during times when no customers are present.

In manufacturing lines, some of the arrival rate variances can be minimized through careful production scheduling; but when dealing with customers, we have to work with service rate variances unless the business is conducive to using an appointment or reservation system. In customer service lines, reneging and balking can be reduced by opening up a temporary express window for customers with standard, easy-to-do needs. An advantage of this approach is that you can use a staff person who may not be trained in the full range of services offered. An example of this is opening up a window strictly for customers only picking up mail and/or packages at post offices when the line is long. So, how do we assess potential variances and risks in waiting line performance? We can use three basic steps:

  1. Be aware that variances and risks exist and often can be much larger than one would normally expect. In addition, we need to accept that average values predicted by equations are the exception rather than the rule for individual customers or items in a queue. Figure 2.3 illustrates this situation by showing some of the performance data for the first 100 customers entering a typical coffee shop.
  2. Collect data about how the business operates. Observing the number of arrivals per period and the number of customers or items in the system at the end of each period allows a determination of waiting times using Little’s Law and an estimation of service times. Processing point-of-sale (POS) data can provide estimates of the percentage of different customer classes and service time distributions. So one can develop more accurate discrete distributions for arrival and service rates in simulation models, some of these data collection methods are discussed in more detail in chapter 7.
  3. Simulation runs using an accurate model of the given business process can be used to provide more accurate performance values than the queuing models can predict. This is particularly true when the arrival and service distributions for a business are not well described by the basic probability distributions. The simulation runs also can provide a picture over time as to how the performance values can vary and identify the possible worst-case and best-case scenarios. The caveat here is that the simulation results are useful only if the simulation model accurately represents how the particular business actually operates. Some of the chapter 7 simulation examples illustrate how to calculate data regarding possible variances.

Variances that are very difficult to quantify are differences in customer attitudes. Their tolerance of waiting times and long lines can vary greatly depending on the time of day, the weather, whether they are with a friend or not, whether their service need is urgent or not, any personal time limitations, and so forth. Although we cannot usually address the direct causes of these variations, we can reduce their intensity by considering the psychological factors associated with waiting. Some of these factors and methods for reducing their effects are discussed later in this chapter. The most important consideration to remember when accounting for variability in waiting line business decisions is that the variances experienced by the business are NOT directly related to the variances encountered by customers.

Waiting Line Costs

Many of us have been in the situation where the cost information is scanty, but our boss requires a quick decision. So, what do we do? We make the best decision we can with what data we have and fill in the gaps with our intuition and past experiences. This is a pretty good solution in many business situations, but when waiting line characteristics are involved, we need to recognize that there is a much higher level of uncertainty because we no longer can count heavily on our intuition and experience to guide us to a reasonable answer.

When evaluating cost trade-offs, you should recognize that such analysis is only as good as the understanding you have regarding your business. Estimated values result in only ballpark guesses as to the best choice. Even worse is when you are unaware of a background activity that is necessary for you to be able to provide your service. Many of the textbook examples for waiting lines focus only on “front-office” activities when evaluating cost. However, some process changes when directly dealing with customers can have a significant effect on “back office” processes. Therefore, it is strongly recommended that you and your staff develop a service blueprint2 of your business to help take into account all the costs and performance factors affecting its outcome.

The simple graph in Figure 1.4 shows the basic idea that the total operational cost of a waiting line has some minimal value per customer served or item produced when the proper balance of resources and customer satisfaction is achieved. Adding more resources and/or improving service performance reduce the waiting time and the length of the queue.

  • For manufacturing applications, this correlates to faster throughput time and reduced work-in-process (WIP) inventories.
  • For repair and maintenance activities, this means that equipment is not out of service for as long, which reduces the size of the equipment fleet required for adequate service capacity.
  • For customer applications, this correlates to retaining more customers, reducing the possibility of balking, and increasing the business volume that be accommodated.

The basic trade-off that must be considered, therefore, is as follows: “If I add more resources to improve service, is there enough reduction in waiting and other costs to more than pay for those resources?”

Table 6.1. Some Cost Components to Be Considered When Seeking a Minimal Cost Balance Between Waiting Costs and Resource Costs
CostSymbolsComponentsData Sources
BalkingCBLost customers, bad PRMarketing
BlockingCBLimited capacity KM/M/s/K model
RenegingCBLine not moving fast enoughMarketing
Waiting timeCWWaiting time in line, total time in service processAll models and simulation
Throughput timeCWWaiting time in line, total time in service process, capacitySimulation
FinancingC$Credit interest, insuranceAccounting
SpoilageC$Production loss, step yields, perishable inventory, theftAccounting and manufacturing
Warehouse spaceCH, CFInventory holding costs, rentAll models and Accounting
Lost business capacityC$Repair time, line length, population size N, late penaltiesM/M/s//N model
Appointment delaysC$Appointment schedule, service time variationAll models and simulation
StaffingCSSalaries, benefitsAccounting and human resources
EquipmentCEPrice, maintenance, spare parts, depreciationVendor and accounting
FacilityCFConstruction, permits, rent, maintenance, capacity, security, insurance, utilities, janitorial servicesVendors, accounting, service process

Table 6.1 lists several cost values that are needed to make useful waiting line business decisions. The list is not intended to be inclusive, but it provides enough items to help you identify the types of costs that you should consider for your business decisions. Table 6.1 also gives some symbols that we will use here to simplify the equations for analyzing cost trade-offs. Many textbook authors use C to represent cost, with a subscript or parenthetical notation to represent the particular type of cost. For example,

C(Total) = C(Waiting) + C(Service)

or

CTotal = CWaiting + CService.

Here we will use a larger set of cost symbols, which are defined in greater detail in appendix B. This helps us remember that there are several waiting-cost and service-cost components, as indicated in Tables 6.1 and 6.2.

It is rare that you can trade off one waiting cost factor against one resource-cost factor without having some effect on other cost factors. Therefore, do not be surprised if the result of your decision is not exactly what your analysis told you.

Table 6.2. Some Waiting and Resource Costs to Be Considered When Working With Different Categories of Waiting Line Applications
Typical ApplicationWaiting Costs: CB, CL, CW, and CHResource Costs: CS, CE, and CF
Coffee shop, bankBalking, blocking, waiting time, renegingStaffing, equipment, facility
Single source: passport office, department of motor vehicles, licensing, other government agenciesWaiting timeStaffing, equipment, facility
Manufacturing lineThroughput time, financing, spoilage, warehouse spaceEquipment, staffing, facility
Self-serviceReneging, waiting timeEquipment
Repair and maintenanceLost business (capacity reduction), spare equipmentStaffing, equipment, facility
Call centersBlocking, reneging, waiting timeStaffing, equipment, hold capacity
Parking lotsBlocking, reneging, waiting timeFacility, maintenance
Health clinics, professional servicesAppointment delaysStaffing, facility

Volume-related costs, such as utilities3 and the materials associated with what each customer or item requires for service, are usually ignored in cost trade-off analyses unless the supplies are perishable (such as food) and the cost decision has a direct effect on the amount of spoilage. Where things get complicated is how we handle floor space costs. If these costs are solely due to business volume (more total customers more space), then they are usually not a factor in determining the minimal total value. But if more floor space is required to not turn away potential customers, then the profits gained from the increased number of customers per period must exceed the increased facility cost per period.

Lost business capacity and lost profit costs are similar in that both lose a potential profit. Lost business capacity is when you lose some of your normal capacity for a short time, such as when a resource is delayed in the repair shop when you had planned to use that resource to satisfy an existing customer order. Lost profit is when your normal capacity is insufficient to accommodate a customer who would have entered your business to place an order if not discouraged by a long line or insufficient capacity for waiting customers. Lost business capacity often has additional losses because you may have to pay a late-delivery penalty to the customer, or the customer may cancel the order after you have already put some work into its completion.

A tricky cost component is the percentage of server idle time. Most businesses include it as part of the service costs associated with a server. However, when P0 is more than 10% to 15% of an individual server’s time, there is an opportunity to reduce total costs by asking the server to do some routine maintenance or other support tasks instead of assigning them to back-office staff. In businesses with a staff of several servers, this practice could reduce the overall staffing needs by one or more people.

You may notice that I did not specifically list line length in the waiting costs column for Table 6.2. Line length plays a part when considering a limited capacity situation (M/M/1/K and M/M/s/K models), where the length of the line affects blocking costs. However, it is the time spent waiting in line that is the primary cost driver in many situations. The old adage “time is money” applies very well here. To help illustrate why line length is not as important as waiting time, consider Example 6.1. You may feel the background information is more than needed, but the intent is to illustrate the importance of considering factors that, at first glance, may not seem relevant to the goal of reducing total costs.

A special category included in Table 6.2 is when there is only one possible source for the service, such as a government agency (passport office, driver’s license bureau, business registration, the courts, airport security, etc.). In such situations, balking, lost customers, and customer satisfaction are not likely costs, and the only issue of primary interest is providing enough capacity in the form of facilities and staff to meet the average demand within some maximum time limit at the lowest cost possible. I have not included post offices in this category because many countries have alternatives to official government agencies for mail and package shipments.

Several examples follow to illustrate approaches for determining cost trade-offs for waiting line situations. Example 6.1 describes some basic cost calculations, and Examples 6.2 and 6.3 illustrate the nature of things to consider for more complex cost trade-off analyses. For a good discussion regarding total cost analysis for queuing situations, readers are referred to the online chapter 26 supplement by Hillier and Lieberman (2010).4

To illustrate situations where reasonable service improvements could pay off in lower total cost, we will discuss three more examples. The first is a classic repair service problem, where long turnaround times could cost a company significant money. The second is a manufacturing system where WIP inventory is expensive. The third is evaluating the benefit of adding more space to accommodate longer lines. Working through the solutions is left to the reader as an exercise; each example discusses the situation, indicates what costs should be considered, and how one might go about determining the best solution.

Improving Service Performance

There are several basic goals to consider when improving service performance. Some of these are strictly from the business perspective—reducing total cost, completing services faster, and accommodating higher customer volume. Others are from the customer perspective—not having to wait as long for service, a lower service charge, perceived fairness, and a more pleasant waiting experience. Because some of the possible approaches apply to both perspectives, they are listed first.

  • Reducing the variation in service times reduces the average waiting time, even if the average service time remains the same (see chapter 4).
  • Reduce the average service time.
  • Reduce the average waiting time.
  • Add capacity—more servers, more waiting space.

Variation reduction is normally the least expensive approach because you are working on using existing resources more effectively. Standardizing all or part of the services provided reduces the variation in service time and reduces the average waiting time, even when the average service time remains the same. In Example 6.1, Samantha standardized part of her service by having sandwiches ready-to-go rather than assembling them to order. This reduced both variability and the duration of the average service time, allowing a single server, Samantha, to handle more customers. She could provide more selections without adding to the service time by setting up a condiment station so that customers can add their own condiments rather than asking her to do it, which would increase her average service time and also add more variability to that time.

Of course, Samantha will have to spend some time to maintain the condiment station, but this can be done during any idle time when there are no customers. For her utilization factor of 0.9, P0 = 10%, which gives her an average of 6 minutes per hour to maintain the station. The hidden cost factors to be considered are Samantha’s choice of method for obtaining ready-to-go sandwiches—typically a classic make versus buy decision.

Another approach for reducing service time variability is to replace the manual parts of a service with more automated methods. This in effect is another form of standardization most often applied to manufacturing situations, but it can also be used effectively in service situations. For example, consider the coffee machines at many gasoline station convenience stores. The cost trade-off considered here is whether having a person keep a hot pot of coffee ready to pour a cup when a customer wants it or having a machine prepare the coffee for the customer on demand is the least expensive option. Doing the cost comparison involves not only the difference and variability in average service times between the two options but also the consideration of spoilage with the manual method, variation in the coffee preparation time, equipment and maintenance costs for either option, and response times for a sudden jump in customer volume (probability the coffee pot is empty, the time to brew a full pot versus the fixed response time for a machine).

Reducing the average service time reduces the average waiting time while increasing the capacity of the service system. Adding more servers is often the first strategy employed to reduce the average service time. But this also increases the probability of no customers in the system (idle time). Normally, the cost of adding servers is offset in the cost analysis by reductions in waiting-time costs and expected increases in customer volume. Hidden cost factors that are not often considered are the possibility of using idle server time to do other tasks required by the business that are not directly related to serving customers. Some examples are managing inventory and cleaning and restocking equipment or service areas, such as a convenience store coffee machine or Samantha’s condiment station. This, of course, requires servers who have more flexible job skills and thus may require a higher salary. The primary focus here should be that the added staff is hired primarily to be a server.

The opposite approach is when a staff member hired primarily for a back-office activity is asked to serve customers during periods of peak demand. The limitation of course is that such staff is often not very skilled in performing the full range of service activities, but they can be used for providing simpler, more standardized services during peak demand periods. This often works well for situations such as banks, post offices, and grocery stores where the customer arrival rates are highly variable during the day.

Reducing the average service time by eliminating unnecessary (non–value-added) steps is a common approach discussed in process management texts, but it can be difficult to do when there is little standardization in the service process. What can be done is to eliminate the execution of some steps by the server and have the customer do them instead. Some examples are having customers fill out forms (health insurance, food orders, address labels, etc.) while they wait instead of the server asking questions and filling out the form for the customer. There is also a psychological benefit in doing this, which will be discussed in more detail later in this chapter.

Another way to reduce the average service time without adding servers is to inform customers about what is expected from them when they reach the server. In fast-food restaurants and coffee shops, displaying the menu with prices in clear view of the line helps reduce the time customers spend at the server window deciding what they want to order. Promptly displaying the amount of payment due after receiving the order reduces more service time by allowing a customer to decide on the payment method or collect the appropriate change while the server either processes the order or forwards it on to the kitchen or the barista. This leads to one of my big rules regarding customer-server interactions: Uninformed customers require more service time. Use the time spent waiting in line to educate them.

The current best example of this is the airport security check-in process. Clear and properly located information signs5 can prepare the customer as to what the customer must do at each step instead of servers having to take time to explain individually what is expected of each customer at each stage of the check-in process. This saves the server’s voice and lowers frustration on both the part of the server and the customer.

Reducing the average waiting time is very difficult to do unilaterally because it is so strongly related to the arrival rate and service rates. But it can be reduced for some customers in the arrival population by either separating them into different service classes with different average service times or by assigning them a higher priority. Some common examples of this are as follows:

  • Express lines at grocery stores for customers with only a few items
  • Windows dedicated for mail pickup at the post office
  • Bank tellers dedicated to commercial business customers
  • Frequent-traveler express lines at airports, car rental agencies, and hotel check-ins
  • Separate coffee shop line for customers just wanting a standard cup of house coffee
  • Self-service stations or checkout lines for standardized services
  • Moving airline travelers who would otherwise miss their flight to the head of the airline counter check-in lines
  • Taking care of more seriously ill patients in an emergency room first

The disadvantage with methods that reduce the average service time and/or the average waiting time for some of your customers by serving them according to priority or class is that if you retain the same resource level with the same total average population, your average service rate will remain the same, but the variability in the waiting time experienced by your customers will increase: Some will have a shorter average waiting time, and others will have a longer average waiting time. The old adage “there is no such thing as a free lunch” holds true in waiting line operations.

The “classes” approach requires a multiple-server situation with at least two servers—more if there are several classes to be served separately. Thus, if you do not already have multiple servers, you are in effect adding resources to reduce waiting time. The “priority” approach can be used in both single-channel and multiple-channel waiting line models.

In the classes and priority approaches, the overall average throughput rate remains unchanged provided that in multiple-server applications a server who is normally assigned to higher priority or a specific class of customers is expected to take care of lower priority or other classes of customers when that server is idle (jockeying is allowed in that case). Therefore, from a business viewpoint, capacity and average throughput rate remain unchanged. From the customer viewpoint, some of them are served faster at the expense of other customers now having to wait longer.

At this point, some of you are likely thinking, “Why bother doing this? It sounds like a no-gain situation that rewards only some customers.” The answer is that sometimes there is little choice to not do it. For a hospital emergency room, it should be clear that having a priority approach is necessary. Rewarding customers who individually contribute a higher percentage of your profits is a standard business practice to encourage them to continue being your customers when competing with other businesses that also provide such incentives.

What are not so obvious are the potential cost advantages. To keep the mathematics manageable, the basic closed-form, waiting line equations assume that all servers are equally capable of providing the same service rate. In chapter 5, we discussed the situation of how long it would take for a new bank teller to achieve that proficiency. Now consider the advantage of a classes approach that would allow new tellers to handle simple transactions, such as deposits and withdrawals, at first and move to more complicated banking transactions later. Their learning curve would then be broken into two or more learning curves with faster learning percentages.

This leads to an important conclusion: Classes allow the use of a range of server skills, which, in turn, allows the possibility of a range of server costs. For example, consider a three-server application where all the servers are expected to be able to handle the full range of expected customer transactions. But in many situations, part of the normal customer population wants only services that could be done using a self-service machine. Automatic teller machines, online banking, Web check-in, flat-rate package mailing, and convenience store coffee machines are some examples that this is true.

This is where managers earn their salaries. We can do the cost comparisons using the three-server example versus two servers and a machine or even one server versus one or two machines. We can account for the differences in capabilities (flexibility of the server or server substitute) and relative costs, assuming we have a good estimate of how many customers in each class we plan to serve. This is the easier part. What is more difficult is considering the risks involved.

In this situation, if one server becomes ill, it is likely that the service can still be provided but with commensurate longer waiting times. This is also the case when we replace one server with a self-service machine: If the machine breaks down, the two remaining full-service servers can temporarily take up the slack. (In both cases, λ/µ must be less than two for this to be possible.) But if one of the two remaining servers becomes ill, the machine cannot take up the entire sick server’s load because of its more limited capability. That is, a server can take all the work of a broken machine, but a machine can take over for only part of a server’s work. The risk imposed by this lack of interchangeability increases when cost comparisons indicate to a manager that the most economical situation is two machines and one server.

Considering that more complex transactions are usually more critical, I would want to be sure that they are covered. Therefore, without even looking at the cost comparisons beforehand, I would restrict my cost comparisons for this situation to the three-server or the two-server-plus-machine options.

If some of the possible classes turn out to be a very low percentage of the total population, it is usually better that they be included as part of a larger class. But, if a class is a significant percentage of a business’s clientele, it is usually worthwhile to consider how you can provide a separate service or priority for them.

One great example of how to serve one class of customer without adding either servers or machines was at a local coffee shop in Oregon where the owner set up a self-serve station for his customers who just wanted a cup of house coffee. He also eliminated the need for a server to handle payments for the house coffees by posting the price for each size of cup and trusting his customers to put the appropriate amount into a locked cashbox with a slit in it. I observed that almost no customers skipped the payment, and some put in more than what was asked because they did not have adequate change. In this case, he reduced not only the average waiting time for his house coffee customers but also the need for server time to support that part of his business.

So how does a business owner identify such classes of customers? In sales operations, POS data can really help. Some examples are as follows:

  • You can screen your POS receipts for the percentage of customers who buy less than a given number of items to determine whether or not an express checkout line would be useful.
  • You can screen the POS data for how many customers order products or services that require only a standard service time.
  • You can analyze your sales data for your highest volume buyers or service requests.

You can also ask your customers to help by allowing them to select the class of service they want. This is particularly useful for call center and repair service operations. When calling in for support, we all have encountered choice menus that let us identify what type of question we want answered. This not only directs us to the right person to answer that question but also provides data that the call center can use to adjust staffing levels, allows the looking up records by the computer for some requests like asking for your current account balance, or makes the records already available for an operator to reduce the average service time and required staffing levels.

The useful thing is to recognize that when arrival rates are described by exponential distributions with appropriate average values, one can easily combine or separate average arrival rate distributions for the classes. The results will also be exponential distributions with the same general characteristics.

For example, consider a typical banking call center operation where there are three choices (effectively classes where the customer tells you what class they are in) on the telephone menu, such as the following:

  1. Account balance check
  2. Billing and payment questions
  3. All other questions

Assume that the bank knows the average arrival rate for each choice by analyzing its previous call history. We will designate these rates as λ1, λ2, and λ3. Hence the total arrival rate λ to be handled by the call center if it is operated as a multiple-channel, single-phase system where any operator can answer the next call is merely the sum of λ1, λ2, and λ3. Of course, in a large call center with several operators with different skill levels, each individual arrival rate is likely to be assigned to a specific operator or even an automated response for some choice.

Consider a similar situation with the same choices, where the business knows the overall average arrival rate λ for its customer base but has not yet collected enough data to break down that rate into the individual rates for each choice. Wanting to improve its service by possibly designating specific operator(s) or automated responses to each choice, the business checks with a call center consulting firm for information about industry average percentages for each class of questions. The consulting firm provides the values of P1, P2, and P3, respectively, for the business’s choices. Because exponential distributions can also be easily separated into components that are also exponential distributions, the bank can then obtain λi by multiplying λ by the respective probability Pi.

You can get useful information even when a consulting firm does not have all the usual percentages. This is often the case because your customer base does not match up well with any of the databases that consulting firms might have. Software can monitor the frequency of choices selected in the telephone menu: Given that information, a manager can then decide whether or not to address the choices separately or as a combined pool.

Finally, when improving service performance by adding more servers, the training time required for a new server to achieve an acceptable proficiency level becomes a business priority. Those businesses that can do it faster than their competitors have a competitive edge when the demand for their type of services is increasing.

One common method to increase the learning rate is using a mentor alongside a new employee for the first week or so. It is important that the mentor have the new employee do all the repetitions for the best result. This means that the mentor needs to resist the temptation to step in and do some of the steps for the employee to serve a long line of waiting customers more quickly. In such situations, it is better, if possible, to open up another line to take some of the pressure off the new employee’s learning experience.

Another method for businesses that handle different classes of customers separately is to start the new employee with the simpler service operations and then include more complex services as that employee’s proficiency increases.

A common example illustrated in Figure 6.1 is the practice of many coffee shops of having one or two servers take customer orders and collect payments and a separate group of baristas prepare the different types of coffees requested. In most instances, the order takers also handle nonpreparation services, such as pouring house coffees and taking care of pastry orders. This not only provides a training path but also allows the more effective use of the more highly trained barista staff and the ability to use part-time workers for the order taking. In effect, two lines and two classes have been created here: one for all customers and a second line (phase) for customers with more complex orders.

The most important element is clear communication between the manager and the service employees about the training process and what is expected. This should include how to deal with long-line situations to avoid the state-dependent rate variances discussed earlier.

Waiting Line Configurations, Psychological Factors

Many times, it is not practical or even possible to do much to reduce the average waiting time for a service. First, it is important to not do things that would increase a customer’s perception of how long the wait is likely to be. Second, there are things we can do to reduce anxiety in line or even to allow a customer to make more effective use of that waiting time. Some suggestions paraphrased from the available online literature are given here; for more detailed information, consult the classic article by Maister (1985) and the following expanded discussion based on Norman (2008).

Figure 6.1. Coffee shop waiting line configuration for customers with simple orders (white symbols) and more complex orders (gray and black symbols).

Some general axioms regarding the waiting experience are as follows:

  • If a customer has time limitations, anxiety increases, and the wait appears to be longer. This is especially appropriate for lunchtime customers.
  • The more desirable the service, the longer people are willing to wait. Popular restaurants, popular movies and shows, and limited attendance venues are good examples. A corollary here is that excellent service can compensate for some poor waiting conditions.
  • The more necessary the service, the more customers will tolerate a long wait.
  • Waiting on hold seems longer than waiting in line.
  • When there is more than one line, your line always appears to be more slowly. You notice it when the other line moves with respect to you but not when your line moves with respect to the other line.
  • Time moves more slowly when you have nothing to do but wait. The corollary is that the wait appears to end more quickly when you have something to do.

Some negative psychological factors to be avoided are as follows:

  • A clock on the wall reminds customers of the passage of time.
  • Not informing customers about what is to be expected (priority rules, what is required of the customer to obtain service, how long the current average wait is, etc.). One common failure in this regard is the practice of many restaurants to assign specific tables to each server on duty. Customers who are unaware of this can be quite upset when they have been waiting and a nearby server who is not assigned to their table seems to be ignoring them. Some customer dissatisfaction can be avoided by the host or hostess seating customers at their table and telling them who their assigned server will be and perhaps even taking their drink orders to get things started.
  • An uncomfortable waiting environment: Expecting people to wait outside in a predominantly rainy climate like the Pacific Northwest or asking medical patients to stand instead of sit while waiting are just cruel practices. When people have other more comfortable waiting options for the service they want, such practices will cost a business many of its customers.

Some positive things a business can do are as follows:

  • Give customers something to do while they wait. The best items are things that contribute to completing the service process, such as filling out forms, informing customers about what they will need to do when they reach the server, or letting them know about any personal data or items that may be required for service rather than finding out when they reach the server window after a long wait. In some cases, it is possible to allow a customer to leave the service facility to do other things during the waiting time and then return shortly before they are next in line for service.6,7
  • Use a single-line configuration for multiple-server applications wherever possible. The single-line and parallel-line configurations in Figure 3.1 provide the same average waiting time from a business perspective, but the variability in individual customer experience and the perception of how fast the line moves are different. The single-line configuration moves every time a single customer is processed; parallel lines move only when a customer in that line is processed. The more servers, the faster the single line moves in respect to individual lines. In addition, the variability in individual customer waiting times is reduced because the effect of individual service times that are very long or very short are averaged out over the number of servers rather than affecting just the line in front of one server.
  • Make the waiting area more comfortable and even productive for customers while they wait. Seating, wireless access to the Internet, and self-serve coffee stations have become more common in many waiting areas as businesses become aware that customers who have something to do are more tolerant of uncertain or longer waits.

Arrival Rate Management Methods

Another approach for reducing variability in service processes is to try to control the arrival rate to some degree. The most familiar method is using appointments or reservations. Appointments are used where there is more predictability in the nature of the services requested and the facility capacity and choice of servers are relatively constant. When the service time has a greater range of variability, such as serving meals in a popular restaurant, and a wider choice of servers can serve a given customer, reservations are more appropriate. Other significant factors here are the cost and availability of servers. The higher the skill requirements, the higher the cost per server and the lower the part-time availability is for such skilled individuals—hence the prevalent use of appointments for most professional services.

Consider a typical medical clinic appointment schedule. The facility has a fixed number of medical staff and examination rooms. The capacity of the facility is dependent on the effectiveness of the appointment schedule. Some medical services, such as vaccinations and routine tests, have a relatively constant service time, making their appointment schedule easy if the demand requires it (some daily peaks greater than the daily capacity). The appointment time then becomes whatever the service time is plus some easily calculated small addition to accommodate cleanup and preparation between patients.

But when we consider the appointment schedule for doctors on the staff, life just became more complicated. The factors to be considered are that not all doctors require the same average examination time, examination times can vary considerably from one patient to the next, patients sometimes arrive a bit late, some room must be left in the appointment schedule to accommodate unscheduled urgent care patients,8 and, most important of all, when a patient examination lasts longer than the allotted appointment time, it affects the timing of some if not all the subsequent appointments that day. Taking all this into account, the goal of an appointment scheduling time analysis is to select an appointment duration that reduces the possibility of an appointment running late without trading off too much patient capacity. The trade-off is between the costs of having people stay later to accommodate the patient who made the last appointment versus accommodating fewer patients during the day.

Having a probability distribution that accurately represents the distribution of examination times is essential. Unfortunately, the standard use of an exponential service distribution to represent service time in general queuing analysis models does not work well here because it allows a high percentage of service times lower than the average time. A normal distribution is more representative of medical examinations, and Erlang or beta distributions are better, but the best is when the clinic has enough data to determine its own discrete distribution for use in simulation programs for scheduling. Example 6.5 illustrates how a normal service time probability distribution can be used to estimate the probability of an appointment running over.

Example 6.5 is a simplified problem that illustrates the need for using simulation methods, where the effect of various appointment times and different overrun times can be evaluated to determine the best result that provides the highest capacity without undue costs and customer waits. A normal distribution is rarely the best distribution for scheduling purposes, but it is often used because it is easier to use in queuing analysis models and more familiar to many users.

Some other observations are that if the examination room preparation and patient record entry could be handled independently of the doctor and the nurse, it would free up another 2 minutes per appointment. For example, you could have a dedicated record clerk do the entries and a dedicated attendant prepare examination rooms.

Staggering the appointment times so that each doctor in a practice does not start appointments at the same times would simplify the scheduling of these back-office support activities. Staggering appointment times would also space out appointment check-ins for a smoother process flow at the check-in desk and provide a positive psychological effect in the waiting room because patients would be arriving and departing on a more regular basis, not intermittently in groups.

Actual operational data are necessary to provide the best representations of the service time and arrival time distributions in simulations. It cannot be emphasized enough that the result of a simulation is only as good as the knowledge of the real-life process and the actual operational data available for it.

For this medical clinic problem, the following data are needed for an accurate simulation solution:

  • A history of actual examination times is needed to build an accurate discrete probability distribution for those times.
  • A history of appointment overruns, how long each was, and what time of day it occurred gives a picture of the typical overrun.
  • Separate this information into values associated with regular patients and values associated with unscheduled urgent-care patients. (It is likely that longer examination times are more closely associated with urgent-care patients, but this assumption should be verified.)
  • Identify the number of urgent-care patients per day per class of care. This helps in planning the number of open appointments needed per day and how they might be shared among the doctors who are most capable of handling the particular class of care required.10
  • Identify the history of times that patients arrived for an appointment: how early or late were they and what time of day was the appointment. Such arrival timing variations can contribute to overruns and hinder the ability to catch up after an overrun or even get ahead of schedule.

Reservation policies have similar constraints and data collection needs, but they allow some simplifications because the number of reservations required per day is more limited. For many popular restaurants, the number of successive reservations (turns) for a given location is limited to how many groups they expect to seat at that location during a normal dining period (breakfast, lunch, or dinner). Obviously, the restaurant will require an estimate of how long customers will typically take to order and eat their meals and also must recognize that their service rate contributes to this period. One rule of thumb to help judge the variability of dining time is that the larger the group, the longer one might expect customers will take for dinner because of the increased amount of socializing (and beverages) that are likely to occur.

In many European restaurants, a group has the table for the evening, so the likelihood of a late-night reservation is not considered. In that respect, such restaurants are no different from one-time events, such as airplane flights, hotel rooms, Broadway shows, and sporting events. The challenge here is to fill the available seats or rooms for maximum profit. There is always some risk that someone with a reservation will cancel at the last minute. Knowing the average number of no-shows based on their past experience, some businesses will accept that number of reservations over their available capacity to ensure full usage of that capacity. The risk they take on is possibly having to deal with a highly dissatisfied customer who has a reservation and no accommodation because the usual number of no-shows did not occur. This practice, called overbooking, is used by many airlines and hotels. While overbooking analysis also involves working with arrival distributions and probabilities, it is not included in the scope of this monograph.11

Priority Management

To conclude this chapter, some discussion regarding suggested guidelines for using priorities in waiting line applications is in order. We will cover this in two parts: people and items. When dealing with people, one basic rule to remember is that when you selectively improve the waiting experience of one group in a queue, you have also chosen to make that experience worse for the other groups in the queue. Another basic rule to also consider is that when dealing with systems where all servers are equally capable of handling all customers, priority management does not affect the overall average performance.

Simply put, moving some people to the head of the line makes others wait longer. If you do this too often in a single line, you run the risk of delaying some customers long enough that they give up and leave the line without being served. If the reason you give some customers priority is not explained or appears to be unfair to the remaining customers, you risk losing even more customers who will take their business elsewhere.

Dealing with these issues is easy in some situations. Giving preference to more critically ill patients in an emergency room is generally understood by the other patients and even expected. Another commonly understood preference is when passengers who would otherwise miss their flights are called to the head of the line at the airline ticket counter. The announcement not only helps identify who needs preference but also tells the others why they are being given preference.

Giving preference to more valuable customers on hold in a call center is easy because the process is invisible to other customers on hold and the selection can even be automated. For face-to-face professional services, some businesses even use separate entrances for preferred customers to achieve a form of call center anonymity.

When the number of customers requiring some preference is relatively large, it is better to treat them as a class and set up either a separate line (frequent travelers at an airport) or a business process like an express line at a grocery store to handle them. This also helps communicate to other customers the reason for the different treatment to reduce their impressions of being treated unfairly.

Nonpreemptive and Preemptive Priorities

There are two ways to handle higher-priority customers: nonpreemptively and preemptively.

A nonpreemptive approach moves a higher-priority customer to the head of the waiting line, allowing a lower-priority customer currently being served to continue being served until that service is completed. A preemptive approach allows a higher-priority customer (A) to replace a lower-priority customer (B) currently being served, delaying the completion of the service for B until later when there are no more higher-priority customers to preempt B. If there are a large number of higher-priority customers, B is likely to wait a long time before his or her service can be completed.

The nonpreemptive approach is preferred for most call center priority applications because it best preserves the invisibility aspect of the prioritization process. (Imagine being asked to be put back on hold while the operator takes care of another caller.) It is also preferred in most applications where immediate service is not the sole reason for the priority and where the line behavior is observable by all customers because it is perceived as being fairer than the preemptive approach.

While this should be intuitively obvious, using priorities increases the variability of waiting times: the higher the percentage of customers getting preferential treatment, the higher the variability. Because variability adds uncertainty to business outcomes, using priority rules in processing waiting line customers should be carefully considered; if used, it should be limited to only a small percentage of the arrival population.

Some models12,13 have been developed to determine the increased variability in average waiting time when using both nonpreemptive and preemptive priorities. These models also aid in determining the degree of reduction in the average waiting time for higher-priority customers and the concomitant increase in waiting time for lower-priority customers. If you are interested in possibly applying these models to your situation, consult Hillier and Lieberman (2010), chapter 17, and Haussmann (1970). Be sure to review the assumptions and conditions for using such models to make sure they are appropriate for your situation.

An effective use of intermittent preemptive prioritization is when a customer being served needs to fill out a form or otherwise do something that only the customer can do to complete the service. Instead of asking the customers waiting in line to wait longer while this is done by the customer, the server asks the customer to leave the line to fill out the form or do the other activity and then reenter the head of the line when done, in effect giving that customer a higher priority. Such a line is occasionally referred to as a double-ended queue or dequeue. This has several benefits with no additional cost to the business to implement it; it reduces the average service and waiting times, improves customer satisfaction, and is generally viewed by customers as a fair use of prioritization.14

Manufacturing applications can use a wider range of priority rules when dealing with items or jobs with varying service times. These rules are normally applied as part of the production scheduling process. At the beginning of each day, the production scheduler reviews the list of jobs or items to be produced and then sequences them according to the priority rule used. Some of the more commonly used rules15,16 are as follows: shortest process time first (SPT), first-come, first-served (FCFS) or first-in, first-out (FIFO), earliest due date (EDD), slack per remaining operation (S/O or SRO), critical ratio (CR), and Johnson’s Rule (for the special case of two sequential steps with varying service times at each).

Performance measures for such schedules include the average number of jobs or work-in process (WIP inventory—same as L in queuing analysis), the average job lateness (missing any due dates), the job flow time (same as W in queuing analysis), and makespan (the total time to complete a group of jobs).17

It should be obvious that the scheduling goal is to minimize the value for each measure. Because priority rules do not all affect performance measures to the same degree or in the same manner, a manager should select a rule that best addresses the performance measure that is most important for that manager’s business. If all jobs must go through the same sequence of steps or operations, the queuing analysis models discussed in this monograph can be used to determine which scheduling priority rule provides the lowest performance measure values for a particular business.

However, when the job flow is not the same for all items, which occurs in many job shops, most queuing models are inadequate for the task. There is considerable work in progress to develop good mathematical models for job shop situations, but the mathematics involved is challenging, and the current results available are usually difficult to understand by someone who does not have strong mathematical or statistical skills.

One special priority rule used in manufacturing is the practice of using “hot lots” for preferentially moving selected items or jobs through a process faster. While in practice it is best to avoid the need for expediting items through a manufacturing process, sometimes it is necessary to satisfy the need of a critical customer or cope with output constraints imposed by a manufacturing process. This need was prevalent in many early integrated circuit fab facilities because of the large variance in final product yields and long manufacturing times of several weeks or more. A problem frequently encountered then with hot lots was that there were often so many rush requests from different customer groups that the reduced processing time advantage for hot lots was so small that it was not worthwhile. This forced many fab managers to put restrictions on the number of hot lots they would accept at any one time. This limit was typically between 5% and 10% based on the experience of a given fab manager. Analysis using one of the priority models available today would have allowed managers to set more appropriate hot-lot limits for a desired expedited throughput time.

In summary, my experience has indicated that using priority systems in queuing applications is less effective when dealing with people and should be generally avoided unless there is a clear justification for them (emergency rooms, 911 calls, etc.). Separating customers into different service classes is a better way to offer improved services to customers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset