Managerial Considerations
This chapter covers topics where an astute managerial choice can minimize uncertainties and improve overall performance for businesses affected by some form of waiting line behavior. The areas of consideration include the following:
Some of you may have jumped ahead to this chapter because you feel that you already know all you have to know about waiting line theory. That may be true for some; but for most, it would be wise to review at least some of the topics in chapters 2 through 5 if parts of the following discussion are confusing.
Variance and Risk Considerations
When I ask business students to select the better business solution after giving them the average performance for each choice, most of them quickly choose the one with the highest average. When more than one choice has the same average performance, the same group often comments that any choice will do.
A few students will look beyond the average performance to considerations of risk or the accuracy of the average performance value before making their choice. To evaluate risk or accuracy, some knowledge of the variances in a process is necessary. Unfortunately waiting line equations provide only steady-state average performance. Start-up variances, such as whether there is a line of customers waiting when a business opens its doors or there is a set of items left over from the previous day of production that need to be completed before beginning new work, are not included in steady-state results. Larger variances, such as no customers or a large group of customers that arrive all at once during the day, occur less often; but when they do, they can cause significant business disruptions.
Risks such as the probability the line will be longer than the capacity of the business can be calculated, but when that condition can occur over a period of time cannot be predicted. Managers need to remember that waiting line arrivals and service times are memoryless. That is, what happens next is not influenced by what has just happened in the past. For example, when flipping a coin, the next coin toss result is unaffected by whether or not there were two heads, two tails, a head and a tail, or tail and a head for the previous two tosses.
One variance often not considered is the potential variance in normal service and arrival times when waiting lines are much longer or shorter than usual. This is often referred to as a state-dependent variance or rate.1
When waiting lines get long, some newly arriving customers may be discouraged from entering the line (balking), and some current customers may choose to give up and leave (reneging). At the same time, many servers feel the pressure to speed up their work to help move the line faster. As a result, this creates three general managerial concerns:
Clear managerial communication and good server training can minimize server variances by letting servers know they will not be penalized for remaining careful and not taking shortcuts when lines are long and making clear any requirements regarding what servers are expected to do during times when no customers are present.
In manufacturing lines, some of the arrival rate variances can be minimized through careful production scheduling; but when dealing with customers, we have to work with service rate variances unless the business is conducive to using an appointment or reservation system. In customer service lines, reneging and balking can be reduced by opening up a temporary express window for customers with standard, easy-to-do needs. An advantage of this approach is that you can use a staff person who may not be trained in the full range of services offered. An example of this is opening up a window strictly for customers only picking up mail and/or packages at post offices when the line is long. So, how do we assess potential variances and risks in waiting line performance? We can use three basic steps:
Variances that are very difficult to quantify are differences in customer attitudes. Their tolerance of waiting times and long lines can vary greatly depending on the time of day, the weather, whether they are with a friend or not, whether their service need is urgent or not, any personal time limitations, and so forth. Although we cannot usually address the direct causes of these variations, we can reduce their intensity by considering the psychological factors associated with waiting. Some of these factors and methods for reducing their effects are discussed later in this chapter. The most important consideration to remember when accounting for variability in waiting line business decisions is that the variances experienced by the business are NOT directly related to the variances encountered by customers.
Waiting Line Costs
Many of us have been in the situation where the cost information is scanty, but our boss requires a quick decision. So, what do we do? We make the best decision we can with what data we have and fill in the gaps with our intuition and past experiences. This is a pretty good solution in many business situations, but when waiting line characteristics are involved, we need to recognize that there is a much higher level of uncertainty because we no longer can count heavily on our intuition and experience to guide us to a reasonable answer.
When evaluating cost trade-offs, you should recognize that such analysis is only as good as the understanding you have regarding your business. Estimated values result in only ballpark guesses as to the best choice. Even worse is when you are unaware of a background activity that is necessary for you to be able to provide your service. Many of the textbook examples for waiting lines focus only on “front-office” activities when evaluating cost. However, some process changes when directly dealing with customers can have a significant effect on “back office” processes. Therefore, it is strongly recommended that you and your staff develop a service blueprint2 of your business to help take into account all the costs and performance factors affecting its outcome.
The simple graph in Figure 1.4 shows the basic idea that the total operational cost of a waiting line has some minimal value per customer served or item produced when the proper balance of resources and customer satisfaction is achieved. Adding more resources and/or improving service performance reduce the waiting time and the length of the queue.
The basic trade-off that must be considered, therefore, is as follows: “If I add more resources to improve service, is there enough reduction in waiting and other costs to more than pay for those resources?”
Table 6.1. Some Cost Components to Be Considered When Seeking a Minimal Cost Balance Between Waiting Costs and Resource Costs | |||
---|---|---|---|
Cost | Symbols | Components | Data Sources |
Balking | CB | Lost customers, bad PR | Marketing |
Blocking | CB | Limited capacity K | M/M/s/K model |
Reneging | CB | Line not moving fast enough | Marketing |
Waiting time | CW | Waiting time in line, total time in service process | All models and simulation |
Throughput time | CW | Waiting time in line, total time in service process, capacity | Simulation |
Financing | C$ | Credit interest, insurance | Accounting |
Spoilage | C$ | Production loss, step yields, perishable inventory, theft | Accounting and manufacturing |
Warehouse space | CH, CF | Inventory holding costs, rent | All models and Accounting |
Lost business capacity | C$ | Repair time, line length, population size N, late penalties | M/M/s/∞/N model |
Appointment delays | C$ | Appointment schedule, service time variation | All models and simulation |
Staffing | CS | Salaries, benefits | Accounting and human resources |
Equipment | CE | Price, maintenance, spare parts, depreciation | Vendor and accounting |
Facility | CF | Construction, permits, rent, maintenance, capacity, security, insurance, utilities, janitorial services | Vendors, accounting, service process |
Table 6.1 lists several cost values that are needed to make useful waiting line business decisions. The list is not intended to be inclusive, but it provides enough items to help you identify the types of costs that you should consider for your business decisions. Table 6.1 also gives some symbols that we will use here to simplify the equations for analyzing cost trade-offs. Many textbook authors use C to represent cost, with a subscript or parenthetical notation to represent the particular type of cost. For example,
C(Total) = C(Waiting) + C(Service)
or
CTotal = CWaiting + CService.
Here we will use a larger set of cost symbols, which are defined in greater detail in appendix B. This helps us remember that there are several waiting-cost and service-cost components, as indicated in Tables 6.1 and 6.2.
It is rare that you can trade off one waiting cost factor against one resource-cost factor without having some effect on other cost factors. Therefore, do not be surprised if the result of your decision is not exactly what your analysis told you.
Table 6.2. Some Waiting and Resource Costs to Be Considered When Working With Different Categories of Waiting Line Applications | ||
---|---|---|
Typical Application | Waiting Costs: CB, CL, CW, and CH | Resource Costs: CS, CE, and CF |
Coffee shop, bank | Balking, blocking, waiting time, reneging | Staffing, equipment, facility |
Single source: passport office, department of motor vehicles, licensing, other government agencies | Waiting time | Staffing, equipment, facility |
Manufacturing line | Throughput time, financing, spoilage, warehouse space | Equipment, staffing, facility |
Self-service | Reneging, waiting time | Equipment |
Repair and maintenance | Lost business (capacity reduction), spare equipment | Staffing, equipment, facility |
Call centers | Blocking, reneging, waiting time | Staffing, equipment, hold capacity |
Parking lots | Blocking, reneging, waiting time | Facility, maintenance |
Health clinics, professional services | Appointment delays | Staffing, facility |
Volume-related costs, such as utilities3 and the materials associated with what each customer or item requires for service, are usually ignored in cost trade-off analyses unless the supplies are perishable (such as food) and the cost decision has a direct effect on the amount of spoilage. Where things get complicated is how we handle floor space costs. If these costs are solely due to business volume (more total customers → more space), then they are usually not a factor in determining the minimal total value. But if more floor space is required to not turn away potential customers, then the profits gained from the increased number of customers per period must exceed the increased facility cost per period.
Lost business capacity and lost profit costs are similar in that both lose a potential profit. Lost business capacity is when you lose some of your normal capacity for a short time, such as when a resource is delayed in the repair shop when you had planned to use that resource to satisfy an existing customer order. Lost profit is when your normal capacity is insufficient to accommodate a customer who would have entered your business to place an order if not discouraged by a long line or insufficient capacity for waiting customers. Lost business capacity often has additional losses because you may have to pay a late-delivery penalty to the customer, or the customer may cancel the order after you have already put some work into its completion.
A tricky cost component is the percentage of server idle time. Most businesses include it as part of the service costs associated with a server. However, when P0 is more than 10% to 15% of an individual server’s time, there is an opportunity to reduce total costs by asking the server to do some routine maintenance or other support tasks instead of assigning them to back-office staff. In businesses with a staff of several servers, this practice could reduce the overall staffing needs by one or more people.
You may notice that I did not specifically list line length in the waiting costs column for Table 6.2. Line length plays a part when considering a limited capacity situation (M/M/1/K and M/M/s/K models), where the length of the line affects blocking costs. However, it is the time spent waiting in line that is the primary cost driver in many situations. The old adage “time is money” applies very well here. To help illustrate why line length is not as important as waiting time, consider Example 6.1. You may feel the background information is more than needed, but the intent is to illustrate the importance of considering factors that, at first glance, may not seem relevant to the goal of reducing total costs.
A special category included in Table 6.2 is when there is only one possible source for the service, such as a government agency (passport office, driver’s license bureau, business registration, the courts, airport security, etc.). In such situations, balking, lost customers, and customer satisfaction are not likely costs, and the only issue of primary interest is providing enough capacity in the form of facilities and staff to meet the average demand within some maximum time limit at the lowest cost possible. I have not included post offices in this category because many countries have alternatives to official government agencies for mail and package shipments.
Several examples follow to illustrate approaches for determining cost trade-offs for waiting line situations. Example 6.1 describes some basic cost calculations, and Examples 6.2 and 6.3 illustrate the nature of things to consider for more complex cost trade-off analyses. For a good discussion regarding total cost analysis for queuing situations, readers are referred to the online chapter 26 supplement by Hillier and Lieberman (2010).4
To illustrate situations where reasonable service improvements could pay off in lower total cost, we will discuss three more examples. The first is a classic repair service problem, where long turnaround times could cost a company significant money. The second is a manufacturing system where WIP inventory is expensive. The third is evaluating the benefit of adding more space to accommodate longer lines. Working through the solutions is left to the reader as an exercise; each example discusses the situation, indicates what costs should be considered, and how one might go about determining the best solution.
Improving Service Performance
There are several basic goals to consider when improving service performance. Some of these are strictly from the business perspective—reducing total cost, completing services faster, and accommodating higher customer volume. Others are from the customer perspective—not having to wait as long for service, a lower service charge, perceived fairness, and a more pleasant waiting experience. Because some of the possible approaches apply to both perspectives, they are listed first.
Variation reduction is normally the least expensive approach because you are working on using existing resources more effectively. Standardizing all or part of the services provided reduces the variation in service time and reduces the average waiting time, even when the average service time remains the same. In Example 6.1, Samantha standardized part of her service by having sandwiches ready-to-go rather than assembling them to order. This reduced both variability and the duration of the average service time, allowing a single server, Samantha, to handle more customers. She could provide more selections without adding to the service time by setting up a condiment station so that customers can add their own condiments rather than asking her to do it, which would increase her average service time and also add more variability to that time.
Of course, Samantha will have to spend some time to maintain the condiment station, but this can be done during any idle time when there are no customers. For her utilization factor of 0.9, P0 = 10%, which gives her an average of 6 minutes per hour to maintain the station. The hidden cost factors to be considered are Samantha’s choice of method for obtaining ready-to-go sandwiches—typically a classic make versus buy decision.
Another approach for reducing service time variability is to replace the manual parts of a service with more automated methods. This in effect is another form of standardization most often applied to manufacturing situations, but it can also be used effectively in service situations. For example, consider the coffee machines at many gasoline station convenience stores. The cost trade-off considered here is whether having a person keep a hot pot of coffee ready to pour a cup when a customer wants it or having a machine prepare the coffee for the customer on demand is the least expensive option. Doing the cost comparison involves not only the difference and variability in average service times between the two options but also the consideration of spoilage with the manual method, variation in the coffee preparation time, equipment and maintenance costs for either option, and response times for a sudden jump in customer volume (probability the coffee pot is empty, the time to brew a full pot versus the fixed response time for a machine).
Reducing the average service time reduces the average waiting time while increasing the capacity of the service system. Adding more servers is often the first strategy employed to reduce the average service time. But this also increases the probability of no customers in the system (idle time). Normally, the cost of adding servers is offset in the cost analysis by reductions in waiting-time costs and expected increases in customer volume. Hidden cost factors that are not often considered are the possibility of using idle server time to do other tasks required by the business that are not directly related to serving customers. Some examples are managing inventory and cleaning and restocking equipment or service areas, such as a convenience store coffee machine or Samantha’s condiment station. This, of course, requires servers who have more flexible job skills and thus may require a higher salary. The primary focus here should be that the added staff is hired primarily to be a server.
The opposite approach is when a staff member hired primarily for a back-office activity is asked to serve customers during periods of peak demand. The limitation of course is that such staff is often not very skilled in performing the full range of service activities, but they can be used for providing simpler, more standardized services during peak demand periods. This often works well for situations such as banks, post offices, and grocery stores where the customer arrival rates are highly variable during the day.
Reducing the average service time by eliminating unnecessary (non–value-added) steps is a common approach discussed in process management texts, but it can be difficult to do when there is little standardization in the service process. What can be done is to eliminate the execution of some steps by the server and have the customer do them instead. Some examples are having customers fill out forms (health insurance, food orders, address labels, etc.) while they wait instead of the server asking questions and filling out the form for the customer. There is also a psychological benefit in doing this, which will be discussed in more detail later in this chapter.
Another way to reduce the average service time without adding servers is to inform customers about what is expected from them when they reach the server. In fast-food restaurants and coffee shops, displaying the menu with prices in clear view of the line helps reduce the time customers spend at the server window deciding what they want to order. Promptly displaying the amount of payment due after receiving the order reduces more service time by allowing a customer to decide on the payment method or collect the appropriate change while the server either processes the order or forwards it on to the kitchen or the barista. This leads to one of my big rules regarding customer-server interactions: Uninformed customers require more service time. Use the time spent waiting in line to educate them.
The current best example of this is the airport security check-in process. Clear and properly located information signs5 can prepare the customer as to what the customer must do at each step instead of servers having to take time to explain individually what is expected of each customer at each stage of the check-in process. This saves the server’s voice and lowers frustration on both the part of the server and the customer.
Reducing the average waiting time is very difficult to do unilaterally because it is so strongly related to the arrival rate and service rates. But it can be reduced for some customers in the arrival population by either separating them into different service classes with different average service times or by assigning them a higher priority. Some common examples of this are as follows:
The disadvantage with methods that reduce the average service time and/or the average waiting time for some of your customers by serving them according to priority or class is that if you retain the same resource level with the same total average population, your average service rate will remain the same, but the variability in the waiting time experienced by your customers will increase: Some will have a shorter average waiting time, and others will have a longer average waiting time. The old adage “there is no such thing as a free lunch” holds true in waiting line operations.
The “classes” approach requires a multiple-server situation with at least two servers—more if there are several classes to be served separately. Thus, if you do not already have multiple servers, you are in effect adding resources to reduce waiting time. The “priority” approach can be used in both single-channel and multiple-channel waiting line models.
In the classes and priority approaches, the overall average throughput rate remains unchanged provided that in multiple-server applications a server who is normally assigned to higher priority or a specific class of customers is expected to take care of lower priority or other classes of customers when that server is idle (jockeying is allowed in that case). Therefore, from a business viewpoint, capacity and average throughput rate remain unchanged. From the customer viewpoint, some of them are served faster at the expense of other customers now having to wait longer.
At this point, some of you are likely thinking, “Why bother doing this? It sounds like a no-gain situation that rewards only some customers.” The answer is that sometimes there is little choice to not do it. For a hospital emergency room, it should be clear that having a priority approach is necessary. Rewarding customers who individually contribute a higher percentage of your profits is a standard business practice to encourage them to continue being your customers when competing with other businesses that also provide such incentives.
What are not so obvious are the potential cost advantages. To keep the mathematics manageable, the basic closed-form, waiting line equations assume that all servers are equally capable of providing the same service rate. In chapter 5, we discussed the situation of how long it would take for a new bank teller to achieve that proficiency. Now consider the advantage of a classes approach that would allow new tellers to handle simple transactions, such as deposits and withdrawals, at first and move to more complicated banking transactions later. Their learning curve would then be broken into two or more learning curves with faster learning percentages.
This leads to an important conclusion: Classes allow the use of a range of server skills, which, in turn, allows the possibility of a range of server costs. For example, consider a three-server application where all the servers are expected to be able to handle the full range of expected customer transactions. But in many situations, part of the normal customer population wants only services that could be done using a self-service machine. Automatic teller machines, online banking, Web check-in, flat-rate package mailing, and convenience store coffee machines are some examples that this is true.
This is where managers earn their salaries. We can do the cost comparisons using the three-server example versus two servers and a machine or even one server versus one or two machines. We can account for the differences in capabilities (flexibility of the server or server substitute) and relative costs, assuming we have a good estimate of how many customers in each class we plan to serve. This is the easier part. What is more difficult is considering the risks involved.
In this situation, if one server becomes ill, it is likely that the service can still be provided but with commensurate longer waiting times. This is also the case when we replace one server with a self-service machine: If the machine breaks down, the two remaining full-service servers can temporarily take up the slack. (In both cases, λ/µ must be less than two for this to be possible.) But if one of the two remaining servers becomes ill, the machine cannot take up the entire sick server’s load because of its more limited capability. That is, a server can take all the work of a broken machine, but a machine can take over for only part of a server’s work. The risk imposed by this lack of interchangeability increases when cost comparisons indicate to a manager that the most economical situation is two machines and one server.
Considering that more complex transactions are usually more critical, I would want to be sure that they are covered. Therefore, without even looking at the cost comparisons beforehand, I would restrict my cost comparisons for this situation to the three-server or the two-server-plus-machine options.
If some of the possible classes turn out to be a very low percentage of the total population, it is usually better that they be included as part of a larger class. But, if a class is a significant percentage of a business’s clientele, it is usually worthwhile to consider how you can provide a separate service or priority for them.
One great example of how to serve one class of customer without adding either servers or machines was at a local coffee shop in Oregon where the owner set up a self-serve station for his customers who just wanted a cup of house coffee. He also eliminated the need for a server to handle payments for the house coffees by posting the price for each size of cup and trusting his customers to put the appropriate amount into a locked cashbox with a slit in it. I observed that almost no customers skipped the payment, and some put in more than what was asked because they did not have adequate change. In this case, he reduced not only the average waiting time for his house coffee customers but also the need for server time to support that part of his business.
So how does a business owner identify such classes of customers? In sales operations, POS data can really help. Some examples are as follows:
You can also ask your customers to help by allowing them to select the class of service they want. This is particularly useful for call center and repair service operations. When calling in for support, we all have encountered choice menus that let us identify what type of question we want answered. This not only directs us to the right person to answer that question but also provides data that the call center can use to adjust staffing levels, allows the looking up records by the computer for some requests like asking for your current account balance, or makes the records already available for an operator to reduce the average service time and required staffing levels.
The useful thing is to recognize that when arrival rates are described by exponential distributions with appropriate average values, one can easily combine or separate average arrival rate distributions for the classes. The results will also be exponential distributions with the same general characteristics.
For example, consider a typical banking call center operation where there are three choices (effectively classes where the customer tells you what class they are in) on the telephone menu, such as the following:
Assume that the bank knows the average arrival rate for each choice by analyzing its previous call history. We will designate these rates as λ1, λ2, and λ3. Hence the total arrival rate λ to be handled by the call center if it is operated as a multiple-channel, single-phase system where any operator can answer the next call is merely the sum of λ1, λ2, and λ3. Of course, in a large call center with several operators with different skill levels, each individual arrival rate is likely to be assigned to a specific operator or even an automated response for some choice.
Consider a similar situation with the same choices, where the business knows the overall average arrival rate λ for its customer base but has not yet collected enough data to break down that rate into the individual rates for each choice. Wanting to improve its service by possibly designating specific operator(s) or automated responses to each choice, the business checks with a call center consulting firm for information about industry average percentages for each class of questions. The consulting firm provides the values of P1, P2, and P3, respectively, for the business’s choices. Because exponential distributions can also be easily separated into components that are also exponential distributions, the bank can then obtain λi by multiplying λ by the respective probability Pi.
You can get useful information even when a consulting firm does not have all the usual percentages. This is often the case because your customer base does not match up well with any of the databases that consulting firms might have. Software can monitor the frequency of choices selected in the telephone menu: Given that information, a manager can then decide whether or not to address the choices separately or as a combined pool.
Finally, when improving service performance by adding more servers, the training time required for a new server to achieve an acceptable proficiency level becomes a business priority. Those businesses that can do it faster than their competitors have a competitive edge when the demand for their type of services is increasing.
One common method to increase the learning rate is using a mentor alongside a new employee for the first week or so. It is important that the mentor have the new employee do all the repetitions for the best result. This means that the mentor needs to resist the temptation to step in and do some of the steps for the employee to serve a long line of waiting customers more quickly. In such situations, it is better, if possible, to open up another line to take some of the pressure off the new employee’s learning experience.
Another method for businesses that handle different classes of customers separately is to start the new employee with the simpler service operations and then include more complex services as that employee’s proficiency increases.
A common example illustrated in Figure 6.1 is the practice of many coffee shops of having one or two servers take customer orders and collect payments and a separate group of baristas prepare the different types of coffees requested. In most instances, the order takers also handle nonpreparation services, such as pouring house coffees and taking care of pastry orders. This not only provides a training path but also allows the more effective use of the more highly trained barista staff and the ability to use part-time workers for the order taking. In effect, two lines and two classes have been created here: one for all customers and a second line (phase) for customers with more complex orders.
The most important element is clear communication between the manager and the service employees about the training process and what is expected. This should include how to deal with long-line situations to avoid the state-dependent rate variances discussed earlier.
Waiting Line Configurations, Psychological Factors
Many times, it is not practical or even possible to do much to reduce the average waiting time for a service. First, it is important to not do things that would increase a customer’s perception of how long the wait is likely to be. Second, there are things we can do to reduce anxiety in line or even to allow a customer to make more effective use of that waiting time. Some suggestions paraphrased from the available online literature are given here; for more detailed information, consult the classic article by Maister (1985) and the following expanded discussion based on Norman (2008).
Figure 6.1. Coffee shop waiting line configuration for customers with simple orders (white symbols) and more complex orders (gray and black symbols).
Some general axioms regarding the waiting experience are as follows:
Some negative psychological factors to be avoided are as follows:
Some positive things a business can do are as follows:
Arrival Rate Management Methods
Another approach for reducing variability in service processes is to try to control the arrival rate to some degree. The most familiar method is using appointments or reservations. Appointments are used where there is more predictability in the nature of the services requested and the facility capacity and choice of servers are relatively constant. When the service time has a greater range of variability, such as serving meals in a popular restaurant, and a wider choice of servers can serve a given customer, reservations are more appropriate. Other significant factors here are the cost and availability of servers. The higher the skill requirements, the higher the cost per server and the lower the part-time availability is for such skilled individuals—hence the prevalent use of appointments for most professional services.
Consider a typical medical clinic appointment schedule. The facility has a fixed number of medical staff and examination rooms. The capacity of the facility is dependent on the effectiveness of the appointment schedule. Some medical services, such as vaccinations and routine tests, have a relatively constant service time, making their appointment schedule easy if the demand requires it (some daily peaks greater than the daily capacity). The appointment time then becomes whatever the service time is plus some easily calculated small addition to accommodate cleanup and preparation between patients.
But when we consider the appointment schedule for doctors on the staff, life just became more complicated. The factors to be considered are that not all doctors require the same average examination time, examination times can vary considerably from one patient to the next, patients sometimes arrive a bit late, some room must be left in the appointment schedule to accommodate unscheduled urgent care patients,8 and, most important of all, when a patient examination lasts longer than the allotted appointment time, it affects the timing of some if not all the subsequent appointments that day. Taking all this into account, the goal of an appointment scheduling time analysis is to select an appointment duration that reduces the possibility of an appointment running late without trading off too much patient capacity. The trade-off is between the costs of having people stay later to accommodate the patient who made the last appointment versus accommodating fewer patients during the day.
Having a probability distribution that accurately represents the distribution of examination times is essential. Unfortunately, the standard use of an exponential service distribution to represent service time in general queuing analysis models does not work well here because it allows a high percentage of service times lower than the average time. A normal distribution is more representative of medical examinations, and Erlang or beta distributions are better, but the best is when the clinic has enough data to determine its own discrete distribution for use in simulation programs for scheduling. Example 6.5 illustrates how a normal service time probability distribution can be used to estimate the probability of an appointment running over.
Example 6.5 is a simplified problem that illustrates the need for using simulation methods, where the effect of various appointment times and different overrun times can be evaluated to determine the best result that provides the highest capacity without undue costs and customer waits. A normal distribution is rarely the best distribution for scheduling purposes, but it is often used because it is easier to use in queuing analysis models and more familiar to many users.
Some other observations are that if the examination room preparation and patient record entry could be handled independently of the doctor and the nurse, it would free up another 2 minutes per appointment. For example, you could have a dedicated record clerk do the entries and a dedicated attendant prepare examination rooms.
Staggering the appointment times so that each doctor in a practice does not start appointments at the same times would simplify the scheduling of these back-office support activities. Staggering appointment times would also space out appointment check-ins for a smoother process flow at the check-in desk and provide a positive psychological effect in the waiting room because patients would be arriving and departing on a more regular basis, not intermittently in groups.
Actual operational data are necessary to provide the best representations of the service time and arrival time distributions in simulations. It cannot be emphasized enough that the result of a simulation is only as good as the knowledge of the real-life process and the actual operational data available for it.
For this medical clinic problem, the following data are needed for an accurate simulation solution:
Reservation policies have similar constraints and data collection needs, but they allow some simplifications because the number of reservations required per day is more limited. For many popular restaurants, the number of successive reservations (turns) for a given location is limited to how many groups they expect to seat at that location during a normal dining period (breakfast, lunch, or dinner). Obviously, the restaurant will require an estimate of how long customers will typically take to order and eat their meals and also must recognize that their service rate contributes to this period. One rule of thumb to help judge the variability of dining time is that the larger the group, the longer one might expect customers will take for dinner because of the increased amount of socializing (and beverages) that are likely to occur.
In many European restaurants, a group has the table for the evening, so the likelihood of a late-night reservation is not considered. In that respect, such restaurants are no different from one-time events, such as airplane flights, hotel rooms, Broadway shows, and sporting events. The challenge here is to fill the available seats or rooms for maximum profit. There is always some risk that someone with a reservation will cancel at the last minute. Knowing the average number of no-shows based on their past experience, some businesses will accept that number of reservations over their available capacity to ensure full usage of that capacity. The risk they take on is possibly having to deal with a highly dissatisfied customer who has a reservation and no accommodation because the usual number of no-shows did not occur. This practice, called overbooking, is used by many airlines and hotels. While overbooking analysis also involves working with arrival distributions and probabilities, it is not included in the scope of this monograph.11
Priority Management
To conclude this chapter, some discussion regarding suggested guidelines for using priorities in waiting line applications is in order. We will cover this in two parts: people and items. When dealing with people, one basic rule to remember is that when you selectively improve the waiting experience of one group in a queue, you have also chosen to make that experience worse for the other groups in the queue. Another basic rule to also consider is that when dealing with systems where all servers are equally capable of handling all customers, priority management does not affect the overall average performance.
Simply put, moving some people to the head of the line makes others wait longer. If you do this too often in a single line, you run the risk of delaying some customers long enough that they give up and leave the line without being served. If the reason you give some customers priority is not explained or appears to be unfair to the remaining customers, you risk losing even more customers who will take their business elsewhere.
Dealing with these issues is easy in some situations. Giving preference to more critically ill patients in an emergency room is generally understood by the other patients and even expected. Another commonly understood preference is when passengers who would otherwise miss their flights are called to the head of the line at the airline ticket counter. The announcement not only helps identify who needs preference but also tells the others why they are being given preference.
Giving preference to more valuable customers on hold in a call center is easy because the process is invisible to other customers on hold and the selection can even be automated. For face-to-face professional services, some businesses even use separate entrances for preferred customers to achieve a form of call center anonymity.
When the number of customers requiring some preference is relatively large, it is better to treat them as a class and set up either a separate line (frequent travelers at an airport) or a business process like an express line at a grocery store to handle them. This also helps communicate to other customers the reason for the different treatment to reduce their impressions of being treated unfairly.
Nonpreemptive and Preemptive Priorities
There are two ways to handle higher-priority customers: nonpreemptively and preemptively.
A nonpreemptive approach moves a higher-priority customer to the head of the waiting line, allowing a lower-priority customer currently being served to continue being served until that service is completed. A preemptive approach allows a higher-priority customer (A) to replace a lower-priority customer (B) currently being served, delaying the completion of the service for B until later when there are no more higher-priority customers to preempt B. If there are a large number of higher-priority customers, B is likely to wait a long time before his or her service can be completed.
The nonpreemptive approach is preferred for most call center priority applications because it best preserves the invisibility aspect of the prioritization process. (Imagine being asked to be put back on hold while the operator takes care of another caller.) It is also preferred in most applications where immediate service is not the sole reason for the priority and where the line behavior is observable by all customers because it is perceived as being fairer than the preemptive approach.
While this should be intuitively obvious, using priorities increases the variability of waiting times: the higher the percentage of customers getting preferential treatment, the higher the variability. Because variability adds uncertainty to business outcomes, using priority rules in processing waiting line customers should be carefully considered; if used, it should be limited to only a small percentage of the arrival population.
Some models12,13 have been developed to determine the increased variability in average waiting time when using both nonpreemptive and preemptive priorities. These models also aid in determining the degree of reduction in the average waiting time for higher-priority customers and the concomitant increase in waiting time for lower-priority customers. If you are interested in possibly applying these models to your situation, consult Hillier and Lieberman (2010), chapter 17, and Haussmann (1970). Be sure to review the assumptions and conditions for using such models to make sure they are appropriate for your situation.
An effective use of intermittent preemptive prioritization is when a customer being served needs to fill out a form or otherwise do something that only the customer can do to complete the service. Instead of asking the customers waiting in line to wait longer while this is done by the customer, the server asks the customer to leave the line to fill out the form or do the other activity and then reenter the head of the line when done, in effect giving that customer a higher priority. Such a line is occasionally referred to as a double-ended queue or dequeue. This has several benefits with no additional cost to the business to implement it; it reduces the average service and waiting times, improves customer satisfaction, and is generally viewed by customers as a fair use of prioritization.14
Manufacturing applications can use a wider range of priority rules when dealing with items or jobs with varying service times. These rules are normally applied as part of the production scheduling process. At the beginning of each day, the production scheduler reviews the list of jobs or items to be produced and then sequences them according to the priority rule used. Some of the more commonly used rules15,16 are as follows: shortest process time first (SPT), first-come, first-served (FCFS) or first-in, first-out (FIFO), earliest due date (EDD), slack per remaining operation (S/O or SRO), critical ratio (CR), and Johnson’s Rule (for the special case of two sequential steps with varying service times at each).
Performance measures for such schedules include the average number of jobs or work-in process (WIP inventory—same as L in queuing analysis), the average job lateness (missing any due dates), the job flow time (same as W in queuing analysis), and makespan (the total time to complete a group of jobs).17
It should be obvious that the scheduling goal is to minimize the value for each measure. Because priority rules do not all affect performance measures to the same degree or in the same manner, a manager should select a rule that best addresses the performance measure that is most important for that manager’s business. If all jobs must go through the same sequence of steps or operations, the queuing analysis models discussed in this monograph can be used to determine which scheduling priority rule provides the lowest performance measure values for a particular business.
However, when the job flow is not the same for all items, which occurs in many job shops, most queuing models are inadequate for the task. There is considerable work in progress to develop good mathematical models for job shop situations, but the mathematics involved is challenging, and the current results available are usually difficult to understand by someone who does not have strong mathematical or statistical skills.
One special priority rule used in manufacturing is the practice of using “hot lots” for preferentially moving selected items or jobs through a process faster. While in practice it is best to avoid the need for expediting items through a manufacturing process, sometimes it is necessary to satisfy the need of a critical customer or cope with output constraints imposed by a manufacturing process. This need was prevalent in many early integrated circuit fab facilities because of the large variance in final product yields and long manufacturing times of several weeks or more. A problem frequently encountered then with hot lots was that there were often so many rush requests from different customer groups that the reduced processing time advantage for hot lots was so small that it was not worthwhile. This forced many fab managers to put restrictions on the number of hot lots they would accept at any one time. This limit was typically between 5% and 10% based on the experience of a given fab manager. Analysis using one of the priority models available today would have allowed managers to set more appropriate hot-lot limits for a desired expedited throughput time.
In summary, my experience has indicated that using priority systems in queuing applications is less effective when dealing with people and should be generally avoided unless there is a clear justification for them (emergency rooms, 911 calls, etc.). Separating customers into different service classes is a better way to offer improved services to customers.