Chapter 17
Tracking the Quality of Services

Harry P. Hatry

Public organizations track the quality of public services for two overarching purposes: (1) to enable public administration officials to identify where progress is, and is not, being made, thus helping them to continually improve their programs and services, thereby improving the lives of their citizens, and (2) to enable higher-level officials, legislative bodies, and the public to be able to hold service providers accountable for getting value for their money.

Some public agencies use the word quality to refer exclusively to how a service is delivered, such as its timeliness, convenience, and courteousness of delivery. The word outcome is then used to indicate what happens after delivery (e.g., whether customers of health programs improved). This chapter primarily uses the word quality to encompass both how well the service is delivered and the results achieved.

This chapter first briefly describes the history and importance of performance measurement (performance measurement is the term commonly used for the process that tracks service quality). Then it describes the performance information needed and how that information can be obtained. The chapter discusses a number of ways performance measurements is changing and can be vastly improved.

The Brief History and Limitations of Service Quality Measurement

In 1938, the International City Management Association (ICMA) (as the International City/County Management Association was then called) published Measuring Municipal Activities: A Survey of Suggested Criteria for Appraising Administration (Ridley & Simon, 1938). That report, by Clarence E. Ridley, executive director of ICMA, and Herbert A. Simon, an assistant professor of political science at the Illinois Institute of Technology, was a pioneering work, discussing potential ways to measure the outputs (the amount of work completed for a number of municipal services) and not only service costs. The report introduced the need to track outputs but did not address the lack of information on service outcomes. Much has transpired in the over more than seventy-five years since its publication.

Starting in the late 1970s, US business was forced by international competition to pay explicit attention to product quality. This led to such innovations as quality circles and the Total Quality Management movement. Customer satisfaction became a central focus for private business, and this orientation spread into the public sector as well. Highly visible books such as Peters and Waterman's In Search of Excellence (1982), Deming's Out of the Crisis (1986), and Osborne and Gaebler's Reinventing Government (1992) focused on such interrelated themes as service quality, customer satisfaction, and managing by results.

ICMA and the Urban Institute in the 1970s examined ways to measure the quality of basic municipal services, leading to their joint publication of the 1977 report How Effective Are Your Community Services? (Hatry et al., 1977) and its two subsequent editions in developing the Center for Performance Measurement

The private sector Financial Accounting Standards Board and its governmental component, the Governmental Accounting Standards Board, recommend financial reporting standards for governments. In the early 1980s, they began encouraging state and local governments to report annually and publicly on what they called “service efforts and accomplishments” (SEA).1 The major component of SEA reporting is information on service accomplishments, especially service outcomes.2

In the early 1990s, a growing number of state legislatures (including those of Oregon, Texas, Minnesota, and Virginia) passed legislation requiring performance measurement of executive branch services.

The passage by Congress in 1993 of the Government Performance and Results Act (GPRA) was a major event. The act required all federal agencies to undertake performance measurement, set targets for each performance indicator, and annually report the results. This opened up a major new emphasis in the federal government on performance measurement focused on service outcomes. Because the federal government provides major funding to state and local governments, the need for performance measurement has trickled down to them.

The GPRA Modernization Act of 2010 recognized that federal agencies had not been using the information to help them manage. It introduced a number of requirements to encourage the use of performance measurement data to improve agency services, such as holding quarterly data-driven reviews of each agency's priority goals.

Performance measurement has become commonplace in state and local governments, as well as the federal government. Most performance measurement efforts have been top down, driven by requirements from the legislature or from a central administrative office. The major purpose was viewed as accountability, not service improvement. But program personnel have typically not used the results for management or program improvement purposes. Some important exceptions include the wide use of crime data by police departments; the New York City sanitation department's use since the 1970s of systematic street cleanliness measurement; many state (and some local) transportation agencies' use of regular road condition measurements; state and local agencies' tracking of employment (required by federal jobs legislation); and school districts increasingly reporting on various education achievement indicators.

Need for Multiple Types of Performance Indicators

Seldom, if ever, will a single indicator of service quality be sufficient. Inevitably, individual programs have multiple elements that should be tracked.

A comprehensive performance measurement system has these components:

  • Output indicators (the amount of work completed)
  • Outcome indicators, both intermediate and end-outcome indicators
  • Efficiency indicators, both output-efficiency and outcome-efficiency indicators

Output counts tell little about how successful a program or service is. Nevertheless, they are an initial product and are important to managers and should not be neglected by an agency in its tracking systems.

More difficult to track are outcomes. Agencies should track both the end outcomes of their programs and what are often called intermediate (or interim) outcomes. A program's intermediate outcomes reflect the extent to which persons or organizations external to the agency have taklen actions or exhibited behaviors that are expected to lead to the program's end outcomes.

The quality of how a service is delivered to the service's customers that are important to customers can also be considered intermediate outcomes. Such qualities include service timeliness, accessibility, courteousness, and helpfulness.

Managers usually have considerably more control over intermediate outcomes than over end outcomes. Intermediate outcomes generally occur earlier in time than end outcomes, and data on them are generally considerably easier to obtain. Thus, managers are usually more comfortable with intermediate outcomes than with end outcomes.

Agencies need to be careful, however, that managers do not neglect end outcomes due to the readier availability of intermediate outcome data. Programs in their initial performance measurement system versions have often relied on response-time indicators as their primary performance indicators. However, although response times are important to customers and should be tracked, they are only intermediate outcomes. They seldom tell anything about the results of the service other than if they were delivered in a timely way.

Figure 17.1 illustrates the relationship between intermediate and end outcomes, using what is often called a logic model or outcome sequence charts. Such diagrams can be useful for public administrators in helping them and their staffs think through their mission and the sequence of outcomes that are expected to occur and indicating what indicators are needed.3

c17f001

Figure 17.1 Outcome Sequence Diagram

Note: IO = intermediate outcome; EO = end outcome.

The logic model applies to community policing programs such as Neighborhood Watch. The initial intermediate outcome (IO-1) calls for an indicator such as “number of residents' participating in the program.” If few people or none participate, outcomes from the program cannot be expected. The second and third intermediate outcomes are changes in resident behavior to help protect their homes and their neighborhood residents protecting their homes (IO-2) and giving police leads relating to crimes (IO-3). The results sought are reductions in crime in the neighborhood (EO-1), increases in arrests due to leads from residents (EO-2), and improved feelings of security among residents in the neighborhood (EO-3). Each indicator provides information that should be of importance to the manager in making decisions about the program.

Such indicators as the number of police officers assigned to the program and the number of neighborhood meetings held by the police are not considered outcomes. The number of officers assigned would be better labeled an input count and the number of meetings held an output count. Whatever the indicators are called, each is important for managing. If any of these numbers are poor, the program is less likely to be successful in producing desired outcomes.

Efficiency indicators are usually defined as the ratio of amount of cost to amount of product and have typically been reported as cost per unit of product. Most existing efficiency indicators relate cost to the number of output units: cost per unit of output. These numbers are calculated as the total cost of a service or activity divided by the total number of units of output delivered. This produces, for example, indicators such as average cost per meal served (in a particular government-supported institution).

Seldom used in performance measurement systems are outcome-efficiency indicators. These take the form of cost per unit of outcome, such as the cost per person who was served and whose condition was found to be improved after the service was provided. This lack of tracking outcome efficiency is due in part to the lack of reliable outcome data.

Sources of Data and Data Collection Procedures

Public agencies can use a number of sources and procedures for measuring performance. The principal sources are: (1) agency records, (2) citizen and client surveys, (3) trained observer procedures, (4) various physical measurement devices, and (5) focus groups.4

Agency Records (Administrative Data)

Existing agency records have been by far the greatest source of performance data that most government agencies use. These records are the primary source of output data such as data on program costs and the number of units of physical output produced by a program (e.g., number of repairs, number of records processed, gallons of water treated, tons of garbage collected). They also sometimes provide data on service outcomes, for example, the number of reported crimes (as an indicator of successful crime prevention and school attendance and graduation rates.

Governments have increasingly measured response times to citizens' requests for services, such as response times to calls reporting fires and crimes. Public agencies have also begun to measure response times for many other public services, such as time to answer complaints, issue a driver's license, complete housing and building inspections. Response-time data can be obtained from agency records, though agencies may need to modify their procedures in order to record the times of initial requests and times to completion of the needed work.

Recent technology developments, such as in big data, have led to considerable attention to the potential for data sharing among agencies. This can enable an agency to obtain outcome information on a regular, timely basis from other agencies that have outcome information on the first agency's customers. An example is the use by social service agencies that are providing prevention services to adults or juveniles at risk for criminal behavior. The social services agency seeks regular information from police departments or courts on clients' subsequent criminal behavior.

A major hurdle to obtaining agency record data from other agencies is the need to ensure the confidentiality of customer information. This will likely require negotiations and memos of understanding among sharing agencies.

Customer Surveys

Surveys enable public administrators to obtain systematic feedback on a number of outcome indicators for almost all public services. Customers are, by definition, the recipients of government services, and their ratings of these services should be of critical importance to public officials.5

Other forms of citizen feedback are also important to public officials, such as complaint counts (available through agency administrative records) and at public hearings and similar sessions. However, these sources are not likely to be sufficiently representative of the views of the total set of customers as can be obtained through properly done surveys. (Not everyone knows how to complain or is willing to complain.)

The use of surveys by federal, state, and local governments has increased considerably in recent years. Some states sponsor annual surveys (perhaps conducted by a state university). ICMA's Center for Performance Measurement since the late 1990s has been reporting annual comparison data on basic municipality service performance indicators for a number of city and county governments. As part of this effort, each participating local government uses a standardized questionnaire.

Surveys can seek information on all households in a community or only citizens or businesses that are customers of a particular service. They can obtain customer ratings of specific characteristics of service quality, such as timeliness, accessibility, adequacy, and the dignity with which specific services are provided. Surveys can provide other intermediate outcome data, such as changes in customer behaviors and practices (e.g., environmental protection behavior and improved personal health care practices). And they can obtain information on end outcomes, such as current employment status, current health, and satisfaction with recreation and library services.

Household surveys can provide information on the percentage of households that have not used particular public services or facilities (such as parks or public transit) and can also ask for reasons for nonuse. Agency records can sometimes provide counts of the number of uses of controlled-access facilities. However, such counts do not indicate how many different individuals or families used a facility or service, only how often it was used. Nor do agency records generally provide demographic information on nonusers or on reasons for nonuse.

Surveys can seek explanatory information. Questionnaires can ask respondents who gave poor ratings to a particular characteristic to identify why they gave that rating. Such information can provide public administrators valuable leads to needed improvements.

A major concern in the survey community has been technology advances in communication. Telephone surveys are finding it difficult to get people to respond, such as when people screen calls and use cell phones or smart phones rather than land lines.

The major constraint on wider regular use of surveys is their potential cost. This concern is driven by the belief that surveys need to be conducted by survey organizations, which can be quite expensive. For use in regularly tracking service quality, however, elaborate surveys are seldom likely to be needed. The cost in time and money can be alleviated by these steps:

  • Limit the size of the questionnaire, perhaps to two pages. Those surveyed are more likely to respond to a short survey, especially if the agency asks the respondents for their help in improving the service.
  • Use mail surveys, providing stamped, addressed return envelopes. For surveys of an agency's customers, agencies will often have, or can readily obtain, contact information on their customers.
  • Consider web-based surveys. These are likely to become options as more and more persons have ready access. Surveys of businesses are likely to be readily administered by electronic means.
  • For agencies with large numbers of customers, use random sampling. Samples of even one hundred customers or households, randomly selected, can be informative for a program if the issues being investigated do not require a high degree of precision. For example, a survey that receives a response from one hundred clients out of two hundred questionnaires mailed to a random sample of clients (a 50 percent return rate) is likely to be more accurate than one that gets one thousand responses from a mailing list of five thousand clients (a 20 percent return rate).
  • Consider combining surveys sent to citizens that include questions on multiple services to spread survey costs among programs or agencies. However, surveys of customers of a single agency or program can provide more detailed information.

Inexpensive software programs are available to help with questionnaire construction and for tabulating and reporting the results of surveys. This means that information can be processed quickly and efficiently once the responses have been received. However, follow-ups to nonrespondents are likely to be needed to obtain an adequate response rate, say, 50 percent.

Trained Observer Ratings

Trained observer procedures use observers taught to use standardized rating procedures to rate physically observable characteristics of a service.6 A number of cities have used such ratings to track street cleanliness, led by New York City, which has been tracking street cleanliness since the 1970s. Such ratings have also been used to assess the physical conditions of road and bridges, traffic signs and signals, public buildings, and parks and playgrounds.

The purpose of the procedure is to provide reliable objective ratings so that different observers making ratings at different times would give approximately the same rating to observed conditions. The ratings might be made by employees (such as in New York City), volunteers, or student interns trained in the use of rating scales.

An attractive feature of such ratings of physical conditions is that they can identify in real time where physical problems exist and can even be linked to work orders for correcting those problems. Another feature is that the agency can map conditions showing the relative extent of deficiencies in various parts of its jurisdiction (e.g., by using various degrees of shading or coloring), as did the maps that New York City included for many years in the annual Mayor's Management Report. Agencies with geographic information system capability should be readily able to produce such exhibits.

Trained observer procedures have been used in some human services agencies to estimate the extent of improvement in the condition of specific clients (e.g., clients in developmental disability programs). Rating scales are established for various conditions that indicate a client's functioning levels relevant to the particular conditions of the customers being evaluated. The observers might rate a customer on each condition at intake and again at one or more points after the client has received services. The ratings for all clients are tabulated to determine the percentage of clients whose condition improved by various degrees.

Physical Measurements

Some governments use physical measurements to measure selected physical outcomes of some services, such as the quality of water after treatment by a water supply or wastewater treatment facility, air and water pollution levels, and road conditions. For example, state and local transportation agencies use a variety of devices attached to or carried in cars that record vertical displacement accurately, providing reliable measurements of road conditions.

Focus Groups

Focus groups do not provide statistical data for tracking service quality. However, they can be useful in providing information on what service quality characteristics are of concern to customers and, thus, what should be measured. Sessions are held with perhaps eight to twelve persons (such as customers) to obtain feedback about a service. Focus group participants might be asked two primary questions: What do you like about the service? What do you not like about it?

Focus groups can also be useful after data have been obtained to help the agency interpret why unexpected results occurred. This particular use of focus groups, however, appears to be rare.

Improving the Usefulness of Performance Measurement Systems

Most performance measurement systems in the United States (and probably the rest of the world) need a major upgrade.7 The potential of performance measurement systems in providing useful information to public managers has hardly been tapped. Major factors in the ability to close this gap are technological advances in handling data.

Public agencies, especially at the federal level, are beginning to address some of these needs. These features can provide public administrators a considerably more complete picture of what is happening and what is likely to be needed.

These three primary concerns are beginning to be addressed in some performance measurement systems:8

  • Outcome indicators are often limited to data for which data are already collected, thus excluding key outcome indicators.
  • Basic analysis of the performance information too often is not done, considerably reducing the value of the information.
  • Public administrators don't often or do not use the performance information to improve services.

The following sections discuss ways that are beginning to emerge to better enable performance measurement systems to meet today's needs.

Providing Timely Data

The timeliness of performance measurement information is critical to the usefulness of information. Performance measurement systems should be able to provide information on performance progress throughout the year. This will allow public administrators to identify problems, take timely action, and subsequently assess whether their actions have led to improvement.

In the early days of performance measurement, formal reporting from performance measurement systems tended to be infrequent, typically annually. Quarterly reporting occurred in some states such as Texas, where the quarterly reporting was required by the legislature. Starting about 2010, Congress and the Office of Management and Budget (OMB) began requiring federal agencies to report quarterly on “agency priority goals” but only annually on other goals.9 Federal agencies can, of course, report performance more frequently if only for internal use.

Managers make many program decisions throughout the year and need performance information that is as current as possible. They should be able to quickly pull the latest performance indicator data from their computers, smart phone, or other devices.

The frequency of data collection and reporting depends on the specific performance indicator. The values of some indicators might not change appreciably, or might only be needed, or might not be feasible to collect, more than once a year. Other indicators are most useful if measured weekly; others need less frequent measurement, such as monthly, quarterly, or annually. Data on street cleanliness, for example, might be provided weekly or monthly, as in New York City. Quarterly reporting of major performance information appears to be becoming a common performance reporting interval.

Some governments have recently begun conducting surveys of households within their jurisdictions on a regular basis to obtain feedback on the quality of their services. Most often these surveys are conducted annually or less frequently, not a timely schedule for program managers. To reduce this problem, a program can split its annual survey effort into parts. For example, a quarter of the total number of sample households might be surveyed every three months. Although the data obtained each quarter will yield less precise findings for the quarter, the increased frequency of feedback is likely to be considerably more useful to administrators and will identify seasonal differences.

For some services, a survey questionnaire can be sent to all of a program's customers, especially if their number is not large, rather than having to use random samples. This would enable managers to obtain up-to-data information more frequently.

Delayed delivery of data also is sometimes delayed for time to clean, process, and tabulate the data after all returns have been received. Technology advances should enable delay times to be significantly reduced.

Disaggregating Outcome Data

This enables public administrators to examine outcomes for different categories of customers and service situations. Performance measurement systems suffer greatly from “aggregationitis.” Performance measurement systems typically provide only aggregate information on their performance indicators, especially their outcome indicators. Likely to be considerably more useful to managers is disaggregated performance information, that is, information broken out into relevant groupings. Breakouts, especially of outcome data, can provide public administrators and other officials with considerably more specific information on where their programs are working, where not, and under what conditions. This information can provide important clues on where problems exist and the corrective action needed. It can also help identify inequities.

Public administrators are likely to find particularly useful the following breakouts of outcome indicator values:

  • Various customer demographic groups, such as by age, income, race/ethnicity, educational level, and household composition. Such breakouts can be important for assessing equity concerns.
  • Geographical categories, such as outcome data broken out by region, state, county, city, service district, neighborhood, census tract, and so on.
  • Service providers. It is likely to be very helpful it outcome information on their own customers (as well as aggregate data across units) is available to each unit's manager and supervisor. This applies to such individual service units as each facility, each program office, and so on. Outcome indicators can also be broken out by individual case workers. Another application is to prove outcome information on each contractor.
  • Type and amount of service provided to individual customers. Program administrators should find it highly useful to break out some outcome indicators by mode of service delivery. This should be particularly helpful to administrators who want to be innovative and try different approaches to service delivery. A program might be applying different amounts or different types of assistance to different parts of its workload. If the program records which approaches are used for each element of the workload, a computer can then readily tabulate the performance data for each approach. Even better, to provide more convincing evidence that the approach with the best outcomes caused the outcome, the program would assign each different service delivery procedure randomly to each customer.
  • Client difficulty. Another breakout that is often important is by level of difficulty of the incoming workload (whether the workload consists of human customers, roads, bodies of water, or something else). Differences in the difficulty of a program's workload can have a major impact on the effectiveness, and costs, of individual programs and services. If level of difficulty is not explicitly considered, the performance data can be misleading.

Differences in difficulty of the incoming workload can directly affect outcomes in most, if not all, services and programs. For example, in comparing the percentage of successful outcomes from one year to the next, the percentages may actually have been better for both difficult-to-help and easy-to-help patients, but aggregate performance data might indicate a worsening of the aggregated outcomes of both patient categories. The breakout by difficulty provides public administrators with a much different story than if only the aggregate outcome was considered. This can occur because a larger proportion of difficult-to-help clients came in for help during that period, causing the aggregate data to look bad. A public manager would likely come to the wrong conclusion if he or she considered only the aggregate success rate. A similar problem will occur if comparisons across service units are made. Different units will likely have larger proportions of difficult-to-help clients than others.

A classic example is the percentage of children available for adoption for whom adoptions occurred. Adoption agencies have generally found it considerably easier to place healthy white babies than older minority children or children with physical or mental problems. Adoption data can be broken out for each relevant demographic group in order to provide a more accurate picture of placement performance. In this example, customer demographic characteristics are used as proxies for difficulty.

Public agencies might categorize their incoming workloads by a small number of different levels of difficulty (perhaps three or four levels). At, or shortly after, intake, supervisors might examine the information on each incoming customer, assign a difficulty category to the customer, and enter it into a computer for subsequent calculations as to the outcomes for each difficulty category.

The use of risk-adjusted indicator values, such as of mortality rates in hospitals, is a version of difficulty-related measurement. Here, rather than grouping customers into qualitatively determined difficulty categories, difficulty levels are calculated statistically using more sophisticated approach to measuring difficulty.

These breakout categories apply to services that directly serve humans and those that do not. Road maintenance programs might break out outcomes: road composition, location, average daily traffic categories, amount of truck traffic, urban versus rural roads, and regions with different weather conditions.

To obtain breakout information on outcomes, the program needs to enter the appropriate characteristics into the database for each workload unit (such as each customer) and link that information to the source that contains the outcome information on each of those customers. The computer then makes such calculations as the number and percentage of customers in each breakout category with each outcome level.

Deeper drilling down can also be done. A manager might want more targeted information, such as identification of the outcomes of women of a specific ethnicity and age group who lived in a specific neighborhood as compared to men.

Formal performance reporting will likely require a considerably limited set of outcome information, such as for an agency's dashboard. However, in the near future, many managers are likely to have access from their own electronic devices to be able to query the device through drop-down menus and hyperlinks to pull up a wide variety of disaggregated outcome information.

Regularly Seeking Explanations for Unexpectedly Poor or Unexpectedly Good Outcomes

Many outside factors, in addition to a program's own efforts, can affect the outcomes sought by a program, factors sometimes called contextual factors.

Few public agencies have required substantive explanatory information as a regular part of performance reporting, especially for unexpectedly poor and unexpectedly good, outcomes. This information can be useful to program managers in providing clues as to why the measured outcomes have occurred and what might be done. Such an examination should be considered part of a modern performance measurement system today, even if resources permit only a small effort.

While asking for explanations is common, this is usually done on an informal basis. Requiring explanations for unusual performance levels can alert programs and their staffs to focus routinely on this key purpose of performance measurement of raising questions when unusual results occur.

Texas, for example, requires agencies in their quarterly performance reports to provide “explanations when actual performance of key measures varies five percent or more from targeted performance.”10 Few other governments at any level have such formal requirements.

Explanatory information might be obtained from examining:

  • Context factors such as special economic or weather data
  • Information from customer (or field employee) focus groups
  • The mix of customers to assess whether a larger proportion of more difficult-to-help customers had entered the service
  • Findings from open-ended questions contained in customer surveys that ask respondents to indicate why they had given poor ratings to any of the rating questions. More in-depth studies such as program evaluations (though these are likely to require additional resources and are likely to be appropriate only for programs with large budgets and particularly important consequences).

Only the last of the above sources is likely to involve significant cost or specialized help.

Providing Basic Analysis of the Information

The data in regular performance reports too often have been little used other than as an accountability tool (such as for the organization's website and public consumption). The increasing pressure for evidence-based programs and policies has been growing, in part because of pressure from OMB.11

Performance measurement systems should provide at least some basic analysis of the data, even if only by asking someone to summarize each report and identify highlights. Such summaries enable managers to focus on problem areas and potential needed actions.

In the future, most public agencies of almost any size are likely to be able to identify one or more persons to act as analysts. Increasingly, young professionals coming from schools of public administration and public policy have gained knowledge of basic analytical (and computer) tools. In addition to preparing the summaries and highlights, they can do a search for explanations for unexpected levels of good or poor outcomes.

Key questions for public administrators reviewing performance data are whether the current level of performance is good or bad, whether performance is improving or worsening and to what extent, and why unexpected outcomes have occurred. Administrators typically compare the current level of performance to benchmarks. Following are some comparisons that managers are likely to find useful when considering the latest values for each indicator:

  • Values for one or more past reporting periods. Examining multiple past reporting periods can identify trends that may help managers obtain a better perspective than only comparing the data for one previous period. When values are reported frequently during the year, this can identify seasonal effects.
  • Targets. The federal Government Performance and Results Act of 1993 requires agencies to set annual targets (“goals”) for each of its performance indicators for the coming year. Many states and local governments also set targets, if only as part of their annual budget process. For internal management purposes to enable regular midyear reviews, targets would be provided for segments of the year, such as quarterly or monthly.
  • Data disaggregated for different demographic and service groups. Such comparisons are typically done only as special studies or on an as-needed basis. However, disaggregations examined on a regular basis as part of the performance measurement system can be highly useful in helping pinpoint for whom and where poorer outcomes are occurring and for identifying equity issues.
  • Data disaggregated by work groups. Service outcomes and efficiency might be compared among different police, fire, sanitation, and road maintenance districts. Or comparison can be made among any type of facility where the agency has more than one such facility, such as mental health clinics or parks and recreation facilities, post offices, or correctional facilities.
  • Similar jurisdictions. This approach is useful if reasonably comparable performance indicator data are available for reasonably similar communities and customers. Examples of outcome data for which comparisons are generally available are crime rates and traffic accident rates. Reasonable comparisons are most often appropriate for programs for which the federal or state government has supported development of common definitions and common data collection procedures for the performance indicators.

Tracking Postservice Outcomes

This need applies to any service where a major desired outcome for the person is expected to be sustained after completion of the program's services. The needed outcome indicator would look something like: “Number, or percent, of customers who X months after completing service had the desired improved condition.” For example, social service and health programs often provide treatment but do not follow up afterward to see whether the customer is still maintaining the benefits. If the benefits are not present, say, twelve months later, the service could be a waste of resources and should be reviewed. If customer surveys are used to obtain this information, respondents could also be asked for their ratings as to the extent to which the service had contributed to the improvement and be asked for suggestions for improving the service.

In some program areas, such follow-up has been required by federal agencies, such as in some employment and substance abuse treatment programs.

The primary barrier to following up former customers is the expected data collection cost and effort. This legitimate concern needs to be balanced against the potential value of the information for avoiding spending future resources on unsuccessful services and the ability to help identify improvements to the service. Following are some ways to reduce after-exit survey costs:

  • Link the follow-up to an after-care checkup to find out not only how the former customer is doing but also whether the customer needs further help.
  • Obtain contact information from the customer before exit.
  • Use electronic surveys for customers with such access. Otherwise use the postal service mail.
  • Obtain outcome data from other agencies rather than a survey. This is becoming more feasible because of the ability of computer technology to handle big data. A common example is seeking state unemployment insurance data to obtain postservice employment information for employment programs.

Reporting Performance Information Well

Modern data visualization technology has ramped up the ability to easily report performance data in visually attractive ways. The use of color, bar charts, and so on can attract readers to key data information. Dangers include the all-too-present temptation to overdo it, such as providing overly crowded visuals, cryptic labels, overly small font sizes, and coloring that makes key information hard to read.12

Uses for Service Quality Information

Major uses for reliable information on the quality of public services include these:

  • Holding public officials accountable for results, in addition to compliance with laws and adherence to budgets. This purpose has become increasingly important as finances have become more constrained.
  • Motivating public employees to focus on results, including meeting customer needs. With reliable data on outcomes, incentives will become more objective rather than primarily based on managers' subjective judgments or outputs.
  • Motivating contractors (and grantees) to focus on results for the public and helping agencies monitor their performance. Agencies can include outcome-oriented performance requirements in contracts rather than specifying contractors' procedures and outputs. Contracts can include financial incentives for work that meets or exceeds performance targets and penalties for work that is done poorly or not on schedule.
  • As a basis for regularly holding data-driven performance reviews (also known as PerformanceStat). These are regularly scheduled meetings led by a manager using performance data as a starting point for such meetings. The sessions (1) address issues and problems identified by the data, (2) explore what can be done to address them, (3) identify actions to be taken, and (4) at future meetings assess progress. The information provided by analytical staff to summarize and highlight the performance report might be used as a starting point for the review. The approach appears to have considerable promise for lower management levels, using less elaborate meeting arrangements.13
  • Helping develop and justify budgets. Data on service quality can be used to aid in developing budgets and helping justify funding requests to a legislative body. For example, information on the current condition of streets and bridges and counts of water main breaks has been used by public works agencies to develop and support capital replacement programs. Performance measurement provides past data, and budgets are about future years. To the extent that budget-year conditions are expected to stay about the same, past performance can be used reliably to estimate future costs and outcomes. When budget-year conditions are expected to change substantially, past data will be less useful.
  • Helping in strategic planning. Information from the performance measurement system will identify levels of needs for use as a starting point for strategic planning. If the strategic plan includes target values for key performance indicators, this will provide a basis for monitoring progress toward strategic plan objectives.
  • To encourage improvements in public service. This is a fundamental reason for measuring the quality of government services.

Problems in Performance Measurement

Some important problems need to be considered:

  • Some important outcome data, especially data that need to be obtained through new data collection procedures (such as customer surveys and trained-observer rating procedures), are unfamiliar and appear costly to many public agencies.
  • Smaller governments and some large agencies may not have sufficiently automated their data processing work. This can make it difficult to obtain important information, such as disaggregating outcome data by customer and service characteristics. Manual data procedures also have the important drawback that they can lead to lower data quality than likely with more automated procedures.
  • Technology still needs to develop really user-friendly software that will enable managers and their personnel to easily access the latest performance indicator values.
  • Performance measurement is far from perfect. Some service characteristics can be difficult to measure. As has often been said, what gets reported gets attention. The danger in this is that agencies may focus on what they can measure and neglect important considerations. It is important for agencies to be as comprehensive as possible in their performance measurements and also to report service elements that their measurements do not cover. Qualitative assessments of such elements might be provided as part of the performance measurement process.
  • Performance information has almost universally been presented in aggregated form. This sacrifices a great deal of its potential usefulness.
  • Because public administrators are often not familiar with or not trained in the nature and use of performance measurements, they do not protest or seek improvements when they are given poor or poorly presented data.
  • Probably most troublesome to public administrators is that while performance data provide outcome information, the data by themselves do not tell why the measured values occurred. For most service outcomes, public agencies have only partial control over performance. Inevitably external factors (such as the economy and the weather) affect performance. For some outcome indicators, especially intermediate outcomes, the logic of the relation between a program's activities and an intermediate outcome can be very strong and additional evaluation is not needed. For example, if extra personnel are assigned to respond to calls for service and the data thereafter show a substantial reduction in customer waiting time for service, this is pretty good evidence that adding personnel was a major reason (assuming no other significant relevant event occurred, such as automation of the service process).

Public managers worry, with good reason, that they (and their programs) will be blamed for less-than-expected service outcomes, even though factors beyond their control have played major roles in producing the shortfalls. Governments need to make a major effort to minimize the likelihood that regularly collected performance data are used to make snap judgments about who or what is responsible for the indicated outcomes. To alleviate the blame problem, governments should ask program managers to routinely provide explanatory information along with their performance data when unexpectedly low or high performance levels occur. This step can help but by no means avoids this problem.

Role of Ad Hoc Program Evaluations

Ad hoc studies, often called program evaluations, are in-depth studies that attempt to assess the effectiveness and impact of particular services or programs. These studies are designed to determine the extent to which the outcomes can be attributed to the program rather than other factors, which performance measurement systems are not designed to do. Evaluations can also be structured to identify ways to improve the service or program. Program evaluations range from complex randomized controlled trials to less rigorous variations.14

Program evaluations generally require specialized personnel or the assistance of outside contractors. Thus, agencies typically can support only a few evaluations in any given year. Any individual service can probably be evaluated only once every several years, if at all. Thus, such studies seldom provide information that managers can use to help them address operational issues throughout the year.

When evaluation information becomes available, however, that information should be examined carefully by public administrators to assess what it tells about the program and what changes might be made. Because evaluations provide more in-depth information about a program, the information obtained is likely to be of higher quality and be considerably more detailed than that obtained from performance measurement systems.

Summary

Attention by elected officials, the public, and the media over accountability and performance has grown significantly and seems likely to continue for many years. Public administrators are becoming more exposed to and more familiar with performance measurement. Greatly improved technology and incorporation of a number of program evaluation concepts into performance measurement systems should considerably increase public administrators' interest in performance data and its use.

The availability of high-speed, economical, data processing hardware and software permits ready calculation of performance data. This capability enables more comprehensive and more frequent and timelier reporting of performance data.

The tremendous growth of social media and more open government, including pressure to provide as many performance data as possible on websites, will put more pressure on governments to provide performance information. Whether this will lead to better-quality data or agencies being considerably more conservative in selecting performance indicators over which they have more control is uncertain.

The federal government remains a major motivator of state and local governments to track performance. Although it has been reducing its role in service delivery, it is likely to continue to apply pressure on state and local governments to demonstrate greater accountability when federal funds are involved.

Knowing how well one is doing is a basic need of management in all sectors of the economy, in the government no less than in the private sector. If one does not know what the score is, how can one tell whether changes or improvements are needed? If one does not know the score, how can one play the game?

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset