Chapter 5
Predictive Statistics Examples

A journey of a thousand miles begins with the right map.

Measurement is a journey. As seen in the example in Chapter 4, the journey takes time, and the value of the information improves with feedback and iterations. The needs of an organization vary by industry, by size, by purpose, and by many other factors. Yet many measures cut across these factors and effectively describe the valuable contributions human resources (HR) makes every day. Not every journey begins with a destination in mind, but expedient, purpose-driven journeys do.

BEGIN WITH THE END IN MIND

One recommended destination comes from the Center for Talent Reporting (www.centerfortalentreporting.org), and it is based on the Talent Development Reporting Principles (TDRP). TDRP is very similar in purpose to the two models that guide the accounting industry: generally accepted accounting principles and generally accepted auditing standards. TDRP is designed for reporting human capital information to executives. The framework builds on Boudreau and Ramstad’s model of performance optimization.1 An organization that can monitor its efficiency, effectiveness, and outcomes can begin to adjust the inputs and optimize performance.

Reflecting on the example of the various reports shared in Chapter 4, a comprehensive report for the vice president (VP) of HR should include measures of efficiency, effectiveness, and outcomes. The purpose of the TDRP is to develop a tool to communicate the impact of HR interventions to executive leadership in a way that they can grasp quickly. Leaders need information for decision making, but they need it in a way that they can understand quickly. Moreover, the information has to be aligned to the business needs.

The Center for Talent Reporting advocates creating statements, similar to financial statements, for executives. The statements use a very simple structure. One statement is created for each of the three types of measures: efficiency, effectiveness, and outcomes. Within each statement, say, efficiency, the critical measures are listed in the left-hand column; these measures might include open positions and positions filled per month, and so on. Then columns to the right show these metrics: last year’s actual values for the metric, this year’s goal or plan, year-to-date actual values, and the percentage of the goal achieved. It is not a complex set of information, but it is comprehensive and easy to understand and therefore useful for executives. Exhibits 5.1 through 5.3 show TDRP statements for efficiency, effectiveness, and outcomes.

Exhibit 5.1 TDRP Efficiency Statement

Measures Data Type Last Year Actual This Year Plan This Year June YTD This Year % of Plan
Open requisitions N count 480 500 250 50%
Positions filled/month N count & Percentage 40 or 8.3% 42 or 8.3% 250 or 50% 50%
Time to fill open positions Number of days 27 27 27 100%
Salary associated with positions Average salary (SD) 64K (6.4K) 65K (6.5K) 64K (6.4K) 98.5%+
Cost to hire new resource Average cost (SD) 122K (12K) 124K (13K) 122K (12K) 98.5%+

Exhibit 5.2 TDRP Effectiveness Statement

Measures Data Type Last Year Actual This Year Plan This Year June YTD This Year % of Plan
Performance ratings at 90 days Average Likert rating: 1–9 5 5 5 or 100% 100%
Performance ratings at 365 days Average Likert rating: 1–9 5 5 4 or 80% 80%
ID high potentials Yes/No indicator 10% Yes 10% Yes 12% Yes 120%+
Assessment results Avg. assessment score 90% 90% 80% 88%
Speed to competency Time in months 1.5 months 1.25 months 1.25 100%
Sponsor satisfaction Aggregate average for all survey responses 4.5 4.5 4.5 100%
Exit survey results Aggregate average for all survey responses 4 4 3.75 93.75%

Exhibit 5.3 TDRP Outcomes Statement

Measures Data Type Last Year Actual This Year Plan This Year June YTD This Year % of Plan
Engagement survey results Aggregate average for all survey responses 4.5 4.5 4.25 94%
Productivity Average number of chargeable hours/week 30 32 30 93.75%
Turnover at 90 days Percentage 3% 3% 3% 100%
Turnover at 365 days Percentage 5% 4% 7% 175%-
Profitability Time in months 1.5 months 1.25 months 1.25 100%

When sharing the consolidated information with the VP of HR and other executives, there are two recommended best practices. First, share the results in a financial statement format as prescribed by the Center for Talent Reporting. This approach provides the results in a format that executives are familiar with. They can easily see the key metrics in the left-hand column. To the right they can see the actual performance last year, the goal for the current year, and progress according to year-to-date values. In this example, some values have superscripts of + and - to reflect how performance compares to goals. The Center for Talent Reporting also provides benchmarks for selected metrics, and comparisons can be made to benchmarks and goals.

A second valuable approach is sharing results through dashboards. The dashboard platform should accommodate various displays of data, such as bar charts, histograms, pie charts, trend lines, and dials. It should also allow users to dig deeper into the results. When building dashboards, cut the data by key organizational demographics, such as business units or region. Both art and science contribute to the final product. An effective dashboard provides critical information in a visually appealing way without overwhelming the screen or page with too much data. The final product is often developed through a trial-and-error process wherein results are displayed and end users provide feedback.

In the end, whether you use statements or dashboards, the goal is not the display. The goal is to use the information to begin a conversation, discuss options, and make decisions.

GO BACK TO THE BEGINNING

Once you have a vision of what you need to produce for executive leaders, the onerous task of executing the vision begins. Often the hardest part of building executive reports is the data collection process. Thankfully, HR systems are advancing quickly, and information technology (IT) departments can provide data extracts on request. The next step in the journey is requesting the data from the right people in the right way.

WHO OWNS DATA, AND WILL THEY SHARE IT?

In order to populate a report for executives, it is essential to determine where the critical data resides. In most organizations, IT controls the data because IT implements and maintains tools, such as learning management systems, talent management systems, HR information systems, and so on.

Consolidation across the technology industry has improved reporting because various types of data are now resident in one system. Additionally, the industry has been consolidating through acquisition so that disparate systems are often integrated under one service provider. For example, in 2004 PricewaterhouseCoopers started the ball rolling with the acquisition of Saratoga Institute, the first HR quantitative benchmarking company. In 2010 SuccessFactors, a talent management company, purchased InfoHRM, a human capital benchmarking and dashboarding company. Next, SuccessFactors bought Plateau, a learning management system. In late 2011, when customers were thinking that SuccessFactors was one of the biggest talent players in the industry, the company was bought by the enterprise resource planning giant SAP. This one system can report hiring information, training histories, compliance, performance appraisal scores, high potential status, promotion history, compensation and benefits information, and so forth. Yet for many organizations, the valuable HR data that is needed for executive reports is still housed in many different systems. Exhibit 5.4 shows a variety of data sources and aligns the sources with the data types from the TDRP framework.

Exhibit 5.4 Data Sources for Executive Reports

Efficiency Effectiveness Outcomes
Human Resources Information System (HRIS) Evaluation System Performance Appraisal System
Number of open requisitions Satisfaction with learning Speed to productivity
Time to fill open positions Assessment results Productivity measures (chargeable hours or widgets produced/hour, etc.)
Salary associated with positions Performance Appraisal System Quality System
Finance System Performance Ratings Error rate/1M units
Cost to hire the new resource Identification of high potentials Customer Service/Relationship Management System
Cost to train new hires (onboarding) HRIS Customer loyalty
  Turnover within 90 or 365 days Sales
  Lost productivity (Salary × time unfilled) CRM/Finance
    Revenue/trainee

The measurement maturity of organizations varies substantially. Some organizations collect and use their data daily. Others struggle to identify key performance indicators and gather data. For organizations at the low end of the maturity curve, the journey is longer. The organization must find a way to collect the information before it can be reported.

Identifying the source of the data is a critical step in the analytics process, but more work is still required. Often the owners of the data are unable or unwilling to share. There is often a good reason why data owners don’t share. The system may not be able to export data to a file. More often, certain data is classified as “sensitive” due to government regulations or organizational policy. Private data is protected. This data often includes personal information, such as gender, age (date of birth), ethnicity, and medical history. Performance appraisals, test scores, and responses to engagement surveys are also classified as sensitive by some organizations.

While some data is locked down and unavailable to most analysts, there are ways to gain access. One way is to request the data without unique identifying information (e.g., a personnel ID number or an e-mail address). This can cause problems when joining data sets for a linkage analysis. Without an ID or an e-mail address, it is impossible to link engagement survey responses with turnover information. A workaround is to enlist the help of the HR analyst or IT specialist who protects the data. Allow this person to join the two files for you and then delete the unique ID. In this way, the data is linked and identities are protected.

Some data sets are unavailable because owners just are unwilling to share. While this does not happen often, it still occurs and can cause substantial frustration when you are trying to serve the greater good of the organization. Direct requests for data are often sufficient to get what is needed. In some organizations, an e-mail request for data is sufficient. In other organizations, a sponsor is needed, and the request has to go through official channels with standard protocols. Be prepared: The effort involved can be substantial, but it is usually not overly burdensome or time consuming.

Sometimes a gatekeeper digs in and refuses to share, and the direct approach does not work. Do not get caught up in determining why. It can be an effort in futility. You can think of it this way: We can do this the easy way or the hard way. It is the data keeper’s choice. Look for other organizational levers that can provide access. Two avenues often work: Gain permission/approval from the gatekeeper’s supervisor or get an executive sponsor to make the request.

When data comes from multiple sources, the request process can be lengthy. Be prepared for a simple request to take several days or even weeks. Sometimes technology teams that are extracting the data ask if the request is a one-time event or if you will be seeking data regularly. Either answer is fine with IT; the department simply wants to know if it should save the code and the process used to extract the data so it can be more efficient if you make future requests.

WHAT WILL YOU DO WITH THE DATA?

There are generally four reasons for gathering data for business purposes: Describe, explain, predict, and optimize performance. Keep these reasons in mind as you report and analyze the data.

  1. Describe. Using simple statistical terms, such as frequency counts, means, and standard deviations, performance is quantified and described to provide insight about an organization’s current state. Performance appraisal results describe the annual performance of individuals with simple numbers. Using a nine-box model, an employee is rated 1 to 9. That single digit summarizes individual performance. It can also be aggregated to describe the performance of a sample or a population.
  2. Explain. After describing performance, it is often useful to explain it. This is usually achieved by digging deeper into the data, giving it context, and examining differences or relationships. For example, if we classified all of the professionals into three groups—novice, experienced, and advanced—we might see an underlying relationship that explains the ratings. The small group of top performers in the nine-box example may be the most experienced professionals. Novice professionals receive the lowest ratings, and experienced (but not advanced) professionals fall in the middle. In this way, experience helps explain the pattern in the data.
  3. Predict. Inferential statistics, such as correlation, regression, analysis of variance (ANOVA) and other techniques, can be used to predict future performance. ANOVA can uncover meaningful differences between groups (e.g., performance among experienced and inexperienced employees). Correlation and regression analysis can uncover relationships among variables—that is, as experience increases, so does performance. It seems reasonable to expect that experience predicts performance, but what if the organization cannot wait for employees to gain experience? Could development programs like coaching or training also improve performance? Of course they can. Moreover, there is likely a dose–response curve for development. As more development is provided, employees improve their performance faster. Given enough cases, it becomes possible to predict performance based on the amount of coaching and training received. Such a model is invaluable because the business can estimate how much to invest in development in order to improve performance.
  4. Optimize. Once a prediction model is developed, the business can implement programs to improve performance—for example, by providing the optimal amount of coaching and learning. By monitoring the inputs and actual performance, a feedback loop is created so the organization can optimize its investment in performance improvement. In line with the Boudreau and Ramstad approach, the entire data set (e.g., efficiency, effectiveness, and outcome measures) should be used to optimize organizational performance.

What does the optimization process look like? It can have many variations, but consider for a moment a situation where a training budget has been cut substantially, yet the goals for development remain the same, such as educate X number of people each year and ensure they are proficient within a month. In this case, efficiency will be impacted. The chief learning officer and learning and development (L&D) managers must change the curriculum in line with the available budget. This could mean increasing the ratio of e-learning to instructor-led classes. Or it could mean eliminating certain high-cost courses from the curriculum. Both approaches have consequences. The shift to more e-learning can save money by meeting training volume while eliminating the costs associated with instructor-led training. The consequence might be less employee engagement or less cross-functional networking as a result of less face-to-face interaction during training. Or if key courses are cut from the curriculum, the business may not be able to meet quality standards or produce a product, because new knowledge and skills are not learned.

The executive suite thrives on making data-based decisions. Making decisions without data is a hit-or-miss proposition. By using a framework like TDRP, HR can provide useful information to executives so they can make data-driven decisions.

WHAT FORM IS THE DATA IN?

When requesting data, be careful what you ask for. There are many forms of data today: HTML, XML, HRXML, text, comma delimited, SQL, SPSS, MS Excel, and MS Access, just to name a few. The variety is overwhelming, but that variety also increases the chances that the data you seek will be in a format that you need. The prevalence of Microsoft products helps matters as well. Most HR professionals use some form of Windows and have access to MS Excel and MS Access. These tools accept a wide variety of file types, such as those just listed. The ability to accept multiple file types is essential, because the business systems that contain the desired data often run on proprietary code, SQL, or other unique languages. Fortunately, those systems often have the capability to export data into standard file types. When working with the IT group to extract data, be sure to specify the file type that you desire. Ask for files to be exported to common formats, such as .txt or .csv, that most programs can open.

Another issue to consider is the data structure. Data systems are set up to store data as efficiently as possible, using relational tables that are extremely long but not wide. These “vertical files” have a few columns of data but millions of rows. While they are efficient and improve processing speed on servers, they can cause problems. Files with more than 1.5 million cases often exceed the limits of MS Excel. MS Access can handle large file sizes, but this program is not as user friendly as MS Excel and requires more advanced analytic skills.

The second way to structure the data is to request a cross-tab export. This format displays one person per row. Every column contains a unique piece of information about that person, such as a demographic or a response to a survey. Compared to a vertical file, this structure is wide with many columns. This format is often used when analyzing data in Excel or a statistical package like SPSS or SAS.

IS THE DATA QUALITY SUFFICIENT?

Having quality data is essential to any analysis. Before analyzing a data set, take the time to examine the data to make sure it contains the measures that it should and that those measures are gathered and stored consistently. Here are a few things that you should inspect when reviewing a data set.

  • Missing data. Not every data set is complete. Even a simple demographic form for new hires is prone to have missing data. Frequently, people forget to enter information, such as zip codes. Or they intentionally skip a question. This is normal. When working with hundreds and thousands of cases, a few missing pieces will not impact the overall analysis. However, when large amounts of data are missing, say 50% or more, you should consider that variable to be suspect in the analysis. If it is a key metric, investigating why so much data is missing will be worthwhile. Better yet, correct the data collection problem when you find it.
  • Errors in the data. Errors come from many sources.
    • Keypunching or data entry errors are most common. Occasionally all of us mistype the simplest of information, such as our names. Infrequent errors are expected and acceptable, as long as they are minimal. If possible, clean up such errors in the data before doing the analysis. Systematic errors, however, are not acceptable. These errors happen when someone types the right data into the wrong field, such as the answer to question 5 into the field for question 6. When this happens consistently across many cases, it can have a substantial negative influence on the accuracy of the data.
    • Database errors happen often and are difficult to find. For example, a data extract for instructor-led course evaluations might incorrectly contain evaluation data from Web-based courses. This error is due to an incorrect query request.
    • Misaligned data are easier to detect. Consider the data from instructor-led courses versus Web-based courses. The data may align in rows and columns perfectly, making it difficult to detect the error. More often, the data will not align across the data set because some of the questions differ on the evaluation forms. If data from a Web-based course contains an extra question or one less demographic (e.g., training location), the columns of data will not align. Misalignment is a good indicator that there are errors in the data set.

The best way to ensure the quality of the data before analysis begins is simply to inspect it. Use your basic understanding of the data and explore. If 100 people were hired, why are there 300 people in the data set? If the rating scale is 1 to 5, why is there a 6 in the data set? Put your professional skepticism to work and see what you find.

In Chapter 6, we roll up our sleeves and examine the data collection and analysis process.

NOTE

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset