In this chapter
With Excel 2019, you can manage huge data collections, but storing more than 1 million rows of data doesn’t help you make business decisions unless you have the ability to focus on the most important data in a worksheet.
Excel includes many powerful and flexible tools with which you can limit the data displayed in your worksheet. When your worksheet displays the subset of data you need, you can perform calculations on that data. You can discover what percentage of monthly revenue was earned in the 10 best days in the month, find your total revenue for particular days of the week, or locate the slowest business day of the month.
Just as you can limit the data displayed by your work-sheets, you can also create validation rules that limit the data entered into them. When you set rules for data entered into cells, you can catch many of the most common data-entry errors, such as entering values that are too small or too large or attempting to enter a word in a cell that requires a number. If you add a validation rule after data has been entered, Excel can circle any invalid data so that you know what to correct.
This chapter guides you through procedures related to limiting the data that appears on your screen, manipulating list data, and creating validation rules that limit data entry to appropriate values.
Excel worksheets can hold as much data as you need them to, but you might not want to work with all the data in a worksheet at the same time. For example, you might want to look at the revenue figures for your company during the first third, second third, and final third of a month. You can limit the data shown on a worksheet by creating a filter, which is a rule that selects rows to be shown in a worksheet.
IMPORTANT
When you turn on filtering, Excel treats the cells in the active cell’s column as a range. To ensure that the filtering works properly, you should always have a label at the top of the column you want to filter. If you don’t, Excel treats the first value in the list as the label and doesn’t include it in the list of values by which you can filter the data.
When you turn on filtering, a filter arrow appears to the right of each column label in the list of data. Selecting the filter arrow displays a menu of filtering options and a list of the unique values in the column. Each item has a check box next to it, which you can use to create a selection filter. Some of the commands vary depending on the type of data in the column. For example, if the column contains a set of dates, you will get a list of commands specific to that data type.
Tip
In Excel tables, filter arrows are turned on by default.
Tip
When a column contains several types of data, the filter command Number Filters appears.
When you select a filtering option, Excel displays a dialog box in which you can define the filter’s criteria. As an example, you could create a filter that displays only dates after 3/31/2019.
If you want to display the highest or lowest values in a data column, you can create a Top 10 filter. You can choose whether to show values from the top or bottom of the list, define the number of items you want to display, and choose whether that number indicates the actual number of items or the percentage of items to be shown when the filter is applied.
Tip
Top 10 filters can be applied only to columns that contain number values.
Excel 2019 includes a capability called the search filter, which you can use to enter a search string that Excel uses to identify which items to display in an Excel table or a data list. Enter the character string you want to search for, and Excel limits your data to values that contain that string.
When you create a custom filter, you can define a rule that Excel uses to decide which rows to show after the filter is applied. For instance, you can create a rule that determines that only days with package volumes of less than 100,000 should be shown in your worksheet. You might then be able to determine whether the weather or another factor resulted in slower business on those days.
Excel indicates that a column has a filter applied by changing the appearance of the column’s filter arrow to include an icon that looks like a funnel. After you finish examining your data by using a filter, you can clear the filter or turn off filtering entirely and hide the filter arrows.
To turn on filter arrows
Select any cell in the list of data you want to filter.
On the Home tab of the ribbon, in the Editing group, select Sort & Filter, and then select Filter.
To create a selection filter
Select Sort & Filter, and then select Filter.
Select the filter arrow for the column by which you want to filter your data.
Clear the check boxes next to the items you want to hide.
Or
Clear the Select All check box and select the check boxes next to the items you want to display.
Select OK.
To create a filter rule
Display the filter arrows for your list of data.
Select the filter arrow for the field by which you want to filter your data.
Point to the Type Filters item to display the available filters for the column’s data type.
Select the filter you want to create.
Enter the arguments required to define the rule.
Select OK.
To create a Top 10 filter
Display the filter arrows for your list of data.
Select the filter arrow for a column that contains number values, point to Number Filters, and then select Top 10.
In the Top 10 AutoFilter dialog box, select the arrow for the first list box and select whether to display the top or bottom values.
Select the arrow for the last list box and select whether to base the rule on the number of items or the percentage of items.
Click in the middle box and enter the number or percentage of items to display.
Select OK.
To create a search filter
Display the filter arrows for your list of data.
Select the filter arrow for the field by which you want to filter your data.
Enter the character string that should appear in the values you want to display in the filter list.
Select OK.
To clear a filter
Select the filter arrow for the field that has the filter you want to clear.
Select Clear Filter from Field.
To turn off the filter arrows
Select any cell in the list of data.
Select Sort & Filter, and then select Filter.
Excel includes a wide range of tools you can use to summarize worksheet data. This section describes how to select rows randomly by using the RAND and RANDBETWEEN functions, how to summarize worksheet data by using the SUBTOTAL and AGGREGATE functions, and how to display a list of unique values within a data set.
In addition to filtering the data stored in your Excel worksheets, you can choose rows at random from a list. Selecting rows randomly is useful for choosing which customers will receive a special offer, deciding which days of the month to audit, or picking prize winners at an employee party.
To choose rows randomly, you can use the RAND function—which generates a random decimal value between 0 and 1—and compare the value it returns with a test value included in a formula. If you recalculate the RAND function 10 times and check each time to find out whether the value is below 0.3, it’s very unlikely that you would get exactly three instances where the value is below 0.3. Just as flipping a coin can result in the same result 10 times in a row by chance, so can the RAND function’s results appear to be off if you only recalculate it a few times. However, if you were to recalculate the function 10,000 times, it is extremely likely that the number of values less than 0.3 would be very close to 30 percent.
Tip
Because the RAND function is a volatile function (that is, it recalculates its results every time you update the worksheet), you should copy the cells that contain the RAND function in a formula and paste the formulas’ values back into their original cells. To do so, select the cells that contain the RAND formulas and paste them back into the same cells as values.
The RANDBETWEEN function generates a random whole number within a defined range. For example, the formula =RANDBETWEEN(1,100) would generate a random integer value from 1 to 100 (inclusive). The RANDBETWEEN function is very useful for creating sample data collections for presentations. Before the RANDBETWEEN function was introduced, you had to create formulas that added, subtracted, multiplied, and divided the results of the RAND function, which are always decimal values between 0 and 1, to create data.
To use RAND or RANDBETWEEN to select a row, create an IF formula that tests the random values. If you want to check 30 percent of the rows, a formula such as =IF(cell_address<0.3, “TRUE”, “FALSE”) would display TRUE in the formula cells for any value of 0.3 or less and FALSE otherwise.
The ability to analyze the data that’s most vital to your current needs is important, but there are some limitations to how you can summarize your filtered data by using functions such as SUM and AVERAGE. One limitation is that any formulas you create that include the SUM and AVERAGE functions don’t change their calculations if some of the rows used in the formula are hidden by the filter.
Excel provides two ways to summarize just the visible cells in a filtered data list. The first method is to use AutoCalculate. To use AutoCalculate, you select the cells you want to summarize. When you do, Excel displays the average of the values in the cells, the sum of the values in the cells, and the number of visible cells (the count) in the selection. You’ll find the display on the status bar at the lower edge of the Excel window.
When you use AutoCalculate, you aren’t limited to finding the sum, average, and count of the selected cells. You can add or remove calculations to suit your needs. A check mark appears next to a function’s name if that function’s result appears on the status bar.
AutoCalculate is great for finding a quick total or average for filtered cells, but it doesn’t make the result available in the worksheet. Formulas such as =SUM(C3:C26) always consider every cell in the range, regardless of whether you hide a cell’s row manually, so you need to create a formula by using either the SUBTOTAL function or the AGGREGATE function to summarize just those values that are visible in your worksheet. The SUBTOTAL function lets you choose whether to summarize every value in a range or summarize only those values in rows you haven’t manually hidden. The SUBTOTAL function has this syntax: =SUBTOTAL(function_num, ref1, ref2,...). The function_num argument holds the number of the operation you want to use to summarize your data. (The operation numbers are summarized in a table later in this section.) The ref1, ref2, and further arguments represent up to 29 ranges to include in the calculation.
As an example, suppose you have a worksheet in which you manually hid rows 20–26. In this case, the formula =SUBTOTAL(9, C3:C26, E3:E26, G3:G26) would find the sum of all values in the ranges C3:C26, E3:E26, and G3:G26, regardless of whether that range contained any hidden rows. The formula =SUBTOTAL(109, C3:C26, E3:E26, G3:G26) would find the sum of all values in cells C3:C19, E3:E19, and G3:G19, ignoring the values in the manually hidden rows.
IMPORTANT
Be sure to place your SUBTOTAL formula in a row that is even with or above the headers in the range you’re filtering. If you don’t, your filter might hide the formula’s result!
The following table lists the summary operations available for the SUBTOTAL formula. Excel displays the available summary operations as part of the Formula AutoComplete functionality, so you don’t need to remember the operation numbers or look them up in the Help system.
Operation number (includes hidden values) |
Operation number (ignores values in manually hidden rows) |
Function |
Description |
1 |
101 |
AVERAGE |
Returns the average of the values in the range |
2 |
102 |
COUNT |
Counts the cells in the range that contain a number |
3 |
103 |
COUNTA |
Counts the nonblank cells in the range |
4 |
104 |
MAX |
Returns the largest (maximum) value in the range |
5 |
105 |
MIN |
Returns the smallest (minimum) value in the range |
6 |
106 |
PRODUCT |
Returns the result of multiplying all numbers in the range |
7 |
107 |
STDEV.S |
Calculates the standard deviation of values in the range by examining a sample of the values |
8 |
108 |
STDEV.P |
Calculates the standard deviation of the values in the range by using all the values |
9 |
109 |
SUM |
Returns the result of adding all numbers in the range together |
10 |
110 |
VAR.S |
Calculates the variance of values in the range by examining a sample of the values |
11 |
111 |
VAR.P |
Calculates the variance of the values in the range by using all of the values |
As the preceding table shows, the SUBTOTAL function has two sets of operations. The first set (operations 1–11) represents operations that include hidden values in their summary, and the second set (operations 101–111) represents operations that summarize only values visible in the worksheet. Operations 1–11 summarize all cells in a range, regardless of whether the range contains any manually hidden rows. By contrast, operations 101–111 ignore any values in manually hidden rows. What the SUBTOTAL function doesn’t do, however, is change its result to reflect rows hidden by using a filter.
IMPORTANT
Excel treats the first cell in the data range as a header cell, so it doesn’t consider the cell as it builds the list of unique values. Be sure to include the header cell in your data range!
The AGGREGATE function extends the capabilities of the SUBTOTAL function. With it, you can select from a broader range of functions and use another argument to determine which, if any, values to ignore in the calculation. AGGREGATE has two possible syntaxes, depending on the summary operation you select. The first syntax is =AGGREGATE(function_num, options, ref1...), which is similar to the syntax of the SUBTOTAL function. The other possible syntax, =AGGREGATE(function_num, options, array, [k]), is used to create AGGREGATE functions that use the LARGE, SMALL, PERCENTILE.INC, QUARTILE.INC, PERCENTILE.EXC, and QUARTILE.EXC operations.
The following table describes the summary operations available for use in the AGGREGATE function.
Number |
Function |
Description |
1 |
AVERAGE |
Returns the average of the values in the range. |
2 |
COUNT |
Counts the cells in the range that contain a number. |
3 |
COUNTA |
Counts the nonblank cells in the range. |
4 |
MAX |
Returns the largest (maximum) value in the range. |
5 |
MIN |
Returns the smallest (minimum) value in the range. |
6 |
PRODUCT |
Returns the result of multiplying all numbers in the range. |
7 |
STDEV.S |
Calculates the standard deviation of values in the range by examining a sample of the values. |
8 |
STDEV.P |
Calculates the standard deviation of the values in the range by using all the values. |
9 |
SUM |
Returns the result of adding together all numbers in the range. |
10 |
VAR.S |
Calculates the variance of values in the range by examining a sample of the values. |
11 |
VAR.P |
Calculates the variance of the values in the range by using all of the values. |
12 |
MEDIAN |
Returns the value in the middle of a group of values. |
13 |
MODE.SNGL |
Returns the most frequently occurring number from a group of numbers. |
14 |
LARGE |
Returns the k-th largest value in a data set. k is specified by using the last function argument. If k is left blank, Excel returns the largest value. |
15 |
SMALL |
Returns the k-th smallest value in a data set. k is specified by using the last function argument. If k is left blank, Excel returns the smallest value. |
16 |
PERCENTILE.INC |
Returns the k-th percentile of values in a range, where k is a value from 0 to 1 (inclusive). |
17 |
QUARTILE.INC |
Returns the quartile value of a data set, based on a percentage from 0 to 1 (inclusive). |
18 |
PERCENTILE.EXC |
Returns the k-th percentile of values in a range, where k is a value from 0 to 1 (exclusive). |
19 |
QUARTILE.EXC |
Returns the quartile value of a data set, based on a percentage from 0 to 1 (exclusive). |
You use the second argument, options, to select which items the AGGREGATE function should ignore. These items can include hidden rows, errors, and SUBTOTAL and AGGREGATE functions. The following table summarizes the values available for the options argument and the effect they have on the function’s results.
Number |
Description |
0 |
Ignore nested SUBTOTAL and AGGREGATE functions. |
1 |
Ignore hidden rows and nested SUBTOTAL and AGGREGATE functions. |
2 |
Ignore error values and nested SUBTOTAL and AGGREGATE functions. |
3 |
Ignore hidden rows, error values, and nested SUBTOTAL and AGGREGATE functions. |
4 |
Ignore nothing. |
5 |
Ignore hidden rows. |
6 |
Ignore error values. |
7 |
Ignore hidden rows and error values. |
To summarize values by using AutoCalculate
Select the cells in your worksheet.
View the summaries on the status bar.
To change the AutoCalculate summaries displayed on the status bar
Right-click the status bar.
Click a summary operation without a check mark to display it.
Or
Click a summary operation with a check mark to hide it.
To create a SUBTOTAL formula
In a cell, enter a formula that uses the syntax =SUBTOTAL(function_num, ref1, ref2,...). The arguments in the syntax are as follows
The function_num argument is the reference number of the function you want to use.
The ref1, ref2, and subsequent ref arguments refer to cell ranges.
To create an AGGREGATE formula
Do either of the following:
Create a formula of the syntax =AGGREGATE(function_num, options, ref1...). The arguments in the syntax are as follows:
The function_num argument is the reference number of the function you want to use.
The options argument is the reference number for the options you want.
The ref1, ref2, and subsequent ref arguments refer to cell ranges.
Or
Create a formula with the syntax =AGGREGATE(function_num, options, array, [k]). The arguments in the syntax are as follows:
The function_num argument is the reference number of the function you want to use.
The options argument is the reference number for the options you want to use.
The array argument represents the cell range (array) that provides data for the formula.
The optional k argument, used with LARGE, SMALL, PERCENTILE.INC, QUARTILE.INC, PERCENTILE.EXC, or QUARTILE.EXC, indicates which value, percentile, or quartile to return.
Summarizing numerical values can provide valuable information to help you run your business. It can also be helpful to know how many different values appear within a column. For example, you might want to display all the countries and regions in which your company has customers. If you want to display a list of the unique values in a column, you can do so by creating an advanced filter.
All you need to do is identify the rows that contain the values you want to filter and indicate that you want to display unique records so that you get only the information you want.
To find unique values within a data set
Select any cell in the range for which you want to find unique values.
On the Data tab of the ribbon, in the Sort & Filter group, select Advanced.
Select Filter the list, in place.
Or
Select Copy to another location.
Verify that the address of your data range appears in the List range box.
If necessary, click in the Copy to box and select the cells where you want the filtered list to appear.
Select the Unique records only check box.
Select OK.
Part of creating efficient and easy-to-use worksheets is to do what you can to ensure that the data entered into your worksheets is as accurate as possible. Although it isn’t possible to catch every typographical or transcription error, you can set up a validation rule to make sure that the data entered into a cell meets certain standards. For example, you can specify the type of data you want, the range of acceptable values, and whether blank values are allowed. Setting accurate validation rules can, for example, help you and your colleagues avoid entering a customer’s name in the cell designated to hold the phone number or setting a credit limit above a certain level.
You can select the cells where you want to add a validation rule, even if those cells already contain data. Excel doesn’t tell you whether existing data in any of those cells violates your rule at the moment you create the rule, but you can find out by having Excel circle any worksheet cells that contain data that violates the cell’s validation rule. When you’re done, you can have Excel clear the validation circles or have Excel turn off data validation for those cells entirely.
To add a validation rule to a cell
On the Data tab, in the Data Tools group, select Data Validation.
In the Data Validation dialog box, on the Settings tab, select the Allow arrow, and then select the type of values to allow.
Use the controls to define the rule.
Select the Ignore blank check box to allow blank values.
Or
Clear the Ignore blank check box to require that a value be entered.
On the Input Message tab, enter an input message for the cell.
On the Error Alert tab, create an error alert message for values that violate the rule.
Select OK.
To edit a validation rule
Select one or more cells that contain the validation rule.
Select Data Validation.
On the Settings tab, select the Apply these changes to all other cells with the same settings check box to affect other cells with the same rule.
Or
Leave the Apply these changes to all other cells with the same settings check box cleared to affect only the selected cells.
Use the controls in the dialog box to edit the rule, input message, and error alert.
Select OK.
To circle invalid data in a worksheet
Select the Data Validation arrow.
Select Circle Invalid Data.
To remove validation circles
Select the Data Validation arrow.
Select Clear Validation Circles.
In this chapter, you learned how to:
Before you can complete these tasks, you need to copy the book’s practice files to your computer. The practice files for these tasks are located in the Office2019SBSCh07 folder. You can save the results of the tasks in the same folder.
The introduction includes a complete list of practice files and download instructions.
Open the LimitData workbook in Excel, and then complete the following tasks:
Create a filter that displays only those package exceptions that happened on RT189.
Clear the previous filter, and then create a filter that shows exceptions for the Northeast and Northwest centers.
With the previous filter still in place, create a filter that displays only those exceptions that occurred before April 1, 2019.
Clear the filter that shows values related to the Northeast and Northwest centers.
Turn off filtering for the list of data.
Open the SummarizeValues workbook in Excel, and then complete the following tasks:
Combine the IF and RAND functions into formulas in cells H3:H27 that display TRUE if the value is less than 0.3 and FALSE otherwise.
Use AutoCalculate to find the SUM, AVERAGE, and COUNT of cells G12:G16.
Remove the COUNT summary from the status bar and add the MINIMUM summary.
Create a SUBTOTAL formula that finds the average of the values in cells G3:G27.
Create an AGGREGATE formula that finds the maximum of values in cells G3:G27.
Create an advanced filter that finds the unique values in cells F3:F27.
Open the ValidateData workbook in Excel, and then complete the following tasks:
Create a data-validation rule in cells J4:J7 that requires values entered into those cells be no greater than $25,000.
Attempt to type the value 30000 in cell J7, observe the message that appears, and then cancel data entry.
Edit the rule you created so it includes an input message and an error alert.
Display validation circles to highlight data that violates the rule you created, and then hide the circles.