CHAPTER 14
Using Formulas for Conditional Analysis

Excel provides several worksheet functions for performing conditional analysis. You'll use some of those functions in this chapter. Conditional analysis means performing different actions depending on whether a condition is met.

Understanding Conditional Analysis

A condition is a value or expression that returns TRUE or FALSE. Based on the value of the condition, a formula can branch into two separate calculations. That is, when the condition returns TRUE, one value or expression is evaluated while the other is ignored. A FALSE condition reverses the flow of the formula, and the first value or expression is ignored while the other is evaluated.

In the following sections, you'll explore some of the logical functions available in Excel.

Checking if a simple condition is met

Figure 14.1 shows a list of states with monthly gas prices. Say you want to determine whether a state's price in a given month is above or below the average of all the states for the same month. For higher-than-average prices, you will report “High” and for lower than average, “Low.” A grid below the data will be used to report the results.

=IF(C3>AVERAGE(C$3:C$11),"High","Low")  
Snapshot of monthly gas prices by state

FIGURE 14.1 Monthly gas prices by state

The IF function is the most basic conditional analysis function in Excel. It has three arguments: the condition, what to do if the condition is true, and what to do if the condition is false.

The condition argument in this example is C3>AVERAGE(C$3:C$11). Condition arguments must be structured to return TRUE or FALSE, and this usually means that there is a comparison operation (like an equal sign or greater-than sign) or another worksheet function that returns TRUE or FALSE (like ISERR or ISBLANK). The example condition has a greater-than sign and compares the value in C3 to the average of all the values in C3:C11.

If our condition argument returns TRUE, the second argument of the IF function is returned to the cell. The second argument is High, and since the value in C3 is indeed larger than the average, cell C14 shows the word High.

Cell C15 compares the value in C4 to the average. Because it is lower, the condition argument returns FALSE, and the third argument is returned. Cell C15 shows Low, the third argument of the IF function.

Checking for multiple conditions

Simple conditions like the one shown in Figure 14.1 can be strung together. These are known as nesting functions. The value_if_true and value_if_false arguments can contain simple conditions of their own. This allows you to test more than one condition where subsequent conditions are dependent on the first one.

Figure 14.2 shows a spreadsheet with two user input fields for the type of automobile and a property of that automobile type. The properties are listed in two ranges below the user input fields. When the user selects the type and property, we want a formula to report whether the user has identified a coupe, a sedan, a pickup, or an SUV.

=IF(E2="Car",IF(E3="2-door","Coupe","Sedan"),IF(E3="Has Bed","Pickup","SUV")) 
Snapshot of a model for selecting an automobile

FIGURE 14.2 A model for selecting an automobile

With some conditional analysis, the result of the first condition causes the second condition to change. In this case, if the first condition is Car, the second condition is 2-door or 4-door. But if the first condition is Truck, the second condition changes to either Has Bed or No Bed.

You've seen that Excel provides the IF function to perform conditional analysis. You can also nest IF functions; that is, you can use another IF function as an argument to the first IF function when you need to check more than one condition. In this example, the first IF checks the value of E2. Rather than returning a value if TRUE, the second argument is another IF formula that checks the value of cell E3. Similarly, the third argument doesn't simply return a value of FALSE, but it contains a third IF function that also evaluates cell E3.

In Figure 14.2, the user has selected Truck. The first IF returns FALSE because E2 doesn't equal Car and the FALSE argument is evaluated. In that argument, E3 is seen to be equal to Has Bed and the TRUE condition (Pickup) is returned. If the user had selected No Bed, the FALSE condition (SUV) would have been the result.

Validating conditional data

The user input fields in Figure 14.2 are data validation lists. The user can make selections from a drop-down box rather than typing in the values. The data validation in cell E3 uses an interesting technique with an INDIRECT function to change its list depending on the value in E2.

There are two named ranges in the worksheet. The range named Car points to E6:E7, and the range named Truck points to E10:E11. The names are identical to choices in the E2 data validation list. Figure 14.3 shows the Data Validation dialog box for cell E3. The source is an INDIRECT function with E2 as the argument.

Snapshot of data validation using INDIRECT

FIGURE 14.3 Data validation using INDIRECT

The INDIRECT function takes a text argument that it resolves into a cell reference. In this case, since E2 is Truck, the formula becomes =INDIRECT("Truck"). Because Truck is a named range, INDIRECT returns a reference to E10:E11, and the values in those cells become the choices. If E2 contained Car, INDIRECT would return E6:E7, and those values become the choices.

One problem with this type of conditional data validation is that when the value in E2 is changed, the value in E3 does not change. The choices in E3 change, but the user still has to select from the available choices or your formulas may return inaccurate results.

Looking up values

When you have too many nested IF functions, your formulas can become long and hard to manage. Figure 14.4 shows a slightly different setup to the auto-selector model. Instead of hard-coding the results in nested IF functions, we enter the results into the cells next to their properties (for example, Sedan is entered in the cell next to 4-door).

The new formula is as follows:

=IF(E2="Car",VLOOKUP(E3,E6:F7,2,FALSE),VLOOKUP(E3,E10:F11,2,FALSE))

This formula can now be used to return the automobile. The IF condition is the same, but now a TRUE result looks up the proper value in E6:F7, and a FALSE result looks it up in E10:F11. You can learn more about VLOOKUP in Chapter 15.

Snapshot of a different auto-selector model

FIGURE 14.4 A different auto-selector model

Checking if Condition1 AND Condition2 are met

In addition to nesting conditional functions, we can evaluate together inside an AND function. This is useful when two or more conditions need to be evaluated at the same time to determine where the formula should branch.

Figure 14.5 shows a listing of inventory items, their quantities, and the discount that applies when they are sold. The inventory items are structured with three sections divided by hyphens. The first section is the department; the second section determines whether the item is a part, a subassembly, or a final assembly; and the third condition is a unique four-digit number. We want to assign a discount of 10 percent only to those items that are in Department 202 and are final assemblies. All other items have no discount.

=IF(AND(LEFT(B3,3)="202",MID(B3,5,3)="FIN"),10%,0%) 

The IF function returns 10 percent if TRUE and 0 percent if FALSE. For the condition argument (the first argument), you need an expression that returns TRUE if both the first section of the item number is 202 and the second section is FIN. Excel provides the AND function to accomplish this. The AND function takes up to 255 logical arguments separated by commas. Logical arguments are expressions that return either TRUE or FALSE. For this example, we're using only two logical arguments.

The first logical argument, LEFT(B3,3)="202", returns TRUE if the first three characters of B3 are equal to 202. The second logical argument, MID(B3,5,3)="FIN", returns TRUE if the three digits starting at the fifth position are equal to FIN. Text manipulation functions are discussed in Chapter 12.

With the AND function, all logical arguments must return TRUE for the entire function to return TRUE. If even one of the logical arguments returns FALSE, then the AND function returns FALSE. Table 14.1 shows the results of the AND function with two logical arguments.

Snapshot of an inventory listing

FIGURE 14.5 An inventory listing

TABLE 14.1 A Truth Table for the AND Function

First Logical ArgumentSecond Logical ArgumentResult of AND Function
TRUETRUETRUE
TRUEFALSEFALSE
FALSETRUEFALSE
FALSEFALSEFALSE

In cell D3, the first logical condition returns TRUE because the first three characters of the item number are 202. The second logical condition returns FALSE because the middle section of the item number is PRT, not FIN. According to Table 14.1, a TRUE condition and a FALSE condition return FALSE, and 0 percent is the result. Cell D5, on the other hand, returns TRUE because both logical conditions return TRUE.

Referring to logical conditions in cells

The AND function in Figure 14.5 includes two logical conditions that evaluate to TRUE or FALSE. The arguments to AND can also reference cells as long as those cells evaluate to TRUE or FALSE. When building a formula with the AND function, it can be useful to break out the logical conditions into their own cells. In Figure 14.6, the inventory listing is modified to show two extra columns. These columns can be inspected to understand why a particular item does or does not get the discount.

With these modifications, the result doesn't change, but the formula becomes

=IF(AND(D3,E3),10%,0%)

Checking if Condition1 OR Condition2 are met

In Figure 14.6, we applied a discount to certain products based on their item number. In this example, we want to expand the number of products eligible for the discount. As before, only final assembly products will get the discount, but the departments will be expanded to include both department 202 and department 203. Figure 14.7 shows the inventory list and the new discount schedule.

=IF(AND(OR(LEFT(B3,3)="202",LEFT(B3,3)="203"),MID(B3,5,3)="FIN"),10%,0%)   
Snapshot of a modified inventory listing

FIGURE 14.6 A modified inventory listing

Snapshot of a revised discount scheme

FIGURE 14.7 A revised discount scheme

We've expanded the conditional argument to the IF function to account for the changes in the discount scheme. The AND function is restrictive because all of the arguments must be TRUE for AND to return TRUE. Conversely, the OR function is inclusive. With OR, if any one of the arguments is TRUE, the entire function returns TRUE. In this example, we've nested an OR function inside the AND function—we've made it one of the arguments. Table 14.2 shows a truth table for how our nested functions work.

Cell D9 in Figure 14.7 shows a previously undiscounted product that receives a discount under the new scheme. The OR section, OR(LEFT(B9,3)="202",LEFT(B9,3)="203"), returns TRUE because one of its arguments returns TRUE.

TABLE 14.2 A Truth Table for an OR Function Nested in an AND Function

OR Logical 1OR Logical 2OR ResultAND Logical 2Final Result
TRUETRUETRUETRUETRUE
TRUEFALSETRUETRUETRUE
FALSETRUETRUETRUETRUE
FALSEFALSEFALSETRUEFALSE
TRUETRUETRUEFALSEFALSE
TRUEFALSETRUEFALSEFALSE
FALSETRUETRUEFALSEFALSE
FALSEFALSEFALSEFALSEFALSE

Performing Conditional Calculations

Simple conditional functions like IF generally work on only one value or cell at a time. Excel provides some different conditional functions for aggregating data, such as summing or averaging.

In the following sections, you'll dive into some of the techniques for applying calculations based on a given set of conditions.

Summing all values that meet a certain condition

Figure 14.8 shows a listing of accounts with positive and negative values. We want to sum all of the negative balances, which we will later compare to the sum of all the positive balances to ensure that they are equal. Excel provides the SUMIF function to sum values based on a condition.

=SUMIF(C3:C12,"<0") 

SUMIF takes each value in C3:C12 and compares it to the condition (the second argument in our function). If the value is less than zero, it meets the condition and is included in the sum. If it is zero or greater, the value is ignored. Text values and blank cells are also ignored. For the example in Figure 14.8, cell C3 is evaluated first. Because it is greater than zero, it is ignored. Next, cell C4 is evaluated. It meets our condition of being less than zero, so it is added to the total. This continues for each cell. When it's complete, cells C4, C7, C8, C9, and C11 are included in the sum, and the others are not.

The second argument of SUMIF, the condition to be met, has quotes around it. Because we're using a less-than sign for this example, we have to create a string that represents the expression.

Snapshot of summing values less than zero

FIGURE 14.8 Summing values less than zero

The SUMIF function has an optional third argument called the sum:range. So far, we've been applying the condition to the very numbers we are summing. By using the third argument, we can sum a range of numbers but apply our conditions to a different range. Figure 14.9 shows a listing of regions and their associated sales. To sum the sales for the East region, use the formula =SUMIF(B2:B11,"East",C2:C11).

Snapshot of list of regions and sales values

FIGURE 14.9 List of regions and sales values

Summing greater than zero

Figure 14.8 also shows the total of all the positive balances. The formula for that calculation is =SUMIF(C3:C12,">0"). Note that the only difference between this formula and our example formula is the expression string. Instead of "<0" as the second argument, this formula has ">0".

We don't have to include zero in our calculation because we're summing and zero never changes a sum. If, however, we were interested in summing numbers greater or less than 1,000, we couldn't simply use "<1000" and ">1000" as our second arguments because we would never include anything that was exactly 1,000.

When you use greater than or less than a nonzero number in a SUMIF, make either the greater than a greater than or equal to, such as ">=1000", or the less than a less than or equal to, such as "<=1000". Don't use the equal sign for both; just use one. This will ensure that you include any numbers that are exactly 1,000 in one or the other calculation but not both.

The syntax needed to use your comparison operators can be tricky. Table 14.3 lists a set of simple rules that can help you get it right every time.

TABLE 14.3 Simple Rules for Using Comparison Operators

To Set a Condition……Follow These RulesFor Example
Equal to a number or cell referenceDon't use an equal sign or any double quotes. =SUMIF(A1:A10,3)
Equal to a stringDon't use an equal sign, but put the string in quotes. =SUMIF(A1:A10,"book")
Nonequal comparison to a numberPut both the operator and the number in double quotes. =SUMIF(A1:A10,">=50")
Nonequal comparison to a stringPut both the operator and the string in double quotes. =SUMIF(A1:A10,"<>Payroll")
Nonequal comparison to a cell reference or formulaPut the operator in double quotes and concatenate the cell reference or formula with the ampersand (&). =SUMIF(A1:A10,"<"&C1)

You can use the TODAY function (to get the current date), or most other functions, in the second argument. Figure 14.10 shows a listing of dates and values. To sum a range of numbers that correspond to today, use the formula =SUMIF(B3:B11,TODAY(),C3:C11). To sum only those values that are today or before, concatenate the less-than-or-equal-to sign to the function, such as =SUMIF(B3:B11,"<="&TODAY(),C3:C11).

Snapshot of SUMIF using the TODAY function

FIGURE 14.10 SUMIF using the TODAY function

There are two wildcard characters that you can use in the condition argument to SUMIF. The question mark (?) represents any single character, and the asterisk (*) represents zero, one, or any number of characters. The formula =SUMIF(B2:B11,"?o*",C2:C11) will sum all the values in C2:C11 that correspond to the values in B2:B11 where the second character is a lowercase o. If we apply that formula to the data in Figure 14.9, we will get the sum for sales in both the North and South regions because both have a lowercase o as the second letter and East does not.

Summing all values that meet two or more conditions

The limitation of SUMIF shown in Figure 14.9 is that it works with only one condition. The SUMIFS function can be used when more than one condition is needed.

Figure 14.11 shows a partial listing of countries and their gross domestic product (GDP) from 2000 to 2009. We want to total Brazil's GDP from 2003 to 2006. Excel's SUMIFS worksheet function is used to sum values where two or more conditions must be met, such as Country and Year in this example.

=SUMIFS(D3:D212,B3:B212,G3,C3:C212,">="&G4,C3:C212,"<="&G5) 

SUMIFS arguments start with the range that contains the value that you want to sum. The remaining arguments are in pairs that follow the pattern criteria_range, criteria. Because of the way that the arguments are laid out, SUMIFS will always have an odd number of arguments. The first criteria pair is required—without at least one condition, SUMIFS would be no different than SUM. The remaining pairs of conditions, up to 126 of them, are optional.

=SUMIFS(D3:D212,B3:B212,G3,C3:C212,">="&G4,C3:C212,"<="&G5)
Snapshot of a partial listing of countries and their gross domestic product

FIGURE 14.11 A partial listing of countries and their gross domestic product

In this example, each cell in D3:D212 is added to the total only if the corresponding values in B3:B212 and C3:C212 meet their respective conditions. The condition for B3:B212 is that it matches whatever is in cell G3. There are two year conditions because we need to define the lower bound and upper bound of our year range. The lower bound is in cell G4, and the upper bound is in cell G5. Those two cells are concatenated with greater than or equal to and less than or equal to, respectively, to create the year conditions. Only if all three conditions are true is the value included in the total.

Summing if values fall between a given date range

One way that you can use SUMIF with two or more conditions is to add or subtract multiple SUMIF calculations. If the two conditions operate on the same range, this is an effective way to use multiple conditions. When you want to test different ranges, the formulas get tricky because you have to make sure you don't double count values.

Figure 14.12 shows a list of dates and amounts. We want to find the sum of the values that are between June 23 and June 29, inclusive. The starting and ending dates will be put in cells F4 and F5, respectively. Cell F7 contains the following formula:

=SUMIF(B3:B20,"<="&F5,C3:C20)-SUMIF(B3:B20,"<"&F4,C3:C20) 
Snapshot of summing values that are between two dates

FIGURE 14.12 Summing values that are between two dates

This technique subtracts one SUMIF from another to get the desired result. The first SUMIF, SUMIF(B3:B20,"<="&F5,C3:C20), returns the sum of the values less than or equal to the date in F5, June 29 in this example. The conditional argument is the less-than-or-equal-to operator concatenated to the cell reference F5. If that was the whole formula, the result would be 5,962.33. But we want only values that are also greater than or equal to June 23. That means we want to exclude values that are less than June 23. The second SUMIF achieves that. Sum everything less than or equal to the later date, and subtract everything less than the earlier date to get the sum of values between the two dates.

Using SUMIFS

You may even find SUMIFS to be more intuitive than the subtraction technique. The formula =SUMIFS(C3:C20,B3:B20,"<="&F5,B3:B20,">="&F4) sums the values in C3:C20 that correspond to the values in B3:B20 that meet the criteria pairs. The first criteria pair is identical to the first SUMIF criteria, "<="&F5. The second criteria pair limits the dates to greater than or equal to the start date.

Getting a count of values that meet a certain condition

Summing values isn't the only aggregation you can do in Excel. Like SUMIF and SUMIFS, Excel provides functions for conditionally counting values in a range.

In Figure 14.13, there is a partial listing of countries and their gross domestic product from 2000 to 2009. We want to know how many times the GDP was greater than or equal to 1 million. The criterion to be applied will be in cell G3.

=COUNTIF(D3:D212,G3)  
Snapshot of a partial listing of countries and their gross domestic product

FIGURE 14.13 A partial listing of countries and their gross domestic product

The COUNTIF function works in a similar manner as the SUMIF function from Figure 14.9. The obvious difference, as the name suggests, is that it counts entries that meet the criteria rather than sum them. Another difference is that there is no optional third argument as there is in SUMIF. With SUMIF, you can sum a range that's different from the range to which the criterion is applied. But with COUNTIF, that wouldn't make sense because counting a different range would get the same result.

The formula in this example uses a slightly different technique to construct the criteria argument. The string concatenation occurs all in cell G3 rather than in the function's second argument. We could have also entered the second argument as ">=1000000" or ">="&G3 rather than just pointing to G3.

Getting a count of values that meet two or more conditions

The SUMIF function has its COUNTIF cousin. Of course, Microsoft couldn't introduce SUMIFS for summing multiple conditions without also introducing COUNTIFS to count them.

In Figure 14.14, there is a list of Alpine Skiing medalists from the 1972 Winter Olympics. We would like to know how many silver medalists have an ö in their name. The letter we're looking for is typed in cell I3 and the type of medal in cell I4.

=COUNTIFS(C3:C20,"*"&I3&"*",F3:F20,I4)  
Snapshot of 1972 Alpine Skiing Olympic medalists

FIGURE 14.14 1972 Alpine Skiing Olympic medalists

The criteria_range and criteria arguments come in pairs just like in SUMIFS. Whereas SUMIFS will always have an odd number of arguments, COUNTIFS will always have an even number.

The first criteria_range argument is the list of athlete names in C3:C20. The matching criteria argument, "*"&I3&"*", surrounds whatever is in I3 with asterisks. Asterisks are wildcard characters in COUNTIFS that stand for zero, one, or more characters of any kind. By including an asterisk both before and after the character, we're asking Excel to count all of the names that include that character anywhere within the name. That is, we don't care if there are zero, one, or more characters before ö, and we don't care if there are zero, one, or more characters after ö as long as that character is in there somewhere.

The second criteria_range, criteria argument pair counts those entries in F3:F20 that are SILVER (the value typed into I4). Only those rows where both the first argument pair and the second argument pair match (only rows where the athlete's name contains ö and the medal won was silver) are counted. In this example, Gustav Thöni won the silver in the Men's Slalom, and Annemarie Moser-Pröll placed in both the Women's Downhill and the Women's Giant Slalom for a count of three.

Finding nonstandard characters

The ö was typed into cell I3 by holding down the Alt key and typing 0246 on the numeric keyboard. Don't try to type those numbers on the number keys across the top of your keyboard because it won't work. The number 0246 is the ASCII code that represents ö. You can get the same character by entering the following formula:

=CHAR(246)

The CHAR Function will give you the character associated with an ASCII code. Every character has an associated ASCII code. So how do you know which ASCII code to use? You can find a character's ASCII code by using the CODE function.

Another look at Figure 14.14 shows some characters in cells H8:I12. You can see a small table of characters and their codes. In cell I8 is the formula =CODE(H8). The CODE worksheet function returns the ASCII code for the letter that's passed in. In this example, we can see that a capital T is ASCII code 84, a lowercase i is ASCII code 105, and ö is ASCII code 246. Armed with that knowledge, we can use the CHAR function to get any character you need.

Getting the average of all numbers that meet a certain condition

After summing and counting, taking an average of a range of numbers is the next most common aggregator. The average, also known as the arithmetic mean, is the sum of the numbers divided by the count of the numbers.

Figure 14.15 once again shows the medalists' results from the 1972 Winter Olympics. We want to determine the average result but only for those skiers from Switzerland. The country code is entered in cell I3 so that it can be easily changed to a different country.

=AVERAGEIF(D3:D20,I3,E3:E20) 
Snapshot of averaging results based on a country

FIGURE 14.15 Averaging results based on a country

Excel provides the AVERAGEIF function to accomplish just what we want. Like its cousin, the SUMIF function, AVERAGEIF has a criteria_range and a criteria argument. The final argument is the range to average. In this example, each cell in E3:E20 is either included in or excluded from the average depending on whether the corresponding cell in D3:D20 meets the criteria.

If no rows meet the criteria in AVERAGEIF, the function returns the #DIV/0! error.

Getting the average of all numbers that meet two or more conditions

Microsoft introduced AVERAGEIFS along with SUMIFS and COUNTIFS to allow you to average a range of numbers based on more than one condition.

Continuing our analysis of skiing times, Figure 14.16 shows some results of the 1972 Winter Olympics. In this case, we want to determine the average time based on more than one condition. The country, gender, and medal are entered into cells I3:I5. We want to average only those results that meet all three criteria.

=AVERAGEIFS(E3:E20,D3:D20,I3,B3:B20,"*"&I4,F3:F20,I5) 
Snapshot of averaging on three conditions

FIGURE 14.16 Averaging on three conditions

The AVERAGEIFS function is structured similarly to the SUMIFS function. The first argument is the range to average, and it's followed by up to 127 pairs of criteria_range/criteria arguments. The three criteria pairs are as follows:

  • D3:D20,I3 includes only those rows where the country code is SUI.
  • B3:B20,"*"&I4 includes only those rows where the event name ends with the word Women.
  • F3:F20,I5 includes only those rows where the medal is GOLD.

When all three conditions are met, the time in the Result column is averaged.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset