Chapter 5

Introduction to Probability

Abstract

This chapter explains the differences between probabilistic statistics and descriptive statistics, showing in which situations the probability theory must be used. The concepts and terminologies related to probabilistic statistics are presented here, as well as their practical application. By using the probability theory, it is possible to predict the occurrence of one or more events. This chapter also shows how combinatorial analysis can be used to calculate probabilities.

Keywords

Probability theory; Events; Conditional probability; Bayes' theorem; Combinatorial analysis

Do you want to sell sugar water for the rest of your life, or do you want to come with me and change the world?

Steve Jobs

5.1 Introduction

In the previous part of this book, we studied descriptive statistics, which describes and summarizes the main characteristics observed in a dataset through frequency distribution tables, charts, graphs, and summary measures, allowing the researcher to have a better understanding of the data.

Probabilistic statistics, on the other hand, uses the probability theory to explain how often certain uncertain events happen, in order to estimate or predict the occurrence of future events. For example, when rolling dice, we do not know for sure which value will appear, so, probability can be used to indicate the occurrence probability of a certain event.

According to Bruni (2011), the history of probability presumably started with the cave men. They needed to understand nature's uncertain phenomena better. In the 17th century, probability theory appeared to explain uncertain events. The study of probability evolved to help plan moves or develop strategies meant for gambling. Currently, it is also applied to the study of statistical inference, in order to generalize the data population.

This chapter has as its main objective to present the concepts and terminologies related to the probability theory, as well as their practical application.

5.2 Terminology and Concepts

5.2.1 Random Experiment

An experiment consists in any observation or measure process. A random experiment is one that generates unpredictable results, so, if the process is repeated several times, it becomes impossible to predict the result. Flipping a coin and/or rolling dice are examples of random experiments.

5.2.2 Sample Space

Sample space S consists of all the possible results of an experiment.

For example, when flipping a coin, we can get head (H) or tail (T). Therefore, S = {H, T}. On the other hand, when rolling a die, the sample space is represented by S = {1, 2, 3, 4, 5, 6}.

5.2.3 Events

An event is any subset of a sample space.

For example, event A only contains the even occurrences of rolling a die. Therefore, A = {2, 4, 6}.

5.2.4 Unions, Intersections, and Complements

Two or more events can form unions, intersections, and complements.

The union of two events A and B, represented by A ∪ B, results in a new event containing all the elements of A, B, or both, and can be illustrated according to Fig. 5.1.

Fig. 5.1
Fig. 5.1 Union of two events (A ∪ B).

The intersection of two events A and B, represented by A ∩ B, results in a new event containing all the elements that are simultaneously in A and B, and can be illustrated according to Fig. 5.2.

Fig. 5.2
Fig. 5.2 Intersection of two events (A ∩ B).

The complement of an event A, represented by Ac, is the event that contains all the points of S that are not in A, as shown in Fig. 5.3.

Fig. 5.3
Fig. 5.3 Complement of event A.

5.2.5 Independent Events

Two events A and B are independent when the probability of B happening is not conditional on event A happening. The concept of conditional probability will be discussed in Section 5.5.

5.2.6 Mutually Exclusive Events

Mutually excluding or exclusive events are those that do not have any elements in common, so, they cannot happen simultaneously. Fig. 5.4 illustrates two events A and B that are mutually exclusive.

Fig. 5.4
Fig. 5.4 Events A and B that are mutually exclusive.

5.3 Definition of Probability

The probability of a certain event A happening in sample space S is given by the ratio between the number of cases favorable to the event (nA) and the total number of possible cases (n):

PA=nAn=number of cases favorable to eventAtotal number of possible cases

si1_e  (5.1)

Example 5.1

When rolling a die, what is the probability of getting an even number?

Solution

The sample space is given by S = {1, 2, 3, 4, 5, 6}. The event we are interested in is A = {even numbers on a die}, so, A = {2, 4, 6}. Therefore, the probability of A happening is:

PA=36=12

si2_e

Example 5.2

A gravity-pick machine contains three white balls, two red balls, four yellow balls, and two black balls. What is the probability of a red ball being drawn?

Solution

Given a total of 11 balls and considering A = {the ball is red}, the probability is:

PA=number ofredballstotal number of balls=211

si3_e

5.4 Basic Probability Rules

5.4.1 Probability Variation Field

The probability of an event A happening is a number between 0 and 1:

0PA1

si4_e  (5.2)

5.4.2 Probability of the Sample Space

Sample space S has probability equal to 1:

PS=1

si5_e  (5.3)

5.4.3 Probability of an Empty Set

The probability of an empty set (ϕ) occurring is null:

Pϕ=0

si6_e  (5.4)

5.4.4 Probability Addition Rule

The probability of event A, event B or both happening can be calculated as follows:

PAB=PA+PBPAB

si7_e  (5.5)

If events A and B are mutually exclusive, that is, A ∩ B ≠ ϕ, the probability of one of them happening is equal to the sum of the individual probabilities:

PAB=PA+PB

si8_e  (5.6)

Expression (5.6) can be extended to n events (A1, A2, …, An) that are mutually exclusive:

PA1A2An=PA1+PA2++PAn

si9_e  (5.7)

5.4.5 Probability of a Complementary Event

If Ac is A's complementary event, then:

PAc=1PA

si10_e  (5.8)

5.4.6 Probability Multiplication Rule for Independent Events

If A and B are two independent events, the probability of them happening together is equal to the product of their individual probabilities:

PAB=PAPB

si11_e  (5.9)

Expression (5.9) can be extended to n independent events (A1, A2, …, An):

PA1A2An=PA1PA2PAn

si12_e  (5.10)

Example 5.3

A gravity-pick machine contains balls with numbers 1 through 60 that have the same probability of being drawn. We would like you to:

  1. a) Define the sample space.
  2. b) Calculate the probability of a ball with an odd number on it being drawn.
  3. c) Calculate the probability of a ball with a multiple of 5 on it being drawn.
  4. d) Calculate the probability of a ball with an odd number or with a multiple of 5 on it being drawn.
  5. e) Calculate the probability of a ball with a multiple of 7 or a multiple of 10 on it being drawn.
  6. f) Calculate the probability of a ball that does not have a multiple of 5 on it being drawn.
  7. g) One ball is drawn randomly and put back into the gravity-pick machine. A new ball will be drawn. Calculate the probability of the first ball having an even number on it and the second one a number greater than 40.

Solution

  1. a) S = {1, 2, 3, …, 60}.
  1. b) A = {1, 3, 5, …, 59}, PA=3060=12si13_e
  1. c) A = {5, 10, 15, …, 60}, PA=1260=15si14_e
  1. d) Where A = {1, 3, 5, …, 59} and B = {5, 10, 15, …, 60}. Since A and B are not mutually exclusive events, because they have common elements (5, 15, 25, 35, 45, 55), we apply Expression (5.5):

    PAB=PA+PBPAB=12+15660=35

    si15_e

  2. e) In this case, A = {7, 14, 21, 28, 35, 42, 49, 56} and B = {10, 20, 30, 40, 50, 60}. Since the events are mutually exclusive (A ∩ B ≠ ϕ), we apply Expression (5.6):

    PAB=PA+PB=860+660=730

    si16_e

  3. f) In this case, A = {multiples of 5} and Ac = {numbers that are not multiples of 5}. Therefore, the probability of complementary event Ac happening is:

    PAc=1PA=115=45

    si17_e

  4. g) Since the events are independent, we apply Expression (5.9):

    PAB=PAPB=12×2060=16

    si18_e

5.5 Conditional Probability

When events are not independent, we must use the concept of conditional probability. Considering two events A and B, the probability of A happening, given that B has already happened, is called conditional probability of A given B, and is represented by P(A | B):

PAB=PABPB

si19_e  (5.11)

An event A is considered independent of B if:

PAB=PA

si20_e  (5.12)

Example 5.4

A die is rolled. What is the probability of getting number 4, given that the number drawn was an even number?

Solution

In this case, A = {number 4} and B = {an even number}. Applying Expression (5.11), we have:

PAB=PABPB=1/61/2=13

si21_e

5.5.1 Probability Multiplication Rule

From the definition of conditional probability, the multiplication rule allows researcher to calculate the probability of the simultaneous occurrence of two events A and B as the probability of one of them multiplied by the conditional probability of the other, given that the first event has occurred:

PAB=PAPBA=PBPAB

si22_e  (5.13)

The multiplication rule can be extended to three events A, B, and C:

PABC=PAPBAPCAB

si23_e  (5.14)

This is only one of the six ways in which Expression (5.14) can be written.

Example 5.5

A gravity-pick machine contains eight white balls, six red balls, and four black balls. Initially, we draw a ball that is not put back into the gravity-pick machine. A new ball will be drawn. What is the probability of both balls being red?

Solution

Differently from the previous example that calculated the conditional probability of a single event, the objective in this case is to calculate the probability of two events occurring simultaneously. The events are also not independent, since the first ball is not put back into the gravity-pick machine.

If event A = {the first ball is red} and B = {the second ball is red}, to calculate P(A ∩ B), we must apply Expression (5.13):

PAB=PAPBA=618517=551

si24_e

Example 5.6

A company will give a car to one of its customers (who are located in different regions of Brazil). Table 5.E.1 shows the data regarding these customers, in terms of gender and city. Determine:

  1. a) What is the probability of a male customer being drawn?
  2. b) What is the probability of a female customer being drawn?
  3. c) What is the probability of a customer from Curitiba being drawn?
  4. d) What is the probability of a customer from Sao Paulo being drawn, given that it is a male customer?
  5. e) What is the probability of a female customer being drawn, given that it is a customer from Aracaju?
  6. f) What is the probability of a female customer from Salvador being drawn?

Table 5.E.1

Absolute Frequency Distribution According to Gender and City
MaleFemaleTotal
Goiania121426
Aracaju81220
Salvador161531
Curitiba242246
Sao Paulo352560
Belo Horizonte101222
105100205

Unlabelled Table

Solution

  1. a) The probability of the customer being a man is 105/205 = 21/41.
  2. b) The probability of the customer being a woman is 100/205 = 20/41.
  3. c) The probability of the customer being from Curitiba is 46/205.
  4. d) Considering that A = {Sao Paulo} and B = {male}, the P(A | B) is calculated according to Expression (5.11):

    PAB=PABPB=35/205105/205=13

    si25_e

  5. e) Considering that A = {female} and B = {Aracaju}, the P(A | B) is:

    PAB=PABPB=12/20520/205=35

    si26_e

  6. f) If A = {Salvador} and B = {female}, the P(A ∩ B) is calculated according to Expression (5.13):

    PAB=PAPBA=312051531=341

    si27_e

5.6 Bayes' Theorem

Imagine that the probability of a certain event was calculated. However, new information was added to the process, so, the probability must be recalculated. The probability calculated initially is called a priori probability; the probability with the recently added information is called a posteriori probability. The calculation of the a posteriori probability is based on Bayes' Theorem and is described here.

Consider B1, B2, …, Bn mutually exclusive events, and P(B1) + P(B2) + … + P(Bn) = 1. A, on the other hand, is any given event that will happen jointly or as a consequence of one of the Bi events (i = 1, 2, …, n). The probability of a Bi event happening, given that A event has already happened, is calculated as follows:

PBiA=PBiAPA=PBiPABiPB1PAB1+PB2PAB2++PBnPABn

si28_e  (5.15)

where:

  • P(Bi) is the a priori probability;
  • P(Bi | A) is the a posteriori probability (probability of Bi after A has happened).

Example 5.7

Consider three identical gravity-pick machines U1, U2, and U3. Gravity-pick machine U1 contains two balls, one is yellow and the other is red. Gravity-pick machine U2, on the other hand, contains three blue balls, while machine U3 contains two red balls and a yellow one. We select one of the gravity-pick machines at random and draw one ball. We can see that the ball chosen is yellow. What is the probability of gravity-pick machine U1 having been chosen?

Solution

Let‘s define the following events:

B1 = choosing gravity-pick machine U1;

B2 = choosing gravity-pick machine U2;

B3 = choosing gravity-pick machine U3;

A = choosing the yellow ball.

The objective is to calculate P(B1 | A), knowing that:

P(B1) = 1/3, P(A | B1) = 1/2

P(B2) = 1/3, P(A | B2) = 0

P(B3) = 1/3, P(A | B3) = 1/3

Therefore, we have:

PB1A=PB1APA=PB1PAB1PB1PAB1+PB2PAB2+PB3PAB3

si29_e

PB1A=13121312+130+1313=35

si30_e

5.7 Combinatorial Analysis

Combinatorial analysis is a set of procedures that calculates the number of different groups that can be formed by selecting a finite number of elements from a set. Arrangements, combinations, and permutations are the three main types of configurations and are applicable to the probability. The probability of an event is, therefore, the ratio between the number of results of the event we are interested in and the total number of results in the sample space (total number of arrangements, combinations, or permutations).

5.7.1 Arrangements

An arrangement calculates the number of possible configurations with distinct elements from a certain set. Bruni (2011) defines arrangement as the study of the number of ways in which researcher can organize a sample of objects, which was removed from a larger population, and in which the alteration of the order of the organized objects is relevant.

Given n different objects, if the objective is to select p of these objects (n and p are integers, n ≥ p), the number of arrangements or possible ways of doing this is represented by An,p and calculated as follows:

An,p=n!np!

si31_e  (5.16)

Example 5.8

Consider a set with three elements A = {1, 2, 3}. If these elements were taken 2 by 2, how many arrangements would be possible? What is the probability of element 3 being in the second position?

Solution

From Expression (5.16), we have:

An,p=3!32!=3×2×11=6

si32_e

These arrangements are (1, 2), (1, 3), (2, 1), (2, 3), (3, 1), and (3, 2). In an arrangement, the order in which the elements are organized is relevant. For example, (1, 2) ≠ (2, 1).

After defining all the arrangements, it is easy to calculate the probability. Since we have two arrangements in which element 3 is in the second position, given that the total number of arrangements is 6, the probability is 2/6 = 1/3.

Example 5.9

Calculate the number of ways in which it is possible to park six vehicles in three parking spaces. What is the probability of vehicle 1 being in the first parking space?

Solution

Through Expression (5.16), we have:

A6,3=6!63!=6×5×4×3!3!=120

si33_e

From the 120 possible arrangements, in 20 of them vehicle 1 is in the first position: (1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 2, 6), (1, 3, 2), (1, 3, 4), (1, 3, 5), (1, 3, 6), (1, 4, 2), (1, 4, 3), (1, 4, 5), (1, 4, 6), (1, 5, 2), (1, 5, 3), (1, 5, 4), (1, 5, 6), (1, 6, 2), (1, 6, 3), (1, 6, 4), (1, 6, 5). Therefore, the probability is 20/120 = 1/6.

5.7.2 Combinations

Combinations are a special case of arrangements in which it does not matter the order in which the elements are organized.

Given n different objects, the number of ways or combinations in which to organize p of these objects is represented by Cn,p (a combination of n elements arranged p by p), and calculated as follows:

Cn,p=np=n!p!np!

si34_e  (5.17)

Example 5.10

How many different ways can we form groups of four students in a class with 20 students?

Solution

Since the order of the elements in the group is not relevant, we must apply Expression (5.17):

C20,4=204=20!4!204!=20×19×18×17×16!2416!=4,845

si35_e

Thus, 4,845 different groups can be formed.

Example 5.11

Marcelo, Felipe, Luiz Paulo, Rodrigo, and Ricardo went to an amusement park to have fun. The ride they chose to go on next only has three seats, so, only three of them will be chosen randomly. What is the probability of Felipe and Luiz Paulo being on that ride?

Solution

The total number of combinations is:

C5,3=53=5!3!2!=5×4×3!3!2=10

si36_e

The 10 possibilities are:

Group 1: Marcelo, Felipe, and Luiz Paulo

Group 2: Marcelo, Felipe, and Rodrigo

Group 3: Marcelo, Felipe, and Ricardo

Group 4: Marcelo, Luiz Paulo, and Rodrigo

Group 5: Marcelo, Luiz Paulo, and Ricardo

Group 6: Marcelo, Rodrigo, and Ricardo

Group 7: Felipe, Luiz Paulo, and Rodrigo

Group 8: Felipe, Luiz Paulo, and Ricardo

Group 9: Felipe, Rodrigo, and Ricardo

Group 10: Luiz Paulo, Rodrigo, and Ricardo

Therefore, the probability is 3/10.

5.7.3 Permutations

Permutation is an arrangement in which all the elements in the set are selected. Therefore, it is the number of ways in which n elements can be grouped, changing their order. The number of possible permutations is represented by Pn and can be calculated as follows:

Pn=n!

si37_e  (5.18)

Example 5.12

Consider a set with three elements, A = {1, 2, 3}. What is the total number of permutations possible?

Solution

P3 = 3 ! = 3 × 2 × 1 = 6. They are (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), and (3, 2, 1).

Example 5.13

A certain factory manufactures six different products. How many different ways can the production sequence occur?

Solution

To determine the number of possible production sequences, we just need to apply Expression (5.18):

P6=6!=6×5×4×3×2×1=720

si38_e

5.8 Final Remarks

This chapter discussed the concepts and terminologies related to the probability theory, as well as their practical application. Probability theory is used to assess the possibility of uncertain events happening, its origin comes from trying to understand uncertain natural phenomena, evolving to planning how to gamble, and, currently, it is being applied to the study of statistical inference.

5.9 Exercises

  1. 1) Two soccer teams will play overtime until the Golden Goal is scored. Define the sample space.
  2. 2) What is the difference between mutually exclusive events and independent events?
  3. 3) In a deck of cards with 52 cards, determine:
    1. a. The probability of a card of hearts being drawn;
    2. b. The probability of a queen being drawn;
    3. c. The probability of a face card (jack, queen, or king) being drawn;
    4. d. The probability of any card, but not a face card, being drawn;
  4. 4) A production batch contains 240 parts and 12 of them are defective. One part is drawn randomly. What is the probability of this part being defective?
  5. 5) A number between 1 and 30 is chosen randomly. We would like you to:
    1. a. Define the sample space.
    2. b. What is the probability of this number being divisible by 3?
    3. c. What is the probability of this number being a multiple of 5?
    4. d. What is the probability of this number being divisible by 3 or a multiple of 5?
    5. e. What is the probability of this number being even, given that it is a multiple of 5?
    6. f. What is the probability of this number being a multiple of 5, given that it is divisible by 3?
    7. g. What is the probability of this number not being divisible by 3?
    8. h. Assuming that two numbers are chosen randomly, what is the probability of the first number being a multiple of 5 and the second one an odd number?
  6. 6) Two dice are rolled simultaneously. Determine:
    1. a. The sample space.
    2. b. What is the probability of both numbers being even?
    3. c. What is the probability of the sum of the numbers being 10?
    4. d. What is the probability of the multiplication of the numbers being 6?
    5. e. What is the probability of the sum of the numbers being 10 or 6?
    6. f. What is the probability of the number drawn in the first die being an odd number or of the number drawn in the second die being a multiple of 3?
    7. g. What is the probability of the number drawn in the first die being an even number or of the number drawn in the second die being a multiple of 4?
  7. 7) What is the difference between arrangements, combinations, and permutations?

References

Bruni A.L. Estatística aplicada à gestão empresarial. third ed. São Paulo: Atlas; 2011.


"To view the full reference list for the book, click here"

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset