 # Geometric Probability – Explanation & Examples Many experiments of practical interest have only two possible outcomes, i.e., success or failure. If we repeat these experiments many times, and the probability of success and failure remains the same for each trial, then these trials are called Bernoulli trials.

Geometric probability or geometric distribution refers to calculating the probability of first success in a sequence of Bernoulli trials.

1. Basic probability theory
2. Independent events
3. Mean and Variance

1. What is meant by geometric probability and geometric distribution.
2. What are Bernoulli Trials.
3. How to calculate geometric probability.
4. How to find the mean and variance of the geometric distribution.

## What is Geometric Probability

To understand geometric probability, we first need to understand what constitutes a Bernoulli trial.

### Bernoulli Trials:

Bernoulli trials deal with experiments with two possible outcomes: ”success” and ”failure”. For example, when we toss a coin, we either get heads or tails. We can arbitrarily label getting either heads or tails as success and the other as a failure. Another example is the rolling of a six-sided fair die. Although we have six possible outcomes in this case, however, we can convert it into a Bernoulli trial by considering the probability of one particular outcome, say 2. So if we roll a die and get a 2, it is a success, and if we get any number other than 2, it is a failure. ### Conditions for Bernoulli Trials:

For an experiment to qualify as a Bernoulli trial, it must satisfy two conditions:

1. There are only two outcomes.
2. The probability of success (and failure) remains the same for each trial.

For example, let’s say we draw a card from a deck of 52 cards. Let us say that it is a success if we get a king; otherwise, it is a failure. Each time we draw a card, if we also replace it in the deck, then the probability of each draw remains the same. However, if we draw a card from the deck and do not replace it before drawing the next card, then the probability of each subsequent draw depends upon previous draws, and the probabilities keep changing in each draw. In such a case, the experiment does not qualify as a Bernoulli trial.

Now that we have understood what a Bernoulli trial means, the concept of geometric probability becomes simple.

### Geometric probability definition

The probability of getting the first success in a sequence of Bernoulli trials is referred to as geometric probability. ## How to find Geometric Probability:

It is best to consider an example to understand this concept.

Example 1: Let us suppose we are rolling a six-sided fair die many times. What is the probability of the following events:

1. Getting a 2 on the first attempt.
2. The first 2 appear on the third attempt.
3. The first 2 appear on the 10th attempt.

Solution:

1. Let us define “getting a 2” as success and “not getting a 2” as a failure. So,

$P(\textrm{Getting a 2 in a single roll}) = \frac16$
$P(\textrm{Not getting a 2 roll}) = 1 – \frac16 = \frac56$

Since each time we roll a die, the probability of a success or a failure remains the same, so it is an example of a Bernoulli trial. Also, since we are interested in the probability of getting a first 2 in the first, third, or 10th attempt. This is a case of geometric probability.

Now, the probability of getting the first $2$ in the first attempt is, of course, $\frac16$. If we get the first $2$ in the third attempt, it means we did not get a $2$ in the first AND second attempt, so

$P(\textrm{Getting the first 2 in 3rd attempt})$

$= P(\textrm{Not getting 2 in 1st attempt})\; \textrm{AND} \;P(\textrm{Not getting 2 in 2nd attempt})\; \textrm{AND}$

$P(\textrm{Getting a 2 in 3rd attempt})$.

Note that the probability of any given attempt is independent of what has happened in the previous attempts or what might happen in future attempts. Recall from basic probability theory that if two events, say $E1$ and $E2$ are independent, then the probability of the event $E1 \;\textrm{AND}\; E2 = P(E1) \times P(E2)$. Hence,

$P(\textrm{Getting the first 2 in 3rd attempt})$

$= P(\textrm{Not getting 2 in 1st attempt}) \times P(\textrm{Not getting 2 in 2nd attempt})$

$\times \;P(\textrm{Getting a 2 in 3rd attempt})$.

$P(\textrm{Getting the first 2 in 3rd attempt}) = (1 – \frac16) \times (1 – \frac16) \times \frac16 = 0.115$

Similarly,

$P(\textrm{Getting the first 2 in 10th attempt})$

$= P(\textrm{Not getting 2 in 1st attempt}) \times \cdots \times P(\textrm{Not getting 2 in 9th attempt})$

$\times\; P(\textrm{Getting 2 in 10th attempt})$.

$P(\textrm{Getting the first 2 in 10th attempt}) = (1 – \frac56)^{9} \times \frac16$.

### Geometric distribution formula

Let us consider a Bernoulli trial with a probability of success equal to $p$ and probability of failure equal to $1-p$, then the probability of getting first success in the kth attempt is given as

$P(\textrm{First success in kth attempt}) = (1-p)^{k-1}p$.

Example 2: A card is drawn randomly from a deck of $52$ cards and replaced. The process is continued until a king is drawn. Find the probability that the first king is drawn in the 5th attempt.

Solution:

Let us suppose, if a king is drawn, we call it a success. If any other card apart from the king comes up, we call it a failure. Since the cards are being replaced at each draw, so we are dealing with a Bernoulli trial. Also, we are interested in the first success, so we are dealing with geometric probability. There are 4 kings in a deck of 52 cards, so the probability of success is $4/52 = 1/13$.
Using the formula described above
$P(\textrm{First king in 5th draw}) = (1 – \frac{1}{13})^{5-1} \times \frac{1}{13} = 0.056$.

### Geometric distribution:

Let $X$ is a random variable. Informally, a random variable is a collection of numbers, where each number has some probability. If we define $P(X=k) = (1-p)^{k-1}p$, then $X$ is called a geometric random variable and the function $f(k) = (1-p)^{k-1}p$ is called the geometric distribution. Note that $f(k)$ is only defined for discrete values of $k$ and hence is a discrete function. Below, we plot geometric distribution for various values of probability of success. ### Mean of the geometric distribution:

When we are performing Bernoulli trials, we might get success on the first attempt, or we might get success on the 10th attempt, or maybe we won’t get success until the 100,000th attempt( although the probability of such a case would be very low). We might be interested in the question, “How many attempts, on average, are required to get the first success?”. Such a number is called the mean or the expected value of a distribution. For geometric distribution, the expected value can be calculated using the formula

$E(X) = \sum^{\infty}_{k=1}(1 – p)^{k-1} \times p \times k$.

We omit the proof, but it can be shown that $E(X) = \frac1p$ if $X$ is a geometric random variable and $p$ is the probability of success.

### The variance of the geometric distribution:

Variance is a measure of the spread of the distribution. It tells us how much the distribution deviates from the mean/expected value. In some applications, we might be interested in the expected value and the variance of the geometric distribution. Again, we omit the proof and state the formula for the variance
$VAR(X) = \frac{1 – p}{p^2}$.

Let us see a few examples.

Example3: A six-sided fair die is rolled many times until we get a 3. How many rolls, on average, are required to get the first 3.

Solution:
We are obviously dealing with a geometric distribution here with $p=\frac16$. Since we are interested in the first success on average, so we can use the formula for the expected value of the geometric random variable. Hence,

$\textrm{Number of rolls, on average, to get first 3} = E(X) = \frac1p = \frac{1}{1/6} = 6$.

So, on average, it will take us six attempts to get the first 3.

Example4: In a community, for every 100 persons, 2 are infected with COVID-19. A doctor performs ideal PCR tests (with 100% detection accuracy) to detect COVID patients each day by randomly selecting persons from the community. Each test is assumed to be independent of the other. How many tests, on average, would the doctor perform to get the first positive result?

Solution:

Let us define a positive test as a success (ironically). The probability of success is $2/100 = 1/50$. Since each test is independent, so it is a Bernoulli trial. Since we are interested in first success, so it is a geometric distribution. Using the formula for the expected value of a geometric distribution

$\textrm{Number of tests, on average, to get first positive} = \frac{1}{p} = 50$.

So, on average, after every $50$ test, the doctor will get a positive result.

### Cumulative geometric probability

Sometimes, we are not exactly interested in whether the first success will appear in the 4th or 5th attempt but instead if the first success will appear WITHIN 4th or 5th attempt. In other words, instead of asking for $P(X=k)$, we are asking for $P(X \leq k)$. One possible method is to note that

$P(X \leq k) = P(X=1) + P(X=2) + \cdots + P(X=k)$.

However, there is another simpler method to find $P(X \leq k)$. We note from basic probability that

$P(X \leq k) = 1 – P(X>k)$.

If we ask what the probability that the first success appears after the 4th attempt, i.e., $X>4$, it is the same as asking what the probability that the first four attempts are not a success is. So,

$P(X>k) = P(\textrm{1st attempt is failure}) \times P(\textrm{2nd attempt is failure}) \times \cdots P(\textrm{kth attempt is failure})$.

$P(X>k) = (1-p)^k$.

Accordingly,

$P(X \leq k) = 1 – (1-p)^k$.

Example5: A card is drawn randomly from a deck of 52 cards and replaced. The process is continued until a king is drawn. Find the probability that the first king is drawn WITHIN first 5 attempts.

Solution:

$P(\textrm{First King not in first 5 attempts}) = (1 – \frac{1}{13})^{5} = 0.67$

$P(\textrm{First King within first 5 attempts}) = 1 -(\textrm{First King not in first 5 attempts}) = 0.33$

### Practice Questions:

Question No.1: Let a factory is producing PlayStation-4 consoles. The probability of producing a defective console is $p=0.001$. A tester is testing randomly selected items from the production line. Find the probability of the following events:
1. The first defective console is detected in the 4th test.
2. The first defective console is detected in the 100th test.
3. The first defective console is not detected in the first 100 tests.
4. The first defective console is detected in the first 100 tests.

1) 0.00099
2) 0.000904
3) 0.904
3) 0.0952

Question No.2: A traveling salesperson is selling an item. The probability that he/she will successfully sell the item to a random customer is $0.1$. On average, how many customers he will try until he gets the first sale?

Question No.3: Plot the geometric distribution for $p=0.4$. What is the mean of the distribution? What is the variance? 