Screenshot_20250613_201822

Negative Binomial Distribution: A Complete and Detailed Guide


Introduction

The negative binomial distribution is a fundamental probability distribution used in statistics to model the number of independent Bernoulli trials needed to achieve a fixed number of successes. It generalizes the geometric distribution, which models the number of trials until the first success.

This distribution is especially useful in situations where you are interested in counting the number of attempts needed to observe a specific number of successful outcomes, with each trial having the same probability of success.

Definition

Let X be a random variable representing the number of trials needed to get r successes in a sequence of independent Bernoulli trials, each with probability of success p. Then X follows a Negative Binomial Distribution.

The number of failures before the r-th success is what is often modeled. The probability mass function (PMF) is:

P(X = x) = C(x + r - 1, r - 1) * p^r * (1 - p)^x

for x = 0, 1, 2, ...

Here, X counts the number of failures before the r-th success, and C(n, k) denotes the binomial coefficient:

C(n, k) = n! / (k! * (n - k)!)

Parameters

  • r: Number of desired successes (a positive integer)
  • p: Probability of success on each trial (0 < p < 1)
  • X: Number of failures before achieving r successes

Mean and Variance

If X follows a Negative Binomial Distribution with parameters r and p, then:

  • Mean (Expected value): E[X] = r * (1 - p) / p
  • Variance: Var(X) = r * (1 - p) / p²

Special Case: Geometric Distribution

The geometric distribution is a special case of the negative binomial distribution when r = 1. In that case, the negative binomial distribution simplifies to counting the number of failures before the first success.

Example

Suppose you are rolling a die, and you define success as rolling a 6 (p = 1/6). What is the probability that you roll the die 10 times and get the 3rd success on the 10th roll?

First, you must have had x = 7 failures before the 3rd success (since 10 – 3 = 7), and r = 3. Plug into the formula:

P(X = 7) = C(7 + 3 - 1, 3 - 1) * (1/6)^3 * (5/6)^7
     = C(9, 2) * (1/216) * (78125 / 279936)

Calculate and simplify for the numerical result.

Applications

  • Modeling the number of accidents before a fixed number of safe days
  • Predicting the number of failed transactions before reaching a success quota
  • Call center analytics (e.g., number of calls before getting 5 successful sales)
  • Quality assurance and manufacturing defects tracking

Python Code Example

from scipy.stats import nbinom

r = 3        # number of successes p = 1/6      # probability of success x = 7        # number of failures

Probability of exactly 7 failures before 3rd success

prob = nbinom.pmf(x, r, p) print(f"P(X = {x}) = {prob:.6f}")

Mean and variance

mean = nbinom.mean(r, p) var = nbinom.var(r, p) print(f"Mean: {mean}, Variance: {var}")

Conclusion

The negative binomial distribution is a versatile and powerful tool in probability, especially useful for modeling events where multiple successes are required over a sequence of trials. It generalizes the geometric distribution and has wide applications in quality control, economics, public health, and more.

Understanding its structure, formula, and behavior allows analysts and statisticians to model uncertainty in a wide range of real-world processes where success isn’t guaranteed on the first few tries.


Tags: No tags

Add a Comment

Your email address will not be published. Required fields are marked *