Hyper Geometric Distribution
Hyper Geometric Distribution
๐งช Hypergeometric Distribution
The Hypergeometric distribution models the number of successes in a sample drawn without replacement from a finite population.
๐ Definition
Let $X \sim \text{Hypergeometric}(N, K, n)$, where:
- $N$ = population size
- $K$ = number of success states in the population
- $n$ = number of draws
Then:
\[P(X = k) = \frac{\binom{K}{k} \binom{N - K}{n - k}}{\binom{N}{n}}\]๐ Key Properties
Property | Value |
---|---|
Parameters | $N \in \mathbb{N}, K \in [0, N], n \in [0, N]$ |
Support | $k \in [\max(0, n - N + K), \min(n, K)]$ |
Mean | $\mathbb{E}[X] = n \cdot \frac{K}{N}$ |
Variance | $\text{Var}(X) = n \cdot \frac{K}{N} \cdot \left(1 - \frac{K}{N}\right) \cdot \frac{N - n}{N - 1}$ |
Mode | $\left\lfloor \frac{(n + 1)(K + 1)}{N + 2} \right\rfloor$ |
๐ Real-World Examples
Quality Testing
Drawing 10 items from a shipment of 100, 20 of which are defective. How many defective in your sample?Card Games
Probability of getting exactly 2 hearts in a 5-card hand from a standard deck.
Notes
- No replacement: dependencies between trials.
- Cannot be modeled with binomial because trials are not independent.
- Useful in finite sampling and acceptance testing.
- Variance shrinks faster than in binomial due to decreasing population.
๐ Python Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
from scipy.stats import hypergeom
# N = total population, K = total successes, n = sample size
N, K, n = 50, 10, 5
rv = hypergeom(N, K, n)
# PMF at a specific value
print("P(X=2):", rv.pmf(2))
# Sample 10 values
print("Samples:", rv.rvs(size=10))
# Mean and variance
print("Mean:", rv.mean())
print("Variance:", rv.var())
This post is licensed under CC BY 4.0 by the author.