Hypergeometric Distribution

Distributions and Statistical Models

Calculates the probability of k successes in n draws, without replacement, from a finite population of size N that contains exactly K objects with that feature.

Practical Examples

Explore real-world scenarios to understand how the hypergeometric distribution is applied.

Drawing Aces in Poker

Poker Hand

What is the probability of drawing exactly 2 aces in a 5-card hand from a standard 52-card deck?

N: 52, K: 4

n: 5, k: 2

Defective Parts Inspection

Quality Control

A batch of 100 computer chips contains 10 defective ones. If you randomly select 8 chips for inspection, what's the chance of finding exactly 1 defective chip?

N: 100, K: 10

n: 8, k: 1

Fish Population Study

Genetics

In a pond with 200 fish, 50 are tagged. If a researcher catches 20 fish, what is the probability that exactly 5 of them are tagged?

N: 200, K: 50

n: 20, k: 5

Lottery Ticket

Lottery

In a lottery, 6 numbers are drawn from 49. To win a prize, you must match at least 3 numbers. What is the probability of matching exactly 3 numbers if you bought one ticket?

N: 49, K: 6

n: 6, k: 3

Other Titles
Understanding the Hypergeometric Distribution: A Comprehensive Guide
Dive deep into the principles, applications, and calculations of the hypergeometric distribution, a key concept in statistics for sampling without replacement.

What is the Hypergeometric Distribution?

  • Core Concept: Sampling Without Replacement
  • Distinguishing from Binomial Distribution
  • Key Parameters of the Distribution
The hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, without replacement, from a finite population of size N that contains exactly K objects with that feature. This is in contrast to the binomial distribution, which describes the probability of k successes in n draws with replacement.
Why 'Without Replacement' Matters
The key distinction is that the probability of success changes with each draw. For example, if you draw a card from a deck and don't put it back, the probability of drawing an ace on the second draw is different from the first. The hypergeometric distribution accounts for these changing probabilities.

Key Differences

  • Hypergeometric: Population is finite, and sampling is done without replacement.
  • Binomial: Trials are independent, and the probability of success remains constant (sampling with replacement).

Step-by-Step Guide to Using the Hypergeometric Distribution Calculator

  • Inputting Your Data Correctly
  • Interpreting the Probability Results
  • Understanding the Statistical Metrics
Using the calculator is straightforward. You need four key pieces of information:

• Population Size (N): The total number of items you are drawing from. • Successes in Population (K): The total number of items with the desired characteristic. • Sample Size (n): How many items you draw. • Successes in Sample (k): The specific number of successful items you are interested in.

Decoding the Output
The calculator provides several outputs. 'P(X=k)' is the exact probability for your specified number of successes. The cumulative probabilities (e.g., P(X≤k)) tell you the chance of getting 'at most' k successes. The mean, variance, and standard deviation describe the distribution's center, spread, and typical deviation.

Real-World Applications of the Hypergeometric Distribution

  • Quality Control in Manufacturing
  • Ecological and Population Studies
  • Games of Chance and Card Games
The hypergeometric distribution is not just an academic concept; it has numerous practical applications.
Manufacturing and Quality Control
Imagine a factory produces a batch of 1,000 light bulbs, and 50 are defective. An inspector randomly selects 100 bulbs. The hypergeometric distribution can calculate the probability that the sample contains exactly 5 defective bulbs, helping the company decide if the entire batch should be rejected.
Genetics and Ecology
Biologists use it for capture-recapture methods to estimate animal population sizes. If they capture, tag, and release 100 deer in a forest, and later recapture 50, finding 10 are tagged, they can estimate the total deer population.

Mathematical Derivation and Formula

  • The Role of Combinations
  • Breaking Down the Formula
  • Calculating the Mean and Variance
The power of the hypergeometric distribution comes from its foundation in combinatorics, the mathematics of counting.
The Formula Explained
P(X=k) = [ C(K, k) * C(N-K, n-k) ] / C(N, n)

• C(K, k): The number of ways to choose k successes from the K available successes in the population. • C(N-K, n-k): The number of ways to choose the remaining n-k items (failures) from the N-K failures in the population. • C(N, n): The total number of ways to choose a sample of size n from the entire population of size N.

Essentially, the formula calculates the ratio of the number of ways to get the desired outcome to the total number of possible outcomes.

Common Misconceptions and Correct Methods

  • Confusing Hypergeometric with Binomial
  • The '10% Rule' Guideline
  • Avoiding Common Input Errors
A primary point of confusion is when to use the hypergeometric versus the binomial distribution. The choice hinges on whether the sampling is done with or without replacement.
When Can You Approximate with Binomial?
While technically different, if the sample size (n) is less than 10% of the population size (N), the change in probability from one draw to the next is minimal. In such cases, the binomial distribution can serve as a reasonable and simpler approximation. However, for accuracy, especially with smaller populations, the hypergeometric model is the correct choice.
Ensure your inputs are logical. For instance, the number of successes in the sample (k) cannot be larger than the sample size (n) or the total number of successes in the population (K).