Summary of "Probability | Binomial Normal & Poisson distribution | Null Hypothesis | Type 1 error & Type 2 error"
Overview
This lecture introduces basic probability, three core probability distributions (Bernoulli/binomial, Poisson, normal), sampling concepts, hypothesis testing, Type I/Type II errors, and the standard error of the mean. Simple examples (coin toss, dice, balls in a bag, defective products) illustrate the concepts.
Probability — definition & basic rule
- Probability is the chance that an event will occur. Notation: P(event).
- Basic formula: P(event) = (number of favorable outcomes) / (total number of outcomes).
- Examples:
- Coin toss: P(head) = 1/2
- Fair die: P(even) = 3/6 = 1/2
- Bag with 5 black and 10 white balls: P(black) = 5/15 = 1/3
- Range: 0 ≤ P ≤ 1
- Types: discrete vs continuous probabilities
Discrete vs continuous probability distributions
- Discrete distributions: outcomes are whole counts (0, 1, 2, …). Examples: Bernoulli/binomial, Poisson.
- Continuous distributions: outcomes vary continuously. Example: normal distribution.
Binomial (Bernoulli) distribution
When to use:
- Fixed number of independent trials (n)
- Each trial has exactly two outcomes (success/failure)
- Probability of success p is constant across trials
Conditions:
- Fixed n (finite)
- Mutually exclusive outcomes per trial
- Independent trials
- Constant probability p across trials
PMF:
P(X = r) = C(n, r) p^r (1 − p)^(n − r)whereC(n, r) = n! / (r!(n − r)!)
Parameters and summary measures:
- Mean:
E[X] = n p - Variance:
Var(X) = n p (1 − p) - Standard deviation:
SD = sqrt(n p (1 − p))
Graphical shape:
p = 0.5→ symmetricp < 0.5→ positively skewedp > 0.5→ negatively skewed
Typical use: count of successes out of a fixed number of trials (e.g., number of heads in 10 coin tosses).
Poisson distribution
When to use:
- Modeling counts of rare events over large numbers of trials or continuous space/time
- Typical situation: n large, p small, mean
m = n pmoderate
PMF:
P(X = x) = e^(−m) m^x / x!forx = 0, 1, 2, ...
Parameter and summary measures:
- Mean:
E[X] = m(oftenm = n p) - Variance:
Var(X) = m(mean equals variance) - SD:
sqrt(m)
Shape:
- Positively skewed when m is small; approaches a normal shape as m increases
Typical use: rare defects in manufacturing, rare failures, rare occurrences per unit time/area.
Normal distribution
Type: continuous distribution, widely used.
Key properties:
- Bell-shaped, symmetric about the mean
μ - Unimodal: mean = median = mode =
μ - Spread determined by standard deviation
σ(parametersμandσ) - Domain:
(−∞, +∞); total area under curve = 1 (50% on each side of mean) - Tails are asymptotic; inflection points at
μ ± σ
Typical applications: modeling continuous variables (height, weight, measurement errors)
Historical contributors: de Moivre, Laplace, Gauss
Sampling: population vs sample, sample size & types
Definitions:
- Population: the complete set of items/individuals of interest
- Sample: a subset of the population selected for measurement/analysis
Purpose of sampling:
- Estimate population parameters, test hypotheses, run pilot studies, conserve resources
Sample size:
- Large sample: commonly > 30 observations
- Advantages: more accurate estimates, reduced sampling error, greater precision
- Small sample: < 30 observations
- Disadvantages: larger sampling error, less precision, more sensitive to bias/outliers
- Choice depends on available resources and required precision; pilot studies often use small samples
Probability (random) sampling methods:
- Simple random sampling
- Stratified sampling (divide population into subgroups and sample from each)
- Systematic sampling (select every k-th individual)
- Cluster sampling (select clusters, then sample within)
- Multi-stage sampling (combine methods across stages)
Non-probability sampling methods:
- Convenience sampling (easy-to-access individuals)
- Snowball sampling (participants refer others)
- Purposive / judgmental sampling (select based on specific criteria)
- Quota sampling (fixed quotas for subgroups)
- Accidental sampling (chance encounters)
Good sample characteristics: representative, reliable, free from bias, sufficiently large
Hypothesis testing — null and alternative hypotheses
- Hypothesis: tentative statement about a population parameter or relationship
- Null hypothesis (
H0): statement of no effect/no relationship (status quo); the hypothesis being tested - Alternative hypothesis (
H1orHa): statement that there is an effect/relationship (complement ofH0) - Test outcomes: reject
H0or fail to rejectH0(ifH0is rejected,H1is supported)
Type I and Type II errors
- Type I error (α): rejecting a true
H0(false positive). Also called alpha error or error of the first kind. - Type II error (β): failing to reject a false
H0(false negative). Also called beta error or error of the second kind. - Trade-offs:
- Lowering α makes the test more conservative (fewer Type I errors) but typically increases β (more Type II errors).
- Increasing sample size and test power reduces Type II errors.
- Power =
1 − β(probability of correctly rejecting a falseH0).
- Practical examples: incorrectly rejecting an effective drug (Type I) vs incorrectly accepting an ineffective drug (Type II).
Standard error of the mean (SE)
- Definition: the standard deviation of the sampling distribution of the sample mean; measures variability of the sample mean around the population mean
- Notation: SE (or S.E. of mean)
- Formulas:
- If population SD
σis known:SE = σ / sqrt(n) - If
σunknown (estimate using sample SDs):SE ≈ s / sqrt(n)
- If population SD
- Interpretation:
- Smaller SE → sample mean likely closer to population mean
- SE decreases as sample size
nincreases - If sample size is very large relative to population, SE approaches zero
Formulas & quick reference
- Basic probability:
P(A) = favorable outcomes / total outcomes - Binomial:
P(X = r) = C(n, r) p^r (1 − p)^(n − r)- Mean =
n p - Variance =
n p (1 − p) - SD =
sqrt(n p (1 − p))
- Poisson:
P(X = x) = e^(−m) m^x / x!- Mean =
m - Variance =
m - SD =
sqrt(m)
- Normal:
- Parameters
μ(mean),σ(SD); bell-shaped PDF centered atμ; area under curve = 1
- Parameters
- Standard error of mean:
SE = σ / sqrt(n)orSE ≈ s / sqrt(n)
Practical examples emphasized
- Coin toss, die roll, drawing colored balls
- Manufacturing defects (Poisson example), battery problems (rare failures)
- Exam/course contexts: B.Pharm, BSc, CSIR NET
- Use of pilot studies
Takeaway learning points
- Know when to use binomial vs Poisson vs normal distributions
- Remember the binomial conditions: fixed n, independent trials, two outcomes, constant p
- Poisson models rare events; mean = variance is characteristic
- Normal distribution is central for continuous variables and many approximations (e.g., CLT)
- Sampling design matters: choose an appropriate method; larger samples give more accurate estimates but require more resources
- Hypothesis testing always evaluates
H0; understand Type I/II errors and trade-offs; increase sample size and power to reduce Type II errors - Compute and interpret the standard error to assess how well a sample mean estimates the population mean
Speakers / sources (as identified in subtitles)
- Primary speaker: the (unnamed) lecturer (referred to as “Sir”)
- Historical mathematicians mentioned:
- James (Jacob) Bernoulli — Bernoulli / binomial distribution
- Abraham de Moivre — early developer of the normal distribution
- Pierre‑Simon Laplace — contributor to normal distribution theory
- Carl Friedrich Gauss — contributor/rediscoverer of the normal distribution
- Other referenced resources:
- “Depth of Biology” application/playlist/channel (lecturer’s resource)
- Course/exam contexts: CSIR NET, BBA, MBA, BCA, B.Pharm, biostatistics courses
Category
Educational
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.