Summary of "Correlation | Karl Pearson’s coefficient of correlation | Multiple correlation | Biostatistics"

Topic

Correlation — definition and types — and methods to calculate Karl Pearson’s coefficient of correlation; brief note on multiple correlation.

Correlation measures the degree (strength and direction) of association between two (or more) variables.
Types of correlation: positive, negative, linear, curvilinear, and multiple correlation (among three or more variables).
Two practical methods to compute Pearson’s r: the Actual‑Mean Method and the Assumed‑Mean Method.
Use the Assumed‑Mean Method when the sample means are inconvenient decimals.
Example worked in the lecture yields r ≈ 0.961–0.962 (very strong positive linear correlation).

Correlation (co‑relation): a measure of how closely two variables are related; it quantifies the degree of relationship/association.
Positive correlation: both variables move in the same direction (one increases → the other tends to increase).
Negative correlation: variables move in opposite directions (one increases → the other tends to decrease).
Linear correlation: the ratio of change between two variables remains approximately constant (points lie roughly on a straight line).
Curvilinear correlation: the ratio of change does not remain constant (relationship is non‑linear).
Multiple correlation: association among three or more variables; typically one variable is treated as dependent and the others as independent.

When to use:

Formula:

r = Σ[(xi − x̄)(yi − ȳ)] / sqrt[ Σ(xi − x̄)² × Σ(yi − ȳ)² ]

Steps:

Worked example (from lecture):

When to use:

Use when x̄ or ȳ would be awkward decimals or to simplify arithmetic by centering on convenient values.

Basic idea:

Choose convenient constants a (for x) and b (for y) near central values; compute coded deviations dx = xi − a and dy = yi − b and work with their sums.

Formula (using coded deviations):

r = [ n·Σ(dx·dy) − (Σdx)(Σdy) ] / sqrt{ [ n·Σ(dx²) − (Σdx)² ] × [ n·Σ(dy²) − (Σdy)² ] }

Steps:

Tips:

Multiple correlation refers to the association among three or more variables.
In a three‑variable case, typically one variable (e.g., z) is treated as dependent and the others (x, y) as independent predictors.
There are standard formulas (and matrix/regression approaches) for computing the multiple correlation coefficient depending on which variable is dependent.
Practical approach: treat one variable as dependent and compute its relationship with the set of independents using regression or matrix methods.

Always transcribe given data carefully; transcription errors cause wrong results.
If one method is inconvenient, switch to the other.
Memorize the formula and practice example problems for fluency.
This topic commonly appears in courses such as BBA, BCA, B.Tech, M.B.A., and B.Pharmacy biostatistics (unit 1).

“Depth of Biology” application (Play Store) and an associated website (as recommended by the lecturer).
Lecturer encouraged watching unit‑wise videos, downloading notes from the app, and asking questions in comments.