Applied Statistics — STAT 2 · CvSU · BSIT

The whole course,
in one sitting.

Nine units, stripped to what matters. Every idea gets a plain definition, the formula, a memory hook, and an example that everyone could follow. Obsessed with what's on the exam. Allergic to filler. The course funnels toward hypothesis testing (the engine) and ANOVA (the final boss) — pace yourself there.

Units

Nine

Contact hours

Major exams

Reference

Walpole

Formula 🧠 Mnemonic Picture-this example Color-coded units

01 / Foundations

Introduction to Statistics

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make decisions under uncertainty. Two halves: descriptive (summarize what you have) and inferential (use a sample to make claims about a population).

The four words you must never mix up

Population — the entire group you care about. Sample — the slice you actually measure.
Parameter — a number describing the population (the truth, usually unknown). Statistic — a number from your sample (your estimate).

🧠

Mnemonic

Population = the whole Pizza 🍕; Sample = one Slice. Parameter ↔ Population, Statistic ↔ Sample. Greek letters (μ, σ) = the truth; Roman letters (x̄, s) = your guess.

Types of variables

Qualitative (categorical): names/labels. Quantitative (numeric): amounts.
Discrete: countable jumps (kids in a class). Continuous: any value on a scale (weight, time).
Levels of measurement: Nominal → Ordinal → Interval → Ratio (low to high info).

🧠

Mnemonic

Levels = NOIR (Nominal, Ordinal, Interval, Ratio). Discrete = Dots you count; Continuous = a Curve you measure.

Summation notation

Sigma just means "add them all up"

Σx = x₁ + x₂ + x₃ + … + xₙ

If x = {2, 4, 6}, then Σx = 12 and Σx² = 4 + 16 + 36 = 56. (Square first, then add.)

📐 Picture this

You want the average height of all Grade-6 kids in the Philippines — the population, too many to measure. So you measure 100 kids (your sample). The true average of every kid = parameter. Your 100-kid average = statistic. Statistics is the art of trusting the 100 to speak for the millions.

02 / Handling Data

Collection & Presentation of Data

Garbage in, garbage out — how you collect data decides whether any of the later math means anything.

Ways to collect

Direct (interview), Indirect (questionnaire), Registration (records), Observation, Experiment.

Ways to present

Textual = words in a paragraph. Tabular = rows & columns. Graphical = charts (bar, pie, histogram).

Frequency Distribution Table (FDT)

Raw scores are noise. An FDT groups them into classes so a pattern appears.

Class width

Width = RangeNumber of classes , Range = Highest − Lowest

Always round the width up. Rule of thumb: 5–15 classes.

🧠

Mnemonic

Build an FDT in order: R-C-W-T — find the Range, decide Classes, get the Width, then Tally.

📐 Picture this

Thirty kids took a test. Listing all 30 scores is a wall of numbers. Instead group them: 0–10 → 2, 11–20 → 5, 21–30 → 9… Now you can see the class bunches up in the middle. Draw bars over those groups and you've got a histogram — same data, readable at a glance.

03 / The Center

III

Central Tendency & Location

One number to represent the "center." Three flavors — they agree when data is symmetric, and disagree when it's skewed (the interesting case).

The big three

Mean: x̄ = Σxn

Median: the middle value once data is sorted

Mode: the value that appears most often

Median position = (n + 1) ÷ 2 → the seat number, not the score sitting in it.

🧠

Mnemonic

All start with M: Mean = add & divide · Median = the Middle (sort first!) · Mode = the Most. And: the Mean is a people-pleaser — one billionaire drags it up. The Median doesn't care.

Measures of location ("non-central")

Quartiles (Q1, Q2, Q3) cut sorted data into 4 equal parts — Q2 is the median.
Deciles = 10 parts, Percentiles = 100 parts. Same idea, finer cuts.

📐 Picture this

Five kids' scores: 3, 5, 5, 7, 10.

Mean = (3+5+5+7+10) ÷ 5 = 30 ÷ 5 = 6 Median = middle of sorted list = 5 Mode = appears most often = 5

The "average" student is ~5–6. If a sixth kid scored 100, the mean jumps to ~21 (misleading!) but the median barely moves to 6. That's when you report the median.

04 / The Spread

Dispersion & Skewness

The center tells you where; dispersion tells you how spread out. Two datasets can share a mean and tell completely different stories.

Spread, from lazy to powerful

Range = Highest − Lowest

Sample variance: s² = Σx² − (Σx)² / nn − 1

Standard deviation: s = √s²

Coefficient of variation: CV = ( s ÷ x̄ ) × 100%

Use the computational form (Σx², (Σx)²) — far fewer keystrokes. Carry decimals; round only at the end.

🧠

Mnemonic

Variance Vexes (weird squared units), SD Saves it (square-root → real units). Divide by n−1 because Samples are Shy by one. CV lets you Compare Variability across different things (₱ vs kg).

Skewness — the lopsidedness

Long tail on the right → positive / right-skew (few high outliers).
Long tail on the left → negative / left-skew (few low outliers).
Symmetric → skew ≈ 0; mean ≈ median.

🧠

Mnemonic

The tail tells the tale. Skew is named after the long tail's direction — and the mean chases the tail (pulled toward the outliers).

📐 Picture this

Two classes both average 80. Class A: everyone 78–82 (tight, tiny SD — average is trustworthy). Class B: 50 to 100 (huge SD — half lost, half bored). Same mean, opposite reality. That's why SD is never optional.

05 / The Keystone

Sampling Distribution & CLT

The keystone unit. Understand why sample averages behave predictably and every test in Units VI–IX stops being magic.

Why & how we sample

Why: cheaper, faster, and sometimes the test destroys the item (you can't crash-test every car).
Random (pure lottery), Systematic (every kth one), Stratified (split into layers, sample each), Cluster (grab whole groups).

🧠

Mnemonic

Really Smart Stats Cluster → Random, Systematic, Stratified, Cluster. Stratified = slice into LAYERS then sample each. Cluster = pick whole GROUPS (entire classrooms).

Standard error & the Central Limit Theorem

The two ideas that run everything

Standard error: SE = σ ÷ √n

CLT: for n ≥ 30, the distribution of x̄ is ≈ Normal — whatever shape the population has

SE is the SD of the sample mean. Bigger n → smaller SE → a steadier, more reliable average.

🧠

Mnemonic

The average of averages goes bell-shaped. n ≥ 30 is the magic number. SE shrinks as n grows.

📐 Picture this

Roll one die: any number 1–6, totally flat, no bell. Now roll five dice and write the average, hundreds of times. Those averages pile up around 3.5 in a bell shape — all-1s or all-6s are rare. The original was flat, yet the averages went bell-curve. That's the CLT, and it's why a sample mean is trustworthy.

06 / The Engine

Test of Hypothesis

A courtroom for data. Assume "nothing's going on," then check whether the evidence is strong enough to overturn that assumption.

The two hypotheses

H₀ (null) — the boring status quo: "no difference / no effect." Always wears the = sign.
H₁ (alternative) — the new claim you're trying to prove: "there IS a difference."

The 5 steps (every test, same dance)

Procedure

1. State H₀ & H₁ 2. Pick α (usually 0.05) 3. Compute the test statistic

4. Find the critical value from the table 5. Decide & conclude in plain words

The two workhorse statistics

z (σ known / large n): z = x̄ − μσ / √n

t (σ unknown / small n): t = x̄ − μs / √n , df = n − 1

By hand you compare against a critical value, not an exact p-value. Reject H₀ when |statistic| > critical (it landed in the tail).

🧠

Mnemonic

Steps = H-A-T-C-D (Hypotheses, Alpha, Test-stat, Compare, Decide). p Low → null must Go; p High → null gets by. Tails: ≠ → two-tailed, < or > → one-tailed.

Two ways to be wrong

Type I (α): reject a true H₀ — a false alarm.
Type II (β): fail to reject a false H₀ — you missed a real effect.

🧠

Mnemonic

Type I = cry wolf when there's none (false alarm). Type II = miss the wolf that's really there. And: statistically significant ≠ important — a tiny effect looks "significant" if n is huge.

📐 Picture this

A candy company swears each bag holds 50 pieces → H₀: μ = 50. You suspect shorting → H₁: μ < 50. You count 30 bags; average is 47. The question: is 47 "far enough" below 50 to prove cheating, or just random bag luck? Deep in the tail → reject H₀ → guilty. Type I = accuse an honest company. Type II = let real cheaters walk.

07 / The Line

VII

Correlation & Regression

Correlation measures how tightly two things move together. Regression draws the best straight line so you can predict.

Correlation coefficient r (lives between −1 and +1)

r = nΣxy − ΣxΣy√[ nΣx² − (Σx)² ] [ nΣy² − (Σy)² ]

+1 = perfect upward line · 0 = random cloud · −1 = perfect downward line.

Regression line ŷ = a + bx

slope: b = nΣxy − ΣxΣynΣx² − (Σx)²

intercept: a = ȳ − b·x̄

coefficient of determination: r² = (r)²

Your Casio's regression mode gives a, b and r directly — use the formulas as a sanity check.

🧠

Mnemonic

ŷ = a + bx is just y = mx + b in a lab coat. b is the boost — how much y jumps per +1 of x. Square r to get the share of variation explained (r = 0.9 → r² = 0.81 → 81%).

🧠

Mnemonic — the trap

Correlation ≠ causation. Ice-cream sales and drownings rise together — but the SUN causes both, not each other. Always hunt for the hidden third factor.

📐 Picture this

Plot hours studied (x) against test score (y). The dots drift up-right — more study, higher score: positive correlation. Draw the single best straight line (regression) and predict: "study 5 hours → expect ~85." The causation trap: kids with bigger feet read better — feet don't cause reading, age does (older kids have both).

08 / The Setup

VIII

Elements of Experimentation

Before you can analyze an experiment (Unit IX), you have to design it so the results actually mean something.

The vocabulary

Treatment — the thing you're testing (the "what-if").
Experimental unit — what receives the treatment (the "who-gets-it").
Response variable — what you measure afterward (the "what-happens").
Factor — a variable you deliberately change; its settings are levels.

Fisher's three principles (the heart of design)

Randomization — assign treatments by chance, so no group gets an unfair edge.
Replication — repeat each treatment many times, so one fluke can't fool you.
Local control / Blocking — group similar units together before comparing, to cancel nuisance differences.

🧠

Mnemonic

Randomize to be fair, Replicate to be sure, Block to be smart. Roles: Treatment = the WHAT-IF · Unit = WHO-GETS-IT · Response = WHAT-HAPPENS.

📐 Picture this

Which fertilizer grows the tallest plants? Treatments = fertilizers A, B, C. Units = the pots. Response = height. Replication: many pots per fertilizer (one could be lucky). Randomization: don't put all of A on the sunny sill — assign spots by chance. Blocking: group pots by sunlight first, then compare fertilizers within each sun-group. Now it's fair.

09 / The Final Boss

Analysis of Variance (ANOVA)

The biggest block in your syllabus and almost certainly your entire Final. Good news: every design (one-way, two-way, RCBD, factorial, split-plot) is the same template with the Sum of Squares sliced differently.

What ANOVA does

Comparing 3+ group means with many t-tests is messy and error-prone. ANOVA compares them all at once by asking: is the difference BETWEEN the groups bigger than the random wobble WITHIN each group?

The ANOVA pipeline — memorize this skeleton, not five procedures

SS_total = SS_between (treatment) + SS_within (error)

MS = SS ÷ df

F = MS_between ÷ MS_within

df: treatment = k − 1 · error = N − k · total = N − 1 (k = #groups, N = total obs). The F-table needs two df — numerator AND denominator — plus α.

🧠

Mnemonic

The pipeline is SS → df → MS → F. Logic: Between bigger than Within → big F → groups really differ (reject H₀). One factor = one-way; two factors (and do they team up?) = two-way.

Interaction (the two-way twist)

🧠

Mnemonic

Interaction = "it depends." Coffee helps you focus — but coffee + no sleep = jitters. The combo matters, not just each factor alone.

📐 Picture this

Three brands of plant food, several pots each. A averages 30 cm, B 32, C 31. Are they really different, or normal plant-to-plant variation? ANOVA stacks the between-brand gap against the within-brand wobble:

Brands differ a lot, plants inside each barely vary → between ≫ within → BIG F → brands genuinely differ ✓ Brands differ a little, plants inside each vary wildly → between ≈ within → small F → it's just noise ✗

One F-value, one verdict, no messy pile of t-tests.

★ / Exam Day

★

Exam Cheat Sheet

1 · Which test do I use?

Your goal	Data type	Test	Table
Compare 1 mean to a target number	numeric	z (σ known / large n) or t (σ unknown / small n)	`z / t`
Compare 2 group means	numeric, 2 groups	2-sample t	`t`
Compare 3+ group means	numeric, 3+ groups	ANOVA (F)	`F`
Test one proportion	yes / no	z for proportion	`z`
Compare two proportions	yes / no, 2 groups	z for two proportions	`z`
Are two categories related?	categorical	Chi-square (independence)	`χ²`
Relationship between two numerics	numeric pairs	correlation / regression	`t`

2 · Master formula list

Quantity	Formula	Note
Mean	`x̄ = Σx / n`	—
Median position	`(n + 1) / 2`	seat, not score
Sample variance	`s² = [Σx² − (Σx)²/n] / (n−1)`	computational form
Standard deviation	`s = √s²`	real units
Coeff. of variation	`CV = (s / x̄) × 100%`	compare across things
Standard error	`SE = σ / √n`	SD of the mean
z-score	`z = (x − μ) / σ`	standardize a value
One-sample z	`z = (x̄ − μ) / (σ/√n)`	σ known / large n
One-sample t	`t = (x̄ − μ) / (s/√n)`	df = n − 1
Correlation r	`[nΣxy − ΣxΣy] / √([nΣx²−(Σx)²][nΣy²−(Σy)²])`	−1 to +1
Regression slope	`b = [nΣxy − ΣxΣy] / [nΣx²−(Σx)²]`	a = ȳ − bx̄
Chi-square	`χ² = Σ[(O − E)² / E]`	O = observed, E = expected
ANOVA F	`F = MS_between / MS_within`	MS = SS / df

3 · Five mistakes that cost points (no-PC edition)

Wrong degrees of freedom → wrong critical value → wrong verdict. Memorize the df rule per test.
One-tailed vs two-tailed changes which column you read in the table.
Rounding mid-calculation — carry 4+ decimals, round only the final answer.
Mixing Greek & Roman — μ, σ describe populations; x̄, s describe samples.
Reading the F-table with one df — it always needs two (numerator + denominator).