Statistical inference is about drawing conclusions from data under uncertainty. In practice, it often means using hypothesis tests to decide whether the data provide enough evidence against a default assumption (the null hypothesis).
A common tool is the t-test (also called Student’s t-test), which can be used to test hypotheses about means:
A t-test compares two competing hypotheses:
In R, the base function is t.test():
Key arguments:
x: numeric vector.y: second numeric vector (for two-sample tests).mu: reference mean (or reference mean difference for
paired tests).paired: set to TRUE for paired data.var.equal: relevant for two-sample
unpaired tests; it uses a pooled variance estimate when
TRUE.A one-sample t-test compares the mean of a sample to a theoretical mean \(\mu\).
\[ t = \frac{\bar{x} - \mu}{s/\sqrt{n}} \]
where \(\bar{x}\) is the sample mean, \(s\) the sample standard deviation, and \(n\) the sample size.
The result includes a p-value. With a significance level of 0.05:
A paired t-test is used when the two samples are dependent (same units measured twice, or matched pairs). The test is equivalent to a one-sample t-test on the differences \(d = x - y\).
A company tracks daily sales for the same shop for 7 days before and 7 days after a discount program.
set.seed(123)
sales_before <- rnorm(7, mean = 50000, sd = 50)
sales_after <- rnorm(7, mean = 50075, sd = 50)
t.test(sales_before, sales_after, paired = TRUE)##
## Paired t-test
##
## data: sales_before and sales_after
## t = -2.6102, df = 6, p-value = 0.04011
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -97.618586 -3.152003
## sample estimates:
## mean difference
## -50.38529
Notes:
paired = TRUE for paired data.var.equal is not meaningful in the
same way as in an unpaired two-sample test, because the test works on a
single vector of differences.The t-test is a fundamental tool in inferential statistics for testing hypotheses about means.
| Test | Hypothesis | Code |
|---|---|---|
| One-sample t-test | Mean of a sample differs from a reference value | t.test(x, mu = mu0) |
| Paired t-test | Mean difference between paired measurements differs from 0 | t.test(x, y, paired = TRUE) |
A work by Gianluca Sottile
gianluca.sottile@unipa.it