T-test - one sample

Used when sample size, n < 30.

Used when population sd is unknown.

Assumes population is normal.

Tests whether this same comes from a population mean.

Formula

To calculate your sample's T statistic value:

$$t = \frac{\overline{x} - \mu_0}{\sigma / \sqrt{n}}$$

where

x bar is the sample mean.

μ0 is the population mean you are trying to guess.

σ is the sample standard deviation.

n is the sample size.

Distribution

Set degrees of freedom, df = n - 1

T-statistic distribution is symmetric around mean. It is similar to a normal dist, slightly fatter.

$$\mu = 0$$

$$\sigma^2 = \frac{df}{df-2}$$

Compare your t-value to the distribution, get the p-value and accept or reject H0

T-test - two samples

Used when samples are independent.

Used when normally distributed. You can transform it.

Used when variances are equal.

Tests whether means are the same.

Formula for test statistic

To calculate your sample's T statistic value, determine degrees of freedom:

$$df_1 = n_1 - 1$$

$$df_2 = n_2 - 1$$

calculate the weighted pooled variance.

$$\sigma_p^2 = \frac{\sigma_1^2 df_1 + \sigma_2^2 df_2}{df_1 + df_2}$$

$$t = \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sigma_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

μ usually cancels out of we are just comparing means.

Distribution comparison

Set degrees of freedom, df = n + m - 2

Randomisation (permutation) test for 2 sets

Parametric test.

Tests whether means are the same.

Use T-test to compare set1 and set2, remember the p-values.

Loop 1000 times:

combine set1 and set2, shuffle, create set3 and set4 so that set3 has the same size as set1.

run T-test for set3 and set4, remember p-value

end loop

Review data and see how extreme the original p-value is.