# T-test - one sample

Used when sample size, n < 30.

Used when population sd is unknown.

Assumes population is normal.

Tests whether this same comes from a population mean.

## Formula

To calculate your sample's T statistic value:

$$t = \frac{\overline{x} - \mu_0}{\sigma / \sqrt{n}}$$

where

x bar is the sample mean.

μ0 is the population mean you are trying to guess.

σ is the sample standard deviation.

n is the sample size.

## Distribution

Set degrees of freedom, df = n - 1

T-statistic distribution is symmetric around mean. It is similar to a normal dist, slightly fatter.

$$\mu = 0$$

$$\sigma^2 = \frac{df}{df-2}$$

Compare your t-value to the distribution, get the p-value and accept or reject H0

# T-test - two samples

Used when samples are independent.

Used when normally distributed. You can transform it.

Used when variances are equal.

Tests whether means are the same.

## Formula for test statistic

To calculate your sample's T statistic value, determine degrees of freedom:

$$df_1 = n_1 - 1$$

$$df_2 = n_2 - 1$$

calculate the weighted pooled variance.

$$\sigma_p^2 = \frac{\sigma_1^2 df_1 + \sigma_2^2 df_2}{df_1 + df_2}$$

$$t = \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sigma_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

μ usually cancels out of we are just comparing means.

## Distribution comparison

Set degrees of freedom, df = n + m - 2

# Randomisation (permutation) test for 2 sets

Parametric test.

Tests whether means are the same.

Use T-test to compare set1 and set2, remember the p-values.

Loop 1000 times:

combine set1 and set2, shuffle, create set3 and set4 so that set3 has the same size as set1.

run T-test for set3 and set4, remember p-value

end loop

Review data and see how extreme the original p-value is.