# Rank Tests

can handle outliers

non-parametric.

# Shapiro-Wilk test

Order the data. Uses weighted sums matching extreme ends and meeting in the middle. Coefficients are used from a table.

Small values are significant.

H0 is that the distribution is normal.

Not good with tied / identical data.

# Kolgomorov-Smirnov test

distribution free. non-parametric

Given an ordered empirical sample distribution, and an expected distribution, compare and see how closely they match.

Draw empirical CDF of both distributions.

Draw cumulative probability densities for both sample and expected distributions. The greatest vertical difference is the test statistic.

The expected distribution could be given as a normal distribution.

General case for distribution difference

data can be transformed, doesn't matter as this is CDF.

Calculate the largest vertical difference.

This is the test statistic.

Compare it with the KS table based on the sample size. More samples mean that the max vertical diff should be lower.

$$D = MAX |F_n(X) - F(x) |$$

Not as powerful as Shapiro-Wilk or A-D

# Anderson-Darling test

Non-parametric.

Order data.

weighted matched extreme pairs.

# Wilcoxon signed-rank test - two matched samples

dependent samples - e.g. test people's heart rate before taking a drug, and after taking a drug.

compares median of 2 samples.

For large N, less power than t-test. good for small N.

non-parametric, doesn't need normal dist.

## Statistic

Person's data before and after is a pair, take the signed difference.

For each person you will get either a positive or negative difference.

Rank all people by abs(diff).

Sum all ranks where people have a positive difference.

Sum all ranks where people have a negative difference.

statistic is abs(lowest sum)

lookup table using N. low value is significant.

# Mann-Whitney-Wilcox rank sum test

Independent samples.

Compares median of 2 sample groups, group A and group B.

Needs total sample size > 10

Total sample size n = na + nb

## Procedure

Concatenate both samples.

Sort samples.

Assign a rank 1..n for each sample. For ties give them same rank. e.g. 1,2,3.5,3.5,5,6

$$R_a = \text{Sum of ranks in group A}$$

$$R_b = \text{Sum of ranks in group B}$$

$$U_a = n_a n_b + \frac{n_a(n_a+1)}{2} - R_a$$

$$U_b = n_a n_b + \frac{n_b(n_b+1)}{2} - R_b$$

$$U = min(U_a, U_b)$$

$$E(U) = \frac{n_a n_b}{2}$$

$$\sigma_{U} = \sqrt{\frac{n_a n_b (n_a + n_b + 1)}{12}}$$

$$Z = \frac{U - E(U)}{\sigma_{U}}$$

Use z-test