Perform the following steps:
- Validate whether the dataset, x (generated with the rnorm function), is distributed normally with a one-sample Kolmogorov-Smirnov test:
> x = rnorm(50)
> ks.test(x,"pnorm")
Output:
One-sample Kolmogorov-Smirnov test
data: x
D = 0.1698, p-value = 0.0994
alternative hypothesis: two-sided
- Next, you can generate uniformly distributed sample data:
> set.seed(3)
> x = runif(n=20, min=0, max=20)
> y = runif(n=20, min=0, max=20)
- We first plot the ecdf of two generated data samples:
> plot(ecdf(x), do.points = FALSE, verticals=T, xlim=c(0, 20))
> lines(ecdf(y), lty=3, do.points = FALSE, verticals=T)
The ecdf plot of two generated data samples
- Finally, we apply a two-sample Kolmogorov-Smirnov test on two groups of data:
> ks.test(x,y)
Output:
Two-sample Kolmogorov-Smirnov test
data: x and y
D = 0.3, p-value = 0.3356
alternative hypothesis: two-sided
This recipe will create a sample data and perform the Kolmogorov-Smirnov test. Perform the following in R:
> g1 = sample(50:100, 30, replace=TRUE)
> g2 = sample(50:100, 30, replace=TRUE)
> plot.ecdf(g1, verticals=TRUE, do.points=FALSE, col="red")
> lines(ecdf(g2), verticals=TRUE, do.points=FALSE, col="blue")
ECDF plot of g1 and g2
> ks.test(g1,g2)
Output:
Two-sample Kolmogorov-Smirnov test
data: g1 and g2
D = 0.2, p-value = 0.586
alternative hypothesis: two-sided