Normal distribution

In this section, we are going to see normal distribution. The formula for normal distribution is as follows:

I realize this formula is intense if you've never seen it before, but focus in on the parameter side instead of the actual formula side. There are only three parameters: xµ and σ. x is the dataset, which represents the domain; µ represents the mean, where we want the mean of our dataset to be; and σ represents the standard deviation, or how thin or wide we want our dataset to be. Now, because this is a hairy formula I've already implemented it, and I'm going to paste it into our window. So, the following example shows our quick function for normal distribution, where you can see the three parameters:

We have mu, which represents the mean; sd, which represents the standard deviation; and x, which is the domain over which we are mapping. Now, here's what you need to remember about the two parameters, the two key parameters mu and sdmu represents the centerline location. Positive means it'll exist to the right of 0 and negative means it'll exist to the left of 0. The standard deviation, sd, represents the width. High numbers in the standard deviation represent wide plots that are low, and low numbers in standard deviation represent high plots which are thin and narrow. So, let's go ahead and plot our normal distribution. We need to plot over the domain, and I prefer to plot on the domain of -5 in 1/10 increments, as shown in the following example:

Now, let's plot our values, as can be seen in the following example:

We have zipped our domain and mapped which domain over the normal distribution. We have specified a mean of 0 and a standard deviation of 1. Then we have mapped our domain over that normal formula, as shown in the following graph:

So, what you see in the previous graph is the true normal distribution. You can see that the peak of the normal distribution is at 0.4, and the width of the normal distribution extends just past -3 on the left and 3 on the right. Take a mental note of those values, where at 0 our peak is at 0.4 and our widths are at -3 and 3. We're going to come back to those values. 

Now let's continue to demonstrate how the standard deviation affects these parameters. Look at the following example:

We have changed the standard deviation to 5. Large standard deviations produce wide plots that are low, as shown in the following graph:

The peak of our standard deviation is still over 0 because our mean is at 0, but now it's only at 0.08. Our widths extend well beyond -5 and 5. So, this is a much wider plot that's lower. Let's perform this again, by changing the standard deviation to 0.5, as shown in the following example:

We should get the following output:

Once again we're over 0, and we have a much narrower plot that only exists from between -2 and -1; and, on the other side, between 1 and 2. The peak of our plot is now at 0.8, so it's much higher than our original. Now that we've seen how standard deviation affects the plot, let's see how the mean affects the plot. To do this, we are going to go back to our original statement with a standard deviation of 1 and a mean of 0, and we are going to change the mean to 2, as shown in the following example:

Let's look at the output now:

So, whenever you have two standard deviations which are the same, the shape of the data will be identical, just the mean offset will be different. Before, we had a mean of 0 and the center of the plot was over 0; and now we have a mean of 2 and the center of the plot is over 2. Everything else about the plot is the same. The width is the same, and the height is the same. Now that you have a little bit of a background on the central limit theorem and normal distribution, and the two key parameters of normal distribution—the mean and the standard deviation—we're ready to go ahead and talk about kernel density estimation. So, our next section will be about how to implement the kernel density estimator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset