Normal Distribution in sampling.

Ashley Denies
2 min readFeb 26, 2020

Introduction

What Is Sampling?

Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. The methodology used to sample from a larger population depends on the type of analysis being performed, but it may include simple random sampling or systematic sampling.

What is Normal Distribution?

The normal distribution is a probability function that describes how the values of a variable are distributed. It is a symmetric distribution where most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off equally in both directions.

What Does Probability Distribution Mean?

· If we want to predict a variable accurately then the first task we need to perform is to understand the underlying behaviour of our target variable.

· What we need to do first is to determine the possible outcomes of the target variable and if the underlying outcomes are discrete (distinct values) or continuous (infinite values). For the sake of simplicity, if we are estimating the behaviour of a dice then the first step is to know that it can take any value from 1 to 6 (discrete).

· Then the next step would be to start assigning probabilities to the events (values). Consequently, if a value cannot occur then it is assigned a probability of 0%.

What Is Normal Distribution?

A normal distribution is a distribution that is solely dependent on two parameters of the data set: its mean and the standard deviation of the sample.

· Mean — This is the average value of all the points in the sample.

· Standard Deviation — This indicates how much the data set deviates from the mean of the sample.

· This characteristic of the distribution makes it extremely simple for statisticians and hence any variable that exhibits normal distribution is feasible to be forecasted with higher accuracy.

Normal distribution is simple to explain. The reasons are:

1. The mean, mode and median of the distribution are equal.

2. We only need to use the mean and standard deviation to explain the entire distribution.

· Mean is the center of the curve. This is the highest point of the curve as most of the points are at the mean.

· There are equal number of points on each side of the curve. The center of the curve has the most number of points.

· The total area under the curve is the total probability of all of the values that the variable can take.

· The total curve area is therefore 100%

· Approximately 68.2% of all of the points are within the range -1 to 1 standard deviation.

· About 95.5% of all of the points are within the range -2 to 2 standard deviations.

· About 99.7% of all of the points are within the range -3 to 3 standard deviations.

--

--