What Nearly normal is
Nearly normal is a term used to describe a set of data that is close to having a normal distribution. A normal distribution is a symmetrical bell-shaped curve where the mean, median, and mode are equal. Data that is nearly normal is data that is not perfectly symmetrical but still follows the general shape of the normal distribution.
The following steps can be used to assess whether a set of data is nearly normal:
- Examine the data graphically via histograms or box plots.
- Calculate the mean, median, and mode of the data.
- Calculate the skewness and kurtosis of the data.
- Calculate the coefficient of variation (CV) of the data.
- Calculate the correlation between the data and a normal distribution.
- Examine the tails of the data, looking for outliers.
- Examine the data for any signs of bimodality.
If all of the above steps indicate that the data follows the general shape of a normal distribution, then it can be said to be nearly normal.
Examples
-
Testing hypotheses: Nearly Normal distributions are often used to test hypotheses in statistical inference due to their properties of symmetry and equal variance.
-
Estimating population parameters: Nearly Normal distributions are used to estimate population parameters such as the mean, median and standard deviation.
-
Modeling data: Nearly Normal distributions are used to model data for regression analysis and other data analysis techniques.
-
Assessing normality of data: Nearly Normal distributions are commonly used to assess the normality of data, as they are considered to be close to the normal distribution.
-
Calculating confidence intervals: Nearly Normal distributions are used to calculate confidence intervals for population parameters such as the mean, standard deviation and variance.