Now that you understand how to identify a normal distribution, we can utilize R to perform calculations that are specific to this distribution type.
When a normal distribution has been identified, we can estimate the probability that an event takes place as it occurs between two values.
I am assuming that you have some understanding of normal distributions in addition to what was discussed in the prior entry.
When a normal distribution has been identified, we can estimate the probability that an event takes place as it occurs between two values.
I am assuming that you have some understanding of normal distributions in addition to what was discussed in the prior entry.
Example:
You are currently employed as a statistician in a factory that produces flashlights. The senior statistician informs you that the premium brand of flashlights that the factory produces, have a battery life expectancy which is normally distributed. The mean for the battery life of this particular brand is 20 hours, with a standard deviation of 5 hours.
What is the probability that a randomly selected flashlight from the production line will last between 20-25 hours?
pnorm(q=25, mean=20, sd=5, lower.tail=TRUE)
# Output = 0.8413447 #
0.8413447 - .50
# 0.3413447 or % 34.134 Probability #
If a flashlight's battery dies at exactly 8 hours after use, how many standard deviations away from the mean is this value?
# (x - mean) / standard deviation #
(8 - 20) / 5
# - 2.4 Standard Deviations #
What is the probability that a randomly selected flashlight from the production line will last between 18-24 hours?
pnorm(q=18, mean=20, sd=5, lower.tail=FALSE)
# Output = 0.6554217 #
0.6554217 - .50
# Output = 0.1554217 #
pnorm(q=24, mean=20, sd=5, lower.tail=TRUE)
# Output = 0.7881446 #
0.7881446 - .50
# Output = 0.2881446 #
0.2881446 + 0.1554217
# 0.4435663 or % 44.357 Probability #
What is the probability that a randomly selected flashlight from the production line will last between 22-26 hours?
pnorm(q=22, mean=20, sd=5, lower.tail=FALSE)
# Output = 0.3445783 #
.50 - 0.3445783
# Output = 0.1554217 #
pnorm(q=26, mean=20, sd=5, lower.tail=TRUE)
# Output = 0.8849303 #
0.8849303 - 0.1554217 - .50
# 0.2295086 or % 22.950 Probability #
At the same factory, while eating lunch, the senior statistician appears again. During this encounter, he decides to test your statistical abilities by asking you a series of questions.
These questions are:
Given a normal distribution with a mean of 55, what is the standard deviation if 45% of the values are above 70?
qnorm(.45, lower.tail=FALSE)
# Output = 0.1256613 #
70 - 55
# Output = 15 #
15 / .1256613
# Standard Deviation = 119.3685 #
Given a normal distribution with a standard deviation of 15, what is the mean if 25% of the values are below 45?
qnorm(.25, lower.tail = FALSE)
# Output = 0.6744898 #
45 + (0.6744898 * 15)
# Output = 55.11735 #
Given a normal distribution with 60% of the values above 100, and 90% of the values above 80, what are the mean and the standard deviation?
qnorm(.60, lower.tail=TRUE)
# Output = 0.2533471 #
qnorm(.90, lower.tail=TRUE)
# Output = 1.281552 #
# (100 - Mean)/Standard Deviation = 0.2533471 #
# (80 - Mean)/Standard Deviation = 1.281552 #
# 100 - Mean = 0.2533471 * Standard Deviation #
# 80 - Mean = 1.281552 * Standard Deviation #
Which can be worked out, algebraically, to solve for both mean and standard deviation.
That is all of this entry, which closes out the 50th article that I have written for this blog. Two things to remember about normal distributions: there is no perfect test for normality, and there is no way to provide a probability for a single event occurring within a continuous normal distribution. All that we can find, is the probability surrounding an event's parameters.
Stay tuned until next time, Data Heads.