1.3 Density Curves and Normal Distributions Ulrich Hoensch Tuesday, January 22, 2013 Fitting Density Curves to Histograms Advanced statistical software (NOT Microsoft Excel) can produce “smoothed versions” of histograms. Example The following are histograms and corresponding density curves for data representing: (a) the acidity or rainwater; (b) the survival time of Guinea pigs. Fitting Density Curves to Histograms When fitting a density curve to a histogram, we want that for any interval on the horizontal axis that spans the width of a collection of rectangles, the following holds: area of rectangles ≈ area under density curve. This requirement follows from the more general fact that for both histograms and density curves, area = proportion. Definition of Density Curve A density curve is a curve that I is always on or above the horizontal axis and I has area exactly 1 underneath it. In addition, we have that for any two values a and b on the horizontal axis, area below the density curve between a and b ≈ proportion of observations that fall between a and b. Median of a Density Curve The median of a density curve is the point M on the horizontal axis so that the area below the density curve and to the left of M is 50% (and consequently the area to the right is also 50%). 50% 50% Median Percentiles of a Density Curve The pth percentile of a density curve is the point P on the horizontal axis so that p percent of the area below the density curve lie to the left of P. The inter-quartile range is consequently the extent of the middle 50% of the area. 50% Q1 Q3 Mean of a Density Curve The mean of a density curve is the “balance point” of the curve: if the area below the curve were made of a solid material, the mean would correspond to the position of the fulcrum when balancing it: Mean and Median of a Density Curve Unless a density curve is symmetric, the mean is not equal to the median. I For right-skewed distributions the mean is larger than the median; I For left-skewed distributions the mean is smaller than the median. Normal Distributions Normal curves are the density functions of normal distributions. They have the following general shape. I They are symmetric, unimodal (have only one peak), and bell-shaped. I The mean is denoted by the symbol µ (small Greek letter “mu”), and the standard deviation is denoted by the symbol σ (small Greek letter “sigma”). I On either side of the mean there are two points, called inflection points where the curve makes the transition from bending upwards to bending downwards, and vice versa. I The standard deviation σ is the horizontal distance from the mean µ to these inflection points. Normal Distributions Two normal curves are shown here. The 68-95-99.7 Rule Example: Height of Young Women The height of young women aged 18 to 24 is approximately normally distributed with mean µ = 64.5 inches and standard deviation σ = 2.5 inches. We write X ∼ N(µ, σ) if a variable X has a normal distribution with mean µ and standard deviation σ. Consequently, we have that for the height X of young women, X ∼ N(64.5, 2.5). 55 60 65 70 Example: Height of Young Women Find the following, using a TI-83/TI-83 Plus/TI-84 Plus calculator: The percentage of women that are between 60 and 65 inches tall. 55 60 65 70 1. Type [2ND] VARS (DISTR), select 2: normalcdf(. Example: Height of Young Women 2. Type normalcdf(60,65,64.5,2.5) and press ENTER. 3. The proportion is 0.5433 . . ., so the percentage of women who are between 60 and 65 inches tall is about 54.3%. This means, the shaded area is 54.3%. 54.3% 55 60 65 70 Note: The general syntax for finding the proportion between a and b is normalcdf(a,b ,µ,σ ). Example: Height of Young Women Find the percentage of who are taller than 62 inches. 1. Type [2ND] VARS (DISTR), select 2: normalcdf(. 2. Type normalcdf(62,1000000,64.5,2.5). (The number 1000000 can be replaced by any very large positive number.) 3. Press ENTER. The proportion is 0.8413 . . ., so the percentage of women who are taller than 62 inches is about 84.1%. 84.1% 55 60 65 70 Example: Height of Young Women What is the cutoff score for the top 10% (i.e. the 90th percentile)? 1. Type [2ND] VARS (DISTR), select 3: invNorm(. 2. Type invNorm(0.9,64.5,2.5) and press ENTER. Example: Height of Young Women 3. The percentile is 67.70 . . ., so 90% of women are shorter than 67.7 inches (and 10% are taller than 67.7 inches). 90% 55 60 65 70 The general syntax for finding the cutoff so that the proportion p of observations fall below this cutoff is invNorm(p,µ,σ ). Example: Height of Young Women Find range of the middle 80% of the distribution. This means we need to find the 10th and the 90th percentile. 80% 55 60 65 70 1. The 90th percentile was computed above, it is 67.7. 2. Type invNorm(0.1,64.5,2.5) to find the 10th percentile. It is 61.29 . . . ≈ 61.3. So the middle 80% of the heights ranges from 61.3 inches to 67.7 inches.
© Copyright 2026 Paperzz