My Life, In Here: Januari 2013

2. The Normal Distribution

The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects. In addition, as we will see, the normal distribution has many nice mathematical properties. The normal distribution is also called the Gaussian distribution, in honor of Carl Friedrich Gauss, who was among the first to use the distribution.

The Standard Normal Distribution

A random variable

Z has the standard normal distribution if it has the probability density function

ϕ given by

ϕ (z) = 1 2 π - - - \sqrt e - 1 2 z 2, z \in R

ϕ is a probability density function.

Proof:

Let

c=∫∞−∞e−12z2dz. We need to show that

c=2π−−√. That is,

2π−−−√ is the normalzing constant for the function

z↦e−12z2. The proof uses a nice trick:

c 2 = \int \infty - \infty e - 1 2 x 2 d x \int \infty - \infty e - 1 2 y 2 d y = \int \infty - \infty \int \infty - \infty e - 1 2 (x 2 + y 2) d x d y

We now convert the double integral to polar coordinates:

x=rcos(θ),

y=rsin(θ) where

r∈[0,∞) and

θ∈[0,2π). So,

x2+y2=r2 and

dxdy=rdrdθ. Thus

c 2 = \int 2 π 0 \int \infty 0 r e - 1 2 r 2 d r d θ

Substituting

u=r2/2 in the inner integral gives

∫∞0e−udu=1 and then the outer integral is

∫2π01dθ=2π. Thus,

c2=2π and so

c=2π−−√.

The standard normal density function

ϕ satisfies the following properties:

ϕ is symmetric about z=0.
ϕ is increasing on (−∞,0) and decreasing on (0,∞).
The mode occurs at z=0.
ϕ is concave upward on (−∞,−1) and on (1,∞) and is concave downward on (−1,1).
The inflection points of ϕ occur at z=±1.
ϕ(z)→0 as z→∞ and as z→−∞.

Proof:

These results follow from standard calculus. Note that

ϕ′(z)=−zϕ(z). This differential equation helps simplify the computations.

In the Special Distribution Simulator, select the normal distribution and keep the default settings. Note the shape and location of the standard normal density function. Run the simulation 1000 times, and note the agreement between the empirical density function and the true density function.

The standard normal distribution function

Φ, given by

Φ (z) = \int z - \infty ϕ (t) d t = \int z - \infty 1 2 π - - - \sqrt e - 1 2 z 2 d z

and its inverse, the quantile function

Φ−1, cannot be expressed in closed form in terms of elementary functions. However approximate values of these functions can be obtained from the special distribution calculator, and from most mathematics and statistics software. Indeed these functions are so important that they are considered special functions of mathematics.

The standard normal distribution function

Φ satisfies the following properties:

Φ(−z)=1−Φ(z) for z∈R
Φ−1(p)=−Φ−1(1−p) for p∈(0,1)
Φ(0)=12, so the median is 0.

Proof:

Part (a) follows from the symmetry of

ϕ. Part (b) follows from part (a). Part (c) follows from part (a) with

z=0.

In the special distribution calculator, select the standard normal distribution.

Note the shape of the density function and the distribution function.
Find the first and third quartiles.
Compute the interquartile range.

Use the special distribution calculator to find the quantiles of the following orders for the standard normal distribution:

p=0.001, p=0.999
p=0.05, p=0.95
p=0.1, p=0.9

Moments

The mean and variance of the standard normal distribution are

E(Z)=0
var(Z)=1

Proof:

Of course, by symmetry, if

Z has a mean, the mean must be 0, but we have to argue that the mean exists. Actually it's not hard to compute the mean directly. Note that

E (Z) = \int \infty - \infty z 1 2 π - - \sqrt e - z 2 / 2 d z = \int 0 - \infty z 1 2 π - - \sqrt e - z 2 / 2 d z + \int \infty 0 z 1 2 π - - \sqrt e - z 2 / 2 d z

The integrals on the right can be evaluated explicitly using the simple substitution

u=z2/2. The result is

E(Z)=−1/2π−−√+1/2π−−√=0. For part (b), note that

var (Z) = E (Z 2) = \int \infty - \infty z 2 ϕ (z) d z

Integrate by parts, using the parts

u=z and

dv=zϕ(z)dz. Thus

du=dz and

v=−ϕ(z). Note that

zϕ(z)→0 as

z→∞ and as

z→−∞. Thus, the integration by parts formula gives

var(Z)=∫∞−∞ϕ(z)dz=1.

Many important properties of the normal distribution are most easily obtained using the moment generating function.

Z has the standard normal distribution then

Z has moment generating function

m (t) = E (e t Z) = e 1 2 t 2, t \in R

Proof:

Note that

E (e t Z) = \int \infty - \infty e t z 1 2 π - - \sqrt e - z 2 / 2 d z = \int \infty - \infty 1 2 π exp (- 1 2 z 2 + t z) d z

We complete the square in

z to get

−12z2+tz=−12(z−t)2+12. Thus we have

E (e t Z) = e 1 2 t 2 \int \infty - \infty 1 2 π - - \sqrt exp [- 1 2 (z - t) 2] d z

In the integral, if we use the simple substitution

u=z−t then the integral becomes

∫∞−∞ϕ(u)du=1. Hence

E(etZ)=e12t2,

The characteristic function of

Z is

χ(t)=m(it)=E(eitZ)=e−t2/2. Thus, the standard normal distribution has the curious property that the characteristic function is a multiple of the PDF.
The moment generating function can be used to give another proof that

Z has mean 0 and variance 1. More generally, we can compute all of the moments of

Z, which we know must exist since the moment generating function is finite for all

t∈R.

For

n∈N,

E(Z2n)=(2n)!/(n!2n)
E(Z2n+1)=0

Proof:

The result follows from repeated differentiation of the MGF. Recall that

E(Zk)=m(k)(0). Of course, the odd order moments must be 0 by symmetry.

The following exercise gives the skewness and kurtosis of the normal distribution.

Z has the standard normal distribution then

skew(Z)=0
kurt(Z)=3

Proof:

Since

Z has mean 0 and variance 1,

skew(Z)=E(Z3)=0 and

kurt(Z)=E(Z4)=4!/(2!22)=3.

Because of the last result, (and the use of the normal distribution as a standard), the excess kurtosis of a random variable is defined to be the ordinary kurtosis minus 3. Thus, the excess kurtosis of the normal distribution is 0.

The General Normal Distribution

The general normal distribution is the location-scale family associated with the standard normal distribution. Specifically, suppose that

μ∈R and

σ∈(0,∞) and that

Z has the standard normal distribution. Then

X=μ+σZ has the normal distribution with location parameter

μ and scale parameter

σ. The basic properties of the density function and distribution function follow easily from general results for location scale families.

The normal distribution with location parameter

μ and scale parameter

σ has probability density function

f given by

f (x) = 1 σ ϕ (x - μ σ) = 1 2 π - - - \sqrt σ exp [- 1 2 (x - μ σ) 2], x \in R

Proof:

This follows from the change of variables formula corresponding to the transformation

x=μ+σz.

The normal density function

f satisfies the following properties:

f is symmetric about x=μ.
f is increasing on (−∞,μ) and is decreasing on (μ,∞)
The mode occurs at x=μ.
f is concave upward on (−∞,μ−σ) and on (μ+σ,∞) and is concave downward on (μ−σ,μ+σ).
The inflection points of f occur at x=μ±σ.
f(x)→0 as x→∞ and as x→−∞.

Proof:

These properties follow from the corresponding properties of

ϕ.

In the special distribution simulator, select the normal distribution. Vary the parameters and note the shape and location of the density function. With your choice of parameter settings, run the simulation 1000 times and note the apparent convergence of the empirical density function to the true probability density function.

Let

F denote the distribution function for the normal distribution with location parameter

μ and scale parameter

σ, and as above, let

Φ denote the standard normal distribution function.

The normal distribution function

F satsifies the following properties:

F(x)=Φ(x−μσ) for x∈R.
F−1(p)=μ+σΦ−1(p) for p∈(0,1).
F(μ)=12 so the median occurs at x=μ.

Proof:

Part (a) follows since

X=μ+σZ. Parts (b) and (c) follow from (a).

In the special distribution calculator, select the normal distribution. Vary the parameters and note the shape of the density function and the distribution function.

Moments

As the notation suggests, the location and scale parameters are also the mean and standard deviation, respectively.

X has the normal distribution with location parameter

μ and scale parameter

σ then

E(X)=μ
var(X)=σ2

Proof:

This follows from the representation

X=μ+σZ and basic properties of expected value and variance.

X has the normal distribution with location parameter

μ and scale parameter

σ then

X has moment generating function

E (e t X) = exp (μ t + 1 2 σ 2 t 2) t \in R

Proof:

This follows from the representation

X=μ+σZ and basic properties of expected value:

E (e t X) = E (e t μ + t σ Z) = e t μ E (e t σ Z) = e t μ e 1 2 t 2 σ 2 = e t μ + 1 2 σ 2 t 2

The central moments of

X can be computed easily from the moments of the standard normal distribution. The ordinary (raw) moments of

X can be computed from the central moments, but the formulas are a bit messy.

X has the normal distribution with mean

μ and standard deviation

σ, then for

n∈N,

E[(X−μ)2n]=(2n)!σ2n/(n!2n)
E[(X−μ)2n+1]=0

All of the odd central moments of

X are 0, a fact that also follows from the symmetry of the probability density function.

In the special distribution simulator select the normal distribution. Vary the mean and standard deviation and note the size and location of the mean/standard deviation bar. With your choice of parameter settings, run the simulation 1000 times and note the apparent convergence of the empirical moments to the true moments.

The following exercise gives the skewness and kurtosis of the normal distribution.

X has the normal distribution with mean

μ and standard deviation

σ then

skew(X)=0
kurt(X)=3

Proof:

The skewness and kurtosis of a variable are defined in terms of the standard score, so these results follows form the corresponding reults of

Z in Theorem 10.

Transformations

The normal family of distributions satisfies two very important properties: invariance under linear transformations and invariance with respect to sums of independent variables. The first property is essentially a restatement of the fact that the normal distribution is a location-scale family.

Suppose that

X is normally distributed with mean

μ and variance

σ2. If

a∈R and

b∈R∖{0}, then

a+bX is normally distributed with mean

a+bμ and variance

b2σ2.

Proof:

The MGF of

a+bX is

E (e t (a + b X)) = e t a E (e (t b) X) = e t a e μ (t b) + σ 2 (t b) 2 / 2 = e (a + b μ) t + b 2 σ 2 t 2 / 2

which we recognize as the MGF of the normal distribution with mean

a+bμ and variance

b2σ2.

In particular

If X has the normal distribution with mean μ and standard deviation σ then Z=X−μσ has the standard normal distribution.
If Z has the standard normal distribution and if μ∈R and σ∈(0,∞) are constants, then X=μ+σZ has the normal distribution with mean μ and standard deviation σ.

Recall that in general, if

X is a random variable with mean

μ and standard deviation

σ>0, then

Z=(X−μ)/σ is the standard score of

X. Thus, if

X has a normal distribution then the standard score

Z has a standard normal distribution.

Suppose that

X1 and

X2 are independent random variables, and that

Xi is normally distributed with mean

μi and variance

σ2i for

i∈{1,2}. Then

X1+X2 is normally distributed with

E(X1+X2)=μ1+μ2
var(X1+X2)=σ21+σ22

Proof:

The MGF of

X1+X2 is the product of the MGFs, so

E {exp [t (X 1 + X 2)]} = exp (μ 1 t + σ 21 t 2 / 2) exp (μ 2 t + σ 22 t 2 / 2) = exp [(μ 1 + μ 2) t + (σ 21 + σ 22) t 2 / 2]

which we recognize as the MGF of the normal distribution with mean

μ1+μ2 and variance

σ21+σ22.

The result of the previous exercise generalizes to a sum of

n independent, normal variables. The important part is that the sum is still normal; the expressions for the mean and variance are standard results that hold for the sum of independent variables generally.

Suppose that

X has the normal distribution with mean

μ and variance

σ2. The distribution is a two-parameter exponential family with natural parameters

(μσ2,−12σ2), and natural statistics

(X,X2).

Computational Exercises

Suppose that the volume of beer in a bottle of a certain brand is normally distributed with mean 0.5 liter and standard deviation 0.01 liter.

Find the probability that a bottle will contain at least 0.48 liter.
Find the volume that corresponds to the 95th percentile

Answer:

Let

X denote the volume of beer in liters

P(X>0.48)=0.9772
x0.95=0.51645

A metal rod is designed to fit into a circular hole on a certain assembly. The radius of the rod is normally distributed with mean 1 cm and standard deviation 0.002 cm. The radius of the hole is normally distributed with mean 1.01 cm and standard deviation 0.003 cm. The machining processes that produce the rod and the hole are independent. Find the probability that the rod is to big for the hole.

Answer:

Let

X denote the radius of the rod and

Y the radius of the hole.

P(Y−X<0)=0.0028

The weight of a peach from a certain orchard is normally distributed with mean 8 ounces and standard deviation 1 ounce. Find the probability that the combined weight of 5 peaches exceeds 45 ounces.

Answer:

Let

X denote the combined weight of the 5 peaches, in ounces.

P(X>45)=0.0127

A Further Generlization

In some settings, it's convenient to consider a constant as having a normal distribution (with mean being the constant and variance 0, of course). This convention simplifies the statements of theorems and definitions in these settings. Of course, the formulas for probability density and distribution functions do not hold for a constant, but the other results involving moments, the moment geneating function, and transformations in Theorems 21 and 23 are still valid, and of course Theorem 23 would hold for all

a and

My Life, In Here

My Favorite Blog is my journey in the World

Rabu, 30 Januari 2013

The Normal Distribution By Ongki

2. The Normal Distribution

The Standard Normal Distribution

Moments

The General Normal Distribution

Moments

Transformations

Computational Exercises

A Further Generlization