Main Menu

Column

Exploring Data	Graphical Procedures Histograms, Boxplots, Ogives	Probability and Random Variables	Probability Distributions
Testing Distributional Assumptions Testing Normality and Outlier Detection	Inference Procedures Confidence Intervals and Hypothesis Testing	Bivariate Data Correlation Coefficients & Hypothesis Tests, Chi Square Test for Independence	Linear Models Simple and Multiple Linear Regression Models, Residual Analysis, Regression ANOVA
Discrete Random Variables Expectations and variance of discrete random variables, Joint Distribution of discrete random variables	Continuous Random Variables Expectations and variance of continuous random variables	More on Random Variables Moment Generating Functions, Cumulant Generating Functions	Random Number Generation
Game Theory and Decision Theory	Experimental Design	Survival Analysis	Statistical Process Control

Poisson Distribution

Column

Poisson Distribution

Probability Density Function for the Poisson Distribution}

A discrete random variable $X$ is said to follow a Poisson distribution with parameter $m$, written $X \sim \mbox{Poisson}(m)$.

Here the key parameter, the , is the expected number of occurence per unit period.
It is usually denoted as either $m$ or $\lambda$.

Given the mean number of successes ($m$) that occur in a specified region, we can compute the Poisson probability based on the formula.

\begin{framed} \ The probability that there will be $k$ occurrences in a is denoted $P(X=k)$, and is computed as: { \[ P(X = k)=\frac{m^k e^{-m}}{k!} \] }

where

%(x) the probability of x occurrences in an interval. * $m$ (or $\lambda$) is the expected value of the mean number of occurrences in any interval (i.e. the Poisson mean), * $e=2.718$, * $k = 0, 1, 2, \ldots$, * $m > 0$.

Poisson Expected Value and Variance}

If the random variable X has a Poisson distribution with parameter $m$, we write \[ X \sim Poisson(m) \] Note the expected number of occurrences per unit time is often denoted $\lambda$ (lambda) rather than $m$.

% If X is a Poisson distribution stochastic variable with parameter $\lambda$, then % % % * The expected value $E[X]=\lambda$ % * The variance $Var[X]=\lambda$ %

Expected Value of X : $\mbox{E}(X)= m \mbox{ or } \lambda$
Variance of X : $\mbox{Var}(X) = m \mbox{ or } \lambda$
Standard Deviation of X : $SD(X) = \sqrt{m} \mbox{ or } \sqrt{\lambda}$

Important \[ \mbox{E}(X) = \mbox{Var}(X)\] * Poisson Distribution - Probability Mass Function * Confidence Intervals for Poisson Variables

Worked Examples

Worked Examples 1

On the average 8 calls per hour are received in a telephone board. Assuming that the number of calls received in the board in a given length of time is a Poisson process, find the probability that

Exercises

6 calls received in 2 Hours
at least 2 calls in the next 20 minutes

Solution

Click here for demonstrated solution

Confidence Intervals for Poisson Random Variables

Worked Example 1

It is assumed that claims on a certain type of policy arise as a Poisson process with claim rate $\lambda$ per year.

For a group of 150 independent policies of this type, the total number of claims during the last calendar year was recorded as 123. Determine an approximate 95% confidence interval for the true underlying annual claim rate for such a policy.

Solution

Click here for demonstrated solution

Worked Example 2

It is assumed that the numbers of claims arising in one year from motor insurance policies for young male drivers and young female drivers are distributed as Poisson random variables with parameters $ _m$ and $\lambda_f$ respectively.

Independent random samples of 120 policies for young male drivers and 80 policies for young female drivers were examined and yielded the following mean number of claims per policy in the last calendar year: $ x_m = 0.24$ and $\LARGE x_f = 0.15$ .

Calculate an approximate 95% confidence interval for $\LARGE \bar{x}_{m} - \bar{x}_{f}$ , the difference between the respective Poisson parameters.

Solution

Click here for demonstrated solution

Videos

Goodness of Fit Tests

Poisson Random Variables: Chi Square for Poisson Distribution

Binomial Distribution

Column

Binomial Distribution

Binomial Distribution : Probability Density Function}

A binomial experiment is one that possesses the following properties:

\begin{enumerate} * The experiment consists of n repeated trials;

Each trial results in an outcome that may be classified as a success or a failure (hence the name, binomial);
The probability of a success, denoted by p, remains constant from trial to trial and repeated trials are independent. \end{enumerate}

The number of successes X in n trials of a binomial experiment is called a .

The probability of exactly k successes in a binomial experiment $Bin(n, p)$ is given by \[ P(X=k) = P(k \mbox{ successes in ``n" trails }) = \;^nC_k \times p^{k} \times (1-p)^{n-k}\]

X: Discrete random variable for the number of successes (variable name)
$k$ : Number of successes (numeric value)
k= 0, 1, 2, … n
$P(X=k)$ ``probability that the number of success is $k$“.
$n$ : number of independent trials
$p$ : probability of a success in any of the $n$ trial.
$1-p$ : probability of a failure in any of the $n$ trial.
${^nC_k}$ is a combination value, found using the Choose operator.

Binomial Distribution : Expected Value and Variance

Expectation and Variance

If the random variable X has a binomial distribution with parameters $n$ and $p$, we write \[ X \sim Bin(n,p) \] Only these two parameters are needed to determine the probability of an event.

The expected value of $X$ is: \[\operatorname{E(X)} = n \times p \]
The variance of $X$ is: \[\operatorname{Var(X)} = n \times p \times (1-p) = n\times p \times q \]
$p$ is the probability of success * $q$ is the probability of failure in a binomial trial
$n$ is the number of independent trials

Interpretation: If $n=100$, and $p=0.25$, then the average number of successes will be 25.

Videos

Binomial Distribution

Binomial Distribution - Worked Example

Formula

Formula Sheet

\[ [P(X = k) = \]

Geometric Distribution

Column

Geometric distributions model (some) discrete random variables. Typically, a Geometric random variable is the number of trials required to obtain the first failure, for example, the number of tosses of a coin until the first ‘tail’ is obtained, or a process where components from a production line are tested, in turn, until the first defective item is found.
A discrete random variable X is said to follow a Geometric distribution with parameter p, written $X \sim Ge(p)$, if it has probability distribution \[P(X=x) = p^{x-1}(1-p)^x\] where
$x = 1, 2, 3, \ldots$
p = success probability; $0 < p < 1$
The trials must meet the following requirements:

the total number of trials is potentially infinite; there are just two outcomes of each trial; success and failure; the outcomes of all the trials are statistically independent; all the trials have the same probability of success.

The Geometric distribution has expected value and variance \[E(X)= 1/(1-p)\] \[V(X)=p/{(1-p)^2}\].
The Geometric distribution is related to the Binomial distribution in that both are based on independent trials in which the probability of success is constant and equal to $p$.
However, a Geometric random variable is the number of trials until the first failure, whereas a Binomial random variable is the number of successes in n trials.

Geometric Distribution

Geometric Distribution with Chi Square Test for Goodness of Fit

Hypergeometric Distribution

Column

Hypergeometric Distribution

Probability mass function

The following conditions characterize the hypergeometric distribution:

The result of each draw (the elements of the population being sampled) can be classified into one of two mutually exclusive categories (e.g. Pass/Fail or Employed/Unemployed).
The probability of a success changes on each draw, as each draw decreases the population (sampling without replacement from a finite population).

A random variable ${\displaystyle X}$ follows the hypergeometric distribution if its probability mass function (pmf) is given by

\[ {\displaystyle p_{X}(k)=\Pr(X=k)={\frac {{\binom {K}{k}}{\binom {N-K}{n-k}}}{\binom {N}{n}}},}\] where

${\displaystyle N}$ is the population size,
${\displaystyle K}$ is the number of success states in the population,
${\displaystyle n}$ is the number of draws (i.e. quantity drawn in each trial),
${\displaystyle k}$ is the number of observed successes,
${\textstyle \textstyle {a \choose b}}$ is a binomial coefficient.

Videos

Hypergeometric Distribution

Hypergeometric Distribution - Worked Example

Binomial Distribution

Column

Binomial Distribution

Binomial Distribution : Probability Density Function}

A binomial experiment is one that possesses the following properties:

\begin{enumerate} * The experiment consists of n repeated trials;

Each trial results in an outcome that may be classified as a success or a failure (hence the name, binomial);
The probability of a success, denoted by p, remains constant from trial to trial and repeated trials are independent. \end{enumerate}

The number of successes X in n trials of a binomial experiment is called a .

The probability of exactly k successes in a binomial experiment $Bin(n, p)$ is given by \[ P(X=k) = P(k \mbox{ successes in ``n" trails }) = \;^nC_k \times p^{k} \times (1-p)^{n-k}\]

X: Discrete random variable for the number of successes (variable name)
$k$ : Number of successes (numeric value)
k= 0, 1, 2, … n
$P(X=k)$ ``probability that the number of success is $k$“.
$n$ : number of independent trials
$p$ : probability of a success in any of the $n$ trial.
$1-p$ : probability of a failure in any of the $n$ trial.
${^nC_k}$ is a combination value, found using the Choose operator.

Binomial Distribution : Expected Value and Variance

Expectation and Variance

If the random variable X has a binomial distribution with parameters $n$ and $p$, we write \[ X \sim Bin(n,p) \] Only these two parameters are needed to determine the probability of an event.

The expected value of $X$ is: \[\operatorname{E(X)} = n \times p \]
The variance of $X$ is: \[\operatorname{Var(X)} = n \times p \times (1-p) = n\times p \times q \]
$p$ is the probability of success * $q$ is the probability of failure in a binomial trial
$n$ is the number of independent trials

Interpretation: If $n=100$, and $p=0.25$, then the average number of successes will be 25.

Videos

Binomial Distribution

Binomial Distribution - Worked Example

Formula

Formula Sheet

\[ [P(X = k) = \]

Negative Binomial Distributions

Column

Introduction

Suppose there is a sequence of independent Bernoulli trials. Thus, each trial has two potential outcomes called success and failure. In each trial the probability of success is ${p}$ and of failure is (${1 - p}$).

Type 1 Specficiation

This approach models the number of failures (denoted ${k}$) in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes (denoted ${r}$) occurs. Then the random number of failures we have seen, ${X}$, will have the negative binomial (or Pascal) distribution: \[ {\displaystyle X\sim \operatorname {NB} (r,p)}\]

Probability Mass Function

The probability mass function of the negative binomial distribution is

\[ {\displaystyle f(k;r,p)\equiv \Pr(X=k)={\binom {k+r-1}{k}}(1-p)^{r}p^{k}}\] where $k$ is the number of failures, ${r}$ is the number of successes, and ${p}$ is the probability of success.

Parameters

Mean ${\displaystyle {\frac {pr}{1-p}}}$
Variance ${\displaystyle {\frac {pr}{(1-p)^{2}}}}$

Type 2 Specficiation

This approach models the number of trial (denoted $n$) in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes (denoted ${r}$) occurs. (N.B. $n = k + r$.)

Probability Mass Function

Let ${X}$ denote the number of trials until the ${r}$-th success. The probability mass function of the negative binomial distribution is

\[ {\textstyle f(n;r,p)\equiv \Pr(X=n)=} { {\binom {n-1}{r-1}}(1-p)^{n-r}p^{r}} \] where $n$ is the number of trial, ${r}$ is the number of successes, and ${p}$ is the probability of success.

Parameters

Mean ${ {\frac {r}{p}}}$
Variance $ {}$

Worked Examples

Negative Binomial Distribution: Worked Example

Suppose that in a group of insurance policies (which are independent as regards occurrence of claims), 20% of the policies have incurred claims during the last year. An auditor is examining the policies in the group one by one in random order until two policies with claims are found.

Determine the probability that exactly five policies have to be examined until two policies with claims are found.
Find the expected number of policies that have to be examined until two policies with claims are found.

Solution

Normal Distribution

Column

%——————————————————————% %——————————————————————%

%———————————————%

Let x be the normal random variable describing waiting times\ $P(X \geq 15) =?$ \ First , we find the z-value that corresponds to x = 15 (remember $\mu=10$ and $\sigma=3$ )\ \[ z_o = { x_o - \mu \over \sigma } = { 15 - 10 \over 3 } = 1.666 \]

We will use $z_o =1.67$
Therefore we can say $P(X \geq 15 ) = P(Z \geq 1.67)$
The Murdoch Barnes tables are tabulated to give $P(Z \geq z_o)$ for some value $ z_o$ .
We can evaluate $P(Z \geq 1.67)$ as 0.0475.
Necessarily $P(X \geq 15) = 0.0475$.

\end{document} %———————————————%

“$90\%$ of customers will be dealt with in at most 12 minutes.”
To answer this question, we need to know $P(X\leq 12)$
First , we find the z-value that corresponds to x = 12 (remember $\mu=10$ and $\sigma=3$ )

\[ z_o = { x_o - \mu \over sigma } = { 12 - 10 \over 2 } = 0.666 \]

We will use $z_o =0.67$
Therefore we can say $P(X \geq 12 ) = P(Z \geq 0.67) = 0.2514$
Necessarily $P(X \leq 12 ) = P(Z \leq 0.67) = 0.7486$
$74.86\%$ of customers will be dealt with in at most 12 minutes.
The statement that $90\%$ will be dealt with in at most 12 minutes is false.

%———————————————%

What percentage will wait between 7 and 13 minutes ?\

$P(7 \leq X \leq 13) = ?$

\ Compute the probability of being too low, and the probability of being too high for the interval.\The probability of being inside the interval is the complement of the combination of these events.

%———————————————%

Z-Scores

Computing the Z score

Normal Distribution

Combined Distributions of Normal Random Variables.

Uniform Distributions

Column

Uniform Distribution

Continuous Uniform distribution

[L] lower bound of an interval
[U] upper bound of an interval

Probability of an outcome being between lower bound L and upper bound U \[P( L \leq X \leq U) = { U - L \over b – a }\]

Reminder

" $\leq$" is less than or equal to
" $\geq$" is greater than or equal to
$L \leq X \leq U$ simply states that X is between L and U inclusively.

(“inclusively” mean that X could be exactly L or U also, although the probability of this is extremely low)

Distributional Formulas

The probability density function is given as

\[f(x) = {1 \over b-a} for a \leq x \leq b\]

For any value “c” between the minimum value a and the maximum value b

\[P(X \ geq c) = {b-c \over b-a}\]

here b is the upper bound while c is the lower bound

\[P(X \ leq c) = {c-a \over b-a}\]

here c is the upper bound while a is the lower bound

Exponential Distribution

Column

Important Formulas

The Exponential Distribution

Probability density function of the Exponential Distribution

The probability density function (pdf) of an exponential distribution is \[ {\displaystyle f(x;\lambda )={\begin{cases}\lambda e^{-\lambda x}&x\geq 0,\\0&x<0.\end{cases}}}\]

Here $\lambda > 0$ is the parameter of the distribution, often called the rate parameter.

Worked Examples

Claim amounts are modelled as an exponential random variable with mean $1,000.

Exercises

Calculate the probability that one such claim amount is greater than $5,000.
Calculate the probability that a claim amount is greater than $5,000 given that it is greater than $1,000.

Solution

Click here for demonstrated solution

Worked Example

Review Questions

Review Question 1

An average of five calls per hour are received by a machine repair department. Beginning the observation at any point in time, what is the probability that the first call for service will arrive within five minutes? Jobs are sent to a printer at an average of 5 jobs per hour.

What is the expected time between jobs?
What is the probability that the next job is sent within 6 minutes after the previous job?

Review Question 2

Assume that the time, denominated in minutes, between arrivals of customers at a particular bank is exponentially distributed with a rate parameter of 0.25.

What is the mean duration between arrivals?
Find the probability that the time between arrivals is greater than 5 minutes.
Find the probability that the time between arrivals will be less than 2 minute.

Review Question 3

Suppose that customers arrive at a filling station at the rate of 3 per hour. Given that a customer has just arrived, the time is takes for the next customer to arrive is called a waiting time}. Let $ T$ be the symbol for the ``waiting times" variable.

Suppose that a customer has just arrived.

What is the expected waiting time between customer arrivals?
Compute $E(T)$
What is the variance of waiting times? Compute $\mbox{Var}(T)$.
What is the probability that the next customer will arrive within the next fifteen minutes?
What is the probability that no customers arrive in the next half hour?

Gamma Distribution

Column

The Gamma Distribution

Worked Examples

Worked Example

In a certain Metropolitan city the daily consumption of electric power (in Million Kilowatt Hour (MKH)) may be regarded as a random variable having Gamma distribution with parameter (3,2).

If the power plant has a daily capacity of 12 MKH, what is the probability that this power supply will be inadequate on any given day.

Solution

Skewness

Mode and Skewness: Worked Example

Consider the following measure of skewness for a unimodal distribution:

\[ S_{KP} \;=\; \frac{ \mbox{mean} \;-\; \mbox{mode}}{ \mbox{ standard deviation} } \]

Determine the value of ${S_{KP}}$ for a gamma distribution with parameters $ = 1.6$ and $ = 0.2$.
Comment on why ${S_{KP}}$ is a suitable measure of skewness for distributions with one mode.

Solution

Moment Generating Functions

Cumulant Generating Functions

Let $ {X}$ be a random variable having a gamma distribution with mean $ {}$ and variance ${\alpha\lambda^2}$ and let the ${i}$-th cumulant of the distribution of $ {X}$ be denoted ${\kappa_i}$.

Assuming the moment generating function of ${X}$, determine the values of ${\kappa_2}$, ${\kappa_3}$ and ${\kappa_4}$.

Solution

Lognormal Distribution

Column

The Lognormal Distribution

A lognormal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then $Y = ln(X)$ has a normal distribution.

Equivalently, if Y has a normal distribution, then the exponential function of Y, $X = \exp(Y)$, has a log-normal distribution.

Probability Density Function

\[ f(x) = {\displaystyle {\frac {1}{x\sigma {\sqrt {2\pi }}}}\ \exp \left(-{\frac {\left(\ln x-\mu \right)^{2}}{2\sigma ^{2}}}\right)} \mbox{ for } x>0\]

Mean

\[ { E(X) = \exp \left(\mu +{\frac {\sigma ^{2}}{2}}\right)} \]

Variance:

\[ { \operatorname{Var}(X) = \left[\exp(\sigma ^{2})-1\right]\exp(2\mu +\sigma ^{2})} \]

Skewness:

\[ { \operatorname{Skewness}(X) = (\exp(\sigma ^{2})+2){\sqrt {\exp(\sigma ^{2})-1}}} \]

Worked Examples

Worked Example 1

In a certain chemical plant, the concentration of pollutants (X-in parts per million) has lognormal distribution with parameter $ =3.2$ and $ =1$

Exercises

Write the pdf of $ X$
Compute the mean and variance of $ X$
Compute the probability that the concentration exceeds 8 parts per million

Solution

Lognormal Distribution - Worked Example

Worked Example 2

The random variable $ Y = X$ has $ N (10, 4)$ distribution. Find

The pdf of $ X$
Mean and variance of $ X$
P($ X 1000$)

Solution

Lognormal Distribution - Worked Example

Weibull Distributions

Column

Weibull Distribution

The Weibull Distribution

Parameter Estimates

The probability density function of a Weibull random variable is:[1]

f(x;,k) = \[\begin{cases} \frac{k}{\lambda}\left(\frac{x}{\lambda}\right)^{k-1}e^{-(x/\lambda)^{k}} & x\geq0 ,\\ 0 & x<0, \end{cases}\]

where k > 0 is the shape parameter and > 0 is the scale parameter of the distribution.

The cumulative distribution function for the Weibull distribution is \[F(x;k,\lambda) = 1- e^{-(x/\lambda)^k}\, for x ≥ 0, and F(x; k; \lambda ) = 0 for x < 0.\] The quantile (inverse cumulative distribution) function for the Weibull distribution is \[Q(p;k,\lambda) = \lambda {(-\ln(1-p))}^{1/k} for 0 ≤ p < 1.\] The failure rate h (or hazard rate) is given by \[ h(x;k,\lambda) = {k \over \lambda} \left({x \over \lambda}\right)^{k-1}.\]

Failure rate is the frequency with which an engineered system or component fails, expressed, for example, in failures per hour. It is often denoted by the Greek letter (lambda) and is important in reliability engineering.

Weibull distributions

\[ \lambda(t) = \alpha \beta t^{\beta -1} \] $(\lambda , \alpha, \beta > 0)$

Gompertz Distribution

\[ \lambda(t) = \alpha e^{\beta t} (\lambda , \alpha > 0), ( \beta \mbox{ is real})\]

GTDL \[ \lambda(t) = \frac{\gamma e^{\alpha+\beta t}}{1+e^{\alpha+\beta t}} \] \[ (\lambda > 0),(\alpha , \beta \mbox{is real})\]

When given the hazard function, the properties of the corresponding survival distribution can be deduced

\[ \LARGE f(t)=\lambda(t)S(t) = \lambda(t) exp(-\int_0^t \lambda(u)d(u)) \]

These models are called parametric models. $\lambda(t)$ involves parameters = unknown constants in these models. The parameters

control the shape and scale variation. They are not unknown functions, i.e., functions are wholly specified in parametric models.

Applications

The Weibull distribution is used

In survival analysis[8]
In reliability engineering and failure analysis
In industrial engineering to represent manufacturing and delivery times
In extreme value theory
In weather forecasting
To describe wind speed distributions, as the natural distribution often matches the Weibull shape[9]
In communications systems engineering
In radar systems to model the dispersion of the received signals level produced by some types of clutters
To model fading channels in wireless communications, as the Weibull fading model seems to exhibit good fit to experimental fading channel measurements

Fitted cumulative Weibull distribution to maximum one-day rainfalls using CumFreq, see also distribution fitting In general insurance to model the size of reinsurance claims, and the cumulative development of asbestosis losses In forecasting technological change (also known as the Sharif-Islam model)[10] In hydrology the Weibull distribution is applied to extreme events such as annual maximum one-day rainfalls and river discharges. The blue picture illustrates an example of fitting the Weibull distribution to ranked annually maximum one-day rainfalls showing also the 90% confidence belt based on the binomial distribution. The rainfall data are represented by plotting positions as part of the cumulative frequency analysis.

Time to Failure

If the quantity X is a “time-to-failure”, the Weibull distribution gives a distribution for which the failure rate is proportional to a power of time. The shape parameter, k, is that power plus one, and so this parameter can be interpreted directly as follows:

A value of k < 1 indicates that the failure rate decreases over time. This happens if there is significant “infant mortality”, or defective items failing early and the failure rate decreasing over time as the defective items are weeded out of the population.
A value of k = 1 indicates that the failure rate is constant over time. This might suggest random external events are causing mortality, or failure.
A value of k > 1 indicates that the failure rate increases with time. This happens if there is an “aging” process, or parts that are more likely to fail as time goes on.

Pareto Distribution

Column

The Pareto Type I distribution

The Pareto Type I distribution is a continuous distribution, parameterized with the shape parameter $\alpha > 0$, and location parameter $x_\mathrm{m} > 0$.

Cumulative distribution function

The cumulative distribution function of a Pareto random variable with parameters $\alpha$ and $x_m$ is \[F_X(x) = \begin{cases} 1-\left(\frac{x_\mathrm{m}}{x}\right)^\alpha & \mbox{for } x \ge x_\mathrm{m}, \\ 0 & \mbox{for }x < x_\mathrm{m}. \end{cases} \]

Probability density function

It follows (by differentiation) that the probability density function is \[ f_X(x)= \begin{cases} \alpha\,\dfrac{x_\mathrm{m}^\alpha}{x^{\alpha+1}} & \mbox{for }x \ge x_\mathrm{m}, \\[12pt] 0 & \mbox{for } x < x_\mathrm{m}. \end{cases} \]

Moments

The expected value of a random variable following a Pareto distribution is \[ E(X)= \begin{cases} \infty & \mbox{if }\alpha\le 1, \\ \frac{\alpha x_\mathrm{m}}{\alpha-1} & \mbox{if }\alpha>1. \end{cases} \]

The variance of a random variable following a Pareto distribution is \[ \mathrm{Var}(X)= \begin{cases} \infty & \mbox{if }\alpha\in(1,2], \\ \left(\frac{x_\mathrm{m}}{\alpha-1}\right)^2 \frac{\alpha}{\alpha-2} & \mbox{if }\alpha>2. \end{cases} \] (If $\alpha\le 1$, the variance does not exist.)

Approximating Distribution

Column

Poisson Approximation

Poisson Approximation Of The Binomial Distribution

Poisson Approximation Of The Binomial Distribution

Poisson Approximation of the Binomial

The Poisson distribution can sometimes be used to approximate the binomial distribution
When the number of observations n is large, and the success probability p is small, the $\mbox{Bin}(n,p)$ distribution approaches the Poisson distribution with the parameter given by $m = np$.
This is useful since the computations involved in calculating binomial probabilities are greatly reduced.
As a rule of thumb, n should be greater than 50 with p very small, such that $np$ should be less than 5.
If the value of $p$ is very high, the definition of what constitutes a success" orfailure" can be switched.

Normal Approximations of the Binomial Distribution

Worked Example 1

In a certain large population 45% of people have blood group A. A random sample of 300 individuals is chosen from this population.

Calculate an approximate value for the probability that more than 115 of the sample have blood group A.

Solution

Normal Approximation of Discrete Distributions

Normal Approximation of Continuous Distributions

Gamma Random Variables - Normal Approximation

Combinations of Normal Random Variables

Column

Worked Examples

Worked Examples 1

My cycle journey to work is 3 km, and my cycling time (in minutes) if there are no delays is distributed N(15, 1), i.e. Normally with mean 15 and variance 1.

Exercises

Find the probability that, if there are no delays, I get to work in at most 17 minutes.
On my route there are three sets of traffic lights. Each time I meet a red traffic light, I am delayed by a random time that is distributed N(0.7, 0.09). These lights operate independently. Find the probability of my getting to work in at most 17 minutes
1. if just one light is set at red when I reach it,
2. if just two lights are set at red when I reach them
3. if all three lights are set at red when I reach them.
Suppose that, for each set of lights, the chance of delay is 0.5. Deduce that the mean value of T, my total journey time, is 16.05 minutes.
Given that $\LARGE \operatorname{Var}(T) = 1.5025$, use a suitable approximation to calculate the probability that, over 10 journeys, my average journey time to work is at most 17 minutes.

Solution

Click here for demonstrated solution

Compound Distributions

Column

Compund Poisson Distributions

A compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. In the simplest cases, the result can be either a continuous or a discrete distribution.

When $N$ is Poisson, the expected value and the variance of the compound distribution are

\[ {\LARGE \operatorname {E} (Y)=\operatorname {E} (N)\operatorname {E} (X),} \] . \[{\LARGE \operatorname {Var} (Y)=E(N)(\operatorname {Var} (X)+{E(X)}^{2})=E(N){E(X^{2})}.} \]

Worked Examples

Worked Example 1

The number of claims arising in one year from a group of policies follows a Poisson distribution with mean 12. The claim sizes independently follow an exponential distribution with mean $80 and they are independent of the number of claims. The current financial year has six months remaining.

Exercises

Calculate the mean and the standard deviation of the total claim amount which arises during this remaining six months.

Solution

Compound Poisson Distribution

Compound Poisson Distribution

Mixed Probability Distributions

Column

Mixed Distributions

Worked Examples

Mixed Probability Distributions

There are two continuous probability distributions:

A is an exponential distribution with mean $\LARGE \mu = 7$.
B is a distribution that is uniform on the interval from 0 to 8, and thereafter proportional to A.

Show that the probability of a random variable that follows distribution B and lying between 0 and 8 is 8/15

Solution

Click here for demonstrated solution

Main Menu

Column

Contents

Poisson Distribution

Column

Poisson Distribution

Poisson Distribution

Probability Density Function for the Poisson Distribution}

Poisson Expected Value and Variance}

Worked Examples

Worked Examples 1

Exercises

Solution

Confidence Intervals for Poisson Random Variables

Worked Example 1

Solution

Worked Example 2

Solution

Videos

Goodness of Fit Tests

Goodness of Fit Tests

Binomial Distribution

Column

Binomial Distribution

Binomial Distribution : Probability Density Function}

Binomial Distribution : Expected Value and Variance

Expectation and Variance

Videos

Binomial Distribution

Formula

Formula Sheet

Geometric Distribution

Column

Geometric Distribution

Hypergeometric Distribution

Column

Hypergeometric Distribution

Probability mass function

Videos

Hypergeometric Distribution

Binomial Distribution

Column

Binomial Distribution

Binomial Distribution : Probability Density Function}

Binomial Distribution : Expected Value and Variance

Expectation and Variance

Videos

Binomial Distribution

Formula

Formula Sheet

Negative Binomial Distributions

Column

Introduction

Type 1 Specficiation

Probability Mass Function

Parameters

Type 2 Specficiation

Probability Mass Function

Parameters

Worked Examples

Negative Binomial Distribution: Worked Example

Normal Distribution

Column

Z-Scores

Normal Distribution

Combined Distributions of Normal Random Variables.

Uniform Distributions

Column

Uniform Distribution

Uniform Distribution

Continuous Uniform distribution

Reminder

Distributional Formulas

Exponential Distribution

Column

Important Formulas

The Exponential Distribution

Probability density function of the Exponential Distribution

Worked Examples

Worked Examples