This is an archived course. A more recent version may be available at ocw.mit.edu.

3.2 Review of Probability and Statistics

3.2.3 Distributions

Measurable Outcome 3.3, Measurable Outcome 3.4

In the first unit, we were introduced to the concept of a random variable, where we distinguished it from a deterministic variable. The first step in an engineering analysis is to identify the random variables present in the system. Then, we need to characterize these random variables, i.e. specify appropriate models of their uncertain behavior.

Random variables can be characterized by assigning them with a probability distribution function. A discrete random variable is characterized by its probability mass function, which assigns probabilities to the event that the random variable takes certain values. A continuous random variable is characterized by its probability density function, which assigns probabilities to the event that the random variable lies in a certain interval. We first discuss the probability mass functions, and then move on the probability density functions.

Probability Mass Functions

Consider a discrete random variable \(X\), and the event \(A\) that \(X\) is equal to a specific value \(x\). The probability of \(A\) can be written as,

\[P\{ A\} =P\{ X=x\} =p_ X(x)\] (3.3)

where \(p_ X(x)\) is called the probability mass function of \(X\). Note that since \(p_ X(x)\) is a probability value, it is always less than one, and it is zero at all those points which the random variable \(X\) will never be equal to.

Example 3: Consider the rotor blades in example 1 of unit 1. Suppose the total number of rotor blades was 4. The random variable \(N_ R\) is a discrete random variable, which can take 5 different values. A possible probability mass funtion for \(N_ R\) is as follows,

\[p_{N_ R}(x) = \left\{ \begin{array}{ll} 0.6 & \mbox{for } x = 0, \\[0.1in] 0.2 & \mbox{for } x = 1, \\[0.1in] 0.1 & \mbox{for } x = 2, \\[0.1in] 0.05 & \mbox{for } x = 3, \\[0.1in] 0.05 & \mbox{for } x = 4, \\[0.1in] 0 & \mbox{for } x \neq \{ 0,1,2,3,4\} \end{array}\right.\] (3.4)

Note that since the \(p_{N_ R}(x)\) are probability values, they sum to 1 over their domain \(x\).

Probability Density Functions

A probability density function (PDF) is used to describe the probability of a continuous random variable being in some range. In particular, consider a random variable \(X\), and the event \(A\) that it lies between the numbers \(a\) and \(b\), i.e., \(a \leq X \leq b\). The probability of the event \(A\) can be written as,

\[P\{ A\} = P\{ a \leq X \leq b\} = \int \limits _{a}^{b} f_ X(x) dx,\] (3.5)

where \(f_ X(x)\) is called the probability density of \(X\). Note that unlike the probability mass function, the probability density function \(f_ X(x)\), by itself, does not give the probability of an event occuring. Indeed, \(f_ X(x)\) can be more than one for a given value of \(x\), however it can never be less than zero for any \(x\).

A common (and probably the simplest) distribution is the uniform distribution. In this case, the probability density is constant within some range and zero outside of this range,

\[f_ X(x) = \left\{ \begin{array}{ll} \frac{1}{b-a} & \mbox{for } a \leq x \leq b, \\[0.1in] 0, & \mbox{otherwise}. \end{array}\right.\] (3.6)

We would say that the random variable \(X\) is uniformly distributed, and denote it as \(X \sim U(a,b)\). Other distribution types are described later in the unit.

Cumulative Density Functions

The cumulative distribution function (CDF) of a random variable \(X\) is defined as the probability of the event that \(X \leq x\). Specifically,

\[F(x) \equiv P\{ X \leq x\}\] (3.7)

The CDF and PDF of \(X\) are related as follows,

\[F(a) = \int ^{a}_{-\infty } f_ X(x)\, dx\] (3.8)

Thus, we can show,

\[F(b) - F(a) = \int ^{b}_{a} f_ X(x)\, dx.\] (3.9)

Furthermore, this implies that

\[f_ X = \frac{dF}{dx}\] (3.10)

Some Common Types of Distributions

The normal (or Gaussian) distribution is,

\[f_ X(x) = \frac{1}{\sigma _ x\sqrt {2\pi }}e^{-(x-\mu _ x)^2/2\sigma _ x^2}\] (3.11)

We will use the common notation \(X \sim \mathcal{N}(\mu ,\sigma ^2)\) to indicate that \(X\) is a normally-distributed random variable with mean \(\mu\) and variance \(\sigma ^2\).

Let \(N_ R\) be distributed as in example 3. For example 1, let \(B\) be the event that the number of blades to be replaced are more than 2. Then \(P\{ B\}\) is

Exercise 1

Answer:

\(P\{ B\}\) = \(P\{ NR > 2\}\) = \(P\{ N_ R = 3 \text { or } N_ R = 4\}\) = \(p_{N_ R}(3) + p_{N_ R}(4)\) = 0.05 + 0.05 = 0.1

Let \(X\) be a continuous random variable, such that \(a \leq X \leq b\). Then the probability that \(X=\frac{a+b}{2}\) is:

Exercise 2

Answer:

For a continuous random variable, the probability of the random variable taking any single value is 0, only events where the random variable lies in an interval have non-zero probability. Another way to see this is that

\[P\{ \frac{a+b}{2} \leq X \leq \frac{a+b}{2}\} = \int _{\frac{a+b}{2}}^{\frac{a+b}{2}} f_ X(x) \, dx = 0\] (3.12)

by the definition of a definite integral.

Let \(X\) be the same random variable as in Exercise. Then \(F(b)\)

Exercise 3

Answer:

The distribution of \(X\), \(f_ X(x)\) is non-zero in the interval \(\{ a \leq X \leq b \}\), and 0 everywhere else. Now by definition \(F(b)=\int _{-\infty }^{b} f_ X(x) \, dx\). This is equal to all the three integrals above because \(f_ X(x)\) is zero outside the interval \(\{ a \leq X \leq b\}\).