The Memoryless Property of the Exponential Distribution

Author

Patrick Talbot

The Exponential Distribution

The exponential distribution is a well-known continuous distribution among statisticians. Like the normal and chi-squared distributions, it has been studied extensively, and it has been found to have some interesting properties. One feature of the exponential distribution that makes it useful for modeling in certain applications is called the memoryless property, sometimes called the forgetfulness property.

Before we investigate this so-called memoryless property, let us give the probability density function of the exponential distribution (it will come in handy when we prove that the exponential distribution has the forgetfulness property). If \(X\) is a continuous random variable with an exponential distribution that has parameter \(\beta\), then we write \(X\) ~ exponential(\(\,\beta\,\)) and the probability density function of X is given by [4]

\[ f(x) = \frac{1}{\beta}e^{-\frac{x}{\beta}}\;,\:x\geq0,\:\beta>0. \]

Sometimes, the probability density function of the exponential distribution is given with \(\frac{1}{\beta}=\lambda\), where \(\lambda\) is now called the rate parameter. We then write \(X\) ~ exponential(\(\lambda\)) and the probability density function is [1]

\[ f(x)=\lambda{e}^{-\lambda{x}}\,,\,x\geq0,\,\lambda>0. \]

Below is a graph depicting the exponential distribution for various values of \(\lambda\).

The Memoryless Property

Now that we know what the density for the exponential distribution is, we will be able to use it to prove the memoryless property, which we will now present. The forgetfulness property is an equation that relates a certain conditional probability to an unconditional probability. This trait states that if \(X\) ~ exponential(\(\,\beta\,\)), then [4]

\[ P[\:X>a+b\:|\:X>a\:]=P[\:X>b\:]. \tag{1}\]

Note that this relationship is not a feature of all distributions. However, it is also not unique to the exponential distribution. For example, the discrete geometric distribution has the memoryless property as well [2] (in fact, the exponential and geometric distributions are the only distributions with this feature [5]). We now prove this feature of the exponential distribution.

Proof: (Based on proof given in [3]) To prove the forgetfulness property, we will show that both sides of equation (1) give the same result. To begin, we consider the left-hand side of the equation:

\[\begin{align*} P[\:X>a+b\:|\:X>a\:]&=\frac{P[ X > a + b \cap X > a ]}{P[ X > a ]}\\\\ &=\frac{P[ X > a + b ]}{P[ X > a ]}&&{(\text{since}\;X>a+b\;\cap\,X>a\;\text{is}\;X>a+b)}\\\\ &=\frac{\int_{a+b}^{\infty}\frac{1}{\beta}e^{-\frac{x}{\beta}}dx}{\int_{a}^{\infty}\frac{1}{\beta}e^{-\frac{x}{\beta}}dx}\\\\ &=\frac{\lim_{t\to\infty}\left[-e^{-\frac{x}{\beta}}\right]_{a+b}^t}{\lim_{k\to\infty}\left[-e^{-\frac{x}{\beta}}\right]_{a}^k}\\\\ &=\frac{\lim_{t\to\infty}\left(-e^{-\frac{t}{\beta}}-\left(-e^{-\frac{a+b}{\beta}}\right)\right)}{\lim_{k\to\infty}\left(-e^{-\frac{k}{\beta}}-\left(-e^{-\frac{a}{\beta}}\right)\right)}\\\\ &=\frac{0\:+\:e^{-\frac{a+b}{\beta}}}{0\:+\:e^{-\frac{a}{\beta}}}\\\\ &=\frac{e^{-\frac{a}{\beta}}\:\cdot\:e^{-\frac{b}{\beta}}}{e^{-\frac{a}{\beta}}}\\\\ &=e^{-\frac{b}{\beta}}\:. \end{align*}\]

Now, we compute the right-hand side of equation (1):

\[\begin{align*} P[\:X>b\:]&=\int_{b}^{\infty}\frac{1}{\beta}e^{-\frac{x}{\beta}}dx\\\\ &=\lim_{t\to\infty}\left[-e^{-\frac{x}{\beta}}\right]_b^t\\\\ &=\lim_{t\to\infty}\left(-e^{-\frac{t}{\beta}}-\left(-e^{-\frac{b}{\beta}}\right)\right)\\\\ &=0\:+\:e^{-\frac{b}{\beta}}\\\\ &=e^{-\frac{b}{\beta}}\\\\ &=P[\:X>a+b\:|\:X>a\:]\,. \end{align*}\]

Thus, the exponential distribution does indeed have the memoryless property.

\(\square\)

Interpretation

We can now understand why this feature is called the memoryless, or forgetfulness, property. Equation (1) tells us that the probability of our random variable \(X\) being greater than \(a+b\), conditional on the fact that \(X\) is greater than \(a\), equals the probability that \(X\) is greater than \(b\). This shows that the conditional aspect of the probability can be removed, provided that we subtract \(a\) from \(a+b\) in the inequality. Whatever has happened from \(X=0\) to \(X=a\) has no effect on the probability that \(X\) is at least \(b\) units greater than what it used to be. Thus, the part \(0\leq{X\leq{a}}\) is irrelevant, and so the exponential distribution has essentially “forgotten” about what happened from \(X=0\) to \(X=a\); the exponential distribution has no memory of what has already occurred. We can then calculate \(P[\:X>a+b\:|\:X>a\:]\) without using the fact that \(X>a\).

If we think about this using an example, the idea may be easier to grasp. Say that we have a lock with a one-digit numerical code, and that someone is guessing the code at random. Now imagine that this person has no memory, so that they do not remember what codes they have already attempted. Then each time that they guess the code (at random), they have the same probability of unlocking the lock (namely, with a probability of \(\frac{1}{10}\)). Our random variable \(X\) in this case is the number of attempts that it takes this person to guess the code. In this situation, knowing that the person has made at least \(a\) attempts at unlocking the lock does not change the probability of it taking them at least another \(b\) guesses to get the correct code. Needing at least three guesses to crack the code versus needing at least three guesses to crack the code after having guessed eight times already does not result in a higher chance of opening the lock, since the person does not remember their first eight guesses. Thus, the probability \(P[\:X>a+b\:|\:X>a\:]\) is the same as the probability \(P[\:X>b\:]\), and so our random variable X has the memoryless property (this example is based off of one given in [5]).

We have seen an example in which the memoryless property is appropriate. Let us now present an example where it is not, to give some contrast. Consider a human being, and think of our random variable \(X\) as being the number of years that the person has been alive. The probability that this person will live at least fifteen years from birth is quite different than the probability that this person will live at least fifteen more years given the fact that they have already lived for more than eighty years. Using mathematical expressions, we have that \(P[\:X>15+80\:|\:X>80\:]\neq{P[\:X>15\:]}\). Thus, our random variable X in this situation does not have the memoryless property.

Applications

Now that we have shown why the exponential distribution has the forgetfulness property, let us investigate how it is used in applications.

One use for the memoryless property involves certain components of electrical circuits. For example, the remaining lifetime of an electrical fuse can be modeled using the exponential distribution and its forgetfulness property. This is due to the fact that the probability of an electrical fuse lasting at least a certain amount of time does not depend on how long it has already been in use. That is, the probability that a fuse will last at least \(x\) amount of time, given that it has already been in use for at least \(y\) amount of time, is equal to the probability that it will last at least \(x-y\) amount of time [4]. This leads us to believe that an electrical fuse doesn’t experience much wear over time. This is in contrast to many objects in our daily lives - car parts, appliances, and clothing are just a few examples of things that wear out and have a higher and higher likelihood of breaking down as they get used over time.

Another application relates to radioactive decay of particles. If we consider the nucleus of a single particle, then the probability of that nucleus undergoing radioactive decay remains constant through time. Thus, the fact that the nucleus hasn’t undergone radioactive decay in a certain length of time does not change the probability of it taking an additional \(x\) amount of time to decay [6].

Overall, the memoryless property of the exponential distribution is an interesting feature that doesn’t occur with most other distributions. Although the memoryless property only applies in certain cases, it is still useful for modeling objects like electrical fuses and phenomena such as radioactive decay of particles.

References

[1] Spicker, D. (2024). STAT 4243: Mathematical Statistics Review [Lecture slides]. D2L Brightspace. https://lms.unb.ca/d2l/le/content/240524/viewContent/2708390/View?ou=240524

[2] Stewart, C. (2023). STAT 3793 Notes6_Geometric Distribution [Lecture notes]. D2L Brightspace. (URL no longer available)

[3] Stewart, C. (2023). Notes 13 Examples [Lecture notes]. D2L Brightspace. (URL no longer available)

[4] Stewart, C. (2023). STAT 3793 Notes13_Exponential and Chi-squared Distributions [Lecture notes]. D2L Brightspace. (URL no longer available)

[5] Wikipedia. (2024, January 23). Memorylessness. https://en.wikipedia.org/wiki/Memorylessness

[6] Wikipedia. (2024, January 28). Radioactive decay. https://en.wikipedia.org/wiki/Radioactive_decay#Universal_law