## Control Charts for Exponential Distributions

Control Charts as defined by Walter A. Shewhart and popularized by W. Edwards Deming were originally designed for quality characteristics that have a normal distribution of probability.

In some contexts, the assumption that the quality characteristic is normally distributed is incorrect. This is particularly true for a characteristic such as resolution time of tickets, whose probability is better represented by an exponential distribution.

# Normal Distribution vs. Exponential Distribution

The following charts demonstrate the differences between normal and exponential distributions.

• Charts at left show the evolution of a given quality characteristic. The X-Axis can be either a sample ID or some time-related value.
• Upper left chart simulates a normal distribution where Mean = 10 and Standard Deviation = 5.
• Bottom left chart simulates an exponential distribution where Lambda = 1/10.
• Charts at right evaluate the frequency of given ranges of the quality characteristic. For example, nearly 30 samples of the normal distribution have a quality value between 8 and 10.
• Upper right chart shows the characteristic bell curve a normal distribution.
• Bottom right chart is typical of an exponential distribution, with a rapidly decreasing frequency as the quality characteristic increases.

Illustration of differences between normal and exponential distributions based on simulated data.

Key Properties

Normal Distribution Exponential Distribution
Probability Density Function
Cumulative Distribution Function
Mean
Median
Variance

# Controls Limits for Exponential Distributions

For normal distributions, control limits used to detect special-cause variations are often based on the famous six-sigma spread around the mean.

This is because nearly 99.7% of values lie within this spread, as shown hereafter:

Probability Density Function of a normal distribution.

What 99.7% Coverage Means for Exponential Distribution

We have just seen that, for a normal distribution, 99.7% of values are covered with x (the quality characteristic) being between μ – 3σ and μ + 3σ.

How does it translate to an exponential distribution? We simply have to resolve the following equation:

99.7% of values for an exponential distribution lie between x = 0 and x = 5.9/λ.

This result is much more visible by drawing the Cumulative Distribution Function of an exponential distribution:

Cumulative Distribution Function of an exponential distribution. P(X <= x) = 99.7% for x = 5.9/λ.

Summary of Control Limits Based On The 99.7% Rule

Normal Distribution Exponential Distribution
LCL (Lower Limit Control) μ – 3σ 0
UCL (Upper Limit Control) μ + 3σ 5.9/λ

Generalization of the Three-Sigma Rule

Range (Normal Distribution) Population in Range Range (Exponential Distribution)
μ ± 1σ 68% [0; 1.1/λ]
μ ± 2σ 95% [0; 3/λ]
μ ± 3σ 99.7% [0; 5.9/λ]

Three-Sigma Rule applied to normal distribution.

Three-Sigma Rule adapted to exponential distribution.

# Tutorial: Statistical Process Control Based on Resolution Time of Tickets

Context: Customers submit demands though JIRA tickets to the IT team.

Goal #1: The team wants to compute several key characteristics of its process of resolution of tickets, based on historical data.

Goal #2: The team wants to analyze variations of resolution time, separate common-cause variations from special-cause variations, and try to reduce variations in order to improve predictability and ultimately gain trust from customers.

## Step 1 – Get Data from JIRA

Browse JIRA and get the latest 50 tickets your team has resolved.

Example of JQL:
`project = GTS AND (resolution = Fixed OR resolution = Done) ORDER BY resolved DESC`

Export your selection with all fields to Microsoft Excel. Format the worksheet to obtain something like this:

Fields of interest are mainly "Created" and "Resolved". "Key" is informative only.

Add a new column “Resolution Time”, which is simply the difference between “Resolved” and “Created”.

When subtracting two dates, Excel returns automatically duration expressed in days.

## Step 2 – Check if Hypothesis of Exponential Distribution is Correct

On a new worksheet, create a table computing the probability that the resolution time is lower or equal to x, for about a hundred values of x (so that you capture frequencies for short resolutions times).

You may use a linear scale, such as 1, 2, 3, 4 days, and so on, or an exponential one. The advantage of using an exponential scale is to capture the numerous small resolution times and therefore to increase the precision of the chart hereafter.

To compute an exponential scale, you can use the following formula:

"Index" is the index of the point on the chart, for example 0, 1, 2, ... 100.

In our example, we have:

• max(resolution time): 55 days
• max(index): 100 (e.g. how many points we want to have on the chart)

Therefore, the formula simplifies to:

This formula produces a simple exponential scale in Excel. Note that f(0) = 0 and f(100) = 55.

Using formula above, compute an exponential scale in the spreadsheet.

An exponential scale helps to capture lowest resolution times.

Add a column that computes the probability.

This table helps to simulate a Cumulative Distribution Function directly computed from the real data.

Plot the computed probability on a chart and check that the curve can be approximated by an exponential distribution Cumulative Distribution Function.

This chart provides a lot of information. For example, we can see that the probability a ticket is resolved within three days is 80%.

## Step 3 – Compute Control Limits

Return to the previous sheet containing the original data. Compute the average resolution time.

You may also use the standard AVERAGE() function of Excel.

We can get an estimation of λ by computing:

We are using a property of exponential distributions stating that the maximum likelihood estimate for the rate parameter λ is the inverse of the average.

Now that we have found the average resolution time, we can define the control limits as:

• LCL: 0 (resolution time cannot be negative)
• UCL: 5.9 x 2.62 = 15.5 days

## Step 4 – Draw Control Chart

Create a chart that includes the UCL computed previously. In this tutorial, tickets are ordered by resolution date.

Annotate your chart to show which tickets are above UCL.

## Step 5 – Reflect and Adjust

Assuming that we can approximate the probability of resolution time by an exponential distribution with λ = 1/2.62 = 0.38 tickets/day, ticket GTS-1104 had only a probability of 0.3% to be above UCL = 15.5 days. Therefore, we can consider that the resolution time of this ticket is “out-of-process”.

Perform root-cause analysis to find the special causes for these tickets. Can you prevent these special causes? What is the probability these causes appear another time in the future? Can you devise countermeasures? If yes, use a PDCA process to implement them properly.

This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

### 3 Responses to Control Charts for Exponential Distributions

1. azheglov says:

I believe you overlooked the data-mining adjustment. The p-value of 0.003 for the ticket #1104 would be statistically significant if you only had that one item. But you “mined” it from a set of 50 items, so the probability that one of them would take that long is: