In some contexts, the assumption that the quality characteristic is normally distributed is incorrect. This is particularly true for a characteristic such as resolution time of tickets, whose probability is better represented by an exponential distribution.
This article is an attempt to adapt Control Charts to exponential distributions.
Normal Distribution vs. Exponential Distribution
The following charts demonstrate the differences between normal and exponential distributions.
- Charts at left show the evolution of a given quality characteristic. The X-Axis can be either a sample ID or some time-related value.
- Upper left chart simulates a normal distribution where Mean = 10 and Standard Deviation = 5.
- Bottom left chart simulates an exponential distribution where Lambda = 1/10.
- Charts at right evaluate the frequency of given ranges of the quality characteristic. For example, nearly 30 samples of the normal distribution have a quality value between 8 and 10.
- Upper right chart shows the characteristic bell curve a normal distribution.
- Bottom right chart is typical of an exponential distribution, with a rapidly decreasing frequency as the quality characteristic increases.
|Normal Distribution||Exponential Distribution|
|Probability Density Function|
|Cumulative Distribution Function|
Controls Limits for Exponential Distributions
For normal distributions, control limits used to detect special-cause variations are often based on the famous six-sigma spread around the mean.
This is because nearly 99.7% of values lie within this spread, as shown hereafter:
What 99.7% Coverage Means for Exponential Distribution
We have just seen that, for a normal distribution, 99.7% of values are covered with x (the quality characteristic) being between μ – 3σ and μ + 3σ.
How does it translate to an exponential distribution? We simply have to resolve the following equation:
This result is much more visible by drawing the Cumulative Distribution Function of an exponential distribution:
Summary of Control Limits Based On The 99.7% Rule
|Normal Distribution||Exponential Distribution|
|LCL (Lower Limit Control)||μ – 3σ||0|
|UCL (Upper Limit Control)||μ + 3σ||5.9/λ|
Generalization of the Three-Sigma Rule
|Range (Normal Distribution)||Population in Range||Range (Exponential Distribution)|
|μ ± 1σ||68%||[0; 1.1/λ]|
|μ ± 2σ||95%||[0; 3/λ]|
|μ ± 3σ||99.7%||[0; 5.9/λ]|
Tutorial: Statistical Process Control Based on Resolution Time of Tickets
Context: Customers submit demands though JIRA tickets to the IT team.
Goal #1: The team wants to compute several key characteristics of its process of resolution of tickets, based on historical data.
Goal #2: The team wants to analyze variations of resolution time, separate common-cause variations from special-cause variations, and try to reduce variations in order to improve predictability and ultimately gain trust from customers.
Step 1 – Get Data from JIRA
Browse JIRA and get the latest 50 tickets your team has resolved.
Example of JQL:
project = GTS AND (resolution = Fixed OR resolution = Done) ORDER BY resolved DESC
Export your selection with all fields to Microsoft Excel. Format the worksheet to obtain something like this:
Add a new column “Resolution Time”, which is simply the difference between “Resolved” and “Created”.
Step 2 – Check if Hypothesis of Exponential Distribution is Correct
On a new worksheet, create a table computing the probability that the resolution time is lower or equal to x, for about a hundred values of x (so that you capture frequencies for short resolutions times).
You may use a linear scale, such as 1, 2, 3, 4 days, and so on, or an exponential one. The advantage of using an exponential scale is to capture the numerous small resolution times and therefore to increase the precision of the chart hereafter.
To compute an exponential scale, you can use the following formula:
In our example, we have:
- max(resolution time): 55 days
- max(index): 100 (e.g. how many points we want to have on the chart)
Therefore, the formula simplifies to:
Using formula above, compute an exponential scale in the spreadsheet.
Add a column that computes the probability.
Plot the computed probability on a chart and check that the curve can be approximated by an exponential distribution Cumulative Distribution Function.
Step 3 – Compute Control Limits
Return to the previous sheet containing the original data. Compute the average resolution time.
We can get an estimation of λ by computing:
Now that we have found the average resolution time, we can define the control limits as:
- LCL: 0 (resolution time cannot be negative)
- UCL: 5.9 x 2.62 = 15.5 days
Step 4 – Draw Control Chart
Create a chart that includes the UCL computed previously. In this tutorial, tickets are ordered by resolution date.
Step 5 – Reflect and Adjust
Assuming that we can approximate the probability of resolution time by an exponential distribution with λ = 1/2.62 = 0.38 tickets/day, ticket GTS-1104 had only a probability of 0.3% to be above UCL = 15.5 days. Therefore, we can consider that the resolution time of this ticket is “out-of-process”.
Perform root-cause analysis to find the special causes for these tickets. Can you prevent these special causes? What is the probability these causes appear another time in the future? Can you devise countermeasures? If yes, use a PDCA process to implement them properly.