PERT Completion Times Revisited
Fred E. Williams
University of Michigan-Flint
July 2005
Abstract
Two sources of PERT
completion time bias are well documented in the literature: near critical paths
turning critical during execution and misspecified activity time probability
models. Although simulation is clearly the most appropriate method for
assessing project duration, most introductory discussions touch on these issues
and move quickly to standard approximations, implying that PERT offers useful,
if only approximate, project duration estimates. This paper uses simulation to
illustrate the nature and extent of PERT approximation errors in simple
examples from two excellent texts. The examples raise serious questions about
the utility of PERT project duration estimates and suggest opportunities for
improvement in introductory PERT instruction.
Project management is a
staple in introductory operations management or management science courses. Not
only are project management concepts and methods important in practice, but the
topics readily lend themselves to an array of useful models built on simple
concepts that can be easily understood and mastered at the introductory level.
For at least these reasons, network planning models such as PERT and CPM are
very natural entrees to the world of model building and analysis.
Introductory discussions
usually begin by explicating the basic concepts of activities, durations, and precedence
relationships, followed by the development of network representations of a
project; earliest and latest start and completion times; slack; and critical
path(s). Attention soon moves to modeling projects with significant randomness
– where activity durations are not deterministic, but are random. The classic
PERT model then unfolds, with independent activities; optimistic, pessimistic,
and most likely time estimates related to beta distributions; and the PERT
approximations.
Attention focuses on finding
the expected duration and variance of the critical path, and with an appeal to
the central limit theorem, using the properties of the critical path duration
to make probability statements about project completion. Many authors warn
readers of the limitations of these probability statements – the danger of near
critical paths becoming critical, departures from the normal distribution
assumed, etc. Some also mention that simulation would be a more apt analytical
method of estimating accurate probabilities. However, despite the usual
disclaimers and admonitions, the uninitiated reader is likely to leave a
typical introductory PERT discussion with a clear impression that PERT yields useful, if only approximate, results.
This paper illustrates some risks inherent in such an impression by contrasting
PERT estimates and spreadsheet simulation results for some relatively simple
examples. We hope these examples motivate increased pedagogical emphasis on (a)
the tenuousness of PERT estimates and (b) the merits of spreadsheet simulation
in project management.
To facilitate exposition, we
begin with a brief summary of the PERT approach, using the example in Figure 1
(from chapter 3 of Heizer and Render (2005)),
concerning the installation of a pollution control system.
Figure 1 shows the project
data, and calculation of the critical path and its mean and variance, using the
methods of Ragsdale (2003) and the usual PERT
approach. [Click here to download the Excel file
for this example.] The critical path is ACEGH with expected duration E(D)
= 15 and variance(D) = 3.11. Appealing to the central
limit theorem, the PERT method approximates the probability distribution of D
with the normal distribution, N(E(D),stdev(D)) = N(15,1.76), and
uses this approximation to make probability statements about project duration.

Example 1:
Figure 1
Figure 2 shows the
precedence diagram and typical computational results.

Figure 2
At this point, most
discussions note a major limitation of the approximation – its myopic focus on
the critical path ignores other near
critical paths that might turn out to be the actual critical path(s) upon
execution of the project. Two likely candidates in our example are ADGH or
BDGH. Ceteris paribus, with a slack of one (1), an unfortunate time for
activity D could push ADGH or BDGH to criticality. Section 3 provides examples
of this and other situations in which near critical paths become critical.
Most authors mention that
the complexities posed by multiple critical paths (along with some other
departures from the basic PERT assumptions) are most appropriately and readily
addressed by simulation models, although relatively few introductory
expositions – especially in operations management texts – ever return to address
the specifics of simulating projects with important random characteristics.
Some introductory OR/MS texts do explicitly address the simulation approach
(see, for example, Hillier and Hillier (2003) or
Ragsdale (1998)).
Other possible limitations
of the PERT approximation include:
·
Activity times
are not stochastically independent.
·
The critical
path comprises fewer activities than reasonable application of the central
limit theorem requires. Our example has a five-activity critical path, far
short of the typical n ≈ 30 rule of thumb for the central limit theorem.
·
The PERT
approximations of activity duration mean and variance can deviate significantly
from reasonable and accurate approximations of the underlying activity duration
probability distribution.
To our knowledge, the first
is relatively unexplored, perhaps because dependent activities would
significantly compound model complexity. The second is rarely mentioned,
probably because (a) it is so obvious and (b) many, if not most, actual
projects will entail enough activities to justify using the central limit
theorem. As for the third, while MacCrimmon and Ryavec cited misspecified
activity probability models in their early analytical study of the PERT
assumptions (1964), introductory discussions
rarely mention this error source. This is unfortunate since, as the examples in
sections 3 and 4 will show, the PERT beta approximations can also compound
inaccuracies.
We begin with two variants
of the Milwaukee General Hospital (MGH) project:
The changes in 2.a-c do not
affect the critical path, so the same
PERT approximations of the expected project duration and variance apply to both
variants. However, the changes simultaneously reduce the slacks of B, D, and
F and increase the variance of each,
thus increasing the likelihood that activities B, D, or F will become critical.
To assess the adequacy of
PERT approximations in this example, we compare three CDFs of project duration:
Figure 3 contains these
three project duration CDFs:

Project Duration CDFs for Two Milwaukee General
Hospital Variants
Figure 3
We regard the simulated
results as baselines for their respective variants. Note that the PERT
approximation does not estimate either baseline very accurately. For the
initial MGH formulation, PERT overestimates F(d) for shorter durations (below
13.6) and underestimates F(d) for longer durations (above 13.6). PERT also
consistently and significantly overestimates F(d) for the modified version,
MGHB. While an eyeball test is probably a reasonable standard of the quality of
fit here, Table 1 contains two common quantitative measures of fit, the mean
absolute deviation, MAD, and the mean absolute percent error, MAPE.

PERT Approximation
Errors: MGH Simulations
Table 1
In other words, the average
absolute deviation between the PERT approximation and the MGH simulation CDF is
.059. In a similar fashion, PERT overestimates F(d) by an average of .117 for
MGHB (since all deviations are overestimates). The MAPE values in Table 1 are
perhaps slightly overstated, since small errors in the left tails of the
distributions generate inflated absolute percent errors. Nonetheless, suffice
it to say the PERT approximations do not accurately estimate the project
duration CDFs.
As expected for the reasons
discussed in Section 2, multiple critical paths emerged in the simulation
trials. Two main paths surfaced in the initial MGH and four emerged in MGHB,
with the approximate frequencies in Table 2.

Critical Paths in MGH
Simulations
Table 2
One might wonder if these
results are unusual. A second example suggests (but of course does not prove)
otherwise. Figures 4-6 contain the second example (from chapter 8 of Krajewski
and Ritzman (2005)), concerning the relocation
of a hospital.

Example 2: St. Adolf’s Hospital (SAH)
Precedence Diagram
Figure 4

Example 2: St. Adolf’s Hospital
Figure 5
Figure 4 is a precedence diagram
and standard computational results for St. Adolf’s Hospital (SAH). Bold borders
identify the critical path, BDHJK. Double lined borders identify a second path,
ACGJK, which is near critical.
Figure 5 shows a Crystal
Ball spreadsheet simulation model for SAH. [Click here
to download the Excel file for this example.] Standard PERT estimates yield an
expected length of 69 weeks (cell R20) and a variance of 11.889 (S21). Note that
the actual variance of the duration of BDHJK, calculated from the underlying
beta distributions of the activity durations, is also 11.889 (X21). The actual
expected duration of BDHJK, calculated from the underlying beta distributions
of the activity durations, is 69.3 (I20 or W20), slightly above the PERT
estimate of 69.
Figure 6 shows three SAH
project duration CDFs – the standard PERT approximation, N(69,3.45); the empirical CDF from a 50,000 trial simulation; and a
PERT adjusted approximation, N(66.3,3.45).
As was true for the first example, MGH, the PERT approximation is not a very
close fit to the baseline simulated project duration CDF. Moreover, the PERT adjusted approximation is not much
better.

Project Duration CDFs for St. Adolf’s Hospital
Figure 6
Table 3 contains the MAD and MAPE values for this example.

PERT Approximation
Errors: SAH Simulations
Table 3

Critical Paths in St.
Adolf’s Hospital Simulations
Table 4
As the three foregoing
examples clearly demonstrate, PERT approximations can offer less than adequate
estimates of project duration. It is worth noting that we borrowed these two
simple examples,
It is also probably worth
mentioning that most introductory discussions suggest that PERT estimates are optimistically biased, tending to
uniformly overestimate the CDF of project duration. This plausible property
seems like a natural consequence of near critical paths turning critical. This
impression is not limited to introductory discussions. In the abstract of his
early paper on this topic, A. R. Klingel (1966) says, “Among network techniques recently widely
employed in program management, Pert is addressed to the problem of assessing
the manager’s chances of completing a project on time. Theory and monte carlo
simulation have shown that the Pert method yields results that are biased
high,…” While Klingel’s assertion might be true of larger projects, the
results for our simple examples suggest the bias can cut both ways – positive
or negative.
It is useful to clarify the
activity time probability models implicit in PERT approximations. We start with a brief overview and then turn
to a slightly more extended (and technical) discussion. On one hand, the second
discussion seems like overkill, yet it also seems necessary in order to
explicate clearly the various issues involved.
Introductory PERT
discussions deal with random activity durations by first introducing the
concepts of three time estimates: optimistic,
most likely, and pessimistic. There inevitably follows an intuitive discussion
relating these three estimates to the form of the assumed underlying
probability distribution of the activity duration, posited to be a beta
distribution. After briefly exploring the flexibility and suitability of the
beta distribution, discussion quickly moves to the standard approximations of
activity mean and variance:
·
D = duration of the activity
·
E(D) = expected duration = (a+4m+b)/6
·
Variance(D) = ((b-a)/6)2
where a, b, and m are,
respectively, the optimistic, pessimistic, and most likely times for the
activity. The activity means and variances are used to compute various
intermediate variables (EST, EFT, LST, LFT, slack), which in turn help identify
the critical path. The expected duration and variance of the critical path are
then calculated.
In sum, PERT offers the beta
as a plausible probability model of activity times, relates the three usual
estimates, (a,m,b), to properties of the beta distribution, and translates
(a,m,b) into approximations for the mean and variance of each activity.
Interestingly, PERT pays little or no explicit attention to precisely which
beta distribution is being proffered, presumably because only the mean and
variance are used. While that shorthand serves PERT’s purposes well, it falls
short of providing the level of detail required, say, to conduct a simulation
of the project. That would require a specific beta distribution for each
activity.
In fairness, in their seminal
PERT paper, Malcolm, Clark, Roseboom, and Fazar (1959)
addressed a comprehensive system in which the individual activity times and
their probability models played an important, but secondary role. Arguably,
they adopted approximations entirely appropriate and adequate to the context in
which they were developed. In fact,
Although
The beta distribution is a
two parameter continuous distribution on the open real interval (A,B) (Some sources use the closed
interval, [A,B].) The density
function is
![]()

See, for example, NIST/SEMATECH
e-Handbook of Statistical Methods
(2005).
α and
β are shape parameters, and A and B are the minimum and maximum values,
respectively. The case A = 0 and B =1 is called the standard beta distribution.
If X
follows a general beta distribution with parameters α and β on (A,B), A is also a location parameter, and A-B is a scale parameter. This is most easily seen by noting that X,
on (A,B) is related to the standard
beta distribution Y (with the same parameters α
and β) by the transformation X = (B-A)Y
+ A.
Table 5 lists some common
statistics for the standard and general beta distributions.
|
Standard beta |
General beta |
|
|
Range |
(0,1) |
(A,B) |
|
Mean |
|
|
|
Mode α,β>1 |
|
|
|
Variance |
|
|
Some Summary Statistics for the Beta Distribution
Table 5
With this terminology, we
can briefly sketch details of five estimation procedures for explicitly
translating the (a,m,b) estimates into a specific beta distribution.
The procedure implicit in Malcolm
et al (1959) and made explicit by
M1.
Obtain the
estimates (a,m,b).
M2.
Fix the mode by
imposing equation 1.
M3.
Fix the variance
by imposing equation 2.
M4.
Solve equations
1 and 2 for α and β, which fixes the specific beta distribution.
1.
[b(α-1)+a(β-1)]/(α+β-2) = m Mode
2. αβ(b-a)2/[(α+β)2(α+β+1)]
= [(b-a)/6]2 Variance
Equations 1 and 2 yield the
cubic equation that
Grubbs (1962),
roundly criticizing the Malcolm et al PERT assumptions, argued that the
traditional estimates for the mean and variance, (a+4m+b)/6 and ((b-a)/6)2,
respectively, overly constrain the parameters of the beta distribution and
restrict it to “one of three fat, flat
Beta distributions” (1962b). While
internally consistent, Grubbs’ method is curious in that he approached the
estimation process in a manner subtly, but distinctly different from that
adopted by Malcolm et al. Grubbs offered the following:
G1.
Obtain the
estimates (a,m,b).
G2.
Fix the mean by
imposing equation 3.
G3.
Fix the variance
by imposing equation 2.
G4.
Solve equations
2 and 3 for and β, which fixes the specific beta distribution.
3. (αb+βa)/(α+β) = (a+4m+b)/6 Mean
Equations 2 and 3 solve
readily, yielding the following unique solutions:
4. α = β = 4 Grubbs
symmetric
5. α = 3 – SQRT(2) β
= 3 + SQRT(2) Grubbs
positively skewed
6. α = 3 + SQRT(2) β
= 3 – SQRT(2) Grubbs
negatively skewed
These results lead Grubbs to
conclude that the PERT assumptions “limit
us to one of three fat, flat, Beta distributions”. Moreover, these
parameter values yield one of three modes:
7. (a+b)/2 Grubbs
symmetric mode
8. (a+b)/3 – (b-a)SQRT(2)/6 Grubbs positively
skewed mode
9. (a+b)/3 + (b-a)SQRT(2)/6 Grubbs negatively
skewed mode
In the symmetric case, the
Grubbs mode is precisely the original subjective estimate m. However, the
asymmetric cases 8 and 9 exhibit a curious property – the calculated modes
are not equal to the original subjective estimates m. This underscores
the subtle, but important distinction between the approaches of Grubbs and
Malcolm et al in their seminal paper. Both valid approaches address related,
but subtly different, problems.
Donaldson and Coon (1964) take still another tack in closely related papers.
Donaldson bases his estimates on subjective estimates of the optimistic,
pessimistic, and mean times, and the assumption that the β-density fX is tangential to
the horizontal axis at the extremes, a and b. Coon extended Donaldson’s method
to handle PERT estimates, (a,m,b). A brief sketch of Coon’s method follows:
CD1. Obtain the estimates (a,m,b).
CD2. Fix the mode by imposing equation 1.
CD3. Impose the tangency assumption, which is equivalent
to α>2, β>2.
CD4. Equation 1 and condition CD3 and do not uniquely fix
α and β, but define a family of β-distributions (see Coon’s
comments below).
CD5. Fix the distribution by finding the smallest sum,
α+β, satisfying equation 1 and CD3. Since Variance = αβ(b-a)2/[(α+β)2(α+β+1)],
the resulting distribution has the largest variance among those that satisfy
equation 1 and condition CD3. The resulting α and β values are given
in equations 10-12.
10. If b-m > m-a α
= 2 β = 2(b-m)/(m-a)+1 CD positively skewed
11. If m-a > b-m β
= 2 α = 2(m-a)/(b-m)+1 CD negatively skewed
12. If b-m = m-a α
= β > 2 CD
symmetric
Neither
Donaldson nor Coon explicitly addressed the symmetric case, but applying their logic
would lead to the results in 12 as estimates for the symmetric case. (We have
used α=β=4 in the simulations MGH, MGHB, and SAH.)
The Coon-Donaldson estimates
have an interesting property:
13. If α and β satisfy Coon’s conditions, so do
α' and β’, where
α' = α +
δm/(1-m) δ>0
β’ = β +
δ δ>0
A similar property (with a
slightly different form) exists for Donaldson’s estimates. In other words, the
Coon-Donaldson estimates generate families of beta distributions, rather than
unique distributions. In fact, Coon concludes her paper with the following
comments (here x1=a, x2=subjective estimate of the mean,
and x3=b):
It should be
clearly pointed out that estimates of the end points and the mean (or mode),
even when coupled with the assumption that the β-distribution is tangent
to the x-axis at both ends, does not lead to complete generalization of PERT
activity time distributions. In effect, what Donaldson’s method does is to set
β=2 for all curves where the mean is less than one-half the range of the
curve and then to estimate α from the ratio
β/α
= (x3- x2)/(x2- x1)
On the other hand, if the mean is
greater than one-half of the range, then α=2 and β is determined from
the β/α ratio. Hence we are still left with a restricted set of
β-distributions, although the current restrictions have been greatly
relaxed by allowing the distributions to take on varying degrees of skewness as
compared to the severe restrictions pointed out by Grubbs.
Farnum and Stanton (1987) explore conditions that justify the PERT
estimates, and as a byproduct, develop improved estimators for those cases in
which the standard estimates are poor. In the course of their analysis, they
suggested the following estimation procedure to translate the usual estimates
(a,m,b) into a unique beta distribution:
FS1. Obtain the estimates (a,m,b).
FS2. Fix the mode by imposing equation 1.
FS3. Fix the variance by imposing equation 14.
FS4. Fix the distribution by solving equations 1 and 14
for α and β.
14. (α-1)(β-1)(b-a)2/[(α+β-2)2(α+β-1)] = [(b-a)/6]2 Variance(α-1,β-1)
Note
that the RHS of equation 14 is not Variance(α,
β), but Variance(α-1,β-1),
an approximation Farnum and Stanton
briefly justify, without providing much in the way of detailed substantiation. At
any rate, the net result of equations 1 and 14 is the following closed form
expressions for α and β in 15 and 16:
15. α = [36[((m-a)/(b-a))2 +
1](b-m)/(b-a) Farnum-Stanton
16. β = [36[((b-m)/(b-a))2 +
1](m-a)/(b-a) Farnum-Stanton
Golenko-Ginzburg (1988)
proposed the following refinement, based on yet another set of relaxed
assumptions:
GG1. Obtain the estimates (a,m,b).
GG2. Fix the mode by imposing equation 1.
GG3. Assume α+β = z = constant. Golenko-Ginzburg justifies this condition as an
extension of earlier assumptions, saying, On
the basis of statistical analysis and some other intuitive arguments, the
creators of PERT assumed that p+q ≈ 4. (Golenko-Ginzburg defined p =
α -1 and q = β -1.)
GG4. Using the standardized completion time,
Golenko-Ginzburg calculated the variance of the completion time, using the
subjective estimate, m, to be σ2(m)=(1+z+z2m-z2m2)/((z+2)2(z+3))
GG5. Fix z by requiring that the average of σ2(m)
over all values, 0<m<1, is 1/36, or equivalently, ∫σ2(m)dm
= 1/36.
GG6. Golenko-Ginzburg did not find α and β, but
directly calculated the mean and variance of the general completion time, X, in expressions 17 and 18.
GG7. Fix the mean by imposing equation 19.
GG8. Fix the distribution by solving equations 1 and 19
for α and β.
17. mean(X) = (2a+9m+2b)/13
18. variance(X) = (b-a)2 [ 22 + 81(m-a)/(b-a) – 81{(m-a)/(b-a)}2]/1268
19. (αb+βa)/(α+β) = (2a+9m+2b)/13 Mean
(Golenko-Ginzburg)
Equations 20-22 show the
Golenko-Ginzburg expressions for the asymmetric case.
20. α = ρβ where
ρ = [9(m-a)+2(b-a)]/[9(b-m)+2(b-a)] GG
asymmetric
21. α = ρ[(b-m)-(m-a)]/[ρ(b-m)-(m-a)] GG
asymmetric
22. β = [(b-m)-(m-a)]/[ρ(b-m)-(m-a)] GG
asymmetric
For the symmetric case, m =
(a+b)/2, so equation 1 yields α=β. Solving equations 1 and 19
simultaneously yields 23:
23. α + β = 6.5
Substituting α=β
into equation 23 and solving yields, for the symmetric case:
24. α = β = 3.25 GG
symmetric
Table 6 contains the resulting numerical values for nine representative distributions
arising in the previous examples. Included are the estimates (a,m,b), the PERT
variance ((b-a)/6)2, and the values of the values of α and
β resulting from each method. The calculated mode is included for Grubbs’
method since it does not faithfully reproduce the subjectively estimated mode.
We include calculated variances for Coon-Donaldson, Farnum-Stanton, and
Golenko-Ginzburg, since Coon-Donaldson does not assume a variance and neither
Farnum-Stanton nor Golenko-Ginzburg faithfully reproduces the assumed PERT
variance, ((b-a)/6)2.