Single machine stochastic scheduling to minimize the expected

Computers & Industrial Engineering 48 (2005) 153–161
www.elsevier.com/locate/dsw
Single machine stochastic scheduling to minimize the expected
number of tardy jobs using mathematical programming models*
Dong K. Seo, Cerry M. Klein, Wooseung Jang*
Department of Industrial and Manufacturing Systems Engineering, University of Missouri—Columbia,
E3437 Engineering Building East, Columbia, MO 65211, USA
Abstract
This paper studies the single machine scheduling problem for the objective of minimizing the expected number
of tardy jobs. Jobs have normally distributed processing times and a common deterministic due date. We develop
new approaches for this problem that generate near optimal solutions. The original stochastic problem is
transformed into a non-linear integer programming model and its relaxations. Computational study validates their
effectiveness by comparison with optimal solutions.
q 2005 Published by Elsevier Ltd.
Keywords: Stochastic scheduling; Single machine; Tardy jobs; Non-linear Programming
1. Introduction
In this paper, we address the problem of sequencing jobs on a single machine so as to minimize the
expected number of tardy jobs. We consider the stochastic version of the problem with normally
distributed job processing times and a deterministic and common due date. This problem exists in most
manufacturing and production systems, where it is desirable to finish jobs on or before their due dates,
and the choice of schedule usually has a significant impact on system performance. For example, it is
common to measure the number of tardy jobs, or equivalently, the percentage of on-time shipments and
to use it to rate managers performance in a semiconductor manufacturing facility or an automobile
assembly line. The mathematical programming approach developed in this paper generates
computationally efficient near optimal solutions for this problem.
*
This manuscript was processed by Area Editor Subhash C. Sarin.
* Corresponding author. Tel.: C1 573 882 2692; fax: C1 573 882 2693.
E-mail address: jangw@missouri.edu (W. Jang).
0360-8352/$ - see front matter q 2005 Published by Elsevier Ltd.
doi:10.1016/j.cie.2005.01.002
154
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
A vast majority of past research on the single machine scheduling problem for the tardiness criterion
has been primarily devoted to the deterministic case. One of the main reasons for this lies in the difficulty
of analyzing such stochastic problems.
A stochastic problem is tractable only when strong assumptions are imposed on processing time
distributions and due dates. Under stochastic order relations and other limited conditions, Chang and
Yao (1993) provided a general and unified approach in solving stochastic scheduling problems, and
Righter (1994) presented optimal policy results for functions of completion times. The optimal policy
for the single machine problem with exponential processing time and a common due date, which is a
random variable with an arbitrary distribution, for the criterion of minimizing expected weighted
number of tardy jobs was given in Pinedo (1983). Boxma and Forst (1986) explored the case when due
dates are independent and exponentially distributed. De, Ghosh, and Wells (1991) minimized the
expected weighted number of tardy jobs when the processing times follow general random variables but
the common job due date is exponentially distributed. Lin and Lee (1995) discussed a dual criteria single
machine problem under the stochastic order assumption.
The case of normally distributed processing times has also been addressed in the literature. Sarin,
Erel, and Steiner (1991) considered jobs with a common due date to optimize the expected tardy cost or
the sum of the weighted tardy probabilities. Cai and Zhou (1997) minimized the expectation of a
weighted combination of the earliness penalty, the tardiness penalty, and the flow time penalty. Both of
these papers established the V- or W-shaped structures of the optimal sequence under the assumption
that known variances are proportional to the means. Jang (2002) considered the same objective function
as ours and developed a dynamic scheduling policy based on a myopic heuristic. Balut (1973) studied the
scheduling of normally distributed jobs with different due dates for the objective of minimizing the
number of tardy jobs under chance constraints, and Kise and Ibaraki (1983) showed that this problem is
NP-complete. Soroush and Fredendall (1994) provided heuristics that identified an optimal sequence for
the objective of minimizing the total expected earliness plus tardiness cost.
As mentioned, the well-known previous work by Sarin et al. (1991) and Cai and Zhou (1997) focus
only on the special case of the problem when the variances of processing times are proportional to their
means. On the other hand, our study considers a general problem where an efficient and exact algorithm
does not exist due to the inherent difficulty. Hence, we seek to find approximate solutions by
transforming the original stochastic problem into a non-linear integer programming model and its
relaxations. Computational results indicate that this approach allows us to find near optimal solutions
quickly. In addition, our mathematical modeling approach can be adopted easily to various stochastic
scheduling problems. In this paper, we formulate an exact model and approximate models in Section 2,
and provide computational experience with the use of these models in Section 3. Finally, concluding
remarks are made in Section 4.
2. Formulation and solution approach
The problem that we investigate in this paper is to sequence a set of n jobs, JZ(1,2,.,n), on a single
machine We assume that all jobs are available at the outset, and that once the processing begins no job is
preempted. Each job i requires a random processing time pi. The processing times are independent of
each other. The performance measure to be minimized is the expected number of jobs finished after a
given common due date, which is denoted as d.
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
155
Let P contain all possible sequences of the n jobs. In a sequence p2P represented by
([1],[2],.,[k],.,[n]), let [k] indicate the job occupying the kth position in that sequence. The
completion time of the kth job, t[k], satisfies
t½k Z
k
X
p½i :
iZ1
The function C(p), representing the expected number of tardy jobs of a sequence p, is expressed by
(
)
n
n
k
X
X
X
CðpÞ Z Cð½1; ½2; .; ½k; .; ½nÞ Z
Prft½k O dg Z
Pr
p½i O d
(1)
kZ1
kZ1
iZ1
Our goal is to find the sequence p2P that minimizes C(p). This problem is difficult to solve if
processing times are random variables. It is generally impossible to obtain an optimal sequence in
polynomial time. The information given in Pinedo (1995) shows that the problem is tractable only when
exponentially distributed processing times are used. Consequently, it is necessary to develop more
efficient solution methodologies for this problem.
To that end, we propose a mathematical programming approach. This approach transforms the
stochastic optimization problem given in (1) to equivalent non-linear deterministic problems, which are
then solved using appropriate deterministic optimization techniques.
Let xijZ1 if job i is scheduled at the jth position in the sequence, and 0, otherwise. Since one job is
scheduled exactly once at a specific time, we have the assignment constraints
n
X
xij Z 1;
i Z 1; .; n
(2)
xij Z 1;
j Z 1; .; n
(3)
jZ1
n
X
iZ1
Pk
Under these constraints a term Pr
iZ1 p½i O d is equivalent to
(
)
k X
n
X
Pr
pi xij O d ;
jZ1 iZ1
and hence, we can develop and solve a non-linear integer programming model.
Consider a case where pi follows Nðmi ; s2i Þ. Normal processing times are justified in practice when
each job consists of many elementary tasks, which have random processing times. Unlike other studies,
we do not impose any relationship between the mean and the variance of a processing time. The original
scheduling problem in (1) can be rewritten as
(Model A1)
min
n
X
kZ1
Pn
jZ1
iZ1 mi xij
1=2
Pn
2
jZ1
iZ1 si xij
dK
1 K F P
k
Pk
!
156
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
n
X
xij Z 1; i Z 1; .; n
jZ1
s:t:
n
X
xij Z 1; j Z 1; .; n
iZ1
xij Z 0; 1
ci; j
where F($) is the cumulative distribution function for the standard normal random variable.
Our first approximation to the problem is based on the assumption that in order to minimize 1KF(a),
we want a to be as large as possible. Hence, we want to maximize a. This yields the following variation
of Model A1.
(Model A2)
P P
n
X
d K kjZ1 niZ1 mi xij
max
Pk Pn
1=2
2
kZ1
jZ1
iZ1 si xij
n
X
xij Z 1; i Z 1; .; n
jZ1
s:t:
n
X
xij Z 1; j Z 1; .; n
iZ1
xij Z 0; 1
ci; j
Models A1 and A2 can be solved using non-linear programming software such as LINGO. Since the
constraints are equalities and totally unimodular, these models belong to a relatively easier class of nonlinear integer programming problems. Hence, we believe the results should be close to optimal solutions,
but the computation time may not be fast, especially for large-size problems.
The objective function for A2 can be simplified further by removing the square root in the
denominator of the objective function. The resulting problem in its relaxed form is a linear fractional LP
and easier to solve. This model is given by
(Model A3)
P P
n
X
d K kjZ1 niZ1 mi xij
max
Pk Pn
2
jZ1
iZ1 si xij
kZ1
n
X
xij Z 1; i Z 1; .; n
jZ1
s:t:
n
X
xij Z 1; j Z 1; .; n
iZ1
xij Z 0; 1
ci; j
Finally, we make a drastic approximation by linearizing the objective function. Since the constraints
are totally unimodular, every basic solution of this linear program will be an integer vector (see Murty,
1983). This linear model is significantly easier to solve and will yield feasible solutions quickly.
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
157
However, due to the linear approximation of the objective function, the solution quality may deteriorate
significantly. This linear model is given by
(Model A4)
max
n
X
"
dK
kZ1
n
X
k X
n
X
mi
jZ1 iZ1
si
#
xij
xij Z 1; i Z 1; .; n
s:t:
jZ1
n
X
xij Z 1; j Z 1; .; n
iZ1
xij R 0
ci; j
Note, however, that even if this model results in a ‘poor’ solution, it is still feasible and can be used as
a bound and starting point for the other models to increase their performance.
3. Computational study
To test the performance of these models, numerical examples with the following settings are solved.
The means of the processing times mi are real numbers randomly and independently drawn from the
uniform distribution U[10, 20]. The variances are also uniformly distributed yet with the restriction that
requires 99% of processing times to be positive. That is, for a given mi, si is randomly selected from the
uniform distribution U[0, mi/2.33]. This guarantees that
Kmi
% FðK2:33Þ Z 0:01
Prfpi % 0g Z F
si
P
P
where pi follows Nðmi ; s2i Þ. The common due date d is between 0:4 mi and mi . For our experiment,
10 examples, each with 5, 10, 20, 30 and 50 jobs for a given due date are generated, and solved using
LINGO.
Table 1
Optimality of Model A1
n
5
10
Due date
P
0:4 P mi
0:6 P mi
0:8 P mi
1:0 mi
P
0:4 P mi
0:6 P mi
0:8 P mi
1:0 mi
Optimal # tardy jobs
2.997
2.117
1.295
0.504
5.656
3.776
2.043
0.533
Model A1
# Tardy jobs
% Error
2.997
2.117
1.295
0.504
5.656
3.776
2.043
0.533
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
158
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
Table 2
Comparison of models
n
5
10
20
Due date
P
0:4 P mi
0:6 P mi
0:8 P mi
1:0 mi
P
0:4 P mi
0:6 P mi
0:8 P mi
1:0 mi
P
0:4 P mi
0:6 P mi
0:8 P mi
1:0 mi
Model A1
tardy jobs
Model A2
Model A3
Model A4
Tardy jobs
% Error
Tardy jobs
% Error
Tardy jobs
% Error
2.997
2.117
1.295
0.504
5.656
3.776
2.043
0.533
10.741
6.976
3.596
0.601
3.014
2.145
1.317
0.504
5.740
3.888
2.095
0.533
11.089
7.244
3.704
0.601
0.6
1.3
1.7
0.0
1.5
3.0
2.5
0.0
3.2
3.8
3.0
0.0
3.064
2.164
1.317
0.505
5.927
3.935
2.099
0.533
11.468
7.383
3.740
0.612
2.3
2.3
1.7
0.2
4.8
4.2
2.7
0.0
6.8
5.8
4.0
1.8
3.518
2.547
1.526
0.542
6.536
4.636
2.573
0.627
12.540
8.429
4.445
0.734
17.4
20.3
17.8
7.4
15.5
22.8
25.9
17.7
16.8
20.8
23.6
22.1
First, the performance of Model A1 is compared with the optimal values obtained using total
enumeration. These are shown in Table 1. As shown, solutions for Model A1 are always equal to optimal
solutions. This is anticipated since the model is a direct transformation of the original problem. When the
number of jobs becomes more than 10, it is not possible to compute optimal values by complete
enumeration in a reasonable amount of time. Because Model A1 generates optimal solutions, we
compare the performance of other models with that of Model A1 for problems with the number of jobs
greater than 10.
Next, the performance of the other three models is compared with that of Model A1. Table 2 shows
the expected number of tardy jobs for each model and the percentage deviation from values generated by
Model A1. Note that the sequences obtained using Models A2, A3, and A4 need to be evaluated by Eq.
(1) for proper comparison of solution quality. When n is equal to 5 or 10, both non-linear models, Models
A2 and A3, perform very well, generating satisfactory solutions with less than 5% deviation from the
optimal. Models A2 and A3 generate solutions more quickly than A1 without losing much in
performance while the linear Model A4 does not provide as good solutions as the other models but is
significantly faster as the problem size increases. When n is 20, Model A2 still provides very good
answers within fairly short amounts of computation time while the performance of Model A3, in terms of
solution quality, begins to deteriorate, even though the computational time remains good. P
We can also
observe that the solution quality of Model A2 is the worst when the due date is around 0:6 mi . As the
due date diverges from this value, there are more partial sequences that do not affect the performance of
schedules, and hence, the overall performance is improved. As an example, if the due date is either too
small or too large, then a sequence will not matter because all or none of the jobs are tardy.
Fig. 1 shows the comparison of computation times of the three models with change in due date values
when n is 20. A Pentium II 300 MHz personal computer was used for this study. Note that Model A4
always takes less than a couple of seconds and is not included in the graph. Models A2 and A3 consume
considerably less time than Model A1, and to our surprise Model A2 is faster than A3. The
computational time of Model A1 increases exponentially as the due date increases. On the other hand,
the computational times for the other models are fairly constant regardless of due dates.
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
159
Fig. 1. Comparison of computation time (nZ20).
Considering both performance and computational time, we believe that Model A2 provides very
satisfactory answers. Consequently, we conducted further evaluation of Model A2 on large-size
problems. The results are summarized in Table 3. It seems that the number of jobs does not affect relative
performance of Model A2. The average errors range between 2 and 3% while the worst case errors are
below 10%. Therefore, we recommend its usage as a solution procedure for this scheduling problem.
Finally, we take a closer look at the case when the variance of processing time changes. The
performance of the heuristic is likely to depend
P on the processing time variance used. Table 4 presents
comparative results when nZ10 and dZ 0:6 mi . We consider three different levels of variances such
as 50% smaller than our original setting, the original setting defined at the beginning of this section, and
50% larger than the original setting. These three cases are designated as Low, Med, and High variances
in Table 4, respectively. It is clear that the performance of Model A2 deteriorates as the variance
increases. Both average errors and worst case errors increase, however, the magnitude of increase is
rather small confirming the validity of Model A2.
Table 3
Deviation of Model A2 from Model A1
n
20
30
50
Due date
P
0:4 P mi
0:6 P mi
0:8 mi
P
0:4 P mi
0:6 P mi
0:8 mi
P
0:4 P mi
0:6 P mi
0:8 mi
Minimum (%)
Average (%)
Maximum (%)
0.4
0.0
0.0
0.3
0.6
0.0
0.4
1.1
0.0
3.2
3.8
3.0
3.6
3.6
2.5
3.2
3.6
2.2
7.8
9.2
8.8
7.5
7.8
7.2
8.5
9.3
8.2
160
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
Table 4
Performance of Model A2 under variance change
Variance
Minimum (%)
Average (%)
Maximum (%)
Low
Med
High
0.0
0.0
0.0
2.4
3.0
3.2
8.7
9.9
10.5
4. Conclusion
This paper considered the single machine stochastic scheduling problem in which the objective is to
minimize the expected number of tardy jobs. This problem is very difficult to solve, especially when the
job processing times are stochastic. The problem was mathematically modeled and four non-linear
integer programming models were proposed that generate optimal or approximate solutions but with
significant computational time savings. As the number of jobs increases, the second proposed model
becomes the model of choice in terms of computation time and solution quality. Our mathematical
programming approach provides a general framework to solve other similar stochastic scheduling
problems. The computation time and accuracy can potentially be improved by using sophisticated
algorithms such as Genetic algorithms to solve the transformed models that are determined by this
approach. The situations involving jobs with different due dates and weights as well as various
probability distributions can be easily modeled by appropriately modifying the models proposed in this
paper. We are in the process of developing the multiple machine version of this problem.
Acknowledgements
The helpful comments and suggestions of anonymous referees and the area editor are very much
appreciated. This research is partially supported by a grant from the University of Missouri Research
Board.
References
Balut, S. J. (1973). Scheduling to minimize the number of late jobs when set-up and processing times are uncertain.
Management Science, 19, 1283–1288.
Boxma, O. J., & Forst, F. G. (1986). Minimizing the expected weighted number of tardy jobs in stochastic flow shops.
Operations Research Letters, 5, 119–126.
Cai, X., & Zhou, S. (1997). Scheduling stochastic jobs with asymmetric earliness and tardiness penalties. Naval Research
Logistics, 44, 531–557.
Chang, C., & Yao, D. D. (1993). Rearrangement, majorization and stochastic scheduling. Mathematics of Operations Research,
18, 658–684.
De, P., Ghosh, J. B., & Wells, C. E. (1991). On the minimization of the weighted number of tardy jobs with random processing
times and deadline. Computers and Operations Research, 18, 457–463.
Jang, W. (2002). Dynamic scheduling of stochastic jobs on a single machine. European Journal of Operational Research, 138,
518–530.
D.K. Seo et al. / Computers & Industrial Engineering 48 (2005) 153–161
161
Kise, H., & Ibaraki, T. (1983). On Balut’s algorithm and NP-completeness for a chance-constrained scheduling problem.
Management Science, 29, 384–388.
Lin, C., & Lee, C. (1995). Single-machine stochastic scheduling with dual criteria. IIE Transactions, 27, 244–249.
Murty, K. G. (1983). Linear programming. New York: Wiley.
Pinedo, M. (1983). Stochastic scheduling with release dates and due dates. Operations Research, 31, 559–572.
Pinedo, M. (1995). Scheduling: Theory, algorithms, and systems. New Jersey: Prentice-Hall.
Righter, R. (1994). Scheduling. In M. Shaked, & J. G. Shanthikumar (Eds.), Stochastic orders and their applications. New
York: Academic Press.
Sarin, S. C., Erel, E., & Steiner, G. (1991). Sequencing jobs on a single machine with a common due dates and stochastic
processing times. European Journal of Operational Research, 51, 188–198.
Soroush, H. M., & Fredendall, L. D. (1994). The stochastic single machine scheduling problem with earliness and tardiness
costs. European Journal of Operational Research, 54, 287–302.