Mukhopadhyay, N., Sen, P.K. and Sinha B.K.; (1987).Stopping Rules, Permutation Invariance and Sufficiency Principle."

•
•
•
.e
Stopping Rules, Permutation Invariance
and Sufficiency Principle
by
•
Nitis Mukhopadhyay
Pranab Kumar Sen
Bikas Kumar Sinha
Institute of Statistics Mimeo Series No. 1820
March 1987
STOPPING RULES, PERMt~ATION
INVARIANCE AND SUFFICIENCY PRINCIPLE I
BY NITIS MUKHOPADHYAY, PRANAB KUMAR SEN AND BIKAS KUMAR SINHA
University of Connecticut, Storrs, University of North Carolina,
Chapel Hill, and Indian Statistical Institute, Calcutta
•
SUMMARY
•
In the context of sequential (point as well as interval) estimation,
a general formulation of permutation-invariant stopping rules is considered.
These stopping rules lead to savings in the ASN at the cost of
some elevation of the associated risk--a phenomenon which may be attributed
to the violation of sufficiency principle.
For the (point and interval)
sequential estimation of the mean of a normal distribution, it is shown
•
that such permutation-invariant stopping rules may lead to a substantial
.e
sav1ng 1n the ASN with only a small increase in the associated risk •
•
lWork partially supported by (i) Office of Naval Research, Contract Number
N00014-8S-K-OS48, -and (ii) Office of Naval Research, Contract Number
N00014-83-K-0387.
.e
AMS 1980 Subjeat CZassifiaations. Primary 62L12; secondary 62B99.
Key words and phrases. Permutation invariant stopping rules, average sample
numbers, percentage savings, sequential point estimation, fixed-width confidence
interval, normal mean, unknown variance.
2
1.
Let {X ,X , .•. } be a sequence of independent and
l 2
Introduction.
identically distributed random variables (i.i.d.r.v.) with a distribution
(~
1), let Tn
=
negative statistic, and let {b : v
v
= I,2, ... }
be a nondecreasing sequence
function (d.f.) F.
For every n
of real, positive numbers.
0.1)
1
\l
T(n;Xl, •.• ,Xn ) be a non-
•
Consider a general stopping variable
= in£{ n
>
m
n > b T },
-
v n
•
where m is a preassigned positive integer (which may even depend on v),
v
= 1,2, •.•.
It may be mentioned that the sequential estimation rules
studied by Ray (1957), Robbins (1959), Chow and Robbins (1965),
Ghosh and Mukhopadhyay (1975, 1976), Woodroofe (1982) and others may be
unified in the form of (1.1).
We may notice that in (1.1), at the nth stage, b T(n;X , ... ,X ) is
l
\!
n
compared with n, where
XI~' ,,~Xn ~ave
been observed 1n that order.
In a
general setup, Tn is a sYtlIJl1etr-icfunction of Xl'." ,X , and, in a nonn
sequential setup, Xl"" ,X
•
e.
are pel>r7:;).:;ctiOi:cZ-li; ii:varia;:t (PI) in the sense
n
that their joint distribution remains invariant under any permutation of
the n arguments.
However, in a general sequential setup, for an arbitrary
stopping rule lV' X ""'X
I
l
need not be permutationally invariant.
Thus,
v
a natural question may arise: Could we have stopped earlier if the same
set of Xl" ",X
had arrived in a possibly different order? This motivates
v
us towards the formulation of peY'l::u.tatim;.-inval·iant stepping rules (PISR),
l
and we shall consider this concept in detail in Section 2.
There is an intricate relationship between
(transitive) sufficiency [see Bahadur (954»).
opti~Ql
ntopping rules and
As we shall see in
•
3
Section 2, by construction of the PISR, this (transitive-) sufficiency
principle is violated.
properties.
Hence, a PISR may not share the optimality
Nevertheless, it will be shown that such PISR may lead to
substantial savings in the ASN and there may not be any significant
increase in the associated risk of the (sequential) estimators.
To
•
emphasize this vital aspect of PISR, in Sections 3 and 4, we will inc or-
•
porate them in the case of sequential point and interval estimation
problems for a normal mean (when the variance is also unknown), and show
that the PISR compare very favorably with their classical (noninvariant)
counterparts.
In passing, we may remark that the above picture is
largely asymptotic in nature, and there is a good scope for indepth
numerical studies in the nonasymptotic case which would be explorec
elsewhere.
2.
.e
Tn
=
PISR: General Formulation.
T(n;Xl,···,X )·
n
Note that we have taken (for n
>
m)
This is done mere1y eo intludethe more general
situation where
Tn = Tn* + hn ; T*n = T* (Xl, ... ,X),
n
(2.1)
and {h
n
: n
>
n > m,
m} is a (nonincreasing) sequence of real nonnegative numbers
with lim h = O. We shall see 1n later sections that (2.1) covers a
n-+oo n
more general setup than the simple case where h = 0 for all n > m.
n
For every n (~ 1), let P
..
n
be the set of n: permutations {il, .•. ,i }
n
of the first n natural integers (l, .•• ,n).
•
For every k (m
(i1, ... ,i ) € {l, ... ,n}, we define
k
(2.2)
Tn (.1 , •..
1
. )
k
,1
=
T(k;X
i 1 , ... ,X.1 ).
k
~
k
~
n) and
4
Then, looking at (1.1), we may consider the following PISR:
At the nth stage (n
~
m), for each k (m
(il, ... ,i k )
e D, ... ,n},
some k (m
k
<
<
k
<
n) and every
we compute T Co
0).
If, for
n 1 , .. o,1
k
l
n) and some (il,.o.,i . ), k -> b v Tn C'1 , ... ,1. )'
k
<
(2.3)
1
k
then we stop sampling at the nth stage; otherwise, proceed
to the next stage by taking one more observation.
The
* v = 1,2, ...
associated stopping variable is denoted by 'v'
If we write, for each k (m
(2.4)
= min{T
TOk
n
n
Co
<
n),
0
1
,*
then, we may also \o,Tri t e
k
<
1 "."
1
)
k
equivalently as
\)
,*
(2.5)
\)
v
= 1,2, ...
.
=
0
inf{n > m : k > b T k' for some k : m
- \) n
-
< k ~
n} ,
Note that by (2.1), (2.2) and (2.4), for every k (m
TO < T = T(k;X ,·· ,X )
nk - k
l
k
(2.6)
0
. J,
k
~
- n),
<
e.
w.p. 1,
~'I
and hence, by (1.1) and (2.5), we have
,*v -< ,
(2.7)
= 1,2, .•.
w.p. 1, for every v
\!
It is also clear from (2.3)
-
(2.5) that
,*
.
remains invariant under any
\!
permutation of the indices il, .•. ,i,* (i.e., the order in which the Xi's
\!
enter into the (stopped) sample), while for,
this invariance may not
\!
generally hold.
Thus"
*
is a PISR, while,
\!
For each n
(~
may not be PI.
\!
(n)
1), let X
be the sample space of (Xl, .•. ,X ),
n
'
f·1e ld on X(n) ,an d 1 et p (n) b e the fam1
' l y 0 f pro b a b'l'
B(n) t h e Bare ls1gma1 1ty
measures on ( X(n) , . B(n»
finite measure.
~
.
O d b y some s1gmawh1ch
are assume d to b e d
om1nate
0
Let Bo(n) be the sigma-subfield generated by T n > m.
n'
5
Then, {Tn: n
~
t~ansitive
m} is a
sequenee for the sequential model
{(X(n), B(n), p(n»; n ~ m} if for every n ~ m, any version of the
conditional distribution of Tn given (Xl, .. "X _ l ) depends only on T - l ,
n
n
Actually, the Wijsman (1959) theorem on transitive sufficiency asserts
that for B
•
•
(n)
O
C B
(n)
, for all n
..
f or {B(n) ; n
trans1t1ve
ally independent given
>
~
m, the sequence {B
m}.J'
~j
an d
Bbn ),
.,
if B(n) and
on~y
(n)
O
; n
Bbn + l )
~
m} 1S
are condition-
Bahadur (1954) has shown that in sequential
decision problems, attention can be confined to procedures based on
transitively sufficient sequences of statistics.
{T ; n
n
m} is such a transitive sequence,
>
-
TV
Thus, whenever
may have some optimality
properties, and it is counter-intuitive to consider the triangular scheme
{T o , k ~ n; n > m} and the associated
nk
whenever the ASN (i.e., E(T »
v
.e
On the other hand, by (2.7),
exists, we have
* < E(1) ,
E(T)
(2.8)
so that
*
TV.
T
*
v
is more desirable than
T •
v
This apparent anomaly can be
easily rectified by considering the optimality properties of the sequential
estimation procedures based on the stopping rules in (1.1) and (2.5),
respectively.
In this context, we shall see that the violation of the
sufficiency principle by
T
*
v
generally results in an elevated risk for
the corresponding (sequential) estimator, and this is quite in line with
Bahadur's (1954) basic result.
Granted this explanation from the
•
theoretical point of view, the natural question may arise:
•
the cost for the excess risk in compensation for the gain through reduction
in the ASN?
In other words, can some "near optimality" properties be
achieved by such PISR?
·e
What would be
In the next two sections, we shall attempt to
6
provide satisfactory answers to these questions with special reference
to the sequential estimation of the normal mean problems.
In this
setup, we will mostly confine ourselves to the asymptotic case where
"b " is taken to be large [as has been done in Robbins (1959), Chow
v
and Robbins (1965) and other places], and we believe that more favorable
results can be obtained in the "non-asymptotic" case through extensive
numerical work.
•
We do report on some limited small sample studies;
however, these are not very conclusive even though the overall picture
appears to be extremely encouraging.
3.
l1inimum Risk Point Estimation of the Normal Mean and PISR.
Having recorded Xl"" ,X ' suppose that the Zcss functi01: in estimating
n
the (unkno"Tfl) mean
by};
\J
n
= n-1L~1= I
L = (X
0.0
n
n
\J)
2
X.
1
1S
gIven by
+ cn
where c (> 0) IS the known cost per unit sample.
If the variance
0
2
-1
0
-~
+ cn is minimized when n = n 'V O'c
,
were .known, the pisk E(L )' = 0'2 n
c
n
i
0
and Pc' the associated minimum risk, 1S 'V 2ac 2 as c -l- 0; here a 'V b means
alb
~
1.
Since 0' is, in fact, unknown, no fixed sample SIze procedure
would minimize E(L ) uniformly in a.
n
for n > 2.
the
n
2
= (n-l) -1 L.n1= l(X.-X
)
1
n
pule:
<3.2)
and estimate
2
Let S
Following Robbins (1959) and others, we may then consider
stoppir~
N
= inf{n
by
~
c
\J
c
<3.3)
e.
>
m (> 2)
n > c
-~
S},
n
Note that the risk associated with X
is given by
N
c
Pc ' say.
•
7
Up to various
orders of approximations [see Starr (1966b), Woodroofe
(1982)], it has been shown that as c + 0,
E(N )
0.4)
c
'Vn
o
and
c
so that the sequential estimator X has asymptotically the m1n1IDum
N
c
risk pO under variety of conditions on m.
Note that the stopping rule
c
•
(3.2) is of the form (1.1) with v
= [c- l ]*
+
1, b -.
= c -~
\
for n > 2 where [x] * stands for the largest integer
= Sn
we may note that for every n
and T
n
<
x.
= Tn*
Also,
2,
>
0.5)
Further, in dealing with possibly nonnormal d.f. 's [see Ghosh and
Mukhopadhyay (1979)), one may modify (3.2) and consider N
c
n >
.e
T
n
c-~(Sn
= Tn*
n- a )} for some 0
+
+ n
-a
of'
=
with T
n
<
a
<
S .
n
n 1
=m1n
' {S2 (.
Z
n, k
n
1
inf{n > m (> 2):
1, and this would correspond to
To introduce the PISR, we define s2(.
0.6)
=
. )
l , •.• ,1 k
i
<
i
. ) as 1n (2.2) and let
1
",,', 1 k
< ."
l
< 1
k
< n}
,
Then, the stopping variable N* of the PISR
c
1S
of the
form
o,n
•
•
for some k:
<
k
<
n} .
XN*
and denote the risk of this sequential estimator
c
Our main contention is to compare E(N ) and p with E(N * ) and
c
c
c
We estimate
by p * .
c
m
lJ
by
* respectively.
Pc'
2
= C(SR,;
Let C
n
2
sequence {SR,; R,
>
R,
>
n) be the sigma-field generated by the tail
n}, for n
>
2, so that C is nonincreasing in n.
n
8
2 Cn; n
Th en, {5 n'
2} 1S
. a reverse mar t~nga
'
Ze, so th at
~
0.8)
0.9)
8
2
~
n
0
2 w.p. l/lst mean, as n ~
2
where for both these results, E(X )
<
00
oc,
suffices and the normality of
•
the d.f. of X is not all that crucial.
k
82 .
L
= (n)-l
0.10)
1<' <
_1
1
Also, by definition,
<' < n(1 l ,···,i k )
.•• l.k_n
(= 52)
n '
•
so that by (3.6) and (3.10), we obtain
')
5~
Z
<
n,k -
0.11)
for every k : 2 < k
n
Suppose nov.' that JO
k+1, n
0.12 )
=
n; n
~
2.
{.O
.O} € {I , ... ,n } b e suc h t h at
Jl"",]k+1
2
k 1 = 8 (.0
. 0 ) , k = m-l, .•. , n-l.
n, +
n J 1 ,··· ,]k+l
Z
Then, for every k-e1ement subset J
jk+1
~
o
= Jk+l,~Jk'
k
=
o
(jl,···,jk)C Jk+l,n' letting
we have
e.
0.13 )
so that for X.
being one of the two extreme values (within the set
J k +1
{Xj : j
e J ko+ 1 ,n})'
the term within the parenthesis {o} is nonnegative,
,
and, hence,
0.14)
zn,k+1 .
9
On the other hand, by construction,
zn,k =
(3.15)
•
•
so that by (3.14)and (3.15), we obtain for every n
(3.16)
z n,k < Zn,k+l
w.p. 1,
We may aga1n recall that ¢(a,b) =
for every k
~(a-b)
2
~
~
2,
rn •
, so that using the order
statistics X 1 < ••• < X
corresponding to XI' ••• 'X and following
n
n: - n:n
some routine steps based on the usual inclusion-exclusion principle, we
obtain for every n
(3.17)
Z
n,k
>
k
> m (>
= min{(~)-l
. {(q)
= m1n
Z
n,k
(q)
where Z k
n,
2).
L
¢(X .,X .):1 < q < n - k + I}
q~i<j~q+k-l
n:1 n:J
: 1
~
q < n - k + l} ,say,
is the sample var1ance for the (ordered) sub-sample
{X n:q , .•. ,Xn:q+ k - I} of size k, for 1 < q _< n - k + 1, k
>
2.
Next, we
note that
=
(3.18 )
•
z(q)
n+ 1, k
{ z(q+1)
n+1,k
if
X
>
X
if
X
<
X
n+1
n+l
n:q+k-1
n:q
while for X
< X
< X
we may use (3.13) to compute Z(q)
n:q - n+1 - n:q+k-1'
n+1,k
from Z(q)
n,k·
(3.19)
In fact, in this case, it follows that
z(q)
is smaller than at least one of z(q) and Z(q+l)
n+ 1, k
n, k
n, k •
10
Thus, from the viewpoint of computation, given the picture at the nth
stage, we do not have to exhaust the full computation of Z(ql) k at the
n+ ,
(n+l)th stage, and this observation would be of considerable help,
particularly if n is large.
THEOREM 3.1.
If the X.'shave a nOP.mal d.f. with a finite variance
"O{E(N* )/E(N
(3.20)
lim c.,.
(3.21)
lim 10{P*/P }
c
C.,.
so that as ciO
C
PISR~
for the
C
•
c )}
J.:
= (n/6)2 ,
= (6+TI)(24TI)-~
,
there is about 27.6% reduction in the ASN
at the exper.se of only about 5% incY'ease in the ris7(.
PROOF.
We start with some identities on the variance of truncated
~orma1 distributions.
Let g and G be; respectively, the standard normal
density function and d.f., and for every
, -1
-1
a + B < I, let a = G(B)~' b = G(a+S).
fb xdG(x)
(3.22 )
a
= g(a)
~,
B such that 0
< ~,
B < land
Then, we have
- g(b) ,
(3.23)
so that
a-I fb x 2dG(x) - {a- l fb xdG(x)}2
(3.24)
a
a
=1
Note that g'(x)
= -xg(x)
and g"(X)
- a-l{bg(b) - ag(a)} - a- 2 {g(b)_g(a)}2.
= (x 2 -l)g(x),
so that for any
a € (0,1), the right-hand side (rhs) of (3.24) is minimized for
S
= ~(l-a).
•
1
Thus, if we define d by G(d )
a
a
the rhs of (3.24) reduces to
= ~(l+~),
then for S
= ~(l-a),
11
1 + 20:- 1 g'(d ) • 1 - 20:- 1 d g(d )
0:
0:
0:
0.25)
• 2{G(d ) - G(O) - d g(d )}/o:
0:
a
a
5
1 3
= 2{-3 d0: g(O) + O(d 0: )}/a
= {-31
•
d
3
g(O)
0:
+
O(d 5 )}/{G(d ) - G(O)}
0:
0:
4
2
= 1.3d0
+ O(d )
:0:
0:2/(12g 2 (O»
=
= 67T
0:
2
0(0: 4 )
+
4
+ 0(0: ) ,
~
0:
as
o.
Returning now to the proof of the theorem, we note that in (3.7),
1 a
·
·b1 e k must sat1s
. f y t h e con d··
an a d m1SS1
1t10n k + > c -~ ,so t h at as
k
~
00.
0.26 )
CT'0 ,
We rewrite (3.7) as
N*
c
=
n 2c
inf{n > m (> 2)
(k/n)-2(Z~
+ k- a )2 ,
for some k : m
-< k -< n}.
>
-
n,k
Keeping this in mind, we first study the asymptotic behavior of
(n/k)2 Z k when k and n are both large.
n,
be the empirical d. f. of the sample of size n.
(n/k)
(3.27)
=n
Let F'(x)
n
2
JIxn
dF (x) - {~J
k I
nkq
-1 n
Li=l I (X i
-<
x)
Then, we have
nkq
xdF (x)}2
n
where
0.28)
.
1
~q~n-
I
nkq
k + 1, m
<
=
{x : X
}
< x < X
- n:q+k-l
n:q -
k
<
n.
Since z(q),s are translation invariant,
n,k
without any loss of generality, we may set
Xl is taken as F(x) = G(x/o), x 6 R.
(3.29)
,
Then,
~
= 0,
so that the d.f. F of
12
<3.30)
J
I
x
2
dF(x)
= o2(k/n)
-o{Xn: k+ q- 1 g (o-l Xn:k+q-1 )
nkq
- Xn : q g(o
-1
Xn : q )},
which are both smooth (i.e., bounded and differentiable) functions of
X
and X
n:q
n:q+k-l·
Let us now write
(q)*
= (n/k) f
n,k
<3.31)
0.32 )
1°nkq
0.33 )
~ (q) = (n/k)
n,k
Note that for r
0.34)
= {x
f
I
= 1,2"
I
nkq
OG-l(S)
n
f
1°nkq
x
<
<
•
nkq
OG- 1 (q+k-1)}
n
x 2 dF(x) - {(n/k)J
I
o
2
xdF(x)} .
nkq
.. , we have
f
xrdF (x)
nkq
2
xdF(x)} ,
2
x dF(x) - {(n/k)J
Z
I
•
I
n
= n
xrd(F (x) - F(x»
n
nkq
-~
~ r
{n x [F (x)
n
n~[F (x) - F(x)]x r n
1
- rJ
I
X
_ F(x)] n:q+k-l
X
n:q
nkq
1
dx}.
Also, for a normal d.f. F, for every finite r(> 0),
00
0.35)
r
flxl {F(x)[l-F(x)]}Y dx
<
00
for every y
>
0,
-coo
while [see Ghosh (1972)] as n
~
00
0.36 )
w.p. 1 as well as 1n the rth mean.
Thus, the rhs of (3.34) is O(n -~ )
w.p. 1, as well as in the sth mean for all s(> 0).
On the other hand,
proceeding as in (3.24) - (3.25), we can verify that for any given kIn,
13
(3.33) is a minimum when q
the rhs of (3.25) with a
K
\(n+k) and this minimum value is given by
= kin.
Thus, using (3.29), (3,30), (3.33) and
the lower bound in (3.25), it follows that whenever k
n ~ (kin) 3
•
•
4~,
exists de) €
4
~,
n4
~
at a rate faster than log n, then for every £ >
(O,~)
with
0, there
such that
(3.37)
p{lz(q)*/~(q) - 11> £}
(3.38)
~(q) »
n,k
< c(£)n -5 , V
n,k
!.(k/n)2 »
6
n,k
n ~ nO
O«n-~ log n) 2/3 ).
Actually, in (3.37), we could have used an exponential rate 1n n, but
O(n- 5 ) suffices.
Similarly, using (3.34) - (3.36), we have for the same
set
Thus, for
k~ n
4
~ with n~(k/n)3(10g n)-l
4
~, we have
(3.40)
where £'(> £) can be made to converge to 0 when
Kiefer (1961, 1967).
£~O.
Also, refer to
Now,
0.41)
On the other hand, whenever k
0.42 )
(n/k)!
I
x
nkq
r
dF (x)
n
4
~,
n
4
~,
but kin
4
0, we write
(r
= 1,2),
as a linear combination of functions of order statistics (i.e., h (x) = x,
1
2
h (x) = x ). For normal distributions, it is known that the sample order
2
14
statistics have finite moments of any finite order, and hence using the
classical moment convergence results on order statistics [see Sen (1959),
Van Zwet (1964)], it follows that the 2pth central moment of (3.42) is
0.43 )
Now, as has been pointed out immediately before (3.26) that k
2 .
so that whenever n c 1S bounded, k »
P such that p/{l+a)
>
O{n
l/{l+a)
l+a
> c
-~
,
), and hence, choosing
5, we again arr1ve at (3.40) - (3.4l).
•
•
A similar
2
technique may be used when n c is large, where we need to adjust p (in
(3.43»
accordingly.
kIn > n
-n ,for some n
I
Thus, we conclude that whenever k
0, we have for every n
>
k -')- Z
p{ (-)
0.44 )
n
-
n,k
0
2
2:.11 > d
6
<
~
nO'
2c( dn
E
~ ~,
n
~ ~
with
0,
>
-4
Next, note that
(n o) 2 c
c
0.45 )
~ 0
2
as
c 1- 0,
so that using (3.26), (3.44) and (3.45) we readily obtain
0.46 )
N* /{{n/6)2~ n 0 }
c
c
~
1 w.p. 1 as c
1-
e·
O.
Also, it follows from Robbins (1959) and others that
0.47>
E(N )/n O ~ 1
c
as
c
while by construction, N* /N
c c
<
c
..j
0,
1 w.p. 1 for all c > O.
Hence, by the
Dominated Convergence Theorem, we get
0.48)
and this
E(N* )/{(n/6)2~ n 0 }
c
c
comp1et~s
~
1
the proof of (3.20).
as
c + 0,
•
15
Next, we note that by (>.7),
(3.49)
P(N*
e
n)
>
e
•
£ >
0, letting k
n
~
(e~k - k- a )2, ' k ~ n }
>
(c~k - k- a )2}
p{Z
<
p{Z
~
oc -!2 (n/6) ~ as e
Thus, if we define n*
>
<
-
n,k
n , kn
n
n
~
'
~ k & [m,n].
n
0, then for every n
n * (l+E),
>
-
e
an (a > 0, small), by using (3.41), we obtain
(3.50 )
Similarly, for every n
0.51)
~
* - E),
ne(l
£ >
0, by (3.40), we get
peN *
, n' 2. n-l, k
c
z
<
n,k
<
p{Z
(c~k - k- a )2,
<
n,k -
for some k
<
~
n',
n}
(c~k - k- a )2 , for some k : n~o~
_<
k
<
n}
*
= O(n -4 (n- nO»'
where nO*
·e
and r
-1
(3.52 )
•
~
+ s
e
-1/(2+2a)
-1
=
.
Thus, using the Holder inequality, for n'
1, we get
< n
*c (l-E)
16
°
Since
<
a < 1, 3 - a > 2 > 1 + a.
= 2,3, ••• ,
for every s
= (3-a)(s-1)/(s+sa)
Also, since
E(IXn - ~12s)
c
O(n- s )
choosing s so large that (3-a)/(r+ra)
> 1, we obtain from (3.52), as c ~ 0,
(3.53)
for n
>
0, while using (3.50) and a similar inequality, we have as c ~ 0,
•
E{(~* -
(3.54)
c
On the central domain, that is, n* (1 - £)
for all n > n * (1 + E).
c
<
n * (1 + E),
(1981) and conclude that as c
~
!\c
~ a
Therefore, as c
~
2 /n * + o(c%) =
C
c
*
- n* I
C
C
<
n * )}
[
-
a(6/TI) ~ C ~
c
+ o(c ~ ).
0, by (3.48), (3.53) - (3.55), we get
crc~{(6/TI)~
(3.56)
N*
0, we have
E{(X~*
-~) 2 I(!t\
(3.55)
<
-
0, we may virtually repeat the steps in Sen and Ghosh
£ >
c
c
P
c
+
e-
(TI/6)~}
(6+TI)/(24TI)~,
and this completes the proof of (3.21).
Q.E.D.
Suppose now that in (3.7) we replace
c~
by
dc~,
for some d > 1, and
denote the corresponding stopping variable by N* (d), that is,
c
(3.57)
*c
N (d)
for some k : m
<
k
~
n}.
Let us also denote by p* (d), the risk of the PISR based on theestimator
c
~*(d).
Then, virtually repeating the proof of Theorem 3.1, we obtain
c
(3.58)
lim .'O{E(N*(d»/E(N )}
CT
c
c
= d(TI/6)~ '
•
17
lim 30{P * (d)/p l
(3.59)
CT
C
%{d(n/6) ~ + d -1 (6/n) ~ l.
c
C
Then, if we let
0.60)
for some arbitrary small n
0), from (3.58) and (3.59), we obtain
(>
0.61)
lim CT'O{E(N *
(dO ,n »/E(N)}
c
c
0.62 )
lim 'O{P*(d
CT
Thus, for n
C
O,n
=
)/p}
C
=
1 - n,
~nZ/(l-n).
1 +
= .1 (or .05), we have 10% (or 5%) reduction in the ASN at the
expense of only .5% (or .13%) 1ncrease in the relative risk.
Thus,
2
allowing n ~ 0 and noting that n /(2-2n) ~ ~n2, we arrive at the following.
Let
THEOREH 3.2.
such that limv +
00
of
.
pos~tive
defining dV
£
v
{£
o.
=
v
: v = 1,2, ... }
Then~
there exists a sequence
numbers such that n
(Z£ ) ~ for all
<
v -
=
v
*
k
c,v
V
v
o such that
£
*
(3.63)
lim 'O{E(N
(3.64)
1im,O{p * (d
CT
CT
C
v
c,v
V
< £
o-
v
:
v
v = 1,2, ••• ,
)/p }
C
nUl7jbel~S
= 1,Z, ••• }
and
£
> 0
and
)/E(N)}
C
O
o
{n
= N*C (d v ), we have for every
(1 - n )(6/n)2 and N
the existence of
be a sequer.ce of posi tive
=
<
1 +
1 - (Z£)~,
1
£.
Thus~ the modified PISR N*(d
) is £-risk efficient and has a smaller
C
v
ASN
than the usual stopping pule Nc '
It appears that though for the PISR, the sufficiency principle is
violated, yet in the asymptotic setup, the PISR or its modified version
compares very favorably with the original stopping rule.
18
In order to get some ideas regarding the comparative behaviors of
Nand N* for small values of n 0 , we ran small-scale simulations with
c
c
c
200 replications.
a
We fixed
0
= 1,
~ =
20, m
= 5,10, n oc =
25, 50, 75,
= .5, .9, 1.3 and d = 1.05, 1.2, 1.35. Note that d should lie in the
interval (1, {6/n) so that we may expect some saving in the ASN.
We
observed negative savings, that is, E(N* ) exceeded E(N ) all the time
c
= .5. When a
when a
nearly the same.
and m
=
= .9,
c
we noted that E(N* ) and E(N ) came out to be
c
c
The following table summarizes our findings when a
= 1.3
10:
(TABLE 1)
~ben d gets closer to
{6/n ~ 1.382, naturally percentage savings are
expected to go down.
For a
= 1.3
and small values of n* like 50 and 75,
c
we do observe significant saving& in the ASN while using the stopping rule
0.26); however, the associated "risk-efficiency" seems to be nearly the
same as it would be for the stopping rule (3.2).
This is definitely
encouraging and we foresee the need for future indepth numerical studies
along these lines.
4.
Fixed-Width Confidence Interval for the Normal Mean and PISR.
Given
two numbers d (> 0) and a € (0,1), we wish to construct a confidence
interval I
(4.1 )
n
for
~
"
such that
the width of I
n
= 2d
and
p(~
e In )
~
1 - a.
The sequential procedure of Ray (1957), later studied 1n Starr (1966a),
can be defined by introducing a stopping variable
19
,
2 2 -2
Nd • inf{n ~ m (~2) : n ~ p Snd },
(4.2)
where S2 is defined as in (3.2) and p is the upper 50a% point of the
n
=
standard normal d.f., that is, G(p)
I - ~a.
Recall from Section 3
that g and G stand for the standard normal density function and d.f.
.,
The confidence interval IN
~
for
is then taken as
d
(4.3)
IN
d
v
=
+ d].
d
+ 1
From Starr (1966b), it follows that as d ~ 0,
p2/d2.
~
(4.4)
d
(4~2) is of the form (1.1) with v = [d- 2 ]*
One may again note that
and b
= [XN - d, XN
2 2 -2
pod
= n *d (say) and
p(~ 6
IN )
~
1 - a.
d
Following Chow and Robbins (1965), we may as well introduce the fudge
factor n
-a~'"
in (4.2), for some
°
<
a*
<
1.
*
and then
[~*(r*) ~ d]
for~.
To introduce the PISR, we conceive of a positive number
p
parallel to 0.7) and 0.57), we define
(4.5)
N*d (p * )
= inf{n
~ m (~2) : k ~ (p * /d) 2' (Zn,k + k-
for some k : m
<
k
~
a*
),
n}.
Then, we propose the confidence interval IN*(p*)
d
=
d
From the detailed analysis given in Section 3, we can show that
as d 4- 0, we get
..
.
(4.6)
and
(4.7)
20
Thus, if we let
(11 /6) ~ P
(4.8)
*
= p ,
then from (4.6) and (4.7), we note that for such particular choice of
p* , (4.l) holds for the PISR introduced in (4.5).
Thus, for the PISR
in (4.5), we achieve both the "asymptotic consistency" and "asymptotic
efficiency" results by simply adjusting p in (4.2) and replacing it by
p* given by (4.8).
{£
v
:
v
= 1,2, •.• }
More generally, if we conceive of a sequence
of positive numbers such that
£
v
~
0 as v
~ ~
and
define
(4.9)
p
*=
v
then for the corresponding
(6/TI)'~ (1 +
£
v
),
N:(P~)' we have as d + 0,
2
~(l+£).
(4.10)
V
Next, we obtain
2G{p{1 +
(4.11)
£
v
» - 1
~
222
2G{p) + 2£ pg(p) + [ p g'{p) - 1 + 0(£ )
=
1 - a + 2£ pg{p)
V
V
V
v
= 1 - a. + £ pg{p){2 - P
2
V
~
£
V
2
} + 0(£ )
v
2
2
2
1 - a + £ p {l - G{p)}{2 - P £:v } + o{£ )
v
V
2
2
2
= 1 - a + £: p {a - ~a.p £ } + o{£ )
V
=1
v
V
2
2
- a + a£ p {l - ~p £:
V
V
2
+ o{£ )}.
V
Thus, in order to be able to claim that the asymptotic coverage probability
in this case is at least 1 - a, we may select
£
v
so small that
21
Q(
\I
p2 Q
p2(1 - ~p2( ) ~ n , while in (4.10), 2(
\I
=
\I
+ (2 is also small.
Note that
\I
2p2{1 _ C(p)} is bounded by (2/n)\p exp(-%p2) for all p and it
converges to zero as p
2
Thus, compared to the increase in the
~~.
relative ASN (i.e., £ (2 + £
\I
....
\I
\I
»,
the gain in the coverage probability
(i.e., 2p2{1 - C(p)}£ {I - %p2£ }) is relatively small.
\I
\I
not be any practical gain in choosing p* as in (4.9).
On the other
\I
if we replace £
(1 - £)
2
\I
<
\I
by
-£
\I
Thus, there may
hand~
in (4.9), then instead of (4.10) we would have
2
2
1 and (4.11) would be changed to l-a-ae p {I + \p £
\I
\)
2+ o(e )~4
\I
Thus, sacrificing only a small fraction of the coverage probability, we
may achieve a somewhat larger fraction of the reduction of ASN.
For
ex~le,
if we let £v = .05 (or .01), we have 9.75% (or 2%) reduction in the ASN along
2
2
2
2
with a reduction of .05ap (1 + .0250 ) (or .Olap (1 + .0050 » of the
target coverage probability.
=
For a
.05~we
thus observe the
possibili~y
of 5% reduction of the ASN for our PISR by lowering the coverage
probabili~y
from .95 to .945.
In order to get some feeling regarding the comparative behaviors of
N and N* for small values of n* , ge Tan small-scale simulations with 200
d
d
d
replications.
We fixed
0
= 1,
~
= 20,
70, 90, 0 * /0 = 1.05, 1.1, 1.3, 1.37.
smaller than E(N ) when 1 < 0 * /0 <
d
...
0
= 1.96,
m
= 5,
10, n*
d
= 25,
5D~
Jk
Notice that E(N ) is expected to be
d
I67TI
~ 1.382.
We estimate the coverage
probability (C.P.) merely by the relative frequency vf the constructed
intervals covering
~-value.
The following table summarizes our findings
'partially.
(TABLE 2)
22
Again, we think that Table 2 speaks for itself.
It seems that by
properly choosing a and p* , the rule (4.5) would provide substantial
saving compared to E(N ) and at the same time, the achieved coverage
d
probability may not be unattractive at all in comparison with the target
1 - a.
Let us point out another aspect.
Suppose we implement the
original stopping rule (4.2) where we plug in 1 - a
from Table 2.
= estimated
C.P.
The ASN for that adaptive rule (which is not permutationally
invariant) always came out within a small fraction of E(N* ) in our
d
simulations, and, of course, asymptotically they are the same.
The
picture is definitely encouraging and worth pursuing in the future.
Acknowledgment.
Eight students enrolled in the sequential analysis
course at The University of Connecticut during the fall semester, 1986,
took part in the related simulation exercises.
We are grateful to them
for sharing with us their insights and understanding.
REFERENCES
BAHADUR, R.R. (1954).
Sufficiency and statistical decision functions.
Ann. Math. Statist. 25 423-462.
CHOW, Y.S. and ROBBINS, H. (1965).
On the asymptotic theory of fixed-
width sequential confidence intervals for the mean.
Ann. Math.
Statist. 36 457-462.
GHOSH, M. (1972).
statistics.
On the representation of linear functions of order
Sankhya Sere A. 34 349-356.
GHOSH, M. and MUKHOPADHYAY, N. (1975).
times in sequential analysis.
Asymptotic normality of stopping
VnpubZished Manuscript.
23
GHOSH, M. and MUKHOPADHYAY, N. (1979).
the mean when the distribution
1S
Sequential point estimation of
unspecified.
Commun. Statist.
Ser. A. 8 637-652.
GHOSH, M. and HVY-HOPADHYAY, N. (1981).
Consistency and asymptotic
efficiency of two-stage and sequential procedures.
Sankhya Ser. A.
43 220-227.
KIEFER, J. (1961).
On large deviations of the empiric D.F. of vector
chance variables and a law of iterated logarithm.
Paaifia J. Math.
11 649-660
KIEFER, J. (1967).
On
Bahadur's representation of sample quanti1es.
Ann. Math. Statist. 38 1323-1342.
RAY, W.D. (1957).
Sequential confidence intervals for the mean of a
normal population with unknown variance.
J. Roy. Statist. Soa.
Ser. B 19 133-143.
ROBBINS, H. (1959).
population.
Sequential estimation of the mean of a normal
Probability and Statistias (H.
Cram~r
Vol.), Uppsa1a:
Almquist & Wiksel1, 235-245.
SEN, P.K. (1959).
On the moments of the sample quantiles.
Cal. Statiet.
Assoa. Bull. 9 1-20.
SEN, P.K. and GHOSH, M. (1981).
Sequential point estimation of estimable
parameters based on U-statistics.
STARR, N. (1966a).
Sankh];;'Q Ser. A. 43 331-344.
The performance of a sequential procedure for fixed-
width interval estimation of the mean.
STARR, N. (L966b).
On
Ann. Math. Statist. 37 36-50.
the asymptotic efficiency of a sequential procedure
for estimating the mean.
Ann. Math. Statist. 37 1173-1185.
24
VAN ZWET, W.R. (1964).
Convex transformations of random variables.
Math.
Center Traat #7, Amsterdam.
WIJSMAN, R. (1959).
On the theory of BAN estimator.
Ann. Math. Statist.
30 185-191.
WOODROOFE, M.
(1982).
Nonlinear Renewal Theory in Sequential Analysis.
Philadelphia: SIAM Publication.
25
TABLE 1
Savings and Associated Risk
n
)
•
.e
..
0
c
d
Estimated
Savings (%)
Estimated
p*/p
c
c
25
1. 05
1. 20
1. 35
21. 6
12.2
2.6
1. 051
l.025
l.018
50
1.05
1. 20
1. 35
28.8
20.4
10.9
l.087
1.047
1.025
75
1. 05
1. 20
1. 35
31.6
21.4
12.6
1.096
1.050
1.027
26
TABLE 2
Savings and
Covepa~e
Probability: ex
= .05,
m
= 10
*
a
Estimated
Savings (%)
25
1.05p
.6
.7
18.5
25.7
• 914
.899
50
1.05p
.6
.7
34.9
42.0
.889
.867
25
1.3p
.6
.7
20.2
28.3
.971
.960
50
1. 3p
.6
•7
34.2
41. 3
.955
.941
70
LIp
.5
.7
24.7
41.4
.910
.885
90
LIp
.5
.7
27.1
45.0
.930
.840
70
1. 37p
.5
.7
- 7.8
18.4
•960
.915
90
1. 37p
.5
.7
- 3.5
22.1
.940
.930
n*
d
p
Estimated
C.P •
e.

Download Report

Mukhopadhyay, N., Sen, P.K. and Sinha B.K.; (1987).Stopping Rules, Permutation Invariance and Sufficiency Principle."

Paperzz.com

Your Paperzz