Confidence Intervals for an Autoregressive Coefficient Near One

Con…dence Intervals for an Autoregressive Coe¢ cient Near One
Based on Optimal Selection of Sequences of Point Optimal Tests
Muhammad Saqiby,
David Harrisz
y Department of Economics,
The University of Melbourne,
Melbourne, Australia
z Department of Econometrics and Business Statistics
Monash University,
Clayton, Australia
July 2, 2012.
Abstract
In this paper, we reconsider Elliot and Stock’s (2001) method to construct con…dence
intervals for an autoregressive coe¢ cient by inverting a sequence of invariant point optimal
tests. We show that the power properties of the point optimal tests can vary greatly with the
choice of the point under the alternative at which power is maximised. Some choices are shown
to lead to inconsistent tests, including, in some cases, the rule of thumb proposed by Elliott,
Rothenberg and Stock (1996). We propose the optimisation of a weighted power criterion as
an alternative method to specify the point optimal tests and demonstrate that this provides
tests with desirable asymptotic local power properties.
Keywords
Point optimal test, GLS detrending, asymptotic power, weighted power maximization,
invariant Je¤reys prior.
1
Introduction
In the last three decades enormous work has been done by di¤erent researchers in the area of unit
root testing and construction of con…dence intervals for the autoregressive root including local to
unity framework proposed by Bobkovski (1983), Cavanagh (1985), Phillips (1987), and Chan and
Wei (1987). As mentioned in Stock (1991), Cavanagh …rst constructed the con…dence interval for
the autoregressive root
with
in an AR(1) model without deterministics using local to unity setting
= 1 + c=T based on t-test statistic. However Stock (1991) extended Cavanagh’s work by
considering deterministic component in AR(p) model and constructed con…dence interval for the
largest autoregressive root by inverting the Dickey-Fuller t-test and Modi…ed Sargan-Bhargava
test statistics. Elliot and Stock (2001) (henceforth ES) proposed new asymptotic methods to
construct con…dence intervals for AR parameter using sequence of tests approach by coupling
Stock’s method with ideas given in Elliot, Rothenberg, and Stock (1996) (henceforth ERS). ERS
followed local to unity asymptotic approximations to the point optimal invariant test of King
1
(1987) and based on Neyman-Pearson Lemma after assuming Gaussian distribution for the errors
they described that the best test of
= 1 against any given alternative
ratio test. This testing problem with
=
1
is the likelihood
= 1 + c=T is the same as testing H0 : c = 0 against
H1 : c = c1 < 0 where the likelihood ratio test is the point optimal test. Contrary to Stock’s OLS
detrending, ERS suggested GLS transformation of the data to deal with deterministics that utilizes
c1 value under the local alternative. This constant is chosen such that the power curve of point
optimal test is tangent to the power envelope at power one-half and hence results in famous choices
of c1 =
7 when xt = 1 and c1 =
13:5 when xt = (1 t)0 . ERS also provide simulation evidence
that the tests based on GLS-demeaning/detrending of the data in the presence of deterministics
are more powerful tests with power curves indistinguishable from the Gaussian power envelope
compared to the same tests based on OLS transformation with poor power properties.
By combining the ideas of Stock and ERS, ES use sequence of tests to obtain con…dence set
for the largest autoregressive root as a set of those values that are not rejected by the sequence
of tests. Under this approach each test in the sequence is point optimal test of a particular null
versus a particular alternative. The key inspiration behind their approach is the powerful test
of a particular parameter value against di¤erent alternatives will produce accurate con…dence
intervals.
The main objective behind this work is to revisit ES work where we generalize their testing
problem to H0 : c = c0 for any c0 instead of only c0
0 for the purpose of constructing con…dence
interval of the autoregressive root under local to unity set up. This requires modi…cation of the
point optimal test and its asymptotic distribution. We do not ignore the possibility of mildly
explosive root of a time series and take into account testing positive values of c0 : Our analysis
shows that when testing for any positive c0 the point optimal test can become inconsistent with
lower sided alternatives and also has inferior power close to the null using upper tail test based
on the alternative values proposed by ERS and ES and indicates that the choice of c1 is more
in‡uential when c0 > 0 then c0
choice of
L
= 0:03 and
U
0. Moreover in two-sided testing case we reconsider with ES’s
= 0:02 where they justify the assignment of higher probability of
type-I error to the left tail by the argument that the power curves are steeper at right tails.
This choice is pragmatic and without any formal justi…cation and we have strong conjecture that
the choice of
L
and
U
is sensitive to varying c0 . These observations (also pointed out by ES
in their conclusion) motivate us to choose model parameters ((c1L ; c1U ) for one-sided tests and
(c1L ; c1U ;
L;
U)
for two-sided tests) based on some formal method using optimality criteria e.g.
maximization of weighted powers. Simulation analysis proves the superiority of power curves of
the test based on our proposed parameters compared to those based on ERS and ES and implies
accuracy of resulting con…dence intervals.
The paper is organized as follows. The model and the method of construction of con…dence
intervals as discussed in ES is summarized in the next section. In section 3, we discuss about the
2
impact of di¤erent values of c1 on one-tailed tests, inconsistency of the test caused by some of
c1 values and, alternative criteria for choosing c1 optimally. We extend the analysis to two-sided
tests and provide some formal methods to choose model parameters in section 4. Results are
provide in section 5. Section 6 concludes. All proofs are provided in Appendix A.
2
The Model and Hypothesis Testing Problem
Consider a time series yt with the data generating process given by
yt = x0t + ut ;
ut =
ut
1
t = 1; :::; T;
+ "t ;
(2.1)
u0 = 0;
(2.2)
= 1 + c=T;
"t
where we treat x0t
i:i:d:(0;
2
):
in the DGP as deterministic component with usual two cases of constant
mean ( xt = 1) and, constant and linear time trend xt = ( 1 t )0 but of course the analysis
can be extended to higher order time polynomials. Similarly the assumption that the initial
condition u0 = 0 is vital and follows ERS and will be maintained in this paper. Since we are
interested only in the asymptotic properties of the point optimal test therefore our analysis here
is restricted to disturbances "t being i:i:d:, however the approach is ‡exible to allow for other
error structures for practical purposes. Also "t obeys the Functional Central Theorem such that
P[T s]
d
T 1=2 j=1 "j ! B(s); 0 s 1 where [:] represents greatest lesser integer function, B(s) is a
d
standard Brownian motion de…ned on C[0; 1], and ! denotes weak convergence in distribution.
Sequence test approach Suppose we want to construct 100(1
for the autoregressive parameter
of asymptotic size
H1 :
6=
0,
)% con…dence set CS (y)
by inverting the sequence of tests with T (y) as the test statistic
that rejects for small values. For the hypothesis H0 :
the test is performed over a range of values
as the set of those values of
0
0
=
0
against
and the con…dence set is then derived
that are not rejected by the tests as CS (y) = f : T (y) > cv ( 0 )g
where cv is the critical value. This set exhibits the property that it has asymptotic coverage
probability of at least (1
) for any true value of
de…ne the rejection probability of the test as
the false null H0 :
0
=
0
in CS (y) ; is given by
the test to have smaller
provided that
= Pr 0 (
0
0
; i.e. limT !1 Pr [ 2 CS (y)]
1
: We
( ) = Pr (T (y) < cv ( 0 )) : If the test fails to reject
is true then the probability of inclusion of false value
2 CS (y)) : However accurate con…dence intervals require
implying high power. This paper therefore considers the construction
of tests of good power properties.
Under local to unity setting the autoregressive root is
0 (=
1 + c0 =T ) we can construct the test of H0 :
3
=
0
= 1 + c=T and for any …xed value
against the one-sided alternatives where
the side depends whether
1
<
0
or
1
>
0;
i.e. H1L :
=
1
<
0
or H1U :
=
1
>
0:
This
problem is analogous to testing H0 : c = c0 vs: H1 : c = c1 : For lower sided test we have c1 < c0
and upper sided test requires c1 > c0 : One sided con…dence set can be constructed as a set of
those values of c0 that are not rejected by the test, i.e.
CS
;i (y)
= fc0 : Ti (y) does not rejectg ; i = L; U:
where for upper sided test we have c^U = sup (CS
be ( 1; c^U ) : Similarly c^L = inf (CS
;L (y))
;U (y))
and the resulting con…dence interval would
and provides (^
cL ; 1) as the lower sided con…dence
interval for c. Construction of two-sided con…dence interval in this framework for H0 : c = c0
requires inverting two one-sided tests corresponding to H1L : c = c1L < c0 and H1 : c = c1U > c0 :
Given the constraint that the probability of rejecting the true null is some …xed level ; the twosided con…dence interval consists of those values of c0 that are not rejected by both tests. We will
explain more about this shortly. Once we obtain the con…dence interval for c; con…dence interval
for
can be obtained simply by applying the transformation ^ = 1 + c^=T .
For above hypothesis with
j
test that accounts for values of
= 1 + cj =T; j = 0; 1, ES have proposed point optimal invariant
under both the null and alternative as
PT (c0 ; c1 ) =
"21;t
t=1 ^
where
^"j;t = u
^j;t
u
^j;t = yt
^j
1 + c1 =T PT 2
^"
1 + c0 =T t=1 0;t
PT
1
^2
^j;t 1 ;
ju
(2.3)
j = 0; 1
x0t ^ j ;
= arg min
j
T
P
(yj;t
t=1
x0j;t )0 (yj;t
x0j;t );
and yj;t ; xj;t are quasi-di¤erenced series obtained as
zj;t =
zt
zt ;
(1 + cj =T )zt
1;
t = 1;
t = 2; :::; T:
From above equations we can also write ^"j;t as
^"j;t = yj;t
x0j;t ^ j ;
j = 0; 1
Since we are interested in a test with high power at a speci…c value of the parameter against
alternatives in either direction, so the asymptotic rejection probability of the point optimal test
is given by
(c; c0 ; c1 ) := lim Pr(PT (c0 ; c1 ) < k j = 1 + c=T ):
T !1
where k
= k (c0 ; c1 ) is the asymptotic critical value. This rejection probability
(c; c0 ; c1 )
represents asymptotic power of the test where the test PT (c0 ; c1 ) rejects the null for small values
4
and for true c0 has asymptotic size equal to power given by (c0 ; c0 ; c1 ) = . Since the critical
value and the power both depend on c0 and c1 (the values under the null and local alternative
respectively) therefore there is no single uniformly most powerful invariant (UMPI) test however
a family of Neyman-Pearson tests indexed by c1 : Each test in the family is most powerful at the
point c1 = c and forms the power envelope de…ned by (c; c0 ; c)
(c; c0 ; c1 ).
For one-sided con…dence intervals for ; the test of H0 : c = 0 with lower tail (stationary)
alternative H1 : c = c1 < 0 is usual unit root test and the point optimal test in this case reduces
to the one introduced and analyzed in detail by ERS. For the choice of c1 , ERS proposed inverting
the envelope power function such that (c1 ; 0; c1 ) = 0:50 and the tests have power functions not
too far from the power envelope over a substantial range. This results in c1 =
c1 =
c1U in
7 for xt = 1 and
0
13:5 for xt = (1; t) : For two-sided con…dence interval two di¤erent values of c1 as c1L and
1
= 1 + c1 =T
are required to test the null H0 : c = c0 against the alternatives H1 :
c1 = c1L < c0 and H1 : c1 = c1U > c0 . For the sake of con…dence interval construction, ES extend
the testing problem to di¤erent values of c0 (= 0; 5; 10) and recommend using alternatives that
are …xed distance from the null as H1i : c1 = c0 + c1i ; i = L; U . They use (c1L ; c1U ) = ( 7; 2)
and ( 13:5; 5) respectively for xt = 1 and xt = (1; t)0 :
To make notations simple we de…ne the events
R i (c0 ; c1i ) = fPT (c0 ; c1i ) < k i (c0 ; c1i )g ; i = L; U
where for instance R
null at
i
L
(c0 ; c1L ) denotes the event that occurs when the lower tail test rejects the
signi…cance level. ES set the overall size of the test as
tervals with desired coverage probability of 100(1
)% where
L+ U
i
=
to have con…dence in-
= Pr (R i (c0 ; c1i )j
0
= 1 + c0 =T ) ;
i = L; U is the size of the individual test. We now de…ne the rejection probability of the two-sided
test as
(c; c0 ; c1L ; c1U ;
L;
U)
= Pr (R
L
(c0 ; c1L ) [ R
U
(c0 ; c1U )j = 1 + c=T ) :
While size of the joint test is given by
(c0 ; c0 ; c1L ; c1U ;
L;
U)
= Pr (R
L
= Pr (R
L
(c0 ; c1L ) [ R
L
+
(c0 ; c1U )j
U
L
(c0 ; c1L ) \ R
Pr (R
L
U
(c0 ; c1U )j
(c0 ; c1L ) \ R
This implies that size of the joint test is bounded by
then choosing
L
+
U
<
0
= 1 + c0 =T )
(c0 ; c1L )j = 1 + c0 =T ) + Pr (R
Pr (R
=
U
U
0
U
(c0 ; c1U )j = 1 + c0 =T )
= 1 + c0 =T )
(c0 ; c1U )j
0
= 1 + c0 =T )
and if simultaneous rejections occur
results in a conservative test that yields unnecessary long con…dence
interval. Therefore in our simulation experiments we will also keep eye on the size of the joint
tests. Using
U
=
L;
the rejection probability of the two-sided test reduces to
5
(c; c0 ; c1L ; c1U ;
L)
= Pr (R
L
(c0 ; c1L ) [ R
The con…dence set is a set of those values of
0
L
(c0 ; c1U )j = 1 + c=T )
that are not rejected by either of the two tests
or both. This implies that neither of the two events R
L
(c0 ; c1L ) and R
U
(c0 ; c1U ) should occur.
However due to the possibility of simultaneous rejection of lower and upper tail tests the overall
size of the tests follows the constraint
L
+
whereas
U
works as upper bound for the size
of the test. Con…dence set and the resulting con…dence interval given
CST
= fc0 : R
CIT
= (^
cL ; c^U )
L
(c0 ; c1L )c \ R
U
L
and
U
is
(c0 ; c1U )c g ;
where c^L = inf c0 CST and c^U = supc0 CST . The con…dence interval for
is then (^L ; ^U ) where
^L = 1 + c^L =T and ^U = 1 + c^U =T .
3
Modi…ed Point Optimal Test
Ng and Perron (2001) have considered the modi…ed point optimal tests and provide their limiting
distributions that coincide with the feasible point optimal test of ERS when testing c0 = 0: Since
our objective is to extend the analysis to the general testing problem c = c0 in
= 1 + c=T; so
we modify the point optimal test of Ng and Perron (2001) and denote it by MP T :
The test statistic and its limiting distribution for testing H0 : c = c0 against H1 : c = c1 are
given for
1. Constant Case
2
MP T (c0 ; c1 ) = ^
d
!
c21
c21
c20 T
c20
2
T
X
u
^21;t
1
(c1
1 2
u
^1;T
c0 ) T
t=2
Z
1
Bc (s)2 ds
!
(3.1)
c0 ) Bc (1)2 :
(c1
0
2. Constant and Linear Trend Case
2
MP T (c0 ; c1 ) = ^
d
!
c21
Z
c21 T
2
T
X
u
^21;t
1
+ (1
c1 ) T
1 2
u
^1;T
t=2
1
2
Vc;c1 (s) ds + (1
c20 T
2
T
X
u
^20;t
1
(1
c0 ) T
t=2
c1 ) Vc;c1 (1)
2
0
6
c20
Z
0
1
Vc;c0 (s)2 ds
1 2
u
^0;T
!
(3.2)
(1
c0 ) Vc;c0 (1)2
where Vc;ci (s) = Bc (s)
s
h
Bc (1) + 3(1
R1
i)
0
i
sBc (s)ds ;
Bc (s) is the Ornstein-Uhlenbeck process given by
Z s
exp (c (s
Bc (s) =
i
=
1 ci
; i = 0; 1; and
1 ci + 31 c2i
r)) dB (r)
0
with B (r) as standard Brownian motion.
4
One-Sided Testing
In this section we investigate the in‡uence of di¤erent choices of c1 under the alternative on
asymptotic local power properties of the point optimal test for di¤erent c0 . We consider c0 =
4
as a representative case for negative choices of c0 ; c0 = 4 for positive values and, the famous unit
root case when c0 = 0: All simulations in this paper are performed in GAUSS using T = 1000
and 50; 000 replications for cases xt = 1 and xt = (1; t)0 .
4.1
The e¤ect of c1 on lower tail tests
The …rst two panels in …gures 4.1 and 4.2 respectively correspond to c0 = 0 and
4 where we
observe that ERS and ES choices of c1 result in power curves that are quite indistinguishable from
the power envelope across all c: This property is also fairly robust to variations in c1 as evident
from the power curves for some arbitrary choices of c1 that correspond to power higher than 50%
at the power envelope.
The most interesting behavior emerges for the …nal case as can be seen from third panel of
above …gures, where we have quite di¤erent …ndings for c0 = 4: We observe that the choice of c1
values can matter a great deal to the power properties of the point optimal test. Based on 50%
rule of ERS, c1 = 3:75 for demeaned case and c1 = 2:25 for constant and linear trend case produce
tests with non-monotonic power curves that are close to the envelope near the null but then decline
towards zero as c moves away from the null. This behavior is explored further below. However c1
(equal to
3 for xt = 1 and
9:5 for xt = (1; t)0 ) based on ES …xed distance from the null rule
behaves reasonably well but unlike the examples of c0 =
4; 0; there does not exist any single
choice of c1 that produces a test with power uniformly close to the power envelope. Therefore the
choice of c1 when we test some positive value of c0 is important and will be considered below in
more detail.
4.1.1
Why is the test inconsistent for some c1 when c0 > 0?
From above discussion we know that the point optimal test when applied to test c0 = 4 against
the lower tail local alternative based on c1 of ERS and arbitrary value of 2:5 has non-monotonic
7
Figure 4.1: Power Curves: Constant Mean
8
Figure 4.2: Power Curves: Constant and linear trend
9
asymptotic power curves that approach to zero as c diverges from the null. This observation
suggests that the test may be inconsistent. A possible explanation for this inconsistency of the
test may be as follows.
The point optimal test without deterministics (as shown in appendix A) is approximately
P
1
(4.1)
c21 c20 T 2 Tt=2 yt2 1 + (c0 c1 )T 1 yT2
2
^
and we reject the null for small values i.e. the test rejects when PT (c0 ; c1 ) < k : Since the
PT (c0 ; c1 ) =
asymptotic distribution of point optimal test is independent of ^ 2 ; so we can replace it with
known
2
to keep things simple.
First consider the case of testing unit root against the stationary local alternatives that requires
c0 = 0 and c1 < 0: The test given above becomes
PT (0; c1 ) =
1
2
c21 T
2 PT
2
t=2 yt 1
c1 T
1 2
yT
;
and consists of entirely positive terms and leads to k > 0: Thus the test is consistent against
…xed alternatives and its consistency can be shown as below. With
T
1
T
P
t=2
and
yt2
2
p
! E yt2
1
< 1; we have
=
1
2
1
d
2
yT2 ! y1
;
2
2 represents a random variable with stationary distribution of y ; i.e. y
where y1
t
1
Thus
c21
d
T PT (0; c1 ) !
y1
c1
0;
1
2
:
2
:
1
This convergence in distribution shows that under …xed alternatives T PT (0; c1 ) = Op (1) or
PT (0; c1 ) = Op (T
1 ).
2
p
This implies that PT (0; c1 ) ! 0 and hence we observe Pr (PT (0; c1 ) < k j < 1) !
1; i.e. the test statistic will be less than any positive critical value with probability approaching
1. Therefore the test is consistent.
Now consider the case such that c0 > 0 and we test c0 against the lower tail alternative
P
c1 < c0 : The signs of terms c21 c20 T 2 Tt=2 yt2 1 + (c0 c1 )T 1 yT2 of point optimal test in (4.1)
depend on magnitudes of c0 and c1 contrary to the unit root test case with c0 = 0 and c1 < 0:
Under this new situation with c0 > 0 and c1 < c0 ; we have c0
(c0
c1 )T
1y2
T
c1 > 0 and so the second term
> 0: Now we have two cases to conjecture about the sign of the …rst term as:
1. If jc1 j > c0 ; then c21
c20 > 0 leads to c21
situation with consistent test.
2. For jc1 j < c0 , we have c21
of c1 we may have
c21
c20 T
2
PT
2
t=2 yt 1
> 0 and we are back to the
P
c20 < 0 implying c21 c20 T 2 Tt=2 yt2 1 < 0: For di¤erent values
P
c20 T 2 Tt=2 yt2 1 7 (c0 c1 )T 1 yT2 and so the point optimal
test PT (c0 ; c1 ) is not guaranteed to be positive or negative.
10
Note that for this second case, the test is still the likelihood ratio test but may cause inconsistency in some cases. Under the …xed alternative case
d
T PT (c0 ; c1 ) ! (c21
This implies that PT (c0 ; c1 ) = Op (T
But the inconsistency arises given c21
c20 )
1)
1
1
2
< 1 we have
+ (c0
y1
c1 )
2
:
p
or PT (c0 ; c1 ) ! 0 for …xed alternative case as before.
c20 < 0 and the critical value may be negative (i.e. k <
p
0): The reason why this negative critical value causes inconsistency is that if PT (c0 ; c1 ) ! 0 and
k < 0 then Pr (PT (c0 ; c1 ) < k ) converges to zero instead of one. For instance, in demeaned case
for c0 = 4 if we choose c1 such that 1:053 < c1 < 4; the resulting test is inconsistent. This is
the underlying reason for non-monotonic power curves when testing some positive value of c0 : As
c!
1; the local to unity model behaves like a stationary model and the point optimal test does
not have power against stationary models. This logic is not applicable to local to unity model
itself, i.e. the test with jc1 j < 4 (e.g. c1 = 2:5 as evident from …gure 4.1) may have good local
power properties near the null but has poor properties further from the null suggesting not to
choose that test. Thus based on our simulation experiments we …nd that; for a test with constant
we need c1
1:053; and with constant and linear trend we need c1 < 1:752 in order for the PT
test to be consistent against stationary alternatives.
4.2
The e¤ect of c1 on upper tail tests
To see if c1 matters in case of upper tail testing, power curves are presented in …gures 4.3 and
4.4 respectively for demeaned and detrended cases. First two panels in these …gures correspond
to c0 = 0 and c0 =
4 respectively. In both speci…cations of xt we see that the power curves
produced by using di¤erent values of c1 are fairly close to the power envelope and we are indi¤erent
in using either of these values with an exception of the subjective choice of c1 = 4 in case of c0 = 0:
The test when used with c1 = 4 (a value that is a bit far from the null) has low power for all c
that are close to the null but as c moves away from the null the power gets closer to the power
bound. This power loss for values near the null make c1 = 4 an inferior choice compared to ERS
and ES.
As evident from …nal panels of above …gures, some di¤erent behavior emerges for c0 = 4: We
ERS = 4:5 and cES = 9 for constant
have cERS
= 4:225 and cES
1
1 = 6 for constant mean case and c1
1
and linear trend case with some arbitrary choices of c1 as a value between cERS
and cES
or a
1
1
ERS
value greater than cES
1 : We observe that in both cases power curves of the test based on c1
are reasonably close to the power envelope until their tangency to the envelope at c1 itself but
afterwards remain below and, contrary to other two cases of c0 =
the envelope. On the other hand, power of the test based on
cES
1
4; 0; never get quite close to
remains substantially low for c
that are near to the null as manifested by nearly horizontal segment of the power curve. However
as c increases further away from the null the power curve starts rising quickly towards the power
11
Figure 4.3: Power Curves: Constant mean
12
Figure 4.4: Power Curves: Constant and linear trend
13
bound in demeaned case but behaves rather poorly in linear trend case. In summary, the choice
of c1 has a substantial e¤ect on the power properties of the tests for c0 = 4. It will therefore be
necessary to consider ways of choosing c1 .
4.2.1
Is the test inconsistent for upper tail testing?
To answer this question of inconsistency for the upper tail test we again focus on the two terms
given in the point optimal test. Since c1 > c0 , so the second term (c0 c1 )T 1 yT2 in point optimal
P
test is always negative. The …rst term c21 c20 T 2 Tt=2 yt2 1 might be positive or negative
depending on c21 7 c20 . However, for upper tail test none of this matters in the same way that
it does for the consistency of lower tail tests against stationary alternatives. In the latter case
the statistic converges to zero, so the sign of the critical value is important for the consistency or
otherwise of the test.
2
In considering the consistency of a test against explosive alternatives, we have both T
+1 and T
1y2
T
p
! +1. More speci…cally, for the DGP yt = yt
1
+ "t with
PT
p
2
t=2 yt 1 !
> 1, we …nd from
2T y 2
T
Theorem 2 of Lai and Wei (1983) (see equations (2.1) and (2.3)) that
= Op (1) and
P
T
2
2T
t=2 yt 1 = Op (1), so the two terms in the test statistic are actually the same order when
P
> 1. But notice that Tt=2 yt2 1 is divided by T 2 while yT2 is only divided by T in the statistic,
so the asymptotic behavior of the statistic when
> 1 is determined entirely by (c0
For any upper tailed test (i.e. any c0 ; c1 such that c0 < c1 ) we have (c0
c1 ) T
1y2
T
c1 ) T
p
!
1y2 .
T
1, and
the test rejects for small values of PT , so it will reject with probability converging to one against
explosive alternatives. Thus inconsistency is not an issue here.
Alternative criteria to choose c1
5
Based on previous discussion we …nd that the alternatives proposed in the literature work fairly
well when applied to testing c0 taking zero or negative values. But for testing positive values of
c0 , the test becomes inconsistent for lower sided testing or has low power near the null for upper
tail testing. This issue motivates us to …nd some alternative choices of the model parameters.
As discussed in Patterson (2011), Cox and Hinkley (1974) provide the following three possible
choices when there does not exist any uniformly most powerful test.
i. Use the most powerful test for a representative
1
value under the alternative.
ii. Maximize power for an alternative very local to the null.
iii. Maximize the weighted power for a range of local alternatives.
To achieve our objective we will use (i) and (iii) and some other criteria as well. To this end,
let
(c; c0 ; c1i ) and
(c; c0 ; c) respectively denote the asymptotic local power function and the
asymptotic power envelope of test of size
and
14
(c; c0 ; c1i ) := lim Pr (PT (c0 ; c1i ) < k j = 1 + c=T )
T !1
where
(c0 ; c0 ; c1i ) = :
Criterion 1 Power curves tangency
Based on (i) above, we can apply power curves tangency rule to obtain c1i as a value such that
the asymptotic power curve is tangent to the power envelope at some pre-decided power , i.e.
(c1 ; c0 ; c1i ) =
using
where
= 50% or 80%: Note that ERS’s choice of c1i is based on this criterion
= 0:50 with emphasis on the treatment of deterministics under the alternatives local to
the null.
Criterion 2 Minimax criterion
In addition to the criteria above, we could choose the optimal value of c1 as the value that
minimizes the largest power loss relative to the power envelope, i.e.
c1i = arg min max (
c1i
c
(c; c0 ; c)
(c; c0 ; c1i )) :
Criterion 3 Optimal weighted average power
Based on rule (iii) given above following Cox and Hinkley (1974), we can de…ne the optimal
weighted average power function as
c1i = arg max
c1i
Z
(c; c0 ; c1i )w(c)dc; j = l; u
~j
C
(c0 ; c0 ; c1i ) = and w(c) 0 represents weights as a function of c; C~l = ( 1; c0 ] for
lower tailed test and C~u = [c0 ; 1) when the test is upper tailed.
R
One problem with the set C~j is that the integral C~j (c; c0 ; c1i )w(c)dc need not exist if
R
C~j is unbounded. We can either choose w (c) such that C~j w (c) dc < 1 or restrict w(c) to
R
interval/truncated domain Cj C~j so that the integral
(c; c0 ; c1i )w(c)dc exists when de…ned
where
on Cj with Cl = [cl ; c0 ] and Cu = [c0 ; cu ] : For instance, for lower tailed test the integral can become
R c0
(c; c0 ; c1L )w(c)dc. Now the question arises of how to choose b? As a practical matter we can
b
choose b such that for any small " > 0 we observe power envelope within " of maximum attainable
power of 1, i.e. 1
(b; c0 ; b)
":
For two-sided tests we will stick to the choice of weighted average power maximization. As
given above the rejection probability of the two-sided test using
(c; c0 ; c1L ; c1U ;
L)
= Pr (R
L
(c0 ; c1L ) [ R
15
L
U
=
L
is
(c0 ; c1U )j = 1 + c=T )
So in general we can set about choosing c1L ; c1U ;
Z
fc1L ; c1U ; L g = arg max
fc1L ;c1U ;
Using the optimal value
…xed
L
(and hence
U)
L;
L
(c; c0 ; c1L ; c1U ;
~
C
Lg
to maximize weighted power, i.e.
we can …nd optimal value of
U
L ) w (c) dc:
from
U
=
L.
However for
like the case of ES, we can obtain separately the optimal values of c1L
and c1U as
c1i = max
c1i
Z
~i
C
(c; c0 ; c1L ; c1U ) w (c) dc; i = L; U:
But rather than setting an arbitrary value for
L;
we will investigate to choose it optimally with
other parameters.
Some possible choices of weight function w(c) are as follows.
1 if c 2 Cj ;
0 if c 2
= Cj :
This choice of weight function results in simple average of power, so the optimal value of c1
i. Uniform weights: w(c) =
for the local alternatives is chosen as a value that maximizes simple average of powers. For
one-sided tests
c1i = arg max
c1
Z
(c; c0 ; c1 )dc; j = l; u
Cj
and for two-sided tests
fc1L ; c1U ;
L g = arg max
fc1L ;c1U ;
Lg
Z
(c; c0 ; c1L ; c1U ;
~
C
L ) dc
ii. w(c) = I (c)1=2 (Je¤ reys Prior )
ii.a. Je¤ reys prior based on full likelihood
Phillips (1991) suggests using Je¤reys prior instead of ‡at priors in response to the criticism
by Sims and Uhlig (1991) on frequentists approach to unit roots. Phillips considers …xed
in the autoregressive model with and without deterministic trend and both conditional
and unconditional cases. However for our purpose we assume the conditional case (u0 =
0); known
2
and with no deterministics to translate his results using
= 1 + c=T to take
asymptotic approximations. The corresponding Fisher’s information matrix (see appendix
for the derivation) is given by
I (c) =
1
2c
1
2
e2c 1
2c
1
if c 6= 0;
if c = 0:
The prior is plotted in …gure 5.1 below. From this diagram we observe that the prior increases
p
slowly to the value 1= 2 at c = 0 as the information content increases with T ! 1; but
16
w(c)
JP
3
IJP
2
1
-4
-3
-2
-1
0
1
2
3
4
c
Figure 5.1: Je¤reys prior and Invariant Je¤reys prior
then it starts growing exponentially for all c > 0. This higher density for c > 0 is due to
our prior knowledge about the parameter that when true value of c0 > 0 in
= 1 + c=T;
the data will carry more information about c0 :
ii.b. Je¤ reys prior based on invariant likelihood
Since in this analysis we are applying point optimal test of King (1987) that is invariant to
transformations of the form y 7! y + Xb, where y = (y1 ; : : : ; yT )0 and X = (x1 ; : : : ; xT )0 , following
King (1980) and King and Hillier (1985), we can derive Je¤reys prior based on invariant likelihood.
Although King and Hillier discuss both invariance to the regressors and to scaling but we follow
ERS and use invariance only to the regressors and not to scaling (which is handled by dividing
the point optimal statistic by an estimate of the variance). By assuming Gaussian errors the
information matrix using invariant maximum likelihood when xt = 1 is the same as the one we
have obtained from full maximum likelihood. However for xt = (1 t)0 ; this information matrix is
given by
I (c) =
1
2c
e2c 1
2c
1 (c
2c3
1
1
3
1
c+
c2
3
2) 3c
0
B
+ 2B
@
2e2c + ce2c + 2c2 + 2
c2
1 c+
3
12
3
6 1
2c
c+
c2
3
C
C ; c 6= 0
A
Note that I (c) ! 0 as c ! 0 and is evident from the plot of this prior given in …gure 5.1.
This observation is also pointed out by Marsh (2007).
17
Since Je¤reys prior is the square root of the information matrix, so we can use the weight
function w(c) = I (c)1=2 and the optimal choice of c1 is given by
Z
c1i = arg max
(c; c0 ; c1 )I (c)1=2 dc; j = l; u
c1i
Cj
Z
fc1L ; c1U ; L g =
arg max
(c; c0 ; c1L ; c1U ; L ) I (c)1=2 dc:
fc1L ;c1U ;
Lg
~
C
iii. w((c) based on the Symmetrized Asymptotic Reference Prior (SARP)
Berger and Yang (1994) provide the symmetrized asymptotic reference prior that maximizes
the Kullback-Leibler divergence between prior and posterior that results in the following
prior.
( ) / E (log ( jy))
For large T; this is approximately
1
E
2
( ) / exp
log
T
X
t=2
yt2 1
!!
but has di¤erent orders of T for di¤erent : Berger and Yang (1994) suggest using the normalized
( ) for
< 1; then use transformation
symmetric reference prior.
7 ! 1= for
p
1
p
1= 2 j j
1= 2
SR
=
> 1:This leads to the following asymptotic
2
; j j<1
2
1 ; j j>1
Berger and Yang discuss some of the important features of this prior as;
1. it a proper prior since it integrates to one,
2. it assigns equal probability of one-half to j j < 1 and j j > 1;
3. contrary to Je¤reys prior as a weight function that assigns …nite weight to the stationary
case where j j < 1 but unreasonable (in…nite) weight to the explosive case j j > 1; this prior
assigns more weight to the values that are close to c = 0 but as c moves away from zero the
weights start shrinking; in particular, for c > 0 the weights converge to zero,
4. usually we can not use improper priors for testing but above mentioned properties of SR
prior make it suitable for use in testing.
This asymptotic reference prior makes sense under local to unity setting
T ! 1 and becomes
R (c)
/ exp
1
Ec log
2
18
Z
0
1
Bc (s)2 ds
= 1 + c=T with
where Bc (s) is the Ornstein-Uhlen process.
This prior is similar to Je¤reys prior but is ‡exible to imposing symmetry about c = 0: For
R1
computational purpose we approximate Ec log 0 Bc (s)2 ds by simulation, as
Ec log
Z
1
R
n
1 X 2
zc;r;t
n2
1 X
log
R
Bc (s)2 ds
0
r=1
t=1
!!
where zc;r;t is generated from
zc;r;t = 1 +
with zc;r;0 = 0 and
n
r;t t=1
c
T
zc;r;t
1
+
r;t ;
being pseudo-random drawings from the i.i.d. standard normal
distribution. The sample size n and number of repeated samples R are chosen to be large in the
usual way to reduce simulation error. The resultant reference prior is
!!!
R
n
11 X
1 X 2
log
zc;r;t
:
R (c) = exp
2R
n2
r=1
t=1
Following diagram presents asymmetric and symmetric versions of the reference prior along
with Je¤reys.
Figure 5.2: Asymmetric and Symmetric Reference Prior
Now we assume the weight function w(c) =
R (c)
and obtain optimal values of c1 for both
one and two-sided tests as
c1i = arg max
c1
fc1L ; c1U ;
Lg
=
Z
arg max
fc1L ;c1U ;
(c; c0 ; c1 )
R (c) dc;
i = l; u
Ci
Lg
Z
~
C
19
(c; c0 ; c1L ; c1U ;
L)
R (c) dc:
6
Results
6.1
6.1.1
One-sided tests
Constant mean case
In Table 6.1 below, we have reported optimal values of c1L and c1U obtained after using di¤erent
criteria discussed above for individual one-sided point optimal test when xt = 1. Power curves
corresponding to these optimal values for c0 = 0; 4 are presented in …gure 6.1. We observe that
these power curves for lower tailed alternatives given in panel 1 and 2 are indistinguishable from
the power envelope. Similarly for upper tailed alternatives we …nd that some of optimal c1L values
produce power curves slightly below the envelope near the null only. But these di¤erences are
negligible, except for invariant Je¤reys prior case when c0 = 0, and hence we can happily use any
of these values for con…dence interval construction. However looking at …gure 6.2 for c0 = 4 we
observe that some of the power curves given in the …rst panel for lower tailed test indicate the
inconsistency of the test for some c1L values. In particular, we …nd c1L values obtained by using
ERS, minimax,
(c1 ; c0 ; c1 ) = 0:85 rules and, the Je¤reys prior as weight function are all greater
then the threshold value of c1 = 1:053 for constant case. As a consequence of using these values
we get inconsistent tests with powers converging to zero instead of one as c moves away from the
null. Further we also …nd that none of these criteria produce power uniformly close to the power
envelope. Speci…cally power curves based on ES behave poorly for alternatives not far from the
null compared to other three choices.
Table 6.1: Optimal values of model
H0 :
Criterion
ERS/
(c1 ; c0 ; c1 ) = 0:50
(c1 ; c0 ; c1 ) = 0:85
Elliot and Stock
Mini-Max
Simple Average
Je¤reys Prior (Invariant JP)
Symmetric Reference Prior
parameters for one- sided tests: Constant only
c0 = 4
c0 = 0
c0 = 4
c1L c1U
c1L c1U
c1L c1U
-13
-19
-11
-23
-24
-24
-24
-0.5
2
-2
0.5
0
2.5
0
-7
-12
-7
-10
-10
-10
-10
2
3
2
1.75
2
2.50
1.75
3.50
1.75
-3
1.75
0.65
1.25
0.65
4.3
4.7
6
4.3
4.5
4.7
4.5
Power curves for upper tail test of c0 = 4 using the same criteria are presented in second panel
of …gure 6.2 where inconsistency is not an issue. The best choice of c1U depends on the position of
the power curve relative to the power envelope, that is, the value of c1U that produces power curve
closer to the power envelope is preferred. But we observe none of the curve is uniformly close to
20
Figure 6.1: Lower-sided power curves with di¤erent criteria (xt = 1)
21
Figure 6.2: Lower-sided power curves with di¤erent criteria (xt = 1)
22
the power envelope and hence we can not select such value. However we …nd that value of c1U
proposed by ES has very low power for all the alternatives that are close to the null (4 < c < 4:85)
but as c moves away from the null there is a sharp rise in the power and the curve gets close to
the power bound. All other criteria provide c1U values with power curves that are fairly close to
the envelope so we can use any of these values.
6.1.2
Constant and linear trend case
Table (6.2) reports the optimal values of parameters c1L and c1U for constant and linear trend
case and corresponding power curves are presented in …gures 6.3 and 6.4 below. Like demeaned
case above, we …nd that when testing c0 =
4 and c0 = 0, c1L and c1U for lower and upper
sided tests respectively produce power curves close to the power envelope and hence make these
c1 values from certain criteria feasible.
Table 6.2: Optimal values of model parameters for one- sided tests: Drift and Linear Trend
H0 :
c0 = 4
c0 = 0
c0 = 4
Criterion
c1L
c1U
c1L
c1U
c1L
c1U
ERS/ (c1 ; c0 ; c1 ) = 0:50 -16
2
-13.5 2.50
2
4.6
(c1 ; c0 ; c1 ) = 0:85
-23.5 3
-19.5 3.50
-2.75 5.2
Elliot and Stock
Mini-Max
Simple Average
Je¤reys Prior
Invariant Je¤reys Prior
Symmetric Reference Prior
-17.5
-23.5
-29.5
-29.5
-29.5
-29.5
1
1.5
1.5
3
3
1.5
-13.5
-18
-18
-13.5
-13.5
-13.5
5
2.75
2.75
3.50
3.50
2.75
-9.50
-0.20
-2
-2.5
-3.25
-3.25
9
4.6
4.8
5.1
5.1
4.8
From …gure 6.4 for c0 = 4 case we observe that only ERS rule does not provide monotonic
power curve and hence causes an inconsistent test for lower sided alternatives. Simulation evidence
shows that all c1L > 1:752 will produce negative critical values and hence will make the test
inconsistent. We also observe that the value of c1L based on ES rule produces a test that has
lower power for alternatives near the null. We also examine that c1L based on minimax criterion
produces the power curve that is not close to the envelope for values far from the null and also
making it an inferior choice compared to other c1L values. Similarly for upper tailed testing we
observe that c1U based on ES rule works poorly since the corresponding power curve has a ‡atter
portion for 4 < c < 6:5 with very low power, then a big jump in power towards power envelope
near c = 6:5. Also power curve based on c1U from reference prior produces slightly low power for
alternatives close to the null but then has power very close to the envelope therefore we should
not rule out this value from the list of optimal values of model parameters.
23
Figure 6.3: One-sided power curves with di¤erent criteria (constant and linear trend)
24
Figure 6.4: One-sided power curves with di¤erent criteria (constant and linear trend)
25
6.2
Two Sided Test
Following the discussion in section 3 about two-sided testing and using only the criterion of optimal
weighted average power we have computed optimal values of parameters
L,
U;
c1L and c1U for
both demeaned and detrended cases for con…dence interval construction. The results are reported
in table 6.4 and the corresponding power curves for di¤erent c0 in 6.2 below. Table 6.3 reports
values based on the Elliot and Stock’s rule discussed above with …xed
L
= 0:03 and
U
= 0:02:
Table 6.3: Model Parameters based on ES Rule
Demeaned Case
Detrended Case
c0
c1;L
c1;U
c1;L
c1;U
-4
0
4
-11
-7
-3
-2
2
6
-17.5
-13.5
-9.5
1
5
9
From the following table we have quite di¤erent optimal values of model parameters from
those proposed by ES. We …nd varying values of
L;
U
across di¤erent choices of c0 instead of
…xed values proposed by ES. We observe that for positive c0 ; all the di¤erent choices of weight
functions assign higher probability of type-I error to the left tail (i.e. higher
higher than the ES value of
L
= 0:03 for a test of overall size
L)
= 0:05:
Table 6.4: Optimal values of model parameters for two sided tests
Weight Function
Demeaned Case
Detrended Case
H0 : c0 = 4
c1L
c1U
c1L c1U
L
L
Simple Average
Je¤reys Prior
Symmetric Reference Prior
Invariant Je¤reys Prior
-13.5
-28
-22
Same
0.25
0.038
3
0.034
-0.75 0.046
as Je¤reys prior
-18
-13
-18
-14
2
4
2
4
0.039
0.027
0.047
0.034
-9.5
-8
-13
Same
2
0.037
3.25
0.032
2.5
.047
as Je¤reys prior
-25
-23
-27
-23
1.5
3.75
3.5
3.75
0.042
0.029
0.048
0.036
0.80
0.20
0.4
Same
4.5
0.049
4.8
0.041
4.6
0.049
as Je¤reys prior
-1.5
-5.5
-3.5
-7.5
4.9
5
4.6
5.5
0.048
0.033
0.049
0.033
H0 : c0 = 0
Simple Average
Je¤reys Prior
Symmetric Reference Prior
Invariant Je¤reys Prior
H0 : c0 = 4
Simple Average
Je¤reys Prior
Symmetric Reference Prior
Invariant Je¤reys Prior
26
that is even
In diagram 6.5 below, left and right panels correspond respectively to the power curves for
demeaned case and the detrended case. We …nd well behaved power curves using the optimal
values of (c1L ; c1U ;
L;
U)
obtained from multiple choices of weights. On comparison we observe
varying slope of the curves particularly at right tails for di¤erent c0 . We also …nd that our proposed
values produce power curves that outperform the power curves based on ES in almost all cases
especially at the lower tails. We observe signi…cant power gains using our proposed values over
those of ES particularly when we test some positive value of c0 : These vital di¤erences support
our hypothesis of varying slope of the power curves for di¤erent c0 values and hence di¤erent
values for the model parameters.
Figure 6.5:Power Curves for two-tailed test with di¤erent criteria
On comparison we observe that power curves corresponding to the parameter values from the
weighted powers using the uniform and symmetric reference prior weights are superior to those
from Je¤reys prior weights at the left tails for each c0 : While the test achieves higher power in
detrended case with Je¤reys invariant prior weights compared to Je¤reys prior weights when c0 is
either
4 or 0: By looking at the right tails of the curves we …nd that for all alternative c values
the test has almost identical powers using optimal values. The most interesting case is that of
testing c0 = 4 where we …nd ES as a poor choice since it has low power close to the null. Also
27
both types of Je¤reys prior weights have identical power at right tail but have slight di¤erences
in powers at left tail. However for detrended case, when we compare the powers at left tails we
…nd signi…cantly lower powers from Je¤reys prior weights compared to powers from uniform and
symmetric reference prior weights.
Given this simulation analysis we observe that there does not exist any unique set of parameters
with uniformly better power across the grid of c values. Despite this outcome we …nd that our
proposed set of values produce powers that are relatively higher than powers proposed by ES. In
particular we observe that the point optimal test has good power properties when computed from
the parameters generated from uniform and symmetric reference prior weight functions for all c0
and for both demeaned and detrended cases.
7
Conclusions
In this paper, we have analyzed and extended Elliot and Stock’s method to construct con…dence
intervals, by inverting a sequence of invariant point optimal tests, for the autoregressive root local
to one. Our analysis of one-sided testing problems shows that in the presence of deterministics the
choices of model parameters based on ERS and ES work fairly well when testing zero or negative
values of parameter c0 . However for some positive c0 ; the test based on the ERS rule becomes
inconsistent for lower sided stationary alternatives. Also the test based on the ES rule has low
power for alternative values close to the null. For upper tail test, this behavior remains the same
whether the deterministic component involves constant mean or constant and linear trend. We
provide the reason of the inconsistency of the test for lower sided alternatives for some choices of
parameter c1L :
Similarly for two-sided test, instead of using …xed values of
L
= 0:03 and
U
= 0:02 and
c1L ; c1U that are …xed distance from the null, we search for optimal values of these parameters that
are jointly determined by the maximization of weighted powers using di¤erent weight functions.
Our results indicate that optimal values of these parameters are very sensitive to changes in c0 .
In particular we …nd that all types of weight functions considered here, except Je¤reys prior,
provide optimal
L
> 0:03 for di¤erent c0 : However, weights based on symmetric reference prior
result in the largest
L
close to the nominal level of 0:05. We acknowledge that none of these
weight functions produce uniformly higher power so that we prefer one weight function over the
other. But simulation results prove that our proposed values of the parameters (particularly from
uniform and symmetric reference prior weights) result in more powerful test compared to those
provided by ES. Based on these …ndings we can say that out proposed values may yield more
accurate con…dence intervals.
Possible extensions to this work include; the derivation of an invariant symmetric reference
prior following the same approach we used for the invariant Je¤reys prior, allowing for serially
correlated errors in the usual way, and consideration of the e¤ect of the initial condition on these
28
tests.
8
References
Berger, J. O., and R. Y. Yang (1994) "Noninformative Priors and the Bayesian Testing for the
AR(1) Model," Econometric Theory 10, 461-482.
Bobkovski, M.J. (1983) "Hypothesis Testing in Nonstationary Time Series," Unpublished
Ph.D. Thesis, Department of Statistics, University of Wisconsin.
Cavanagh, C. (1985) "Roots Local to Unity," Unpublished Manuscript , Department of Economics, Harvard University.
Chan, N. H., and C. Z. Wei (1987) "Asymptotic Inference for Nearly Nonstationary AR(1)
Processes," Annals of Statistics 15, 1050-1063.
Cox, D. R., and D. V. Hinkley (1974) Theoretical Statistics. Chapman & Hall, London.
Elliot, G., T. J. Rothenberg, and J. H. Stock (1996) "E¢ cient Tests for an Autoregressive
Unit Root," Econometrica 64, 813-836.
Elliot, G., and J.H. Stock (2001) "Con…dence Intervals for Autoregressive Coe¢ cients Near
One," Journal of Econometrics 103, 155-181.
Lai, T. L., and C. Z. Wei (1983) "Asymptotic Properties of General Autoregressive Models
and Strong Consistency of Least-Squares Estimates of their Parameters,"
Journal of Multivariate Analysis 13, 1-23.
King, M. L. (1980) "Robust Tests for Spherical Symmetry and their Applications to least
Squares Regression," The Annals of Statistics 8, 1265-1271.
King, M. L., and G. H. Hillier (1985) "Best Invariant Tests of the Error Covariance Matrix of
the Linear Regression Model," Journal of Royal Statistical Society. Series B 47, 98-102.
King, M. L. (1987) "Towards a Theory of Point Optimal Testing," Econometric Reviews 6,
169-218.
Marsh, P. (2007) "The Available Information for Invariant Tests of a Unit Root," Econometric
Theory 23, 686–710.
Ng, S., and P. Perron (2001) "Lag Length Selection and Construction of Unit Root Tests with
Good Size and Power," Econometrica 69, 1519-1554.
Phillips, P. C. B. (1987) "Towards a Uni…ed Asymptotic Theory for Autoregression," Biometrika 48, 419-426.
Phillips, P. C. B. (1991) "To Criticize the Critics: An Objective Bayesian Analysis of Stochastic
Trends," Journal of Applied Econometrics, vol. 6, 333-364.
Patterson, K. (2011) Unit Root tests in Time Series Volume 1. Palgrave Macmillan, UK.
Sims, C. A., and Uhlig, H. (1991) "Understanding Unit Rooters: A Helicopter Tour," Econometrica 59, 1591-1599.
29
Stock, J.H. (1991) "Con…dence Intervals for the Largest Autoregressive Root in U.S. Macroeconomic Time Series," Journal of Monetary Economics 28, 435-459.
Appendix A
1
Derivation of Je¤reys Prior
We derive here the expression for Je¤reys prior in the autoregressive model under local to unity
setting, whereas this prior is the positive square root of Fisher’s information matrix. The loglikelihood function based on "t being i.i.d. in equation (2.2) is given by
log L(cj
2
)=
T
log
2
T
log(2 )
2
2
2
the score function with respect to c is given by
2)
@ log L(cj
@c
1
T
=
2
and the derivative of the score function is
@ 2 log L(cj
@c2
PT
ut =
E(u2t ) =
=
=
Pt
1
j=0
1+
2 Pt 1
j=0
21
1
21
1
1+
1
2
2
2
T
=
By recursively iterating ut = (1 + c=T )ut
2
T
=
ut
t=1
ut ut
t=1
2)
1 PT
2
c
T
1+
u2t
c
T
ut
;
1
PT
2
t=1 ut 1
PT
1 2
t=0 ut :
+ "t ; we obtain
c
T
j
"t
c
T
1+
j
2j
(geometric series)
c 2t
T
2
1 + Tc
2t
1 + Tc
:
1
c2
T (2c + T )
1+
Thus we have
E
PT
1 2
t=0 ut
=
=
2 PT 1
t=0
2
1
1
T (2c
1
1
T (2c
+
c 2t
T
2
+ cT )
1+
c2
T
The information matrix is then obtained as
30
)
T
1
!
PT
1
t=0
1+
c
T
2t
2
1
;
I
=
=
=
=
=
=
!
@ 2 log L(cj
@c2
E
T
2
2
T
PT
c6=0
!
1 2
t=0 ut
E
2
2)
2
2
1
1
T (2c
+
c2
T )
1
T
c2
T )
(2c +
1
T
(2c +
c2
T )
T
1
0
@T
(2c +
c2 2
T )
1
e2c
4c2
1
1
t=0
1
1
1+
1
1
1
1+
1
c 2T
T
2
+ Tc
1
T
T
1
PT
T
T
c
T
2t
1
A
+1
e2c 1
2c
1
2c
2c =
!
c 2T
T
2
2c + cT
c2
2c +
T
1
1+
c
1+
T
T
c
2c
!
1 :
If c ! 0; then by L’Hopital’s rule we get I =1=2: Thus in the limit we have
I (c) =
2
2.1
1
2c
e2c 1
2c
1
1
2
if c 6= 0;
if c = 0:
Point Optimal and Modi…ed Point Optimal Tests and Their Asymptotic
Distributions
Point optimal test in the absence of determinsitcs
First of all we derive the point optimal test without deterministics as this will help explaining
why the test statistic becomes inconsistent for some choices of c0 and secondly we can also use
this expression to derive modi…ed point optimal test. The point optimal test is given by
PT ( 0 ;
1)
= ^
T
X
2
^"21;t
t=1
^"j;t = u
^j;t
u
^j;t = yt
1
T
X
0 t=1
^j;t 1 ;
ju
^"20;t
!
;
x0t ^ j
Suppose we have no deterministics i.e. x0t = 0, so that the original model given in (2.1) and
(2.2) becomes
yt = ut ; and u
^j;t = yt
ut =
ut
1
u1 = "1 :
31
+ "t ;
The point optimal test of H0 :
PT ( 0 ;
=
T
X
2
1) = ^
t=1
T
X
2
= ^
against H1 :
0
T
X
1
"21;t
= 1 is then
!
"20;t
0 t=1
2
(yt
T
X
1
1 yt 1 )
(yt
0 yt 1 )
2
0 t=1
t=1
!
:
Consider the term in the parenthesis
T
X
(yt
1 yt
2
1)
1
1
1
yT2 +
0
=
1
(
(yt
1
+
2
1) =
0 yt
0
T
X
yt2
t=2
1)
yT2
+ (1
!
0
0 1)
0
2
1
T
X
0
T
X
1 0
yt2 1
t=2
This statistic is applied with
T
X
1
1
0 t=1
t=1
=
T
X
!
1
=
2
1
1 0
t=1
T
X
yt2
1
t=2
1
t=2
:
= 1 + c0 =T and
0
yt2
yt2 +
1
c0
= 1 + c1 =T , so
c1
T
and
1
0 1
=1
1+
c0
T
1+
c1
T
=
c0 + c1
T
c0 c1
;
T2
so for large T (which is what concerns us for choice of c1 )
PT (c0 ; c1 )
c0
c1
T
= (c0
yT2
T
c0 + c1 X 2
yt
T
1
t=2
1 2
yT
c1 ) T
(c0 + c1 ) T
2
!
T
X
yt2 1
t=2
PT (c0 ; c1 ) = c21
c20 T
2
T
X
yt2
1
+ (c0
c1 ) T
!
1 2
yT
(0.1)
t=2
and we reject for small values. (it’s worth not omitting the c0
c1 at the front of second last term
above so as not to mess with the sign of the statistic, and hence the rejection tail.)
2.2
Asymptotic distributions in the presence of deterministics but no autocorrelation in error term "t
Now we incorporate the deterministic component and assuming no serial correlation in error term
"t derive the asymptotic distribution of the test statistic under the local to unity framework. The
model is given by
32
yt = x0t + ut ;
ut =
ut
1
+ "t ;
u1 = "1 ;
"t
2
i:i:d: 0;
c
= 1+
T
The quasi-di¤erenced series are obtained as
yj;t =
yt
y1 ;
j yt
t=1
1 ; t = 2; : : : ; T
xj;t =
xt
x1 ;
j xt
t=1
1 ; t = 2; : : : ; T
with the resultant GLS-detrended series as
x0t ^ j
u
^j;t = yt
and based on u
^j;t , we have
^"j;t = u
^j;t
^j;t 1 :
ju
where
^j
= arg min
j
^j
=
T
X
T
X
t=1
xj;t x0j;t
t=1
PT ( 0 ;
1)
=^
x0j;t
yj;t
!
1 T
X
yj;t
x0j;t
xj;t yj;t :
t=1
T
X
2
0
1
^"21;t
T
X
^"20;t
0 t=1
t=1
!
By applying Cochrane-Orcutt method, we obtain from …rst two equations in terms of detrended
series as
yj;t = x0j;t + "j;t ;
"j;t = ut
j ut 1
Using yj;t in the above expression for ^ j and in yj;t = y^j;t + ^"j;t ; and after rearranging the terms
we obtain
^j
=
T
X
xj;t x0j;t
t=1
and
^"j;t = "j;t
!
x0j;t ^ j
33
1 T
X
t=1
xj;t "j;t
so that
x0j;t
^"j;t = "j;t
T
X
xj;t x0j;t
t=1
!
1 T
X
xj;t "j;t :
(0.2)
t=1
Now using yt = x0t + ut in u
^j;t and after rearranging the terms we obtain u
^j;t = ut
such that by using the expression for ^ j
in u
^j;t we obtain
x0t
u
^j;t = ut
T
X
xj;t x0j;t
t=1
FCLT states that:
1=2
T
Constant Case
!
1 T
X
xj;t "j;t
xj;t =
d
u[T s] ! Bc (s) :
1;
t=1
:
t = 2; : : : ; T
cj
T;
Since
x0j;t
T
X
xj;t x0j;t
t=1
therefore we have
T
X
(0.3)
t=1
Consider xt = 1; so that the quasi-di¤erenced series xj;t = xt
e^j;t = "j;t
x0t ^ j
xj;t x0j;t = x2j;1 +
t=1
!
1 T
X
T
X
xj;t "j;t ;
t=1
xj;t x0j;t
t=2
= 1+
T
X
cj
T
t=2
= 1 + c2j
34
T
1
T2
:
2
j xt 1
becomes
Similarly,
T
X
xj;t "j;t = u1 +
t=1
T
X
xj;t "j;t ; ( * "1 = u1 )
t=2
= u1
= u1
cj T
cj T
1
1
T
X
"j;t
t=2
T
X
ut
j
t=2
= u1
cj T
1
uT +
T
X1
T
X1
ut
j
t=2
= u1
= u1
cj T
cj T
1
1
ut
t=1
!
T
X1
t=1
T
X1
uT + 1
j
uT + cj T
t=2
T
X1
1
ut
!
ut
ut
j u1
!
1 + cj T
1
u1
t=2
=
1 + cj T
1
+ c2j T
2
u1
c2j T
2
T
X1
ut
cj T
1
uT
t=2
= u1 + r1;T ;
where r1;T = Op T
1=2
:
Now
T
X
xj;t x0j;t
t=1
and r2;T = Op T
1=2
!
1 T
X
xj;t "j;t =
1 + c2j
T
t=1
1
T2
1
(u1 + r1;T )
= u1 + r2;T
:
^"j;1 = "j;1
u1
r2;T
= r2;T ; * "j;1 = u1 :
Thus we have
^"j;t =
"j;t + cj T
r2;T
t=1
:
1 (u + r
)
t
=
2; : : : ; T
1
2;T
35
!
PT
"2j;t
t=1 ^
Now we can obtain the expression for
T
X
2
^"2j;t = r2;T
+
t=1
T
X
1
"j;t + cj T
as follows.
2
(u1 + r2;T )
t=2
=
2
r2;T
+
T
X
"2j;t
1
+ 2cj T
(u1 + r2;T )
t=2
=
T
X
"2j;t
+
2
r2;T
1
+ 2cj T
(u1 + r2;T )
t=2
=
T
X
"j;t + c2j
t=2
T
X
ut
T
j
t=2
T
X
2
"2j;t + r2;T
+ 2cj T
1
(u1 + r2;T ) uT
(u1 + r2;T )2
ut
cj T
!
+ c2j
ut
T
1
T2
(u1 + r2;T )2
1
1 + cj T
t=2
+c2j
T
X
T
X
t=1
T
X1
1
t=2
=
1
T2
T
1
T2
u1
!
(u1 + r2;T )2
"2j;t + Op T
1=2
:
t=2
Using the convention u0 = 0 as required:
PT ( 0 ;
1)
= ^
= ^
= ^
= ^
= ^
T
X
2
t=1
T
X
2
t=1
T
X
2
2
2
t=1
T
X
t=1
T
X
T
X
^"21;t
1
"21;t
0 t=1
!
T
X
1
2
"0;t
0 t=1
+ Op T
2
1 ut 1 )
1
(ut
^"20;t
2
u2t
2 1 ut ut
u2t +
1
1
2 2
1 ut 1
u2t +
1
1
1
T
X
u2t +
2
1
1
2
u2t
2 2
0 ut 1
0
(
1 0
t=1
= ^
c21
u2T
T
X
u2t
1
+ 1
2
1
+
1 0
0
1) (
0 1
0)
1
1
!
2
c21
2 2
0 ut 1
+
+ Op T
1=2
c20
T
2
T
X
u2t 1
c20 T
1
T
X
T
u2t
(c1
1=2
u2t 1
t=2
T
X
1
(
2
0 ) uT
1
c0 ) T
1 2
uT
ut
t=2
36
2
1
(c1
1=2
+ Op T
T
X
t=2
= ^
1
1=2
+ Op T
t=2
2
1=2
+ Op T
2 0 ut ut
t=1
0
= ^
1
2 2
1 ut 1
+
0
0
= ^
2
0 ut 1 )
(ut
!
0
1
2
T
X
1=2
0 t=2
t=1
= ^
!
!
!
!
+ Op T
1=2
+ Op T
1=2
+ Op T
c0 ) T
1=2
1=2
uT
2
!
+ Op T
1=2
:
Using the functional central limit theorem, the asymptotic distribution of point optimal test with
GLS-demeaning is given by
PT (c0 ; c1 ) ) c21
Z
c20
1
Bc (s)2 ds
0
For linear time trend we have dt = x0t ; so that xt = (1; t)0 and the
Linear Time Trend
quasi-di¤erenced series xj;t = xt
(1 + cj =T )xt
1
is given by
0
xj;t =
( cj =T; 1
(1; 1)
cj (t
t=1
:
1) =T )0 t = 2; : : : ; T
1=2
Let the diagonal matrix be given by DT = diag 1; T
T
X
x0j;t
^"j;t = "j;t
xj;t x0j;t
t=1
as
T
X
x0j;t DT
^"j;t = "j;t
!
T
X
xj;t x0j;t DT
= DT xj;1 x0j;1 DT + DT
t=1
!
1 T
X
DT xj;t "j;t :
(0.4)
t=1
xj;t x0j;t DT
t=2
1
=
T
+
T
T
1=2
T
X
But
1=2
1
c2j =T 2
cj T 3=2 (1 cj (t
t=2
T
X
c2j =T 2 = c2j
1=2
T
X
cj T
1
(1
cj (t
1
T2
= r1;T ;
1=2
1) =T ) = T
cj T 3=2 (1 cj (t 1) =T )
T 1 (1 cj (t 1) =T )2
1) =T )
T
t=2
T
xj;t "j;t
t=1
DT xj;t x0j;t DT
T
X
, so that after normalization we have
1 T
X
t=1
DT
c0 ) Bc (1)2 :
(c1
cj
T
1
T
t=2
+ c2j T
2
!
(t
1)
(t
1)2
t=2
(T 1)
T 1
+ c2j
T 3=2
2T 3=2
T 1 1 2
=
c
cj
T 3=2 2 j
= r2;T ;
=
T
X
cj
and
T
1
T
X
(1
cj (t
2
1) =T )
=
T
=
2cj T
T
t=2
=
1
T
T
X
(t
1) +
t=2
1
cj
T
1
2
T
1
T
1
cj + c2j
3
37
+ c2j
+ r3;T :
c2j T
3
T
X
t=2
1
3
1
1
+
2T
6T 2
:
So that we have
DT
T
X
xj;t x0j;t DT
1
=
1=2
T
t=1
1
=
+
1
cj + 13 c2j
1
!
r2;T
r2;T
!
0
0
r1;T
1=2
T
T
cj + 13 c2j + r3;T
1
+ R1;T :
Now consider
DT
T
X
xj;t "j;t = DT xj;1 "j;1 + DT
t=1
u1
=
T
+
1=2 u
1
T
X
xj;t "j;1 ; ( "j;1 = u1 )
t=2
T
X
T
t=2
cj T 1 ut
j ut 1
cj (t 1) =T ) ut
1=2 (1
:
j ut 1
But
1
cj T
T
X
ut
j ut 1
=
cj T
T
X
1
t=2
ut
j
t=2
=
cj T
=
cj T
=
cj T
1
uT + 1
1
1
uT
ut
t=1
T
X1
j
1
cj T
uT + c2j T
!
T
X1
2
j u1
ut
t=2
T
X1
ut
!
1 + cj T
t=2
T
X1
ut + cj T
1
1
u1
1 + cj T
!
1
u1
t=1
= r4;T
and
1=2
T
1=2
= T
T
X
t=2
T
X
(1
cj (t
1) =T ) ut
j ut 1
(1
cj (t
1) =T ) ut
jT
uT + T
T
X1
1=2
t=2
1=2
= T
jT
=
1
= (1
cj
1=2
1
T
1
T
cj ) T
(1
cj t=T ) ut
t=1
1
cj
T
X1
1=2
T
1
T
1=2
(1
cj (t
1) =T )
j
(1
cj t=T ) ut
t=2
cj T
1
T
1=2
u1
uT + c2j T
5=2
T
X1
tut
T
1=2
1
c2j T
2
u1
t=2
uT +
c2j T
5=2
T
X1
tut + r5;T
t=2
hence we have
DT
T
X
t=1
xj;t "j;t =
(1
cj ) T
1=2 u
T
38
u1
+ c2j T
5=2
PT
1
t=2 tut
+ R2;T :
By using the expressions for DT
DT
T
X
xj;t x0j;t DT
t=1
1
=
0
!
0
t=1 xj;t xj;t DT
1
DT
cj + 31 c2j
T
X
PT
and DT
t=1 xj;t "j;t ,
xj;t "j;t
1
(1
1=2 u
T
cj ) T
u1
+ c2j T
u1
=
1
cj + 13 c2j
1
we obtain
t=1
!
0
1
PT
(1
1=2 u
T
cj ) T
+ c2j T
5=2
5=2
PT
PT
+ R3;T
1
t=2 tut
!
1
t=2 tut
+ R3;T :
Squaring and summing both sides of equation (0:4) to obtain
T
X
T
X
^"2j;t =
t=1
t=1
T
X
=
0
@"j;t
"2j;t
t=1
x0j;t DT
T
X
DT
xj;t x0j;t DT
t=1
T
X
"j;t x0j;t DT
T
X
DT
t=1
!
1
DT
T
X
t=1
xj;t x0j;t DT
t=1
!
1
12
xj;t "j;t A
DT
T
X
xj;t "j;t
t=1
but
T
X
"j;t x0j;t DT
DT
t=1
xj;t x0j;t DT
t=1
"j;1 x0j;1 DT
=
T
X
+
T
X
"j;t x0j;t DT
t=1
=
u21
+ 1
1
cj + c2j
3
!
!
DT
DT
T
X
xj;t "j;t
t=1
T
X
xj;t x0j;t DT
t=1
1
(1
1
cj ) T
1=2
uT +
!
c2j T
1
DT
T
X
xj;t "j;t
t=1
5=2
T
X1
t=2
tut
!2
+ R5;T :
Thus we have
T
X
t=1
^"2j;t
=
T
X
t=1
"2j;t
1
1
cj + c2j
3
1
(1
cj ) T
1=2
uT +
c2j T
5=2
T
X1
t=2
tut
!2
:
(0.5)
The point optimal test using (0.5) is given by
PT = ^
2
T
X
^"21;t
1
T
X
0 t=1
t=1
^"20;t
!
:
Now we focus on the numerator part (NT )of the point optimal test
39
(0.6)
NT
=
=
T
X
1
^"21;t
t=1
T
X
t=1
+ 1
^"20;t
0 t=1
!
T
X
1
"20;t
0 t=1
"21;t
1
T
X
1
c1 + c21
3
1
1
c0 + c20
3
1
(1
1=2
c1 ) T
uT + c21 T
5=2
T
X1
tut
t=2
(1
1=2
c0 ) T
uT + c20 T
5=2
T
X1
tut
t=2
!2
!2
:
Using the expression from equation (0.1), we get
NT
c21
=
c20 T
2
T
X
u2t
(c1
1
1 2
uT
c0 ) T
t=2
1
+ 1
c21
=
1
c1 + c21
3
1
1
c0 + c20
3
1
c20 T
2
T
X
(1
1=2
c1 ) T
uT +
c21 T
5=2
T
X1
tut
t=2
(1
1=2
c0 ) T
uT +
c20 T
5=2
T
X1
tut
t=2
u2t
(c1
1
c0 ) T
!2
!2
1 2
uT
t=2
1
+ 1
Let
j
=
1
c1 + c21
3
1 c1
T
1 c1 + 31 c21
1
c0 + c20
3
1 c0
T
1 c0 + 31 c20
1 cj
so that 3 (1
1 cj + 13 c2j
NT
=
c21
c20 T
j)
2
T
X
=
u2t
1=2
1=2
uT +
uT +
c2j
1
cj + 13 c2j
1
(c1
1
c21
T
c1 + 31 c21
1
c20
T
c0 + 31 c20
5=2
T
X1
5=2
T
X1
t=2
1
+ 1
1
c0 + c20
3
1T
1=2
c0 ) T
1 2
uT
0T
:
(0.7)
uT + 3 (1
1) T
5=2
T
X1
tut
t=2
1=2
tut
!2
and hence
t=2
1
c1 + c21
3
tut
t=2
!2
uT + 3 (1
0) T
5=2
T
X1
t=2
tut
!2
!2
Following the functional central limit theorem, we obtain the asymptotic distribution of this
generalized point optimal test as
40
d
PT (c0 ; c1 ) !
c21
where Bc;cj (s) =
1
c20
j Bc (1)
Z
1
Bc (s)2
(c1
c0 ) Bc (1)2
1
c1 + c21 Bc;c1 (s)2
3
1
+ 1 c0 + c20 Bc;c0 (s)2
3
1
0
R1
2
j ) 0 sBc (s) ;
+ 3 (1
j = 0; 1:
Modi…ed Point Optimal Test
To obtain modi…ed point optimal test, we consider equation (0.3) after normalizing xt with diagonal matrix DT = diag(1; T
u
^j;t = ut
1=2 )
x0t DT
as
DT
T
X
xj;t x0j;t DT
t=1
= ut
x0t DT
1
DT
+ 3 (1
T
X
xj;t "j;t
t=1
u1
1=2 u
T
jT
!
j) T
PT
For t = 1;
T
1=2
u
^j;1 = T
1=2
u1
T
1=2
u1
1=2
1 T
+ RT
1
t=2 tut
5=2
jT
1=2 u
T
+ 3 (1
j) T
5=2
= RT
PT
+ RT
1
t=2 tut
For t = 2; :::; T;
T
1=2
u
^j;t = T
= T
1=2
1=2
ut
ut
T
t
T
1=2
jT
t=T
1=2
u1
jT
uT + 3 (1
1=2 u
T
j) T
+ 3 (1
5=2
j) T
T
X1
tut
t=2
!
5=2
PT
1
t=2 tut
+ RT
+ RT :
The modi…ed point optimal test denoted by MP T is given by
MP T = ^
2
c21 T
2
T
X
u
^21;t
1
+ (1
c1 ) T
1 2
u
^1;T
t=2
c20 T
2
T
X
t=2
u
^20;t
1
(1
c0 ) T
1 2
u
^0;T
!
(1.1)
Now we prove that MP T is asymptotically equivalent to PT : For this consider the following:
41
c2j T
2
T
X
u
^2j;t
1
+ (1
T
1=2
1 2
u
^j;T
cj ) T
t=2
=
c2j T
1
T
X
t=2
= c2j T
1
T
X
t=2
+ (1
= cj T
0
2
T
X
1
+ (1
cj ) T
t
B
@T
cj ) T
2
u
^j;t
1=2
1=2
u2t
1
ut
uT
1
1
cj + 31 c2j
1
+ (1
cj ) T
u
^j;T
2
1
T
cj + 13 c2j
1
1=2
1=2
(1
cj ) T
1=2
uT + c2j T
t=2
(1
cj ) T
1=2
uT + c2j T
2
1
5=2
T
cj + 13 c2j
1 cj
2
T
1 cj + 13 c2j
c2j
+
1
1
3
cj + 13 c2j
1
+
But T
1=2
PT
uT
1 2
3 cj
uT
(1
cj ) T
(1
1=2
cj ) T
cj ) T
1=2
2
(1
cj ) T
1)2 ! 1=3 and
1=2
uT + c2j T
1=2
uT +
uT +
c2j T
5=2
c2j T
5=2
T
X1
uT +
c2j T
5=2
t=2 (t
1) ut
42
1
=
t=2
!2
T
X1
tut
T
X1
tut
PT
tut
t=2
T
X1
5=2
t=2
PT
tut
1
! 2
C
tut A
!!2
2
t=2
cj
cj +
t=2 (t
2
(1
5=2
T
X1
t=2
t=2
c2j
5=2
T
X1
!2
1
t=2 tut ,
!
T
X
tut
T
(t
1) ut
t=2
!
3
T
X
(t
t=2
+ RT
so we have
1)2
1
c2j T
1
c2j T
2
T
X
t=2
T
X
1=2
T
u2t
1
u
^j;t
+ (1
2
+ (1
1
1=2
cj ) T
1=2
cj ) T
uT
2
u
^j;T
2
t=2
2
c2j
cj + 13 c2j
1
5=2
T
1
cj +
+
= c2j T
2
T
X
u2t
1
uT + c2j T
5=2
1=2
cj ) T
uT + c2j T
5=2
T
X1
tut
t=2
1=2
cj
cj + 13 c2j
1
(1
2
1 2
3 cj
1 cj
T
2
1 cj + 13 c2j
1
1=2
cj ) T
t=2
1 2
3 cj
+
(1
T
X1
2
(1
uT
(1
1=2
cj ) T
1=2
cj ) T
uT +
uT + c21 T
c2j T
5=2
5=2
T
X1
cj ) T
1=2
uT
2
+
=
1
1
cj + 13 c2j
1
c2j T
2
T
X
u2t
1
(1
1=2
cj ) T
cj ) T
1=2
uT + c2j T
5=2
T
X1
tut
uT + c2j T
5=2
T
X1
tut
t=2
+ (1
cj ) T
1=2
uT
!
+ RT
2
t=2
(1
tut
tut
t=2
1
cj + 13 c2j
!2
T
X1
tut
t=2
t=2
!2
t=2
+ (1
tut
!T 1
X
2
!2
!2
+ RT
+ RT
t=2
1
cj + c2j
3
1
= c2j T
2
T
X
u2t
1
+ (1
1 cj
T
1 cj + 13 c2j
cj ) T
1=2
uT
1=2
uT +
c2j
1
cj + 31 c2j
1
jT
1=2
uT + 3 (1
j) T
5=2
T
X1
t=2
43
5=2
t=2
2
t=2
1
cj + c2j
3
T
T
X1
tut
!2
+ RT
tut
!2
Now use above expression with j = 0; 1 in the numerator of (1.1) to obtain
NT
= c21 T
2
T
X
u2t
1
+ (1
1=2
c1 ) T
uT
2
t=2
1
c1 + c21
3
1
c20 T
2
T
X
u2t
1
1T
1=2
uT + 3 (1
1) T
5=2
T
X1
tut
!2
tut
!2
tut
!2
t=2
(1
c0 ) T
1=2
uT
2
t=2
+ 1
=
c21
1
c0 + c20
3
c20 T
2
T
X
0T
1=2
uT + 3 (1
0) T
5=2
T
X1
t=2
u2t
1
(c1
c0 ) T
1=2
uT
2
t=2
1
+ 1
1
c1 + c21
3
1
c0 + c20
3
1T
1=2
uT + 3 (1
1) T
5=2
T
X1
t=2
0T
1=2
uT + 3 (1
0) T
5=2
T
X1
t=2
= NT + RT
Thus we conclude that
NT = NT + op (1) :
44
+ RT
tut
!2
+ RT