Con…dence Intervals for an Autoregressive Coe¢ cient Near One Based on Optimal Selection of Sequences of Point Optimal Tests Muhammad Saqiby, David Harrisz y Department of Economics, The University of Melbourne, Melbourne, Australia z Department of Econometrics and Business Statistics Monash University, Clayton, Australia July 2, 2012. Abstract In this paper, we reconsider Elliot and Stock’s (2001) method to construct con…dence intervals for an autoregressive coe¢ cient by inverting a sequence of invariant point optimal tests. We show that the power properties of the point optimal tests can vary greatly with the choice of the point under the alternative at which power is maximised. Some choices are shown to lead to inconsistent tests, including, in some cases, the rule of thumb proposed by Elliott, Rothenberg and Stock (1996). We propose the optimisation of a weighted power criterion as an alternative method to specify the point optimal tests and demonstrate that this provides tests with desirable asymptotic local power properties. Keywords Point optimal test, GLS detrending, asymptotic power, weighted power maximization, invariant Je¤reys prior. 1 Introduction In the last three decades enormous work has been done by di¤erent researchers in the area of unit root testing and construction of con…dence intervals for the autoregressive root including local to unity framework proposed by Bobkovski (1983), Cavanagh (1985), Phillips (1987), and Chan and Wei (1987). As mentioned in Stock (1991), Cavanagh …rst constructed the con…dence interval for the autoregressive root with in an AR(1) model without deterministics using local to unity setting = 1 + c=T based on t-test statistic. However Stock (1991) extended Cavanagh’s work by considering deterministic component in AR(p) model and constructed con…dence interval for the largest autoregressive root by inverting the Dickey-Fuller t-test and Modi…ed Sargan-Bhargava test statistics. Elliot and Stock (2001) (henceforth ES) proposed new asymptotic methods to construct con…dence intervals for AR parameter using sequence of tests approach by coupling Stock’s method with ideas given in Elliot, Rothenberg, and Stock (1996) (henceforth ERS). ERS followed local to unity asymptotic approximations to the point optimal invariant test of King 1 (1987) and based on Neyman-Pearson Lemma after assuming Gaussian distribution for the errors they described that the best test of = 1 against any given alternative ratio test. This testing problem with = 1 is the likelihood = 1 + c=T is the same as testing H0 : c = 0 against H1 : c = c1 < 0 where the likelihood ratio test is the point optimal test. Contrary to Stock’s OLS detrending, ERS suggested GLS transformation of the data to deal with deterministics that utilizes c1 value under the local alternative. This constant is chosen such that the power curve of point optimal test is tangent to the power envelope at power one-half and hence results in famous choices of c1 = 7 when xt = 1 and c1 = 13:5 when xt = (1 t)0 . ERS also provide simulation evidence that the tests based on GLS-demeaning/detrending of the data in the presence of deterministics are more powerful tests with power curves indistinguishable from the Gaussian power envelope compared to the same tests based on OLS transformation with poor power properties. By combining the ideas of Stock and ERS, ES use sequence of tests to obtain con…dence set for the largest autoregressive root as a set of those values that are not rejected by the sequence of tests. Under this approach each test in the sequence is point optimal test of a particular null versus a particular alternative. The key inspiration behind their approach is the powerful test of a particular parameter value against di¤erent alternatives will produce accurate con…dence intervals. The main objective behind this work is to revisit ES work where we generalize their testing problem to H0 : c = c0 for any c0 instead of only c0 0 for the purpose of constructing con…dence interval of the autoregressive root under local to unity set up. This requires modi…cation of the point optimal test and its asymptotic distribution. We do not ignore the possibility of mildly explosive root of a time series and take into account testing positive values of c0 : Our analysis shows that when testing for any positive c0 the point optimal test can become inconsistent with lower sided alternatives and also has inferior power close to the null using upper tail test based on the alternative values proposed by ERS and ES and indicates that the choice of c1 is more in‡uential when c0 > 0 then c0 choice of L = 0:03 and U 0. Moreover in two-sided testing case we reconsider with ES’s = 0:02 where they justify the assignment of higher probability of type-I error to the left tail by the argument that the power curves are steeper at right tails. This choice is pragmatic and without any formal justi…cation and we have strong conjecture that the choice of L and U is sensitive to varying c0 . These observations (also pointed out by ES in their conclusion) motivate us to choose model parameters ((c1L ; c1U ) for one-sided tests and (c1L ; c1U ; L; U) for two-sided tests) based on some formal method using optimality criteria e.g. maximization of weighted powers. Simulation analysis proves the superiority of power curves of the test based on our proposed parameters compared to those based on ERS and ES and implies accuracy of resulting con…dence intervals. The paper is organized as follows. The model and the method of construction of con…dence intervals as discussed in ES is summarized in the next section. In section 3, we discuss about the 2 impact of di¤erent values of c1 on one-tailed tests, inconsistency of the test caused by some of c1 values and, alternative criteria for choosing c1 optimally. We extend the analysis to two-sided tests and provide some formal methods to choose model parameters in section 4. Results are provide in section 5. Section 6 concludes. All proofs are provided in Appendix A. 2 The Model and Hypothesis Testing Problem Consider a time series yt with the data generating process given by yt = x0t + ut ; ut = ut 1 t = 1; :::; T; + "t ; (2.1) u0 = 0; (2.2) = 1 + c=T; "t where we treat x0t i:i:d:(0; 2 ): in the DGP as deterministic component with usual two cases of constant mean ( xt = 1) and, constant and linear time trend xt = ( 1 t )0 but of course the analysis can be extended to higher order time polynomials. Similarly the assumption that the initial condition u0 = 0 is vital and follows ERS and will be maintained in this paper. Since we are interested only in the asymptotic properties of the point optimal test therefore our analysis here is restricted to disturbances "t being i:i:d:, however the approach is ‡exible to allow for other error structures for practical purposes. Also "t obeys the Functional Central Theorem such that P[T s] d T 1=2 j=1 "j ! B(s); 0 s 1 where [:] represents greatest lesser integer function, B(s) is a d standard Brownian motion de…ned on C[0; 1], and ! denotes weak convergence in distribution. Sequence test approach Suppose we want to construct 100(1 for the autoregressive parameter of asymptotic size H1 : 6= 0, )% con…dence set CS (y) by inverting the sequence of tests with T (y) as the test statistic that rejects for small values. For the hypothesis H0 : the test is performed over a range of values as the set of those values of 0 0 = 0 against and the con…dence set is then derived that are not rejected by the tests as CS (y) = f : T (y) > cv ( 0 )g where cv is the critical value. This set exhibits the property that it has asymptotic coverage probability of at least (1 ) for any true value of de…ne the rejection probability of the test as the false null H0 : 0 = 0 in CS (y) ; is given by the test to have smaller provided that = Pr 0 ( 0 0 ; i.e. limT !1 Pr [ 2 CS (y)] 1 : We ( ) = Pr (T (y) < cv ( 0 )) : If the test fails to reject is true then the probability of inclusion of false value 2 CS (y)) : However accurate con…dence intervals require implying high power. This paper therefore considers the construction of tests of good power properties. Under local to unity setting the autoregressive root is 0 (= 1 + c0 =T ) we can construct the test of H0 : 3 = 0 = 1 + c=T and for any …xed value against the one-sided alternatives where the side depends whether 1 < 0 or 1 > 0; i.e. H1L : = 1 < 0 or H1U : = 1 > 0: This problem is analogous to testing H0 : c = c0 vs: H1 : c = c1 : For lower sided test we have c1 < c0 and upper sided test requires c1 > c0 : One sided con…dence set can be constructed as a set of those values of c0 that are not rejected by the test, i.e. CS ;i (y) = fc0 : Ti (y) does not rejectg ; i = L; U: where for upper sided test we have c^U = sup (CS be ( 1; c^U ) : Similarly c^L = inf (CS ;L (y)) ;U (y)) and the resulting con…dence interval would and provides (^ cL ; 1) as the lower sided con…dence interval for c. Construction of two-sided con…dence interval in this framework for H0 : c = c0 requires inverting two one-sided tests corresponding to H1L : c = c1L < c0 and H1 : c = c1U > c0 : Given the constraint that the probability of rejecting the true null is some …xed level ; the twosided con…dence interval consists of those values of c0 that are not rejected by both tests. We will explain more about this shortly. Once we obtain the con…dence interval for c; con…dence interval for can be obtained simply by applying the transformation ^ = 1 + c^=T . For above hypothesis with j test that accounts for values of = 1 + cj =T; j = 0; 1, ES have proposed point optimal invariant under both the null and alternative as PT (c0 ; c1 ) = "21;t t=1 ^ where ^"j;t = u ^j;t u ^j;t = yt ^j 1 + c1 =T PT 2 ^" 1 + c0 =T t=1 0;t PT 1 ^2 ^j;t 1 ; ju (2.3) j = 0; 1 x0t ^ j ; = arg min j T P (yj;t t=1 x0j;t )0 (yj;t x0j;t ); and yj;t ; xj;t are quasi-di¤erenced series obtained as zj;t = zt zt ; (1 + cj =T )zt 1; t = 1; t = 2; :::; T: From above equations we can also write ^"j;t as ^"j;t = yj;t x0j;t ^ j ; j = 0; 1 Since we are interested in a test with high power at a speci…c value of the parameter against alternatives in either direction, so the asymptotic rejection probability of the point optimal test is given by (c; c0 ; c1 ) := lim Pr(PT (c0 ; c1 ) < k j = 1 + c=T ): T !1 where k = k (c0 ; c1 ) is the asymptotic critical value. This rejection probability (c; c0 ; c1 ) represents asymptotic power of the test where the test PT (c0 ; c1 ) rejects the null for small values 4 and for true c0 has asymptotic size equal to power given by (c0 ; c0 ; c1 ) = . Since the critical value and the power both depend on c0 and c1 (the values under the null and local alternative respectively) therefore there is no single uniformly most powerful invariant (UMPI) test however a family of Neyman-Pearson tests indexed by c1 : Each test in the family is most powerful at the point c1 = c and forms the power envelope de…ned by (c; c0 ; c) (c; c0 ; c1 ). For one-sided con…dence intervals for ; the test of H0 : c = 0 with lower tail (stationary) alternative H1 : c = c1 < 0 is usual unit root test and the point optimal test in this case reduces to the one introduced and analyzed in detail by ERS. For the choice of c1 , ERS proposed inverting the envelope power function such that (c1 ; 0; c1 ) = 0:50 and the tests have power functions not too far from the power envelope over a substantial range. This results in c1 = c1 = c1U in 7 for xt = 1 and 0 13:5 for xt = (1; t) : For two-sided con…dence interval two di¤erent values of c1 as c1L and 1 = 1 + c1 =T are required to test the null H0 : c = c0 against the alternatives H1 : c1 = c1L < c0 and H1 : c1 = c1U > c0 . For the sake of con…dence interval construction, ES extend the testing problem to di¤erent values of c0 (= 0; 5; 10) and recommend using alternatives that are …xed distance from the null as H1i : c1 = c0 + c1i ; i = L; U . They use (c1L ; c1U ) = ( 7; 2) and ( 13:5; 5) respectively for xt = 1 and xt = (1; t)0 : To make notations simple we de…ne the events R i (c0 ; c1i ) = fPT (c0 ; c1i ) < k i (c0 ; c1i )g ; i = L; U where for instance R null at i L (c0 ; c1L ) denotes the event that occurs when the lower tail test rejects the signi…cance level. ES set the overall size of the test as tervals with desired coverage probability of 100(1 )% where L+ U i = to have con…dence in- = Pr (R i (c0 ; c1i )j 0 = 1 + c0 =T ) ; i = L; U is the size of the individual test. We now de…ne the rejection probability of the two-sided test as (c; c0 ; c1L ; c1U ; L; U) = Pr (R L (c0 ; c1L ) [ R U (c0 ; c1U )j = 1 + c=T ) : While size of the joint test is given by (c0 ; c0 ; c1L ; c1U ; L; U) = Pr (R L = Pr (R L (c0 ; c1L ) [ R L + (c0 ; c1U )j U L (c0 ; c1L ) \ R Pr (R L U (c0 ; c1U )j (c0 ; c1L ) \ R This implies that size of the joint test is bounded by then choosing L + U < 0 = 1 + c0 =T ) (c0 ; c1L )j = 1 + c0 =T ) + Pr (R Pr (R = U U 0 U (c0 ; c1U )j = 1 + c0 =T ) = 1 + c0 =T ) (c0 ; c1U )j 0 = 1 + c0 =T ) and if simultaneous rejections occur results in a conservative test that yields unnecessary long con…dence interval. Therefore in our simulation experiments we will also keep eye on the size of the joint tests. Using U = L; the rejection probability of the two-sided test reduces to 5 (c; c0 ; c1L ; c1U ; L) = Pr (R L (c0 ; c1L ) [ R The con…dence set is a set of those values of 0 L (c0 ; c1U )j = 1 + c=T ) that are not rejected by either of the two tests or both. This implies that neither of the two events R L (c0 ; c1L ) and R U (c0 ; c1U ) should occur. However due to the possibility of simultaneous rejection of lower and upper tail tests the overall size of the tests follows the constraint L + whereas U works as upper bound for the size of the test. Con…dence set and the resulting con…dence interval given CST = fc0 : R CIT = (^ cL ; c^U ) L (c0 ; c1L )c \ R U L and U is (c0 ; c1U )c g ; where c^L = inf c0 CST and c^U = supc0 CST . The con…dence interval for is then (^L ; ^U ) where ^L = 1 + c^L =T and ^U = 1 + c^U =T . 3 Modi…ed Point Optimal Test Ng and Perron (2001) have considered the modi…ed point optimal tests and provide their limiting distributions that coincide with the feasible point optimal test of ERS when testing c0 = 0: Since our objective is to extend the analysis to the general testing problem c = c0 in = 1 + c=T; so we modify the point optimal test of Ng and Perron (2001) and denote it by MP T : The test statistic and its limiting distribution for testing H0 : c = c0 against H1 : c = c1 are given for 1. Constant Case 2 MP T (c0 ; c1 ) = ^ d ! c21 c21 c20 T c20 2 T X u ^21;t 1 (c1 1 2 u ^1;T c0 ) T t=2 Z 1 Bc (s)2 ds ! (3.1) c0 ) Bc (1)2 : (c1 0 2. Constant and Linear Trend Case 2 MP T (c0 ; c1 ) = ^ d ! c21 Z c21 T 2 T X u ^21;t 1 + (1 c1 ) T 1 2 u ^1;T t=2 1 2 Vc;c1 (s) ds + (1 c20 T 2 T X u ^20;t 1 (1 c0 ) T t=2 c1 ) Vc;c1 (1) 2 0 6 c20 Z 0 1 Vc;c0 (s)2 ds 1 2 u ^0;T ! (3.2) (1 c0 ) Vc;c0 (1)2 where Vc;ci (s) = Bc (s) s h Bc (1) + 3(1 R1 i) 0 i sBc (s)ds ; Bc (s) is the Ornstein-Uhlenbeck process given by Z s exp (c (s Bc (s) = i = 1 ci ; i = 0; 1; and 1 ci + 31 c2i r)) dB (r) 0 with B (r) as standard Brownian motion. 4 One-Sided Testing In this section we investigate the in‡uence of di¤erent choices of c1 under the alternative on asymptotic local power properties of the point optimal test for di¤erent c0 . We consider c0 = 4 as a representative case for negative choices of c0 ; c0 = 4 for positive values and, the famous unit root case when c0 = 0: All simulations in this paper are performed in GAUSS using T = 1000 and 50; 000 replications for cases xt = 1 and xt = (1; t)0 . 4.1 The e¤ect of c1 on lower tail tests The …rst two panels in …gures 4.1 and 4.2 respectively correspond to c0 = 0 and 4 where we observe that ERS and ES choices of c1 result in power curves that are quite indistinguishable from the power envelope across all c: This property is also fairly robust to variations in c1 as evident from the power curves for some arbitrary choices of c1 that correspond to power higher than 50% at the power envelope. The most interesting behavior emerges for the …nal case as can be seen from third panel of above …gures, where we have quite di¤erent …ndings for c0 = 4: We observe that the choice of c1 values can matter a great deal to the power properties of the point optimal test. Based on 50% rule of ERS, c1 = 3:75 for demeaned case and c1 = 2:25 for constant and linear trend case produce tests with non-monotonic power curves that are close to the envelope near the null but then decline towards zero as c moves away from the null. This behavior is explored further below. However c1 (equal to 3 for xt = 1 and 9:5 for xt = (1; t)0 ) based on ES …xed distance from the null rule behaves reasonably well but unlike the examples of c0 = 4; 0; there does not exist any single choice of c1 that produces a test with power uniformly close to the power envelope. Therefore the choice of c1 when we test some positive value of c0 is important and will be considered below in more detail. 4.1.1 Why is the test inconsistent for some c1 when c0 > 0? From above discussion we know that the point optimal test when applied to test c0 = 4 against the lower tail local alternative based on c1 of ERS and arbitrary value of 2:5 has non-monotonic 7 Figure 4.1: Power Curves: Constant Mean 8 Figure 4.2: Power Curves: Constant and linear trend 9 asymptotic power curves that approach to zero as c diverges from the null. This observation suggests that the test may be inconsistent. A possible explanation for this inconsistency of the test may be as follows. The point optimal test without deterministics (as shown in appendix A) is approximately P 1 (4.1) c21 c20 T 2 Tt=2 yt2 1 + (c0 c1 )T 1 yT2 2 ^ and we reject the null for small values i.e. the test rejects when PT (c0 ; c1 ) < k : Since the PT (c0 ; c1 ) = asymptotic distribution of point optimal test is independent of ^ 2 ; so we can replace it with known 2 to keep things simple. First consider the case of testing unit root against the stationary local alternatives that requires c0 = 0 and c1 < 0: The test given above becomes PT (0; c1 ) = 1 2 c21 T 2 PT 2 t=2 yt 1 c1 T 1 2 yT ; and consists of entirely positive terms and leads to k > 0: Thus the test is consistent against …xed alternatives and its consistency can be shown as below. With T 1 T P t=2 and yt2 2 p ! E yt2 1 < 1; we have = 1 2 1 d 2 yT2 ! y1 ; 2 2 represents a random variable with stationary distribution of y ; i.e. y where y1 t 1 Thus c21 d T PT (0; c1 ) ! y1 c1 0; 1 2 : 2 : 1 This convergence in distribution shows that under …xed alternatives T PT (0; c1 ) = Op (1) or PT (0; c1 ) = Op (T 1 ). 2 p This implies that PT (0; c1 ) ! 0 and hence we observe Pr (PT (0; c1 ) < k j < 1) ! 1; i.e. the test statistic will be less than any positive critical value with probability approaching 1. Therefore the test is consistent. Now consider the case such that c0 > 0 and we test c0 against the lower tail alternative P c1 < c0 : The signs of terms c21 c20 T 2 Tt=2 yt2 1 + (c0 c1 )T 1 yT2 of point optimal test in (4.1) depend on magnitudes of c0 and c1 contrary to the unit root test case with c0 = 0 and c1 < 0: Under this new situation with c0 > 0 and c1 < c0 ; we have c0 (c0 c1 )T 1y2 T c1 > 0 and so the second term > 0: Now we have two cases to conjecture about the sign of the …rst term as: 1. If jc1 j > c0 ; then c21 c20 > 0 leads to c21 situation with consistent test. 2. For jc1 j < c0 , we have c21 of c1 we may have c21 c20 T 2 PT 2 t=2 yt 1 > 0 and we are back to the P c20 < 0 implying c21 c20 T 2 Tt=2 yt2 1 < 0: For di¤erent values P c20 T 2 Tt=2 yt2 1 7 (c0 c1 )T 1 yT2 and so the point optimal test PT (c0 ; c1 ) is not guaranteed to be positive or negative. 10 Note that for this second case, the test is still the likelihood ratio test but may cause inconsistency in some cases. Under the …xed alternative case d T PT (c0 ; c1 ) ! (c21 This implies that PT (c0 ; c1 ) = Op (T But the inconsistency arises given c21 c20 ) 1) 1 1 2 < 1 we have + (c0 y1 c1 ) 2 : p or PT (c0 ; c1 ) ! 0 for …xed alternative case as before. c20 < 0 and the critical value may be negative (i.e. k < p 0): The reason why this negative critical value causes inconsistency is that if PT (c0 ; c1 ) ! 0 and k < 0 then Pr (PT (c0 ; c1 ) < k ) converges to zero instead of one. For instance, in demeaned case for c0 = 4 if we choose c1 such that 1:053 < c1 < 4; the resulting test is inconsistent. This is the underlying reason for non-monotonic power curves when testing some positive value of c0 : As c! 1; the local to unity model behaves like a stationary model and the point optimal test does not have power against stationary models. This logic is not applicable to local to unity model itself, i.e. the test with jc1 j < 4 (e.g. c1 = 2:5 as evident from …gure 4.1) may have good local power properties near the null but has poor properties further from the null suggesting not to choose that test. Thus based on our simulation experiments we …nd that; for a test with constant we need c1 1:053; and with constant and linear trend we need c1 < 1:752 in order for the PT test to be consistent against stationary alternatives. 4.2 The e¤ect of c1 on upper tail tests To see if c1 matters in case of upper tail testing, power curves are presented in …gures 4.3 and 4.4 respectively for demeaned and detrended cases. First two panels in these …gures correspond to c0 = 0 and c0 = 4 respectively. In both speci…cations of xt we see that the power curves produced by using di¤erent values of c1 are fairly close to the power envelope and we are indi¤erent in using either of these values with an exception of the subjective choice of c1 = 4 in case of c0 = 0: The test when used with c1 = 4 (a value that is a bit far from the null) has low power for all c that are close to the null but as c moves away from the null the power gets closer to the power bound. This power loss for values near the null make c1 = 4 an inferior choice compared to ERS and ES. As evident from …nal panels of above …gures, some di¤erent behavior emerges for c0 = 4: We ERS = 4:5 and cES = 9 for constant have cERS = 4:225 and cES 1 1 = 6 for constant mean case and c1 1 and linear trend case with some arbitrary choices of c1 as a value between cERS and cES or a 1 1 ERS value greater than cES 1 : We observe that in both cases power curves of the test based on c1 are reasonably close to the power envelope until their tangency to the envelope at c1 itself but afterwards remain below and, contrary to other two cases of c0 = the envelope. On the other hand, power of the test based on cES 1 4; 0; never get quite close to remains substantially low for c that are near to the null as manifested by nearly horizontal segment of the power curve. However as c increases further away from the null the power curve starts rising quickly towards the power 11 Figure 4.3: Power Curves: Constant mean 12 Figure 4.4: Power Curves: Constant and linear trend 13 bound in demeaned case but behaves rather poorly in linear trend case. In summary, the choice of c1 has a substantial e¤ect on the power properties of the tests for c0 = 4. It will therefore be necessary to consider ways of choosing c1 . 4.2.1 Is the test inconsistent for upper tail testing? To answer this question of inconsistency for the upper tail test we again focus on the two terms given in the point optimal test. Since c1 > c0 , so the second term (c0 c1 )T 1 yT2 in point optimal P test is always negative. The …rst term c21 c20 T 2 Tt=2 yt2 1 might be positive or negative depending on c21 7 c20 . However, for upper tail test none of this matters in the same way that it does for the consistency of lower tail tests against stationary alternatives. In the latter case the statistic converges to zero, so the sign of the critical value is important for the consistency or otherwise of the test. 2 In considering the consistency of a test against explosive alternatives, we have both T +1 and T 1y2 T p ! +1. More speci…cally, for the DGP yt = yt 1 + "t with PT p 2 t=2 yt 1 ! > 1, we …nd from 2T y 2 T Theorem 2 of Lai and Wei (1983) (see equations (2.1) and (2.3)) that = Op (1) and P T 2 2T t=2 yt 1 = Op (1), so the two terms in the test statistic are actually the same order when P > 1. But notice that Tt=2 yt2 1 is divided by T 2 while yT2 is only divided by T in the statistic, so the asymptotic behavior of the statistic when > 1 is determined entirely by (c0 For any upper tailed test (i.e. any c0 ; c1 such that c0 < c1 ) we have (c0 c1 ) T 1y2 T c1 ) T p ! 1y2 . T 1, and the test rejects for small values of PT , so it will reject with probability converging to one against explosive alternatives. Thus inconsistency is not an issue here. Alternative criteria to choose c1 5 Based on previous discussion we …nd that the alternatives proposed in the literature work fairly well when applied to testing c0 taking zero or negative values. But for testing positive values of c0 , the test becomes inconsistent for lower sided testing or has low power near the null for upper tail testing. This issue motivates us to …nd some alternative choices of the model parameters. As discussed in Patterson (2011), Cox and Hinkley (1974) provide the following three possible choices when there does not exist any uniformly most powerful test. i. Use the most powerful test for a representative 1 value under the alternative. ii. Maximize power for an alternative very local to the null. iii. Maximize the weighted power for a range of local alternatives. To achieve our objective we will use (i) and (iii) and some other criteria as well. To this end, let (c; c0 ; c1i ) and (c; c0 ; c) respectively denote the asymptotic local power function and the asymptotic power envelope of test of size and 14 (c; c0 ; c1i ) := lim Pr (PT (c0 ; c1i ) < k j = 1 + c=T ) T !1 where (c0 ; c0 ; c1i ) = : Criterion 1 Power curves tangency Based on (i) above, we can apply power curves tangency rule to obtain c1i as a value such that the asymptotic power curve is tangent to the power envelope at some pre-decided power , i.e. (c1 ; c0 ; c1i ) = using where = 50% or 80%: Note that ERS’s choice of c1i is based on this criterion = 0:50 with emphasis on the treatment of deterministics under the alternatives local to the null. Criterion 2 Minimax criterion In addition to the criteria above, we could choose the optimal value of c1 as the value that minimizes the largest power loss relative to the power envelope, i.e. c1i = arg min max ( c1i c (c; c0 ; c) (c; c0 ; c1i )) : Criterion 3 Optimal weighted average power Based on rule (iii) given above following Cox and Hinkley (1974), we can de…ne the optimal weighted average power function as c1i = arg max c1i Z (c; c0 ; c1i )w(c)dc; j = l; u ~j C (c0 ; c0 ; c1i ) = and w(c) 0 represents weights as a function of c; C~l = ( 1; c0 ] for lower tailed test and C~u = [c0 ; 1) when the test is upper tailed. R One problem with the set C~j is that the integral C~j (c; c0 ; c1i )w(c)dc need not exist if R C~j is unbounded. We can either choose w (c) such that C~j w (c) dc < 1 or restrict w(c) to R interval/truncated domain Cj C~j so that the integral (c; c0 ; c1i )w(c)dc exists when de…ned where on Cj with Cl = [cl ; c0 ] and Cu = [c0 ; cu ] : For instance, for lower tailed test the integral can become R c0 (c; c0 ; c1L )w(c)dc. Now the question arises of how to choose b? As a practical matter we can b choose b such that for any small " > 0 we observe power envelope within " of maximum attainable power of 1, i.e. 1 (b; c0 ; b) ": For two-sided tests we will stick to the choice of weighted average power maximization. As given above the rejection probability of the two-sided test using (c; c0 ; c1L ; c1U ; L) = Pr (R L (c0 ; c1L ) [ R 15 L U = L is (c0 ; c1U )j = 1 + c=T ) So in general we can set about choosing c1L ; c1U ; Z fc1L ; c1U ; L g = arg max fc1L ;c1U ; Using the optimal value …xed L (and hence U) L; L (c; c0 ; c1L ; c1U ; ~ C Lg to maximize weighted power, i.e. we can …nd optimal value of U L ) w (c) dc: from U = L. However for like the case of ES, we can obtain separately the optimal values of c1L and c1U as c1i = max c1i Z ~i C (c; c0 ; c1L ; c1U ) w (c) dc; i = L; U: But rather than setting an arbitrary value for L; we will investigate to choose it optimally with other parameters. Some possible choices of weight function w(c) are as follows. 1 if c 2 Cj ; 0 if c 2 = Cj : This choice of weight function results in simple average of power, so the optimal value of c1 i. Uniform weights: w(c) = for the local alternatives is chosen as a value that maximizes simple average of powers. For one-sided tests c1i = arg max c1 Z (c; c0 ; c1 )dc; j = l; u Cj and for two-sided tests fc1L ; c1U ; L g = arg max fc1L ;c1U ; Lg Z (c; c0 ; c1L ; c1U ; ~ C L ) dc ii. w(c) = I (c)1=2 (Je¤ reys Prior ) ii.a. Je¤ reys prior based on full likelihood Phillips (1991) suggests using Je¤reys prior instead of ‡at priors in response to the criticism by Sims and Uhlig (1991) on frequentists approach to unit roots. Phillips considers …xed in the autoregressive model with and without deterministic trend and both conditional and unconditional cases. However for our purpose we assume the conditional case (u0 = 0); known 2 and with no deterministics to translate his results using = 1 + c=T to take asymptotic approximations. The corresponding Fisher’s information matrix (see appendix for the derivation) is given by I (c) = 1 2c 1 2 e2c 1 2c 1 if c 6= 0; if c = 0: The prior is plotted in …gure 5.1 below. From this diagram we observe that the prior increases p slowly to the value 1= 2 at c = 0 as the information content increases with T ! 1; but 16 w(c) JP 3 IJP 2 1 -4 -3 -2 -1 0 1 2 3 4 c Figure 5.1: Je¤reys prior and Invariant Je¤reys prior then it starts growing exponentially for all c > 0. This higher density for c > 0 is due to our prior knowledge about the parameter that when true value of c0 > 0 in = 1 + c=T; the data will carry more information about c0 : ii.b. Je¤ reys prior based on invariant likelihood Since in this analysis we are applying point optimal test of King (1987) that is invariant to transformations of the form y 7! y + Xb, where y = (y1 ; : : : ; yT )0 and X = (x1 ; : : : ; xT )0 , following King (1980) and King and Hillier (1985), we can derive Je¤reys prior based on invariant likelihood. Although King and Hillier discuss both invariance to the regressors and to scaling but we follow ERS and use invariance only to the regressors and not to scaling (which is handled by dividing the point optimal statistic by an estimate of the variance). By assuming Gaussian errors the information matrix using invariant maximum likelihood when xt = 1 is the same as the one we have obtained from full maximum likelihood. However for xt = (1 t)0 ; this information matrix is given by I (c) = 1 2c e2c 1 2c 1 (c 2c3 1 1 3 1 c+ c2 3 2) 3c 0 B + 2B @ 2e2c + ce2c + 2c2 + 2 c2 1 c+ 3 12 3 6 1 2c c+ c2 3 C C ; c 6= 0 A Note that I (c) ! 0 as c ! 0 and is evident from the plot of this prior given in …gure 5.1. This observation is also pointed out by Marsh (2007). 17 Since Je¤reys prior is the square root of the information matrix, so we can use the weight function w(c) = I (c)1=2 and the optimal choice of c1 is given by Z c1i = arg max (c; c0 ; c1 )I (c)1=2 dc; j = l; u c1i Cj Z fc1L ; c1U ; L g = arg max (c; c0 ; c1L ; c1U ; L ) I (c)1=2 dc: fc1L ;c1U ; Lg ~ C iii. w((c) based on the Symmetrized Asymptotic Reference Prior (SARP) Berger and Yang (1994) provide the symmetrized asymptotic reference prior that maximizes the Kullback-Leibler divergence between prior and posterior that results in the following prior. ( ) / E (log ( jy)) For large T; this is approximately 1 E 2 ( ) / exp log T X t=2 yt2 1 !! but has di¤erent orders of T for di¤erent : Berger and Yang (1994) suggest using the normalized ( ) for < 1; then use transformation symmetric reference prior. 7 ! 1= for p 1 p 1= 2 j j 1= 2 SR = > 1:This leads to the following asymptotic 2 ; j j<1 2 1 ; j j>1 Berger and Yang discuss some of the important features of this prior as; 1. it a proper prior since it integrates to one, 2. it assigns equal probability of one-half to j j < 1 and j j > 1; 3. contrary to Je¤reys prior as a weight function that assigns …nite weight to the stationary case where j j < 1 but unreasonable (in…nite) weight to the explosive case j j > 1; this prior assigns more weight to the values that are close to c = 0 but as c moves away from zero the weights start shrinking; in particular, for c > 0 the weights converge to zero, 4. usually we can not use improper priors for testing but above mentioned properties of SR prior make it suitable for use in testing. This asymptotic reference prior makes sense under local to unity setting T ! 1 and becomes R (c) / exp 1 Ec log 2 18 Z 0 1 Bc (s)2 ds = 1 + c=T with where Bc (s) is the Ornstein-Uhlen process. This prior is similar to Je¤reys prior but is ‡exible to imposing symmetry about c = 0: For R1 computational purpose we approximate Ec log 0 Bc (s)2 ds by simulation, as Ec log Z 1 R n 1 X 2 zc;r;t n2 1 X log R Bc (s)2 ds 0 r=1 t=1 !! where zc;r;t is generated from zc;r;t = 1 + with zc;r;0 = 0 and n r;t t=1 c T zc;r;t 1 + r;t ; being pseudo-random drawings from the i.i.d. standard normal distribution. The sample size n and number of repeated samples R are chosen to be large in the usual way to reduce simulation error. The resultant reference prior is !!! R n 11 X 1 X 2 log zc;r;t : R (c) = exp 2R n2 r=1 t=1 Following diagram presents asymmetric and symmetric versions of the reference prior along with Je¤reys. Figure 5.2: Asymmetric and Symmetric Reference Prior Now we assume the weight function w(c) = R (c) and obtain optimal values of c1 for both one and two-sided tests as c1i = arg max c1 fc1L ; c1U ; Lg = Z arg max fc1L ;c1U ; (c; c0 ; c1 ) R (c) dc; i = l; u Ci Lg Z ~ C 19 (c; c0 ; c1L ; c1U ; L) R (c) dc: 6 Results 6.1 6.1.1 One-sided tests Constant mean case In Table 6.1 below, we have reported optimal values of c1L and c1U obtained after using di¤erent criteria discussed above for individual one-sided point optimal test when xt = 1. Power curves corresponding to these optimal values for c0 = 0; 4 are presented in …gure 6.1. We observe that these power curves for lower tailed alternatives given in panel 1 and 2 are indistinguishable from the power envelope. Similarly for upper tailed alternatives we …nd that some of optimal c1L values produce power curves slightly below the envelope near the null only. But these di¤erences are negligible, except for invariant Je¤reys prior case when c0 = 0, and hence we can happily use any of these values for con…dence interval construction. However looking at …gure 6.2 for c0 = 4 we observe that some of the power curves given in the …rst panel for lower tailed test indicate the inconsistency of the test for some c1L values. In particular, we …nd c1L values obtained by using ERS, minimax, (c1 ; c0 ; c1 ) = 0:85 rules and, the Je¤reys prior as weight function are all greater then the threshold value of c1 = 1:053 for constant case. As a consequence of using these values we get inconsistent tests with powers converging to zero instead of one as c moves away from the null. Further we also …nd that none of these criteria produce power uniformly close to the power envelope. Speci…cally power curves based on ES behave poorly for alternatives not far from the null compared to other three choices. Table 6.1: Optimal values of model H0 : Criterion ERS/ (c1 ; c0 ; c1 ) = 0:50 (c1 ; c0 ; c1 ) = 0:85 Elliot and Stock Mini-Max Simple Average Je¤reys Prior (Invariant JP) Symmetric Reference Prior parameters for one- sided tests: Constant only c0 = 4 c0 = 0 c0 = 4 c1L c1U c1L c1U c1L c1U -13 -19 -11 -23 -24 -24 -24 -0.5 2 -2 0.5 0 2.5 0 -7 -12 -7 -10 -10 -10 -10 2 3 2 1.75 2 2.50 1.75 3.50 1.75 -3 1.75 0.65 1.25 0.65 4.3 4.7 6 4.3 4.5 4.7 4.5 Power curves for upper tail test of c0 = 4 using the same criteria are presented in second panel of …gure 6.2 where inconsistency is not an issue. The best choice of c1U depends on the position of the power curve relative to the power envelope, that is, the value of c1U that produces power curve closer to the power envelope is preferred. But we observe none of the curve is uniformly close to 20 Figure 6.1: Lower-sided power curves with di¤erent criteria (xt = 1) 21 Figure 6.2: Lower-sided power curves with di¤erent criteria (xt = 1) 22 the power envelope and hence we can not select such value. However we …nd that value of c1U proposed by ES has very low power for all the alternatives that are close to the null (4 < c < 4:85) but as c moves away from the null there is a sharp rise in the power and the curve gets close to the power bound. All other criteria provide c1U values with power curves that are fairly close to the envelope so we can use any of these values. 6.1.2 Constant and linear trend case Table (6.2) reports the optimal values of parameters c1L and c1U for constant and linear trend case and corresponding power curves are presented in …gures 6.3 and 6.4 below. Like demeaned case above, we …nd that when testing c0 = 4 and c0 = 0, c1L and c1U for lower and upper sided tests respectively produce power curves close to the power envelope and hence make these c1 values from certain criteria feasible. Table 6.2: Optimal values of model parameters for one- sided tests: Drift and Linear Trend H0 : c0 = 4 c0 = 0 c0 = 4 Criterion c1L c1U c1L c1U c1L c1U ERS/ (c1 ; c0 ; c1 ) = 0:50 -16 2 -13.5 2.50 2 4.6 (c1 ; c0 ; c1 ) = 0:85 -23.5 3 -19.5 3.50 -2.75 5.2 Elliot and Stock Mini-Max Simple Average Je¤reys Prior Invariant Je¤reys Prior Symmetric Reference Prior -17.5 -23.5 -29.5 -29.5 -29.5 -29.5 1 1.5 1.5 3 3 1.5 -13.5 -18 -18 -13.5 -13.5 -13.5 5 2.75 2.75 3.50 3.50 2.75 -9.50 -0.20 -2 -2.5 -3.25 -3.25 9 4.6 4.8 5.1 5.1 4.8 From …gure 6.4 for c0 = 4 case we observe that only ERS rule does not provide monotonic power curve and hence causes an inconsistent test for lower sided alternatives. Simulation evidence shows that all c1L > 1:752 will produce negative critical values and hence will make the test inconsistent. We also observe that the value of c1L based on ES rule produces a test that has lower power for alternatives near the null. We also examine that c1L based on minimax criterion produces the power curve that is not close to the envelope for values far from the null and also making it an inferior choice compared to other c1L values. Similarly for upper tailed testing we observe that c1U based on ES rule works poorly since the corresponding power curve has a ‡atter portion for 4 < c < 6:5 with very low power, then a big jump in power towards power envelope near c = 6:5. Also power curve based on c1U from reference prior produces slightly low power for alternatives close to the null but then has power very close to the envelope therefore we should not rule out this value from the list of optimal values of model parameters. 23 Figure 6.3: One-sided power curves with di¤erent criteria (constant and linear trend) 24 Figure 6.4: One-sided power curves with di¤erent criteria (constant and linear trend) 25 6.2 Two Sided Test Following the discussion in section 3 about two-sided testing and using only the criterion of optimal weighted average power we have computed optimal values of parameters L, U; c1L and c1U for both demeaned and detrended cases for con…dence interval construction. The results are reported in table 6.4 and the corresponding power curves for di¤erent c0 in 6.2 below. Table 6.3 reports values based on the Elliot and Stock’s rule discussed above with …xed L = 0:03 and U = 0:02: Table 6.3: Model Parameters based on ES Rule Demeaned Case Detrended Case c0 c1;L c1;U c1;L c1;U -4 0 4 -11 -7 -3 -2 2 6 -17.5 -13.5 -9.5 1 5 9 From the following table we have quite di¤erent optimal values of model parameters from those proposed by ES. We …nd varying values of L; U across di¤erent choices of c0 instead of …xed values proposed by ES. We observe that for positive c0 ; all the di¤erent choices of weight functions assign higher probability of type-I error to the left tail (i.e. higher higher than the ES value of L = 0:03 for a test of overall size L) = 0:05: Table 6.4: Optimal values of model parameters for two sided tests Weight Function Demeaned Case Detrended Case H0 : c0 = 4 c1L c1U c1L c1U L L Simple Average Je¤reys Prior Symmetric Reference Prior Invariant Je¤reys Prior -13.5 -28 -22 Same 0.25 0.038 3 0.034 -0.75 0.046 as Je¤reys prior -18 -13 -18 -14 2 4 2 4 0.039 0.027 0.047 0.034 -9.5 -8 -13 Same 2 0.037 3.25 0.032 2.5 .047 as Je¤reys prior -25 -23 -27 -23 1.5 3.75 3.5 3.75 0.042 0.029 0.048 0.036 0.80 0.20 0.4 Same 4.5 0.049 4.8 0.041 4.6 0.049 as Je¤reys prior -1.5 -5.5 -3.5 -7.5 4.9 5 4.6 5.5 0.048 0.033 0.049 0.033 H0 : c0 = 0 Simple Average Je¤reys Prior Symmetric Reference Prior Invariant Je¤reys Prior H0 : c0 = 4 Simple Average Je¤reys Prior Symmetric Reference Prior Invariant Je¤reys Prior 26 that is even In diagram 6.5 below, left and right panels correspond respectively to the power curves for demeaned case and the detrended case. We …nd well behaved power curves using the optimal values of (c1L ; c1U ; L; U) obtained from multiple choices of weights. On comparison we observe varying slope of the curves particularly at right tails for di¤erent c0 . We also …nd that our proposed values produce power curves that outperform the power curves based on ES in almost all cases especially at the lower tails. We observe signi…cant power gains using our proposed values over those of ES particularly when we test some positive value of c0 : These vital di¤erences support our hypothesis of varying slope of the power curves for di¤erent c0 values and hence di¤erent values for the model parameters. Figure 6.5:Power Curves for two-tailed test with di¤erent criteria On comparison we observe that power curves corresponding to the parameter values from the weighted powers using the uniform and symmetric reference prior weights are superior to those from Je¤reys prior weights at the left tails for each c0 : While the test achieves higher power in detrended case with Je¤reys invariant prior weights compared to Je¤reys prior weights when c0 is either 4 or 0: By looking at the right tails of the curves we …nd that for all alternative c values the test has almost identical powers using optimal values. The most interesting case is that of testing c0 = 4 where we …nd ES as a poor choice since it has low power close to the null. Also 27 both types of Je¤reys prior weights have identical power at right tail but have slight di¤erences in powers at left tail. However for detrended case, when we compare the powers at left tails we …nd signi…cantly lower powers from Je¤reys prior weights compared to powers from uniform and symmetric reference prior weights. Given this simulation analysis we observe that there does not exist any unique set of parameters with uniformly better power across the grid of c values. Despite this outcome we …nd that our proposed set of values produce powers that are relatively higher than powers proposed by ES. In particular we observe that the point optimal test has good power properties when computed from the parameters generated from uniform and symmetric reference prior weight functions for all c0 and for both demeaned and detrended cases. 7 Conclusions In this paper, we have analyzed and extended Elliot and Stock’s method to construct con…dence intervals, by inverting a sequence of invariant point optimal tests, for the autoregressive root local to one. Our analysis of one-sided testing problems shows that in the presence of deterministics the choices of model parameters based on ERS and ES work fairly well when testing zero or negative values of parameter c0 . However for some positive c0 ; the test based on the ERS rule becomes inconsistent for lower sided stationary alternatives. Also the test based on the ES rule has low power for alternative values close to the null. For upper tail test, this behavior remains the same whether the deterministic component involves constant mean or constant and linear trend. We provide the reason of the inconsistency of the test for lower sided alternatives for some choices of parameter c1L : Similarly for two-sided test, instead of using …xed values of L = 0:03 and U = 0:02 and c1L ; c1U that are …xed distance from the null, we search for optimal values of these parameters that are jointly determined by the maximization of weighted powers using di¤erent weight functions. Our results indicate that optimal values of these parameters are very sensitive to changes in c0 . In particular we …nd that all types of weight functions considered here, except Je¤reys prior, provide optimal L > 0:03 for di¤erent c0 : However, weights based on symmetric reference prior result in the largest L close to the nominal level of 0:05. We acknowledge that none of these weight functions produce uniformly higher power so that we prefer one weight function over the other. But simulation results prove that our proposed values of the parameters (particularly from uniform and symmetric reference prior weights) result in more powerful test compared to those provided by ES. Based on these …ndings we can say that out proposed values may yield more accurate con…dence intervals. Possible extensions to this work include; the derivation of an invariant symmetric reference prior following the same approach we used for the invariant Je¤reys prior, allowing for serially correlated errors in the usual way, and consideration of the e¤ect of the initial condition on these 28 tests. 8 References Berger, J. O., and R. Y. Yang (1994) "Noninformative Priors and the Bayesian Testing for the AR(1) Model," Econometric Theory 10, 461-482. Bobkovski, M.J. (1983) "Hypothesis Testing in Nonstationary Time Series," Unpublished Ph.D. Thesis, Department of Statistics, University of Wisconsin. Cavanagh, C. (1985) "Roots Local to Unity," Unpublished Manuscript , Department of Economics, Harvard University. Chan, N. H., and C. Z. Wei (1987) "Asymptotic Inference for Nearly Nonstationary AR(1) Processes," Annals of Statistics 15, 1050-1063. Cox, D. R., and D. V. Hinkley (1974) Theoretical Statistics. Chapman & Hall, London. Elliot, G., T. J. Rothenberg, and J. H. Stock (1996) "E¢ cient Tests for an Autoregressive Unit Root," Econometrica 64, 813-836. Elliot, G., and J.H. Stock (2001) "Con…dence Intervals for Autoregressive Coe¢ cients Near One," Journal of Econometrics 103, 155-181. Lai, T. L., and C. Z. Wei (1983) "Asymptotic Properties of General Autoregressive Models and Strong Consistency of Least-Squares Estimates of their Parameters," Journal of Multivariate Analysis 13, 1-23. King, M. L. (1980) "Robust Tests for Spherical Symmetry and their Applications to least Squares Regression," The Annals of Statistics 8, 1265-1271. King, M. L., and G. H. Hillier (1985) "Best Invariant Tests of the Error Covariance Matrix of the Linear Regression Model," Journal of Royal Statistical Society. Series B 47, 98-102. King, M. L. (1987) "Towards a Theory of Point Optimal Testing," Econometric Reviews 6, 169-218. Marsh, P. (2007) "The Available Information for Invariant Tests of a Unit Root," Econometric Theory 23, 686–710. Ng, S., and P. Perron (2001) "Lag Length Selection and Construction of Unit Root Tests with Good Size and Power," Econometrica 69, 1519-1554. Phillips, P. C. B. (1987) "Towards a Uni…ed Asymptotic Theory for Autoregression," Biometrika 48, 419-426. Phillips, P. C. B. (1991) "To Criticize the Critics: An Objective Bayesian Analysis of Stochastic Trends," Journal of Applied Econometrics, vol. 6, 333-364. Patterson, K. (2011) Unit Root tests in Time Series Volume 1. Palgrave Macmillan, UK. Sims, C. A., and Uhlig, H. (1991) "Understanding Unit Rooters: A Helicopter Tour," Econometrica 59, 1591-1599. 29 Stock, J.H. (1991) "Con…dence Intervals for the Largest Autoregressive Root in U.S. Macroeconomic Time Series," Journal of Monetary Economics 28, 435-459. Appendix A 1 Derivation of Je¤reys Prior We derive here the expression for Je¤reys prior in the autoregressive model under local to unity setting, whereas this prior is the positive square root of Fisher’s information matrix. The loglikelihood function based on "t being i.i.d. in equation (2.2) is given by log L(cj 2 )= T log 2 T log(2 ) 2 2 2 the score function with respect to c is given by 2) @ log L(cj @c 1 T = 2 and the derivative of the score function is @ 2 log L(cj @c2 PT ut = E(u2t ) = = = Pt 1 j=0 1+ 2 Pt 1 j=0 21 1 21 1 1+ 1 2 2 2 T = By recursively iterating ut = (1 + c=T )ut 2 T = ut t=1 ut ut t=1 2) 1 PT 2 c T 1+ u2t c T ut ; 1 PT 2 t=1 ut 1 PT 1 2 t=0 ut : + "t ; we obtain c T j "t c T 1+ j 2j (geometric series) c 2t T 2 1 + Tc 2t 1 + Tc : 1 c2 T (2c + T ) 1+ Thus we have E PT 1 2 t=0 ut = = 2 PT 1 t=0 2 1 1 T (2c 1 1 T (2c + c 2t T 2 + cT ) 1+ c2 T The information matrix is then obtained as 30 ) T 1 ! PT 1 t=0 1+ c T 2t 2 1 ; I = = = = = = ! @ 2 log L(cj @c2 E T 2 2 T PT c6=0 ! 1 2 t=0 ut E 2 2) 2 2 1 1 T (2c + c2 T ) 1 T c2 T ) (2c + 1 T (2c + c2 T ) T 1 0 @T (2c + c2 2 T ) 1 e2c 4c2 1 1 t=0 1 1 1+ 1 1 1 1+ 1 c 2T T 2 + Tc 1 T T 1 PT T T c T 2t 1 A +1 e2c 1 2c 1 2c 2c = ! c 2T T 2 2c + cT c2 2c + T 1 1+ c 1+ T T c 2c ! 1 : If c ! 0; then by L’Hopital’s rule we get I =1=2: Thus in the limit we have I (c) = 2 2.1 1 2c e2c 1 2c 1 1 2 if c 6= 0; if c = 0: Point Optimal and Modi…ed Point Optimal Tests and Their Asymptotic Distributions Point optimal test in the absence of determinsitcs First of all we derive the point optimal test without deterministics as this will help explaining why the test statistic becomes inconsistent for some choices of c0 and secondly we can also use this expression to derive modi…ed point optimal test. The point optimal test is given by PT ( 0 ; 1) = ^ T X 2 ^"21;t t=1 ^"j;t = u ^j;t u ^j;t = yt 1 T X 0 t=1 ^j;t 1 ; ju ^"20;t ! ; x0t ^ j Suppose we have no deterministics i.e. x0t = 0, so that the original model given in (2.1) and (2.2) becomes yt = ut ; and u ^j;t = yt ut = ut 1 u1 = "1 : 31 + "t ; The point optimal test of H0 : PT ( 0 ; = T X 2 1) = ^ t=1 T X 2 = ^ against H1 : 0 T X 1 "21;t = 1 is then ! "20;t 0 t=1 2 (yt T X 1 1 yt 1 ) (yt 0 yt 1 ) 2 0 t=1 t=1 ! : Consider the term in the parenthesis T X (yt 1 yt 2 1) 1 1 1 yT2 + 0 = 1 ( (yt 1 + 2 1) = 0 yt 0 T X yt2 t=2 1) yT2 + (1 ! 0 0 1) 0 2 1 T X 0 T X 1 0 yt2 1 t=2 This statistic is applied with T X 1 1 0 t=1 t=1 = T X ! 1 = 2 1 1 0 t=1 T X yt2 1 t=2 1 t=2 : = 1 + c0 =T and 0 yt2 yt2 + 1 c0 = 1 + c1 =T , so c1 T and 1 0 1 =1 1+ c0 T 1+ c1 T = c0 + c1 T c0 c1 ; T2 so for large T (which is what concerns us for choice of c1 ) PT (c0 ; c1 ) c0 c1 T = (c0 yT2 T c0 + c1 X 2 yt T 1 t=2 1 2 yT c1 ) T (c0 + c1 ) T 2 ! T X yt2 1 t=2 PT (c0 ; c1 ) = c21 c20 T 2 T X yt2 1 + (c0 c1 ) T ! 1 2 yT (0.1) t=2 and we reject for small values. (it’s worth not omitting the c0 c1 at the front of second last term above so as not to mess with the sign of the statistic, and hence the rejection tail.) 2.2 Asymptotic distributions in the presence of deterministics but no autocorrelation in error term "t Now we incorporate the deterministic component and assuming no serial correlation in error term "t derive the asymptotic distribution of the test statistic under the local to unity framework. The model is given by 32 yt = x0t + ut ; ut = ut 1 + "t ; u1 = "1 ; "t 2 i:i:d: 0; c = 1+ T The quasi-di¤erenced series are obtained as yj;t = yt y1 ; j yt t=1 1 ; t = 2; : : : ; T xj;t = xt x1 ; j xt t=1 1 ; t = 2; : : : ; T with the resultant GLS-detrended series as x0t ^ j u ^j;t = yt and based on u ^j;t , we have ^"j;t = u ^j;t ^j;t 1 : ju where ^j = arg min j ^j = T X T X t=1 xj;t x0j;t t=1 PT ( 0 ; 1) =^ x0j;t yj;t ! 1 T X yj;t x0j;t xj;t yj;t : t=1 T X 2 0 1 ^"21;t T X ^"20;t 0 t=1 t=1 ! By applying Cochrane-Orcutt method, we obtain from …rst two equations in terms of detrended series as yj;t = x0j;t + "j;t ; "j;t = ut j ut 1 Using yj;t in the above expression for ^ j and in yj;t = y^j;t + ^"j;t ; and after rearranging the terms we obtain ^j = T X xj;t x0j;t t=1 and ^"j;t = "j;t ! x0j;t ^ j 33 1 T X t=1 xj;t "j;t so that x0j;t ^"j;t = "j;t T X xj;t x0j;t t=1 ! 1 T X xj;t "j;t : (0.2) t=1 Now using yt = x0t + ut in u ^j;t and after rearranging the terms we obtain u ^j;t = ut such that by using the expression for ^ j in u ^j;t we obtain x0t u ^j;t = ut T X xj;t x0j;t t=1 FCLT states that: 1=2 T Constant Case ! 1 T X xj;t "j;t xj;t = d u[T s] ! Bc (s) : 1; t=1 : t = 2; : : : ; T cj T; Since x0j;t T X xj;t x0j;t t=1 therefore we have T X (0.3) t=1 Consider xt = 1; so that the quasi-di¤erenced series xj;t = xt e^j;t = "j;t x0t ^ j xj;t x0j;t = x2j;1 + t=1 ! 1 T X T X xj;t "j;t ; t=1 xj;t x0j;t t=2 = 1+ T X cj T t=2 = 1 + c2j 34 T 1 T2 : 2 j xt 1 becomes Similarly, T X xj;t "j;t = u1 + t=1 T X xj;t "j;t ; ( * "1 = u1 ) t=2 = u1 = u1 cj T cj T 1 1 T X "j;t t=2 T X ut j t=2 = u1 cj T 1 uT + T X1 T X1 ut j t=2 = u1 = u1 cj T cj T 1 1 ut t=1 ! T X1 t=1 T X1 uT + 1 j uT + cj T t=2 T X1 1 ut ! ut ut j u1 ! 1 + cj T 1 u1 t=2 = 1 + cj T 1 + c2j T 2 u1 c2j T 2 T X1 ut cj T 1 uT t=2 = u1 + r1;T ; where r1;T = Op T 1=2 : Now T X xj;t x0j;t t=1 and r2;T = Op T 1=2 ! 1 T X xj;t "j;t = 1 + c2j T t=1 1 T2 1 (u1 + r1;T ) = u1 + r2;T : ^"j;1 = "j;1 u1 r2;T = r2;T ; * "j;1 = u1 : Thus we have ^"j;t = "j;t + cj T r2;T t=1 : 1 (u + r ) t = 2; : : : ; T 1 2;T 35 ! PT "2j;t t=1 ^ Now we can obtain the expression for T X 2 ^"2j;t = r2;T + t=1 T X 1 "j;t + cj T as follows. 2 (u1 + r2;T ) t=2 = 2 r2;T + T X "2j;t 1 + 2cj T (u1 + r2;T ) t=2 = T X "2j;t + 2 r2;T 1 + 2cj T (u1 + r2;T ) t=2 = T X "j;t + c2j t=2 T X ut T j t=2 T X 2 "2j;t + r2;T + 2cj T 1 (u1 + r2;T ) uT (u1 + r2;T )2 ut cj T ! + c2j ut T 1 T2 (u1 + r2;T )2 1 1 + cj T t=2 +c2j T X T X t=1 T X1 1 t=2 = 1 T2 T 1 T2 u1 ! (u1 + r2;T )2 "2j;t + Op T 1=2 : t=2 Using the convention u0 = 0 as required: PT ( 0 ; 1) = ^ = ^ = ^ = ^ = ^ T X 2 t=1 T X 2 t=1 T X 2 2 2 t=1 T X t=1 T X T X ^"21;t 1 "21;t 0 t=1 ! T X 1 2 "0;t 0 t=1 + Op T 2 1 ut 1 ) 1 (ut ^"20;t 2 u2t 2 1 ut ut u2t + 1 1 2 2 1 ut 1 u2t + 1 1 1 T X u2t + 2 1 1 2 u2t 2 2 0 ut 1 0 ( 1 0 t=1 = ^ c21 u2T T X u2t 1 + 1 2 1 + 1 0 0 1) ( 0 1 0) 1 1 ! 2 c21 2 2 0 ut 1 + + Op T 1=2 c20 T 2 T X u2t 1 c20 T 1 T X T u2t (c1 1=2 u2t 1 t=2 T X 1 ( 2 0 ) uT 1 c0 ) T 1 2 uT ut t=2 36 2 1 (c1 1=2 + Op T T X t=2 = ^ 1 1=2 + Op T t=2 2 1=2 + Op T 2 0 ut ut t=1 0 = ^ 1 2 2 1 ut 1 + 0 0 = ^ 2 0 ut 1 ) (ut ! 0 1 2 T X 1=2 0 t=2 t=1 = ^ ! ! ! ! + Op T 1=2 + Op T 1=2 + Op T c0 ) T 1=2 1=2 uT 2 ! + Op T 1=2 : Using the functional central limit theorem, the asymptotic distribution of point optimal test with GLS-demeaning is given by PT (c0 ; c1 ) ) c21 Z c20 1 Bc (s)2 ds 0 For linear time trend we have dt = x0t ; so that xt = (1; t)0 and the Linear Time Trend quasi-di¤erenced series xj;t = xt (1 + cj =T )xt 1 is given by 0 xj;t = ( cj =T; 1 (1; 1) cj (t t=1 : 1) =T )0 t = 2; : : : ; T 1=2 Let the diagonal matrix be given by DT = diag 1; T T X x0j;t ^"j;t = "j;t xj;t x0j;t t=1 as T X x0j;t DT ^"j;t = "j;t ! T X xj;t x0j;t DT = DT xj;1 x0j;1 DT + DT t=1 ! 1 T X DT xj;t "j;t : (0.4) t=1 xj;t x0j;t DT t=2 1 = T + T T 1=2 T X But 1=2 1 c2j =T 2 cj T 3=2 (1 cj (t t=2 T X c2j =T 2 = c2j 1=2 T X cj T 1 (1 cj (t 1 T2 = r1;T ; 1=2 1) =T ) = T cj T 3=2 (1 cj (t 1) =T ) T 1 (1 cj (t 1) =T )2 1) =T ) T t=2 T xj;t "j;t t=1 DT xj;t x0j;t DT T X , so that after normalization we have 1 T X t=1 DT c0 ) Bc (1)2 : (c1 cj T 1 T t=2 + c2j T 2 ! (t 1) (t 1)2 t=2 (T 1) T 1 + c2j T 3=2 2T 3=2 T 1 1 2 = c cj T 3=2 2 j = r2;T ; = T X cj and T 1 T X (1 cj (t 2 1) =T ) = T = 2cj T T t=2 = 1 T T X (t 1) + t=2 1 cj T 1 2 T 1 T 1 cj + c2j 3 37 + c2j + r3;T : c2j T 3 T X t=2 1 3 1 1 + 2T 6T 2 : So that we have DT T X xj;t x0j;t DT 1 = 1=2 T t=1 1 = + 1 cj + 13 c2j 1 ! r2;T r2;T ! 0 0 r1;T 1=2 T T cj + 13 c2j + r3;T 1 + R1;T : Now consider DT T X xj;t "j;t = DT xj;1 "j;1 + DT t=1 u1 = T + 1=2 u 1 T X xj;t "j;1 ; ( "j;1 = u1 ) t=2 T X T t=2 cj T 1 ut j ut 1 cj (t 1) =T ) ut 1=2 (1 : j ut 1 But 1 cj T T X ut j ut 1 = cj T T X 1 t=2 ut j t=2 = cj T = cj T = cj T 1 uT + 1 1 1 uT ut t=1 T X1 j 1 cj T uT + c2j T ! T X1 2 j u1 ut t=2 T X1 ut ! 1 + cj T t=2 T X1 ut + cj T 1 1 u1 1 + cj T ! 1 u1 t=1 = r4;T and 1=2 T 1=2 = T T X t=2 T X (1 cj (t 1) =T ) ut j ut 1 (1 cj (t 1) =T ) ut jT uT + T T X1 1=2 t=2 1=2 = T jT = 1 = (1 cj 1=2 1 T 1 T cj ) T (1 cj t=T ) ut t=1 1 cj T X1 1=2 T 1 T 1=2 (1 cj (t 1) =T ) j (1 cj t=T ) ut t=2 cj T 1 T 1=2 u1 uT + c2j T 5=2 T X1 tut T 1=2 1 c2j T 2 u1 t=2 uT + c2j T 5=2 T X1 tut + r5;T t=2 hence we have DT T X t=1 xj;t "j;t = (1 cj ) T 1=2 u T 38 u1 + c2j T 5=2 PT 1 t=2 tut + R2;T : By using the expressions for DT DT T X xj;t x0j;t DT t=1 1 = 0 ! 0 t=1 xj;t xj;t DT 1 DT cj + 31 c2j T X PT and DT t=1 xj;t "j;t , xj;t "j;t 1 (1 1=2 u T cj ) T u1 + c2j T u1 = 1 cj + 13 c2j 1 we obtain t=1 ! 0 1 PT (1 1=2 u T cj ) T + c2j T 5=2 5=2 PT PT + R3;T 1 t=2 tut ! 1 t=2 tut + R3;T : Squaring and summing both sides of equation (0:4) to obtain T X T X ^"2j;t = t=1 t=1 T X = 0 @"j;t "2j;t t=1 x0j;t DT T X DT xj;t x0j;t DT t=1 T X "j;t x0j;t DT T X DT t=1 ! 1 DT T X t=1 xj;t x0j;t DT t=1 ! 1 12 xj;t "j;t A DT T X xj;t "j;t t=1 but T X "j;t x0j;t DT DT t=1 xj;t x0j;t DT t=1 "j;1 x0j;1 DT = T X + T X "j;t x0j;t DT t=1 = u21 + 1 1 cj + c2j 3 ! ! DT DT T X xj;t "j;t t=1 T X xj;t x0j;t DT t=1 1 (1 1 cj ) T 1=2 uT + ! c2j T 1 DT T X xj;t "j;t t=1 5=2 T X1 t=2 tut !2 + R5;T : Thus we have T X t=1 ^"2j;t = T X t=1 "2j;t 1 1 cj + c2j 3 1 (1 cj ) T 1=2 uT + c2j T 5=2 T X1 t=2 tut !2 : (0.5) The point optimal test using (0.5) is given by PT = ^ 2 T X ^"21;t 1 T X 0 t=1 t=1 ^"20;t ! : Now we focus on the numerator part (NT )of the point optimal test 39 (0.6) NT = = T X 1 ^"21;t t=1 T X t=1 + 1 ^"20;t 0 t=1 ! T X 1 "20;t 0 t=1 "21;t 1 T X 1 c1 + c21 3 1 1 c0 + c20 3 1 (1 1=2 c1 ) T uT + c21 T 5=2 T X1 tut t=2 (1 1=2 c0 ) T uT + c20 T 5=2 T X1 tut t=2 !2 !2 : Using the expression from equation (0.1), we get NT c21 = c20 T 2 T X u2t (c1 1 1 2 uT c0 ) T t=2 1 + 1 c21 = 1 c1 + c21 3 1 1 c0 + c20 3 1 c20 T 2 T X (1 1=2 c1 ) T uT + c21 T 5=2 T X1 tut t=2 (1 1=2 c0 ) T uT + c20 T 5=2 T X1 tut t=2 u2t (c1 1 c0 ) T !2 !2 1 2 uT t=2 1 + 1 Let j = 1 c1 + c21 3 1 c1 T 1 c1 + 31 c21 1 c0 + c20 3 1 c0 T 1 c0 + 31 c20 1 cj so that 3 (1 1 cj + 13 c2j NT = c21 c20 T j) 2 T X = u2t 1=2 1=2 uT + uT + c2j 1 cj + 13 c2j 1 (c1 1 c21 T c1 + 31 c21 1 c20 T c0 + 31 c20 5=2 T X1 5=2 T X1 t=2 1 + 1 1 c0 + c20 3 1T 1=2 c0 ) T 1 2 uT 0T : (0.7) uT + 3 (1 1) T 5=2 T X1 tut t=2 1=2 tut !2 and hence t=2 1 c1 + c21 3 tut t=2 !2 uT + 3 (1 0) T 5=2 T X1 t=2 tut !2 !2 Following the functional central limit theorem, we obtain the asymptotic distribution of this generalized point optimal test as 40 d PT (c0 ; c1 ) ! c21 where Bc;cj (s) = 1 c20 j Bc (1) Z 1 Bc (s)2 (c1 c0 ) Bc (1)2 1 c1 + c21 Bc;c1 (s)2 3 1 + 1 c0 + c20 Bc;c0 (s)2 3 1 0 R1 2 j ) 0 sBc (s) ; + 3 (1 j = 0; 1: Modi…ed Point Optimal Test To obtain modi…ed point optimal test, we consider equation (0.3) after normalizing xt with diagonal matrix DT = diag(1; T u ^j;t = ut 1=2 ) x0t DT as DT T X xj;t x0j;t DT t=1 = ut x0t DT 1 DT + 3 (1 T X xj;t "j;t t=1 u1 1=2 u T jT ! j) T PT For t = 1; T 1=2 u ^j;1 = T 1=2 u1 T 1=2 u1 1=2 1 T + RT 1 t=2 tut 5=2 jT 1=2 u T + 3 (1 j) T 5=2 = RT PT + RT 1 t=2 tut For t = 2; :::; T; T 1=2 u ^j;t = T = T 1=2 1=2 ut ut T t T 1=2 jT t=T 1=2 u1 jT uT + 3 (1 1=2 u T j) T + 3 (1 5=2 j) T T X1 tut t=2 ! 5=2 PT 1 t=2 tut + RT + RT : The modi…ed point optimal test denoted by MP T is given by MP T = ^ 2 c21 T 2 T X u ^21;t 1 + (1 c1 ) T 1 2 u ^1;T t=2 c20 T 2 T X t=2 u ^20;t 1 (1 c0 ) T 1 2 u ^0;T ! (1.1) Now we prove that MP T is asymptotically equivalent to PT : For this consider the following: 41 c2j T 2 T X u ^2j;t 1 + (1 T 1=2 1 2 u ^j;T cj ) T t=2 = c2j T 1 T X t=2 = c2j T 1 T X t=2 + (1 = cj T 0 2 T X 1 + (1 cj ) T t B @T cj ) T 2 u ^j;t 1=2 1=2 u2t 1 ut uT 1 1 cj + 31 c2j 1 + (1 cj ) T u ^j;T 2 1 T cj + 13 c2j 1 1=2 1=2 (1 cj ) T 1=2 uT + c2j T t=2 (1 cj ) T 1=2 uT + c2j T 2 1 5=2 T cj + 13 c2j 1 cj 2 T 1 cj + 13 c2j c2j + 1 1 3 cj + 13 c2j 1 + But T 1=2 PT uT 1 2 3 cj uT (1 cj ) T (1 1=2 cj ) T cj ) T 1=2 2 (1 cj ) T 1)2 ! 1=3 and 1=2 uT + c2j T 1=2 uT + uT + c2j T 5=2 c2j T 5=2 T X1 uT + c2j T 5=2 t=2 (t 1) ut 42 1 = t=2 !2 T X1 tut T X1 tut PT tut t=2 T X1 5=2 t=2 PT tut 1 ! 2 C tut A !!2 2 t=2 cj cj + t=2 (t 2 (1 5=2 T X1 t=2 t=2 c2j 5=2 T X1 !2 1 t=2 tut , ! T X tut T (t 1) ut t=2 ! 3 T X (t t=2 + RT so we have 1)2 1 c2j T 1 c2j T 2 T X t=2 T X 1=2 T u2t 1 u ^j;t + (1 2 + (1 1 1=2 cj ) T 1=2 cj ) T uT 2 u ^j;T 2 t=2 2 c2j cj + 13 c2j 1 5=2 T 1 cj + + = c2j T 2 T X u2t 1 uT + c2j T 5=2 1=2 cj ) T uT + c2j T 5=2 T X1 tut t=2 1=2 cj cj + 13 c2j 1 (1 2 1 2 3 cj 1 cj T 2 1 cj + 13 c2j 1 1=2 cj ) T t=2 1 2 3 cj + (1 T X1 2 (1 uT (1 1=2 cj ) T 1=2 cj ) T uT + uT + c21 T c2j T 5=2 5=2 T X1 cj ) T 1=2 uT 2 + = 1 1 cj + 13 c2j 1 c2j T 2 T X u2t 1 (1 1=2 cj ) T cj ) T 1=2 uT + c2j T 5=2 T X1 tut uT + c2j T 5=2 T X1 tut t=2 + (1 cj ) T 1=2 uT ! + RT 2 t=2 (1 tut tut t=2 1 cj + 13 c2j !2 T X1 tut t=2 t=2 !2 t=2 + (1 tut !T 1 X 2 !2 !2 + RT + RT t=2 1 cj + c2j 3 1 = c2j T 2 T X u2t 1 + (1 1 cj T 1 cj + 13 c2j cj ) T 1=2 uT 1=2 uT + c2j 1 cj + 31 c2j 1 jT 1=2 uT + 3 (1 j) T 5=2 T X1 t=2 43 5=2 t=2 2 t=2 1 cj + c2j 3 T T X1 tut !2 + RT tut !2 Now use above expression with j = 0; 1 in the numerator of (1.1) to obtain NT = c21 T 2 T X u2t 1 + (1 1=2 c1 ) T uT 2 t=2 1 c1 + c21 3 1 c20 T 2 T X u2t 1 1T 1=2 uT + 3 (1 1) T 5=2 T X1 tut !2 tut !2 tut !2 t=2 (1 c0 ) T 1=2 uT 2 t=2 + 1 = c21 1 c0 + c20 3 c20 T 2 T X 0T 1=2 uT + 3 (1 0) T 5=2 T X1 t=2 u2t 1 (c1 c0 ) T 1=2 uT 2 t=2 1 + 1 1 c1 + c21 3 1 c0 + c20 3 1T 1=2 uT + 3 (1 1) T 5=2 T X1 t=2 0T 1=2 uT + 3 (1 0) T 5=2 T X1 t=2 = NT + RT Thus we conclude that NT = NT + op (1) : 44 + RT tut !2 + RT
© Copyright 2025 Paperzz