ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 Question 1. Data set “uniftrialdata.xls” presents yields of a uniformity trial on winter wheat (simulated data). Unit size (1.5m wide × 4.5m long) plots are distributed in a 6 columns × 48 rows, for a total of 288 plots (size X = 1). Interest is in exploring the relationship between plot size and the variance among plots (in unit basis). There are four variables that identify plots according to their size: plot2, plot4, plot8 and plot16, where the plot sizes are 2,4,6,8, and 16 units. 1. Following the approach presented in Swallow and H a nested analysis of variance on yield is presented that will be used in the estimation of VX, variance among plots of size x, expressed in unitary basis. proc glm data=b3(where=(plot1<49 and col <7)); class plot1 plot2 plot4 plot8 plot16 ; model newyield = plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) ; random plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) /test; output out=outglm r=resid student=sres p=pred; run; The GLM Procedure Dependent Variable: newyield Source DF Sum of Squares Mean Square F Value Pr > F Model 143 209919.5978 1467.9692 2.40 <.0001 Error 144 87907.8423 610.4711 Corrected Total 287 297827.4402 R-Square Coeff Var Root MSE newyield Mean 0.704836 6.033877 24.70771 409.4832 Source plot16 plot8(plot16) plot4(plot8*plot16) plot(plot*plot*plot) DF 17 18 36 72 Type I SS 50233.17637 34266.59997 62756.55429 62663.26720 Mean Square 2954.89273 1903.70000 1743.23762 870.32316 F Value 4.84 3.12 2.86 1.43 Pr > F <.0001 <.0001 <.0001 0.0370 Source plot16 plot8(plot16) plot4(plot8*plot16) DF 17 18 36 Type III SS 50233.17637 34266.59997 62756.55429 Mean Square 2954.89273 1903.70000 1743.23762 F Value 4.84 3.12 2.86 Pr > F <.0001 <.0001 <.0001 plot(plot*plot*plot) 72 62663.26720 870.32316 1.43 0.0370 Expected Mean Squares Source Type III Expected Mean Square plot16 Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) + 8 Var(plot8(plot16)) + 16 Var(plot16) plot8(plot16) Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) + 8 Var(plot8(plot16)) plot4(plot8*plot16) Var(Error) + 2 Var(plot(plot*plot*plot)) + 4 Var(plot4(plot8*plot16)) plot(plot*plot*plot) Var(Error) + 2 Var(plot(plot*plot*plot)) Tuesday October 9, 2007 Homework 5 1 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 Plot size MS VX 1 610.4711 V1 = 610.4711 2 870.32316 V2 1743.23762 4 V4 1903.70000 8 V8 2954.89273 16 V16 870.32316 610.4711 = 129.9261 2 1743.23762 870.32316 218.2286 = 4 1903.70000 1743.23762 = 20.0578 8 2954.89273 1903.70000 = 65.69954 16 Variance components may be obtained directly with PROC MIXED, proc mixed data=b3(where=(plot1<49 and col<7)); class plot1 plot2 plot4 plot8 plot16 ; model newyield= / outp=predds ; random plot16 plot8(plot16) plot4(plot8*plot16) plot2(plot4*plot8*plot16) ; run; Variance components estimates from PROC MIXED Plot Size Estimate 1 610.47 2 129.93 4 218.23 8 20.0578 16 65.6995 log Vx 6.0354 0.9127log X The Mixed Procedure Covariance Parameter Estimates Cov Parm plot16 plot8(plot16) plot4(plot8*plot16) plot(plot*plot*plot) Residual 2. Estimate 65.6995 20.0578 218.23 129.93 610.47 Size 16 8 4 16 1 Next, a regression of Vx on X, in a log scale, is used to get a raw estimate of the coefficient of soil heterogeneity b, Smith’s b. The REG Procedure Model: MODEL1 Tuesday October 9, 2007 Homework 5 2 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 Dependent Variable: log_vx Analysis of Variance DF Sum of Squares Mean Square 1 3 4 4.00265 2.56906 6.57171 4.00265 0.85635 Root MSE Dependent Mean Coeff Var 0.92539 4.77010 19.39988 Source Model Error Corrected Total R-Square Adj R-Sq F Value Pr > F 4.67 0.1194 0.6091 0.4788 Parameter Estimates Variable DF Parameter Estimate Standard Error t Value Pr > |t| Intercept 1 6.03543 0.71681 8.42 0.0035 log_x 1 -0.91274 0.42218 -2.16 0.1194 Regression equation: log Vx 6.0354 0.9127log X Smith’s b = 0.9127 Values closer to 1 indicates increasing homogeneity of the soil. A plot size between 2 and 8 seems adequate since for X=16 the variance among plots of size 16 is greater. 3. Additionally, we can analyze the residuals for the plot size X = 1, X = 8 and see whether the use of a larger plot reduces the residual variation. Check residual distribution on field *** fit just an intercept in the model yij ij , coordinates of each plot, i = 1, 2, . . ., 48 is the yield in (i, j) plot, where i and j are the row and j = 1,2,3,4,5,6 column, is the overall mean, and ij is yij the residual value in (i, j) plot. proc glm data = newtrial; model newyield = ; output out = outglm r = resid student = sres p = pred; run; The GLM Procedure Dependent Variable: newyield Sum of Source DF Squares Mean Square Model 1 Tuesday October 9, 2007 Homework 5 48290839.62 48290839.62 F Value Pr > F 46535.2 <.0001 3 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 Error 287 297827.44 Uncorrected Total 288 48588667.06 1037.73 R-Square Coeff Var Root MSE newyield Mean 0.000000 7.866930 32.21376 409.4832 Source Intercept DF Type I SS Mean Square F Value Pr > F 1 48290839.62 48290839.62 46535.2 <.0001 Parameter Estimate Standard Error t Value Pr > |t| Intercept 409.4832432 1.89821396 215.72 <.0001 *** residual plot on the field ***; Residual plot Standardized Residual plot *** graph a contour plot for residuals on the field ***; proc g3grid data=outglm out=out2; grid row*col = sres ; run; proc gcontour data=out2; plot row*col=sres/ levels= -4 -3 -2 -1 0 1 2 3 4;* pattern join; run; Tuesday October 9, 2007 Homework 5 4 ST 524 Homework 4 Tuesday October 9, 2007 Homework 5 NCSU - Fall 2007 Due: 10/17/07 5 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 Question 2 Dylan B. Keon and Patricia S. Muir. Growth of Usnea longissima Across a Variety of Habitats in the Oregon Coast Range. The Bryologist. Vol 105, No. 2, pp 233-242 Abstract. The sensitive lichen Usnea longissima Ach. has a limited, patchy distribution across forested landscapes in the U.S. Pacific Northwest. To gain insight into whether the current distribution within the Oregon Coast Range has resulted from a lack of suitable habitat or from dispersal limitations, we measured growth of U. longissima transplants placed in four habitats. Transplant study site locations and habitats were determined through an accompanying study that identified significant U. longissima habitat characteristics, based on the present distribution of the species, and used predictive modeling to identify areas of apparently suitable habitat within the study area. Transplants were placed in 12 sites, comprised of three replicates of the four habitats. Ninety transplants were placed in each habitat (n = 360). Growth was measured as changes in biomass and length after one year. Transplants grew in all habitats, particularly in sites where habitat was predicted to be least suitable for U. longissima. Although transplants in those sites had mean biomass increases that were 2.7 to 4.6 times greater than those of transplants placed in the other three habitats, their overall rate of attrition was 1.5 to 1.8 times higher than transplants in the other three habitats. Increases in length were also greatest in sites where habitat was predicted to be least suitable. The fact that the transplants grew well in all habitats and actually thrived in sites where habitat was predicted to be least suitable indicates that dispersal limitations may play a more significant role than the availability of suitable habitat in determining the distribution of U. longissima in the Oregon Coast Range. These findings underscore the importance of green tree retention during timber harvests. Trees containing U. longissima should be retained so that they may inoculate the regenerating stand with U. longissima fragments. It is also recommended that stands harboring significant populations of U. longissima (typically old stands) be preserved as source locations of this dispersal-limited species. Habitat is a Fixed effects factor Additive Linear Model Yijk i ij ijk i Habitat Fixed effect, i = 1, 2, 3, 4. ij , Site Random effect, j = 1, 2, 3. ijk , Individuals Random effect k = 1, 2, …, 30 1. “This is an Observational Study”. Argue in favor or against this statement. 2. Analyze the following statement. What do you consider the reason(s) that we use a nested ANOVA to analyze the data? Were the individual transplants independent? Tuesday October 9, 2007 Homework 5 6 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 1. Write down null hypotheses to be tested. 2. Write down the analysis of variance table with Sources of Variation, corresponding degrees of freedom, and Expected Mean Squares column. 3. Initially there were thirty 30 individual transplants within each site, but some were discarded as the experiment progressed. Compare degrees of freedom for ANOVA Table in question 2 with Table 3, below. Note that a separate Error term is specified for testing of Habitat effect, as a result of missing observations. What should be the degrees of freedom of this F test denominator, if there were no missing observations? 4. Use Table 3 to write down conclusions. Refer to the hypotheses being tested. Do we have an estimate of the variation among sites? What about the variation among individual transplants subject to similar conditions (within same site and habitat)? Read the following paragraph. Would you consider necessary any changes? 5. Tuesday October 9, 2007 Homework 5 7 ST 524 Homework 4 NCSU - Fall 2007 Due: 10/17/07 6. To analyze the quality of the data, we may pay attention to the following two paragraphs. They explain how the discard and retention of transplants process was carried and may give information about any bias in the data. Comment on the quality of data. Surviving Transplants II Surviving Transplants I Missing observations were completely missing or they did not qualify as “surviving transplants” 6. 7. ssss Tuesday October 9, 2007 Homework 5 8
© Copyright 2025 Paperzz