university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Faculty of Health Sciences Regression models Two covariates, Quantitative outcome, (2-5-2015) Per Kragh Andersen and Lene Theil Skovgaard Dept. of Biostatistics 1 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Multiple regression, no interaction PKA & LTS, Sect. 5.1, 5.1.1, 5.1.2, 5.1.3 Confounding I I I I I I The need for more than one covariate Confounding Adjusted vs. unadjusted estimates Two-way anova Ancova Model check Home pages: http://biostat.ku.dk/~pka/regrmodels15 2 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s So far: a single covariate When is it reasonable to use models with only a single covariate? I Randomized clinical trials, comparing I I I Construction of reference curves, such as I I I 3 / 58 two or more treatments dose groups “weight for height” hormone level vs. gestational age blood pressure vs. age university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Now: more than one covariate When do we need more than a single covariate? I In observational studies: I I Randomized clinical trials I I I 4 / 58 Potential risk factors may be associated with the covariate of interest If the randomization is poorly conducted If important risk factors exist (If the trial is small) university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Hypothetical example Gender difference in lung function? Model: Analysis: Simple T-test Problem: I The age distributions for men and women may differ I The height distributions surely differ 5 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Confounding Age is a confounder for the gender comparison, if I Age distributions differ for men and women I Age has an effect on lung function (which it surely has) Such a confounder may affect the original comparison. 6 / 58 university of copenhagen Is height a confounder? Height is I an intermediate variable I a mediator of the gender effect 7 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretation of gender difference, I Estimate the typical difference in Fev1 for men and women 8 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretation of gender difference, II Estimate the difference in Fev1 for a man and a women of equal heights If the situation is this latter effect will be zero 9 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Example: Vitamin D Two binary covariates: xi,1 body mass index (BMI) for the ith woman I normal weight (18.5 < BMI < 25) I overweight (BMI ≥ 25) xi,2 country for the ith woman (Ireland or Poland) yi log10 of vitamin D status for the ith woman 10 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Two-way anova, Additive model E(yi ) = LPi = a + b1 I (xi,1 ≥ 25) + b2 I (xi,2 = Ireland). Normal Weight Overweight Difference Poland Ireland a a + b2 a + b1 a + b1 + b2 b1 b1 Difference b2 b2 0 I The effect of overweight is b1 , for both countries I The difference between countries is b2 no matter body stature The effects are mutually adjusted, and there is no interaction 11 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Vitamin D averages on log10 scale, with numbers of women in brackets Normal Weight Overweight Difference Poland Ireland 1.598 (12) 1.720 (16) 1.443 (53) 1.593 (25) –0.155 –0.127 Difference 0.121 (28) 0.150 (78) 0.028 I Are the differences between countries the same no matter body stature? I And vice versa? It looks quite reasonable, since 0.028 is small 12 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Graphical display of averages Average log10 (25OHD values) in four (country by BMI)-groups. The size of a bar reflects the sample size. 13 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Estimates from additive model I ab = 1.587(0.043) b b1 = −0.141(0.044), overweight vs. normal weight I b b2 = 0.142(0.039), Ireland vs. Poland I b b1 is an average of the two country specific differences, weighted according to country size b b2 is an average of the two stature specific differences, weighted according to (stature) group size These are mutually adjusted estimates 14 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Estimated (predicted) mean values in the four groups b i) = a b+b E(y b1 I (xi,1 ≥ 25) + bb2 I (xi,2 = Ireland). Normal Weight Overweight Difference Poland Ireland 1.587 1.729 1.446 1.588 -0.141 -0.141 Difference 0.142 0.142 0 Compare to the crude averages from the previous table 15 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Adjusted vs. unadjusted estimates Unadjusted Adjusted overweight vs normal weight Ireland vs Poland -0.177 (0.045) -0.141 (0.044) 0.171 (0.040) 0.142 (0.039) Unadjusted (marginal) estimates: I Overweight vs. normal weight: too large, because the group of overweight women is dominated by Polish women (with low values) I Ireland vs. Poland: too large in favour of Ireland, because Poland has so many overweight women 16 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Confounding BMI and COUNTRY are confounders for each other I Overweight gives small vitamin D levels I Polish women have low vitamin D levels and I More Polish women are overweight In this situation: Marginal (unadjusted) estimates exaggerate the effects (they steal effect from the covariate not included in the model) It may also go the other way.... 17 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Test of no effect in model with both covariates: I Effect of overweight: I I I Walds test: (−0.141/0.044)2 = 10.46 ∼ χ2 (1), P = 0.0012 T-test: (−0.141/0.044) = −3.20 ∼ t(75), P = 0.0020 Ireland vs. Poland I I Walds test: (0.142/0.039)2 = 12.88 ∼ χ2 (1), P = 0.0003 T-test: (0.142/0.039) = 3.62 ∼ t(75), P = 0.0005 Clear evidence of effect of both covariates 18 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Back-transformation Since we have analyzed logarithmic values, we have to make a back-transformation: log10 (y∗ ) = y = a + b1 x1 + b2 x2 ⇒ x1 x2 b2∗ y∗ = a∗ b1∗ where a∗ = 10a , b1∗ = 10b1 , b2∗ = 10b2 The relation is not linear on the original scale: It is multiplicative 19 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretation of estimates 10b1 = 0.72, 10b2 = 1.39 b b I For given country, median 25OHD-values among overweight women are about 72% of those for normal weight women. I For a given BMI group, Irish women have median 25OHD values about 39% larger than Polish women. Confidence intervals: 10b−1.96·SD b to 10b+1.96·SD b and for the effect of BMI we get the interval (0.59,0.88). 20 / 58 university of copenhagen Model assumptions I Additivity (no interaction) I I Equal SD’s (variance homogeneity) I I I later .... Plot residuals vs. predicted Levenes test for variance homogeneity Normality (or at least symmetry) I I 21 / 58 Histogram of residuals Quantile plot of residuals d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen Variance homogeneity Residuals plotted against fitted values x: Ireland, o: Poland. Note: Two almost identical predicted values 22 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Variance homogeneity, cont’d SDs of log10 values, with numbers of women in brackets: Poland Ireland Normal Weight Overweight 0.126 (12) 0.164 (16) 0.213 (53) 0.192 (25) Levenes test of equal SD’s yields P = 0.37 (for untransformed data: P = 0.027, i.e. rejection) 23 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Now to: General categorical covariates 3 BMI-groups, 4 countries Average log10 vitamin D values (and numbers of women): Normal Weight Denmark Finland Ireland Poland 24 / 58 1.692 1.664 1.720 1.598 (20) (9) (16) (12) Slight Overweight 1.545 1.665 1.626 1.393 (21) (32) (16) (25) Obese 1.603 1.562 1.534 1.488 (12) (13) (9) (28) university of copenhagen Graphical display of averages Parallel profiles? Additivity? 25 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Model xi,1 body mass index (BMI) for the ith woman I normal weight (18.5 < BMI < 25) I slight overweight (25 ≤ BMI < 30) I obese (BMI ≥ 30) xi,2 country for the ith woman (all four) yi log10 of vitamin D for the ith woman Additive model (two-way anova): E(yi ) = a + b1,1 I (25 ≤ xi,1 < 30) + b1,2 I (30 ≤ xi,1 ) +b2,1 I (xi,2 = Denmark) + b2,2 I (xi,2 = Finland) +b2,3 I (xi,2 = Ireland) 26 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Adjusted and unadjusted estimates Parameter b1,1 : slight overweight vs. normal weight b1,2 : obese vs. normal weight b2,1 : Denmark vs. Poland b2,2 : Finland vs. Poland b2,3 : Ireland vs. Poland 27 / 58 Adjusted Estimate SD Unadjusted Estimate SD –0.116 0.036 –0.116 0.037 –0.113 0.040 –0.143 0.040 0.120 0.171 0.147 0.040 0.039 0.043 0.142 0.168 0.171 0.040 0.040 0.043 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Test of no effect Omnibus test for no effect of I Body stature: I I I Likelihood ratio: 11.67 ∼ χ2 (2), P = 0.003 F-test: 5.83 ∼ F (2, 207), P = 0.003 Country: I I Likelihood ratio: 21.78 ∼ χ2 (3), P < 0.0001 F-test: 7.43 ∼ F (3, 207), P < 0.0001 Clear evidence of effect of both covariates 28 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s BMI as a quantitative covariate, Ancova 1. One binary covariate (x1 : country, Ireland and Poland) 2. One quantitative covariate (x2 : BMI), assumed to have a linear effect on the mean value of the outcome Additive model: E(yi ) = a + b1 (xi,1 − 25) + b2 I (xi,2 = Ireland) Estimates: b b1 = −0.0152(0.0045), slope of BMI effect b b2 = 0.131(0.040), Ireland vs. Poland ab = 1.532(0.030), Polish women with BMI=25 29 / 58 university of copenhagen Estimated relation: Parallel lines Ireland (dots) and Poland (circles) 30 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Adjusted estimates of "Ireland vs Poland" Ireland and Poland differ a little in BMI: x̄1,Irl = 26.36, x̄1,Pol = 28.94 Therefore: Unadjusted estimates of vitamin D level (marginal averages) will not be directly comparable Least squares means: Adjust to overall BMI average (27.94): adj ȳIrl = 1.643 + (−0.0152)(27.94 − 26.36) = 1.618 adj ȳPol = 1.472 + (−0.0152)(27.94 − 28.94) = 1.487. adj adj Difference: ȳIrl − ȳPol = 1.618 − 1.487 = 0.131 31 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Illustration of adjusted country effect 32 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Adjusted vs. unadjusted effect estimates Unadjusted Adjusted BMI Ireland vs Poland -0.0195 (0.0045) -0.0152 (0.0045) 0.171 (0.040) 0.131 (0.040) Unadjusted estimates: I Effect of BMI: too large, because women with high BMI are predominantly from Poland I Ireland vs. Poland: too large, because Poland has so many women with high BMI 33 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretation of effects, I Since we have analyzed logarithmic values, we have to make a back-transformation: Ireland vs Poland: 100.131 = 1.35 Irish women have a 35% higher level of vitamin D, compared to Polish women with the same BMI Confidence limits: (100.131−1.96·0.040 , 100.131+1.96·0.040 ) = (1.13, 1.62) 34 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretation of effects, II Effect of a difference in 5 kg/m2 in BMI: 10−0.0152·5 = 0.84 A women with a BMI of 5 kg/m2 more than another woman from the same country will have a 16% lower level of vitamin D Confidence limits: (10−0.0152−1.96·0.0045 , 10−0.0152+1.96·0.0045 ) = (0.76, 0.93) 35 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Model assumptions I No interaction between BMI and country, i.e. additivity, with parallel regression lines I I I I I Plot residuals vs. covariate (BMI) Test quadratic effect, or linear spline Equal SD’s (variance homogeneity) I I I later ... Linear effect of BMI on log10 (25OHD) Plot residuals vs. predicted F-test for equality between groups Normality (or at least symmetry) I I 36 / 58 Histogram of residuals Quantile plot of residuals university of copenhagen Linear effect of BMI? Smooth curves for each country: 37 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen Additivity Are the slopes reasonably equal? 38 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Test of non-linearity using a linear spline, with cutpoints at 25 and 30 Country BMI (BMI-25) I(BMI≥ 25) (BMI-30) I(BMI≥ 30) P for linearity Poland –0.0422 (0.0301) 0.0393 (0.0412) –0.0033 (0.0248) 0.53 Ireland –0.0221 (0.0201) –0.0084 (0.0348) 0.0203 (0.0430) 0.89 39 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Predicted values for linear spline model Blue: Ireland, Red: Poland 40 / 58 university of copenhagen Variance homogeneity Residuals vs fitted values: Pattern? 41 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Test for identical variances Do we see the same residual variation in the two countries? F= 2 sPol 0.2052 = = 1.53 ∼ F (63, 39) 2 0.1662 sIrl resulting in P = 0.16 42 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Example: The birthweight study For 107 babies, we have ultrasound measurements of I abdominal diameter (AD) I biparietal diameter (BPD) shortly before birth. Purpose of this study: Describe the relationship between birthweight and these two ultrasound measurements, with the aim of predicting birthweight. 43 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Suggested model Because of the “geometry” of the problem (volume of a child), AD and BPD are likely to affect birthweight multiplicatively: BW ≈ c0 AD b1 BPD b2 In order to work with a linear model we therefore make logarithmic transformation of all variables: yi = log10 (BWi ), xi,1 = log10 (ADi ), xi,2 = log10 (BPDi ). 44 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Marginal relation to AD E(yi ) = −1.062 + 2.237xi,1 45 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Marginal relation to BPD E(yi ) = −3.077 + 3.332xi,2 , 46 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Two linear effects in the same model E(yi ) = LPi = a + b1 xi,1 + b2 xi,2 describes a plane 47 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Estimation Fit a plane to the data by minimizing the residual sum of squares X (yi − (a + b1 xi,1 + b2 xi,2 ))2 i For the birthweight example the fitted plane is c i = −2.546 + 1.467xi,1 + 1.552xi,2 LP with a residual standard deviation of 0.0464 48 / 58 university of copenhagen Observations and fitted plane 49 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Adjusted vs unadjusted estimates Model log10 (AD) (SD) b b Mutually adjusted Unadjusted only AD only BPD 50 / 58 log10 (BPD) (SD) Residual SD b b 1.467 (0.147) 1.552 (0.229) 0.0464 2.237 - (0.111) - 3.332 (0.202) 0.0554 0.0646 university of copenhagen Relation between the two covariates Relations: log10bpd = 3.176 + 0.496 * log10ad log10ad = -1.205 + 1.214 * log10bpd 51 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Confounding The two explanatory variables are I strongly related to each other I both strongly related to the outcome Therefore, the effect of, say xi,1 , depends on whether we adjust for xi,2 or not, and vice versa and the interpretation changes! This is not to be confused with interaction (later....) 52 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Interpretations Marginal (unadjusted) effect of e.g. a 10% increase in BPD: 1.13.332 = 1.37 A 10 % difference in BPD corresponds to a 37% difference in expected birth weight The corresponding adjusted estimate is 1.11.552 = 1.16 Two fetuses, with identical AD but with a 10 % difference in BPD will differ with 16% in expected birth weight 53 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s But: When BPD differs by 10%, the AD will typically also differ, typically a little more than 10% (1.11.214 = 1.12), giving rise to a factor on birthweight: 1.11.552 1.121.467 = 1.37 i.e. precisely the unadjusted effect 54 / 58 university of copenhagen Residual plots for linearity 55 / 58 d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Residual plots for variance homogeneity and normality 56 / 58 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Test for linearity by including quadratic effects for log10 (AD) and log10 (BPD) log10 (AD) log10 (BPD) 1.467 (0.147) 1.552 (0.229) –5.688 (5.098) 1.699 (0.251) 1.418 (0.146) –18.933 (9.852) (log10 (AD))2 1.785 (1.271) P=0.16 Problem with linearity in log10 (BPD), mostly due to the fetus with the lowest BPD 57 / 58 (log10 (BPD))2 5.354 (2.574) P=0.04 university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Diagnostics Cook’s distance for model with quadratic effect of log10 (BPD) I The fetus with the smallest BPD has a large influence I Omitting this, the quadratic effect of BPD is no longer important 58 / 58
© Copyright 2025 Paperzz