As we saw with linear models and logistic models, it's possible to combine the fixed effects and random effects approaches to get some of the virtues of each. As before, the first step is to calculate the mean of each time-varying predictor variable for each individual, and then calculate the deviations from those means:
PROC SORT DATA=patents2; BY id; PROC MEANS DATA=patents2 NWAY NOPRINT; CLASS id; VAR rd_0-rd_5; OUTPUT OUT=means MEAN=mrd_0 mrd_1 mrd_2 mrd_3 mrd_4 mrd_5; DATA patcomb; MERGE patents2 means; BY id; drd_0=rd_0-mrd_0; drd_1=rd_1-mrd_1; drd_2=rd_2-mrd_2; drd_3=rd_3-mrd_3; drd_4=rd_4-mrd_4; drd_5=rd_5-mrd_5; RUN;
The next step is to run a regression model with both the deviations and the means as predictor variables. To do this correctly, it's important to use an estimation method that allows for dependence among the multiple observations for each individual. The simplest approach is GEE with PROC GENMOD:
PROC GENMOD DATA=patcomb; CLASS id t; MODEL patent= drd_0-drd_5 mrd_0-mrd_5 t / DIST=NB; REPEATED SUBJECT=id / TYPE=MDEP(4) CORRW; CONTRAST 'FE VS. RE' drd_0 1 mrd_0 −1,drd_1 0 mrd_1 −1, drd_2 1 mrd_2 −1, drd_3 1 mrd_3 −1,drd_4 1 mrd_4 −1, drd_5 1 mrd_5 −1; RUN;
Here I've specified a negative binomial distribution with an MDEP correlation structure. The CONTRAST statement produces a chi-square test of the null hypothesis that all the deviation coefficients are equal to all the corresponding mean coefficients.
Results are in Output 4.16. The coefficients for the deviation variables can be interpreted as if they were fixed effects estimates in the sense that they control for all stable covariates. In fact, they are quite close to the fixed effects coefficients of the R & D variables in Output 4.12. However, the GEE standard error estimates are somewhat larger than those from the unconditional fixed effects method. The coefficients for the means are generally quite different from the deviation coefficients, although none is statistically significant. Under the GEE and random effects models of the last section, the deviation and mean coefficients should be the same. The chi-square test of that assumption, reported at the end of the output, indicates that there is marginal evidence for rejection.
Analysis Of GEE Parameter Estimates | |||||||
---|---|---|---|---|---|---|---|
Empirical Standard Error Estimates | |||||||
Parameter | Estimate | Standard Error | 95% Confidence | Limits | Z | Pr>|Z| | |
Intercept | 1.0944 | 0.0876 | 0.9227 | 1.2661 | 12.49 | <.0001 | |
drd_0 | 0.3597 | 0.1223 | 0.1201 | 0.5994 | 2.94 | 0.0033 | |
drd_1 | −0.1158 | 0.1177 | −0.3465 | 0.1149 | −0.98 | 0.3252 | |
drd_2 | 0.0529 | 0.0819 | −0.1076 | 0.2134 | 0.65 | 0.5183 | |
drd_3 | −0.0287 | 0.0887 | −0.2024 | 0.1451 | −0.32 | 0.7465 | |
drd_4 | 0.0273 | 0.0852 | −0.1397 | 0.1943 | 0.32 | 0.7485 | |
drd_5 | −0.0775 | 0.0781 | −0.2305 | 0.0755 | −0.99 | 0.3209 | |
mrd_0 | −0.0490 | 0.8545 | −1.7239 | 1.6258 | −0.06 | 0.9542 | |
mrd_1 | 1.0590 | 1.8618 | −2.5902 | 4.7081 | 0.57 | 0.5695 | |
mrd_2 | −0.9196 | 1.9653 | −4.7714 | 2.9322 | −0.47 | 0.6398 | |
mrd_3 | −0.3526 | 1.5650 | −3.4200 | 2.7148 | −0.23 | 0.8218 | |
mrd_4 | 1.3779 | 1.1420 | −0.8603 | 3.6162 | 1.21 | 0.2276 | |
mrd_5 | −0.2428 | 0.4805 | −1.1846 | 0.6990 | −0.51 | 0.6134 | |
t | 1 | 0.1924 | 0.0504 | 0.0935 | 0.2912 | 3.81 | 0.0001 |
t | 2 | 0.1428 | 0.0492 | 0.0464 | 0.2391 | 2.90 | 0.0037 |
t | 3 | 0.1537 | 0.0392 | 0.0769 | 0.2305 | 3.92 | <.0001 |
t | 4 | 0.1019 | 0.0411 | 0.0214 | 0.1823 | 2.48 | 0.0131 |
t | 5 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | . | . |
Contrast Results for GEE Analysis | ||||
---|---|---|---|---|
Contrast | DF | Chi-Square | Pr>ChiSq | Type |
FE VS. RE | 6 | 12.99 | 0.0432 | Score |
The alternative to GEE is to implement the hybrid method in the context of a random effects model. Here's the PROC NLMIXED code for doing that, with results in Output 4.17:
PROC NLMIXED DATA=patcomb; lambda=EXP(int+d0*drd_0+d1*drd_1+d2*drd_2+d3*drd_3+d4*drd_4+ d5*drd_5+m0*mrd_0+m1*mrd_1+m2*mrd_2+m3*mrd_3+m4*mrd_4+m5*mrd_5 +t1*(t EQ 1)+t2*(t EQ 2)+t3*(t EQ 3)+t4*(t EQ 4)+alpha); MODEL patent~NEGBIN(theta,(theta/(lambda+theta))); RANDOM alpha~NORMAL(0,s2) SUBJECT=id; PARMS int=1 d0=0 d1=0 d2=0 d3=0 d4=0 d5=0 m0=0 m1=0 m2=0 m3=0 m4=0 m5=0 t1=0 t2=0 t3=0 t4=0 s2=1 theta=1; RUN;
The coefficients for the deviation scores, along with their standard errors, are remarkably close to the fixed effects estimates in Output 4.12. This suggests that the random effects hybrid approach may be superior to GEE in replicating fixed effects results (as we found for logistic models in chapter 3), but beware. While PROC GENMOD took about a fifth of a second to run the GEE model, PROC NLMIXED took seven minutes to estimate the random effects model. For both of these methods I could also have included time-invariant variables like SCIENCE. I did not do so in order to maximize comparability with the fixed effects results.
Fit Statistics | |
---|---|
−2 Log Likelihood | 9671.0 |
AIC (smaller is better) | 9709.0 |
AICC (smaller is better) | 9709.5 |
BIC (smaller is better) | 9780.9 |
Parameter Estimates | |||||||||
---|---|---|---|---|---|---|---|---|---|
Parameter | Estimate | Standard Error | DF | tValue | Pr>|t| | Alpha | Lower | Upper | Gradient |
int | 0.6940 | 0.07057 | 323 | 9.83 | <.0001 | 0.05 | 0.5552 | 0.8328 | 0.009595 |
d0 | 0.3745 | 0.06874 | 323 | 5.45 | <.0001 | 0.05 | 0.2393 | 0.5097 | −0.00483 |
d1 | −0.08268 | 0.07341 | 323 | −1.13 | 0.2609 | 0.05 | −0.2271 | 0.06175 | −0.00721 |
d2 | 0.05900 | 0.06976 | 323 | 0.85 | 0.3984 | 0.05 | −0.07825 | 0.1962 | 0.002148 |
d3 | 0.009520 | 0.06492 | 323 | 0.15 | 0.8835 | 0.05 | −0.1182 | 0.1372 | 0.00983 |
d4 | 0.04112 | 0.06181 | 323 | 0.67 | 0.5063 | 0.05 | −0.08047 | 0.1627 | 0.005421 |
d5 | −0.00224 | 0.05033 | 323 | −0.04 | 0.9646 | 0.05 | −0.1012 | 0.09677 | −0.00302 |
m0 | −0.1543 | 0.7641 | 323 | −0.20 | 0.8401 | 0.05 | −1.6576 | 1.3490 | −0.02112 |
m1 | 2.0226 | 1.5748 | 323 | 1.28 | 0.2000 | 0.05 | −1.0756 | 5.1208 | 0.011452 |
m2 | −2.1181 | 1.7480 | 323 | −1.21 | 0.2265 | 0.05 | −5.5570 | 1.3208 | 0.07705 |
m3 | −0.2077 | 1.5771 | 323 | −0.13 | 0.8953 | 0.05 | −3.3104 | 2.8951 | −0.11503 |
m4 | 1.7695 | 1.2742 | 323 | 1.39 | 0.1659 | 0.05 | −0.7373 | 4.2763 | 0.06146 |
m5 | −0.4168 | 0.5980 | 323 | −0.70 | 0.4864 | 0.05 | −1.5933 | 0.7598 | −0.00998 |
t1 | 0.2285 | 0.02864 | 323 | 7.98 | <.0001 | 0.05 | 0.1722 | 0.2849 | 0.006028 |
t2 | 0.1791 | 0.02824 | 323 | 6.34 | <.0001 | 0.05 | 0.1236 | 0.2347 | 0.00154 |
t3 | 0.1739 | 0.02745 | 323 | 6.34 | <.0001 | 0.05 | 0.1199 | 0.2279 | −0.00316 |
t4 | 0.06997 | 0.02659 | 323 | 2.63 | 0.0089 | 0.05 | 0.01765 | 0.1223 | 0.009996 |
s2 | 0.7530 | 0.06758 | 323 | 11.14 | <.0001 | 0.05 | 0.6201 | 0.8860 | 0.004639 |
theta | 32.1344 | 3.3315 | 323 | 9.65 | <.0001 | 0.05 | 25.5802 | 38.6886 | 0.000046 |