As we saw in chapters 2 and 3, random effects models and GEE estimation are widely used alternatives to fixed effects methods for longitudinal data. Both methods can be applied to count data and are readily available in SAS. The principal attractions of these alternative methods are (1) the ability to estimate effects for time-invariant covariates, and (2) more efficient use of the data (if the assumptions are met). The major disadvantage is that neither method controls for unmeasured time-invariant covariates. I'll briefly describe these methods in this section, both to serve as a point of comparison with the fixed effects methods and because they will be needed for the hybrid method discussed in the next section.
As we've seen before, GEE is a form of iterated generalized least squares that allows for correlations among the repeated observations for each individual. GEE is easily invoked with the REPEATED statement in PROC GENMOD, and can be used with either a negative binomial model or a Poisson model. Here's the SAS code for GEE estimation of a negative binomial model for the patent data, with separate records for each firm-year:
PROC GENMOD DATA=patents2; CLASS id t; MODEL patent= rd_0-rd_5 t / D=NB; REPEATED SUBJECT=id / TYPE=MDEP(4) CORRW; RUN;
The TYPE=MDEP(4) option specifies that the correlation matrix for patent counts among the five years of observation has a "banded" structure. There is one correlation for counts that are one year apart, another correlation for counts that are two years apart, and so on. The correlation for counts more than four years apart is set to 0 (hence the 4 in MDEP(4)), but four years is the maximum distance for these data anyway. This imposed structure can be seen in the estimated "Working Correlation Matrix," requested with the CORRW option and shown in Output 4.13. I also tried other correlation structures, but the TYPE=UN (for unstructured) could not be fitted with these data. The TYPE=EXCH (for exchangeable) specifies that all the inter-year correlations are identical. Although this specification yielded similar results, it seems unnecessarily restrictive.
Working Correlation Matrix | |||||
---|---|---|---|---|---|
Col1 | Col2 | Col3 | Col4 | Col5 | |
Row1 | 1.0000 | 0.7567 | 0.7349 | 0.6655 | 0.6909 |
Row2 | 0.7567 | 1.0000 | 0.7567 | 0.7349 | 0.6655 |
Row3 | 0.7349 | 0.7567 | 1.0000 | 0.7567 | 0.7349 |
Row4 | 0.6655 | 0.7349 | 0.7567 | 1.0000 | 0.7567 |
Row5 | 0.6909 | 0.6655 | 0.7349 | 0.7567 | 1.0000 |
Analysis Of GEE Parameter Estimates | |||||||
---|---|---|---|---|---|---|---|
Empirical Standard Error Estimates | |||||||
Parameter | Estimate | Standard Error | 95% Confidence | Limits | Z | Pr>|Z| | |
Intercept | 1.0839 | 0.0884 | 0.9106 | 1.2572 | 12.26 | <.0001 | |
rd_0 | 0.4969 | 0.1131 | 0.2752 | 0.7186 | 4.39 | <.0001 | |
rd_1 | −0.0451 | 0.1162 | −0.2728 | 0.1826 | −0.39 | 0.6977 | |
rd_2 | 0.1613 | 0.0855 | −0.0063 | 0.3289 | 1.89 | 0.0593 | |
rd_3 | 0.0729 | 0.0944 | −0.1121 | 0.2579 | 0.77 | 0.4401 | |
rd_4 | 0.1380 | 0.0735 | −0.0061 | 0.2821 | 1.88 | 0.0605 | |
rd_5 | 0.0247 | 0.0544 | −0.0818 | 0.1313 | 0.45 | 0.6492 | |
t | 1 | 0.2326 | 0.0497 | 0.1351 | 0.3301 | 4.68 | <.0001 |
t | 2 | 0.1825 | 0.0465 | 0.0914 | 0.2736 | 3.93 | <.0001 |
t | 3 | 0.1855 | 0.0383 | 0.1104 | 0.2606 | 4.84 | <.0001 |
t | 4 | 0.1169 | 0.0403 | 0.0380 | 0.1958 | 2.90 | 0.0037 |
t | 5 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | . | . |
Parameter estimates in Output 4.13 are roughly similar to those in Output 4.12 for the fixed effects negative binomial model. But unlike the fixed effects method, two of the lagged R & D measures have GEE coefficients that approach statistical significance. Interestingly, the standard errors for the GEE estimates are generally larger than those for the fixed effects method, which is the opposite of what would ordinarily be expected.
Random effects models can be fitted with PROC NLMIXED for either the Poisson or negative binomial distributions. Let's first consider a Poisson model. As before, we begin by assuming that yit has a Poisson distribution with expected value λit. As with the fixed effects model, we then assume that . Now, however, instead of treating αi as a set of fixed constants, we assume that it is a random variable, normally distributed with a mean of 0 and a variance σ2. We also assume that αi is independent of all measured variables in the model, and that the yit terms are independent of each other, conditional on i. Under these assumptions, NLMIXED produces maximum likelihood estimates of all parameters. Here's the code for the patent data:
PROC NLMIXED DATA=patents2; lambda=EXP(int+brd0*rd_0+brd1*rd_1+brd2*rd_2+brd3*rd_3+brd4*rd_4+brd5*rd_5+d1*(t EQ 1)+d2*(t EQ 2)+d3*(t EQ 3)+ d4*(t EQ 4)+alpha); MODEL PATENT~POISSON(lambda); RANDOM ALPHA~NORMAL(0,s2) SUBJECT=id; PARMS int=1 brd0=0 brd1=0 brd2=0 brd3=0 brd4=0 brd5=0 d1=0 d2=0 d3=0 d4=0 s2=1; RUN;
The statement that begins with LAMBDA defines the expected patent count as a function of the explanatory variables. Note the inclusion of ALPHA, which is the random, firm-level effect. The MODEL statement says that patent counts have a Poisson distribution with parameter LAMBDA. The RANDOM statement declares that ALPHA has a normal distribution with a mean of 0 and variance of S2. This variance is assumed to be constant across firms and across time. Alternatively, it could be written as a function of other variables simply by including another assignment equation similar to the one for LAMBDA.
This model took about 19 seconds to estimate on my computer, as compared with about a quarter second for the GEE model with PROC GENMOD. Results are shown in Output 4.14. The coefficients are roughly similar to those we just saw with GEE estimation, but the standard errors are quite a bit smaller. This is probably because the GEE estimates presumed a negative binomial distribution, whereas the random effects model presumes a Poisson distribution, which allows for less overdispersion.
Fit Statistics | |
---|---|
−2 Log Likelihood | 10410 |
AIC (smaller is better) | 10434 |
AICC (smaller is better) | 10435 |
BIC (smaller is better) | 10480 |
Parameter Estimates | |||||||||
---|---|---|---|---|---|---|---|---|---|
Parameter | Estimate | Standard Error | DF | tValue | Pr>|t| | Alpha | Lower | Upper | Gradient |
int | 0.8460 | 0.06729 | 323 | 12.57 | <.0001 | 0.05 | 0.7136 | 0.9784 | −0.26972 |
brd0 | 0.4762 | 0.04227 | 323 | 11.26 | <.0001 | 0.05 | 0.3930 | 0.5593 | 0.043797 |
brd1 | −0.00684 | 0.04797 | 323 | −0.14 | 0.8867 | 0.05 | −0.1012 | 0.08754 | 0.258257 |
brd2 | 0.1333 | 0.04473 | 323 | 2.98 | 0.0031 | 0.05 | 0.04532 | 0.2213 | −0.08825 |
brd3 | 0.05825 | 0.04126 | 323 | 1.41 | 0.1589 | 0.05 | −0.02291 | 0.1394 | 0.260459 |
brd4 | 0.02590 | 0.03761 | 323 | 0.69 | 0.4916 | 0.05 | −0.04810 | 0.09989 | −0.02615 |
brd5 | 0.07911 | 0.03100 | 323 | 2.55 | 0.0112 | 0.05 | 0.01812 | 0.1401 | 0.076259 |
d1 | 0.2520 | 0.01422 | 323 | 17.72 | <.0001 | 0.05 | 0.2240 | 0.2799 | 0.048431 |
d2 | 0.2053 | 0.01422 | 323 | 14.43 | <.0001 | 0.05 | 0.1773 | 0.2333 | −0.03654 |
d3 | 0.1962 | 0.01394 | 323 | 14.07 | <.0001 | 0.05 | 0.1687 | 0.2236 | 0.030349 |
d4 | 0.06218 | 0.01378 | 323 | 4.51 | <.0001 | 0.05 | 0.03507 | 0.08929 | 0.006942 |
s2 | 0.8169 | 0.07580 | 323 | 10.78 | <.0001 | 0.05 | 0.6677 | 0.9660 | 0.149421 |
To get a fairer comparison, let's estimate a random effects negative binomial model. While this can also be done with PROC NLMIXED, it's a little tricky because the parameterization of the negative binomial distribution in NLMIXED is different from the one I've used here. NLMIXED labels the parameters N and p (Johnson and Kotz 1969) while I use λ and Θ. The functional relationship is N = Θ and p = q/ (λ+Θ). Here's how to set it up:
PROC NLMIXED DATA=patents2; lambda=EXP(int+brd0*rd_0+brd1*rd_1+brd2*rd_2+brd3*rd_3+brd4*rd_4 +brd5*rd_5+d1*(t EQ 1)+d2*(t EQ 2)+d3*(t EQ 3)+ d4*(t EQ 4)+alpha); MODEL patent~NEGBIN(theta,(theta/(lambda+theta))); RANDOM alpha~NORMAL(0,s2) SUBJECT=id; PARMS int=1 brd0=0 brd1=0 brd2=0 brd3=0 brd4=0 brd5=0 d1=0 d2=0 d3=0 d4=0 s2=1 theta=1; RUN;
Results are shown in Output 4.15.
Fit Statistics | |
---|---|
−2 Log Likelihood | 9703.9 |
AIC (smaller is better) | 9729.9 |
AICC (smaller is better) | 9730.1 |
BIC (smaller is better) | 9779.0 |
Parameter Estimates | |||||||||
---|---|---|---|---|---|---|---|---|---|
Parameter | Estimate | Standard Error | DF | tValue | Pr>|t| | Alpha | Lower | Upper | Gradient |
int | 0.7069 | 0.06960 | 323 | 10.16 | <.0001 | 0.05 | 0.5699 | 0.8438 | −0.01105 |
brd0 | 0.5021 | 0.06226 | 323 | 8.06 | <.0001 | 0.05 | 0.3796 | 0.6245 | 0.024034 |
brd1 | −0.01835 | 0.07302 | 323 | −0.25 | 0.8018 | 0.05 | −0.1620 | 0.1253 | 0.015229 |
brd2 | 0.1205 | 0.06923 | 323 | 1.74 | 0.0828 | 0.05 | −0.01573 | 0.2567 | 0.026795 |
brd3 | 0.06403 | 0.06473 | 323 | 0.99 | 0.3233 | 0.05 | −0.06331 | 0.1914 | 0.020925 |
brd4 | 0.1044 | 0.06142 | 323 | 1.70 | 0.0901 | 0.05 | −0.01642 | 0.2252 | 0.057457 |
brd5 | 0.07823 | 0.04764 | 323 | 1.64 | 0.1015 | 0.05 | −0.01548 | 0.1720 | 0.08812 |
d1 | 0.2802 | 0.02719 | 323 | 10.31 | <.0001 | 0.05 | 0.2268 | 0.3337 | −0.00773 |
d2 | 0.2244 | 0.02722 | 323 | 8.24 | <.0001 | 0.05 | 0.1708 | 0.2779 | 0.032592 |
d3 | 0.2074 | 0.02702 | 323 | 7.68 | <.0001 | 0.05 | 0.1542 | 0.2606 | −0.04431 |
d4 | 0.08709 | 0.02680 | 323 | 3.25 | 0.0013 | 0.05 | 0.03436 | 0.1398 | 0.006565 |
s2 | 0.7720 | 0.06956 | 323 | 11.10 | <.0001 | 0.05 | 0.6351 | 0.9088 | 0.003151 |
theta | 30.2799 | 3.0701 | 323 | 9.86 | <.0001 | 0.05 | 24.2400 | 36.3199 | 0.000062 |
In Output 4.15, the coefficients are quite similar in magnitude to those in Output 4.14 for the Poisson model, but the standard errors are somewhat larger. These are about on par with those for the fixed effects negative binomial model in Output 4.12, but still not as large as those for the GEE estimates in Output 4.13. For this model, like the fixed effects model, the only significant R & D coefficient is for the contemporaneous year. A chi-square statistic for testing the Poisson random effects model versus the negative binomial random effects model can be obtained by calculating the difference in their −2 log-likehoods: 10410 – 9704 = 706. With 1 d.f., this chi-square is highly significant, implying a strong preference for the less restrictive negative binomial model.