Comparison with Random Effects Models and GEE Estimation

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.5. Comparison with Random Effects Models and GEE Estimation

As we saw in chapters 2 and 3, random effects models and GEE estimation are widely used alternatives to fixed effects methods for longitudinal data. Both methods can be applied to count data and are readily available in SAS. The principal attractions of these alternative methods are (1) the ability to estimate effects for time-invariant covariates, and (2) more efficient use of the data (if the assumptions are met). The major disadvantage is that neither method controls for unmeasured time-invariant covariates. I'll briefly describe these methods in this section, both to serve as a point of comparison with the fixed effects methods and because they will be needed for the hybrid method discussed in the next section.

As we've seen before, GEE is a form of iterated generalized least squares that allows for correlations among the repeated observations for each individual. GEE is easily invoked with the REPEATED statement in PROC GENMOD, and can be used with either a negative binomial model or a Poisson model. Here's the SAS code for GEE estimation of a negative binomial model for the patent data, with separate records for each firm-year:

PROC GENMOD DATA=patents2;
   CLASS id t;
   MODEL patent= rd_0-rd_5 t / D=NB;
   REPEATED SUBJECT=id / TYPE=MDEP(4) CORRW;
RUN;

The TYPE=MDEP(4) option specifies that the correlation matrix for patent counts among the five years of observation has a "banded" structure. There is one correlation for counts that are one year apart, another correlation for counts that are two years apart, and so on. The correlation for counts more than four years apart is set to 0 (hence the 4 in MDEP(4)), but four years is the maximum distance for these data anyway. This imposed structure can be seen in the estimated "Working Correlation Matrix," requested with the CORRW option and shown in Output 4.13. I also tried other correlation structures, but the TYPE=UN (for unstructured) could not be fitted with these data. The TYPE=EXCH (for exchangeable) specifies that all the inter-year correlations are identical. Although this specification yielded similar results, it seems unnecessarily restrictive.

Table 4.13. Output 4.13 GEE Estimates for a Negative Binomial Model
Working Correlation Matrix
	Col1	Col2	Col3	Col4	Col5
Row1	1.0000	0.7567	0.7349	0.6655	0.6909
Row2	0.7567	1.0000	0.7567	0.7349	0.6655
Row3	0.7349	0.7567	1.0000	0.7567	0.7349
Row4	0.6655	0.7349	0.7567	1.0000	0.7567
Row5	0.6909	0.6655	0.7349	0.7567	1.0000

Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter		Estimate	Standard Error	95% Confidence	Limits	Z	Pr>\|Z\|
Intercept		1.0839	0.0884	0.9106	1.2572	12.26	<.0001
rd_0		0.4969	0.1131	0.2752	0.7186	4.39	<.0001
rd_1		−0.0451	0.1162	−0.2728	0.1826	−0.39	0.6977
rd_2		0.1613	0.0855	−0.0063	0.3289	1.89	0.0593
rd_3		0.0729	0.0944	−0.1121	0.2579	0.77	0.4401
rd_4		0.1380	0.0735	−0.0061	0.2821	1.88	0.0605
rd_5		0.0247	0.0544	−0.0818	0.1313	0.45	0.6492
t	1	0.2326	0.0497	0.1351	0.3301	4.68	<.0001
t	2	0.1825	0.0465	0.0914	0.2736	3.93	<.0001
t	3	0.1855	0.0383	0.1104	0.2606	4.84	<.0001
t	4	0.1169	0.0403	0.0380	0.1958	2.90	0.0037
t	5	0.0000	0.0000	0.0000	0.0000	.	.

Parameter estimates in Output 4.13 are roughly similar to those in Output 4.12 for the fixed effects negative binomial model. But unlike the fixed effects method, two of the lagged R & D measures have GEE coefficients that approach statistical significance. Interestingly, the standard errors for the GEE estimates are generally larger than those for the fixed effects method, which is the opposite of what would ordinarily be expected.

Random effects models can be fitted with PROC NLMIXED for either the Poisson or negative binomial distributions. Let's first consider a Poisson model. As before, we begin by assuming that y_it has a Poisson distribution with expected value λ_it. As with the fixed effects model, we then assume that . Now, however, instead of treating α_i as a set of fixed constants, we assume that it is a random variable, normally distributed with a mean of 0 and a variance σ². We also assume that α_i is independent of all measured variables in the model, and that the y_it terms are independent of each other, conditional on i. Under these assumptions, NLMIXED produces maximum likelihood estimates of all parameters. Here's the code for the patent data:

PROC NLMIXED DATA=patents2;
lambda=EXP(int+brd0*rd_0+brd1*rd_1+brd2*rd_2+brd3*rd_3+brd4*rd_4+brd5*rd_5+d1*(t EQ 1)+d2*(t EQ 2)+d3*(t EQ 3)+
   d4*(t EQ 4)+alpha);
   MODEL PATENT~POISSON(lambda);
   RANDOM ALPHA~NORMAL(0,s2) SUBJECT=id;
   PARMS int=1 brd0=0 brd1=0 brd2=0 brd3=0 brd4=0 brd5=0 d1=0 d2=0 d3=0 d4=0 s2=1;
RUN;

The statement that begins with LAMBDA defines the expected patent count as a function of the explanatory variables. Note the inclusion of ALPHA, which is the random, firm-level effect. The MODEL statement says that patent counts have a Poisson distribution with parameter LAMBDA. The RANDOM statement declares that ALPHA has a normal distribution with a mean of 0 and variance of S2. This variance is assumed to be constant across firms and across time. Alternatively, it could be written as a function of other variables simply by including another assignment equation similar to the one for LAMBDA.

This model took about 19 seconds to estimate on my computer, as compared with about a quarter second for the GEE model with PROC GENMOD. Results are shown in Output 4.14. The coefficients are roughly similar to those we just saw with GEE estimation, but the standard errors are quite a bit smaller. This is probably because the GEE estimates presumed a negative binomial distribution, whereas the random effects model presumes a Poisson distribution, which allows for less overdispersion.

Table 4.14. Output 4.14 NLMIXED Output for a Random Effects Poisson Model
Fit Statistics
−2 Log Likelihood	10410
AIC (smaller is better)	10434
AICC (smaller is better)	10435
BIC (smaller is better)	10480

Parameter Estimates
Parameter	Estimate	Standard Error	DF	tValue	Pr>\|t\|	Alpha	Lower	Upper	Gradient
int	0.8460	0.06729	323	12.57	<.0001	0.05	0.7136	0.9784	−0.26972
brd0	0.4762	0.04227	323	11.26	<.0001	0.05	0.3930	0.5593	0.043797
brd1	−0.00684	0.04797	323	−0.14	0.8867	0.05	−0.1012	0.08754	0.258257
brd2	0.1333	0.04473	323	2.98	0.0031	0.05	0.04532	0.2213	−0.08825
brd3	0.05825	0.04126	323	1.41	0.1589	0.05	−0.02291	0.1394	0.260459
brd4	0.02590	0.03761	323	0.69	0.4916	0.05	−0.04810	0.09989	−0.02615
brd5	0.07911	0.03100	323	2.55	0.0112	0.05	0.01812	0.1401	0.076259
d1	0.2520	0.01422	323	17.72	<.0001	0.05	0.2240	0.2799	0.048431
d2	0.2053	0.01422	323	14.43	<.0001	0.05	0.1773	0.2333	−0.03654
d3	0.1962	0.01394	323	14.07	<.0001	0.05	0.1687	0.2236	0.030349
d4	0.06218	0.01378	323	4.51	<.0001	0.05	0.03507	0.08929	0.006942
s2	0.8169	0.07580	323	10.78	<.0001	0.05	0.6677	0.9660	0.149421

To get a fairer comparison, let's estimate a random effects negative binomial model. While this can also be done with PROC NLMIXED, it's a little tricky because the parameterization of the negative binomial distribution in NLMIXED is different from the one I've used here. NLMIXED labels the parameters N and p (Johnson and Kotz 1969) while I use λ and Θ. The functional relationship is N = Θ and p = q/ (λ+Θ). Here's how to set it up:

PROC NLMIXED DATA=patents2;
 lambda=EXP(int+brd0*rd_0+brd1*rd_1+brd2*rd_2+brd3*rd_3+brd4*rd_4
   +brd5*rd_5+d1*(t EQ 1)+d2*(t EQ 2)+d3*(t EQ 3)+
   d4*(t EQ 4)+alpha);
   MODEL patent~NEGBIN(theta,(theta/(lambda+theta)));
   RANDOM alpha~NORMAL(0,s2) SUBJECT=id;
   PARMS int=1 brd0=0 brd1=0 brd2=0 brd3=0 brd4=0 brd5=0 d1=0 d2=0 d3=0 d4=0 s2=1 theta=1;
RUN;

Results are shown in Output 4.15.

Table 4.15. Output 4.15 NLMIXED Output for a Random Effects Negative Binomial MODEL
Fit Statistics
−2 Log Likelihood	9703.9
AIC (smaller is better)	9729.9
AICC (smaller is better)	9730.1
BIC (smaller is better)	9779.0

Parameter Estimates
Parameter	Estimate	Standard Error	DF	tValue	Pr>\|t\|	Alpha	Lower	Upper	Gradient
int	0.7069	0.06960	323	10.16	<.0001	0.05	0.5699	0.8438	−0.01105
brd0	0.5021	0.06226	323	8.06	<.0001	0.05	0.3796	0.6245	0.024034
brd1	−0.01835	0.07302	323	−0.25	0.8018	0.05	−0.1620	0.1253	0.015229
brd2	0.1205	0.06923	323	1.74	0.0828	0.05	−0.01573	0.2567	0.026795
brd3	0.06403	0.06473	323	0.99	0.3233	0.05	−0.06331	0.1914	0.020925
brd4	0.1044	0.06142	323	1.70	0.0901	0.05	−0.01642	0.2252	0.057457
brd5	0.07823	0.04764	323	1.64	0.1015	0.05	−0.01548	0.1720	0.08812
d1	0.2802	0.02719	323	10.31	<.0001	0.05	0.2268	0.3337	−0.00773
d2	0.2244	0.02722	323	8.24	<.0001	0.05	0.1708	0.2779	0.032592
d3	0.2074	0.02702	323	7.68	<.0001	0.05	0.1542	0.2606	−0.04431
d4	0.08709	0.02680	323	3.25	0.0013	0.05	0.03436	0.1398	0.006565
s2	0.7720	0.06956	323	11.10	<.0001	0.05	0.6351	0.9088	0.003151
theta	30.2799	3.0701	323	9.86	<.0001	0.05	24.2400	36.3199	0.000062

In Output 4.15, the coefficients are quite similar in magnitude to those in Output 4.14 for the Poisson model, but the standard errors are somewhat larger. These are about on par with those for the fixed effects negative binomial model in Output 4.12, but still not as large as those for the GEE estimates in Output 4.13. For this model, like the fixed effects model, the only significant R & D coefficient is for the contemporaneous year. A chi-square statistic for testing the Poisson random effects model versus the negative binomial random effects model can be obtained by calculating the difference in their −2 log-likehoods: 10410 – 9704 = 706. With 1 d.f., this chi-square is highly significant, implying a strong preference for the less restrictive negative binomial model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Comparison with Random Effects Models and GEE Estimation

Create new playlist

Sign In

Sign Up

4.5. Comparison with Random Effects Models and GEE Estimation

Table of Contents for
Comparison with Random Effects Models and GEE Estimation