4.4. Fixed Effects Negative Binomial Models for Count Data

As we just saw in the last section, Poisson regression models often run into problems with overdispersion. That's a bit surprising for fixed effects models because these models already allow for unobserved heterogeneity across individuals by way of the αi parameters. But that heterogeneity is presumed to be time-invariant. There might still be unobserved heterogeneity that is specific to particular points in time, leading to observed overdispersion. As we've seen, the standard errors can be corrected for overdispersion by a simple method based on the ratio of the deviance (or Pearson chi-square) to its degrees of freedom.

Although that's not a bad method, we might do better by directly building overdispersion into the model for event counts. Specifically, we will now assume that the patent counts have a negative binomial distribution for each firm at each point in time. The negative binomial distribution can be regarded as a generalization of the Poisson distribution with an additional parameter that allows for overdispersion. The attraction of this approach is that the estimated regression coefficients might be more efficient (in the statistical sense), and the standard errors and test statistics might be more accurate than those produced by the simpler, after-the-fact correction method.

Negative binomial regression models can be formulated in different ways. The model we shall use here is what Cameron and Trivedi (1998) call an NB2 model. In this model, the probability mass function for yit is given by


In this equation λit is the expected value of yit, Θ is the overdispersion parameter, and γ(.) is the gamma function. As Θ →∞, this distribution converges to the Poisson distribution. As before, we assume a log-linear regression decomposition of the expected value,


where the αi are treated as fixed effects. Conditional on αi, the multiple event counts for each individual (in this case, a firm) are assumed to be independent. But unconditionally, they may be dependent.

How can we estimate this model? Unlike the Poisson model, conditional likelihood is not an option here. In technical terminology, the total count for each individual is not a complete sufficient statistic for αi, so conditioning on the total does not remove αi from the likelihood function. Hausman, Hall and Griliches (1984) proposed a rather different fixed effects negative binomial regression model, and they derived a conditional maximum likelihood estimator for that model. In fact, their method has been incorporated into procedures in some widely available commercial software packages (not SAS). But Allison and Waterman (2002) have shown that this is not a true fixed effects regression model, and the method does not, in fact, control for all stable covariates.

Instead, we shall consider unconditional maximum likelihood estimation of models that include dummy variables for all individuals (except one). The following program estimates an unconditional negative binomial model in PROC GENMOD for the patent data.

PROC GENMOD DATA=patents2;
   CLASS t id;
   MODEL patent = rd_0-rd_5 t id / DIST=NB SCALE=0;
RUN;

The key difference with Poisson regression is the DIST=NB option on the MODEL statement. The SCALE option sets the starting value for the dispersion parameter. That option won't be necessary for most applications, but, in this case, the default starting value was not very good and the model failed to converge.

Results in Output 4.11 should be compared with those for Poisson regression in Output 4.7 (without overdispersion correction) and Output 4.8 (with overdispersion correction). It's apparent that the coefficients for the negative binomial model are very similar to those for the Poisson model. Moreover, the standard errors and test statistics for the negative binomial model are close to those for the Poisson model with overdispersion adjustment. (Note that the dispersion estimate reported in the last line of Output 4.11 is actually an estimate of 1/Θ, where Θ is the parameter in 4.7.)

Table 4.11. Output 4.11 Fixed Effects Negative Binomial Regression Model
Criteria For Assessing Goodness Of Fit
CriterionDFValueValue/DF
Deviance12861704.18041.3252
Scaled Deviance12861704.18041.3252
Pearson Chi-Square12861618.55701.2586
Scaled Pearson X212861618.55701.2586
Log Likelihood 224419.2756 
Analysis Of Parameter Estimates
Parameter DFEstimateStandard ErrorWald 95% Confidence Limits Chi-SquarePr>ChiSq
Intercept 12.50550.28091.95503.056079.56<.0001
rd_0 10.37060.06340.24640.494834.22<.0001
rd_1 1−0.08270.0676−0.21520.04991.490.2216
rd_2 10.06360.0641−0.06210.18920.980.3214
rd_3 10.01360.0596−0.10320.13050.050.8193
rd_4 10.03450.0565−0.07630.14520.370.5420
rd_5 10.00180.0464−0.08900.09270.000.9685
t110.22370.02540.17380.273677.27<.0001
t210.17500.02510.12580.224148.69<.0001
t310.17220.02430.12460.219950.22<.0001
t410.06490.02350.01880.11107.620.0058
t500.00000.00000.00000.0000..
id110.94770.21770.52101.374518.95<.0001
id21−1.82940.4892−2.7882−0.870513.980.0002
id31−0.01030.1329−0.27080.25020.010.9382
.  .     
.  .     
.  .     
id3441−2.70740.5289−3.7439−1.670826.21<.0001
id34511.07810.19570.69451.461630.35<.0001
id34600.00000.00000.00000.0000..
Dispersion 10.01960.00200.01560.0236  

Because the Poisson model is a special case of the negative binomial regression model, we can compare the two by constructing a likelihood ratio chi-square statistic. This is accomplished by taking the difference in their log-likelihoods and multiplying by 2:

2(224419 − 224169) = 500.

With only 1 degree of freedom, this result is statistically significant by any standard. (Note that one cannot take differences in the deviance to construct this test because the deviance is computed differently for Poisson and negative binomial models). The implication is that we should reject the Poisson model in favor of the negative binomial model. Equivalently, we reject the hypothesis that 1/Θ is equal to 0.

There are a couple of things worth noting about this test. First, some readers will be puzzled by the fact that both of the log-likelihoods are positive, although log-likelihoods for these models must in fact be negative. The reason is that the log-likelihood reported in GENMOD is not the true log-likelihood, but differs from it by a constant that depends on the data. This implies that differences in the reported log-likelihoods will be the same as differences in the true log-likelihoods. The other thing to remember is that you can't compare the log-likelihood for the negative binomial model with the log-likelihood for the Poisson model with overdispersion correction (reported in Output 4.8). That's because the overdispersion correction rescales the log-likelihood as well as the standard errors and test statistics.

So the negative binomial model is clearly a better fit to these data than the Poisson model. But, unlike the Poisson model (where conditional and unconditional estimates must be identical), we have no guarantee that unconditional negative binomial estimation is resistant to the incidental parameters problem (discussed for the logistic model in chapter 2). Allison and Waterman (2002) investigated this question with Monte Carlo simulations. They found that the unconditional estimator did not show any substantial bias, even under conditions most likely to produce bias from incidental parameters. Their simulations also showed that negative binomial estimators had substantially smaller true standard errors than Poisson estimators. Furthermore, confidence intervals produced by the Poisson method, even with the overdispersion correction, tended to be much too small under many conditions.

In sum, negative binomial estimation seems substantially superior to Poisson estimation for many applications. Nevertheless, the simulations also showed that the negative binomial method produced confidence intervals that tended to be too small, although the undercoverage was not nearly as severe as for the Poisson. Under many conditions, nominal 95% confidence intervals covered the true value only about 85% of the time. This problem is easily corrected by using the DSCALE option in GENMOD (not the PSCALE option) to introduce additional correction for overdispersion. When this was done in simulations, the actual coverage rates were very close to the nominal 95% confidence intervals for nearly all conditions. Output 4.12 shows the results of applying the DSCALE correction to the model for the patent data. For this set of covariates, these estimates are the best among the various estimation methods we have considered.

Table 4.12. Output 4.12 Fixed Effects Negative Binomial Model with Overdispersion Correction
Criteria For Assessing Goodness Of Fit
CriterionDFValueValue/DF
Deviance12861704.17541.3252
Scaled Deviance12861286.00001.0000
Pearson Chi-Square12861618.55211.2586
Scaled Pearson X212861221.38720.9498
Log Likelihood 169350.6376 
Analysis Of Parameter Estimates
Parameter DFEstimateStandard ErrorWald 95% Confidence Limits Chi-SquarePr>ChiSq
Intercept 11.44420.13431.18101.7074115.66<.0001
rd_0 10.37060.07290.22770.513625.82<.0001
rd_1 1−0.08270.0779−0.23520.06991.130.2884
rd_2 10.06360.0738−0.08110.20820.740.3891
rd_3 10.01360.0686−0.12090.14820.040.8427
rd_4 10.03450.0651−0.09310.16200.280.5963
rd_5 10.00180.0534−0.10280.10640.000.9726
t110.09650.01750.06220.130930.38<.0001
t210.04780.01700.01440.08127.870.0050
t310.04510.01670.01240.07787.290.0069
t41−0.06230.0171−0.0958−0.028813.280.0003

Although computation time for the unconditional negative binomial estimates was quite tolerable for the patent data, it could become a burden for very large data sets with lots of dummy variable coefficients to estimate. Again, Greene (2001) has shown how such computational difficulties can be readily overcome, but that would require modification of GENMOD algorithms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset