Fixed Effects Negative Binomial Models for Count Data

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4.4. Fixed Effects Negative Binomial Models for Count Data

As we just saw in the last section, Poisson regression models often run into problems with overdispersion. That's a bit surprising for fixed effects models because these models already allow for unobserved heterogeneity across individuals by way of the α_i parameters. But that heterogeneity is presumed to be time-invariant. There might still be unobserved heterogeneity that is specific to particular points in time, leading to observed overdispersion. As we've seen, the standard errors can be corrected for overdispersion by a simple method based on the ratio of the deviance (or Pearson chi-square) to its degrees of freedom.

Although that's not a bad method, we might do better by directly building overdispersion into the model for event counts. Specifically, we will now assume that the patent counts have a negative binomial distribution for each firm at each point in time. The negative binomial distribution can be regarded as a generalization of the Poisson distribution with an additional parameter that allows for overdispersion. The attraction of this approach is that the estimated regression coefficients might be more efficient (in the statistical sense), and the standard errors and test statistics might be more accurate than those produced by the simpler, after-the-fact correction method.

Negative binomial regression models can be formulated in different ways. The model we shall use here is what Cameron and Trivedi (1998) call an NB2 model. In this model, the probability mass function for y_it is given by

In this equation λ_it is the expected value of y_it, Θ is the overdispersion parameter, and γ(.) is the gamma function. As Θ →∞, this distribution converges to the Poisson distribution. As before, we assume a log-linear regression decomposition of the expected value,

where the α_i are treated as fixed effects. Conditional on α_i, the multiple event counts for each individual (in this case, a firm) are assumed to be independent. But unconditionally, they may be dependent.

How can we estimate this model? Unlike the Poisson model, conditional likelihood is not an option here. In technical terminology, the total count for each individual is not a complete sufficient statistic for α_i, so conditioning on the total does not remove α_i from the likelihood function. Hausman, Hall and Griliches (1984) proposed a rather different fixed effects negative binomial regression model, and they derived a conditional maximum likelihood estimator for that model. In fact, their method has been incorporated into procedures in some widely available commercial software packages (not SAS). But Allison and Waterman (2002) have shown that this is not a true fixed effects regression model, and the method does not, in fact, control for all stable covariates.

Instead, we shall consider unconditional maximum likelihood estimation of models that include dummy variables for all individuals (except one). The following program estimates an unconditional negative binomial model in PROC GENMOD for the patent data.

PROC GENMOD DATA=patents2;
   CLASS t id;
   MODEL patent = rd_0-rd_5 t id / DIST=NB SCALE=0;
RUN;

The key difference with Poisson regression is the DIST=NB option on the MODEL statement. The SCALE option sets the starting value for the dispersion parameter. That option won't be necessary for most applications, but, in this case, the default starting value was not very good and the model failed to converge.

Results in Output 4.11 should be compared with those for Poisson regression in Output 4.7 (without overdispersion correction) and Output 4.8 (with overdispersion correction). It's apparent that the coefficients for the negative binomial model are very similar to those for the Poisson model. Moreover, the standard errors and test statistics for the negative binomial model are close to those for the Poisson model with overdispersion adjustment. (Note that the dispersion estimate reported in the last line of Output 4.11 is actually an estimate of 1/Θ, where Θ is the parameter in 4.7.)

Table 4.11. Output 4.11 Fixed Effects Negative Binomial Regression Model
Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	1286	1704.1804	1.3252
Scaled Deviance	1286	1704.1804	1.3252
Pearson Chi-Square	1286	1618.5570	1.2586
Scaled Pearson X2	1286	1618.5570	1.2586
Log Likelihood		224419.2756

Analysis Of Parameter Estimates
Parameter		DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr>ChiSq
Intercept		1	2.5055	0.2809	1.9550	3.0560	79.56	<.0001
rd_0		1	0.3706	0.0634	0.2464	0.4948	34.22	<.0001
rd_1		1	−0.0827	0.0676	−0.2152	0.0499	1.49	0.2216
rd_2		1	0.0636	0.0641	−0.0621	0.1892	0.98	0.3214
rd_3		1	0.0136	0.0596	−0.1032	0.1305	0.05	0.8193
rd_4		1	0.0345	0.0565	−0.0763	0.1452	0.37	0.5420
rd_5		1	0.0018	0.0464	−0.0890	0.0927	0.00	0.9685
t	1	1	0.2237	0.0254	0.1738	0.2736	77.27	<.0001
t	2	1	0.1750	0.0251	0.1258	0.2241	48.69	<.0001
t	3	1	0.1722	0.0243	0.1246	0.2199	50.22	<.0001
t	4	1	0.0649	0.0235	0.0188	0.1110	7.62	0.0058
t	5	0	0.0000	0.0000	0.0000	0.0000	.	.
id	1	1	0.9477	0.2177	0.5210	1.3745	18.95	<.0001
id	2	1	−1.8294	0.4892	−2.7882	−0.8705	13.98	0.0002
id	3	1	−0.0103	0.1329	−0.2708	0.2502	0.01	0.9382
.			.
.			.
.			.
id	344	1	−2.7074	0.5289	−3.7439	−1.6708	26.21	<.0001
id	345	1	1.0781	0.1957	0.6945	1.4616	30.35	<.0001
id	346	0	0.0000	0.0000	0.0000	0.0000	.	.
Dispersion		1	0.0196	0.0020	0.0156	0.0236

Because the Poisson model is a special case of the negative binomial regression model, we can compare the two by constructing a likelihood ratio chi-square statistic. This is accomplished by taking the difference in their log-likelihoods and multiplying by 2:

2(224419 − 224169) = 500.

With only 1 degree of freedom, this result is statistically significant by any standard. (Note that one cannot take differences in the deviance to construct this test because the deviance is computed differently for Poisson and negative binomial models). The implication is that we should reject the Poisson model in favor of the negative binomial model. Equivalently, we reject the hypothesis that 1/Θ is equal to 0.

There are a couple of things worth noting about this test. First, some readers will be puzzled by the fact that both of the log-likelihoods are positive, although log-likelihoods for these models must in fact be negative. The reason is that the log-likelihood reported in GENMOD is not the true log-likelihood, but differs from it by a constant that depends on the data. This implies that differences in the reported log-likelihoods will be the same as differences in the true log-likelihoods. The other thing to remember is that you can't compare the log-likelihood for the negative binomial model with the log-likelihood for the Poisson model with overdispersion correction (reported in Output 4.8). That's because the overdispersion correction rescales the log-likelihood as well as the standard errors and test statistics.

So the negative binomial model is clearly a better fit to these data than the Poisson model. But, unlike the Poisson model (where conditional and unconditional estimates must be identical), we have no guarantee that unconditional negative binomial estimation is resistant to the incidental parameters problem (discussed for the logistic model in chapter 2). Allison and Waterman (2002) investigated this question with Monte Carlo simulations. They found that the unconditional estimator did not show any substantial bias, even under conditions most likely to produce bias from incidental parameters. Their simulations also showed that negative binomial estimators had substantially smaller true standard errors than Poisson estimators. Furthermore, confidence intervals produced by the Poisson method, even with the overdispersion correction, tended to be much too small under many conditions.

In sum, negative binomial estimation seems substantially superior to Poisson estimation for many applications. Nevertheless, the simulations also showed that the negative binomial method produced confidence intervals that tended to be too small, although the undercoverage was not nearly as severe as for the Poisson. Under many conditions, nominal 95% confidence intervals covered the true value only about 85% of the time. This problem is easily corrected by using the DSCALE option in GENMOD (not the PSCALE option) to introduce additional correction for overdispersion. When this was done in simulations, the actual coverage rates were very close to the nominal 95% confidence intervals for nearly all conditions. Output 4.12 shows the results of applying the DSCALE correction to the model for the patent data. For this set of covariates, these estimates are the best among the various estimation methods we have considered.

Table 4.12. Output 4.12 Fixed Effects Negative Binomial Model with Overdispersion Correction
Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	1286	1704.1754	1.3252
Scaled Deviance	1286	1286.0000	1.0000
Pearson Chi-Square	1286	1618.5521	1.2586
Scaled Pearson X2	1286	1221.3872	0.9498
Log Likelihood		169350.6376

Analysis Of Parameter Estimates
Parameter		DF	Estimate	Standard Error	Wald 95% Confidence Limits		Chi-Square	Pr>ChiSq
Intercept		1	1.4442	0.1343	1.1810	1.7074	115.66	<.0001
rd_0		1	0.3706	0.0729	0.2277	0.5136	25.82	<.0001
rd_1		1	−0.0827	0.0779	−0.2352	0.0699	1.13	0.2884
rd_2		1	0.0636	0.0738	−0.0811	0.2082	0.74	0.3891
rd_3		1	0.0136	0.0686	−0.1209	0.1482	0.04	0.8427
rd_4		1	0.0345	0.0651	−0.0931	0.1620	0.28	0.5963
rd_5		1	0.0018	0.0534	−0.1028	0.1064	0.00	0.9726
t	1	1	0.0965	0.0175	0.0622	0.1309	30.38	<.0001
t	2	1	0.0478	0.0170	0.0144	0.0812	7.87	0.0050
t	3	1	0.0451	0.0167	0.0124	0.0778	7.29	0.0069
t	4	1	−0.0623	0.0171	−0.0958	−0.0288	13.28	0.0003

Although computation time for the unconditional negative binomial estimates was quite tolerable for the patent data, it could become a burden for very large data sets with lots of dummy variable coefficients to estimate. Again, Greene (2001) has shown how such computational difficulties can be readily overcome, but that would require modification of GENMOD algorithms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Fixed Effects Negative Binomial Models for Count Data

Create new playlist

Sign In

Sign Up

4.4. Fixed Effects Negative Binomial Models for Count Data

Table of Contents for
Fixed Effects Negative Binomial Models for Count Data