
Generalized Linear Models and Extensions
James Hardin, Joseph Hilbe
Table of contents
List of Tables
List of Figures
List of Listings
Preface
1 Introduction
- 1.1 Origins and motivation
- 1.2 Notational conventions
- 1.3 Applied or theoretical?
- 1.4 Road map
- 1.5 Installing the support materials
Part I - Foundations of Generalized Linear Models
2 Generalized Linear Models
- 2.1 Components
- 2.2 Assumptions
- 2.3 Exponential family
- 2.4 Example: Using an offset in a GLM
- 2.5 Summary
3 GLM estimation algorithms
- 3.1 NewtonRaphson (using the observed Hessian)
- 3.2 Starting values for NewtonRaphson
- 3.3 IRLS (using the expected Hessian)
- 3.4 Starting values for IRLS
- 3.5 Goodness of fit
- 3.6 Estimated variance matrices
- 3.6.1 Hessian
- 3.6.2 Outer product of the gradient (OPG)
- 3.6.3 Sandwich
- 3.6.4 Modified sandwich
- 3.6.5 Unbiased sandwich
- 3.6.6 Modified unbiased sandwich
- 3.6.7 Weighted sandwich: NeweyWest
- 3.6.8 Jackknife
- 3.6.8.1 Usual jackknife
- 3.6.8.2 One-step jackknife
- 3.6.8.3 Weighted jackknife
- 3.6.8.4 Variable jackknife
- 3.6.9 Bootstrap
- 3.6.9.1 Usual bootstrap
- 3.6.9.2 Grouped bootstrap
- 3.7 Estimation algorithms
- 3.8 Summary
4 Analysis of fit
- 4.1 Deviance
- 4.2 Diagnostics
- 4.2.1 Cook's distance
- 4.2.2 Overdispersion
- 4.3 Assessing the link function
- 4.4 Checks for systematic departure from the model
- 4.5 Residual analysis
- 4.5.1 Response residuals
- 4.5.2 Working residuals
- 4.5.3 Pearson residuals
- 4.5.4 Partial residuals
- 4.5.5 Anscombe residuals
- 4.5.6 Deviance residuals
- 4.5.7 Adjusted deviance residuals
- 4.5.8 Likelihood residuals
- 4.5.9 Score residuals
- 4.6 Model statistics
- 4.6.1 Criterion measures
- 4.6.1.1 Akaike information criterion (AIC)
- 4.6.1.2 Bayesian information criterion (BIC)
- 4.6.2 The interpretation of R2 in linear regression
- 4.6.2.1 Percent variance explained
- 4.6.2.2 The ratio of variances
- 4.6.2.3 A transformation of the likelihood ratio
- 4.6.2.4 A transformation of the F test
- 4.6.2.5 Squared correlation
- 4.6.3 Generalizations of linear regression R2 interpretations
- 4.6.3.1 Efron's pseudo-R2
- 4.6.3.2 McFadden's likelihood-ratio index
- 4.6.3.3 Ben-Akiva and Lerman adjusted likelihood-ratio index
- 4.6.3.4 McKelvey and Zavoina ratio of variances
- 4.6.3.5 Transformation of likelihood ratio
- 4.6.3.6 Cragg and Uhler normed measure
- 4.6.4 Additional R2 measures
- 4.6.4.1 The count R2
- 4.6.4.2 The adjusted count R2
- 4.6.4.3 Veall and Zimmermann R2
- 4.6.4.4 CameronWindmeijer R2
Part II - Continuous Response Models
5 The Gaussian family
- 5.1 Derivation of the GLM Gaussian family
- 5.2 Derivation in terms of the mean
- 5.3 IRLS GLM algorithm (non-binomial)
- 5.4 Maximum likelihood estimation
- 5.5 GLM log-normal models
- 5.6 Expected versus observed information matrix
- 5.7 Other Gaussian links
- 5.8 Example: Relation to OLS
- 5.9 Example : Beta-carotene
6 The gamma family
- 6.1 Derivation of the gamma model
- 6.2 Example: Reciprocal link
- 6.3 Maximum likelihood estimation
- 6.4 Log-gamma models
- 6.5 Identity-gamma models
- 6.6 Using the gamma model for survival analysis
7 The inverse Gaussian family
- 7.1 Derivation of the inverse Gaussian model
- 7.2 The inverse Gaussian algorithm
- 7.3 Maximum likelihood algorithm
- 7.4 Example: The canonical inverse Gaussian
- 7.5 Non-canonical links
8 The power family and link
- 8.1 Power links
- 8.2 Example: Power link
- 8.3 The power family
Part III - Binomial Response Models
9 The binomial-logit family
- 9.1 Derivation of the binomial model
- 9.2 Derivation of the Bernoulli model
- 9.3 The binomial regression algorithm
- 9.4 Example: Logistic regression
- 9.4.1 Model producing logistic coefficients: The heart data
- 9.4.2 Model producing logistic odds ratios
- 9.5 Goodness-of-fit statistics (GOF)
- 9.6 Interpretation of parameter estimates
10 The general binomial family
- 10.1 Non-canonical binomial models
- 10.2 Non-canonical binomial links (binary form)
- 10.3 The probit model
- 10.4 The complementary log-log and log-log models
- 10.5 Other links
- 10.6 Interpretation of coefficients
- 10.6.1 Identity link
- 10.6.2 Logit link
- 10.6.3 Log link
- 10.6.4 Log complement link
- 10.6.5 Summary
- 10.7 Generalized binomial regression
11 The problem of overdispersion
- 11.1 Overdispersion
- 11.2 Scaling of standard errors
- 11.3 Williams' procedure
- 11.4 Robust standard errors
Part IV - Count Response Models
12 The Poisson family
- 12.1 Count response regression models
- 12.2 Derivation of the Poisson algorithm
- 12.3 Poisson regression: Examples
- 12.4 Example: Testing overdispersion in the Poisson model
- 12.5 Using the Poisson model for survival analysis
- 12.6 Using offsets to compare models
- 12.7 Interpretation of coefficients
13 The negative binomial family
- 13.1 Constant overdispersion
- 13.2 Variable overdispersion
- 13.2.1 Derivation in terms of a Poissongamma mixture
- 13.2.2 Derivation in terms of the negative binomial probability function
- 13.2.3 The canonical link negative binomial parameterization
- 13.3 The log-negative binomial parameterization
- 13.4 Negative binomial examples
- 13.5 The geometric family
- 13.7 Interpretation of coefficients
-
14 Other count data models
- 14.1 Count response regression models
- 14.2 Zero-truncated models
- 14.3 Zero-inflated models
- 14.4 hurdle models
- 14.5 Heterogeneous negative binomial models
- 14.6 Generalized Poisson regression models
- 14.7 Censored count response models
Part V - Multinomial Response Models
15 The ordered-response family
- 15.1 Ordered outcomes for general link
- 15.2 Ordered outcomes for specific links
- 15.2.1 Ordered logit
- 15.2.2 Ordered probit
- 15.2.3 Ordered clog-log
- 15.2.4 Ordered log-log
- 15.2.5 Ordered cauchit
- 15.3 Generalized ordered outcome models
- 15.4 Example: Synthetic data
- 15.5 Example: Automobile data
- 15.6 Partial proportional-odds models
- 15.7 Continuation ratio models
16 Unordered-response family
- 16.1 The multinomial logit model
- 16.1.1 Example: Relation to logistic regression
- 16.1.2 Example: Relation to conditional logistic regression
- 16.1.3 Example: Extensions with conditional logistic regression
- 16.1.4 The independence of irrelevant alternatives
- 16.1.5 Example: Assessing the IIA
- 16.1.6 Interpretation of coefficients
- 16.1.7 Example : Medical admissions - introduction
- 16.1.8 Example : Medical admissions - summary
- 16.2 The multinomial probit models
- 16.2.1 Example : A comparison of the models
- 16.2.2 Example : comparing probit and multinomial probit
- 16.2.3 Example : Concluding remarks
Part VI - Extensions to the GLM
17 Extending the likelihood
- 17.1 The quasi-likelihood
- 17.2 Example: Wedderburn's leaf blotch data
- 17.3 Generalized additive models
18 Clustered data
- 18.1 Generalization from individual to clustered data
- 18.2 Pooled estimators
- 18.3 Fixed effects
- 18.3.1 Unconditional fixed effects estimators
- 17.3.2 Conditional fixed effects estimators
- 18.4 Random effects
- 18.4.1 Maximum likelihood estimation
- 17.4.2 Gibbs sampling
- 18.5 GEEs
- 18.6 Other models
Part VII - Stata Software
19 Programs for Stata
- 19.1 The glm command
- 19.1.1 Syntax
- 19.1.2 description
- 19.1.3 Options
- 19.2 the predict command after glm
-
- 19.2.1 Syntax
- 19.2.2 Options
- 19.3 User-written programs
- 19.3.1 Global macros available for user-written programs
- 19.3.2 User-written variance functions
- 19.3.3 User-written programs for link functions
- 19.3.4 User-written programs for NeweyWest weights
- 19.4 Remarks
- 19.4.1 Equivalent comments
- 19.4.2 Special comments on family (Gaussian) models
- 19.4.3 Special comments on family (binomial) models
- 19.4.4 Special comments on family (nbinomial) models
- 19.4.5 Special comment on family (gamma) link (log) models
A Tables
References
Author index
Subject index
|