|

Multilevel and Longitudinal Modeling Using Stata
Sophia Rabe-Hesketh and Anders Skrondal
Table of Contents
List of Tables
List of Figures
Preface
Part I - Preliminaries
1 Review of linear regression
- 1.1 Introduction
- 1.2 Is there gender discrimination in faculty salaries?
- 1.3 Independent-samples t test
- 1.4 One-way analysis of variance
- 1.5 Simple linear regression
- 1.6 Dummy variables
- 1.7 Multiple linear regression
- 1.8 Interactions
- 1.9 Dummies for more than two groups
- 1.10 Other types of interactions
- 1.10.1 Interaction between dummy variables
- 1.10.2 Interaction between continuous covariates
- 1.11 Nonlinear effects
- 1.12 Residual diagnostics
- 1.13 Summary and further reading
- 1.14 Exercises
Part II - Two-level linear models
2 Variance-components models
- 2.1 Introduction
- 2.2 How reliable are peak-expiratory-flow measurements
- 2.3 The variance-components model
- 2.3.1 Model specification and path diagram
- 2.3.2 Error components, variance components, and reliability
- 2.3.3 Intraclass correlation
- 2.4 Fixed versus random effects
- 2.5 Estimation using Stata
- 2.5.1 Data preparation
- 2.5.2 Using xtreg
- 2.5.3 Using xtmixed
- 2.5.4 Using gllamm
- 2.6 Hypothesis tests and confidence intervals
- 2.6.1 Hypothesis test and confidence interval for the population mean
- 2.6.2 Hypothesis test and confidence interval for the between-cluster variance
- 2.7 More on statistical inference
- 2.7.1 Different estimation models
- 2.7.2 Inference forΒ
Estimate and standard error: Balanced case
Estimate: Unbalanced case
- 2.8 Crossed versus nested effects
- 2.9 Assigning values to the random intercepts
- 2.9.1 Maximum likelihood estimation
Implementation via OLS regression
Implementation via the mean total residual
- 2.9.2 Empirical Bayes prediction
- 2.9.3 Empirical Bayes variances
- 2.10 Summary and further reading
- 2.11 Exercises
3 Random-intercept models with covariates
- 3.1 Introduction
- 3.2 Does smoking during pregnancy affect birthweight?
- 3.3 The linear random-intercept model with covariates
- 3.3.1 Model specification
- 3.3.2 Residual variance and intraclass correlation
- 3.4 Estimation using Stata
- 3.4.1 Using xtreg
- 3.4.2 Using xtmixed
- 3.4.3 Using gllamm
- 3.5 Coefficients of determination or variance explained
- 3.6 Hypothesis tests and confidence intervals
- 3.6.1 Hypothesis tests for regression coefficients
Hypothesis tests for individual regression coefficients
Joint hypothesis tests for several regression coefficients
- 3.6.2 Predicted means and confidence intervals
- 3.6.3 Hypothesis test for between-cluster variance
- 3.7 Between and within effects
- 3.7.1 Between-mother effects
- 3.7.2 Within-mother effects
- 3.7.3 Relations among estimators
- 3.7.4 Endogeneity and different within- and between-mother effects
- 3.7.5 Hausman endogeneity test
- 3.8 Fixed versus random effects revisited
- 3.9 Residual diagnostics
- 3.10 More on statistical inference for regression coefficients
- 3.10.1 Consequences of using ordinary regression for clustered data
- 3.10.2 Power and sample-size determination
- 3.11 Summary and further reading
- 3.12 Exercises
4 Random-coefficient models
- 4.1 Introduction
- 4.2 How effective are different schools
- 4.3 Separate linear regressions for each school
- 4.4 Specification and interpretation of a random-coefficient model
- 4.4.1 Specification of random-coefficient model
- 4.4.2 Interpretation of the random-effects variances and covariances
- 4.5 Estimation using Stata
- 4.5.1 Using xtmixed
Random-intercept model
Random-coefficient model
- 4.5.2 Using gllamm
Random-intercept model
Random-coefficient model
- 4.6 Testing the slope variance
- 4.7 Interpretation of estimates
- 4.8 Assigning values to the random intercepts and slopes
- 4.8.1 Maximum likelihood estimation
- 4.8.2 Empirical Bayes prediction
- 4.8.3 Model visualization
- 4.8.4 Residual diagnostics
- 4.8.5 Inferences for individual schools
- 4.9 Two-stage model formulation
- 4.10 Some warnings about random-coefficient models
- 4.11 Summary and further reading
- 4.12 Exercises
5 Longitudinal, panel, and growth-curve models
- 5.1 Introduction
- 5.2 How and why do wages change over time?
- 5.3 Data structure
- 5.3.1 Missing data
- 5.3.2 Time-varying and time-constant variables
- 5.4 Time scales in longitudinal data
- 5.5 Random- and fixed-effects approaches
- 5.5.1 Correlated residuals
- 5.5.2 Fixed-intercept model
Using xtreg
Using anova
- 5.5.3 Random-intercept model
- 5.5.4 Random-coefficient model
- 5.5.5 Marginal mean and covariance structure induced by random effects
Marginal mean and covariance structure for random-intercept models
Marginal mean and covariance structure for random-coefficient models
- 5.6 Marginal modeling
- 5.6.1 Covariance structures
Compound symmetric or exchangeable structure
Random-coefficient structure
Autoregressive residual structure
Unstructured covariance matrix
- 5.6.2 Marginal modeling using Stata
- 5.7 Autoregressive- or lagged-response models
- 5.8 Hybrid approaches
- 5.8.1 Autoregressive response and random effects
- 5.8.2 Autoregressive responses and autoregressive residuals
- 5.8.3 Autoregressive residuals and random or fixed effects
- 5.9 Missing data
- 5.9.1 Maximum likelihood estimation under MAR: A simulation
- 5.10 How do children grow?
- 5.10.1 Observed growth trajectories
- 5.11 Growth-curve modeling
- 5.11.1 Random-intercept model
- 5.11.2 Random-coefficient model
- 5.11.3 Two-stage model formulation
- 5.12 Prediction of trajectories for individual children
- 5.13 Prediction of mean growth trajectory and 95% band
- 5.14 Complex level-1 variation or heteroskedasticity
- 5.15 Summary and further reading
- 5.16 Exercises
Part III - Two-level generalized linear models
6 Dichotomous or binary responses
- 6.1 Introduction
- 6.2 Single-level models for dichotomous responses
- 6.2.1 Generalized linear model formulation
- 6.2.2 Latent-response formulation
Logistic regression
Probit regression
- 6.3 Which treatment is best for toenail infection?
- 6.4 Longitudinal data structure
- 6.5 Population-averaged or marginal probabilities
- 6.6 Random-intercept logistic regression
- 6.7 Estimation of logistic random-intercept models
- 6.7.1 Using xtlogit
- 6.7.2 Using xtmelogit
- 6.7.3 Using gllamm
- 6.8 Inference for logistic random-intercept models
- 6.9 Subject-specific vs. population-averaged relationships
- 6.10 Measures of dependence and heterogeneity
- 6.10.1 Conditional or residual intraclass correlation of the latent responses
- 6.10.2 Median odds ratio
- 6.11 Maximum likelihood estimation
- 6.11.1 Adaptive quadrature
- 6.11.2 Some speed considerations
Advice for speeding up gllamm
- 6.12 Assigning values to random effects
- 6.12.1 Maximum likelihood estimation
- 6.12.2 Empirical Bayes prediction
- 6.12.3 Empirical Bayes modal prediction
- 6.13 Different kinds of predicted probabilities
- 6.13.1 Predicted population-averaged probabilities
- 6.13.2 Predicted subject-specific probabilities
Predictions for hypothetical subjects
Predictions for the subjects in the sample
- 6.14 Other approaches to clustered dichotomous data
- 6.14.1 Conditional logistic regression
- 6.14.2 Generalized estimating equations (GEE)
- 6.15 Summary and further reading
7 Ordinal responses
- 7.1 Introduction
- 7.2 Single-level cumulative models for ordinal responses
- 7.2.1 Generalized linear model formulation
- 7.2.2 Latent-response formulation
- 7.2.3 Proportional odds
- 7.2.4 Identification
- 7.3 Are antipsychotic drugs effective for patients with schizophrenia?
- 7.4 Longitudinal data structure and graphs
- 7.4.1 Longitudinal data structure
- 7.4.2 Plotting cumulative proportions
- 7.4.3 Plotting estimated cumulative logits and transforming the time scale
- 7.5 A single-level proportional odds model
- 7.5.1 Model specification
- 7.5.2 Estimation using Stata
- 7.6 A random-intercept proportional odds model
- 7.6.1 Model specification
- 7.6.2 Estimation using Stata
- 7.7 A random-intercept proportional odds model
- 7.7.1 Model specification
- 7.7.2 Estimation using gllamm
- 7.8 Different kinds of predicted probabilities
- 7.8.1 Predicted population-averaged probabilities
- 7.8.2 Predicted patient-specific probabilities
- 7.9 Do experts differ in the grading of student essays?
- 7.10 A random-intercept probit model with grader bias
- 7.10.1 Model specification
- 7.10.2 Estimation
- 7.11 Including grader-specific measurement error variances
- 7.11.1 Model specification
- 7.11.2 Estimation
- 7.12 Including grader-specific thresholds
- 7.12.1 Model specification
- 7.12.2 Estimation
- 7.13 Summary and further reading
- 7.14 Exercises
8 Discrete-time survival
- 8.1 Introduction
- 8.1.1 Censoring and truncation
- 8.1.2 Time-varying covariates and different time-scales
- 8.1.3 Discrete- versus continuous-time survival data
- 8.2 Single-level models for discrete-time survival data
- 8.2.1 Discrete-time hazard and discrete-time survival
- 8.2.2 Data expansion for discrete-time survival analysis
- 8.2.3 Estimation via regression models for dichotomous responses
- 8.2.4 Including covariates
Time-constant covariates
Time-varying covariates
- 8.2.5 Handling left-truncated data
- 8.3 How does birth history affect child mortality?
- 8.4 Data expansion
- 8.5 Proportional hazards and interval censoring
- 8.6 Complementary log-log models
- 8.7 A random-intercept complementary log-log model
- 8.7.1 Model specification
- 8.7.2 Estimation using Stata
- 8.8 Marginal and conditional survival probabilities
- 8.9 Summary and further reading
- 8.10 Exercises
9 Counts
- 9.1 Introduction
- 9.2 What are counts?
- 9.2.1 Counts versus proportions
- 9.2.2 Counts as aggregated event-history data
- 9.3 Single-level Poisson models for counts
- 9.4 Did the German health-care reform reduce the number of doctor visits?
- 9.5 Longitudinal data structure
- 9.6 Single-level Poisson regression
- 9.6.1 Model specification
- 9.6.2 Estimation using Stata
- 9.7 Random-intercept Poisson regression
- 9.7.1 Model specification
- 9.7.2 Estimation using Stata
Using xtpoisson
Using xtmepoisson
Using gllamm
- 9.8 Random-coefficient Poisson regression
- 9.8.1 Model specification
- 9.8.2 Estimation using Stata
Using xtmepoisson
Using gllamm
- 9.8.3 Interpretation of estimates
- 9.9 Overdispersion in single-level models
- 9.9.1 Normally distributed random intercept
- 9.9.2 Negative binomial models
Mean dispersion or NB2
Constant dispersion or NB1
- 9.9.3 Quasilikelihood or robust standard errors
- 9.10 Level-1 overdispersion in two-level models
- 9.11 Other approaches to two-level count data
- 9.11.1 Conditional Poisson regression
- 9.11.2 Conditional negative binomial regression
- 9.11.3 Generalized estimating equations
- 9.11.4 Marginal and conditional estimates when responses are MAR
Simulation
- 9.12 How does birth history affect child mortality?
- 9.12.1 Simple piecewise exponential survival model
- 9.12.2 Piecewise exponential survival model with covariates and frailty
- 9.13 Which Scottish counties have a high risk of lip cancer?
- 9.14 Standardized mortality ratios
- 9.15 Random-intercept Poisson regression
- 9.15.1 Model specification
- 9.15.2 Estimation using gllamm
- 9.15.3 Prediction of standardized mortality ratios
- 9.16 Nonparametric maximum likelihood estimation
- 9.16.1 Specification
- 9.16.2 Estimation using gllamm
- 9.16.3 Prediction
- 9.17 Summary and further reading
- 9.18 Exercises
Part IV - Models with nested and crossed random effects
10 Higher-level models with nested random effects
- 10.1 Introduction
- 10.2 Do peak-expiratory-flow measurements vary between methods?
- 10.3 Two-level variance-components models
- 10.3.1 Model specification
- 10.3.2 Estimation using xtmixed
- 10.4 Three-level variance-components models
- 10.4.1 Model specification
- 10.4.2 Different types of intraclass correlation
- 10.4.3 Three-stage formulation
- 10.4.4 Estimating using xtmixed
- 10.4.5 Empirical Bayes prediction using xtmixed
- 10.5 Did the Guatemalan immunization campaign work?
- 10.6 A three-level logistic random-intercept model
- 10.6.1 Model specification
- 10.6.2 Different types of intraclass correlations for the latent responses
- 10.6.3 Different kinds of median odds ratios
- 10.6.4 Three-stage formulation
- 10.7 Estimation of three-level logistic random-intercept models using Stata
- 10.7.1 Using gllamm
- 10.7.2 Using xtmelogit
- 10.8 A three-level logistic random-coefficient model
- 10.9 Estimation of three-level logistic random-coefficient models using Stata
- 10.9.1 Using gllamm
- 10.9.2 Using xtmelogit
- 10.10 Prediction of random effects
- 10.10.1 Empirical Bayes prediction
- 10.10.2 Empirical Bayes modal prediction
- 10.11 Different kinds of predicted probabilities
- 10.11.1 Predicted marginal probabilities
- 10.11.2 Predicted median or conditional probabilities
- 10.11.e Predicted posterior mean probabilities
- 10.12 Summary and further reading
- 10.13 Exercises
11 Crossed random effects
- 11.1 Introduction
- 11.2 How does investment depend on expected profit and capital stock?
- 11.3 A two-way error-components model
- 11.3.1 Models specification
- 11.3.2 Residual intraclass correlations
- 11.3.3 Estimation
- 11.3.4 Prediction
- 11.4 How much do primary and secondary schools affect attainment at age 16?
- 11.5 An additive crossed random-effects model
- 11.5.1 Specification
- 11.5.2 Estimation using xtmixed
- 11.6 Including a random interaction
- 11.6.1 Model specification
- 11.6.2 Intraclass correlations
- 11.6.3 Estimation using xtmixed
- 11.6.4 Some diagnostics
- 11.7 A trick requiring fewer random effects
- 11.8 Do salamanders from different populations mate successfully?
- 11.9 Crossed random-effects logistic regression
- 11.10 Summary and further reading
- 11.11 Exercises
A Syntax for gllamm, eq, and gllapred: The bare essentials
B Syntax for gllamm
C Syntax for gllapred
D Syntax for gllasim
References
Author index
Subject index
|