List of Tables
List of figures
1 Stata basics
- 1.1 Interactive use
- 1.2 Documentation
- 1.2.1 Stata manuals
- 1.2.2 Additional Stata resources
- 1.2.3 The help command
- 1.2.4 The search, findit, and hsearch commands
- 1.3 Command syntax and operators
- 1.3.1 Basic command syntax
- 1.3.2 Example: The summarize command
- 1.3.3 Example: The regress command
- 1.3.4 Abbreviations, case sensitivity, and wildcards
- 1.3.5 Arithmetic, relational, and logical operators
- 1.3.6 Error messages
- 1.4 Do-files and log files
- 1.4.1 Writing a do-file
- 1.4.2 Running do-files
- 1.4.3 Log files
- 1.4.4 A three-step process
- 1.4.5 Comments and long lines
- 1.4.6 Different implementations of Stata
- 1.5 Scalars and matrices
- 1.6 Using results from Stata commands
- 1.6.1 Using results from the r-class command summarize
- 1.6.2 Using results from the e-class command regress
- 1.7 Global and local macros
- 1.8 Looping commands
- 1.9 Some useful commands
- 1.10 Template do-file
- 1.11 User-written commands
- 1.12 Stata resources
- 1.13 Exercises
2 Data management and graphics
- 2.1 Introduction
- 2.2 Types of data
- 2.2.1 Text or ASCII data
- 2.2.2 Internal numeric data
- 2.2.3 String data
- 2.2.4 Formats for displaying numeric data
- 2.3 Inputting data
- 2.3.1 General principles
- 2.3.2 Inputting data already in Stata format
- 2.3.3 Inputting data from the keyboard
- 2.3.4 Inputting nontext data
- 2.3.5 Inputting text data from a spreadsheet
- 2.3.6 Inputting text data in free format
- 2.3.7 Inputting text data in fixed format
- 2.3.8 Dictionary files
- 2.3.9 Common pitfalls
- 2.4 Data management
- 2.4.1 PSID example
- 2.4.2 Naming and labeling variables
- 2.4.3 Viewing data
- 2.4.4 Using original documentation
- 2.4.5 Missing values
- 2.4.6 Imputing missing data
- 2.4.7 Transforming data (generate, replace, egen, recode)
- The generate and replace commands
- The egen command
- The recode command
- The by prefix
- Indicator variables
- Set of indicator variables
- Interactions
- Demeaning
- 2.4.8 Saving data
- 2.4.9 Selecting the sample
- 2.5 Manipulating datasets
- 2.5.1 Ordering observations and variables
- 2.5.2 Preserving and restoring a dataset
- 2.5.3 Wide and long forms for a dataset
- 2.5.4 Merging datasets
- 2.5.5 Appending datasets
- 2.6 Graphical display of data
- 2.6.1 Stata graph commands
- 2.6.2 Box-and-whisker plot
- 2.6.3 Histogram
- 2.6.4 Kernel density plot
- 2.6.5 Twoway scatterplots and fitted lines
- 2.6.6 Lowess, kernel, local linear, and nearest-neighbor regression
- 2.6.7 Multiple scatterplots
- 2.7 Stata resources
- 2.8 Exercises
3 Linear regression basics
- 3.1 Introduction
- 3.2 Data and data summary
- 3.2.1 Data description
- 3.2.2 Variable description
- 3.2.3 Summary statistics
- 3.2.4 More-detailed summary statistics
- 3.2.5 Tables for data
- 3.2.6 Statistical tests
- 3.2.7 Data plots
- 3.3 Regression in levels and logs
- 3.3.1 Basic regression theory
- 3.3.2 OLS regression and matrix algebra
- 3.3.3 Properties of the OLS estimator
- 3.3.4 Heteroskedasticity-robust standard error
- 3.3.5 Clusterrobust standard errors
- 3.3.6 Regression in logs
- 3.4 Basic regression analysis
- 3.4.1 Correlations
- 3.4.2 The regress command
- 3.4.3 Hypothesis tests
- 3.4.4 Tables of output from several regressions
- 3.4.5 Even better tables of regression output
- 3.4.6 Factor variables for categorical variables and interactions
- 3.5 Specification analysis
- 3.5.1 Specification tests and model diagnostics
- 3.5.2 Residual diagnostic plots
- 3.5.3 Influential observations
- 3.5.4 Specification tests
- Test of omitted variables
- Test of the BoxCox model
- Test of the functional form of the conditional mean
- Heteroskedasticity test
- Omnibus test
- 3.5.5 Tests have power in more than one direction
- 3.6 Prediction
- 3.6.1 In-sample prediction
- 3.6.2 MEs and elasticities
- 3.6.3 Prediction in logs: The retransformation problem
- 3.6.4 Prediction exercise
- 3.7 Sampling weights
- 3.8 OLS using Mata
- 3.9 Stata resources
- 3.10 Exercises
- 4.1 Introduction
- 4.2 Pseudorandom-number generators: Introduction
- 4.2.1 Uniform random-number generation
- 4.2.2 Draws from normal
- 4.2.3 Draws from t, chi-squared, F, gamma, and beta
- 4.2.4 Draws from binomial, Poisson, and negative binomial
- Independent (but not identically distributed) draws from binomial
- Independent (but not identically distributed) draws from Poisson
- Histograms and density plots
- 4.3 Distribution of the sample mean
- 4.3.1 Stata program
- 4.3.2 The simulate command
- 4.3.3 Central limit theorem simulation
- 4.3.4 The postfile command
- 4.3.5 Alternative central limit theorem simulation
- 4.4 Pseudorandom-number generators: Further details
- 4.4.1 Inverse-probability transformation
- 4.4.2 Direct transformation
- 4.4.3 Other methods
- 4.4.4 Draws from truncated normal
- 4.4.5 Draws from multivariate normal
- 4.4.6 Draws using Markov chain Monte Carlo method
- 4.5 Computing integrals
- 4.6 Simulation for regression: Introduction
- 4.6.1 Simulation example: OLS with X2 errors
- 4.6.2 Interpreting simulation output
- 4.6.3 Variations
- 4.6.4 Estimator inconsistency
- 4.6.5 Simulation with endogenous regressors
- 4.7 Stata resources
- 4.8 Exercises
- 5.1 Introduction
- 5.2 GLS and FGLS regression
- 5.2.1 GLS for heteroskedastic errors
- 5.2.2 GLS and FGLS
- 5.2.3 Weighted least squares and robust standard errors
- 5.2.4 Leading examples
- 5.3 Modeling heteroskedastic data
- 5.3.1 Simulated dataset
- 5.3.2 OLS estimation
- 5.3.3 Detecting heteroskedasticity
- 5.3.4 FGLS estimation
- 5.3.5 WLS estimation
- 5.4 System of linear regressions
- 5.4.1 SUR model
- 5.4.2 The sureg command
- 5.4.3 Application to two categories of expenditures
- 5.4.4 Robust standard errors
- 5.4.5 Testing cross-equation constraints
- 5.4.6 Imposing cross-equation constraints
- 5.5 Survey data: Weighting, clustering, and stratification
- 5.6 Stata resources
- 5.7 Exercises
6 Linear instrumental-variables regression
- 6.1 Introduction
- 6.2 IV estimation
- 6.2.1 Basic IV theory
- 6.2.2 Model setup
- 6.2.3 IV estimators: IV, 2SLS, and GMM
- 6.2.4 Instrument validity and relevance
- 6.2.5 Robust standard-error estimates
- 6.3 IV example
- 6.3.1 The ivregress command
- 6.3.2 Medical expenditures with one endogenous regressor
- 6.3.3 Available instruments
- 6.3.4 IV estimation of an exactly identified model
- 6.3.5 IV estimation of an overidentified model
- 6.3.6 Testing for regressor endogeneity
- 6.3.7 Tests of overidentifying restrictions
- 6.3.8 IV estimation with a binary endogenous regressor
- 6.4 Weak instruments
- 6.4.1 Finite-sample properties of IV estimators
- 6.4.2 Weak instruments
- 6.4.3 The estat firststage command
- 6.4.4 Just-identified model
- 6.4.5 Overidentified model
- 6.4.6 More than one endogenous regressor
- 6.4.7 Sensitivity to choice of instruments
- 6.5 Better inference with weak instruments
- 6.5.1 Conditional tests and confidence intervals
- 6.5.2 LIML estimator
- 6.5.3 Jackknife IV estimator
- 6.5.4 Comparison of 2SLS, LIML, JIVE, and GMM
- 6.6 3SLS systems estimation
- 6.7 Stata resources
- 6.8 Exercises
- 7.1 Introduction
- 7.2 QR
- 7.2.1 Conditional quantiles
- 7.2.2 Computation of QR estimates and standard errors
- 7.2.3 The qreg, bsqreg, and sqreg commands
- 7.3 QR for medical expenditures data
- 7.3.1 Data summary
- 7.3.2 QR estimates
- 7.3.3 Interpretation of conditional quantile coefficients
- 7.3.4 Retransformation
- 7.3.5 Comparison of estimates at different quantiles
- 7.3.6 Heteroskedasticity test
- 7.3.7 Hypothesis tests
- 7.3.8 Graphical display of coefficients over quantiles
- 7.4 QR for generated heteroskedastic data
- 7.5 QR for count data
- 7.5.1 Quantile count regression
- 7.5.2 The qcount command
- 7.5.3 Summary of doctor visits data
- 7.5.4 Results from QCR
- 7.6 Stata resources
- 7.7 Exercises
8 Linear panel-data models: Basics
- 8.1 Introduction
- 8.2 Panel-data methods overview
- 8.2.1 Some basic considerations
- 8.2.2 Some basic panel models
- Individual-effects model
- Fixed-effects model
- Random-effects model
- Pooled model or population-averaged model
- Two-wayeffects model
- Mixed linear models
- 8.2.3 Clusterrobust inference
- 8.2.4 The xtreg command
- 8.2.5 Stata linear panel-data commands
- 8.3 Panel-data summary
- 8.3.1 Data description and summary statistics
- 8.3.2 Panel-data organization
- 8.3.3 Panel-data description
- 8.3.4 Within and between variation
- 8.3.5 Time-series plots for each individual
- 8.3.6 Overall scatterplot
- 8.3.7 Within scatterplot
- 8.3.8 Pooled OLS regression with clusterrobust standard errors
- 8.3.9 Time-series autocorrelations for panel data
- 8.3.10 Error correlation in the RE model
- 8.4 Pooled or population-averaged estimators
- 8.4.1 Pooled OLS estimator
- 8.4.2 Pooled FGLS estimator or population-averaged estimator
- 8.4.3 The xtreg, pa command
- 8.4.4 Application of the xtreg, pa command
- 8.5 Within estimator
- 8.5.1 Within estimator
- 8.5.2 The xtreg, fe command
- 8.5.3 Application of the xtreg, fe command
- 8.5.4 Least-squares dummy-variables regression
- 8.6 Between estimator
- 8.6.1 Between estimator
- 8.6.2 Application of the xtreg, be command
- 8.7 RE estimator
- 8.7.1 RE estimator
- 8.7.2 The xtreg, re command
- 8.7.3 Application of the xtreg, re command
- 8.8 Comparison of estimators
- 8.8.1 Estimates of variance components
- 8.8.2 Within and between R-squared
- 8.8.3 Estimator comparison
- 8.8.4 Fixed effects versus random effects
- 8.8.5 Hausman test for fixed effects
- 8.8.6 Prediction
- 8.9 First-difference estimator
- 8.10 Long panels
- 8.10.1 Long-panel dataset
- 8.10.2 Pooled OLS and PFGLS
- 8.10.3 The xtpcse and xtgls commands
- 8.10.4 Application of the xtgls, xtpcse, and xtscc commands
- 8.10.5 Separate regressions
- 8.10.6 FE and RE models
- 8.10.7 Unit roots and cointegration
- 8.11 Panel-data management
- 8.11.1 Wide-form data
- 8.11.2 Convert wide form to long form
- 8.11.3 Convert long form to wide form
- 8.11.4 An alternative to wide-form data
- 8.12 Stata resources
- 8.13 Exercises
9 Linear panel-data models: Extensions
- 9.1 Introduction
- 9.2 Panel IV estimation
- 9.2.1 Panel IV
- 9.2.2 The xtivreg command
- 9.2.3 Application of the xtivreg command
- 9.2.4 Panel IV extensions
- 9.3 HausmanTaylor estimator
- 9.3.1 HausmanTaylor estimator
- 9.3.2 The xthtaylor command
- 9.3.3 Application of the xthtaylor command
- 9.4 ArellanoBond estimator
- 9.4.1 Dynamic model
- 9.4.2 IV estimation in the FD model
- 9.4.3 The xtabond command
- 9.4.4 ArellanoBond estimator: Pure time series
- 9.4.5 ArellanoBond estimator: Additional regressors
- 9.4.6 Specification tests
- 9.4.7 The xtdpdsys command
- 9.4.8 The xtdpd command
- 9.5 Mixed linear models
- 9.5.1 Mixed linear model
- 9.5.2 The xtmixed command
- 9.5.3 Random-intercept model
- 9.5.4 Clusterrobust standard errors
- 9.5.5 Random-slopes model
- 9.5.6 Random-coefficients model
- 9.5.7 Two-way random-effects model
- 9.6 Clustered data
- 9.6.1 Clustered dataset
- 9.6.2 Clustered data using nonpanel commands
- 9.6.3 Clustered data using panel commands
- 9.6.4 Hierarchical linear models
- 9.7 Stata resources
- 9.8 Exercises
10 Nonlinear regression methods
- 10.1 Introduction
- 10.2 Nonlinear example: Doctor visits
- 10.3 Nonlinear regression methods
- 10.3.1 MLE
- 10.3.2 The poisson command
- 10.3.3 Postestimation commands
- 10.3.4 NLS
- 10.3.5 The nl command
- 10.3.6 GLM
- 10.3.7 The glm command
- 10.3.8 The gmm command
- 10.3.9 Other estimators
- 10.4 Different estimates of the VCE
- 10.4.1 General framework
- 10.4.2 The vce() option
- 10.4.3 Application of the vce() option
- 10.4.4 Default estimate of the VCE
- 10.4.5 Robust estimate of the VCE
- 10.4.6 Clusterrobust estimate of the VCE
- 10.4.7 Heteroskedasticity- and autocorrelation-consistent estimate of the VCE
- 10.4.8 Bootstrap standard errors
- 10.4.9 Statistical inference
- 10.5 Prediction
- 10.5.1 The predict and predictnl commands
- 10.5.2 Application of predict and predictnl
- 10.5.3 Out-of-sample prediction
- 10.5.4 Prediction at a specified value of one of the regressors
- 10.5.5 Prediction at a specified value of all the regressors
- 10.5.6 Prediction of other quantities
- 10.5.7 The margins command for prediction
- 10.6 Marginal effects
- 10.6.1 Calculus and finite-difference methods
- 10.6.2 MEs estimates AME, MEM, and MER
- 10.6.3 Elasticities and semielasticities
- 10.6.4 Simple interpretations of coefficients in single-index models
- 10.6.5 The margins command for marginal effects
- 10.6.6 MEM: Marginal effect at mean
- 10.6.7 MER: Marginal effect at representative value
- 10.6.8 AME: Average marginal effect
- 10.6.9 Elasticities and semielasticities
- 10.6.10 AME computed manually
- 10.6.11 Polynomial regressors
- 10.6.12 Interacted regressors
- 10.6.13 Complex interactions and nonlinearities
- 10.7 Model diagnostics
- 10.7.1 Goodness-of-fit measures
- 10.7.2 Information criteria for model comparison
- 10.7.3 Residuals
- 10.7.4 Model-specification tests
- 10.8 Stata resources
- 10.9 Exercises
11 Nonlinear optimization methods
- 11.1 Introduction
- 11.2 NewtonRaphson method
- 11.2.1 NR method
- 11.2.2 NR method for Poisson
- 11.2.3 Poisson NR example using Mata
- 11.3 Gradient methods
- 11.3.1 Maximization options
- 11.3.2 Gradient methods
- 11.3.3 Messages during iterations
- 11.3.4 Stopping criteria
- 11.3.5 Multiple maximums
- 11.3.6 Numerical derivatives
- 11.4 The ml command: lf method
- 11.4.1 The ml command
- 11.4.2 The lf method
- 11.4.3 Poisson example: Single-index model
- 11.4.4 Negative binomial example: Two-index model
- 11.4.5 NLS example: Nonlikelihood model
- 11.5 Checking the program
- 11.5.1 Program debugging using ml check and ml trace
- 11.5.2 Getting the program to run
- 11.5.3 Checking the data
- 11.5.4 Multicollinearity and near collinearity
- 11.5.5 Multiple optimums
- 11.5.6 Checking parameter estimation
- 11.5.7 Checking standard-error estimation
- 11.6 The ml command: d0, d1, d2, lf0, lf1, and lf2 methods
- 11.6.1 Evaluator functions
- 11.6.2 The d0 method
- 11.6.3 The d1 method
- 11.6.4 The lf1 method with the robust estimate of the VCE
- 11.6.5 The d2 and lf2 methods
- 11.7 The Mata optimize() function
- 11.7.1 Type d and gf evaluators
- 11.7.2 Optimize functions
- 11.7.3 Poisson example
- 11.8 Generalized method of moments
- 11.9 Stata resources
- 11.10 Exercises
- 12.1 Introduction
- 12.2 Critical values and p-values
- 12.2.1 Standard normal compared with Student's t
- 12.2.2 Chi-squared compared with F
- 12.2.3 Plotting densities
- 12.2.4 Computing p-values and critical values
- 12.2.5 Which distributions does Stata use?
- 12.3 Wald tests and confidence intervals
- 12.3.1 Wald test of linear hypotheses
- 12.3.2 The test command
- Test single coefficient
- Test several hypotheses
- Test of overall significance
- Test calculated from retrieved coefficients and VCE
- 12.3.3 One-sided Wald tests
- 12.3.4 Wald test of nonlinear hypotheses (delta method)
- 12.3.5 The testnl command
- 12.3.6 Wald confidence intervals
- 12.3.7 The lincom command
- 12.3.8 The nlcom command (delta method)
- 12.3.9 Asymmetric confidence intervals
- 12.4 Likelihood-ratio tests
- 12.5 Lagrange multiplier test (or score test)
- 12.6 Test size and power
- 12.6.1 Simulation DGP: OLS with chi-squared errors
- 12.6.2 Test size
- 12.6.3 Test power
- 12.6.4 Asymptotic test power
- 12.7 Specification tests
- 12.7.1 Moment-based tests
- 12.7.2 Information matrix test
- 12.7.3 Chi-squared goodness-of-fit test
- 12.7.4 Overidentifying restrictions test
- 12.7.5 Hausman test
- 12.7.6 Other tests
- 12.8 Stata resources
- 12.9 Exercises
- 13.1 Introduction
- 13.2 Bootstrap methods
- 13.2.1 Bootstrap estimate of standard error
- 13.2.2 Bootstrap methods
- 13.2.3 Asymptotic refinement
- 13.2.4 Use the bootstrap with caution
- 13.3 Bootstrap pairs using the vce(bootstrap) option
- 13.3.1 Bootstrap-pairs method to estimate VCE
- 13.3.2 The vce(bootstrap) option
- 13.3.3 Bootstrap standard-errors example
- 13.3.4 How many bootstraps?
- 13.3.5 Clustered bootstraps
- 13.3.6 Bootstrap confidence intervals
- 13.3.7 The postestimation estat bootstrap command
- 13.3.8 Bootstrap confidence-intervals example
- 13.3.9 Bootstrap estimate of bias
- 13.4 Bootstrap pairs using the bootstrap command
- 13.4.1 The bootstrap command
- 13.4.2 Bootstrap parameter estimate from a Stata estimation command
- 13.4.3 Bootstrap standard error from a Stata estimation command
- 13.4.4 Bootstrap standard error from a user-written estimation command
- 13.4.5 Bootstrap two-step estimator
- 13.4.6 Bootstrap Hausman test
- 13.4.7 Bootstrap standard error of the coefficient of variation
- 13.5 Bootstraps with asymptotic refinement
- 13.5.1 Percentile-t method
- 13.5.2 Percentile-t Wald test
- 13.5.3 Percentile-t Wald confidence interval
- 13.6 Bootstrap pairs using bsample and simulate
- 13.6.1 The bsample command
- 13.6.2 The bsample command with simulate
- 13.6.3 Bootstrap Monte Carlo exercise
- 13.7 Alternative resampling schemes
- 13.7.1 Bootstrap pairs
- 13.7.2 Parametric bootstrap
- 13.7.3 Residual bootstrap
- 13.7.4 Wild bootstrap
- 13.7.5 Subsampling
- 13.8 The jackknife
- 13.9 Stata resources
- 13.10 Exercises
- 14.1 Introduction
- 14.2 Some parametric models
- 14.2.1 Basic model
- 14.2.2 Logit, probit, linear probability, and clog-log models
- 14.3 Estimation
- 14.3.1 Latent-variable interpretation and identification
- 14.3.2 ML estimation
- 14.3.3 The logit and probit commands
- 14.3.4 Robust estimate of the VCE
- 14.3.5 OLS estimation of LPM
- 14.4 Example
- 14.4.1 Data description
- 14.4.2 Logit regression
- 14.4.3 Comparison of binary models and parameter estimates
- 14.5 Hypothesis and specification tests
- 14.5.1 Wald tests
- 14.5.2 Likelihood-ratio tests
- 14.5.3 Additional model-specification tests
- 14.5.4 Model comparison
- 14.6 Goodness of fit and prediction
- 14.6.1 Pseudo-R2 measure
- 14.6.2 Comparing predicted probabilities with sample frequencies
- 14.6.3 Comparing predicted outcomes with actual outcomes
- 14.6.4 The predict command for fitted probabilities
- 14.6.5 The prvalue command for fitted probabilities
- 14.7 Marginal effects
- 14.7.1 Marginal effect at a representative value (MER)
- 14.7.2 Marginal effect at the mean (MEM)
- 14.7.3 Average marginal effect (AME)
- 14.7.4 The prchange command
- 14.8 Endogenous regressors14.8.1 Example
- 14.8.2 Model assumptions
- 14.8.3 Structural-model approach
- 14.8.4 IVs approach
- 14.9 Grouped data
- 14.10 Stata resources
- 14.11 Exercises
- 15.1 Introduction
- 15.2 Multinomial models overview
- 15.2.1 Probabilities and MEs
- 15.2.2 Maximum likelihood estimation
- 15.2.3 Case-specific and alternative-specific regressors
- 15.2.4 Additive random-utility model
- 15.2.5 Stata multinomial model commands
- 15.3 Multinomial example: Choice of fishing mode
- 15.4 Multinomial logit model
- 15.4.1 The mlogit command
- 15.4.2 Application of the mlogit command
- 15.4.3 Coefficient interpretation
- 15.4.4 Predicted probabilities
- 15.4.5 MEs
- 15.5 Conditional logit model
- 15.5.1 Creating long-form data from wide-form data
- 15.5.2 The asclogit command
- 15.5.3 The clogit command
- 15.5.4 Application of the asclogit command
- 15.5.5 Relationship to multinomial logit model
- 15.5.6 Coefficient interpretation
- 15.5.7 Predicted probabilities
- 15.5.8 MEs
- 15.6 Nested logit model
- 15.6.1 Relaxing the independence of irrelevant alternatives assumption
- 15.6.2 NL model
- 15.6.3 The nlogit command
- 15.6.4 Model estimates
- 15.6.5 Predicted probabilities
- 15.6.6 MEs
- 15.6.7 Comparison of logit models
- 15.7 Multinomial probit model
- 15.7.1 MNP
- 15.7.2 The mprobit command
- 15.7.3 Maximum simulated likelihood
- 15.7.4 The asmprobit command
- 15.7.5 Application of the asmprobit command
- 15.7.6 Predicted probabilities and MEs
- 15.8 Random-parameters logit
- 15.8.1 Random-parameters logit
- 15.8.2 The mixlogit command
- 15.8.3 Data preparation for mixlogit
- 15.8.4 Application of the mixlogit command
- 15.9 Ordered outcome models
- 15.9.1 Data summary
- 15.9.2 Ordered outcomes
- 15.9.3 Application of the ologit command
- 15.9.4 Predicted probabilities
- 15.9.5 MEs
- 15.9.6 Other ordered models
- 15.10 Multivariate outcomes
- 15.11 Stata resources
- 15.12 Exercises
16 Tobit and selection models
- 16.1 Introduction
- 16.2 Tobit model
- 16.2.1 Regression with censored data
- 16.2.2 Tobit model setup
- 16.2.3 Unknown censoring point
- 16.2.4 Tobit estimation
- 16.2.5 ML estimation in Stata
- 16.3 Tobit model example
- 16.3.1 Data summary
- 16.3.2 Tobit analysis
- 16.3.3 Prediction after tobit
- 16.3.4 Marginal effects
- Left-truncated, left-censored, and right-truncated examples
- Left-censored case computed directly
- Marginal impact on probabilities
- 16.3.5 The ivtobit command
- 16.3.6 Additional commands for censored regression
- 16.4 Tobit for lognormal data
- 16.4.1 Data example
- 16.4.2 Setting the censoring point for data in logs
- 16.4.3 Results
- 16.4.4 Two-limit tobit
- 16.4.5 Model diagnostics
- 16.4.6 Tests of normality and homoskedasticity
- 16.4.7 Next step?
- 16.5 Two-part model in logs
- 16.6 Selection model
- 16.6.1 Model structure and assumptions
- 16.6.2 ML estimation of the sample-selection model
- 16.6.3 Estimation without exclusion restrictions
- 16.6.4 Two-step estimation
- 16.6.5 Estimation with exclusion restrictions
- 16.7 Prediction from models with outcome in logs
- 16.7.1 Predictions from tobit
- 16.7.2 Predictions from two-part model
- 16.7.3 Predictions from selection model
- 16.8 Stata resources
- 16.9 Exercises
- 17.1 Introduction
- 17.2 Features of count data
- 17.2.1 Generated Poisson data
- 17.2.2 Overdispersion and negative binomial data
- 17.2.3 Modeling strategies
- 17.2.4 Estimation methods
- 17.3 Empirical example 1
- 17.3.1 Data summary
- 17.3.2 Poisson model
- Poisson model results
- Robust estimate of VCE for Poisson MLE
- Test of overdispersion
- Coefficient interpretation and marginal effects
- 17.3.3 NB2 model
- NB2 model results
- Fitted probabilities for Poisson and NB2 models
- The countfit command
- The prvalue command
- Discussion
- Generalized NB model
- 17.3.4 Nonlinear least-squares estimation
- 17.3.5 Hurdle model
- 17.3.6 Finite-mixture models
- FMM specification
- Simulated FMM sample with comparisons
- ML estimation of the FMM
- The fmm command
- Application: Poisson finite-mixture model
- Interpretation
- Comparing marginal effects
- Application: NB finite-mixture model
- Model selection
- Cautionary note
- 17.4 Empirical example 2
- 17.4.1 Zero-inflated data
- 17.4.2 Models for zero-inflated data
- 17.4.3 Results for the NB2 model
- 17.4.4 Results for ZINB
- 17.4.5 Model comparison
- 17.5 Models with endogenous regressors
- 17.5.1 Structural-model approach
- 17.5.2 Nonlinear IV method
- 17.6 Stata resources
- 17.7 Exercises
18 Nonlinear panel models
- 18.1 Introduction
- 18.2 Nonlinear panel-data overview
- 18.2.1 Some basic nonlinear panel models
- 18.2.2 Dynamic models
- 18.2.3 Stata nonlinear panel commands
- 18.3 Nonlinear panel-data example
- 18.3.1 Data description and summary statistics
- 18.3.2 Panel-data organization
- 18.3.3 Within and between variation
- 18.3.4 FE or RE model for these data?
- 18.4 Binary outcome models
- 18.4.1 Panel summary of the dependent variable
- 18.4.2 Pooled logit estimator
- 18.4.3 The xtlogit command
- 18.4.4 The xtgee command
- 18.4.5 PA logit estimator
- 18.4.6 RE logit estimator
- 18.4.7 FE logit estimator
- 18.4.8 Panel logit estimator comparison
- 18.4.9 Prediction and marginal effects
- 18.4.10 Mixed-effects logit estimator
- 18.5 Tobit model
- 18.5.1 Panel summary of the dependent variable
- 18.5.2 RE tobit model
- 18.5.3 Generalized tobit models
- 18.5.4 Parametric nonlinear panel models
- 18.6 Count-data models
- 18.6.1 The xtpoisson command
- 18.6.2 Panel summary of the dependent variable
- 18.6.3 Pooled Poisson estimator
- 18.6.4 PA Poisson estimator
- 18.6.5 RE Poisson estimators
- 18.6.6 FE Poisson estimator
- 18.6.7 Panel Poisson estimators comparison
- 18.6.8 Negative binomial estimators
- 18.7 Stata resources
- 18.8 Exercises
- A.1 Stata matrix commands
- A.1.1 Stata matrix overview
- A.1.2 Stata matrix input and output
- A.1.3 Stata matrix subscripts and combining matrices
- A.1.4 Matrix operators
- A.1.5 Matrix functions
- A.1.6 Matrix accumulation commands
- A.1.7 OLS using Stata matrix commands
- A.2 Programs
- A.2.1 Simple programs (no arguments or access to results)
- A.2.2 Modifying a program
- A.2.3 Programs with positional arguments
- A.2.4 Temporary variables
- A.2.5 Programs with named positional arguments
- A.2.6 Storing and retrieving program results
- A.2.7 Programs with arguments using standard Stata syntax
- A.2.8 Ado-files
- A.3 Program debugging
- B.1 How to run Mata
- B.1.1 Mata commands in Mata
- B.1.2 Mata commands in Stata
- B.1.3 Stata commands in Mata
- B.1.4 Interactive versus batch use
- B.1.5 Mata help
- B.2 Mata matrix commands
- B.2.1 Mata matrix input
- Matrix input by hand
- Identity matrices, unit vectors, and matrices of constants
- Matrix input from Stata data
- Matrix input from Stata matrix
- Stata interface functions
- B.2.2 Mata matrix operators
- B.2.3 Mata functions
- B.2.4 Mata cross products
- B.2.5 Mata matrix subscripts and combining matrices
- B.2.6 Transferring Mata data and matrices to Stata
- B.3 Programming in Mata
- B.3.1 Declarations
- B.3.2 Mata program
- B.3.3 Mata program with results output to Stata
- B.3.4 Stata program that calls a Mata program
- B.3.5 Using Mata in ado-files
Glossary of abbreviations
References