Ritme Informatique
spécialiste du logiciel scientifique
demande d'info
recherche
contact
bienvenue
accueil
produits
formations
services
support
tarifs
logiciels scientifiques

stata

Demande de devis

Retour à la liste
stata

Microeconometrics Using Stata, Revised Edition

A. colin Cameron and Pravin K. Trivedi

Table of Contents

List of Tables
List of figures
1 Stata basics
  • 1.1 Interactive use
  • 1.2 Documentation
    • 1.2.1 Stata manuals
    • 1.2.2 Additional Stata resources
    • 1.2.3 The help command
    • 1.2.4 The search, findit, and hsearch commands
  • 1.3 Command syntax and operators
    • 1.3.1 Basic command syntax
    • 1.3.2 Example: The summarize command
    • 1.3.3 Example: The regress command
    • 1.3.4 Abbreviations, case sensitivity, and wildcards
    • 1.3.5 Arithmetic, relational, and logical operators
    • 1.3.6 Error messages
  • 1.4 Do-files and log files
    • 1.4.1 Writing a do-file
    • 1.4.2 Running do-files
    • 1.4.3 Log files
    • 1.4.4 A three-step process
    • 1.4.5 Comments and long lines
    • 1.4.6 Different implementations of Stata
  • 1.5 Scalars and matrices
    • 1.5.1 Scalars
    • 1.5.2 Matrices
  • 1.6 Using results from Stata commands
    • 1.6.1 Using results from the r-class command summarize
    • 1.6.2 Using results from the e-class command regress
  • 1.7 Global and local macros
    • 1.7.1 Global macros
    • 1.7.2 Local macros
    • 1.7.3 Scalar or macro?
  • 1.8 Looping commands
    • 1.8.1 The foreach loop
    • 1.8.2 The forvalues loop
    • 1.8.3 The while loop
    • 1.8.4 The continue command
  • 1.9 Some useful commands
  • 1.10 Template do-file
  • 1.11 User-written commands
  • 1.12 Stata resources
  • 1.13 Exercises

2 Data management and graphics

  • 2.1 Introduction
  • 2.2 Types of data
    • 2.2.1 Text or ASCII data
    • 2.2.2 Internal numeric data
    • 2.2.3 String data
    • 2.2.4 Formats for displaying numeric data
  • 2.3 Inputting data
    • 2.3.1 General principles
    • 2.3.2 Inputting data already in Stata format
    • 2.3.3 Inputting data from the keyboard
    • 2.3.4 Inputting nontext data
    • 2.3.5 Inputting text data from a spreadsheet
    • 2.3.6 Inputting text data in free format
    • 2.3.7 Inputting text data in fixed format
    • 2.3.8 Dictionary files
    • 2.3.9 Common pitfalls
  • 2.4 Data management
    • 2.4.1 PSID example
    • 2.4.2 Naming and labeling variables
    • 2.4.3 Viewing data
    • 2.4.4 Using original documentation
    • 2.4.5 Missing values
    • 2.4.6 Imputing missing data
    • 2.4.7 Transforming data (generate, replace, egen, recode)
      • The generate and replace commands
      • The egen command
      • The recode command
      • The by prefix
      • Indicator variables
      • Set of indicator variables
      • Interactions
      • Demeaning
    • 2.4.8 Saving data
    • 2.4.9 Selecting the sample
  • 2.5 Manipulating datasets
    • 2.5.1 Ordering observations and variables
    • 2.5.2 Preserving and restoring a dataset
    • 2.5.3 Wide and long forms for a dataset
    • 2.5.4 Merging datasets
    • 2.5.5 Appending datasets
  • 2.6 Graphical display of data
    • 2.6.1 Stata graph commands
      • Example graph commands
      • Saving and exporting graphs
      • Learning how to use graph commands
    • 2.6.2 Box-and-whisker plot
    • 2.6.3 Histogram
    • 2.6.4 Kernel density plot
    • 2.6.5 Twoway scatterplots and fitted lines
    • 2.6.6 Lowess, kernel, local linear, and nearest-neighbor regression
    • 2.6.7 Multiple scatterplots
  • 2.7 Stata resources
  • 2.8 Exercises

3 Linear regression basics

  • 3.1 Introduction
  • 3.2 Data and data summary
    • 3.2.1 Data description
    • 3.2.2 Variable description
    • 3.2.3 Summary statistics
    • 3.2.4 More-detailed summary statistics
    • 3.2.5 Tables for data
    • 3.2.6 Statistical tests
    • 3.2.7 Data plots
  • 3.3 Regression in levels and logs
    • 3.3.1 Basic regression theory
    • 3.3.2 OLS regression and matrix algebra
    • 3.3.3 Properties of the OLS estimator
    • 3.3.4 Heteroskedasticity-robust standard error
    • 3.3.5 Cluster–robust standard errors
    • 3.3.6 Regression in logs
  • 3.4 Basic regression analysis
    • 3.4.1 Correlations
    • 3.4.2 The regress command
    • 3.4.3 Hypothesis tests
    • 3.4.4 Tables of output from several regressions
    • 3.4.5 Even better tables of regression output
    • 3.4.6 Factor variables for categorical variables and interactions
  • 3.5 Specification analysis
    • 3.5.1 Specification tests and model diagnostics
    • 3.5.2 Residual diagnostic plots
    • 3.5.3 Influential observations
    • 3.5.4 Specification tests
      • Test of omitted variables
      • Test of the Box–Cox model
      • Test of the functional form of the conditional mean
      • Heteroskedasticity test
      • Omnibus test
    • 3.5.5 Tests have power in more than one direction
  • 3.6 Prediction
    • 3.6.1 In-sample prediction
    • 3.6.2 MEs and elasticities
    • 3.6.3 Prediction in logs: The retransformation problem
    • 3.6.4 Prediction exercise
  • 3.7 Sampling weights
    • 3.7.1 Weights
    • 3.7.2 Weighted mean
    • 3.7.3 Weighted regression
    • 3.7.4 Weighted prediction and MEs
  • 3.8 OLS using Mata
  • 3.9 Stata resources
  • 3.10 Exercises

4 Simulation

  • 4.1 Introduction
  • 4.2 Pseudorandom-number generators: Introduction
    • 4.2.1 Uniform random-number generation
    • 4.2.2 Draws from normal
    • 4.2.3 Draws from t, chi-squared, F, gamma, and beta
    • 4.2.4 Draws from binomial, Poisson, and negative binomial
      • Independent (but not identically distributed) draws from binomial
      • Independent (but not identically distributed) draws from Poisson
      • Histograms and density plots
  • 4.3 Distribution of the sample mean
    • 4.3.1 Stata program
    • 4.3.2 The simulate command
    • 4.3.3 Central limit theorem simulation
    • 4.3.4 The postfile command
    • 4.3.5 Alternative central limit theorem simulation
  • 4.4 Pseudorandom-number generators: Further details
    • 4.4.1 Inverse-probability transformation
    • 4.4.2 Direct transformation
    • 4.4.3 Other methods
    • 4.4.4 Draws from truncated normal
    • 4.4.5 Draws from multivariate normal
      • Direct draws from multivariate normal
      • Transformation using Cholesky decomposition
    • 4.4.6 Draws using Markov chain Monte Carlo method
  • 4.5 Computing integrals
    • 4.5.1 Quadrature
    • 4.5.2 Monte Carlo integration
    • 4.5.3 Monte Carlo integration using different S
  • 4.6 Simulation for regression: Introduction
    • 4.6.1 Simulation example: OLS with X2 errors
    • 4.6.2 Interpreting simulation output
      • Unbiasedness of estimator
      • Standard errors
      • t statistic
      • Test size
      • Number of simulations
    • 4.6.3 Variations
      • Different sample size and number of simulations
      • Test power
      • Different error distributions
    • 4.6.4 Estimator inconsistency
    • 4.6.5 Simulation with endogenous regressors
  • 4.7 Stata resources
  • 4.8 Exercises

5 GLS regression

  • 5.1 Introduction
  • 5.2 GLS and FGLS regression
    • 5.2.1 GLS for heteroskedastic errors
    • 5.2.2 GLS and FGLS
    • 5.2.3 Weighted least squares and robust standard errors
    • 5.2.4 Leading examples
  • 5.3 Modeling heteroskedastic data
    • 5.3.1 Simulated dataset
    • 5.3.2 OLS estimation
    • 5.3.3 Detecting heteroskedasticity
    • 5.3.4 FGLS estimation
    • 5.3.5 WLS estimation
  • 5.4 System of linear regressions
    • 5.4.1 SUR model
    • 5.4.2 The sureg command
    • 5.4.3 Application to two categories of expenditures
    • 5.4.4 Robust standard errors
    • 5.4.5 Testing cross-equation constraints
    • 5.4.6 Imposing cross-equation constraints
  • 5.5 Survey data: Weighting, clustering, and stratification
    • 5.5.1 Survey design
    • 5.5.2 Survey mean estimation
    • 5.5.3 Survey linear regression
  • 5.6 Stata resources
  • 5.7 Exercises

6 Linear instrumental-variables regression

  • 6.1 Introduction
  • 6.2 IV estimation
    • 6.2.1 Basic IV theory
    • 6.2.2 Model setup
    • 6.2.3 IV estimators: IV, 2SLS, and GMM
    • 6.2.4 Instrument validity and relevance
    • 6.2.5 Robust standard-error estimates
  • 6.3 IV example
    • 6.3.1 The ivregress command
    • 6.3.2 Medical expenditures with one endogenous regressor
    • 6.3.3 Available instruments
    • 6.3.4 IV estimation of an exactly identified model
    • 6.3.5 IV estimation of an overidentified model
    • 6.3.6 Testing for regressor endogeneity
    • 6.3.7 Tests of overidentifying restrictions
    • 6.3.8 IV estimation with a binary endogenous regressor
  • 6.4 Weak instruments
    • 6.4.1 Finite-sample properties of IV estimators
    • 6.4.2 Weak instruments
      • Diagnostics for weak instruments
      • Formal tests for weak instruments
    • 6.4.3 The estat firststage command
    • 6.4.4 Just-identified model
    • 6.4.5 Overidentified model
    • 6.4.6 More than one endogenous regressor
    • 6.4.7 Sensitivity to choice of instruments
  • 6.5 Better inference with weak instruments
    • 6.5.1 Conditional tests and confidence intervals
    • 6.5.2 LIML estimator
    • 6.5.3 Jackknife IV estimator
    • 6.5.4 Comparison of 2SLS, LIML, JIVE, and GMM
  • 6.6 3SLS systems estimation
  • 6.7 Stata resources
  • 6.8 Exercises

7 Quantile regression

  • 7.1 Introduction
  • 7.2 QR
    • 7.2.1 Conditional quantiles
    • 7.2.2 Computation of QR estimates and standard errors
    • 7.2.3 The qreg, bsqreg, and sqreg commands
  • 7.3 QR for medical expenditures data
    • 7.3.1 Data summary
    • 7.3.2 QR estimates
    • 7.3.3 Interpretation of conditional quantile coefficients
    • 7.3.4 Retransformation
    • 7.3.5 Comparison of estimates at different quantiles
    • 7.3.6 Heteroskedasticity test
    • 7.3.7 Hypothesis tests
    • 7.3.8 Graphical display of coefficients over quantiles
  • 7.4 QR for generated heteroskedastic data
    • 7.4.1 Simulated dataset
    • 7.4.2 QR estimates
  • 7.5 QR for count data
    • 7.5.1 Quantile count regression
    • 7.5.2 The qcount command
    • 7.5.3 Summary of doctor visits data
    • 7.5.4 Results from QCR
  • 7.6 Stata resources
  • 7.7 Exercises

8 Linear panel-data models: Basics

  • 8.1 Introduction
  • 8.2 Panel-data methods overview
    • 8.2.1 Some basic considerations
    • 8.2.2 Some basic panel models
      • Individual-effects model
      • Fixed-effects model
      • Random-effects model
      • Pooled model or population-averaged model
      • Two-way–effects model
      • Mixed linear models
    • 8.2.3 Cluster–robust inference
    • 8.2.4 The xtreg command
    • 8.2.5 Stata linear panel-data commands
  • 8.3 Panel-data summary
    • 8.3.1 Data description and summary statistics
    • 8.3.2 Panel-data organization
    • 8.3.3 Panel-data description
    • 8.3.4 Within and between variation
    • 8.3.5 Time-series plots for each individual
    • 8.3.6 Overall scatterplot
    • 8.3.7 Within scatterplot
    • 8.3.8 Pooled OLS regression with cluster–robust standard errors
    • 8.3.9 Time-series autocorrelations for panel data
    • 8.3.10 Error correlation in the RE model
  • 8.4 Pooled or population-averaged estimators
    • 8.4.1 Pooled OLS estimator
    • 8.4.2 Pooled FGLS estimator or population-averaged estimator
    • 8.4.3 The xtreg, pa command
    • 8.4.4 Application of the xtreg, pa command
  • 8.5 Within estimator
    • 8.5.1 Within estimator
    • 8.5.2 The xtreg, fe command
    • 8.5.3 Application of the xtreg, fe command
    • 8.5.4 Least-squares dummy-variables regression
  • 8.6 Between estimator
    • 8.6.1 Between estimator
    • 8.6.2 Application of the xtreg, be command
  • 8.7 RE estimator
    • 8.7.1 RE estimator
    • 8.7.2 The xtreg, re command
    • 8.7.3 Application of the xtreg, re command
  • 8.8 Comparison of estimators
    • 8.8.1 Estimates of variance components
    • 8.8.2 Within and between R-squared
    • 8.8.3 Estimator comparison
    • 8.8.4 Fixed effects versus random effects
    • 8.8.5 Hausman test for fixed effects
      • The hausman command
      • Robust Hausman test
    • 8.8.6 Prediction
  • 8.9 First-difference estimator
    • 8.9.1 First-difference estimator
    • 8.9.2 Strict and weak exogeneity
  • 8.10 Long panels
    • 8.10.1 Long-panel dataset
    • 8.10.2 Pooled OLS and PFGLS
    • 8.10.3 The xtpcse and xtgls commands
    • 8.10.4 Application of the xtgls, xtpcse, and xtscc commands
    • 8.10.5 Separate regressions
    • 8.10.6 FE and RE models
    • 8.10.7 Unit roots and cointegration
  • 8.11 Panel-data management
    • 8.11.1 Wide-form data
    • 8.11.2 Convert wide form to long form
    • 8.11.3 Convert long form to wide form
    • 8.11.4 An alternative to wide-form data
  • 8.12 Stata resources
  • 8.13 Exercises

9 Linear panel-data models: Extensions

  • 9.1 Introduction
  • 9.2 Panel IV estimation
    • 9.2.1 Panel IV
    • 9.2.2 The xtivreg command
    • 9.2.3 Application of the xtivreg command
    • 9.2.4 Panel IV extensions
  • 9.3 Hausman–Taylor estimator
    • 9.3.1 Hausman–Taylor estimator
    • 9.3.2 The xthtaylor command
    • 9.3.3 Application of the xthtaylor command
  • 9.4 Arellano–Bond estimator
    • 9.4.1 Dynamic model
    • 9.4.2 IV estimation in the FD model
    • 9.4.3 The xtabond command
    • 9.4.4 Arellano–Bond estimator: Pure time series
    • 9.4.5 Arellano–Bond estimator: Additional regressors
    • 9.4.6 Specification tests
    • 9.4.7 The xtdpdsys command
    • 9.4.8 The xtdpd command
  • 9.5 Mixed linear models
    • 9.5.1 Mixed linear model
    • 9.5.2 The xtmixed command
    • 9.5.3 Random-intercept model
    • 9.5.4 Cluster–robust standard errors
    • 9.5.5 Random-slopes model
    • 9.5.6 Random-coefficients model
    • 9.5.7 Two-way random-effects model
  • 9.6 Clustered data
    • 9.6.1 Clustered dataset
    • 9.6.2 Clustered data using nonpanel commands
    • 9.6.3 Clustered data using panel commands
    • 9.6.4 Hierarchical linear models
  • 9.7 Stata resources
  • 9.8 Exercises

10 Nonlinear regression methods

  • 10.1 Introduction
  • 10.2 Nonlinear example: Doctor visits
    • 10.2.1 Data description
    • 10.2.2 Poisson model description
  • 10.3 Nonlinear regression methods
    • 10.3.1 MLE
    • 10.3.2 The poisson command
    • 10.3.3 Postestimation commands
    • 10.3.4 NLS
    • 10.3.5 The nl command
    • 10.3.6 GLM
    • 10.3.7 The glm command
    • 10.3.8 The gmm command
    • 10.3.9 Other estimators
  • 10.4 Different estimates of the VCE
    • 10.4.1 General framework
    • 10.4.2 The vce() option
    • 10.4.3 Application of the vce() option
    • 10.4.4 Default estimate of the VCE
    • 10.4.5 Robust estimate of the VCE
    • 10.4.6 Cluster–robust estimate of the VCE
    • 10.4.7 Heteroskedasticity- and autocorrelation-consistent estimate of the VCE
    • 10.4.8 Bootstrap standard errors
    • 10.4.9 Statistical inference
  • 10.5 Prediction
    • 10.5.1 The predict and predictnl commands
    • 10.5.2 Application of predict and predictnl
    • 10.5.3 Out-of-sample prediction
    • 10.5.4 Prediction at a specified value of one of the regressors
    • 10.5.5 Prediction at a specified value of all the regressors
    • 10.5.6 Prediction of other quantities
    • 10.5.7 The margins command for prediction
  • 10.6 Marginal effects
    • 10.6.1 Calculus and finite-difference methods
    • 10.6.2 MEs estimates AME, MEM, and MER
    • 10.6.3 Elasticities and semielasticities
    • 10.6.4 Simple interpretations of coefficients in single-index models
    • 10.6.5 The margins command for marginal effects
    • 10.6.6 MEM: Marginal effect at mean
      • Comparison of calculus and finite-difference methods
    • 10.6.7 MER: Marginal effect at representative value
    • 10.6.8 AME: Average marginal effect
    • 10.6.9 Elasticities and semielasticities
    • 10.6.10 AME computed manually
    • 10.6.11 Polynomial regressors
    • 10.6.12 Interacted regressors
    • 10.6.13 Complex interactions and nonlinearities
  • 10.7 Model diagnostics
    • 10.7.1 Goodness-of-fit measures
    • 10.7.2 Information criteria for model comparison
    • 10.7.3 Residuals
    • 10.7.4 Model-specification tests
  • 10.8 Stata resources
  • 10.9 Exercises

11 Nonlinear optimization methods

  • 11.1 Introduction
  • 11.2 Newton–Raphson method
    • 11.2.1 NR method
    • 11.2.2 NR method for Poisson
    • 11.2.3 Poisson NR example using Mata
      • Core Mata code for Poisson NR iterations
      • Complete Stata and Mata code for Poisson NR iterations
  • 11.3 Gradient methods
    • 11.3.1 Maximization options
    • 11.3.2 Gradient methods
    • 11.3.3 Messages during iterations
    • 11.3.4 Stopping criteria
    • 11.3.5 Multiple maximums
    • 11.3.6 Numerical derivatives
  • 11.4 The ml command: lf method
    • 11.4.1 The ml command
    • 11.4.2 The lf method
    • 11.4.3 Poisson example: Single-index model
    • 11.4.4 Negative binomial example: Two-index model
    • 11.4.5 NLS example: Nonlikelihood model
  • 11.5 Checking the program
    • 11.5.1 Program debugging using ml check and ml trace
    • 11.5.2 Getting the program to run
    • 11.5.3 Checking the data
    • 11.5.4 Multicollinearity and near collinearity
    • 11.5.5 Multiple optimums
    • 11.5.6 Checking parameter estimation
    • 11.5.7 Checking standard-error estimation
  • 11.6 The ml command: d0, d1, d2, lf0, lf1, and lf2 methods
    • 11.6.1 Evaluator functions
    • 11.6.2 The d0 method
    • 11.6.3 The d1 method
    • 11.6.4 The lf1 method with the robust estimate of the VCE
    • 11.6.5 The d2 and lf2 methods
  • 11.7 The Mata optimize() function
    • 11.7.1 Type d and gf evaluators
    • 11.7.2 Optimize functions
    • 11.7.3 Poisson example
      • Evaluator program for Poisson MLE
      • The optimize() function for Poisson MLE
  • 11.8 Generalized method of moments
    • 11.8.1 Definition
    • 11.8.2 Nonlinear IV example
    • 11.8.3 GMM using the Mata optimize() function
  • 11.9 Stata resources
  • 11.10 Exercises

12 Testing methods

  • 12.1 Introduction
  • 12.2 Critical values and p-values
    • 12.2.1 Standard normal compared with Student's t
    • 12.2.2 Chi-squared compared with F
    • 12.2.3 Plotting densities
    • 12.2.4 Computing p-values and critical values
    • 12.2.5 Which distributions does Stata use?
  • 12.3 Wald tests and confidence intervals
    • 12.3.1 Wald test of linear hypotheses
    • 12.3.2 The test command
      • Test single coefficient
      • Test several hypotheses
      • Test of overall significance
      • Test calculated from retrieved coefficients and VCE
    • 12.3.3 One-sided Wald tests
    • 12.3.4 Wald test of nonlinear hypotheses (delta method)
    • 12.3.5 The testnl command
    • 12.3.6 Wald confidence intervals
    • 12.3.7 The lincom command
    • 12.3.8 The nlcom command (delta method)
    • 12.3.9 Asymmetric confidence intervals
  • 12.4 Likelihood-ratio tests
    • 12.4.1 Likelihood-ratio tests
    • 12.4.2 The lrtest command
    • 12.4.3 Direct computation of LR tests
  • 12.5 Lagrange multiplier test (or score test)
    • 12.5.1 LM tests
    • 12.5.2 The estat command
    • 12.5.3 LM test by auxiliary regression
  • 12.6 Test size and power
    • 12.6.1 Simulation DGP: OLS with chi-squared errors
    • 12.6.2 Test size
    • 12.6.3 Test power
    • 12.6.4 Asymptotic test power
  • 12.7 Specification tests
    • 12.7.1 Moment-based tests
    • 12.7.2 Information matrix test
    • 12.7.3 Chi-squared goodness-of-fit test
    • 12.7.4 Overidentifying restrictions test
    • 12.7.5 Hausman test
    • 12.7.6 Other tests
  • 12.8 Stata resources
  • 12.9 Exercises

13 Bootstrap methods

  • 13.1 Introduction
  • 13.2 Bootstrap methods
    • 13.2.1 Bootstrap estimate of standard error
    • 13.2.2 Bootstrap methods
    • 13.2.3 Asymptotic refinement
    • 13.2.4 Use the bootstrap with caution
  • 13.3 Bootstrap pairs using the vce(bootstrap) option
    • 13.3.1 Bootstrap-pairs method to estimate VCE
    • 13.3.2 The vce(bootstrap) option
    • 13.3.3 Bootstrap standard-errors example
    • 13.3.4 How many bootstraps?
    • 13.3.5 Clustered bootstraps
    • 13.3.6 Bootstrap confidence intervals
    • 13.3.7 The postestimation estat bootstrap command
    • 13.3.8 Bootstrap confidence-intervals example
    • 13.3.9 Bootstrap estimate of bias
  • 13.4 Bootstrap pairs using the bootstrap command
    • 13.4.1 The bootstrap command
    • 13.4.2 Bootstrap parameter estimate from a Stata estimation command
    • 13.4.3 Bootstrap standard error from a Stata estimation command
    • 13.4.4 Bootstrap standard error from a user-written estimation command
    • 13.4.5 Bootstrap two-step estimator
    • 13.4.6 Bootstrap Hausman test
    • 13.4.7 Bootstrap standard error of the coefficient of variation
  • 13.5 Bootstraps with asymptotic refinement
    • 13.5.1 Percentile-t method
    • 13.5.2 Percentile-t Wald test
    • 13.5.3 Percentile-t Wald confidence interval
  • 13.6 Bootstrap pairs using bsample and simulate
    • 13.6.1 The bsample command
    • 13.6.2 The bsample command with simulate
    • 13.6.3 Bootstrap Monte Carlo exercise
  • 13.7 Alternative resampling schemes
    • 13.7.1 Bootstrap pairs
    • 13.7.2 Parametric bootstrap
    • 13.7.3 Residual bootstrap
    • 13.7.4 Wild bootstrap
    • 13.7.5 Subsampling
  • 13.8 The jackknife
    • 13.8.1 Jackknife method
    • 13.8.2 The vce(jackknife) option and the jackknife command
  • 13.9 Stata resources
  • 13.10 Exercises

14 Binary outcome models

  • 14.1 Introduction
  • 14.2 Some parametric models
    • 14.2.1 Basic model
    • 14.2.2 Logit, probit, linear probability, and clog-log models
  • 14.3 Estimation
    • 14.3.1 Latent-variable interpretation and identification
    • 14.3.2 ML estimation
    • 14.3.3 The logit and probit commands
    • 14.3.4 Robust estimate of the VCE
    • 14.3.5 OLS estimation of LPM
  • 14.4 Example
    • 14.4.1 Data description
    • 14.4.2 Logit regression
    • 14.4.3 Comparison of binary models and parameter estimates
  • 14.5 Hypothesis and specification tests
    • 14.5.1 Wald tests
    • 14.5.2 Likelihood-ratio tests
    • 14.5.3 Additional model-specification tests
      • Lagrange multiplier test of generalized logit
      • Heteroskedastic probit regression
    • 14.5.4 Model comparison
  • 14.6 Goodness of fit and prediction
    • 14.6.1 Pseudo-R2 measure
    • 14.6.2 Comparing predicted probabilities with sample frequencies
    • 14.6.3 Comparing predicted outcomes with actual outcomes
    • 14.6.4 The predict command for fitted probabilities
    • 14.6.5 The prvalue command for fitted probabilities
  • 14.7 Marginal effects
    • 14.7.1 Marginal effect at a representative value (MER)
    • 14.7.2 Marginal effect at the mean (MEM)
    • 14.7.3 Average marginal effect (AME)
    • 14.7.4 The prchange command
  • 14.8 Endogenous regressors14.8.1 Example
    • 14.8.2 Model assumptions
    • 14.8.3 Structural-model approach
      • The ivprobit command
      • Maximum likelihood estimates
      • Two-step sequential estimates
    • 14.8.4 IVs approach
  • 14.9 Grouped data
    • 14.9.1 Estimation with aggregate data
    • 14.9.2 Grouped-data application
  • 14.10 Stata resources
  • 14.11 Exercises

15 Multinomial models

  • 15.1 Introduction
  • 15.2 Multinomial models overview
    • 15.2.1 Probabilities and MEs
    • 15.2.2 Maximum likelihood estimation
    • 15.2.3 Case-specific and alternative-specific regressors
    • 15.2.4 Additive random-utility model
    • 15.2.5 Stata multinomial model commands
  • 15.3 Multinomial example: Choice of fishing mode
    • 15.3.1 Data description
    • 15.3.2 Case-specific regressors
    • 15.3.3 Alternative-specific regressors
  • 15.4 Multinomial logit model
    • 15.4.1 The mlogit command
    • 15.4.2 Application of the mlogit command
    • 15.4.3 Coefficient interpretation
    • 15.4.4 Predicted probabilities
    • 15.4.5 MEs
  • 15.5 Conditional logit model
    • 15.5.1 Creating long-form data from wide-form data
    • 15.5.2 The asclogit command
    • 15.5.3 The clogit command
    • 15.5.4 Application of the asclogit command
    • 15.5.5 Relationship to multinomial logit model
    • 15.5.6 Coefficient interpretation
    • 15.5.7 Predicted probabilities
    • 15.5.8 MEs
  • 15.6 Nested logit model
    • 15.6.1 Relaxing the independence of irrelevant alternatives assumption
    • 15.6.2 NL model
    • 15.6.3 The nlogit command
    • 15.6.4 Model estimates
    • 15.6.5 Predicted probabilities
    • 15.6.6 MEs
    • 15.6.7 Comparison of logit models
  • 15.7 Multinomial probit model
    • 15.7.1 MNP
    • 15.7.2 The mprobit command
    • 15.7.3 Maximum simulated likelihood
    • 15.7.4 The asmprobit command
    • 15.7.5 Application of the asmprobit command
    • 15.7.6 Predicted probabilities and MEs
  • 15.8 Random-parameters logit
    • 15.8.1 Random-parameters logit
    • 15.8.2 The mixlogit command
    • 15.8.3 Data preparation for mixlogit
    • 15.8.4 Application of the mixlogit command
  • 15.9 Ordered outcome models
    • 15.9.1 Data summary
    • 15.9.2 Ordered outcomes
    • 15.9.3 Application of the ologit command
    • 15.9.4 Predicted probabilities
    • 15.9.5 MEs
    • 15.9.6 Other ordered models
  • 15.10 Multivariate outcomes
    • 15.10.1 Bivariate probit
    • 15.10.2 Nonlinear SUR
  • 15.11 Stata resources
  • 15.12 Exercises

16 Tobit and selection models

  • 16.1 Introduction
  • 16.2 Tobit model
    • 16.2.1 Regression with censored data
    • 16.2.2 Tobit model setup
    • 16.2.3 Unknown censoring point
    • 16.2.4 Tobit estimation
    • 16.2.5 ML estimation in Stata
  • 16.3 Tobit model example
    • 16.3.1 Data summary
    • 16.3.2 Tobit analysis
    • 16.3.3 Prediction after tobit
    • 16.3.4 Marginal effects
      • Left-truncated, left-censored, and right-truncated examples
      • Left-censored case computed directly
      • Marginal impact on probabilities
    • 16.3.5 The ivtobit command
    • 16.3.6 Additional commands for censored regression
  • 16.4 Tobit for lognormal data
    • 16.4.1 Data example
    • 16.4.2 Setting the censoring point for data in logs
    • 16.4.3 Results
    • 16.4.4 Two-limit tobit
    • 16.4.5 Model diagnostics
    • 16.4.6 Tests of normality and homoskedasticity
      • Generalized residuals and scores
      • Test of normality
      • Test of homoskedasticity
    • 16.4.7 Next step?
  • 16.5 Two-part model in logs
    • 16.5.1 Model structure
    • 16.5.2 Part 1 specification
    • 16.5.3 Part 2 of the two-part model
  • 16.6 Selection model
    • 16.6.1 Model structure and assumptions
    • 16.6.2 ML estimation of the sample-selection model
    • 16.6.3 Estimation without exclusion restrictions
    • 16.6.4 Two-step estimation
    • 16.6.5 Estimation with exclusion restrictions
  • 16.7 Prediction from models with outcome in logs
    • 16.7.1 Predictions from tobit
    • 16.7.2 Predictions from two-part model
    • 16.7.3 Predictions from selection model
  • 16.8 Stata resources
  • 16.9 Exercises

17 Count-data models

  • 17.1 Introduction
  • 17.2 Features of count data
    • 17.2.1 Generated Poisson data
    • 17.2.2 Overdispersion and negative binomial data
    • 17.2.3 Modeling strategies
    • 17.2.4 Estimation methods
  • 17.3 Empirical example 1
    • 17.3.1 Data summary
    • 17.3.2 Poisson model
      • Poisson model results
      • Robust estimate of VCE for Poisson MLE
      • Test of overdispersion
      • Coefficient interpretation and marginal effects
    • 17.3.3 NB2 model
      • NB2 model results
      • Fitted probabilities for Poisson and NB2 models
      • The countfit command
      • The prvalue command
      • Discussion
      • Generalized NB model
    • 17.3.4 Nonlinear least-squares estimation
    • 17.3.5 Hurdle model
      • Variants of the hurdle model
      • Application of the hurdle model
    • 17.3.6 Finite-mixture models
      • FMM specification
      • Simulated FMM sample with comparisons
      • ML estimation of the FMM
      • The fmm command
      • Application: Poisson finite-mixture model
      • Interpretation
      • Comparing marginal effects
      • Application: NB finite-mixture model
      • Model selection
      • Cautionary note
  • 17.4 Empirical example 2
    • 17.4.1 Zero-inflated data
    • 17.4.2 Models for zero-inflated data
    • 17.4.3 Results for the NB2 model
      • The prcounts command
    • 17.4.4 Results for ZINB
    • 17.4.5 Model comparison
      • The countfit command
      • Model comparison using countfit
  • 17.5 Models with endogenous regressors
    • 17.5.1 Structural-model approach
      • Model and assumptions
      • Two-step estimation
      • Application
    • 17.5.2 Nonlinear IV method
  • 17.6 Stata resources
  • 17.7 Exercises

18 Nonlinear panel models

  • 18.1 Introduction
  • 18.2 Nonlinear panel-data overview
    • 18.2.1 Some basic nonlinear panel models
      • FE models
      • RE models
      • Pooled models or population-averaged models
      • Comparison of models
    • 18.2.2 Dynamic models
    • 18.2.3 Stata nonlinear panel commands
  • 18.3 Nonlinear panel-data example
    • 18.3.1 Data description and summary statistics
    • 18.3.2 Panel-data organization
    • 18.3.3 Within and between variation
    • 18.3.4 FE or RE model for these data?
  • 18.4 Binary outcome models
    • 18.4.1 Panel summary of the dependent variable
    • 18.4.2 Pooled logit estimator
    • 18.4.3 The xtlogit command
    • 18.4.4 The xtgee command
    • 18.4.5 PA logit estimator
    • 18.4.6 RE logit estimator
    • 18.4.7 FE logit estimator
    • 18.4.8 Panel logit estimator comparison
    • 18.4.9 Prediction and marginal effects
    • 18.4.10 Mixed-effects logit estimator
  • 18.5 Tobit model
    • 18.5.1 Panel summary of the dependent variable
    • 18.5.2 RE tobit model
    • 18.5.3 Generalized tobit models
    • 18.5.4 Parametric nonlinear panel models
  • 18.6 Count-data models
    • 18.6.1 The xtpoisson command
    • 18.6.2 Panel summary of the dependent variable
    • 18.6.3 Pooled Poisson estimator
    • 18.6.4 PA Poisson estimator
    • 18.6.5 RE Poisson estimators
    • 18.6.6 FE Poisson estimator
    • 18.6.7 Panel Poisson estimators comparison
    • 18.6.8 Negative binomial estimators
  • 18.7 Stata resources
  • 18.8 Exercises

A Programming in Stata

  • A.1 Stata matrix commands
    • A.1.1 Stata matrix overview
    • A.1.2 Stata matrix input and output
      • Matrix input by hand
      • Matrix input from Stata estimation results
    • A.1.3 Stata matrix subscripts and combining matrices
    • A.1.4 Matrix operators
    • A.1.5 Matrix functions
    • A.1.6 Matrix accumulation commands
    • A.1.7 OLS using Stata matrix commands
  • A.2 Programs
    • A.2.1 Simple programs (no arguments or access to results)
    • A.2.2 Modifying a program
    • A.2.3 Programs with positional arguments
    • A.2.4 Temporary variables
    • A.2.5 Programs with named positional arguments
    • A.2.6 Storing and retrieving program results
    • A.2.7 Programs with arguments using standard Stata syntax
    • A.2.8 Ado-files
  • A.3 Program debugging
    • A.3.1 Some simple tips
    • A.3.2 Error messages and return code
    • A.3.3 Trace

B Mata

  • B.1 How to run Mata
    • B.1.1 Mata commands in Mata
    • B.1.2 Mata commands in Stata
    • B.1.3 Stata commands in Mata
    • B.1.4 Interactive versus batch use
    • B.1.5 Mata help
  • B.2 Mata matrix commands
    • B.2.1 Mata matrix input
      • Matrix input by hand
      • Identity matrices, unit vectors, and matrices of constants
      • Matrix input from Stata data
      • Matrix input from Stata matrix
      • Stata interface functions
    • B.2.2 Mata matrix operators
      • Element-by-element operators
    • B.2.3 Mata functions
      • Scalar and matrix functions
      • Matrix inversion
    • B.2.4 Mata cross products
    • B.2.5 Mata matrix subscripts and combining matrices
    • B.2.6 Transferring Mata data and matrices to Stata
      • Creating Stata matrices from Mata matrices
      • Creating Stata data from a Mata vector
  • B.3 Programming in Mata
    • B.3.1 Declarations
    • B.3.2 Mata program
    • B.3.3 Mata program with results output to Stata
    • B.3.4 Stata program that calls a Mata program
    • B.3.5 Using Mata in ado-files
Glossary of abbreviations
References