Stata basics
1/ Read thisit will help
- 3.1 Overview
- 3.2 Stata on the Internet (www.stata.com and other resources)
- 3.2.1 The Stata website (www.stata.com)
- 3.2.2 The Stata BlogNot Elsewhere Classified
- 3.2.3 Stata on Twitter and Facebook
- 3.2.4 Other Internet resources on Stata
- 3.3 Stata Press
- 3.4 The Stata listserver
- 3.5 The Stata Journal
- 3.6 Updating and adding features from the web
- 3.6.1 Official updates
- 3.6.2 Unofficial updates
- 3.7 Conferences and training
- 3.7.1 Conferences and users group meetings
- 3.7.2 NetCourses
- 3.7.3 Public training courses
- 3.7.4 On-site training courses
- 3.8 Books and other support materials
- 3.8.1 For readers
- 3.8.2 For authors
- 3.9 Technical support
- 3.9.1 Register your software
- 3.9.2 Before contacting technical support
- 3.9.3 Technical support by email
- 3.9.4 Technical support by phone or fax
- 3.9.5 Comments and suggestions for our technical staff
4/ Stata’s help and search facilities
- 4.1 Introduction
- 4.2 Getting started
- 4.3 help: Stata’s help system
- 4.4 Accessing PDF manuals from help entries
- 4.5 Searching
- 4.6 More on search
- 4.7 More on help
- 4.8 help contents: Table of contents for Stata’s help system
- 4.9 search: All the details
- 4.9.1 How search works
- 4.9.2 Author searches
- 4.9.3 Entry ID searches
- 4.9.4 FAQ searches
- 4.9.5 Return codes
- 4.10 net search: Searching net resources
- 4.11 hsearch: An alternative to search
5/ Flavors of Stata
- 5.1 Platforms
- 5.2 Stata/MP, Stata/SE, Stata/IC, and Small Stata
- 5.2.1 Determining which version you own
- 5.2.2 Determining which version is installed
- 5.3 Size limits of Stata/MP, SE, IC, and Small Stata
- 5.4 Speed comparison of Stata/MP, SE, IC, and Small Stata
- 5.5 Feature comparison of Stata/MP, SE, and IC
6/ Managing memory
- 6.1 Memory-size considerations
- 6.2 Compressing data
- 6.3 Setting maxvar
- 6.4 Setting matsize
- 6.5 The memory command
7/ more conditions
8/ Error messages and return codes
9/ The Break key
- 9.1 Making Stata stop what it is doing
- 9.2 Side effects of clicking on Break
- 9.3 Programming considerations
10/ Keyboard use
- 10.1 Description
- 10.2 F-keys
- 10.3 Editing keys in Stata
- 10.4 Editing keys in Stata for Unix(console)
- 10.5 Editing previous lines in Stata
- 10.6 Tab expansion of variable names
Elements of Stata
11/ Language syntax
- 11.1 Overview
- 11.1.1 varlist
- 11.1.2 by varlist:
- 11.1.3 if exp
- 11.1.4 in range
- 11.1.5 =exp
- 11.1.6 weight
- 11.1.7 options
- 11.1.8 numlist
- 11.1.9 datelist
- 11.1.10 Prefix commands
- 11.2 Abbreviation rules
- 11.2.1 Command abbreviation
- 11.2.2 Option abbreviation
- 11.2.3 Variable-name abbreviation
- 11.3 Naming conventions
- 11.4 varlists
- 11.5 by varlist: construct
- 11.6 Filenaming conventions
- 11.6.1 A special note for Mac users
- 11.6.2 A special note for Unix users
- 11.7 References
12/ Data
- 12.1 Data and datasets
- 12.2 Numbers
- 12.2.1 Missing values
- 12.2.2 Numeric storage types
- 12.3 Dates and times
- 12.4 Strings
- 12.4.1 Strings containing identifying data
- 12.4.2 Strings containing categorical data
- 12.4.3 Strings containing numeric data
- 12.4.4 String storage types
- 12.5 Formats: Controlling how data are displayed
- 12.5.1 Numeric formats
- 12.5.2 European numeric formats
- 12.5.3 Date and time formats
- 12.5.4 String formats
- 12.6 Dataset, variable, and value labels
- 12.6.1 Dataset labels
- 12.6.2 Variable labels
- 12.6.3 Value labels
- 12.6.4 Labels in other languages
- 12.7 Notes attached to data
- 12.8 Characteristics
- 12.9 Data Editor and Variables Manager
- 12.10 Reference
13/ Functions and expressions
- 13.1 Overview
- 13.2 Operators
- 13.2.1 Arithmetic operators
- 13.2.2 String operators
- 13.2.3 Relational operators
- 13.2.4 Logical operators
- 13.2.5 Order of evaluation, all operators
- 13.3 Functions
- 13.4 System variables (_variables)
- 13.5 Accessing coefficients and standard errors
- 13.5.1 Single-equation models
- 13.5.2 Multiple-equation models
- 13.5.3 Factor variables and time-series operators
- 13.6 Accessing results from Stata commands
- 13.7 Explicit subscripting
- 13.7.1 Generating lags and leads
- 13.7.2 Subscripting within groups
- 13.8 Indicator values for levels of factor variables
- 13.9 Time-series operators
- 13.9.1 Generating lags, leads, and differences
- 13.9.2 Time-series operators and factor variables
- 13.9.3 Operators within groups
- 13.10 Label values
- 13.11 Precision and problems therein
- 13.12 References
14/ Matrix expressions
- 14.1 Overview
- 14.1.1 Definition of a matrix
- 14.1.2 matsize
- 14.2 Row and column names
- 14.2.1 The purpose of row and column names
- 14.2.2 Two-part names
- 14.2.3 Setting row and column names
- 14.2.4 Obtaining row and column names
- 14.3 Vectors and scalars
- 14.4 Inputting matrices by hand
- 14.5 Accessing matrices created by Stata commands
- 14.6 Creating matrices by accumulating data
- 14.7 Matrix operators
- 14.8 Matrix functions
- 14.9 Subscripting
- 14.10 Using matrices in scalar expressions
15/ Saving and printing outputlog files
- 15.1 Overview
- 15.1.1 Starting and closing logs
- 15.1.2 Appending to an existing log
- 15.1.3 Temporarily suspending and resuming logging
- 15.2 Placing comments in logs
- 15.3 Logging only what you type
- 15.4 The log-button alternative
- 15.5 Printing logs
- 15.6 Creating multiple log files simultaneously
16/ Do-files
- 16.1 Description
- 16.1.1 Version
- 16.1.2 Comments and blank lines in do-files
- 16.1.3 Long lines in do-files
- 16.1.4 Error handling in do-files
- 16.1.5 Logging the output of do-files
- 16.1.6 Preventing more conditions
- 16.2 Calling other do-files
- 16.3 Creating and running do-files
- 16.3.1 Creating and running do-files for Windows
- 16.3.2 Creating and running do-files for Mac
- 16.3.3 Creating and running do-files for Unix
- 16.4 Programming with do-files
- 16.4.1 Argument passing
- 16.4.2 Suppressing output
- 16.5 References
17/ Ado-files
- 17.1 Description
- 17.2 What is an ado-file?
- 17.3 How can I tell if a command is built in or an ado-file?
- 17.4 How can I look at an ado-file?
- 17.5 Where does Stata look for ado-files?
- 17.5.1 Where are the official ado-directories?
- 17.5.2 Where is my personal ado-directory?
- 17.6 How do I install an addition?
- 17.7 How do I add my own ado-files?
- 17.8 How do I install official updates?
- 17.9 How do I install updates to user-written additions?
- 17.10 Reference
18/ Programming Stata
- 18.1 Description
- 18.2 Relationship between a program and a do-file
- 18.3 Macros
- 18.3.1 Local macros
- 18.3.2 Global macros
- 18.3.3 The difference between local and global macros
- 18.3.4 Macros and expressions
- 18.3.5 Double quotes
- 18.3.6 Extended macro functions
- 18.3.7 Macro increment and decrement functions
- 18.3.8 Macro expressions
- 18.3.9 Advanced local macro manipulation
- 18.3.10 Advanced global macro manipultation
- 18.3.11 Constructing Windows filenames by using macros
- 18.3.12 Accessing system values
- 18.3.13 Referring to characteristics
- 18.4 Program arguments
- 18.4.1 Named positional arguments
- 18.4.2 Incrementing through positional arguments
- 18.4.3 Using macro shift
- 18.4.4 Parsing standard Stata syntax
- 18.4.5 Parsing immediate commands
- 18.4.6 Parsing nonstandard syntax
- 18.5 Scalars and matrices
- 18.6 Temporarily destroying the data in memory
- 18.7 Temporary objects
- 18.7.1 Temporary variables
- 18.7.2 Temporary scalars and matrices
- 18.7.3 Temporary files
- 18.8 Accessing results calculated by other programs
- 18.9 Accessing results calculated by estimation commands
- 18.10 Saving results
- 18.10.1 Saving results in r()
- 18.10.2 Saving results in e()
- 18.10.3 Saving results in s()
- 18.11 Ado-files
- 18.11.1 Version
- 18.11.2 Comments and long lines in ado-files
- 18.11.3 Debugging ado-files
- 18.11.4 Local subroutines
- 18.11.5 Development of a sample ado-command
- 18.11.6 Writing online help
- 18.11.7 Programming dialog boxes
- 18.12 A compendium of useful commands for programmers
- 18.13 References
19/ Immediate commands
- 19.1 Overview
- 19.1.1 Examples
- 19.1.2 A list of the immediate commands
- 19.2 The display command
20/ Estimation and postestimation commands
- 20.1 All estimation commands work the same way
- 20.2 Standard syntax
- 20.3 Replaying prior results
- 20.4 Cataloging estimation results
- 20.5 Saving estimation results
- 20.6 Specifying the estimation subsample
- 20.7 Specifying the width of confidence intervals
- 20.8 Formatting the coefficient table
- 20.9 Obtaining the variancecovariance matrix
- 20.10 Obtaining predicted values
- 20.10.1 Using predict
- 20.10.2 Making in-sample predictions
- 20.10.3 Making out-of-sample predictions
- 20.10.4 Obtaining standard errors, tests, and confidence intervals for predictions
- 20.11 Accessing estimated coefficients
- 20.12 Performing hypothesis tests on the coefficients
- 20.12.1 Linear tests
- 20.12.2 Using test
- 20.12.3 Likelihood-ratio tests
- 20.12.4 Nonlinear Wald tests
- 20.13 Obtaining linear combinations of coefficients
- 20.14 Obtaining nonlinear combinations of coefficients
- 20.15 Obtaining marginal means, adjusted predictions, and predictive margins
- 20.15.1 Obtaining estimated marginal means
- 20.15.2 Obtaining adjusted predictions
- 20.15.3 Obtaining predictive margins
- 20.16 Obtaining conditional and average marginal effects
- 20.16.1 Obtaining conditional marginal effects
- 20.16.2 Obtaining average marginal effects
- 20.17 Obtaining pairwise comparisons
- 20.18 Obtaining contrasts, tests of interactions, and main effects
- 20.19 Graphing margins, marginal effects, and contrasts
- 20.20 Obtaining robust variance estimates
- 20.21 Obtaining scores
- 20.22 Weighted estimation
- 20.22.1 Frequency weights
- 20.22.2 Analytic weights
- 20.22.3 Sampling weights
- 20.22.4 Importance weights
- 20.23 A list of postestimation commands
- 20.24 References
Advice
21/ Inputting data and importing data
- 21.1 Overview
- 21.2 Determining which input method to use
- 21.2.1 Entering data interactively
- 21.2.2 Copying and pasting data
- 21.2.3 If the dataset is in binary format
- 21.2.4 If the data are simple
- 21.2.5 If the dataset is formatted and the formatting is significant
- 21.2.6 If there are no string variables
- 21.2.7 If all the string variables are enclosed in quotes
- 21.2.8 If the undelimited strings have no blanks
- 21.2.9 If you have EBCDIC data
- 21.2.10 If you make it to here
- 21.3 If you run out of memory
- 21.4 Transfer programs
- 21.5 ODBC sources
- 21.6 References
22/ Combining datasets
23/ Working with strings
- 23.1 Description
- 23.2 Categorical string variables
- 23.3 Mistaken string variables
- 23.4 Complex strings
- 23.5 Reference
24/ Working with dates and times
- 24.1 Overview
- 24.2 Inputting dates and times
- 24.3 Displaying dates and times
- 24.4 Typing dates and times (datetime literals)
- 24.5 Extracting components of dates and times
- 24.6 Converting between date and time values
- 24.7 Business dates and calendars
- 24.8 Reference
25/ Working with categorical data and factor variables
- 25.1 Continuous, categorical, and indicator variables
- 25.1.1 Converting continuous variables to indicator variables
- 25.1.2 Converting continuous variables to categorical variables
- 25.2 Estimation with factor variables
- 25.2.1 Including factor variables
- 25.2.2 Specifying base levels
- 25.2.3 Setting base levels permanently
- 25.2.4 Testing significance of a main effect
- 25.2.5 Specifying indicator (dummy) variables as factor variables
- 25.2.6 Including interactions
- 25.2.7 Testing significance of interactions
- 25.2.8 Including factorial specifications
- 25.2.9 Including squared terms and polynomials
- 25.2.10 Including interactions with continuous variables
- 25.2.11 Parentheses binding
- 25.2.12 Including indicators for single levels
- 25.2.13 Including subgroups of levels
- 25.2.14 Combining factor variables and time-series operators
- 25.2.15 Treatment of empty cells
26/ Overview of Stata estimation commands
- 26.1 Introduction
- 26.2 Linear regression with simple error structures
- 26.3 Structural equation models (SEM)
- 26.4 ANOVA, ANCOVA, MANOVA, and MANCOVA
- 26.5 Generalized linear models
- 26.6 Binary-outcome qualitative dependent-variable models
- 26.7 ROC analysis
- 26.8 Conditional logistic regression
- 26.9 Multiple-outcome qualitative dependent-variable models
- 26.10Count dependent-variable models
- 26.11 Exact estimators
- 26.12 Linear regression with heteroskedastic errors
- 26.13 Stochastic frontier models
- 26.14 Regression with systems of equations
- 26.15 Models with endogenous sample selection
- 26.16 Models with time-series data
- 26.17 Panel-data models
- 26.17.1 Linear regression with panel data
- 26.17.2 Censored linear regression with panel data
- 26.17.3 Generalized linear models with panel data
- 26.17.4 Qualitative dependent-variable models with panel data
- 26.17.5 Count dependent-variable models with panel data
- 26.17.6 Random-coefficient models with panel data
- 26.18 Survival-time (failure-time) models
- 26.19 Generalized method of moments (GMM)
- 26.20 Estimation wtih correlated errors
- 26.21 Survey data
- 26.22 Multiple imputation
- 26.23 Multivariate and cluster analysis
- 26.24 Pharmacokinetic data
- 26.25 Specification search tools
- 26.26 Obtaining new estimation commands
- 26.27 Reference
27/ Commands everyone should know
28/ Using the Internet to keep up to date
- 28.1 Overview
- 28.2 Sharing datasets (and other files)
- 28.3 Official updates
- 28.3.1 Frequently asked questions about updating
- 28.4 Downloading and managing additions by users
- 28.4.1 Downloading files
- 28.4.2 Managing files
- 28.4.3 Finding files to download
- 28.4.4 Updating additions by users
- 28.5 Making your own download site