Creating Stata datasets
- input data from command line
- input data saved from spreadsheets
- read data using a dictionary
- read any type of ASCII data
- read and write data in the format required by the FDA for NDA submittals
- Read and write XML-formatted data files, including those produced by
Microsoft Excel
- convert datasets directly from other statistical packages, spreadsheets, and databases using third-party software
ODBC support
- import data from any ODBC data source, such as Access, Excel, Postgres or MySQL
- export data to new or existing ODBC tables
- execute raw SQL commands individually or in batches
- support for ODBC on Windows, Macintosh , and Linux
Built-in spreadsheet editor
- for Windows, Macintosh, and Unix
Data-management functions
Data reorganization
- row–column transposition
- data reshaping
- stacking of variables
- collapsing into means, totals, etc.
Labels
- dataset labels
- variable labels
- value labels (e.g., male and female for 0 and 1)
- ability to switch between multiple sets of data, variable, and value labels
- missing value labels
- multiple-language support
Notes
- extensive notes can be attached to a dataset
|
|
Sorting
- ascending or descending sorts
- multiple-key sorts
- numeric and string sorts
Merging datasets
- Merge datasets
- by key variables
- by observations
- join datasets
- outer join
- append datasets
- append time series
Special datasets
Utilities
- compress (make dataset as small as possible without loss of accuracy)
- formatted and unformatted disk I/O
Variable management
- generation of new variables
- replacement of existing variables
- encoding and decoding string variables
Dataset reports
- flexible description of variables, labels, and types
- codebooks for variables
- value-label reports
- duplicates and missing values
Variable types
- byte
- integer (int)
- long
- float
- double
- string
- dates
|