Data management


Creating Stata datasets
  • input data from command line
  • input data saved from spreadsheets
  • read data using a dictionary
  • read any type of ASCII data
  • read and write data in the format required by the FDA for NDA submittals
  • Read and write XML-formatted data files, including those produced by Microsoft Excel
  • convert datasets directly from other statistical packages, spreadsheets, and databases using third-party software

ODBC support

  • import data from any ODBC data source, such as Access, Excel, Postgres or MySQL
  • export data to new or existing ODBC tables
  • execute raw SQL commands individually or in batches
  • support for ODBC on Windows, Macintosh , and Linux

Built-in spreadsheet editor

  • for Windows, Macintosh, and Unix

Data-management functions

Data reorganization

  • row–column transposition
  • data reshaping
  • stacking of variables
  • collapsing into means, totals, etc.

Labels

  • dataset labels
  • variable labels
  • value labels (e.g., male and female for 0 and 1)
  • ability to switch between multiple sets of data, variable, and value labels
  • missing value labels
  • multiple-language support

Notes

  • extensive notes can be attached to a dataset
 

Sorting

  • ascending or descending sorts
  • multiple-key sorts
  • numeric and string sorts

Merging datasets

  • Merge datasets
    • by key variables
    • by observations
  • join datasets
  • outer join
  • append datasets
  • append time series

Special datasets

Utilities

  • compress (make dataset as small as possible without loss of accuracy)
  • formatted and unformatted disk I/O

Variable management

  • generation of new variables
  • replacement of existing variables
  • encoding and decoding string variables

Dataset reports

  • flexible description of variables, labels, and types
  • codebooks for variables
  • value-label reports
  • duplicates and missing values

Variable types

  • byte
  • integer (int)
  • long
  • float
  • double
  • string
  • dates

© Copyright 2005 Stata Corporation.