Table of descriptives

[This example is outdated; see the estpost examples on Summary statistics (summarize) and Summary statistics (tabstat).]

Research papers usually contain a table displaying the descriptive statistics for all variables in the analysis. The following example illustrates how such a table can be produced using estadd summ and esttab. Assume, your analysis uses price as the dependent variable and weight, mpg, and foreign as independent variables. To create a descriptives table including all four variables, type:

. sysuse auto, clear
(1978 Automobile Data)

. generate y = uniform()

. quietly regress y price weight mpg foreign, noconstant

. estadd summ

added matrices:
                 e(sd) :  1 x 4
                e(max) :  1 x 4
                e(min) :  1 x 4
               e(mean) :  1 x 4

. esttab, cells("mean sd min max") nogap nomtitle nonumber

----------------------------------------------------------------
                     mean           sd          min          max
----------------------------------------------------------------
price            6165.257     2949.496         3291        15906
weight           3019.459     777.1936         1760         4840
mpg               21.2973     5.785503           12           41
foreign          .2972973     .4601885            0            1
----------------------------------------------------------------
N                      74                                       
----------------------------------------------------------------
Code

The trick is to generate a fake variable and regress it on all involved variables, including the dependent variable.

[top]

Descriptives by subgroups

[This example is outdated; see the estpost examples on Summarize by subgroups and Tabstat by subgroups.]

A table of descriptive statistics by subgroups can easily be produced using by and eststo:

. sysuse auto, clear
(1978 Automobile Data)

. generate y = uniform()

. by foreign: eststo: quietly regress y price weight mpg, nocons

---------------------------------------------------------------------------------------
-> Domestic
(est1 stored)

---------------------------------------------------------------------------------------
-> Foreign
(est2 stored)

. estadd summ : *

. esttab, main(mean) aux(sd) label nodepvar nostar nonote

----------------------------------------------
                              (1)          (2)
                         Domestic      Foreign
----------------------------------------------
Price                      6072.4       6384.7
                         (3097.1)     (2621.9)

Weight (lbs.)              3317.1       2315.9
                          (695.4)      (433.0)

Mileage (mpg)               19.83        24.77
                          (4.743)      (6.611)
----------------------------------------------
Observations                   52           22
----------------------------------------------

. eststo clear
Code
[top]

Tabulating results from t-tests

[This example is outdated; see the estpost examples on on Two-sample t-tests.]

Basically anything can be tabulated by estout or esttab once it is posted in e(). Here is an example with t-tests:

. capt prog drop myttests

. *! version 1.0.0  14aug2007  Ben Jann
. program myttests, eclass
  1.     version 8
  2.     syntax varlist [if] [in], by(varname) [ * ]
  3.     marksample touse
  4.     markout `touse' `by'
  5.     tempname mu_1 mu_2 d d_se d_t d_p
  6.     foreach var of local varlist {
  7.         qui ttest `var' if `touse', by(`by') `options'
  8.         mat `mu_1' = nullmat(`mu_1'), r(mu_1)
  9.         mat `mu_2' = nullmat(`mu_2'), r(mu_2)
 10.         mat `d'    = nullmat(`d'   ), r(mu_1)-r(mu_2)
 11.         mat `d_se' = nullmat(`d_se'), r(se)
 12.         mat `d_t'  = nullmat(`d_t' ), r(t)
 13.         mat `d_p'  = nullmat(`d_p' ), r(p)
 14.     }
 15.     foreach mat in mu_1 mu_2 d d_se d_t d_p {
 16.         mat coln ``mat'' = `varlist'
 17.     }
 18.     tempname b V
 19.     mat `b' = `mu_1'*0
 20.     mat `V' = `b''*`b'
 21.     eret post `b' `V'
 22.     eret local cmd "myttests"
 23.     foreach mat in mu_1 mu_2 d d_se d_t d_p {
 24.         eret mat `mat' = ``mat''
 25.     }
 26. end

. sysuse auto, clear
(1978 Automobile Data)

. myttests price weight mpg, by(foreign)

. ereturn list

macros:
                e(cmd) : "myttests"
         e(properties) : "b V"

matrices:
                  e(b) :  1 x 3
                  e(V) :  3 x 3
                e(d_p) :  1 x 3
                e(d_t) :  1 x 3
               e(d_se) :  1 x 3
                  e(d) :  1 x 3
               e(mu_2) :  1 x 3
               e(mu_1) :  1 x 3

. esttab, nomtitle nonumbers noobs ///
>     cells("mu_1(fmt(a3)) mu_2 d(star pvalue(d_p))" ". . d_se(par)")

------------------------------------------------------
                     mu_1         mu_2       d/d_se   
------------------------------------------------------
price              6072.4       6384.7       -312.3   
                                            (754.4)   
weight             3317.1       2315.9       1001.2***
                                            (160.3)   
mpg                 19.83        24.77       -4.946***
                                            (1.362)   
------------------------------------------------------
Code

An alternative approach would be to save three sets of estimates, one for each group, and one for the differences.

[top]

Frequency tables

[This example is outdated; see the estpost examples on One-way frequency table and Two-way frequency table.]

With a little programming you could even do frequency tables in estout. Here is an example for a one-way table:

. capt prog drop e_tabulate

. *! version 1.0.0  24sep2007  Ben Jann
. prog e_tabulate, eclass
  1.     version 8.2
  2.     syntax varname(numeric) [if] [in] [fw aw iw] [, noTOTal * ]
  3.     tempname count percent vals V
  4.     tab `varlist' `if' `in' [`weight'`exp'], ///
>         matcell(`count') matrow(`vals') `options'
  5.     local N = r(N)
  6.     mat `count' = `count''
  7.     forv r =1/`=rowsof(`vals')' {
  8.         local value: di `vals'[`r',1]
  9.         local label: label (`varlist') `value'
 10.         local values "`values' `value'"
 11.         local labels `"`labels' `value' `"`label'"'"'
 12.     }
 13.     if "`total'"=="" {
 14.         mat `count' = `count', `N'
 15.         local values "`values' total"
 16.         local labels `"`labels' total `"Total"'"'
 17.     }
 18.     mat colname `count' = `values'
 19.     mat `percent' = `count'/`N'*100
 20.     mat `V' = `count''*`count'*0
 21.     eret post `count' `V', depname(`varlist') obs(`N')
 22.     eret local cmd "e_tabulate"
 23.     eret local depvar "`varlist'"
 24.     eret local labels `"`labels'"'
 25.     eret mat percent = `percent'
 26. end

. sysuse auto, clear
(1978 Automobile Data)

. e_tabulate foreign

   Car type |      Freq.     Percent        Cum.
------------+-----------------------------------
   Domestic |         52       70.27       70.27
    Foreign |         22       29.73      100.00
------------+-----------------------------------
      Total |         74      100.00

. ereturn list

scalars:
                  e(N) =  74

macros:
             e(labels) : " 0 `"Domestic"' 1 `"Foreign"' total `"Total"'"
             e(depvar) : "foreign"
                e(cmd) : "e_tabulate"
         e(properties) : "b V"

matrices:
                  e(b) :  1 x 3
                  e(V) :  3 x 3
            e(percent) :  1 x 3

. mat list e(b)

e(b)[1,3]
        0      1  total
y1     52     22     74

. mat list e(percent)

e(percent)[1,3]
           0         1     total
c1  70.27027  29.72973       100

. esttab, cell("b percent") noobs nonumbers nomtitles ///
>     collabels(Freq. Percent, lhs(`:var lab `e(depvar)'')) ///
>     varlabels(`e(labels)', blist(total "{hline @width}{break}"))

--------------------------------------
Car type            Freq.      Percent
--------------------------------------
Domestic               52     70.27027
Foreign                22     29.72973
--------------------------------------
Total                  74          100
--------------------------------------
Code

To construct a twoway table, save the conditional distributions in the table columns as separate estimation sets. Example:

. bys foreign: eststo: e_tabulate rep

---------------------------------------------------------------------------------------
-> Domestic

     Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |          2        4.17        4.17
          2 |          8       16.67       20.83
          3 |         27       56.25       77.08
          4 |          9       18.75       95.83
          5 |          2        4.17      100.00
------------+-----------------------------------
      Total |         48      100.00
(est1 stored)

---------------------------------------------------------------------------------------
-> Foreign

     Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
          3 |          3       14.29       14.29
          4 |          9       42.86       57.14
          5 |          9       42.86      100.00
------------+-----------------------------------
      Total |         21      100.00
(est2 stored)

. esttab, main(percent 2) not nostar mtitles noobs nonote  ///
>     varlab(`e(labels)', blist(total "{hline @width}{break}"))

--------------------------------------
                      (1)          (2)
                 Domestic      Foreign
--------------------------------------
1                    4.17             
2                   16.67             
3                   56.25        14.29
4                   18.75        42.86
5                    4.17        42.86
--------------------------------------
Total              100.00       100.00
--------------------------------------

. eststo clear
Code