Using arrows to illustrate change

Here is an example that shows how arrows can be used to illustrate change:

. webuse nlswork, clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. mean ln_wage if year==88, over(ind_code)
(output omitted)

. matrix b88 = e(b)

. mean ln_wage if year==78, over(ind_code)
(output omitted)

. mata: assert(st_matrixcolstripe("b88")==st_matrixcolstripe("e(b)"))

. quietly estadd matrix b88

. coefplot, ci((b b) (b b88)) ciopts(recast(rcap pcarrow)) cionly ///
>     vertical sort ///
>     xtitle("Industry code") ytitle("Change in ln(wage) from 78 to 88")
Code
varia/arrows.svg

The trick is to add the results from both years to the same estimation set (e.g., using the estadd command) and then use the ci() option to plot them. A first (zero width) CI is used to plot a cap at the origin, the second CI is used to plot an arrow from the origin to the destination. The command mata: assert() is used in the example to assert that both results vectors have the same structure (coefplot does not check that).

Results are sorted by the initial level in the example. To sort by destination level, you could type:

. coefplot, ci((b b) (b b88)) ciopts(recast(rcap pcarrow)) cionly ///
>     vertical sort(, by(ul 2)) ///
>     xtitle("Industry code") ytitle("Change in ln(wage) from 78 to 88") 
Code
varia/arrows2.svg
[top]

Bar charts

[top]

Values as marker labels

Unfortunately, twoway bar (which is used by coefplot if you specify recast(bar)) does not seem to support marker labels. You can add the marker labels by superimposing an (invisible) scatter plot. Here is an example using internal temporary variables and the addplot() option:

. sysuse auto, clear
(1978 Automobile Data)

. proportion rep if foreign==0 & rep78>=3
(output omitted)

. estimates store domestic

. proportion rep if foreign==1 & rep78>=3
(output omitted)

. estimates store foreign

. coefplot domestic foreign, vertical recast(bar) barwidth(0.3) fcolor(*.5) ///
>     ciopts(recast(rcap)) citop citype(logit) format(%9.2f) ///
>     addplot(scatter @b @at, ms(i) mlabel(@b) mlabpos(2) mlabcolor(black))
Code
varia/barlbl.svg

Here is an example in which the marker labels are plotted inside the bars:

. coefplot domestic foreign, vertical noci format(%9.1f) rescale(100) ///
>     recast(bar) barwidth(0.3) fcolor(*.5) plotregion(margin(b=0)) ///
>     rename(* = .rep78) coeflabels(, notick labgap(2)) ///
>     ylabel(0(10)70, angle(horizontal) format(%9.0f)) ytitle(Percent) ///
>     addplot(scatter @b @at if @plot==1, ms(i) mlabel(@b) mlabpos(6) pstyle(p1) ///
>          || scatter @b @at if @plot==2, ms(i) mlabel(@b) mlabpos(6) pstyle(p2))
Code
varia/barlbl2.svg

Instead of using the addplot() option you can also use a second layer of regular plots, but this is a bit more complicated:

. coefplot (domestic, recast(bar) barwidth(0.3) fcolor(*.5) offset(-.175)) ///
>          (foreign, recast(bar) barwidth(0.3) fcolor(*.5) offset(.175)) ///
>          (domestic, mlabel mlabpos(6) nokey ms(i) pstyle(p1) offset(-.175)) ///
>          (foreign,  mlabel mlabpos(6) nokey ms(i) pstyle(p2) offset(.175)) ///
>     , vertical noci rescale(100) format(%9.1f) plotregion(margin(b=0)) ///
>       rename(* = .rep78) coeflabels(, notick labgap(2)) ///
>       ylabel(0(10)70, angle(horizontal) format(%9.0f)) ytitle(Percent)
Code
varia/barlbl3.svg
[top]

Different plot style for each bar

If you want to use different styling for each bar in a bar chart you need to place each bar in a separate series. In the following example the first bar is plotted using the first pstyle, the second using the second pstyle, and so on:

. sysuse auto, clear
(1978 Automobile Data)

. logit foreign i.rep78 weight if rep78>=3
(output omitted)

. margins i.rep78, post
(output omitted)

. coefplot (., keep(3.rep78)) ///
>          (., keep(4.rep78)) ///
>          (., keep(5.rep78)) ///
>     , vertical legend(off) nooffsets recast(bar) barwidth(0.8) fcolor(*.8) ///
>       citop ciopts(recast(rcap)) citype(logit) ///
>       coeflabels(, notick labgap(2)) plotregion(margin(b=0))
Code
varia/barpsty.svg

By default coefplot adds some offset to the plot positions if multiple series are plotted, so that the elements do not overlap. In this case this makes no sense because each series only contains one bar. Option nooffsets has been specified to remove the offsets and make sure the bars are centered.

Here is an example in which the same pstyle is used for all bars, but additional styling is added to one of the bars:

. coefplot (., keep(3.rep78)) ///
>          (., keep(4.rep78) lwidth(2) lcolor(red)) ///
>          (., keep(5.rep78)) ///
>     , vertical legend(off) nooffsets pstyle(p1) ///
>       recast(bar) barwidth(0.8) fcolor(*.8) ///
>       citop ciopts(recast(rcap)) citype(logit) ///
>       coeflabels(, notick labgap(2)) plotregion(margin(b=0))
Code
varia/barpsty2.svg
[top]

Stacked bar chart

Here is a rather involved example illustrating how to produce a stacked bar chart:

. use http://repec.sowi.unibe.ch/files/wp8/ASQ-ETHBE-2011.dta, clear
(Online Survey on "Exams and Written assignments" 2011)

. local vars    q21_1 q21_2 q21_3 q21_4

. local lblname q21_

. local levels  1 2 3 4 5

. local nvars: list sizeof vars

. local nlevels: list sizeof levels

. matrix p = J(`nlevels', `nvars', .)

. matrix colnames p = `vars'

. matrix rownames p = `levels'

. local i 0

. foreach v of local vars {
  2.     local ++i
  3.     quietly proportion `v'
  4.     matrix p[1,`i'] = e(b)' * 100
  5. }

. matrix r = p

. mata: st_replacematrix("r", mm_colrunsum(st_matrix("p")))

. mata: st_matrix("l", (J(1,`nvars',0) \ st_matrix("r")[1::`nlevels'-1,]))

. matrix m = r

. mata: st_replacematrix("m", (st_matrix("l") :+ st_matrix("r"))/2)

. local plots

. local i 0

. foreach l of local levels {
  2.     local ++i
  3.     local lbl: lab `lblname' `l'
  4.     local plots `plots' (matrix(m[`i']), ci((l[`i'] r[`i'])) aux(p[`i']) ///
>         key(ci) label(`lbl'))
  5. }

. coefplot `plots', nooffset ms(i) mlabel(@aux) mlabpos(0) format(%9.0f) ///
>     coeflabels(, wrap(30)) ciopts(recast(rbar) barwidth(0.5)) ///
>     legend(rows(1) span stack)
Code
varia/stackedbars.svg
[top]

The addplot command

Subgraph-specific xlines

The addplot command is tool that allows you to add elements to a graph after it has been crated (type ssc install addplot to install the command; also see Jann 2015). For example, coefplot does not allow subgraph-specific xline() options. However, you can use addplot to add the lines after coefplot has created the graph:

. sysuse auto, clear
(1978 Automobile Data)

. logit foreign mpg trunk length turn
(output omitted)

. coefplot ., bylabel(Log odds) || ., bylabel(Odds ratios) eform ///
>     || , drop(_cons) nolabel byopts(xrescale)

. addplot 1: , xline(0) norescaling

. addplot 2: , xline(1) norescaling
Code
varia/addplot.svg

The norescaling option is essential so that addplot does not mess up the labeling of the axes. Always include this option when applying addplot to a graph produced by coefplot.

[top]

Subgraph-specific axis titles

Likewise, you can use addplot to add a separate axis titles to the subgraphs:

. sysuse auto
(1978 Automobile Data)

. regress price mpg trunk length turn
(output omitted)

. estimates store Price

. regress weight mpg trunk length turn
(output omitted)

. estimates store Weight

. coefplot Price || Weight, drop(_cons) xline(0) byopts(xrescale) 

. addplot 1: , b1title("Dollars") norescaling

. addplot 2: , b1title("Pounds") norescaling
Code
varia/addplot2.svg
[top]

Subgraph-specific legends

Furthermore, it is also possible to use addplot to create a separate legend for each subgraph:

. sysuse auto, clear
(1978 Automobile Data)

. forvalues i = 2/5 {
  2.     quietly regress price mpg trunk length turn if rep78==`i'
  3.     estimates store rep`i'
  4. }

. coefplot rep2 rep3, bylabel(Low record)  ///
>     ||   rep4 rep5, bylabel(High record) ///
>     || , drop(_cons) xline(0) norecycle byopts(legend(off))

. addplot 1: , legend(order(2 "rep78=2" 4 "rep78=3") on) norescaling

. addplot 2: , legend(order(6 "rep78=4" 8 "rep78=5") on) norescaling
Code
varia/addplot3.svg

In the logic of the legend() option, the confidence spikes are separate objects. This is why the relevant legend keys in this example are 2, 4, 6, and 8, not 1–4.

[top]

Subgraphs with different sizes

If you want to create subgraphs with different sizes you need to generate separate graphs and then combine them with graph combine. Option fxsize() can be used to control the sizes of the subgraphs:

. sysuse auto, clear
(1978 Automobile Data)

. regress price mpg trunk length turn
(output omitted)

. estimates store price

. regress weight mpg trunk length turn
(output omitted)

. estimates store weight

. coefplot price, drop(_cons) subtitle(Price, box bexpand lstyle(none)) ///
>     name(price) nodraw

. coefplot weight, drop(_cons) subtitle(Weight, box bexpand lstyle(none)) ///
>     name(weight) nodraw yscale(off) fxsize(40)

. graph combine price weight, imargin(small)

. graph drop price weight
Code
varia/fxsize.svg
[top]

Model names as coefficient names

There are many applications in which one wants to display a series of coefficients that all have the same name. Unfortunately, if such coefficients are merged into a single series of estimates, they will be printed on top of each other. For example, you may want to display how the effect of a specific variable differs by subpopulation, by dependent variable, by choice of control variables, or by estimation methodology. To assign unique names to the coefficients in such a situation, options asequation and swapnames can be useful. Here is an example:

. sysuse nlsw88, clear
(NLSW, 1988 extract)

. foreach i in 4 6 7 11 12 {
  2.     quietly regress wage grade ttl_exp tenure if industry==`i'
  3.     estimates store industry_`i'
  4. }

. coefplot (industry_*), keep(grade) asequation swapnames ///
>     title("Effect of grade on wages by industry")
Code
varia/aseqswap.svg

The trick is to store the models under the names you want to use for the coefficients. asequation will cause coefplot to use these names as equation names, and swapnames interchanges equation names and coefficient names. The result of both options together is that the model names are used as coefficient names.

Note that swapnames is applied after processing possible rename() and eqrename() options. Hence, use eqrename() to edit the coefficient names in this case. In the following example, the names are modified in a way such that coefplot picks up the industry value labels (e.g. industry_4 becomes 4.industry):

. coefplot (industry_*), keep(grade) asequation swapnames ///
>     eqrename(^industry_(.*)$ = \1.industry, regex) ///
>     title("Effect of grade on wages by industry")
Code
varia/aseqswap2.svg