The basic procedure is to compute one or more sets of estimates (e.g.
regression models) and then apply coefplot to these estimation sets to draw
a plot displaying the point estimates and their confidence intervals.
Estimation commands store their results in the so-called
e()
returns (type
ereturn list
after running an estimation command to see a list of what has been stored).
By default, coefplot retrieves the point estimates from (the first equation
in) vector e(b)
and computes confidence intervals from the
variance estimates found in matrix e(V)
. See the
Estimates and
Confidence intervals examples
for information on how to change these defaults. Furthermore, coefplot can
also read results from matrices that are not stored as part of an
estimation set; see Plotting results from matrices
below.
The syntax to produce a plot of the coefficients of a single model is
coefplot [name] [, options]
where name
is the name of a stored model (see help
estimates store
), or .
or empty string for the active
model. For example, to plot the point estimates and 95% confidence
intervals for the most recent model, type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(4, 69) = 5.79 Model | 159570047 4 39892511.8 Prob > F = 0.0004 Residual | 475495349 69 6891236.94 R-squared = 0.2513 -------------+---------------------------------- Adj R-squared = 0.2079 Total | 635065396 73 8699525.97 Root MSE = 2625.1 ------------------------------------------------------------------------------ price | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mpg | -186.8417 88.17601 -2.12 0.038 -362.748 -10.93533 trunk | -12.72642 104.8785 -0.12 0.904 -221.9534 196.5005 length | 54.55294 35.56248 1.53 0.130 -16.39227 125.4981 turn | -200.3248 140.0166 -1.43 0.157 -479.6502 79.00066 _cons | 8009.893 6205.538 1.29 0.201 -4369.817 20389.6 ------------------------------------------------------------------------------ . coefplot, drop(_cons) xline(0)Code
Option drop(_cons)
has been added to exclude the constant of the model; option xline(0)
has been added to draw a reference line at zero so one can better see
which coefficients are significantly different from zero.
By default, coefplot uses a horizontal layout in which the names of the
coefficients are placed on the Y-axis and the estimates and their
confidence intervals are plotted along the X-axis. Specify option vertical
to use a
vertical layout:
. coefplot, vertical drop(_cons) yline(0)
Code
Note that, because the axes were flipped, we now have to use
yline(0)
instead of
xline(0)
.
By default, coefplot displays all coefficients from the first equation
of a model. Alternatively, options
keep()
and
drop()
can be used to
specify the elements to be displayed. For example, above, option
drop(_cons)
was used to exclude the constant. Furthermore, coefplot automatically excluded
coefficients that are flagged as "omitted" or as "base levels". To include
such coefficients in the plot, specify options
omitted
and
baselevels
.
For example, if you want to display all equations from a multinomial logit model (including the equation for the base outcome for which all coefficients are zero by definition), type:
. sysuse auto, clear (1978 Automobile Data) . gen mpp = mpg/8 . mlogit rep78 mpp i.foreign if rep>=3 (output omitted) . coefplot, nolabel drop(_cons) keep(*:) omitted baselevelsCode
Option keep(*:)
selects all equations for display, not
just first. For detailed information on the syntax, see the description of the
keep()
option in the
help file.
Here is a further example that illustrates how
keep()
can be used to select different coefficients depending on equation:
. coefplot, nolabel keep(3:*.foreign 4:mpp 5:mpp _cons) omitted baselevels
CodeThe syntax to include multiple models as separate series in the same graph is
coefplot (name [, plotopts]) (name [, plotopts]) ... [, globalopts]
where plotopts
are options that apply to a single series. These options specify the
information to be collected, affect the rendition of the series, and
provide a label for the series in the legend.
globalopts
are options that apply to the overall graph, such as titles or axis
labels, but may also contain any options allowed as plot options to provide
defaults for the single series. A basic example is as follows:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==1 (output omitted) . estimates store F . coefplot D F, drop(_cons) xline(0)Code
To specify separate options for an individual model, enclose the model and its options in parentheses. For example, to add a label for each plot in the legend, to use alternative plot styles, and to change the marker symbol, you could type:
. coefplot (D, label(Domestic Cars) pstyle(p3)) ///
> (F, label(Foreign Cars) pstyle(p4)) ///
> , drop(_cons) xline(0) msymbol(S)
Code
Option
msymbol()
is specified as a global option so that the same symbol is
used in both series. To use different symbols, include an individual
msymbol()
option for each model.
Alternatively, you can also use
p1()
,
p2()
, etc. to specify
options for the different series:
. coefplot D F, drop(_cons) xline(0) msymbol(S) ///
> p1(label(Domestic Cars) pstyle(p3)) ///
> p2(label(Foreign Cars) pstyle(p4))
Code
coefplot offsets the plot positions of the coefficients so that the
confidence spikes do not overlap. To deactivate the automatic offsets, you
can specify global option
nooffsets
.
Alternatively, custom offsets may be specified by the
offset()
option (if
offset()
is
specified for at least one model, automatic offsets are disabled). The
spacing between coefficients is one unit, so usually offsets between –0.5
and 0.5 make sense. For example, if you want to use smaller offsets than
the default, you could type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==1 (output omitted) . estimates store F . coefplot (D, offset(0.05)) (F, offset(-0.05)), drop(_cons) xline(0)Code
If the dependent variables of the models you want to include in the graph have
different scales, it can be useful to employ the
axis()
plot option to
assign specific axes to the models. For example, to include a regression on
price and a regression on weight in the same graph, type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn (output omitted) . estimates store Price . regress weight mpg trunk length turn (output omitted) . estimates store Weight . coefplot Price (Weight, axis(2)), drop(_cons) xtitle(Price) xtitle(Weight, axis(2))Code
The syntax to merge multiple models into the same series is
coefplot (namelist [, plotopts]) ...
or, more precisely,
coefplot (namelist [, modelopts] \ namelist [, modelopts] \ ... [, plotopts]) ...
where modelopts
are options that apply to a single model. For example, if you want to draw
a graph comparing bivariate and multivariate effects, you could type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn (output omitted) . estimates store multivariate . foreach var in mpg trunk length turn { 2. quietly regress price `var' 3. estimates store `var' 4. } . coefplot (mpg trunk length turn, label(bivariate)) /// > (multivariate) /// > , drop(_cons) xline(0)Code
When merging multiple models you may need to apply some renaming of
coefficients, because coefficients that have the same name will be printed
on top of each other. This can be achieved by applying the
rename()
option
to the individual models. An alternative approach is presented in
Model names as coefficient names.
The syntax to create subgraphs is
coefplot plotlist [, subgropts] || plotlist [, subgropts] || ... [, globalopts]
where plotlist
is a list of models as above and subgropts
are options that apply to a single subgraph. An example with one model per
subgraph is as follows:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==1 (output omitted) . estimates store F . coefplot D, bylabel(Domestic Cars) /// > || F, bylabel(Foreign Cars) /// > ||, drop(_cons) xline(0)Code
An example with multiple models per subgraph is:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==1 (output omitted) . estimates store F . regress weight mpg trunk length turn if foreign==0 (output omitted) . estimates store D_weight . regress weight mpg trunk length turn if foreign==1 (output omitted) . estimates store F_weight . coefplot (D, label(Domestic)) (F, label(Foreign)), bylabel(Price) /// > || (D_weight) (F_weight) , bylabel(Weight) /// > ||, drop(_cons) xline(0) byopts(xrescale)Code
Option
byopts(xrescale)
has
been added so that the two subgraphs can have different scales. Furthermore,
the plot labels for the legend were set within the first
subgraph. They could also have been specified within the second subgraph, as
plot styles are recycled with each new subgraph and plot options are collected
across subgraphs (unless
norecycle
is
specified; see below).
If the subgraphs do not contain the same number of models,
it may be necessary to insert "empty" models to achieve the correct alignment.
This can be achieved by typing _skip
:
. coefplot (D, label(Domestic)) (F, label(Foreign)), bylabel(Price) ///
> || _skip (F_weight) , bylabel(Weight) ///
> ||, drop(_cons) xline(0) byopts(xrescale)
Code
As evident in the last example, coefplot recycles plot styles within each
subgraph. If you want each subgraph to use its own set of styles, apply the
norecycle
option:
. sysuse auto, clear (1978 Automobile Data) . forvalues i = 2/5 { 2. quietly regress price mpg trunk length turn if rep78==`i' 3. estimates store rep`i' 4. } . coefplot (rep2, label(rep78=2)) (rep3, label(rep78=3)), bylabel(Low record) /// > || (rep4, label(rep78=4)) (rep5, label(rep78=5)), bylabel(High record) /// > ||, drop(_cons) xline(0) norecycle legend(colfirst)Code
Use option byopts(byopts)
to determine how subgraphs are combined. See
help by_option
for available byopts
. For example, to use a compact style
and stack the subgraphs in one column, you could type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==1 (output omitted) . estimates store F . coefplot D, bylabel(Domestic Cars) || F, bylabel(Foreign Cars) /// > ||, drop(_cons) xline(0) byopts(compact cols(1))Code
Note that Stata renders the titles of the subgraphs as "subtitles". Hence,
you can use the
subtitle()
option to change their styling:
. coefplot D, bylabel(Domestic Cars) || F, bylabel(Foreign Cars) ///
> ||, drop(_cons) xline(0) byopts(compact cols(1)) ///
> subtitle(, size(vlarge) margin(medium) justification(left) ///
> color(white) bcolor(black) bmargin(top_bottom))
CodeSometimes it makes sense to arrange coefficients in separate subgraphs with individual scales, as the size of coefficients may vary considerably. For example, when comparing results by subgroups or estimation techniques, the focus usually lies on differences across models and less on differences within models, so that it appears natural to use individuals subgraphs for the different coefficients.
Creating subgraphs by coefficients requires lengthy commands as for each
coefficient a separate piece of subgraph syntax has to be put together.
To avoid this extra typing you can use the
bycoefs
option.
Technically, bycoefs
flips coefficients and subgraphs, that is, the
coefficients are treated as "subgraphs" and what was specified as subgraphs
is treated as "coefficients". This seems difficult to understand, but should
become clear in the following example:
. sysuse auto, clear (1978 Automobile Data) . forv i = 3/5 { 2. quietly regress price mpg headroom weight turn if rep78==`i' 3. estimate store rep78_`i' 4. } . coefplot rep78_3 || rep78_4 || rep78_5, drop(_cons) xline(0) /// > bycoefs byopts(xrescale)Code
As some people prefer vertical mode for such a graph, you might want to
specify the
vertical
option:
. coefplot rep78_3 || rep78_4 || rep78_5, drop(_cons) yline(0) ///
> bycoefs byopts(yrescale) vertical
CodeHere is an example that adds another dimension. Displayed are the means of some variables by repair record and car type:
. sysuse auto, clear (1978 Automobile Data) . forv s = 0/1 { 2. forv i = 3/5 { 3. quietly mean price mpg headroom weight if foreign==`s' & rep78==`i' 4. estimate store m`s'_`i' 5. } 6. } . coefplot m0_3 m1_3, bylabel(rep78=3) /// > || m0_4 m1_4, bylabel(rep78=4) /// > || m0_5 m1_5, bylabel(rep78=5) /// > || , bycoefs byopts(xrescale) /// > plotlabels("Domestic cars" "Foreign cars")Code
Instead of providing distinct model names to coefplot, you can also
specify a name pattern containing *
(any string)
and ?
(any nonzero character) wildcards. coefplot
will then plot the results from all matching
models.
If a name pattern is specified within parentheses, the results from the matching models will be combined into the same plot. An example is as follows:
. sysuse auto, clear (1978 Automobile Data) . foreach var of varlist mpg trunk length turn { 2. quietly regress price `var' if foreign==0 3. estimates store d_`var' 4. quietly regress price `var' if foreign==1 5. estimates store f_`var' 6. } . estimates dir ------------------------------------------------------- name | command depvar npar title -------------+----------------------------------------- d_mpg | regress price 2 f_mpg | regress price 2 d_trunk | regress price 2 f_trunk | regress price 2 d_length | regress price 2 f_length | regress price 2 d_turn | regress price 2 f_turn | regress price 2 ------------------------------------------------------- . coefplot (d*, label(domestic)) (f*, label(foreign)) /// > , drop(_cons) xline(0) title("Bivariate effects on price by car type")Code
This is equivalent to the following command using explicit names:
. coefplot (d_mpg d_trunk d_length d_turn, label(domestic)) ///
> (f_mpg f_trunk f_length f_turn, label(foreign)) ///
> , drop(_cons) xline(0) title("Bivariate effects on price by car type")
CodeIf multiple patterns are specified, options attached to a specific pattern will be applied to all matching models. Example:
. coefplot (d*, asequation(Domestic) \ f*, asequation(Foreign) \ , pstyle(p4)) ///
> , drop(_cons) xline(0) title("Bivariate effects on price by car type")
CodeThis is equivalent to the following command using explicit names:
. coefplot (d_mpg d_trunk d_length d_turn, asequation(Domestic) \ ///
> f_mpg f_trunk f_length f_turn, asequation(Foreign) \ ///
> , pstyle(p4)) ///
> , drop(_cons) xline(0) title("Bivariate effects on price by car type")
CodeIf a name pattern is specified without parentheses, the matching models will be treated as separate series:
. sysuse nlsw88, clear (NLSW, 1988 extract) . generate lnwage = ln(wage) . forvalues i=0/1 { 2. forvalues j=1/2 { 3. quietly regress lnwage grade ttl_exp tenure if south==`i' & race==`j' 4. estimates store est`i'_`j' 5. } 6. } . coefplot est0* || est1*, drop(_cons) xline(0) /// > plotlabels("White" "Black") bylabels("North" "South")Code
This is equivalent to the following command using explicit names:
. coefplot est0_1 est0_2 || est1_1 est1_2, drop(_cons) xline(0) ///
> plotlabels("White" "Black") bylabels("North" "South")
Code
When using a name pattern that is expanded into multiple series, you need to
use p1()
,
p2()
, etc. to provide
separate options for the different series:
. sysuse auto, clear (1978 Automobile Data) . forvalues i=3/5 { 2. quietly regress price mpg trunk if rep78==`i' 3. estimates store rep_`i' 4. } . coefplot rep*, drop(_cons) xline(0) /// > p1(pstyle(p3) label("Rep=3")) /// > p2(pstyle(p4) label("Rep=4")) /// > p3(pstyle(p5) label("Rep=5"))Code
The default for coefplot is to use the first (nonzero) equation from
each model and match coefficients across models by their names (ignoring
equation names). For example,
regress
returns one (unnamed) equation containing the regression coefficients
whereas
tobit
returns two equations, equation model
containing the regression
coefficients and equation sigma
containing the standard error of
the regression. Hence, the default for coefplot is to match the
regression coefficients from the two models and ignore equation
sigma
from the Tobit model:
. webuse laborsub, clear . regress whrs kl6 k618 wa we Source | SS df MS Number of obs = 250 -------------+---------------------------------- F(4, 245) = 5.27 Model | 16526046.1 4 4131511.52 Prob > F = 0.0004 Residual | 192218058 245 784563.5 R-squared = 0.0792 -------------+---------------------------------- Adj R-squared = 0.0641 Total | 208744104 249 838329.733 Root MSE = 885.76 ------------------------------------------------------------------------------ whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- kl6 | -462.1233 124.6768 -3.71 0.000 -707.6985 -216.5481 k618 | -91.141 45.85001 -1.99 0.048 -181.4515 -.8305151 wa | -13.1577 8.334958 -1.58 0.116 -29.57502 3.259612 we | 53.26156 26.09369 2.04 0.042 1.864986 104.6581 _cons | 940.0593 530.7197 1.77 0.078 -105.296 1985.415 ------------------------------------------------------------------------------ . estimate store regress . tobit whrs kl6 k618 wa we, ll(0) Tobit regression Number of obs = 250 Uncensored = 150 Limits: lower = 0 Left-censored = 100 upper = +inf Right-censored = 0 LR chi2(4) = 23.03 Prob > chi2 = 0.0001 Log likelihood = -1367.0903 Pseudo R2 = 0.0084 ------------------------------------------------------------------------------ whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- kl6 | -827.7657 214.7407 -3.85 0.000 -1250.731 -404.8008 k618 | -140.0192 74.22303 -1.89 0.060 -286.2129 6.174547 wa | -24.97919 13.25639 -1.88 0.061 -51.08969 1.131317 we | 103.6896 41.82393 2.48 0.014 21.31093 186.0683 _cons | 589.0001 841.5467 0.70 0.485 -1068.556 2246.556 -------------+---------------------------------------------------------------- /sigma | 1309.909 82.73335 1146.953 1472.865 ------------------------------------------------------------------------------ . estimate store tobit . coefplot regress tobit, xline(0)Code
Even though the collected results from
regress
and tobit
have
different equation names (_
and model
, respectively),
coefplot matches their coefficients, that is, the equation names are
ignored. This is the default if only one equation per model is collected. If
you want to take equation names into account nonetheless, you can specify the
eqstrict
option:
. coefplot regress tobit, xline(0) eqstrict
Code
Although eqstrict
causes equation names to be relevant, the second equation from the
tobit
model is still ignored. To include all equations, type:
. coefplot regress tobit, xline(0) keep(*:)
Code
Furthermore, to match the coefficients from
regress
with the first equation from
tobit
and also print the second equation from
tobit
,
you can use
asequation()
to set the equation name of
regress
to model
:
. coefplot (regress, asequation(model)) (tobit, keep(*:)), xline(0)
Code
Alternatively, you could also use
eqrename(_ = model)
to rename equation _
to model
or
eqrename(model = _)
to rename equation model
to _
.
The following example further illustrates how you can get rid of the
equation labels and give the coefficient in the sigma
equation
a meaningful name using the
rename()
option:
. coefplot (regress, asequation(model)) ///
> (tobit, keep(*:) rename(sigma:_cons = Sigma)) ///
> , xline(0) noeqlabels
Code
The rename()
option is also useful if you want to match coefficients that have different
names in the input models. Here is an example that illustrates the effect
of measurement error in regression models:
. drop _all . matrix C = ( 1, .5, 0 \ .5, 1, .3 \ 0, .3, 1 ) . drawnorm x1 x2 x3, n(10000) corr(C) (obs 10,000) . generate y = 1 + x1 + x2 + x3 + 5 * invnorm(uniform()) . regress y x1 x2 x3 (output omitted) . estimates store m1 . generate x1err = x1 + 2 * invnorm(uniform()) . regress y x1err x2 x3 (output omitted) . estimates store m2 . coefplot (m1, label(Without error)) /// > (m2, label(With error)) /// > , xline(1) rename(x1err = x1)Code
We can see how measurement error in x1
distorts all slope coefficients
in the model, even for variable x3
that is uncorrelated with
x1
(due to the indirect correlation through x2
).
In general, coefficients are plotted in the same order (from top to bottom) as
they appear in the input models. However, coefficients appearing only in
later models are placed after coefficients from earlier models (with the
exception of _cons
, which is always placed last). Have a look at the
following example:
. sysuse auto, clear (1978 Automobile Data) . label variable mpg "1. mpg" . label variable trunk "{bf:2. trunk}" . label variable length "{bf:3. length}" . label variable turn "4. turn" . regress price mpg length (output omitted) . estimate store m1 . regress price mpg trunk turn (output omitted) . estimate store m2 . regress price mpg trunk length turn (output omitted) . estimate store m3 . coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1))Code
Even though in the full model (m3
) trunk
comes before
length
, the order of the two coefficients is reversed in the plot.
This is because length
but not trunk
is part of the first
model. That is, because trunk
only appears in the later models, it is
placed after length
that appears already in the first model.
To establish an order as in the full model, you can use the
orderby()
option:
. coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1)) orderby(3:)
Code
Typing orderby(3:)
instructs coefplot to use the model
displayed in the third subgraph to determine the order of the coefficients.
(Typing orderby(3)
would refer to the third model in the first
subgraph, which doesn't exist in the example above;
orderby(3:3)
would refer to the third model in the third
subgraph.)
Alternatively, you can specify an explicit order of coefficients using the
order()
option:
. coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1)) ///
> order(mpg trunk length)
Code
Within order()
,
you can use the *
(any string) and ?
(any nonzero character) wildcards. Furthermore, you can type .
to insert
gaps. Example:
. label variable mpg . label variable trunk . label variable length . label variable turn . coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1)) /// > order(. mpg . t* . length .)Code
In case of multiple equations, the specified order of coefficients applies to each equation:
. sysuse auto, clear (1978 Automobile Data) . gen mpp = mpg/8 . mlogit rep78 mpp if rep>=3 (output omitted) . estimates store m1 . mlogit rep78 mpp i.foreign if rep>=3 (output omitted) . estimates store m2 . coefplot m1 || m2, xline(0) nolabel keep(*:) order(_cons 1.foreign mpp)Code
Alternatively, to change the order of equations without changing the order of coefficients, type:
. coefplot m1 || m2, xline(0) nolabel keep(*:) order(5: 4:)
CodeYou can also specify a separate order for each equation or even take equations apart, as in the following example:
. coefplot m1 || m2, xline(0) nolabel keep(*:) ///
> order(4:1.foreign 5:1.foreign 4:_cons mpp)
Code
Coefficients can be ordered by size using the
sort()
option. Here
is an example that displays average wages by industry, from lowest to
highest:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(industry,2) (4 observations deleted) . regress wage ibn.industry, nocons noheader ----------------------------------------------------------------------------------------- wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------------------+---------------------------------------------------------------- industry | Ag/Forestry/Fisheries | 5.621121 1.348538 4.17 0.000 2.976592 8.26565 Construction | 7.564934 1.032496 7.33 0.000 5.540173 9.589695 Manufacturing | 7.501578 .2902381 25.85 0.000 6.932411 8.070745 Transport/Comm/Utility | 11.44335 .5860926 19.52 0.000 10.294 12.5927 Wholesale/Retail Trade | 6.125897 .3046951 20.11 0.000 5.528379 6.723414 Finance/Ins/Real Est.. | 9.843174 .4012702 24.53 0.000 9.056269 10.63008 Business/Repair Svc | 7.51579 .5995678 12.54 0.000 6.340017 8.691564 Personal Services | 4.401093 .564549 7.80 0.000 3.293993 5.508193 Entertainment/Rec Svc | 6.724409 1.348538 4.99 0.000 4.07988 9.368938 Professional Services | 7.871186 .1936975 40.64 0.000 7.491338 8.251033 Public Administration | 9.148407 .4191131 21.83 0.000 8.326512 9.970302 ----------------------------------------------------------------------------------------- . coefplot, sortCode
To sort from highest to lowest, specify the
descending
suboption:
. coefplot, sort(, descending)
Code
sort()
has a
by()
suboption to select the statistic by which the coefficients are ordered.
For example, to sort by estimation precision (standard errors) you could
type:
. coefplot, sort(, by(se))
CodeNote how confidence intervals increase from top to bottom.
If a graph contains multiple series, it usually makes sense to select a specific series for sorting the coefficients (the default is to take all available estimates into account; this is equivalent to sort coefficients based on their minimums across series). An example is as follows:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(industry,2) (4 observations deleted) . regress wage ibn.industry, nocons noheader (output omitted) . estimates store overall . regress wage ibn.industry if union==0, nocons (output omitted) . estimates store nonunion . regress wage ibn.industry if union==1, nocons (output omitted) . estimates store union . coefplot overall, nokey /// > || nonunion union, bylabel(by union status) /// > || , norecycle byopts(legend(position(5))) sort(1, descending)Code
In the example, the first series (overall means) is used for sorting. To sort by the third series (wages of the unionized; green markers in the right panel), you could type:
. coefplot overall, nokey ///
> || nonunion union, bylabel(by union status) ///
> || , norecycle byopts(legend(position(5))) sort(3, descending)
Code
Because in this example the
norecycle
option has been specified, the series are uniquely identified across the
subgraphs. If
norecycle
is
omitted, series are repeated by subgraph. Hence, to sort by a series
in a specific subgraph in this case you need to provide both the subgraph number
and the series number, as in the following example:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(industry,1,2,3,10) (67 observations deleted) . regress wage ibn.industry if union==0 & south==0, nocons (output omitted) . estimates store nonunionnorth . regress wage ibn.industry if union==1 & south==0, nocons (output omitted) . estimates store unionnorth . regress wage ibn.industry if union==0 & south==1, nocons (output omitted) . estimates store nonunionsouth . regress wage ibn.industry if union==1 & south==1, nocons (output omitted) . estimates store unionsouth . coefplot nonunionnorth unionnorth, bylabel(North) /// > || nonunionsouth unionsouth, bylabel(South) /// > || , plotlabels("nonunion" "union") sort(2:1, descending)Code
sort(2:1)
instructs coefplot to sort the coefficients according
to the first series in the second subgraph (wages of nonunionized in the south;
blue series in right panel).
By default, coefplot reads results from the e()
returns of
stored estimation sets. To read results from a matrix, type matrix(name)
instead of name
when referring to the results.
coefplot will then collect the point estimates from the first row of matrix
name
instead of from e(b)
of estimation
set name
. If plotting results from matrices, you also
have to specify how to obtain the confidence intervals, using the v()
, se()
, or ci()
option. For example,
to plot medians and their confidence intervals as computed by centile
you
could type:
. sysuse auto, clear (1978 Automobile Data) . matrix median = J(1,3,.) . matrix colnames median = mpg trunk turn . matrix CI = J(2,3,.) . matrix colnames CI = mpg trunk turn . matrix rownames CI = ll95 ul95 . local i 0 . foreach v of var mpg trunk turn { 2. local ++ i 3. quietly centile `v' 4. matrix median[1, `i'] = r(c_1) 5. matrix CI[1, `i'] = r(lb_1) \ r(ub_1) 6. } . matrix list median median[1,3] mpg trunk turn r1 20 14 40 . matrix list CI CI[2,3] mpg trunk turn ll95 19 12 37.078729 ul95 22 16 42 . coefplot matrix(median), ci(CI)Code
By default coefplot reads the point estimates from the first row of the
specified matrix, and the CIs from the first two rows of the matrix specified
in ci()
. Depending on
the layout of your results matrices, you will need to
specify the rows and columns to read from (see the next example, or the remark on
Plotting results from matrices in
the help file).
Results from estimation commands and from matrices can be combined in the same graph. Here is an example that displays means and medians of price by repair record:
. sysuse auto, clear (1978 Automobile Data) . matrix R = J(5, 3, .) . matrix coln R = median ll95 ul95 . matrix rown R = 1 2 3 4 5 . forv i = 1/5 { 2. quietly centile price if rep78==`i' 3. matrix R[`i',1] = r(c_1), r(lb_1), r(ub_1) 4. } . matrix list R R[5,3] median ll95 ul95 1 4564.5 4195 4934 2 4638 3898.525 8993.35 3 4741 4484.8407 5714.9172 4 5751.5 4753.4403 7055.1933 5 5397 3930.5673 6988.0509 . mean price, over(rep78) Mean estimation Number of obs = 69 1: rep78 = 1 2: rep78 = 2 3: rep78 = 3 4: rep78 = 4 5: rep78 = 5 -------------------------------------------------------------- Over | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ price | 1 | 4564.5 369.5 3827.174 5301.826 2 | 5967.625 1265.494 3442.372 8492.878 3 | 6429.233 643.5995 5144.95 7713.516 4 | 6071.5 402.9585 5267.409 6875.591 5 | 5913 788.6821 4339.209 7486.791 -------------------------------------------------------------- . coefplot (., label(mean)) /// > (matrix(R[,1]), ci((2 3)) label(median)) /// > , ytitle(Repair Record 1978) xtitle(Price)Code
By default, coefplot uses marker symbols for point estimates and spikes for
confidence intervals. To change the plot types, use the recast()
and ciopts(recast())
options. For example, to use bars for point estimates and capped spikes for
confidence intervals, you could type:
. sysuse auto, clear (1978 Automobile Data) . regress price mpg trunk length if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length if foreign==1 (output omitted) . estimates store F . coefplot (D, label(Domestic Cars)) (F, label(Foreign Cars)) /// > , drop(_cons) xline(0) recast(bar) ciopts(recast(rcap)) citop barwidt(0.3)Code
Option citop
has
been added so that the confidence spikes are plotted in front of the bars.
The recast()
option can also be used to select other plot types such as connected-line
plots or area plots. Furthermore, different plot types can be combined in a
single graph by specifying a separate recast()
option for
each series. Here is an example, in which proportions are displayed as bars
(without confidence intervals using the noci
option) and means
are displayed as a connected-line plot with capped spikes for confidence
intervals:
. sysuse auto, clear (1978 Automobile Data) . proportion rep78 (output omitted) . estimates store prop . mean price, over(rep78) (output omitted) . estimates store mean . coefplot (prop, recast(bar) noci barwidth(0.5) color(*.6)) /// > (mean, recast(connected) ciopts(recast(rcap)) axis(2)) /// > , vertical nooffsets plotlabels("Proportion" "Price") /// > xtitle("Repair record") ytitle("Proportion") ytitle("Price", axis(2))Code
The example also illustrates some other options. Option vertical
has been
specified to flip the axes, option nooffsets
omits
offsetting the plot positions so that bars and markers are both
centered above the categories, option axis()
has been applied
to the second series so that proportions and means have different axes.
Furthermore, option plotlabels()
provides an alternative way to specify legend labels for the series (instead of
specifying separate label()
options).
The coefficients provided to coefplot may represent estimates along a
continuous dimension. Examples are predictive margins or marginal effects
computed over values of a continuous variable. In such a case, use the at()
option to provide
the plot positions to coefplot. Here is an example where predictive margins
of foreign
are computed by level of mpg
, once
from a bivariate model and once from a multivariate model:
. sysuse auto, clear (1978 Automobile Data) . logit foreign mpg (output omitted) . margins, at(mpg=(10(2)40)) post (output omitted) . estimates store bivariate . logit foreign mpg turn price (output omitted) . margins, at(mpg=(10(2)40)) post (output omitted) . estimates store multivariate . coefplot bivariate multivariate, ytitle(Pr(foreign=1)) xtitle(Miles per Gallon) /// > at recast(line) lwidth(*2) ciopts(recast(rline) lpattern(dash))Code
The example also illustrates how to change the plot types such that the
estimates and their confidence intervals are displayed as lines.
Furthermore, note that automatic offsetting of plot positions is
deactivated by default if the at()
option is specified.
coefplot has four levels of options:
modelopts
are options that apply to a single model (or matrix). They specify the
information to be collected from the model.
plotopts
are options that apply to a single plot (i.e., a single series of points
displayed in the same style), possibly containing results
from multiple models. They affect the rendition of markers and
confidence intervals and provide a label for the plot.
subgropts
are options that apply to a single subgraph, possibly containing
multiple plots.
globalopts
are options that apply to the overall graph. This also includes option
byopts()
to
determine how subgraphs are combined.
The levels are nested in the sense that upper level options include all
lower level options. That is,
globalopts
includes
subgropts
,
plotopts
,
and
modelopts
;
subgropts
includes
plotopts
,
and
modelopts
;
plotopts
includes
modelopts
.
However, upper level options may not be specified at a lower level.
If lower level options are specified at an upper level, they serve as
defaults for all included lower levels elements. For example, if you want
to draw 99% and 95% confidence intervals for all included models,
specify levels(99 95)
as global option:
coefplot model1 model2 model3, levels(99 95)
Options specified with an individual element override the defaults set by upper level options. For example, if you want to draw 99% and 95% confidence intervals for model 1 and model 2 and 90% confidence intervals for model 3, you could type:
coefplot model1 model2 (model3, level(90)), levels(99 95)
There are some fine distinctions about the placement of options and how they are interpreted. For example, if you type
coefplot m1, opts1 || m2, opts2 opts3
then opts2
and opts3
are
interpreted as global options. If you want to apply
opts2
only to m2
then type
coefplot m1, opts1 || m2, opts2 ||, opts3
Similarly, if you type
coefplot (m1, opts1 \ m2, opts2)
then opts2
will be applied to both models. To apply
opts2
only to m2
type
coefplot (m1, opts1 \ m2, opts2 \)
or, if you also want to include opts3
to be applied
to both models, type
coefplot (m1, opts1 \ m2, opts2 \, opts3)
or
coefplot (m1, opts1 \ m2, opts2 \), opts3
In case of multiple subgraphs there is some ambiguity about where to
specify the plot options (unless global option
norecycle
is specified). You can provide plot options within any of the subgraphs, as
plot options are collected across subgraphs. However, in case of conflict,
the plot options from the later subgraphs usually take precedence over
earlier plot options. In addition, you can also use global options
p1()
,
p2()
, etc. to
provide options for specific plots. In case of conflict, options specified
within a plot take precedence over options provided via
p1()
,
p2()
, etc.