The LOGISTIC procedure offers the ODDSRATIO and UNITS statements, which you can use to estimate odds ratios, and the PLOTS=ODDSRATIO option for creating plots of odds ratios. The following shows how you can produce plots of odds ratios in several situations. For similar examples of plotting hazard ratios from Cox models fit by the PHREG procedure, see SAS KB0044200, "Plot hazard ratio change over a continuous variable in polynomial, spline, or interaction effect."
An odds ratio for a continuous predictor is the change in odds that results from increasing the predictor by some number of units. You can specify the interval width in units of the predictor using the UNITS statement. You can specify the starting point of the interval or intervals using the AT option in the ODDSRATIO statement.
The odds ratio is computed as the ratio of the odds at the interval maximum over the odds at the interval minimum—reversed if you specify a negative value for the predictor in the UNITS statement. By default, the odds ratio for a one-unit increase in the predictor is computed. For a categorical predictor, the odds ratio compares the odds at two levels of the predictor. You can specify how levels are compared using the DIFF= option in the ODDSRATIO statement. By default, odds ratios for all pairs of levels are estimated.
A continuous predictor or a binary categorical predictor that does not interact with another variable, is not involved in polynomial (higher-order) effects, and is not represented in the model using a spline has only a single odds ratio. A multi-level categorical predictor in this situation has a single set of odds ratios comparing its levels. However, if the predictor is involved in interactions, splines, or polynomial effects (such as a quadratic effect), then the odds ratio for the predictor depends on the value of the interacting variable, and on its own value if the variable appears in a spline or in a polynomial effect.
For more information about odds ratios, see "Odds Ratio Estimation" in the Details section of the LOGISTIC procedure documentation. Except where noted, the following examples use the neuralgia data shown in the example titled "Logistic Modeling with Categorical Predictors" in the LOGISTIC procedure documentation to show how you can produce plots of changing odds ratios in a variety of model types.
This article contains the following sections:
The following statements fit a logistic model on the probability of no pain in which the binary Sex variable (with levels M and F) interacts with the continuous Age variable. The ODDSRATIO statement requests odds ratios comparing the two Sex levels at each integer value of Age from 65 to 75. The DIFF=REF option specifies that each odds ratio has the odds for Females (Sex='F') in the denominator because 'F' is specified as the reference level for Sex in the CLASS statement. The ODS OUTPUT statement saves the table of odds ratio estimates in a data set. You can use the PRINT procedure to examine the variable names and contents of the saved data set.
The first plot is created by the PLOTS option in the PROC LOGISTIC statement. The TYPE=VERTICAL suboption requests that the plot be drawn with confidence bars drawn vertically rather than horizontally, as is done by default. If preferred, you can produce the plot with the odds ratios and confidence limits displayed using a line and band by using the DATA step and the SGPLOT procedure. The SCAN function in the DATA step is used to extract the Age values from the character EFFECT variable in the data set and create a numeric Age variable for use as the horizontal axis. The SERIES and BAND statements in PROC SGPLOT produce the second plot.
proc logistic data=neuralgia plots(only)=oddsratio(type=vertical);
class sex(ref='F');
model Pain(event='No') = age|sex;
oddsratio sex / at (age=65 to 75) diff=ref;
ods output oddsratioswald=or;
run;
data or; set or;
length Age 8;
Age=scan(effect,-1,'=');
run;
proc sgplot data=or noautolegend;
band upper=uppercl lower=lowercl
x=age / transparency=.5;
series y=oddsratioest x=age;
xaxis grid;
yaxis label='Odds Ratio' grid;
refline 1 / axis=y;
title "Sex odds ratios (M/F) and 95% CI at Age values";
run;
The resulting plot shows decreasing Male versus Female odds ratios as Age increases, becoming significantly less than one for Ages between 70 and 73, suggesting that being male significantly reduces the odds of no pain in that Age range.
|
|
The following model includes two continuous variables, Duration and Age, and their interaction.
In the ODDSRATIO statement, the AT option requests odds ratios for Age (measured in years) at Duration values from 0 to 40. Since the UNITS statement is not specified, each estimated odds ratio reflects a one-unit increase in Age. As in the previous example, two versions of the plot of Age odds ratios are presented. The first is produced by the PLOTS option and shows the discrete odds ratio estimates with confidence bars. The second plot, produced by PROC SGPLOT, shows the odds ratio estimates connected by a line with a confidence band.
proc logistic data=neuralgia plots(only)=oddsratio(type=vertical);
model Pain(event='No') = age|duration;
oddsratio age / at (duration=0 to 40 by 5);
ods output oddsratioswald=or;
run;
data or; set or;
length Duration 8;
Duration=scan(effect,-1,'=');
run;
proc sgplot data=or noautolegend;
band upper=uppercl lower=lowercl
x=duration / transparency=.5;
series y=oddsratioest x=duration;
xaxis grid;
yaxis label='Odds Ratio' grid;
refline 1 / axis=y;
title "Age odds ratios (M/F) and 95% CI at Duration values";
run;
The resulting plots show that the one-unit odds ratios for Age decrease with Duration. The odds ratios between Duration=15 and Duration=35 are significantly less than one, suggesting that a one-year increase in Age decreases the odds of no pain in that Duration range.
|
|
Adding a spline or polynomial effect in the model for a continuous predictor allows for a nonlinear fit to the log odds of the event. While polynomial effects (such as quadratic or cubic effects) allow for a variety of shapes, splines are even more flexible. When the predictor has a nonlinear effect, its odds ratio changes over the range of the predictor. In the following statements, the EFFECT statement produces a natural cubic spline named SPL for the Age predictor (measured in years). The spline then represents the effect of Age in the model.
In addition to the following, you can find another example of plotting the changing odds ratio for a continuous predictor used in a spline in SAS Note 70221, "Estimate and plot the effect of changing a continuous predictor in a spline."
The ODDSRATIO statement below requests odds ratio estimates for increases over one-year intervals of AGE (since the UNITS statement is not specified). Since the odds ratio for AGE depends on where those intervals are located in the range of AGE, the AT option is used to specify the starting points of several one-unit intervals: 60 to 61, 61 to 62, and so on. Rather than use the character Effect variable in the saved data set, the SCAN function in the DATA step below captures the values of Age appearing at the end of the Effect values and uses the resulting numeric Age variable for the horizontal axis.
proc logistic data=neuralgia plots(only)=oddsratio(type=vertical);
effect spl=spline(age/naturalcubic basis=tpf(noint));
model Pain(event='No') = spl;
oddsratio age / at (age=60 to 80);
ods output oddsratioswald=or;
run;
data or; set or;
length Age 8;
Age=scan(effect,-1,'=');
run;
proc sgplot data=or noautolegend;
band upper=uppercl lower=lowercl
x=age / transparency=.5;
series y=oddsratioest x=age;
xaxis grid;
yaxis label='Odds Ratio' grid;
refline 1 / axis=y;
title 'Odds ratios for 1 unit increase in Age';
run;
As in previous examples, two versions of the plot are presented. Both plots show that the odds ratio for one-year increases in Age decrease between Age=66 and 76 and plateau at more extreme Ages. The odds ratios from Age=70 and beyond are significantly below one, suggesting that a one-year increase in Age beyond 70 lowers the odds of no pain.
|
|
In a model that has two interacting continuous variables, the odds ratio for one variable depends on the value of the other. In the case of a splined continuous variable that interacts with another continuous variables, the odds ratio for the splined variable depends on the values of both variables. In this latter scenario, a plot of the changing odds ratios becomes three-dimensional, varying both variables with the odds ratio estimate at each combination. You can present this scenario as a two-dimensional contour plot with contour lines that indicate the odds ratio values.
The following uses the diabetes data in the example titled "Nonparametric Logistic Regression" of the GAMPL procedure documentation. The data include all 532 observations regardless of the value of the Test variable. As in the previous examples, the PLOTS option requests an odds ratio plot with vertical confidence bars. The RANGE=CLIP option allows the confidence bars to be clipped so that you can more easily distinguish the odds ratio values. The EFFECT statement defines a natural cubic spline on the continuous Pedigree predictor. The MODEL statement specifies a model on the probability of a positive diabetes test that includes the effects of Glucose, the spline on Pedigree, and their interaction. In the ODDSRATIO statement, the AT option specifies interval starting values for Glucose and Pedigree at which to estimate the odds ratio. The odds ratios measure the effect of one-unit increases in Pedigree since the UNITS statement is not specified. As before, the ODS OUTPUT statement saves the estimated odds ratios in a data set.
proc logistic data=DiabetesStudy
plots(only)=oddsratio(type=vertical range=clip);
effect splp=spline(pedigree/naturalcubic basis=tpf(noint));
model Diabetes(event='1') = glucose|splp;
oddsratio pedigree / at (glucose=50 to 200 by 50 pedigree=0 to 2.5 by .5);
ods output oddsratioswald=or;
run;
Since the saved data set only shows the values of Glucose and Pedigree within the character Effect variable, the following DATA step adds separate Glucose and Pedigree variables, taking care that they match the values as shown in Effect. Following the DATA step, the TEMPLATE procedure is used to define a contour plot template named ORcontourplot for producing the desired contour plot. In the CONTOURPLOTPARM statement, Pedigree is specified as the horizontal axis variable, Glucose as the vertical axis variable, and OddsRatioEst (created by the ODDSRATIO statement) as the variable that defines the contours. The LEVELS= option requests a specific set of contour lines. The CONTOURTYPE=LABELEDLINEFILL option requests labels on the contour lines and colored contour intervals between the contour lines. The CONTINUOUSLEGEND statement provides a scale of colors and the associated odds ratio values at the side of the plot. For more information about the CONTOURPLOTPARM and other statements in PROC TEMPLATE, see SAS® Graph Template Language: Reference in the SAS documentation.
data or2;
do Glucose=50 to 200 by 50;
do Pedigree=0 to 2.5 by .5;
set or; output;
end; end;
run;
proc template;
define statgraph ORcontourplot;
begingraph;
entrytitle "Pedigree Odds Ratio Plot";
layout overlay /
xaxisopts=(offsetmin=0 offsetmax=0
linearopts=(thresholdmin=0 thresholdmax=0))
yaxisopts=(offsetmin=0 offsetmax=0
linearopts=(thresholdmin=0 thresholdmax=0));
contourplotparm x=pedigree y=glucose z=oddsratioest /
contourtype=labeledlinefill levels=(0 1 2 4 6 8 10 15 20) name="Contour";
continuouslegend "Contour" / title="Odds Ratio";
endlayout;
endgraph;
end;
run;
proc sgrender data=or2 template=ORcontourplot;
run;
The odds ratio plots produced by the PLOTS option and the SGRENDER procedure show how the one-unit odds ratio estimates for Pedigree change as Pedigree and Glucose are varied. The pattern of change is most easily grasped from the contour plot. It shows that, below and to the left of the 1 line, which represents no change in odds, the odds ratio for a one-unit increase in Pedigree increases the odds of a positive diabetes test. The amount of the increase, as measured by the odds ratio, increases as both variables decrease. The same can be seen, though perhaps less clearly, in the first odds ratio plot with confidence bars.
|
|
Returning to the model above on a single-splined predictor, this example produces a plot that enables you to see how the odds ratio changes as you change both the size and the location of the interval over which the odds ratio is computed. Recall that the odds ratio on a continuous predictor is the ratio of the odds at the interval endpoint divided by the odds at the interval starting point (though, this can be reversed by using a negative value for the variable in the UNITS statement).
The following statements fit the same logistic model as above on a spline of Age and add the UNITS statement requesting odds ratios for one-, five-, and ten-unit (year) increases. As before, the odds ratios are estimated for increases starting at Age=60 to 80, but this time in jumps of five years. The DATA step then adds separate Age and Units variables (repeating their values embedded in the Effect variable) in the saved data set, making them conveniently available for plotting. The EFFECTPLOT statement with the LINK option is added to show how the log odds of the event (no pain) change over Age.
Similar to the previous example, PROC TEMPLATE defines a template named ORcontourplot for producing the contour plot with the starting point of the Age interval on the horizontal axis, the width of the interval (Units) on the vertical axis, and the odds ratio estimate (OddsRatioEst) as the variable defining the contours. Horizontal reference lines are added at levels in the plot indicating one-, five-, and ten-year changes.
proc logistic data=neuralgia
plots(only)=oddsratio(type=vertical range=clip);
effect spl=spline(age/naturalcubic basis=tpf(noint));
model Pain(event='No') = spl;
oddsratio age / at (age=60 to 80 by 5);
units age=1 5 10;
effectplot/link noobs;
ods output oddsratioswald=or;
run;
data or2;
do Age=60 to 80 by 5;
do Units=1,5,10;
set or; output;
end; end;
run;
proc template;
define statgraph ORcontourplot;
begingraph;
entrytitle "Age Odds Ratio Plot";
layout overlay /
xaxisopts=(offsetmin=0 offsetmax=0
linearopts=(thresholdmin=0 thresholdmax=0))
yaxisopts=(offsetmin=0.025 offsetmax=0.025
linearopts=(thresholdmin=0 thresholdmax=0));
contourplotparm x=age y=units z=oddsratioest /
contourtype=labeledlinefill nhint=12 name="Contour";
continuouslegend "Contour" / title="Odds Ratio";
referenceline y=1 / lineattrs=(thickness=2 pattern=mediumdash color=black);
referenceline y=5 / lineattrs=(thickness=2 pattern=mediumdash color=black);
referenceline y=10 / lineattrs=(thickness=2 pattern=mediumdash color=black);
endlayout;
endgraph;
end;
run;
proc sgrender data=or2 template=ORcontourplot;
run;
The resulting plots show that the odds ratios are generally below 1 except when the Age interval begins at 60. The first plot with confidence bars shows that all odds ratio estimates for increases beginning at Age=65 and higher are significantly less than 1, except for the five-year increase starting at Age=65. In the contour plot, it is perhaps most useful to consider how the odds ratio changes with Age for a fixed interval width. For example, to see how the odds ratio changes when Age is increased by five years, scan along a horizontal reference line at UNITS=5. Along that line, the odds ratio is 1, indicating no change in odds of no pain for a five-year increase starting at about Age=63. The odds ratio is about 0.4 for a five-year increase starting at Age=70 and is close to 0.2 for five-year Age intervals above 75.
The final effect plot can help you understand any particular pattern of odds ratio values. The vertical axis in this plot, labeled "Linear Predictor", is the log odds of the event (no pain). Recall that an Age odds ratio is the ratio of the odds at two Age values. Equivalently, the odds ratio is the exponentiated difference of two log odds. Consider one-, five-, and ten-year increases in Age starting at Age=65. Moving along the line that represents the fitted model in the effect plot, the log odds changes little for a one-year increase from 65 to 66, drops a little from 65 to 70, and drops even more from 65 to 75. This means that the log odds at the higher Age minus the log odds at 65 is about 0 for an increase of one year and increasingly negative for the five- and ten-year changes. Those differences are also the log odds ratios. Exponentiating them yields odds ratios that are about 1, then increasingly dropping below 1 for the five- and ten-year changes. You can see that pattern in both odds ratio plots.
|
|