The following demonstrates how results from the analysis of imputed data from the LOGISTIC procedure can be combined using the MIANALYZE procedure. Slightly altered data from the example titled "Reading Logistic Model Results from a PARMS= Data Set" in the MIANALYZE documentation are used below. Rather than using Length as a continuous variable, it is categorized in order to demonstrate how to combine results involving CLASS variables from LOGISTIC in MIANALYZE.
The statements that follow create the data and also format the Length variable in order to categorize it into three categories.
proc format;
value lengthfmt
low-25 = '10-25'
25.1-29 ='25.1-29'
29.1-high = '29.1+';
run;
data Fish2;
title 'Fish Measurement Data';
format length lengthfmt.;
input Species $ Length Width @@;
datalines;
Parkki 16.5 2.3265 Parkki 17.4 2.3142 . 19.8 .
Parkki 21.3 2.9181 Parkki 22.4 3.2928 . 23.2 3.2944
Parkki 23.2 3.4104 Parkki 24.1 3.1571 . 25.8 3.6636
Parkki 28.0 4.1440 Parkki 29.0 4.2340 Perch 8.8 1.4080
. 14.7 1.9992 Perch 16.0 2.4320 Perch 17.2 2.6316
Perch 18.5 2.9415 Perch 19.2 3.3216 . 19.4 .
Perch 20.2 3.0502 Perch 20.8 3.0368 Perch 21.0 2.7720
Perch 22.5 3.5550 Perch 22.5 3.3075 . 22.5 .
Perch 22.8 3.5340 . 23.5 . Perch 23.5 3.5250
Perch 23.5 3.5250 Perch 23.5 3.5250 Perch 23.5 3.9950
. 24.0 . Perch 24.0 3.6240 Perch 24.2 3.6300
Perch 24.5 3.6260 Perch 25.0 3.7250 . 25.5 3.7230
Perch 25.5 3.8250 Perch 26.2 4.1658 Perch 26.5 3.6835
. 27.0 4.2390 Perch 28.0 4.1440 Perch 28.7 5.1373
. 28.9 4.3350 . 28.9 . . 28.9 4.5662
Perch 29.4 4.2042 Perch 30.1 4.6354 Perch 31.6 4.7716
Perch 34.0 6.0180 . 36.5 6.3875 . 37.3 7.7957
Perch 39.0 . Perch 38.3 . Perch 39.4 6.2646
Perch 39.3 6.3666 Perch 41.4 7.4934 Perch 41.4 6.0030
Perch 41.3 7.3514 Parkki 42.3 . Perch 42.5 7.2250
Perch 42.4 7.4624 Perch 42.5 6.6300 Perch 44.6 6.8684
Perch 45.2 7.2772 Perch 45.5 7.4165 Perch 46.0 8.1420
Perch 46.6 7.5958
;
To impute missing values in both the response (Species) and the continuous predictor (Width), the FCS Logistic and Regression methods are used in the MI statements below. The default 25 imputations are created and saved in the data set OutFish2.
proc mi data=Fish2 seed=1305417 out=OutFish2;
class Species Length;
fcs logistic(Species = Length Width);
fcs regression (Width = Species Length);
var Length Width Species;
run;
These statements fit a logistic model to each of the 25 imputed data sets. An ODS SELECT statement is used so that no results are displayed, but the parameter estimates from each model are saved in the data set LGSPARMS. The parameter estimates from the model on the first imputed data set are printed by the PRINT procedure.
ods select none;
proc logistic data=OutFish2;
class length / param=glm;
model Species = Length Width;
by _Imputation_;
ods output ParameterEstimates=lgsparms;
run;
ods select all;
proc print data=lgsparms noobs;
where _imputation_=1;
title 'LOGISTIC Model Coefficients (First Imputation)';
run;
|
Because the LGSPARMS data set contains the variable ClassVal0, the CLASSVAR=CLASSVAL option is specified to properly identify the CLASS variable format in the PARMS= data set to MIANALYZE.
The following statements combine the results using the MIANALYZE procedure. The combined parameter estimates are saved in the data set MI_PARMS.
proc mianalyze parms(classvar=classval)=lgsparms;
class Length;
modeleffects intercept Length Width;
ods output ParameterEstimates=mi_parms;
run;
The combined parameter estimates from MIANALYZE are shown below.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The theory supporting multiple imputation requires a point estimate and standard error. Because LOGISTIC does not report a standard error for the odds ratios, it is necessary to first combine the parameter estimates as has already been done, and then exponentiate the combined estimates and their confidence limits to get the appropriate measures for the odds ratios. This action assumes that either PARAM=GLM or PARAM=REF was used in the LOGISTIC step so that the exponentiated estimates represent true odds ratios.
data mi_parms;
set mi_parms;
where parm ne 'intercept';
OR=exp(estimate);
LCL_OR=exp(LCLMean);
UCL_OR=exp(UCLMean);
run;
proc print data=mi_parms noobs;
var parm length OR LCL_OR UCL_OR;
title 'Combined odds ratio estimates and confidence limits';
run;
|
The following statements add an LSMEANS statement and two ESTIMATE statements to the LOGISTIC step above. The ODS OUTPUT statement saves the results from each of these statements in data sets.
ods select none;
proc logistic data=OutFish2;
class length / param=glm;
model Species = Length Width;
by _Imputation_;
lsmeans length / diff;
estimate 'Width=5' width 5;
estimate 'Length 10-25 vs 25 and above' length 1 -.5 -.5;
ods output lsmeans=lsm_mi diffs=diff_ds estimates=est_mi;
run;
ods select all;
To combine the results from the ESTIMATE statements, it is necessary to sort the EST_MI data set by LABEL because LOGISTIC identifies each of the different ESTIMATE statements by the variable LABEL. The BY LABEL statement can then be used in MIANALYZE to provide separate results from each ESTIMATE statement. Note that the DATA= option is used rather than the PARMS= option to input the data into MIANALYZE.
proc sort data=est_mi;
by label;
run;
proc mianalyze data=est_mi;
by label;
modeleffects estimate;
stderr stderr;
title 'Combined results for the ESTIMATE statements';
run;
| |||||||||||||||||||||||||||||||||
A similar approach is used in the following code to combine the LSMEANS estimates.
proc sort data=lsm_mi;
by length;
run;
proc mianalyze data=lsm_mi;
by length;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMEANS estimates';
run;
| |||||||||||||||||||||||||||||||||
Combining the difference among the LSMEANS requires creating a variable, Comparison, to identify the two levels involved in the comparison. Then steps that are similar to the above for the LSMEANS estimates and ESTIMATE statement results are used.
data diff2;
set diff_ds;
Comparison=length||' vs '||left(_length);
run;
proc sort data=diff2;
by comparison _imputation_;
run;
proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMEANS differences';
run;
| |||||||||||||||||||||||||||||||||
Full Code
proc format;
value lengthfmt low-25= '10-25'
25.1-29='25.1-29'
29.1- high ='29.1+';
data Fish2;
title 'Fish Measurement Data';
format length lengthfmt.;
input Species $ Length Width @@;
datalines;
Parkki 16.5 2.3265 Parkki 17.4 2.3142 . 19.8 .
Parkki 21.3 2.9181 Parkki 22.4 3.2928 . 23.2 3.2944
Parkki 23.2 3.4104 Parkki 24.1 3.1571 . 25.8 3.6636
Parkki 28.0 4.1440 Parkki 29.0 4.2340 Perch 8.8 1.4080
. 14.7 1.9992 Perch 16.0 2.4320 Perch 17.2 2.6316
Perch 18.5 2.9415 Perch 19.2 3.3216 . 19.4 .
Perch 20.2 3.0502 Perch 20.8 3.0368 Perch 21.0 2.7720
Perch 22.5 3.5550 Perch 22.5 3.3075 . 22.5 .
Perch 22.8 3.5340 . 23.5 . Perch 23.5 3.5250
Perch 23.5 3.5250 Perch 23.5 3.5250 Perch 23.5 3.9950
. 24.0 . Perch 24.0 3.6240 Perch 24.2 3.6300
Perch 24.5 3.6260 Perch 25.0 3.7250 . 25.5 3.7230
Perch 25.5 3.8250 Perch 26.2 4.1658 Perch 26.5 3.6835
. 27.0 4.2390 Perch 28.0 4.1440 Perch 28.7 5.1373
. 28.9 4.3350 . 28.9 . . 28.9 4.5662
Perch 29.4 4.2042 Perch 30.1 4.6354 Perch 31.6 4.7716
Perch 34.0 6.0180 . 36.5 6.3875 . 37.3 7.7957
Perch 39.0 . Perch 38.3 . Perch 39.4 6.2646
Perch 39.3 6.3666 Perch 41.4 7.4934 Perch 41.4 6.0030
Perch 41.3 7.3514 Parkki 42.3 . Perch 42.5 7.2250
Perch 42.4 7.4624 Perch 42.5 6.6300 Perch 44.6 6.8684
Perch 45.2 7.2772 Perch 45.5 7.4165 Perch 46.0 8.1420
Perch 46.6 7.5958
; run;
proc mi data=Fish2 seed=1305417 out=outfish2;
class Species length;
fcs logistic( Species= Length Width);
fcs regression (width=species length);
var Length Width Species;
run;
ods select none;
proc logistic data=outfish2;
class length/param=glm;
model Species= Length Width;
by _Imputation_;
ods output ParameterEstimates=lgsparms;
run;
ods select all;
proc print data=lgsparms noobs;
where _imputation_=1;
title 'LOGISTIC Model Coefficients (First Imputation)';
run;
proc mianalyze parms(classvar=classval)=lgsparms;
class length;
modeleffects intercept length width;
ods output ParameterEstimates=mi_parms;
title 'Proc MIANALYZE results for the combined Parameter Estimates';
run;
data mi_parms;
set mi_parms;
where parm ne 'intercept';
OR=exp(estimate);
LCL_OR=exp(LCLMean);
UCL_OR=exp(UCLMean);
run;
proc print data=mi_parms noobs;
var parm length OR LCL_OR UCL_OR;
title 'Combined Estimate for OR and CLs';
run;
ods select none;
proc logistic data=outfish2;
class length/param=glm;
model Species= Length Width;
by _Imputation_;
lsmeans length/diff;
estimate 'Width=5' width 5;
estimate 'Length 10-25 vs 25 and above' length 1 -.5 -.5;
ods output lsmeans=lsm_mi diffs=diff_ds estimates=est_mi;
run;
ods select all;
proc sort data=est_mi;
by label;
run;
proc mianalyze data=est_mi;
by label;
modeleffects estimate;
stderr stderr;
title 'Combined results for the ESTIMATE statements';
run;
proc sort data=lsm_mi;
by length;
run;
proc mianalyze data=lsm_mi;
by length;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMeans';
run;
data diff2;
set diff_ds;
comparison=length||' vs '||left(_length);
run;
proc sort data=diff2;
by comparison _imputation_;
run;
proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMean differences';
run;