Additional examples of combining PROC LOGISTIC results in PROC MIANALYZE


The following demonstrates how results from the analysis of imputed data from the LOGISTIC procedure can be combined using the MIANALYZE procedure. Slightly altered data from the example titled "Reading Logistic Model Results from a PARMS= Data Set" in the MIANALYZE documentation are used below. Rather than using Length as a continuous variable, it is categorized in order to demonstrate how to combine results involving CLASS variables from LOGISTIC in MIANALYZE.

The statements that follow create the data and also format the Length variable in order to categorize it into three categories.

  proc format;
        value lengthfmt
        low-25 = '10-25'
        25.1-29 ='25.1-29'
        29.1-high = '29.1+';
        run;
     data Fish2;
        title 'Fish Measurement Data';
        format length lengthfmt.;
        input Species $ Length Width @@;
        datalines;
     Parkki  16.5  2.3265    Parkki  17.4  2.3142    .      19.8   .
     Parkki  21.3  2.9181    Parkki  22.4  3.2928    .      23.2  3.2944
     Parkki  23.2  3.4104    Parkki  24.1  3.1571    .      25.8  3.6636
     Parkki  28.0  4.1440    Parkki  29.0  4.2340    Perch   8.8  1.4080
     .       14.7  1.9992    Perch   16.0  2.4320    Perch  17.2  2.6316
     Perch   18.5  2.9415    Perch   19.2  3.3216    .      19.4   .
     Perch   20.2  3.0502    Perch   20.8  3.0368    Perch  21.0  2.7720
     Perch   22.5  3.5550    Perch   22.5  3.3075    .      22.5   .
     Perch   22.8  3.5340    .       23.5   .        Perch  23.5  3.5250
     Perch   23.5  3.5250    Perch   23.5  3.5250    Perch  23.5  3.9950
     .       24.0   .        Perch   24.0  3.6240    Perch  24.2  3.6300
     Perch   24.5  3.6260    Perch   25.0  3.7250    .      25.5  3.7230
     Perch   25.5  3.8250    Perch   26.2  4.1658    Perch  26.5  3.6835
     .       27.0  4.2390    Perch   28.0  4.1440    Perch  28.7  5.1373
     .       28.9  4.3350    .       28.9   .        .      28.9  4.5662
     Perch   29.4  4.2042    Perch   30.1  4.6354    Perch  31.6  4.7716
     Perch   34.0  6.0180    .       36.5  6.3875    .      37.3  7.7957
     Perch   39.0   .        Perch   38.3   .        Perch  39.4  6.2646
     Perch   39.3  6.3666    Perch   41.4  7.4934    Perch  41.4  6.0030
     Perch   41.3  7.3514    Parkki  42.3   .        Perch  42.5  7.2250
     Perch   42.4  7.4624    Perch   42.5  6.6300    Perch  44.6  6.8684
     Perch   45.2  7.2772    Perch   45.5  7.4165    Perch  46.0  8.1420
     Perch   46.6  7.5958
     ; 

To impute missing values in both the response (Species) and the continuous predictor (Width), the FCS Logistic and Regression methods are used in the MI statements below. The default 25 imputations are created and saved in the data set OutFish2.

     proc mi data=Fish2 seed=1305417 out=OutFish2;
       class Species Length;
       fcs logistic(Species = Length Width);
       fcs regression (Width = Species Length);
       var Length Width Species;

     run;

Combining parameter estimates and odds ratios for continuous and CLASS predictors

These statements fit a logistic model to each of the 25 imputed data sets. An ODS SELECT statement is used so that no results are displayed, but the parameter estimates from each model are saved in the data set LGSPARMS. The parameter estimates from the model on the first imputed data set are printed by the PRINT procedure.

ods select none;
     proc logistic data=OutFish2;
       class length / param=glm;
       model Species = Length Width;
       by _Imputation_;
       ods output ParameterEstimates=lgsparms;
       run;
     ods select all;
     proc print data=lgsparms noobs;
       where _imputation_=1;
       title 'LOGISTIC Model Coefficients (First Imputation)';
       run;

LOGISTIC Model Coefficients (First Imputation)
 
_Imputation_VariableClassVal0DFEstimateStdErrWaldChiSqProbChiSq_ESTTYPE_
1Intercept 11.76633.06980.33110.5650MLE
1length10-251-0.37791.71350.04860.8254MLE
1length25.1-291-0.20001.51190.01750.8948MLE
1length29.1+00...MLE
1Width 1-0.79120.51712.34160.1260MLE

 

Because the LGSPARMS data set contains the variable ClassVal0, the CLASSVAR=CLASSVAL option is specified to properly identify the CLASS variable format in the PARMS= data set to MIANALYZE.

The following statements combine the results using the MIANALYZE procedure. The combined parameter estimates are saved in the data set MI_PARMS.

    proc mianalyze parms(classvar=classval)=lgsparms;
      class Length;
      modeleffects intercept Length Width;
      ods output ParameterEstimates=mi_parms;
      run; 

The combined parameter estimates from MIANALYZE are shown below.

Parameter Estimates (25 Imputations)
ParameterLengthEstimateStd Error95% Confidence LimitsDFMinimumMaximumTheta0t for H0:
Parameter=Theta0
Pr > |t|
intercept 0.7331974.032199-7.235128.701512147.56-5.0578485.56020200.180.8560
Length10-250.0828862.157870-4.164794.330565280.44-1.9370452.88032500.040.9694
Length25.1-290.8342821.785409-2.677874.346433331.75-0.7373832.57247200.470.6406
Length29.1+0....000..
Width -0.6130700.677676-1.955090.728950117.68-1.5345270.2966360-0.900.3675

 

The theory supporting multiple imputation requires a point estimate and standard error. Because LOGISTIC does not report a standard error for the odds ratios, it is necessary to first combine the parameter estimates as has already been done, and then exponentiate the combined estimates and their confidence limits to get the appropriate measures for the odds ratios. This action assumes that either PARAM=GLM or PARAM=REF was used in the LOGISTIC step so that the exponentiated estimates represent true odds ratios.

data mi_parms;
       set mi_parms;
       where parm ne 'intercept';
       OR=exp(estimate);
       LCL_OR=exp(LCLMean);
       UCL_OR=exp(UCLMean);
       run;
     proc print data=mi_parms noobs;
       var parm length OR LCL_OR UCL_OR;
       title 'Combined odds ratio estimates and confidence limits';
       run;

Combined odds ratio estimates and confidence limits
 
ParmLengthORLCL_ORUCL_OR
Length10-251.086420.0155375.9872
Length25.1-292.303160.0687177.2026
Length29.1+1.00000..
Width 0.541690.141552.0729

 

Combining results from the LSMEANS and ESTIMATE statements

The following statements add an LSMEANS statement and two ESTIMATE statements to the LOGISTIC step above. The ODS OUTPUT statement saves the results from each of these statements in data sets.

     ods select none;
     proc logistic data=OutFish2;
       class length / param=glm;
       model Species = Length Width;
       by _Imputation_;
       lsmeans length / diff;
       estimate 'Width=5' width 5;
       estimate 'Length 10-25 vs 25 and above' length 1 -.5 -.5;
       ods output lsmeans=lsm_mi diffs=diff_ds estimates=est_mi;
       run;
     ods select all; 

To combine the results from the ESTIMATE statements, it is necessary to sort the EST_MI data set by LABEL because LOGISTIC identifies each of the different ESTIMATE statements by the variable LABEL. The BY LABEL statement can then be used in MIANALYZE to provide separate results from each ESTIMATE statement. Note that the DATA= option is used rather than the PARMS= option to input the data into MIANALYZE.

     proc sort data=est_mi;
       by label;
       run;
     proc mianalyze data=est_mi;
       by label;
       modeleffects estimate;
       stderr stderr;
       title 'Combined results for the ESTIMATE statements';
       run; 

Parameter Estimates (25 Imputations)
ParameterEstimateStd Error95% Confidence LimitsDFMinimumMaximumTheta0t for H0:
Parameter=Theta0
Pr > |t|
estimate-3.0653523.388381-9.775463.644752117.68-7.6726351.4831790-0.900.3675

 

A similar approach is used in the following code to combine the LSMEANS estimates.

    proc sort data=lsm_mi;
      by length;
      run;
    proc mianalyze data=lsm_mi;
      by length;
      modeleffects estimate;
      stderr stderr;
      title 'Combined results for the LSMEANS estimates';
      run;

Parameter Estimates (25 Imputations)
ParameterEstimateStd Error95% Confidence LimitsDFMinimumMaximumTheta0t for H0:
Parameter=Theta0
Pr > |t|
estimate-1.9964111.441610-4.828480.835656522.35-3.747256-0.8024520-1.380.1667

 

Combining the difference among the LSMEANS requires creating a variable, Comparison, to identify the two levels involved in the comparison. Then steps that are similar to the above for the LSMEANS estimates and ESTIMATE statement results are used.

data diff2;
       set diff_ds;
       Comparison=length||' vs '||left(_length);
       run;
     proc sort data=diff2;
       by comparison _imputation_;
       run;
     proc mianalyze data=diff2;
       by comparison;
       modeleffects estimate;
       stderr stderr;
       title 'Combined results for the LSMEANS differences';
       run;

Parameter Estimates (25 Imputations)
ParameterEstimateStd Error95% Confidence LimitsDFMinimumMaximumTheta0t for H0:
Parameter=Theta0
Pr > |t|
estimate0.8342821.785409-2.677874.346433331.75-0.7373832.57247200.470.6406

Full Code

proc format;
value lengthfmt low-25= '10-25'
                25.1-29='25.1-29'
           29.1- high ='29.1+';
           

data Fish2;
   title 'Fish Measurement Data';
format length lengthfmt.;
   input Species $ Length Width @@;
   datalines;
Parkki  16.5  2.3265    Parkki  17.4  2.3142    .      19.8   .
Parkki  21.3  2.9181    Parkki  22.4  3.2928    .      23.2  3.2944
Parkki  23.2  3.4104    Parkki  24.1  3.1571    .      25.8  3.6636
Parkki  28.0  4.1440    Parkki  29.0  4.2340    Perch   8.8  1.4080
.       14.7  1.9992    Perch   16.0  2.4320    Perch  17.2  2.6316
Perch   18.5  2.9415    Perch   19.2  3.3216    .      19.4   .
Perch   20.2  3.0502    Perch   20.8  3.0368    Perch  21.0  2.7720
Perch   22.5  3.5550    Perch   22.5  3.3075    .      22.5   .
Perch   22.8  3.5340    .       23.5   .        Perch  23.5  3.5250
Perch   23.5  3.5250    Perch   23.5  3.5250    Perch  23.5  3.9950
.       24.0   .        Perch   24.0  3.6240    Perch  24.2  3.6300
Perch   24.5  3.6260    Perch   25.0  3.7250    .      25.5  3.7230
Perch   25.5  3.8250    Perch   26.2  4.1658    Perch  26.5  3.6835
.       27.0  4.2390    Perch   28.0  4.1440    Perch  28.7  5.1373
.       28.9  4.3350    .       28.9   .        .      28.9  4.5662
Perch   29.4  4.2042    Perch   30.1  4.6354    Perch  31.6  4.7716
Perch   34.0  6.0180    .       36.5  6.3875    .      37.3  7.7957
Perch   39.0   .        Perch   38.3   .        Perch  39.4  6.2646
Perch   39.3  6.3666    Perch   41.4  7.4934    Perch  41.4  6.0030
Perch   41.3  7.3514    Parkki  42.3   .        Perch  42.5  7.2250
Perch   42.4  7.4624    Perch   42.5  6.6300    Perch  44.6  6.8684
Perch   45.2  7.2772    Perch   45.5  7.4165    Perch  46.0  8.1420
Perch   46.6  7.5958
; run;

proc mi data=Fish2 seed=1305417 out=outfish2;
   class Species length;
   fcs logistic( Species= Length Width);
   fcs regression (width=species length);
   var Length Width Species;
run;

ods select none;
proc logistic data=outfish2;
   class length/param=glm;
   model Species= Length Width;
   by _Imputation_;
   ods output ParameterEstimates=lgsparms;
run;
ods select all;

proc print data=lgsparms noobs;
where _imputation_=1;
title 'LOGISTIC Model Coefficients (First Imputation)';
run;

proc mianalyze parms(classvar=classval)=lgsparms;
class length;
modeleffects intercept length width;
ods output ParameterEstimates=mi_parms;
title 'Proc MIANALYZE results for the combined Parameter Estimates';
run;

data mi_parms;
set mi_parms;
where parm ne 'intercept';
OR=exp(estimate);
LCL_OR=exp(LCLMean);
UCL_OR=exp(UCLMean);
run;

proc print data=mi_parms noobs;
var parm length OR LCL_OR UCL_OR;
title 'Combined Estimate for OR and CLs';
run;

ods select none;
proc logistic data=outfish2;
   class length/param=glm;
   model Species= Length Width;
   by _Imputation_;
   lsmeans length/diff;
   estimate 'Width=5' width 5;
   estimate 'Length 10-25 vs 25 and above' length 1 -.5 -.5;
   ods output lsmeans=lsm_mi diffs=diff_ds estimates=est_mi;
run;

ods select all;


proc sort data=est_mi;
by label;
run;

proc mianalyze data=est_mi;
by label;
modeleffects estimate;
stderr stderr;
title 'Combined results for the ESTIMATE statements';
run;

 

proc sort data=lsm_mi;
by length;
run;

proc mianalyze data=lsm_mi;
by length;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMeans';
run;

data diff2;
set diff_ds;
comparison=length||' vs '||left(_length);
run;

proc sort data=diff2;
by comparison _imputation_;
run;

proc mianalyze data=diff2;
by comparison;
modeleffects estimate;
stderr stderr;
title 'Combined results for the LSMean differences';

run;