Naming convention for statistics in an output data set created by PROC TABULATE


The OUT= option in the PROC TABULATE statement creates an output data set. The naming convention for the statistics in the output data set is variablename_statisticname. If the variable name is long and appending the statistic name will exceed the 32-character limitation, PROC TABULATE will name the variable statisticname<n>. If there are multiple variables with long names, the following NOTE will appear in the SAS log:

    NOTE: Variable Mean already exists on file WORK.NEW, using Mean2 instead.
    NOTE: Variable PctSum already exists on file WORK.NEW, using PctSum2 instead.


For the percentage statistics, the variable name does not have to be very long for PROC TABULATE to use the alternative naming convention. The default name in the output data set appends a series of 0's and 1's to the variable name and percentage statistic name which represents whether the CLASS variable contributed to the statistic. PROC TABULATE considers all CLASS variables even if they are not used on the TABLE statement.

The code below replicates the alternative naming behavior.


    data one;
     x=111;
     y=1;
     first_variable_with_long_name=5;
     second_variable_with_long_name=10;
    run;

    proc tabulate data=one out=new;
     class x y;
     var first_variable_with_long_name
         second_variable_with_long_name;
     table x*y, first_variable_with_long_name*(mean pctsum)
                second_variable_with_long_name*(mean pctsum);
    run;

    proc print data=new;
    run;