Estimating RERI: the Relative Excess Risk due to Interaction


When a binary response model includes the interaction of two binary predictors, such as two treatments or two types of exposure, an assessment of the interaction effect is needed. Specifically, does the presence of both treatments or exposures have more or less effect than each separately? For this model, one assessment statistic is the so-called difference in difference (DID) of event probabilities. For example, if the two binary predictors are Group and Exposure, the difference in difference tells you whether the Exposure results in a change in the probability of the response event in one group that differs from the change in the other group. More generally, the DID measures how much the effect of one predictor changes over the levels of the other predictor. Estimating the DID in various scenarios is discussed and illustrated in SAS Note 61830.

Assuming A and B are two binary predictors, each with levels 1 and 0 reflecting presence or absence respectively, VanderWeele, et al. (2014) shows that the effect of A and B together, minus their separate effects, equals the DID which is the difference of the B effect in the two levels of A. This is a measure of the interaction effect in an additive sense.

DID =(p11 - p00) - (p10 - p00) - (p01 - p00)
=(p11 - p10) - (p01 - p00)
=p11 - p10 - p01 + p00 ,

where pij is the event probability for A=i and B=j.

Another statistic that assesses additive interaction is the relative excess risk due to interaction or RERI. It results from dividing DID by p00 to create a function of relative risks rather than probabilities. RERI is positive, zero, or negative in agreement with DID. So, while DID provides the more direct and intuitive assessment of interaction in an additive sense, RERI can indicate the direction of the interaction effect if not its magnitude.

RERIrr =DID/p00
=p11/p00 - p10/p00 - p01/p00 + p00/p00
=rr11 - rr10 - rr01 + 1

A similar statistic that is a function of odds ratios rather than relative risks can be defined by replacing each probability in the DID with an odds ratio that divides each odds by p00/(1-p00).

RERIor =(p11/(1-p11))/(p00/(1-p00)) - (p10/(1-p10))/(p00/(1-p00)) -
 (p01/(1-p01))/(p00/(1-p00)) + (p00/(1-p00))/(p00/(1-p00))
=or11 - or10 - or01 + 1

As is well known, the odds ratio approximates the relative risk when the event is rare. When the event is rare in all AB combinations, RERIor approximates RERIrr. Since RERIor makes use of odds ratios, it is suitable for case-control studies in which the overall response rate is set by the design.

RERIrr can be directly estimated from the parameters of a log-linked binomial model,

log(p) = b0 + b1A + b2B + b3AB ,

which can be fit using the GENMOD or GLIMMIX procedure. For example,

   proc genmod;
      class a b;
      model y(event='1') = a|b / dist=bin link=log;
      run;

With this model, RERIrr can be estimated as

RERIrr = exp(b1+b2+b3) - exp(b1) - exp(b2) + 1 .

This nonlinear function of the model parameters can be estimated using the NLEST macro (SAS Note 58775) or the NLMeans macro (SAS Note 62362). However, estimation of this model can encounter convergence problems since the log link does not ensure that the estimated mean is a valid probability (see "Log-linked Binomial Model" in SAS Notes 23003 and 23001). Under this log-linked binomial model, it can be shown that exp(b3) estimates the ratio of relative risks (p11/p10)/(p01/p00). This is the ratio of the relative risk of B in A=1 to the relative risk of B in A=0. A value of 1 indicates no interaction. Also, exp(b3) estimates rr11/(rr10•rr01), which is more easily seen to measure the effect of A and B together compared to the product of their individual effects. It measures interaction in a multiplicative sense, so is a multiplicative analog of the additive measure, DID.

RERIor can be directly estimated from the parameters of a logistic model,

logit(p) = log(p/(1-p)) = b0 + b1A + b2B + b3AB ,

which can be fit using the LOGISTIC, GENMOD, or GLIMMIX procedure. For example,

   proc logistic;
      class a b / param=glm;
      model y(event='1') = a|b;
      run;

The same function of model parameters as above for estimating RERIrr can be used with this logistic model to estimate RERIor.

RERIor = exp(b1+b2+b3) - exp(b1) - exp(b2) + 1

Again, this nonlinear function can be estimated using the NLEST or NLMeans macro. The logistic model is much less likely to suffer from convergence problems than the log-binomial model above. This makes RERIor an attractive approximation of RERI when the event is rare.

Under the logistic model, it can be shown that exp(b3) estimates or11/(or10•or01) which, like the measure from the log-binomial model above, is a multiplicative measure of the effect of A and B together compared to the product of their individual effects. A value of 1 indicates no interaction.

The following examples illustrate using both RERI estimators when the event is common and when it is rare. The same estimates are obtained using NLEST and NLMeans. As shown in the last example, RERI can be estimated in more complex models that include covariates by defining the above function of relative risks or odds ratios using the NLMeans macro. Based on LS-means from the fitted model, this is done in the same way as described for the DID in the "Estimating the Difference in Difference in Models with Covariates" section of SAS Note 61830. The last example illustrates RERI with non-binary predictors.

Examples

The following uses example data analyzed in the "Generalized Linear Models with a Non-Identity Link" section of SAS Note 61830 on estimating the DID. As shown there, the estimated DID is -0.26. The following call of the NLEST macro uses the parameters of the logistic model that is fitted in the SAS Note to estimate RERIor. Equivalently, the NLMeans macro uses the coefficients of the LS-means and specifies the function of odds ratios for the RERIor.

   %nlest(instore=log, label=RERI_or, 
      f=exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1)
   %nlmeans(instore=lmod, coef=c, link=logit, flabel=RERI_or,
      f=(mu1/(1-mu1))/(mu4/(1-mu4)) - (mu2/(1-mu2))/(mu4/(1-mu4)) -
        (mu3/(1-mu3))/(mu4/(1-mu4)) + 1)

The estimated RERIor from either macro is -2.5906 and differs significantly from zero (p=0.0369).

RERIrr can be estimated using the log-binomial model discussed above, followed by NLEST or NLMeans. With NLMeans, specify the function of LS-means that defines the function of relative risks as shown above.

   proc genmod data=a;
      class a b / param=glm ref=first;
      model y(event="1") = a b a*b / dist=bin link=log;
      lsmeans a*b / e ilink;
      ods output coef=c;
      store gen;
      run;
   %nlest(instore=gen, label=RERI_rr, 
      f=exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1)
   %nlmeans(instore=gen, coef=c, link=log, flabel=RERI_rr, 
      f=mu1/mu4 - mu2/mu4 - mu3/mu4 + 1)

The RERIrr estimate from both NLEST and NLMeans is -0.6667 (p=0.0258).

The relative risk and odds ratio based estimates differ substantially because the event is not rare in this example, so RERIrr is the more appropriate measure. As shown in the Note from the LSMEANS statement, the event probabilities in the four AB combinations are 0.53, 0.72, 0.46, and 0.39.

As noted above, DID=RERIrr • p00. From the LSMEANS statement results shown in the Note, p00 = 0.39. This is used to multiply the estimates from NLEST and NLMeans to estimate DID.

   %nlest(instore=gen, label=DID, 
          f=0.39*(exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1))
   %nlmeans(instore=gen, coef=c, link=log, flabel=RERI_rr, 
      f=0.39*(mu1/mu4 - mu2/mu4 - mu3/mu4 + 1))

The resulting estimate from each macro matches the DID directly estimated in the Note. The standard errors differ slightly due to the use of different estimation methods. Note that both RERIrr and DID agree in sign (negative) and are significantly different from zero (p=0.0258) but differ in magnitude as discussed above.

Rare Event

Dividing the event probabilities in the SAS Note example by 10 creates a scenario in which the event is rare. The following repeats the logistic and log-binomial models and computation of both estimators of RERI.

   data a;
     do a=1,0; 
     do b=1,0; 
       input mean @@; 
       n=ranbin(34534,100,mean/10);
       do rep=1 to n;     y=1; output; end; 
       do rep=n+1 to 100; y=0; output; end; 
     end; end;
     datalines;
   .5 .7 .4 .4
   ;

   proc logistic data=a;
     class a b / param=glm ref=first;
     model y(event="1") = a b a*b;
     lsmeans a*b / e ilink;
     ods output coef=coeffs;
     store log;
     run;
   %nlest(instore=log, label=RERI_or, 
      f=exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1)
   %nlmeans(instore=log, coef=coeffs, link=logit, flabel=RERI_or,
      f=(mu1/(1-mu1))/(mu4/(1-mu4)) - (mu2/(1-mu2))/(mu4/(1-mu4)) -
        (mu3/(1-mu3))/(mu4/(1-mu4)) + 1)

   proc genmod data=a;
      class a b / param=glm ref=first;
      model y(event="1") = a b a*b / dist=bin link=log;
      lsmeans a*b / e ilink;
      ods output coef=c;
      store gen;
      run;
   %nlest(instore=gen, label=RERI_rr, 
      f=exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1)
   %nlmeans(instore=gen, coef=c, link=log, flabel=RERI_rr, 
      f=mu1/mu4 - mu2/mu4 - mu3/mu4 + 1)

Since the event is rare in all of the AB combinations, as shown in the LSMEANS results, the two RERI estimators produce very similar estimates.



As before, DID can be estimated by multiplying RERIrr by p00. As seen in the LSMEANS results above, p00 = 0.01.

   %nlest(instore=gen, label=DID, 
      f=0.01*(exp(b_p2+b_p4+b_p6)-exp(b_p2)-exp(b_p4)+1))
   %nlmeans(instore=gen, coef=c, link=log, flabel=RERI_rr, 
      f=0.01*(mu1/mu4 - mu2/mu4 - mu3/mu4 + 1))

The DID estimate, -0.01, is easily confirmed by computing it directly from the probabilities in the LSMEANS results: (0.05-0.03)-(0.04-0.01) = -0.01. Note again that RERIrr and DID agree in sign, but not magnitude.

Covariates and Non-Binary Predictors

You can estimate RERI in more complex models that contain covariates or if one or both predictors of interest have more than two levels. The following uses the neuralgia data analyzed in the example titled "Logistic Modeling with Categorical Predictors" in the LOGISTIC documentation. The primary predictors are Treatment and Sex and an assessment of their interaction is of interest. Additionally, there are two continuous covariates, Duration and Age. Note that Treatment has three levels – two experimental treatments, A and B, and a placebo, P. The PROC RANK step below categorizes Age into three levels to be used as a categorical predictor.

Since the response in these data is not rare, a model for relative risks is used rather than a logistic model that would use odds ratios so that RERIrr can be estimated. PROC GENMOD fits a modified Poisson model, as discussed in SAS Note 23003, to estimate the probabilities of Pain=Yes (equivalently, PainNum=1) in each of the six Treatment-Sex combinations adjusted for the effects of Duration and Age.

   proc rank data=neuralgia out=neur group=3;
      var age;
      run;
   data neur; set neur;
      id+1; PainNum=(Pain='Yes');
      run;
   proc genmod data=neur;
      class treatment sex age id;
      model PainNum = treatment|sex duration age / dist=poisson;
      repeated subject=id;
      lsmeans treatment*sex / e ilink;
      ods output coef=c;
      store modp;
      run;

The following shows the adjusted Pain=Yes probabilities in the Mean column produced by the ILINK option in the LSMEANS statement.

When one or both predictors of interest are non-binary, you can estimate RERI for each meaningful dichotomization of both predictors, with one of the resulting four combinations considered as a reference or "unexposed" combination. In this case, the Treatment=P, Sex=M combination is treated as the reference. RERI is then estimated separately for Treatments A and B. Since GENMOD models the log probability, the following statements use RERIrr to estimate RERI for each experimental treatment adjusted for the covariates.

   %nlmeans(instore=modp, coef=c, link=log, flabel=RERI_rr A vs P, 
      f=mu1/mu6 - mu2/mu6 - mu5/mu6 + 1, title=RERI A vs P)
   %nlmeans(instore=modp, coef=c, link=log, flabel=RERI_rr B vs P, 
      f=mu3/mu6 - mu4/mu6 - mu5/mu6 + 1, title=RERI B vs P)

With estimates of RERIrr, the DID is easily estimated by multiplying by the probability of the reference combination, Treatment=P Sex=M, which is shown above to be 0.8081.

   %nlmeans(instore=modp, coef=c, link=log, flabel=DID A vs P, 
      f=.8081*(mu1/mu6 - mu2/mu6 - mu5/mu6 + 1), title=DID Estimate from RERI)
   %nlmeans(instore=modp, coef=c, link=log, flabel=DID B vs P, 
      f=.8081*(mu3/mu6 - mu4/mu6 - mu5/mu6 + 1), title=DID Estimate from RERI)

However, the DID can also be directly estimated using the difference in difference expression.

   %nlmeans(instore=modp, coef=c, link=log, flabel=DID A vs P, 
      f=(mu1-mu2)-(mu5-mu6), title=Direct Estimate of DID)
   %nlmeans(instore=modp, coef=c, link=log, flabel=DID B vs P, 
      f=(mu3-mu4)-(mu5-mu6), title=Direct Estimate of DID)

__________

Richardson DB, Kaufman JS. Estimation of the relative excess risk due to interaction and associated confidence bounds. Am J Epidemiol. 2009 Mar 15;169(6):756-60.

VanderWeele TJ, Knol MJ. A tutorial on interaction. Epidemiologic methods. 2014 Dec 1;3(1):33-72.