Timeouts Occur for Transactions on the SAS® OnDemand Decision Engine (ODE) Server


This article details the reasons that timeouts can occur on the ODE server and provides some workarounds. Timeouts can occur in SAS® Fraud Management 6.2 and earlier releases. 

Analysis

Timeouts can happen for many reasons, including but not limited to the following: 

Reviewing the console log and ODE log for errors, as well as the interval stats for that timeframe, can help clarify the situation.

Whenever there is a timeout on the ODE server, a log entry similar to the following occurs in the ode.log:

2025-04-22T14:13:36,449 [Engine  3] INFO  SLA_NOTICE 80ms SLA exceeded elapsed=380.6ms cmx_tran_id=0032039680429853 smh_acct_type="CC" smh_activity_type="BF" smh_rtn_code="00" smh_reason_code="0000" srp_timestamp_1=0 srp_timestamp_2=263 srp_timestamp_3=231 srp_timestamp_4=0 srp_timestamp_5=378,439 srp_timestamp_6=341 srp_timestamp_9=380,493

Reviewing this log gives you information that can help pinpoint the cause, such as the following information:

In real-life use cases, high SRP5 times have been observed as a major cause of timeouts. These timeouts typically occur when reading from the MEH database is very slow. As a result, engine threads become unavailable to process new transactions, and some transactions time out while waiting for an available engine thread (srp_timestamp_2).

High SRP5 is caused by MEH contention, either due to a slow-performing MEH database or a High Velocity Low Cardinality (HVLC) scenario, where multiple transactions are competing for the same User Defined Segment key(s) within a short time frame. An example is multiple transactions with the same account number or customer number.

Network latency, in addition to MEH contention, might also be reflected in the INTERVAL_STATS.

Workarounds

If SRP timestamp details are not present in the logs, include them by completing the steps in SAS KB0041973, "Introduction to SRP timestamps and how to configure them." 

For poor MEH performance, consult your DBA to review the health of the MEH database. Checking the database’s health does not just mean looking for ERRORs or WARNINGs. You need to examine how the MEH database was performing during periods with a high number of SLA violations and compare that with periods where SLA violations were minimal or absent. The DBA or database vendor might be able to suggest tuning or resource allocation adjustments to improve performance.

For the HVLC scenario, work with the rule writers at your organization to see whether the way the rules reference User Defined Variables (UDVs) can be optimized. Optimized rules can help reduce the impact of HVLC. For example:

To identify the segment(s) causing contention, check the ODE's INTERVAL_STATS for the string:

MEHCORE.Fetch.LocalLock

For network latency, consult your Network Team about how to improve performance or address any intermittent network glitches to ensure that there is no slowness between the ODE server and the MEH database.