Using the SAS/ACCESS Interface to Spark or SAS/ACCESS Interface to JDBC LIBNAME statement to access Databricks causes Hive/Hadoop configuration errors


When you execute the SAS/ACCESS Interface to Spark or SAS/ACCESS Interface to JDBC LIBNAME statement, an error message similar to the following might occur:

ERROR: SAS_HADOOP_JAR_PATH environment variable not set. It must be set for SAS/ACCESS to connect. See the SAS Configuration Guide for
SAS_HADOOP_JAR_PATH requirements.
ERROR: Error in the LIBNAME statement.

ERROR: Error trying to establish connection:
[Databricks] [Databricks JDBC Driver] (500051) ERROR processing query/statement. Error Code: 0, SQL state:
TStatus (statusCode: ERROR STATUS,
infoMessages: [*org.apache.hive.service.cli.HiveSQLException: Configuration sas.spark.jdbc.spark.conf.dir is not available.]

This issue is caused by a bug in the SAS/ACCESS Interface to Spark and SAS/ACCESS Interface to JDBC engine's code. The issue has been resolved in SAS® 9.4M8 (TS1M8) and later.

Resolution

To fix this issue in SAS® 9.4M7 (TS1M7), do as follows:

In SAS® 9.4M9 (TS1M9), SAS/ACCESS Interface to Spark is the preferred method for accessing Databricks, due to enhanced data type fidelity and native support for bulk load and bulk unload operations, which improve both compatibility and performance.