Frequently asked questions about preparing your site to use UTF-8 SAS® session encoding


This SAS KB article discusses relevant information that you must consider when you are deciding whether to run your SAS programs with UTF-8 SAS session encoding. UTF-8 is an encoding form of the Unicode standard. 

Why do I want to use UTF-8 encoding?

How do I know whether my SAS session already uses UTF-8 encoding?

If you are unsure whether SAS is already running in UTF-8 encoding, look in the SAS log after submitting the following code:

proc options option=encoding;
run;

What issues could I encounter with the use of UTF-8?

 

   ERROR: Some character data was lost during transcoding in the dataset libref.data-set-name.
   NOTE: The data step has been abnormally terminated. 


   Some character data was lost during transcoding in the data set libref.data-set-name. Either the 
   data contains characters that are not representable in the new encoding or truncation occurred
   during transcoding.

   libname mylib cvp 'path';
   data new;
   set mylib.wlatin1;
   run;

The warning is as follows:

WARNING: The destination buffer size was not sufficient for the transcoded data


To prevent the warning, you can do one of the following:

PROC CIMPORT in SAS® Viya® 3.5 has new options that enable you to specify a multiplier for character columns as well as automatically expand the size of formats.
 

I want my data sets to all be in UTF-8 encoding. How do I do that?

Do I need to modify existing programs?

Resources

Bales, Elizabeth, and Wei Zheng. 2017. “SAS® and UTF-8: Ultimately the Finest. Your Data and Applications Will Thank You!” Proceedings of the SAS Global Forum 2017 Conference. Cary, NC: SAS Institute Inc. http://support.sas.com/resources/papers/proceedings17/SAS0296-2017.pdf.

Bouedo, Mickaël. 2020. "The SAS® encoding journey: A byte at a time." Proceedings of the SAS Global Forum 2020 Conference. Cary, NC: SAS Institute Inc. https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4561-2020.pdf.

Carlton, Jody. 2018. “A transcoding story (or, How Oliver S. Füßling lost his last name and comes to find it again).” Cary, NC: SAS Institute Inc. https://blogs.sas.com/content/sgf/2018/06/22/a-transcoding-story-or-how-oliver-s-fusling-lost-his-last-name-and-comes-to-find-it-again/.

Carlton, Jody. 2017. “Demystifying and resolving common transcoding problems.” Cary, NC: SAS Institute Inc. https://blogs.sas.com/content/sgf/2017/05/19/demystifying-and-resolving-common-transcoding-problems/.

Lawhorn, Bari. 2014. “Encoding: helping SAS speak your language.” Cary, NC: SAS Institute Inc. https://blogs.sas.com/content/sgf/2014/09/26/encoding-helping-sas-speak-your-language/.

SAS Institute Inc. 2019. Migration Focus Area. Cary, NC: SAS Institute Inc. http://support.sas.com/rnd/migration/index.html.

SAS Institute Inc. 2019. SAS® 9.4 National Language Support (NLS): Reference Guide, Fifth Edition. Cary, NC: SAS Institute Inc. https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=nlsref&docsetTarget=titlepage.htm&locale=en.

SAS Institute Inc. 2018. Migrating Data to UTF-8 for SAS® Viya® 3.4. Cary, NC: SAS Institute Inc. https://go.documentation.sas.com/?docsetId=viyadatamig&docsetTarget=p1e9huvrtpq0upn1jjht4vn2gctb.htm&docsetVersion=3.4&locale=en.

Xie, Edwin (You). 2020. "Your data will go on: Practice for character data migration."  Proceedings of the SAS Global Forum 2020 Conference. Cary, NC: SAS Institute Inc. https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4195-2020.pdf.