Lookup data for city geocoding with PROC GEOCODE


The default lookup data set for U.S. City geocoding is Mapsgfk.Uscity_all.

If the SAS/GRAPH® Mapsgfk.Uscity_all data set is not installed, the Geocode procedure will issue the following error:

ERROR: The default lookup data set MAPSGFK.USCITY_ALL was not found. You can specify an alternate data set for CITY geocoding with the LOOKUPCITY= option.

To circumvent the problem, install the Mapsgfk.Uscity_all data set or download it from Mapsonline if SAS/GRAPH is licensed and installed.

Otherwise, specify an alternate city lookup data set, such as Sashelp.Zipcode, in the Lookupcity= option on the PROC GEOCODE statement. 

For best results, make sure you are using the latest version of the Sashelp.Zipcode data set.

For information on updating the Sashelp.Zipcode data set, refer to SAS KB0040528, "How to replace the Sashelp.Zipcode data set".

 

The Sashelp.Zipcode data set City variable contains one city per ZIP code, the city in which the U.S. Post Office is located.

However, multiple cities can be associated with the same ZIP code. In these circumstances, the Alias_City variable in the Sashelp.Zipcode data set contains a list of those cities.

To obtain a city match for the alternate cities, modify the Sashelp.Zipcode data set to create separate observations for each city listed in the Alias_City variable, where the City variable contains the alternate city name and the City2 variable contains the normalized version of the city name.

The code below shows how to do this. Note that you must have Write access to the SASHELP library to make changes to the Sashelp.Zipcode data set.

/* create a copy of the original Sashelp.Zipcode data set */

proc datasets lib=sashelp nolist;

  change zipcode=zipcode_original;

run;

quit;

/* Create a modified version with City and City2 from the

   Alias_City variable */

data Sashelp.Zipcode(label="Modified SASHELP.ZIPCODE data");

  set Sashelp.Zipcode_original;

  output;

  if alias_city ne '' then do;

    number=countw(alias_city,'||');

    do i=1 to number;

      city=scan(alias_city,i,'||');

      city2=upcase(compress(city,,'kad'));

      output;

    end;

  end;

run; 

/* Create the data set indexes for faster geocoding */

proc datasets lib=sashelp nolist;

  modify zipcode;

  index create zip;

  index create city2_statecode =(CITY2 STATECODE);

  index create city2_statename2=(CITY2 STATENAME2);

run;

quit;

 

Add the Lookupcity=Sashelp.Zipcode option on the Proc Geocode statement and run your code.

Once you have completed your city geocoding, remove the modified Sashelp.Zipcode data set and rename the backup copy to Sashelp.Zipcode using the following code:

/* delete the modified data set and rename the copy back to ZIPCODE */

proc datasets lib=sashelp nolist;

  delete zipcode;

  run;

  /* rename the imported data set to zipcode */

  change zipcode_original=zipcode;

run;

quit;