Understanding the Result Set Size from a query on a SAS OLAP cube


The following information describes how the structure of a SAS OLAP
cube and the report design can impact the result set size and therefore
the memory usage of a query.

Suppose a cube has the following structure. Each dimension contains a
single level. The cardinality of the level for this query is included
next to the dimension name.

   Dimension A - 45   unique members
   Dimension B - 60      "      "
   Dimension C - 20      "      "
   Dimension D - 10      "      "
   Dimension E - 15      "      "

   Measures - 10 measures

First design a report with the five dimensions placed on rows and the
measures placed on columns. The set size can be calculated with this
formula:

  Axis Set Size =
     Number of Dimensions *
        (Cardinality of Dimension 1 *
         Cardinality of Dimension 2 * ... *
         Cardinality of Dimension N)

  Result Set Size = Axis 1 Set Size + Axis 2 Set Size + ... + Axis N Set
  Size

The set for this report would then be:

  Row Set Size = 5 * (45 * 60 * 20 * 10 * 15) = 40,500,000 tuples
  Column Set Size = 1 * (10) = 10 tuples

  Result Set Size = 40,500,000 + 10 = 40,500,010

However, by moving the two smallest dimensions to the columns axis, the
set size and memory usage can be significantly decreased:

  Row Set Size = 3 * (45 * 60 * 20) = 162,000 tuples
  Column Set Size = 3 * (10 * 10 * 15) = 4,500 tuples

  Result Set Size = 162,000 + 4,500 = 166,500 tuples

As you can see, proper distribution of the dimensions is essential for
an efficient query. Alternatively, if the levels do have high
cardinality, you can impose subsets by creating logical buckets. These
would enable you to navigate down into smaller portions of the data, and
limit the result set size being requested. For example, if you have user
IDs from 10000 to 99999, you might consider using logical buckets based
on several levels of detail. For example, groups of 10000, then groups
of 1000, then groups of 100, and so on.

If a measure is defined to use the NUNIQUE statistic, you must also 
consider the size of the set that is requested by this measure. The
number of unique values for the level might be the maximum set size for 
your particular query.