Analytic Views are one of the main features for Business Intelligence introduced with Oracle 12c Release 2. They provide a lot of new possibilities for analytical queries on star schemas. This blog post gives an overview of the new functionality.
In my last blog post, I wrote about the new objects Attribute Dimensions and Hierarchies in Oracle 12.2. They are required to define additional metadata and hierarchy information for the dimension tables of a star schema. This information is needed for another new type of objects in Oracle 12c Release 2: Analytic Views.
An Analytic View is created on top of a fact table and several dimensions and hierarchies. It supports aggregations of measures on different hierarchy levels for each dimension. Additionally, different kinds of calculated measures can be defined in an Analytic View. This simplifies many queries in BI reports and OLAP tools.
The following picture shows the different types of metadata objects that are created in this context:
- For each dimension table, an Attribute Dimension is defined.
- On top of each Attribute Dimension, at least one Hierarchy is defined. A dimension may have multiple hierarchies.
- On the fact table, one or more Analytic Views are defined, each of them referring several Attributes Dimensions and Hierarchies.
Creating an Analytic View
The Analytic View created here is an example of Oracle’s Live SQL Tutorial Creating an Analytic View for the Sales History (SH) Sample Schema. The Attribute Dimension and Hierarchy for the dimension table PRODUCTS are explained in my last blog post. The complete definition of the metadata objects for all dimensions of the SH schema and the corresponding DDL statements are described in the Live SQL Tutorial.
The first version of the Analytic View SH_SALES_HISTORY_AV contains only the base measures AMOUNT_SOLD and QUANTITY_SOLD. It can be created with the following DDL statement (which I copied from the tutorial, too):
CREATE OR REPLACE ANALYTIC VIEW sh_sales_history_av
USING sh.sales
DIMENSION BY
(
sh_times_attr_dim
KEY time_id REFERENCES time_id
HIERARCHIES (sh_times_calendar_hier DEFAULT, sh_times_fiscal_hier),
sh_products_attr_dim
KEY prod_id REFERENCES prod_id
HIERARCHIES (sh_products_hier DEFAULT),
sh_customers_attr_dim
KEY cust_id REFERENCES cust_id
HIERARCHIES (sh_customers_hier DEFAULT),
sh_channels_attr_dim
KEY channel_id REFERENCES channel_id
HIERARCHIES (sh_channels_hier DEFAULT),
sh_promotions_attr_dim
KEY promo_id REFERENCES promo_id
HIERARCHIES (sh_promotions_hier DEFAULT)
)
MEASURES (
amount_sold FACT amount_sold,
quantity_sold FACT quantity_sold
)
DEFAULT MEASURE amount_sold;
The statement does not create a new table or any kind of database segment, but an additional metadata layer on top of a star schema. It consists of the following parts:
- USING: This part defines the base fact table (or view) that is used for the Analytic View. In this example, the fact table SALES of the SH schema is the base table.
- DIMENSION BY: All dimensions that are referred in the Analytic View must be defined here. Important: Not the dimension tables are named, but the Attribute Dimensions that were created before with a CREATE ATTRIBUTE DIMENSION command.
- KEY … REFERENCES: For each dimension, the dimension key of the fact table and the corresponding key in the Attribute Dimension must be specified.
- HIERARCHIES: For each dimension, one default hierarchy and optionally additional hierarchies can be specified. The Hierarchy objects have to be defined before with a CREATE HIERARCHY command.
- MEASURES: Here, the measures are defined. These can be either base measures of the fact table, as in the example above, or calculated measures, as we will see later. One of the measures can be specified as default measure. Optionally, an aggregation function can be defined. The default is SUM, which is the typical aggregation function for additive measures.
The example does not contain all options that are available. There are much more possibilities like user-defined classifications, aggregate clauses, caching specifications, etc.
Queries on an Analytic View
A query on an Analytic View looks like a regular SELECT statement at the first glance, but the syntax is slightly different. After the FROM clause, the hierarchies for the specific query must be defined. In the SELECT clause and in the WHERE condition, all attributes including additional metadata attributes (e.g. MEMBER_NAME, LEVEL_NAME, DEPTH) can be used. The attribute HIER_ORDER of each hierarchy is typically used to sort the result set.
The following query shows the amount and quantity sold per year, along the product hierarchy for product category ‘Electronics’:
SELECT sh_times_calendar_hier.member_name AS calendar_hier
, RPAD(‘ ‘, sh_products_hier.depth * 2)
||sh_products_hier.member_name AS products_hier
, amount_sold
, quantity_sold
FROM sh_sales_history_av
HIERARCHIES (sh_times_calendar_hier, sh_products_hier)
WHERE sh_times_calendar_hier.level_name = ‘CALENDAR_YEAR’
AND sh_products_hier.prod_category = ‘Electronics’
ORDER BY
sh_times_calendar_hier.hier_order
, sh_products_hier.hier_order
, sh_customers_hier.hier_order;
The result of the query returns 66 rows and does not only contain the measures per product, but also the total for the hierarchy levels ‘Subcategory’ and ‘Category’:
CALENDAR_HIER PRODUCTS_HIER AMOUNT_SOLD QUANTITY_SOLD
————- —————————————————— ———– ————-
1998 Electronics 1775688.87 9853
1998 Game Consoles 11.99 1
1998 Y Box 11.99 1
1998 Home Audio 1568900.57 2688
1998 8.3 Minitower Speaker 578374.62 1043
1998 Home Theatre Package with DVD-Audio/Video Play 990525.95 1645
1998 Y Box Accessories 73744.81 2684
1998 Extension Cable 1685.89 211
1998 Xtend Memory 72058.92 2473
1998 Y Box Games 133031.5 4480
1998 Adventures with Numbers 35771.66 2147
1998 Endurance Racing 97259.84 2333
1999 Electronics 2916369.92 24105
1999 Game Consoles 272901.18 884
1999 Y Box 272901.18 884
…
2001 Fly Fishing 16726.41 2102
2001 Martial Arts Champions 53100.3 2506
2001 Smash up Boxing 100546.16 3121
66 rows selected.
Of course, it would be possible to generate the same result with a “traditional” SQL statement, but because of the subtotals of each hierarchy level, this statement would be more complex. The following SQL query returns the same result:
WITH prod_level AS (
SELECT t.calendar_year
, p.prod_subcategory
, p.prod_category
, ‘ ‘||p.prod_name AS products_hier
, SUM(s.amount_sold) AS amount_sold
, SUM(s.quantity_sold) AS quantity_sold
, t.calendar_year||’/’||p.prod_category||’/’||p.prod_subcategory||’/’||p.prod_name AS sort_order
FROM sales s
JOIN times t ON (s.time_id = t.time_id)
JOIN products p ON (s.prod_id = p.prod_id)
WHERE p.prod_category = ‘Electronics’
GROUP BY
t.calendar_year
, p.prod_name
, p.prod_subcategory
, p.prod_category
ORDER BY
t.calendar_year
, p.prod_subcategory
, p.prod_name
)
, subcat_level AS (
SELECT calendar_year
, ‘ ‘||prod_subcategory AS products_hier
, SUM(amount_sold) AS amount_sold
, SUM(quantity_sold) AS quantity_sold
, calendar_year||’/’||prod_category||’/’||prod_subcategory AS sort_order
FROM prod_level
GROUP BY
calendar_year
, prod_category
, prod_subcategory
)
, cat_level AS (
SELECT calendar_year
, ‘ ‘||prod_category AS products_hier
, SUM(amount_sold) AS amount_sold
, SUM(quantity_sold) AS quantity_sold
, calendar_year||’/’||prod_category AS sort_order
FROM prod_level
GROUP BY
calendar_year
, prod_category
)
SELECT calendar_year
, products_hier
, amount_sold
, quantity_sold
, sort_order
FROM prod_level
UNION ALL
SELECT calendar_year
, products_hier
, amount_sold
, quantity_sold
, sort_order
FROM subcat_level
UNION ALL
SELECT calendar_year
, products_hier
, amount_sold
, quantity_sold
, sort_order
FROM cat_level
ORDER BY sort_order;
This simple example shows one advantage of Analytic Views: The complexity of the SQL statements can be reduced in ad-hoc queries, reports or generated SQL statements of OLAP tools. But how about performance? This is another story – and probably the subject of a future blog post.
Calculated Measures in Analytic Views
The Analytic View above contains only base measures, i.e. columns of the fact table. But the most powerful advantage of Analytic Views is the variety of calculated measures that can be defined in the Analytic View definition. These can be simple single row expression, e.g. total_amount AS (amount_sold * quantity_sold), but also complex windowing expressions like those you know from Analytic Functions. The following DDL statement shows how the existing Analytic View SH_SALES_HISTORY_AV can be enhanced with several calculated measures (in white color). The example is a subset of the use case in the Live SQL Tutorial – the original view definition contains some more calculated measures.
CREATE OR REPLACE ANALYTIC VIEW sh_sales_history_av
USING sh.sales
DIMENSION BY
(
sh_times_attr_dim
KEY time_id REFERENCES time_id
HIERARCHIES (sh_times_calendar_hier DEFAULT, sh_times_fiscal_hier),
sh_products_attr_dim
KEY prod_id REFERENCES prod_id
HIERARCHIES (sh_products_hier DEFAULT),
sh_customers_attr_dim
KEY cust_id REFERENCES cust_id
HIERARCHIES (sh_customers_hier DEFAULT),
sh_channels_attr_dim
KEY channel_id REFERENCES channel_id
HIERARCHIES (sh_channels_hier DEFAULT),
sh_promotions_attr_dim
KEY promo_id REFERENCES promo_id
HIERARCHIES (sh_promotions_hier DEFAULT)
)
MEASURES (
— Amount sold maps to the fact table.
amount_sold FACT amount_sold,
— Quantity sold maps to the fact table.
quantity_sold FACT quantity_sold,
— Ratio of amount sold for the current value to the parent product value.
amt_sold_shr_parent_prod AS
(SHARE_OF(amount_sold HIERARCHY sh_products_hier PARENT)),
— Ratio of amount sold for the current value to the parent customer value.
sales_shr_parent_cust AS
(SHARE_OF(amount_sold HIERARCHY sh_customers_hier PARENT)),
— Sales Calendar Year to Date
sales_cal_ytd AS (SUM(amount_sold) OVER
(HIERARCHY sh_times_calendar_hier BETWEEN UNBOUNDED PRECEDING AND CURRENT MEMBER WITHIN ANCESTOR AT LEVEL calendar_year)),
— Sales same period 1 year ago.
sales_cal_year_ago as (LAG(amount_sold) OVER
(HIERARCHY sh_times_calendar_hier OFFSET 1 ACROSS ANCESTOR AT LEVEL calendar_year)),
— Change in sales for the current period as compared to the same period 1 year ago.
sales_chg_cal_year_ago as (LAG_DIFF(amount_sold) OVER
(HIERARCHY sh_times_calendar_hier OFFSET 1 ACROSS ANCESTOR AT LEVEL calendar_year)),
— Percent change in sales for the current period as compared to the same period 1 year ago.
sales_pctchg_cal_year_ago as (LAG_DIFF_PERCENT(amount_sold) OVER
(HIERARCHY sh_times_calendar_hier OFFSET 1 ACROSS ANCESTOR AT LEVEL calendar_year))
)
DEFAULT MEASURE amount_sold;
Whereas single row expressions can also be implemented with Virtual Columns (see blog post Derived Measures and Virtual Columns), the flexibility of calculated measure expressions in an Analytic View is much higher. An overview of possible expressions with examples can be found in the Oracle Database Data Warehousing Guide 12c Release 2, chapter 26. In the Database SQL Language Reference manual, the syntax definition of Calculated Measures Expressions is explained.
Conclusion
I think for most of us, it will take some time to get experience with these new expressions and appreciate their power. This was already the case with Analytic Functions in Oracle 8i, with the Model Clause in Oracle 10g or with SQL Pattern Matching in Oracle 12c Release 1. I am sure this will not be my last blog post about Analytic Views. This powerful feature gives us a lot of new benefits in BI applications and ad-hoc queries.
Reblogged this on simple. and commented:
Oracle 12.2: Analytic Views
LikeLiked by 1 person