Incremental statistics reduce the time to gather global statistics for a partitioned table. Really? In some situations it may happen that incremental statistics slow down statistics calculation dramatically. An example of a real project in Oracle 12.1 and how it can be improved with Oracle 12.2.
Gathering global statistics of a partitioned table is time-consuming. Normally, all partitions of a (large) partitioned table must be scanned to calculate the statistics on table level. To reduce this time, incremental statistics were introduced with Oracle 11g. If this feature is enabled, additional statistics information is gathered for each partition and stored as “synopsis”. To (re-)calculate the global statistics, only the synopses must be read instead of the whole table. A good description of this concept can be found in the blog post Efficient Statistics Maintenance for Partitioned Tables Using Incremental Statistics – Part 1 of Nigel Bayliss.
Incremental statistics are especially useful for environments where only a few partitions are changed at a time, but many partitions exist per table. This is usually the case in data warehouses where new data is loaded into a current partition. When the statistics on partition-level are gathered for the current partition, the global statistics are updated using the synopsis information of all the historical partitions and the current partition.
In a large data warehouse at one of my customers, we decided to introduce incremental statistics. Most of the tables are INTERVAL partitioned by date, containing daily partitions. We enabled interval partitioning for all of the 891 partitioned tables and started the DBMS_STATS.GATHER_SCHEMA_STATS procedure. The job was running for two weeks(!). While it was quite efficient at the beginning, it became slower and slower with every table. Finally, when all partition-level and global statistics were up-to-date, I thought the time-consuming part is over now. The initial effort was quite high, but now we are able to benefit from the synopses that were calculated during this long time.
This was a wrong assumption: Whenever we started a job to gather statistics for an individual partition (even if the partition contained only a small number of rows), it took around 40 minutes to calculate the synopsis information for this partition. What was going on here? For this, let’s have to look under the hood of DBMS_STATS.
A Look Under the Hood of DBMS_STATS
The synopsis information used for incremental statistics is stored in the SYSAUX tablespace in the two tables SYS.WRI$_OPTSTAT_SYNOPSIS_HEAD$ and SYS.WRI$_OPTSTAT_SYNOPSIS$. Since Oracle 12.1, the detail table WRI$_OPTSTAT_SYNOPSIS$ is partitioned using composite LIST-HASH partitioning.
On the customer database (Oracle Database 220.127.116.11 Enterprise Edition, around 24 TB of data), the head table WRI$_OPTSTAT_SYNOPSIS_HEAD$ contains about 13 million rows, the detail table WRI$_OPTSTAT_SYNOPSIS$ about 10 billion rows. The two tables together use about 320 GB of disk space in the SYSAUX tablespace. The reason for this high amount of synopsis data is the number of partitions in this database. All 891 partitioned tables together contained 325851 partitions. This was the state at end of November, about one month ago. Because new partitions are created every day for most of the tables, there will be a few thousand more now.
On the list of long-running SQL statements, the following query was always one of the most time-consuming statements. It seems that the query is executed very often (for each table or even each partition?) as part of the incremental statistics calculation. On a “normal” database, this is not critical, but on the 13 million rows of the synopsis table, each execution was running for several minutes. As far as I could see (without knowing the internal code of the DBMS_STATS package), this seems to be the “bad guy” that wasted most of the time.
SELECT DISTINCT BO#, GROUP#
FROM SYS.WRI$_OPTSTAT_SYNOPSIS_HEAD$ H
WHERE GROUP# <> 0
AND GROUP# NOT IN
(SELECT T.OBJ# * 2 FROM SYS.TABPART$ T
SELECT T.OBJ# * 2 FROM SYS.TABCOMPART$ T
It would be possible to rewrite this SQL statement to run faster, but of course this was not an option. I’m not able nor allowed to change the internal queries of DBMS_STATS. But at least I found out that this statement is executed only on Oracle 12.1. I did some tests on my laptop with incremental statistics on Oracle version 11.2, 12.1 and 12.2. Only in the trace file of Oracle 12.1 I found this statement.
Searching for a Solution
While looking around for a solution or a workaround, I found several notes on My Oracle Support (MOS), for example the document Gather_Database_Stats_Job_Proc Taking More Time in 18.104.22.168 Than 22.214.171.124 (Doc ID 2041541.1). But the more documents I browsed, the more I got the impression that there are many bugs on 126.96.36.199 with incremental statistics.
In the blog post Incremental Statistics Collection in Oracle 188.8.131.52 – A True Story of Mike Dietrich, I read that there are really several issues with incremental statistics in Oracle 184.108.40.206, but there are multiple patches and bug fixes available. Unfortunately, the referenced MOS document Things to Consider When Using Incremental Statistics (Doc ID 2107602.1) is rather confusing, and some of the patches mentioned in the document exclude each other.
Finally, we decided to solve the performance problem with a simple, but effective solution: We turned off incremental statistics, as I also mentioned on Twitter.
Instead of gathering incremental statistics after each load job, a statistics job will run now every Sunday. It gathers the partition-level statistics of all stale partitions and global statistics of tables, if required. Interestingly, this job runs much faster than the previous job with incremental statistics – although all partitions must be scanned now to gather the global statistics.
I’m not very happy with this (temporary) workaround. I still hope to find a better and more elegant way to gather global statistics on partitioned tables. Probably, Oracle 12.2 will solve our problem?
A Better Solution in Oracle 12.2?
A few days after this story, I had the chance to meet Nigel Bayliss, product manager of the Oracle optimizer, at the UKOUG Tech17 conference. He explained me that there is a new way to calculate the synopsis values in Oracle 12c Release 2. Instead of storing a lot of detail data in table WRI$_OPTSTAT_SYNOPSIS$ to retrieve the number of distinct values for each column, a HyperLogLog algorithm is used to calculate an approximate number of distinct values. If only this new algorithm is used, the (large) synopsis detail table will not be used anymore. It is still in place for backward compatibility. With the new DBMS_STATS preference APPROXIMATE_NDV_ALGORITHM, the method to calculate synopses can be defined. The default REPEAT OR HYPERLOGLOG uses the old method for existing tables, but the HyperLogLog algorithm for all new tables. If the preference is switched to HYPERLOGLOG, much less synopsis data is stored in the SYSAUX tablespace. An overview of this new way to calculate synopses for incremental statistics he described in his blog post Efficient Statistics Maintenance for Partitioned Tables Using Incremental Statistics – Part 3. Additionally, a set of demo scripts is available on GitHub.
I played around with these demo scripts and did some tests with the DBMS_STATS preference APPROXIMATE_NDV_ALGORITHM on my laptop. At least, the new method to calculate incremental statistics looks promising, and the “bad statement” described above seems to be gone. But currently I can’t say (yet) whether it will help to solve the performance issue of my customer’s database. Hopefully, my customer will decide to migrate to Oracle 12c Release 2 soon. Then I’m looking forward to enable incremental statistics again.