Tuesday, April 17, 2018

Choosing the Right Compression

Choosing the right database table compression matters.  It affects the size of the objects and also the performance of your application.  Compression makes objects smaller, and for many systems, this is an objective in itself, but it also affects performance.  As compressed objects are smaller, less physical I/O is required to access them.  However, that saving is paid for in CPU.  More CPU is required to compress the data when it is written, and then again (albeit usually less CPU) to decompress it when you read it.  The question is whether the saving in I/O pays for the additional CPU.

There are a number of forms of compression to choose between.  They have different characteristics.  The resulting objects are different sizes, and they require different amounts of CPU to compress and decompress.
  • Basic Compression:  This is not really compression, but a form of block-level de-duplication.  It has been available since Oracle 9i.  It is available on all platforms.
  • Hybrid Columnar Compression is only available on Engineered Systems, FS Flash Storage System and ZFS Storage Appliance (ZFSSA).  4 different levels of compression that use 3 different compression algorithms.  HCC is frequently positioned as being highly effective at reducing the size of data warehouses, but it also has implications for performance.
    • Query Low: LZO algorithm
    • Query High: ZLIB algorithm
    • Archive Low: ZLIB algorithm higher level with a larger block size.
    • Archive High: BZIP2 algorithm.  The higher compression ratio comes at the expense of significantly increased CPU overhead on decompression.
Further Reading:
Which one is right for your application?  It depends.  Ultimately, you need to test it for yourself.  In this blog, I am going to relate my experience of one particular system, including how I tested it.  Our objective was optimising performance, rather than reducing database size.

My Test System

  • SuperCluster M7 (i.e. an Engineered System), Oracle 12.1.0.2, 2-node RAC, running PeopleSoft Financials 9.2.
  • In particular, this system runs a large number of complex PeopleSoft General Ledger nVision reports that use a number of reporting views on summary ledgers. There are materialized views built on the reporting views.  The summary ledgers are updated nightly.  Then the materialized views are fully refreshed in non-atomic mode so the tables are truncated and repopulated in direct-path mode.  Hence, they can be Hybrid Columnar Compressed without needing the Advanced Compression licence.  The materialized views are not indexed.  Thus they can only be full scanned, and these scans can be offloaded to the storage cells (i.e. smart scan).
The metrics presented here are extracted from test system where the summary ledger tables are updated, materialised view refreshed and then 136 General Ledger (GL) report books are run on regular a daily schedule.  This approach ensures that each test is, as far as was possible, run with the same initial conditions and only known parameters are changed each time.  In each test, a different compression level was set on the materialized views.
  • I have not investigated ARCHIVE compression.

Writing

Description Materialized Views
Size (MB) Comp. % Comp. Ratio Elapsed Total DB Time
No Compression 151,868 0% 1.0 1:01 18:13
Basic Compression 34,720 77% 4.4 1:12 22:54
Query Low 22,666 85% 6.7 1:00 18:27
Query High 14,086 91% 10.8 1:05 22:56
all times expressed in hours:minutes
  • While basic compression does significantly reduce table sizes (4.4x), Hybrid Columnar Compression does better (6.7x – 10.8x).  
  • HCC Query Low only has a very small overhead while writing data.  The materialized views are refreshed in parallel, and while the total DB time is higher, the elapsed time is lower!
  • Both basic compression and HCC Query High take about 20% longer to write the data in direct path mode.

Reading

Description nVision DB Time
On CPU+CPU Wait cell smart table scan Others Total On CPU+CPU Wait cell smart table scan Others
No Compression 201:26 97:12 18:15 316:54 64% 31% 6%
Basic Compression 466:13 66:03 31:07 563:24 83% 12% 6%
Query Low 241:13 12:34 44:57 298:45 81% 4% 15%
Query High 278:20 4:43 11:38 294:42 94% 2% 4%
The nVision GL reports only read the materialized views (although they do write some temporary working storage tables).
  • As the level of compression of the materialized views increases, the time spent on cell smart table scan (physical read) decreases, this is mainly because the objects are smaller.
  • The slightly surprising result is that the CPU overhead of basic compression is higher than either form of HCC, and more than twice that of using uncompressed tables.  Hence it significantly increases the total database response time
  • Both forms of query HCC perform better than the uncompressed table.  The increase in CPU time is more than compensated by the reduction in time spent on physical I/O.
  • The Query low test was affected by other factors external to the test (a lot more time spent in other wait events).  Otherwise, it would have outperformed query high.  If we look at just CPU and physical read time it is clearly the best option.
In an offloaded (i.e. smart) scan on an engineered system, decompression as well as the physical I/O is done in the storage cell.  The saving in time spent on physical between query low and query high is small because the disk subsystem performs so well on the engineered system.
Whereas on a non-engineered system compression and I/O must be done by the database server, there is nowhere else.  You may well find that a different level of compression is optimal.

Conclusion

  • Basic compression exhibits a surprisingly high CPU overhead.  In my test, the performance was poorer than with uncompressed tables.
  • HCC Query low was generally the best option, saving in I/O more than paid for the increase in CPU.  
  • I also analysed ASH data for the different tests broken down by materialized view.  I found it was better to leave smaller materialized views uncompressed but to compress the larger ones.  This is not surprising since HCC adds additional structures.
These results and conclusions are from one set of test on one particular application, on one particular 'engineered' hardware configuration using one particular data set.  So you need to conduct your own comparative test and decide what is best for your system.