Monday, November 12, 2007

Oracle 11g Advanced Compression

Oracle 11g introduces a new feature that potentially offers an organization the promise of tables that take less space therefore equaling a smaller database. A smaller database taking less disk space equals a lower cost for disk storage lowering cost for disk storage for databases.

With database sizes continually growing at an alarming rate the ability to increase the amount of data store per gigabyte is exciting.

Oracle first introduced compression in 9i where you could compress tables upon creation via operations like create table as select or direct load operations. This is well suited for initial loads of data, but to remain compressed required maintenance. With pressure to increase availability of databases this compression was not well suited for our OLTP systems since most data is not direct loaded. Oracle’s introduction of Advanced Compression changes that. Keep in mind though that Advanced Compression is a new option with 11g and therefore will have additional license requirements.

Oracle boasted 2-3x less disk space and minimal DML overhead.

Is that really true can we expect a table to be half the size it is today uncompressed? If so will it maintain itself to be continually half the size over time?

Also how bad does the compression affect my read and write operations? Oracle boasts that read operations are not impacted and claims that write operations have a minimal impact.

I did a small test to see if I could at least at some level quickly get some answers to those questions.

In my test I create a compressed table with a create table as select, much like I would creating a compressed table and loading with initial data in 9i. Created a table un-compressed with initial data load and uncompressed with initial data load. Then I moved on to load more data with an insert select looping though doing that insert upwards of 30 times each time taking a measure of the table size and how long the load took.

I have included my script that I used and output results from the execution with the load operation looping 30 times. Keep in mind that this is a small test to get familiar with the feature and get some initial thoughts.

Script used in testing:

-------------------------------------------------
-- script: test_11g_compression.sql
-- author: Michael Messina
--
-- description: Script the sill create tables
-- with and without compression
-- load tables with data and
-- show sizes and timings
-------------------------------------------------
-- spool to a log file for us
spool test_11g_compression.log

-- Lets drop the tables for our test
drop table test_compression_compressed ;
drop table test_compression_uncompressed ;

-- let see how long it takes to create the tables
-- and load the data for compressed and
-- uncompressed versions.
set timing on

-- create a table compressed and load with data
create table test_compression_compressed
COMPRESS FOR ALL OPERATIONS
as
select * from dba_tables ;

-- create a table and load with data uncompressed
create table test_compression_uncompressed
as
select * from dba_tables ;

set timing off

-- Lets see how large our compressed table is
select sum(bytes)/1024 from dba_segments where owner = 'SYSTEM' and segment_name = 'TEST_COMPRESSION_COMPRESSED' ;

-- Lets see how large out uncompressed table is
select sum(bytes)/1024 from dba_segments where owner = 'SYSTEM' and segment_name = 'TEST_COMPRESSION_UNCOMPRESSED' ;

-----------------------------------------------------------------------------------------------------------------------
set timing on

-- Now lets insert a ton more records into the compressed table and show timing
declare
l number := 0 ;

begin

loop
if l > 30 then
exit ;
end if ;

l := l + 1 ;

insert into test_compression_compressed select * from dba_tables ;
end loop ;
end ;
/


-- Now lets insert a ton more records into the uncompressed table and show timing
declare
l number := 0 ;

begin

loop
if l > 30 then
exit ;
end if ;

l := l + 1 ;

insert into test_compression_uncompressed select * from dba_tables ;
end loop ;
end ;
/

set timing off

-- Lets see how large our compressed table is
select sum(bytes)/1024 from dba_segments where owner = 'SYSTEM' and segment_name = 'TEST_COMPRESSION_COMPRESSED' ;

-- Lets see how large out uncompressed table is
select sum(bytes)/1024 from dba_segments where owner = 'SYSTEM' and segment_name = 'TEST_COMPRESSION_UNCOMPRESSED' ;

-- Turn the spool off
spool off


Results from script execution:

SQL> @test_11g_compression.sql

Table dropped.


Table dropped.


Table created.

Elapsed: 00:00:00.48

Table created.

Elapsed: 00:00:00.39

SUM(BYTES)/1024
---------------
192


SUM(BYTES)/1024
---------------
704


PL/SQL procedure successfully completed.

Elapsed: 00:00:09.17

PL/SQL procedure successfully completed.

Elapsed: 00:00:04.79

SUM(BYTES)/1024
---------------
9216


SUM(BYTES)/1024
---------------
20480

SQL>



Query performance results:

UnCompressed table

SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.40
SQL> select count(*) from test_compression_uncompressed ;

COUNT(*)
----------
80767

Elapsed: 00:00:00.60
SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.07
SQL> select count(*) from test_compression_uncompressed ;

COUNT(*)
----------
80767

Elapsed: 00:00:00.57
SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.07
SQL> select count(*) from test_compression_uncompressed ;

COUNT(*)
----------
80767

Elapsed: 00:00:00.61
SQL>


Compressed table:

SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.06
SQL> select count(*) from test_compression_compressed ;

COUNT(*)
----------
80766

Elapsed: 00:00:00.40
SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.09
SQL> select count(*) from test_compression_compressed ;

COUNT(*)
----------
80766

Elapsed: 00:00:00.39
SQL>


Trace Results from full table scan reads:

SQL> set autotrace traceonly
SQL> set timing on
SQL>
SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.03
SQL> select * from test_compression_compressed ;

80766 rows selected.

Elapsed: 00:00:01.93

Execution Plan
----------------------------------------------------------
Plan hash value: 1423006034

-------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
-------------------------------------------------------------------------------------------------
0 SELECT STATEMENT 57433 26M 311 (1) 00:00:04
1 TABLE ACCESS FULL TEST_COMPRESSION_COMPRESSED 57433 26M 311 (1) 00:00:04
-------------------------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1156 consistent gets
1114 physical reads
0 redo size
3523277 bytes sent via SQL*Net to client
821 bytes received via SQL*Net from client
42 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
80766 rows processed

SQL>
SQL> alter system flush buffer_cache ;

System altered.

Elapsed: 00:00:00.04
SQL> select * from test_compression_uncompressed ;

80767 rows selected.

Elapsed: 00:00:01.84

Execution Plan
----------------------------------------------------------
Plan hash value: 3640389637

---------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
---------------------------------------------------------------------------------------------------
0 SELECT STATEMENT 92920 43M 690 (1) 00:00:09
1 TABLE ACCESS FULL TEST_COMPRESSION_UNCOMPRESSED 92920 43M 690 (1) 00:00:09
---------------------------------------------------------------------------------------------------

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2559 consistent gets
2513 physical reads
0 redo size
3557685 bytes sent via SQL*Net to client
821 bytes received via SQL*Net from client
42 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
80767 rows processed

SQL>
SQL> set timing off
SQL>


Summary:

As you can see from my very small test that the compression is a little more then 50% and appeared to be consistent with loads at 10 times, 20 times and 30 times. This shows that Oracle is maintaining the compression and is consistent.

If you look at the time for the load however you can clearly see the impact that the compression seems to be having. At 10 times, 20 times and 30 times the load consistently seemed to impact the load by taking 2x as long to load verses the uncompressed table.

At very low volumes this impact may not be overly noticeable but to a large load however the impact may be significant as far as runtime. This is very true if you have short windows to get your data loaded.

Over all the compression write overhead is light enough that small transactional activity could handle the overhead, however larger loads might pose a problem.

The disk space savings can be huge for large tables a savings of 50% is big when you are talking very large tables and very large databases.

A quick look at query performance with doing full table scans shows little impact to overall performance. However it does appear that on the compressed table ½ the total number of blocks are visited as compared to the non-compressed table. With ½ the block visits I would expect that a slight performance advantage might go to the compressed table, however the results do not back that up.

Another interesting difference is in the explain plan for the full table scan. The cost of the query is lower with the compressed table then with the uncompressed table. The cost difference may be fully related to the number of I/O operations difference. This would also lead you to believe that you may see a slight performance advantage to the scan on the compressed table, yet the results do not back that assumption up.

My small test only showed me that I need to do some more extensive testing, but initial results do show that there are advantages to this new feature and that further testing will be needed to be able to best apply this new feature.

No comments: