Thursday, February 21, 2008

Oracle 11g REF Partitioning

Oracle REF partitioning is a new partitioning feature with the 11g version of the Oracle database. REF Partitioning is the ability to partition a table based on the foreign key parent-child relationship. In REF partitioning the partitioning key of the child table is inherited from the parent table. REF partitioning also has all partition maintenance operations that change the logical shape of the parent table into the child table. REF Partitioning also improves performance for joins between the parent and child table by enabling partition-wise joins.

Ref partitioning can not be used when the parent table is partitioned using range interval partitioning or virtual column partitioning. The following message will happen upon attempt to create the partitioned table with ref partitioning.

CREATE TABLE order_items
*
ERROR at line 1:
ORA-14659: Partitioning method of the parent table is not supported

Another oddity found in this test was that if I created a partitioned index on the parent table with parallel I could not create the child table with ref partitioning as it got an ORA-0600, however if I created the index without parallel the child table with ref partitioning created ok. To get around this issue I created the partitioned index on the parent table after creating the child table with the ref partitioning.

For the basis as an example lets use the OE.ORDERS and OE.ORDER_ITEMS so create our own versions of the tables insert records where the rows in both sets are exactly the same. This will allow the comparison of a query that could be used against both sets of tables for a comparison. The original tables from OE schema will remain unchanged. For the copies for the partitioning testing we will use a range interval partition for the orders table and a ref partition for the order_items table.

Create the tables and indexes

CREATE TABLE orders
( order_id NUMBER(12) CONSTRAINT orders_order_id_nn NOT NULL,
order_date DATE CONSTRAINT orders_order_date_nn NOT NULL,
order_mode VARCHAR2(8),
customer_id NUMBER(6) CONSTRAINT orders_customer_id_nn NOT NULL,
order_status VARCHAR2(2),
order_total NUMBER(8,2),
sales_rep_id NUMBER(6),
promotion_id NUMBER(6),
CONSTRAINT orders_order_id_pk PRIMARY KEY (order_id)
)
PARTITION BY RANGE (order_date)
--INTERVAL (NUMTOYMINTERVAL(1,'month'))
( PARTITION p_pre_1999 VALUES LESS THAN (TO_DATE('01-JAN-1999','dd-MON-yyyy')),
PARTITION p_JAN_1999 VALUES LESS THAN (TO_DATE('01-FEB-1999','dd-MON-yyyy')),
PARTITION p_FEB_1999 VALUES LESS THAN (TO_DATE('01-MAR-1999','dd-MON-yyyy')),
PARTITION p_MAR_1999 VALUES LESS THAN (TO_DATE('01-APR-1999','dd-MON-yyyy')),
PARTITION p_APR_1999 VALUES LESS THAN (TO_DATE('01-MAY-1999','dd-MON-yyyy')),
PARTITION p_MAY_1999 VALUES LESS THAN (TO_DATE('01-JUN-1999','dd-MON-yyyy')),
PARTITION p_JUN_1999 VALUES LESS THAN (TO_DATE('01-JUL-1999','dd-MON-yyyy')),
PARTITION p_JUL_1999 VALUES LESS THAN (TO_DATE('01-AUG-1999','dd-MON-yyyy')),
PARTITION p_AUG_1999 VALUES LESS THAN (TO_DATE('01-SEP-1999','dd-MON-yyyy')),
PARTITION p_SEP_1999 VALUES LESS THAN (TO_DATE('01-OCT-1999','dd-MON-yyyy')),
PARTITION p_OCT_1999 VALUES LESS THAN (TO_DATE('01-NOV-1999','dd-MON-yyyy')),
PARTITION p_NOV_1999 VALUES LESS THAN (TO_DATE('01-DEC-1999','dd-MON-yyyy')),
PARTITION p_DEC_1999 VALUES LESS THAN (TO_DATE('01-JAN-2000','dd-MON-yyyy')),
PARTITION p_JAN_2000 VALUES LESS THAN (TO_DATE('01-FEB-2000','dd-MON-yyyy')),
PARTITION p_FEB_2000 VALUES LESS THAN (TO_DATE('01-MAR-2000','dd-MON-yyyy')),
PARTITION p_MAR_2000 VALUES LESS THAN (TO_DATE('01-APR-2000','dd-MON-yyyy')),
PARTITION p_APR_2000 VALUES LESS THAN (TO_DATE('01-MAY-2000','dd-MON-yyyy')),
PARTITION p_MAY_2000 VALUES LESS THAN (TO_DATE('01-JUN-2000','dd-MON-yyyy')),
PARTITION p_JUN_2000 VALUES LESS THAN (TO_DATE('01-JUL-2000','dd-MON-yyyy')),
PARTITION p_JUL_2000 VALUES LESS THAN (TO_DATE('01-AUG-2000','dd-MON-yyyy')),
PARTITION p_AUG_2000 VALUES LESS THAN (TO_DATE('01-SEP-2000','dd-MON-yyyy')),
PARTITION p_SEP_2000 VALUES LESS THAN (TO_DATE('01-OCT-2000','dd-MON-yyyy')),
PARTITION p_OCT_2000 VALUES LESS THAN (TO_DATE('01-NOV-2000','dd-MON-yyyy')),
PARTITION p_NOV_2000 VALUES LESS THAN (TO_DATE('01-DEC-2000','dd-MON-yyyy')),
PARTITION p_DEC_2000 VALUES LESS THAN (TO_DATE('01-JAN-2001','dd-MON-yyyy'))
)
PARALLEL ;

CREATE TABLE order_items
( order_id NUMBER(12) CONSTRAINT oitems_order_id_nn NOT NULL,
line_item_id NUMBER(3) CONSTRAINT oitems_line_item_id_nn NOT NULL,
product_id NUMBER(6) CONSTRAINT oitems_product_id_nn NOT NULL,
unit_price NUMBER(8) CONSTRAINT oitems_unit_price_nn NOT NULL,
quantity NUMBER(8,2) CONSTRAINT oitems_quantity_nn NOT NULL,
sales_amount NUMBER(12,2) CONSTRAINT oitems_sales_amount_nn NOT NULL,
CONSTRAINT order_items_orders_fk
FOREIGN KEY (order_id) REFERENCES orders(order_id)
)
PARTITION BY REFERENCE (order_items_orders_fk)
PARALLEL ;

CREATE UNIQUE INDEX ORDER_ITEMS_PK
ON ORDER_ITEMS (ORDER_ID, LINE_ITEM_ID)
PARALLEL ;

-- have to create this index with parallel after the ref partition
-- table is created to avoid ORA-0600.
CREATE INDEX ORDERS_ORDER_DATE_NU
ON ORDERS (ORDER_DATE, ORDER_ID)
LOCAL
( PARTITION p_pre_1999,
PARTITION p_JAN_1999,
PARTITION p_FEB_1999,
PARTITION p_MAR_1999,
PARTITION p_APR_1999,
PARTITION p_MAY_1999,
PARTITION p_JUN_1999,
PARTITION p_JUL_1999,
PARTITION p_AUG_1999,
PARTITION p_SEP_1999,
PARTITION p_OCT_1999,
PARTITION p_NOV_1999,
PARTITION p_DEC_1999,
PARTITION p_JAN_2000,
PARTITION p_FEB_2000,
PARTITION p_MAR_2000,
PARTITION p_APR_2000,
PARTITION p_MAY_2000,
PARTITION p_JUN_2000,
PARTITION p_JUL_2000,
PARTITION p_AUG_2000,
PARTITION p_SEP_2000,
PARTITION p_OCT_2000,
PARTITION p_NOV_2000,
PARTITION p_DEC_2000
)
PARALLEL ;
-- Populate the new partitioned tables with the same data
-- from the original tables
insert into orders select * from oe.orders ;
commit ;

insert into order_items
select order_id, line_item_id, product_id, unit_price, quantity, (unit_price*quantity) sales_amount
from oe.order_items ;
commit;

execute dbms_stats.gather_table_stats ('MRMESSIN','ORDERS') ;
execute dbms_stats.gather_table_stats ('MRMESSIN','ORDER_ITEMS') ;


Test the Outcome

Execution One

SQL> set autotrace on
SQL> SELECT TO_DATE(to_char(o.order_date, 'DD-MON-YYYY'), 'DD-MON-YYYY') order_date, sum(oi.sales_amount) sum_sales
2 FROM oe.orders o,
3 oe.order_items oi
4 WHERE o.order_id = oi.order_id
5 AND o.order_date BETWEEN TO_DATE('01-JAN-1999','DD-MON-YYYY')
6 AND TO_DATE('01-MAY-1999','DD-MON-YYYY')
7 GROUP BY order_date
8 ORDER BY order_date ;

ORDER_DAT SUM_SALES
--------- ----------

26-FEB-99 3322700.8
27-FEB-99 526310.4
11-MAR-99 1688284.8
12-MAR-99 2330752
13-MAR-99 952000

20-MAR-99 61651.2
20-MAR-99 1010368
29-MAR-99 345427.2
29-MAR-99 12288
11-APR-99 52352

10 rows selected.

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------

Plan hash value: 795276733

------------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
------------------------------------------------------------------------------------------------------
0 SELECT STATEMENT 1 24 6 (17) 00:00:01
1 SORT ORDER BY 1 24 6 (17) 00:00:01
2 SORT GROUP BY NOSORT 1 24 6 (17) 00:00:01
* 3 FILTER
4 NESTED LOOPS
5 NESTED LOOPS 2 48 5 (0) 00:00:01
6 TABLE ACCESS BY INDEX ROWID ORDERS 1 15 2 (0) 00:00:01
* 7 INDEX RANGE SCAN ORD_ORDER_DATE_IX 1 1 (0) 00:00:01
* 8 INDEX RANGE SCAN ITEM_ORDER_IX 6 1 (0) 00:00:01
9 TABLE ACCESS BY INDEX ROWID ORDER_ITEMS 6 54 3 (0) 00:00:01
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):

---------------------------------------------------
3 - filter(TIMESTAMP' 1999-01-01 00:00:00'<=TIMESTAMP' 1999-05-01 00:00:00')
7 - access("O"."ORDER_DATE">=TIMESTAMP' 1999-01-01 00:00:00' AND "O"."ORDER_DATE"<=TIMESTAMP' 1999-05-01 00:00:00')

8 - access("O"."ORDER_ID"="OI"."ORDER_ID")

Statistics
----------------------------------------------------------

0 recursive calls
0 db block gets
875 consistent gets

0 physical reads
0 redo size
667 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
10 rows processed


Execution Two

SQL> SELECT o.order_date, sum(oi.sales_amount) sum_sales
2 FROM orders o,
3 order_items oi
4 WHERE o.order_date BETWEEN TO_DATE('01-JAN-1999','DD-MON-YYYY')
5 AND TO_DATE('01-MAY-1999','DD-MON-YYYY')
6 AND o.order_id = oi.order_id
7 GROUP BY order_date
8 ORDER BY order_date ;

ORDER_DAT SUM_SALES
--------- ----------
26-FEB-99 3322700.8

27-FEB-99 526310.4
11-MAR-99 1688284.8
12-MAR-99 2330752
13-MAR-99 952000
20-MAR-99 61651.2
20-MAR-99 1010368
29-MAR-99 345427.2
29-MAR-99 12288
11-APR-99 52352

10 rows selected.

Elapsed: 00:00:00.07

Execution Plan
----------------------------------------------------------

Plan hash value: 1413427856
----------------------------------------------------------------------------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time Pstart Pstop TQ IN-OUT PQ Distrib
----------------------------------------------------------------------------------------------------------------------------------------------------------
0 SELECT STATEMENT 5 105 14 (15) 00:00:01
1 PX COORDINATOR
2 PX SEND QC (ORDER) :TQ10002 5 105 14 (15) 00:00:01 Q1,02 P->S QC (ORDER)
3 SORT GROUP BY 5 105 14 (15) 00:00:01 Q1,02 PCWP
4 PX RECEIVE 5 105 14 (15) 00:00:01 Q1,02 PCWP
5 PX SEND RANGE :TQ10001 5 105 14 (15) 00:00:01 Q1,01 P->P RANGE
6 SORT GROUP BY 5 105 14 (15) 00:00:01 Q1,01 PCWP
* 7 HASH JOIN 1115 23415 13 (8) 00:00:01 Q1,01 PCWP
8 PX RECEIVE 176 2112 1 (0) 00:00:01 Q1,01 PCWP
9 PX SEND BROADCAST :TQ10000 176 2112 1 (0) 00:00:01 Q1,00 P->P BROADCAST
10 PX PARTITION RANGE ITERATOR 176 2112 1 (0) 00:00:01 2 6 Q1,00 PCWC
* 11 INDEX RANGE SCAN ORDERS_ORDER_DATE_NU 176 2112 1 (0) 00:00:01 2 6 Q1,00 PCWP
12 PX BLOCK ITERATOR 21280 187K 11 (0) 00:00:01 1 25 Q1,01 PCWC
13 TABLE ACCESS FULL ORDER_ITEMS 21280 187K 11 (0) 00:00:01 1 25 Q1,01 PCWP
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------
7 - access("O"."ORDER_ID"="OI"."ORDER_ID")
11 - access("O"."ORDER_DATE">=TO_DATE(' 1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "O"."ORDER_DATE"<=TO_DATE(' 1999-05-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------

45 reursive calls
3 db block gets

357 consistent gets
0 physical reads
664 redo size

683 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
34 sorts (memory)
0 sorts (disk)

10 rows processed

SQL>


Execution Three

SQL> set linesize 200
SQL> set timing on
SQL>
SQL> spool ref_partitioning_test.out
SQL>
SQL> set autotrace on
SQL> SELECT TO_DATE(to_char(o.order_date, 'DD-MON-YYYY'), 'DD-MON-YYYY') order_date, sum(oi.sales_am
2 FROM oe.orders o,
3 oe.order_items oi
4 WHERE o.order_id = oi.order_id
5 AND o.order_date BETWEEN TO_DATE('01-JAN-1999','DD-MON-YYYY')
6 AND TO_DATE('01-MAY-1999','DD-MON-YYYY')
7 GROUP BY order_date
8 ORDER BY order_date ;

ORDER_DAT SUM_SALES
--------- ----------
26-FEB-99 6645401.6
27-FEB-99 1052620.8
11-MAR-99 3376569.6
12-MAR-99 4661504
13-MAR-99 1904000
20-MAR-99 123302.4
20-MAR-99 2020736
29-MAR-99 690854.4
29-MAR-99 24576
11-APR-99 104704

10 rows selected.

Elapsed: 00:00:00.04

Execution Plan
----------------------------------------------------------
Plan hash value: 795276733

------------------------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
------------------------------------------------------------------------------------
0 SELECT STATEMENT 1 24 6 (17) 00:00:01
1 SORT ORDER BY 1 24 6 (17) 00:00:01
2 SORT GROUP BY NOSORT 1 24 6 (17) 00:00:01
* 3 FILTER
4 NESTED LOOPS
5 NESTED LOOPS 2 48 5 (0) 00:00:01
6 TABLE ACCESS BY INDEX ROWID ORDERS 1 15 2 (0) 00:00:01
* 7 INDEX RANGE SCAN ORD_ORDER_DATE_IX 1 1 (0) 00:00:01
* 8 INDEX RANGE SCAN ITEM_ORDER_IX 6 1 (0) 00:00:01
9 TABLE ACCESS BY INDEX ROWID ORDER_ITEMS 6 54 3 (0) 00:00:01
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(TIMESTAMP' 1999-01-01 00:00:00'<=TIMESTAMP' 1999-05-01 00:00:00')
7 - access("O"."ORDER_DATE">=TIMESTAMP' 1999-01-01 00:00:00' AND
"O"."ORDER_DATE"<=TIMESTAMP' 1999-05-01 00:00:00')
8 - access("O"."ORDER_ID"="OI"."ORDER_ID")


Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1724 consistent gets
0 physical reads
116 redo size
669 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
10 rows processed

SQL> SELECT o.order_date, sum(oi.sales_amount) sum_sales
2 FROM orders o,
3 order_items oi
4 WHERE o.order_date BETWEEN TO_DATE('01-JAN-1999','DD-MON-YYYY')
5 AND TO_DATE('01-MAY-1999','DD-MON-YYYY')
6 AND o.order_id = oi.order_id
7 GROUP BY order_date
8 ORDER BY order_date ;

ORDER_DAT SUM_SALES
--------- ----------
26-FEB-99 6645401.6
27-FEB-99 1052620.8
11-MAR-99 3376569.6
12-MAR-99 4661504
13-MAR-99 1904000
20-MAR-99 123302.4
20-MAR-99 2020736
29-MAR-99 690854.4
29-MAR-99 24576
11-APR-99 104704

10 rows selected.

Elapsed: 00:00:00.07

Execution Plan
----------------------------------------------------------
Plan hash value: 1413427856
------------------------------------------------------------------------------------ Id Operation Name Rows Bytes Cost (%CPU) Time Pstart Pstop
------------------------------------------------------------------------------------
0 SELECT STATEMENT 5 105 21 (10) 00:00:01
1 PX COORDINATOR
2 PX SEND QC (ORDER) :TQ10002 5 105 21 (10) 00:00:01
3 SORT GROUP BY 5 105 21 (10) 00:00:01
4 PX RECEIVE 5 105 21 (10) 00:00:01
5 PX SEND RANGE :TQ10001 5 105 21 (10) 00:00:01
6 SORT GROUP BY 5 105 21 (10) 00:00:01
* 7 HASH JOIN 2231 46851 20 (5) 00:00:01
8 PX RECEIVE 352 4224 1 (0) 00:00:01
9 PX SEND BROADCAST :TQ10000 352 4224 1 (0) 00:00:01
10 PX PARTITION RANGE ITERATOR 352 4224 1 (0) 00:00:01 2 6
* 11 INDEX RANGE SCAN ORDERS_ORDER_DATE_NU 352 4224 1 (0) 00:00:01 2
12 PX BLOCK ITERATOR 42560 374K 18 (0) 00:00:01 1 25
13 TABLE ACCESS FULL ORDER_ITEMS 42560 374K 18 (0) 00:00:01 1 25
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("O"."ORDER_ID"="OI"."ORDER_ID")
11 - access("O"."ORDER_DATE">=TO_DATE(' 1999-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "O"."O
'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
45 recursive calls
3 db block gets
481 consistent gets
0 physical reads
664 redo size
685 bytes sent via SQL*Net to client
396 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
34 sorts (memory)
0 sorts (disk)
10 rows processed


Summary

Initial tests were done without parallel and did not yield expected results, expected results did not really start to happen until parallel was used and the number of rows in the parent and child tables significantly increased. Keep in mind that parallel operations increase CPU utilization and therefore a CPU bound system may not encounter the same results.

Running many tests and increasing the number of rows proportionately in each table with each execution. As the number of rows in orders and orders items increased the larger the gap in buffer gets to resolve the query. On a very small table the non-partitioned table used far less resources, as the number of rows increased the closer the buffer gets got until the partitioned table was taking less resources. Even as the difference in the buffer gets changed as the number of records increased the number of sorts in memory stayed consistent. Another thing noticed in the testing is that the runtime performance stayed more consistent with the partitioned tables more so the run time increase was lower then with the non-partitioned tables as the number of rows increased.

This is a nice new partitioning feature for parent-child relationships and initial testing shows this partitioning could have benefits for tables that are large. Smaller tables did not show any advantages, but the advantages seem to gain as the table sizes/number of rows increases.

Initial testing has shown that this new Oracle Partitioning Feature is worth the time to do further testing on very large data sets.

No comments: