Partitioning vs. Traditional DML to Maintain a Rolling Set of Data

Posted at Monday, June 13, 2005
The Partitioning option of the Oracle RDBMS is heralded as important by many, but understood - in practice - by relatively few. The general rule of thumb is that partitioning should be employed in cases where table and/or index segments are 2GB+ in size. Partitioning can be used to solve a variety of problems, beyond its inherent strength in breaking entities' contents into smaller, more manageable pieces. The subject of this post is just such a case.

A customer has a large transaction auditing table, which is used in Customer Support operations. There are many transactions and the data is only useful going back 30 days in the past. Due to the amount of data and the fact that internal customers only use this data for an ephemeral period of time, we need to prune unnecessary data from the table on a regular basis; in essence, maintaining a rolling set of data, representing the last 30 days of activity.

Normally, the aforementioned requirement is easy to satisfy using simple DML - namely the DELETE statement. However, when dealing with large amounts of data (VLDB) within a database in ARCHIVELOG mode, problems can occur due to insufficient amounts of available undo space and/or insufficient disk space to house the potentially staggering amount of archive logs generated during the transaction. Applying the NOLOGGING attribute to the segments in question offers no reprieve. Here is where my example that uses partitioning comes in handy. Instead of daily performing a delete of created < TRUNC(SYSDATE) - 30 (or something similar) against the transaction archive table, we use a partitioned table with a partition for each day, which we truncate and add partitions to as appropriate.

In Test Case 1, I created an example that satisfies our requirement, using the 'Traditional' method of maintaining the rolling data set
(Note: Right-click 'Save Target As'):
Test Case 1 - Traditional DML | Log of Test Case 1

In Test Case 2, I created an example that satisfies our requirement, using the Partitioning method, in which partitions are dynamically created and truncated each day to retain only partitions, representing data from the last 30 days
(Note: Right-click 'Save Target As'):
Test Case 2 - Partitioning Method | Log of Test Case 2

Results:

The Traditional DML Test case (Test Case 1), deletes 250,000 rows from a small table in 14.63 seconds and generates a lot of logging in the process. This technique will only scale so far, due to limited host resources.

The Partitioning Technique (Test Case 2), truncates 250,000 rows worth of data from a small table in 4.42 seconds with virtually no logging. This technique is quite scalable. In cases, in which the data needs to be rolled out of the table, but kept for posterity, you could use the ALTER TABLE ...EXCHANGE PARTITION statement. Partitioning, in this case, is the way to go!

Labels: ,