Pipelined SQL Function Example

Posted at Wednesday, June 18, 2008
Here is a short example on how to create and use a pipelined sql function. In my example, I use a pipelined function along with a MERGE statement to keep a star schema's dimension table up to date.

Pipelined SQL Example

Pipelined SQL Example Source (Right-Click - Save Target As...)

Labels: ,

Bugs when installing OWB 10g Release 2

Posted at Wednesday, May 14, 2008
When installing Oracle Warehouse Builder - specifically 10g Release 2 - there are a couple of Oracle bugs that need to be addressed:

Depending upon your platform, you may encounter:

Error in invoking target 'isqlldr' of makefile 
'$ORACLE_HOME/rdbms/lib/ins_rdbms.mk'.

The following Oracle Forum posting identifies the workaround:

Thread: OWB paris install on 10g rel1 RAC cluster

Depending upon your platform, you may also encounter the following bug while installing the optional Oracle Workflow - version 2.6.4 - component.

After issuing this command:

$ORACLE_HOME/wf/install/wfinstall.csh

You will encounter an error which mentions WorkflowCA: WFTLoad Exception: could not upload the wfstd.wft file.

The above issues is resolved with patch:

4778368 - WFTLOAD EXCEPTION:COULD NOT UPLOAD FILE /RDBMS/10201/WF/RES/US/WFSTD.WFT


Labels: ,

Installing 11g Oracle Warehouse Builder

Posted at Saturday, February 09, 2008
Here is my step-by-step instructions for installing and configuring Oracle 11g Warehouse Builder: Installing 11g Oracle Warehouse Builder.

This example uses one database and therefore doesn't employ remote RUNTIME schemas, etc.

Labels: , ,

Star Schema/Transformation Example

Posted at Friday, November 23, 2007
Here is a home-grown example of a star schema at work. I created a couple of dimension tables and a single fact, that represent baseball statistics. In my example, I have a couple of fictious players that have been batting in every game since the dawn of the last millenium. I then query this to see whether oracle decides to perform star transformations in order to enhance the execution plans.

In theory, the optimizer should consider the bitmap indexes I have created and transform the query in question if considered a more efficient mechanism for identifying the requested rows from the fact table.

Here is the DDL as well as the corresponding log from my test:

create_sample_star_schema.sql

create_sample_star_schema.log

As you can see, the design is fairly simple. I utilize B-TREE indexes for the primary keys and the alternate/unique keys. When the dimensions' primary keys migrate into the fact table (as foreign keys), however, I index them using a BITMAP strategy. I also bitmap all the remaining columns that would ostensibly be queried by users in real life.

Afterwards, I test running an analytical query with STAR_TRANSFORMATION_ENABLED set to 'TRUE' and 'FALSE' to see how having star transformation at the optimizer's disposal effects the results.

Here is the DDL as well as the corresponding log from my queries:

star_transform_perf_test.sql

star_transform_perf_test.log

If STAR_TRANSFORMATION_ENABLED is set to 'TRUE' then the optimizer does choose to transform the query: it returns my query in 680 milliseconds performing 4,087 consistent gets and 603 physical reads at an optimizer cost of 169.

If STAR_TRANSFORMATION_ENABLED is set to 'FALSE' then the optimizer doesn't choose to transform the query: it returns my query in 280 milliseconds performing 1,776 consistent gets and 51 physical reads at an optimizer cost of 102.

Granted, my example doesn't have a lot of data (and I didn't partition, since I could perform my desired tests that notwithstanding), but I find it strange that Oracle chooses a more resource-intensive plan if STAR_TRANSFORMATION_ENABLED is 'TRUE'. Perhaps a better strategy is to use the */+ STAR_TRANSFORMATION */ hint for those queries that have been tested and identified to work better with the transformation. As data volumes increase in this example, I am pretty sure that the star transformation would gain ground performance-wise; therefore, testing is key.

Without question, however, designs that use BITMAP indexes have a much smaller footprint than their B-TREE counterparts and herego are associated with shorter ETL times.

Just to prove this, I ran the exact same previous example, but instead I replaced all BITMAP indexes with BTREE indexes:

create_sample_star_schema_btree.sql

create_sample_star_schema_btree.log

You should notice just how much more space the BTREE indexes consume.

Now I run the aforementioned query and see what the results are:

star_transform_perf_test_btree.sql

star_transform_perf_test_btree.log

If we replace the BITMAP indexes in the previous example with BTREE indexes: it returns my query in 670 milliseconds performing 4,034 consistent gets and 276 physical reads at an optimizer cost of 38. Not much of a performance gain given that it takes almost twice the storage to house the BTREE indexes vs. the BITMAP indexes.

So, in conclusion, using bitmap indexes in a star schema design can save a lot of space and can meet or out-perform the same design with btree indexes. As far as the star transformations, I would test thoroughly in your environment. This can always be set via a hint and or at the session level. It is my understanding that the decision to use star transformation is very environment dependent.

According to Barnali Banerjee, star transformations are best used in the following circumstances, when:

"
• The query patterns are undefined so building and maintaining multiple composite keys would be too difficult, if not impossible.

• The number of joined dimensions is large (over 5).

• Cartesian products would be too costly due to the size of the individual dimension tables.

• The fact table is sparsely populated.

• The cardinality is low between fact foreign key columns and the total fact number of rows making it a good candidate for bitmap indexes.

• Space is an issue. Bitmap indexes do not store the column value in the index, which greatly reduces the space requirements.

• In complex queries where the conditions in the WHERE clause are based on non-foreign key fact table columns. These columns need to have bitmap indexes to benefit from the bitmap transformation.
"

I obtained the following from his presentation Tuning Data Warehouse Query Performance (A Case Study)

Labels:

A few notes on Oracle and Parallel Execution

Posted at Wednesday, November 21, 2007
As Data Warehouses proliferate and as data volumes - in general - continue to increase, employing parallelism for relevant database operations has become much more important.

Setting up and using Oracle's parallel execution feature is pretty straight-forward and is covered adequately in available manuals, etc. However, there are some things that aren't quite as straight-forward.

For example, from a DBA's perspective, I want to know when I hop onto a system what parallel operations are currently being performed so that I can get a general idea of that system's profile.

I use the following query, which is restricted to solely parallel user operations. I can get a lot of information from this one: what degree of parallelism was asked for and whether this was factored down; how many physical reads each PX Slave has performed; how long each operation (at the Coordinator level) has lasted, and what the SQL_HASH_VALUE is (which I can, in turn, get the full statement from). Optionally, I can use the values from this high-level query to drill-down using other dynamic performance views such as V$SESSION_LONGOPS, V$SESSION_WAIT, and V$PX_SESSTAT.

select decode(ps.server_set,'',s.program,'  PX Slave') "Program",
ps.qcsid "Parent SID",
decode(ps.server_set,'',' --', ps.sid) "Child SID",
ps.degree ||decode(ps.degree,'',' --','/')||ps.req_degree "DOP/REQ",
decode(ps.server_set,'','Coordinator Process', ps.server_set) "Server Set",
nvl(p.server_name,' --') "PX Server",
nvl(p.status,' --') "PX Server Status",
pss.value "Physical Reads",
decode(ps.server_set,'',s.sql_hash_value,'') "SQL Hash Value",
decode(ps.server_set,'',s.last_call_et,'') "Seconds Elapsed"
from
v$px_session ps,
v$session s,
v$px_process p,
v$px_sesstat pss,
v$statname sn
where
s.sid = ps.qcsid (+)
and ps.sid = p.sid(+)
and ps.sid = pss.sid(+)
and pss.statistic# = sn.statistic# (+)
and sn.name='physical reads'
and s.status='ACTIVE'
and not s.program like '%(A%' --Eliminate Streams Apply Process
and not s.program like '%(C%' --Eliminate Streams Capture Process
order by ps.qcsid, "Server Set" desc, "PX Server";

Here is an example output of the above query: Sample Output

Here is a monitoring script from Oracle's docs which shows the wait events for the slaves:

SELECT   px.SID "SID", p.pid, p.spid "SPID", px.inst_id "Inst",
px.server_group "Group", px.server_set "Set", px.DEGREE "Degree",
px.req_degree "Req Degree", w.event "Wait Event"
FROM gv$session s, gv$px_session px, gv$process p, gv$session_wait w
WHERE s.SID(+) = px.SID
AND s.inst_id(+) = px.inst_id
AND s.SID = w.SID(+)
AND s.inst_id = w.inst_id(+)
AND s.paddr = p.addr(+)
AND s.inst_id = p.inst_id(+)
ORDER BY DECODE (px.qcinst_id, NULL, px.inst_id, px.qcinst_id),
px.qcsid,
DECODE (px.server_group, NULL, 0, px.server_group),
px.server_set,
px.inst_id;

You can find a couple more monitoring queries at this site by Doug Burns: Suck It Dry - Tuning Parallel Execution. A fancier pdf version can be found here: Tuning Parallel Execution

Of course, the use of parallelism can make tracing a process a real mess. Luckily with 10g, the ability to concatenate trace files using DBMS_MONITOR functionality finally makes this feasible.

Here is a great article regarding how to do this: Tracing Parallel Execution

Here is a quick example:

SQL> exec dbms_session.set_identifier('pxtrace_test1');
SQL> alter session set tracefile_identifier = 'pxtrace_test1';
SQL> exec dbms_monitor.client_id_trace_enable(client_id=>'pxtrace_test1');
SQL> select /*+ PARALLEL(sales,8) */ count(*) from sh.sales;

COUNT(*)
----------
51455208

SQL> exec dbms_monitor.client_id_trace_disable(client_id=>'pxtrace_test1');

Here is the results from the trace file (notice the values of 'os threads','parse',and 'execute')

select /*+ PARALLEL(sales,8) */ count(*)
from
sh.sales


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 9 0.05 0.05 0 0 0 0
Execute 9 33.84 1685.92 223362 449329 0 0
Fetch 2 0.01 215.03 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20 33.91 1901.00 223362 449329 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 45 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PX Deq: Execution Msg 115 0.09 0.32
direct path read 8667 0.00 0.07
os thread startup 9 0.05 0.28
PX Deq: Join ACK 7 0.00 0.01
PX Deq: Parse Reply 7 0.01 0.03
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 212 1.95 214.75
PX qref latch 12 0.02 0.02
PX Deq: Signal ACK 5 0.00 0.00
enq: PS - contention 1 0.00 0.00
SQL*Net message from client 2 20.92 20.92
********************************************************************************

One of the biggest challenges in employing parallelism optimally is determining the appropriate Degree of Parallelism (DOP). This debate is probably as old as the feature itself. Here is a great article which shows just how subjective determining DOP can be: How many Slaves?. At the end of the day, the correct setting for DOP varies by system configuration and must take into account key usage characteristics.

In terms of help in setting global init.ora parameters, the following queries may be helpful:

SELECT NAME, VALUE
FROM gv$sysstat
WHERE UPPER (NAME) LIKE '%PARALLEL OPERATIONS%'
OR UPPER (NAME) LIKE '%PARALLELIZED%'
OR UPPER (NAME) LIKE '%PX%';

SELECT *
FROM v$px_buffer_advice;

The following are considered to be the basic parallelism parameters (values are samples):
*.parallel_adaptive_multi_user=TRUE
*.parallel_execution_message_size=16384
*.parallel_max_servers=72
*.parallel_min_servers=0
*.parallel_threads_per_cpu=2


Labels: , ,

Change Data Capture: Implementing Distributed Asynchronous Autolog Archive CDC

Posted at Saturday, November 17, 2007
I have talked to a lot of folks lately that are moving towards using Oracle's Change Data Capture functionality for their Data Warehouses' ETL processes.

Change Data Capture (CDC for short), introduced with 9i, is used for propagating changes - specifically, those that are incremental - to non-OLTP environments. In other words, this functionality is largely for the Data Warehouse/Business Intelligence Stakeholder.

CDC can be configured a lot of different ways: synchronous vs. asynchronous; distributed vs. non-distributed; ARCH vs. LGWR transport method - just to name a few.

I believe that the Distributed Asynchronous Autolog configuration holds the most promise namely because it has little performance impact on the source operational (OLTP) environment (an important factor in firms' ETL tool selection decisions). Additionally, the configuration changes necessary on the source database are reasonable and shouldn't have an impact on availability/SLA requirements.

I implemented this on a couple of Dell machines using 10.2.0.3 and Enterprise Linux (2.6.9-55). I followed the following 'cookbook' from Oracle: Asynchronous Autolog Change Data Capture Cookbook (September 2007)

I also referred to the following Oracle Data Warehousing Guide (Reference Chapter 16): Data Warehousing Guide 10g Release 2 (10.2) B14223-02

To make this easier for you to do the same, I have uploaded tar archives from each machine - both the SOURCE and STAGE targets - which include all the scripts and log files I used in my implementation. This should be most of what you will need to successfully create the databases/listeners and test the aforementioned Cookbook's test-case, except for manual steps such as editing /etc/oratab and running orapwd.

Tarball Archive for the STAGING Database (cdc_autolog_arch_stage.tar)

Tarball Archive for the SOURCE Database (cdc_autolog_arch_source.tar)

Labels: ,

Great Page about PL/SQL Pipelined Functions

Posted at Saturday, June 16, 2007
In addition to this great article, you will find other
posts and utilities worth reviewing here.

Labels: ,

Creating a Fast Refreshable, ON COMMIT Materialized View

Posted at Monday, July 11, 2005
Although most people normally associate materialized views with OLAP environments, it has been my experience there are many uses for them in OLTP systems as well. Oracle 10g has brought with it many enhancements that aim to assist users in creating fast refreshable, query rewrite compatible materialized views; however, a basic understanding of the fast refreshable materialized view and query rewrite concepts should be prologue.

I have constructed a simple example that exhibits how to create a basic materialized view that is fast refreshable ON COMMIT. Because of this, queries against the materialized view's source tables are rewritten effectively. Note in the example how the materialized view stays up to date as DML is performed on its source tables. Additionally, note how queries are rewritten to take advantage of the preaggregated data. At the end of the script, I alter the materialized view to be fast refreshable ON DEMAND; therefore, subsequent DMLs disable the query rewrite due to the ENFORCED query_rewrite_integrity setting. Be aware that although this example works in both 9i and 10g versions, that only the 10g version will show the rewrite operation in the output of the EXPLAIN PLAN - this capability is new in 10g. (Note: Right-click 'Save Target As'):
Sample Script | Script Log

Labels:

Partitioning vs. Traditional DML to Maintain a Rolling Set of Data

Posted at Monday, June 13, 2005
The Partitioning option of the Oracle RDBMS is heralded as important by many, but understood - in practice - by relatively few. The general rule of thumb is that partitioning should be employed in cases where table and/or index segments are 2GB+ in size. Partitioning can be used to solve a variety of problems, beyond its inherent strength in breaking entities' contents into smaller, more manageable pieces. The subject of this post is just such a case.

A customer has a large transaction auditing table, which is used in Customer Support operations. There are many transactions and the data is only useful going back 30 days in the past. Due to the amount of data and the fact that internal customers only use this data for an ephemeral period of time, we need to prune unnecessary data from the table on a regular basis; in essence, maintaining a rolling set of data, representing the last 30 days of activity.

Normally, the aforementioned requirement is easy to satisfy using simple DML - namely the DELETE statement. However, when dealing with large amounts of data (VLDB) within a database in ARCHIVELOG mode, problems can occur due to insufficient amounts of available undo space and/or insufficient disk space to house the potentially staggering amount of archive logs generated during the transaction. Applying the NOLOGGING attribute to the segments in question offers no reprieve. Here is where my example that uses partitioning comes in handy. Instead of daily performing a delete of created < TRUNC(SYSDATE) - 30 (or something similar) against the transaction archive table, we use a partitioned table with a partition for each day, which we truncate and add partitions to as appropriate.

In Test Case 1, I created an example that satisfies our requirement, using the 'Traditional' method of maintaining the rolling data set
(Note: Right-click 'Save Target As'):
Test Case 1 - Traditional DML | Log of Test Case 1

In Test Case 2, I created an example that satisfies our requirement, using the Partitioning method, in which partitions are dynamically created and truncated each day to retain only partitions, representing data from the last 30 days
(Note: Right-click 'Save Target As'):
Test Case 2 - Partitioning Method | Log of Test Case 2

Results:

The Traditional DML Test case (Test Case 1), deletes 250,000 rows from a small table in 14.63 seconds and generates a lot of logging in the process. This technique will only scale so far, due to limited host resources.

The Partitioning Technique (Test Case 2), truncates 250,000 rows worth of data from a small table in 4.42 seconds with virtually no logging. This technique is quite scalable. In cases, in which the data needs to be rolled out of the table, but kept for posterity, you could use the ALTER TABLE ...EXCHANGE PARTITION statement. Partitioning, in this case, is the way to go!

Labels: ,