A few notes on Oracle and Parallel Execution

Posted at Wednesday, November 21, 2007
As Data Warehouses proliferate and as data volumes - in general - continue to increase, employing parallelism for relevant database operations has become much more important.

Setting up and using Oracle's parallel execution feature is pretty straight-forward and is covered adequately in available manuals, etc. However, there are some things that aren't quite as straight-forward.

For example, from a DBA's perspective, I want to know when I hop onto a system what parallel operations are currently being performed so that I can get a general idea of that system's profile.

I use the following query, which is restricted to solely parallel user operations. I can get a lot of information from this one: what degree of parallelism was asked for and whether this was factored down; how many physical reads each PX Slave has performed; how long each operation (at the Coordinator level) has lasted, and what the SQL_HASH_VALUE is (which I can, in turn, get the full statement from). Optionally, I can use the values from this high-level query to drill-down using other dynamic performance views such as V$SESSION_LONGOPS, V$SESSION_WAIT, and V$PX_SESSTAT.

select decode(ps.server_set,'',s.program,'  PX Slave') "Program",
ps.qcsid "Parent SID",
decode(ps.server_set,'',' --', ps.sid) "Child SID",
ps.degree ||decode(ps.degree,'',' --','/')||ps.req_degree "DOP/REQ",
decode(ps.server_set,'','Coordinator Process', ps.server_set) "Server Set",
nvl(p.server_name,' --') "PX Server",
nvl(p.status,' --') "PX Server Status",
pss.value "Physical Reads",
decode(ps.server_set,'',s.sql_hash_value,'') "SQL Hash Value",
decode(ps.server_set,'',s.last_call_et,'') "Seconds Elapsed"
from
v$px_session ps,
v$session s,
v$px_process p,
v$px_sesstat pss,
v$statname sn
where
s.sid = ps.qcsid (+)
and ps.sid = p.sid(+)
and ps.sid = pss.sid(+)
and pss.statistic# = sn.statistic# (+)
and sn.name='physical reads'
and s.status='ACTIVE'
and not s.program like '%(A%' --Eliminate Streams Apply Process
and not s.program like '%(C%' --Eliminate Streams Capture Process
order by ps.qcsid, "Server Set" desc, "PX Server";

Here is an example output of the above query: Sample Output

Here is a monitoring script from Oracle's docs which shows the wait events for the slaves:

SELECT   px.SID "SID", p.pid, p.spid "SPID", px.inst_id "Inst",
px.server_group "Group", px.server_set "Set", px.DEGREE "Degree",
px.req_degree "Req Degree", w.event "Wait Event"
FROM gv$session s, gv$px_session px, gv$process p, gv$session_wait w
WHERE s.SID(+) = px.SID
AND s.inst_id(+) = px.inst_id
AND s.SID = w.SID(+)
AND s.inst_id = w.inst_id(+)
AND s.paddr = p.addr(+)
AND s.inst_id = p.inst_id(+)
ORDER BY DECODE (px.qcinst_id, NULL, px.inst_id, px.qcinst_id),
px.qcsid,
DECODE (px.server_group, NULL, 0, px.server_group),
px.server_set,
px.inst_id;

You can find a couple more monitoring queries at this site by Doug Burns: Suck It Dry - Tuning Parallel Execution. A fancier pdf version can be found here: Tuning Parallel Execution

Of course, the use of parallelism can make tracing a process a real mess. Luckily with 10g, the ability to concatenate trace files using DBMS_MONITOR functionality finally makes this feasible.

Here is a great article regarding how to do this: Tracing Parallel Execution

Here is a quick example:

SQL> exec dbms_session.set_identifier('pxtrace_test1');
SQL> alter session set tracefile_identifier = 'pxtrace_test1';
SQL> exec dbms_monitor.client_id_trace_enable(client_id=>'pxtrace_test1');
SQL> select /*+ PARALLEL(sales,8) */ count(*) from sh.sales;

COUNT(*)
----------
51455208

SQL> exec dbms_monitor.client_id_trace_disable(client_id=>'pxtrace_test1');

Here is the results from the trace file (notice the values of 'os threads','parse',and 'execute')

select /*+ PARALLEL(sales,8) */ count(*)
from
sh.sales


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 9 0.05 0.05 0 0 0 0
Execute 9 33.84 1685.92 223362 449329 0 0
Fetch 2 0.01 215.03 0 0 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 20 33.91 1901.00 223362 449329 0 1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 45 (recursive depth: 1)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
PX Deq: Execution Msg 115 0.09 0.32
direct path read 8667 0.00 0.07
os thread startup 9 0.05 0.28
PX Deq: Join ACK 7 0.00 0.01
PX Deq: Parse Reply 7 0.01 0.03
SQL*Net message to client 2 0.00 0.00
PX Deq: Execute Reply 212 1.95 214.75
PX qref latch 12 0.02 0.02
PX Deq: Signal ACK 5 0.00 0.00
enq: PS - contention 1 0.00 0.00
SQL*Net message from client 2 20.92 20.92
********************************************************************************

One of the biggest challenges in employing parallelism optimally is determining the appropriate Degree of Parallelism (DOP). This debate is probably as old as the feature itself. Here is a great article which shows just how subjective determining DOP can be: How many Slaves?. At the end of the day, the correct setting for DOP varies by system configuration and must take into account key usage characteristics.

In terms of help in setting global init.ora parameters, the following queries may be helpful:

SELECT NAME, VALUE
FROM gv$sysstat
WHERE UPPER (NAME) LIKE '%PARALLEL OPERATIONS%'
OR UPPER (NAME) LIKE '%PARALLELIZED%'
OR UPPER (NAME) LIKE '%PX%';

SELECT *
FROM v$px_buffer_advice;

The following are considered to be the basic parallelism parameters (values are samples):
*.parallel_adaptive_multi_user=TRUE
*.parallel_execution_message_size=16384
*.parallel_max_servers=72
*.parallel_min_servers=0
*.parallel_threads_per_cpu=2


Labels: , ,