诊断RAC数据库的启动

上周为一位网友诊断了RAC数据库手动添加节点以后,所添加的节点可以手动startup,但是无法利用srvctl工具启动的问题。实际上是因为srvctl启动实例时优先使用的是ASM中的spfile,而手动startup则使用$ORACLE_HOME/dbs下的spfile,因为这位网友没有通过DBCA工具来添加实例,所以ASM中的spfile没有正确被配置。对于该类使用srvctl无法正常启动RAC数据库的”常见”问题,我们可以从以下几个方向入手:

1.使用”srvctl config database -d PROD -a”命令打印OCR中数据库的详细配置信息

[oracle@rh2 ~]$ srvctl  config database -d PROD -h

Displays the configuration for the database.

Usage: srvctl config database [-d  [-a] ] [-v]
    -d       Unique name for the database
    -a                       Print detailed configuration information
    -v                       Verbose output
    -h                       Print usage

[oracle@rh2 ~]$ srvctl  config database -d PROD -a
Database unique name: PROD
Database name: PROD
Oracle home: /s01/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/PROD/spfilePROD.ora
Domain: 
Start options: open
Stop options: normal
Database role: PRIMARY
Management policy: MANUAL
Server pools: PROD
Database instances: PROD1,PROD2
Disk Groups: DATA
Mount point paths: 
Services: maclean,maclean_pre,maclean_taf
Type: RAC
Database is enabled
Database is administrator managed

可以看到以上PROD数据库在CRS管理时使用+DATA/PROD/spfilePROD.ora ASM内的spfile启动。

2.分析sqlnet.ora配置文件,该文件位于$ORACLE_HOME/network/admin目录下

3.分析$ORACLE_HOME/log//racg目录下的文件

Script:收集RAC性能诊断信息

以下脚本可以用于收集RAC性能诊断信息:

spool rac_diag.log
SELECT B1.INST_ID,
B2.VALUE blocks,
Round(B1.VALUE / 100) total_time,
round((B1.VALUE / B2.VALUE) * 10, 2) avg_time
FROM GV$SYSSTAT B1, GV$SYSSTAT B2
WHERE B1.NAME = 'gc cr block receive time'
AND B2.NAME = 'gc cr blocks received'
AND B1.INST_ID = B2.INST_ID
AND B2.VALUE > 0
Order by 1
/
select v.banner, i.instance_name
from gv$version v, gv$instance i
where v.inst_id = i.inst_id
and v.banner in
(select banner
from (select banner, count(*) cnt from gv$version group by banner)
where cnt <> 2)
order by 1, 2
/
SELECT INST_ID,
SND_Q_LEN,
SND_Q_MAX,
SND_Q_TOT,
TCKT_AVAIL,
TCKT_LIMIT,
TCKT_RCVD,
TCKT_WAIT
FROM GV$DLM_TRAFFIC_CONTROLLER
WHERE (SND_Q_LEN > 0)
OR ((TCKT_LIMIT - TCKT_AVAIL) >= (TCKT_LIMIT * 0.6))
OR TCKT_WAIT = 'YES'
/
SELECT A1.INST_ID,
A1.VALUE blocks_lost,
A2.VALUE blocks_corrupt
FROM   GV$SYSSTAT A1,
GV$SYSSTAT A2
WHERE  A1.NAME = 'gc blocks lost'
AND    A2.NAME = 'gc blocks corrupt'
AND    A1.INST_ID = A2.INST_ID
AND    (a1.value > 0 or a2.value > 0)
/
select *
from (SELECT INST_ID,
OWNER#,
NAME,
KIND,
FILE#,
SUM(FORCED_READS) READS,
SUM(FORCED_WRITES) WRITES,
MAX(XNC) XNC
FROM GV$CACHE_TRANSFER
GROUP BY INST_ID, OWNER#, NAME, KIND, FILE#
ORDER BY 8 DESC)
where rownum <= 10
/
select o.parameter, o.value, i.instance_name
from gv$option o, gv$instance i
where o.inst_id = i.inst_id
and o.Parameter in (select Parameter
from (select Parameter, value, count(*) cnt
from gv$option
group by Parameter, value)
where cnt <> 2)
order by 1, 2
/
select p.name, p.value, i.instance_name
from gv$parameter p, gv$instance i
where p.inst_id = i.inst_id
and p.name in (select name
from (select name, value, count(*) cnt
from gv$parameter
where name in ('archive_lag_target',
'control_management_pack_access',
'diagnostic_dest',
'redo_transport_user',
'trace_enabled',
'license_max_users',
'log_archive_format',
'spfile',
'undo_retention')
group by name, value)
where cnt <> 2)
order by 1, 2
/
select p.name, p.value, i.instance_name
from gv$parameter p, gv$instance i
where p.inst_id = i.inst_id
and p.name in (select name
from (select name, value, count(*) cnt
from gv$parameter
where name in ('active_instance_count',
'cluster_database',
'cluster_database_instances',
'compatible',
'control_files',
'db_block_size',
'db_domain',
'db_files',
'db_name',
'db_recovery_file_dest',
'db_recovery_file_dest_size',
'db_unique_name',
'instance_type',
'max_parallel_servers',
'parallel_execution_message_size',
'dml_locks',
'remote_login_passwordfile',
'result_cache_max_size',
'undo_management')
and not ((name = 'dml_locks') and (value = '0'))
group by name, value)
where cnt <> 2)
order by 1, 2
/
select p.name, p.value, i.instance_name
from gv$parameter p, gv$instance i
where p.inst_id = i.inst_id
and p.name in
(select name
from (select name, value, count(*) cnt
from gv$parameter
where name in ('instance_name',
'instance_number',
'thread',
'rollback_segments',
'undo_tablespace')
and not ((name = 'rollback_segments') and (value = null))
and not ((name = 'instance_name') and (value = null))
group by name, value)
where cnt <> 1)
order by 1, 2
/
select s.inst_id,
s.blocks_served,
Round(1000000 * s.pin_time / s.blocks_served) / 1000 avg_pin_time,
Round(1000000 * s.flush_time / s.blocks_served) / 1000 avg_flush_time,
Round(1000000 * s.send_time / s.blocks_served) / 1000 avg_send_time,
Round((1000000 * (s.pin_time + s.flush_time + s.send_time)) /
s.blocks_served) / 1000 avg_service_time
from (select inst_id,
sum(decode(name, 'gc current block pin time', value, 0)) pin_time,
sum(decode(name, 'gc current block pin flush', value, 0)) flush_time,
sum(decode(name, 'gc current block pin send', value, 0)) send_time,
sum(decode(name, 'gc current block blocks served', value, 0)) blocks_served
from gv$sysstat
where name in ('gc current block pin time',
'gc current block pin flush',
'gc current block pin send',
'gc current block blocks served')
group by inst_id) s
where s.blocks_served > 0
/
spool off

What’s preconnect.svc in 11g RAC?

有网友反映在11.2的RAC中有一个名如*_preconnect.svc的资源一直处于OFFLINE状态,而其TARGET则为ONLINE状态,无法通过重启CRS或者手动start该资源来使之ONLINE。我们来具体看一下什么情况下回产生这种资源?:

[oracle@rh2 ~]$ srvctl add service -h
Adds a service configuration to the Oracle Clusterware.
Usage: srvctl add service -d  -s  {-r "" 
[-a ""] [-P {BASIC | NONE | PRECONNECT}] | -g  
[-c {UNIFORM | SINGLETON}] } [-k   ] [-l [PRIMARY][,PHYSICAL_STANDBY]
[,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y {AUTOMATIC | MANUAL}] [-q {TRUE|FALSE}] 
[-x {TRUE|FALSE}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] 
[-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z ] 
[-w ] [-t ] [-f]
-d       Unique name for the database
-s              Service name
-r ""    Comma separated list of preferred instances
-a ""    Comma separated list of available instances
-g            Server pool name
-c {UNIFORM | SINGLETON} Service runs on every active server in the 
server pool hosting this service (UNIFORM) or just one server (SINGLETON)
-k              network number (default number is 1)
-P {NONE | BASIC | PRECONNECT}        TAF policy specification
-l                 Role of the service (primary, physical_standby, logical_standby, snapshot_standby)
-y               Management policy for the service (AUTOMATIC or MANUAL)
-e        Failover type (NONE, SESSION, or SELECT)
-m      Failover method (NONE or BASIC)
-w              Failover delay
-z              Failover retries
-t              Edition (or "" for empty edition value)
-j   Connection Load Balancing Goal (SHORT or LONG). Default is LONG.
-B      Runtime Load Balancing Goal (SERVICE_TIME, THROUGHPUT, or NONE)
-x   Distributed Transaction Processing (TRUE or FALSE)
-q  AQ HA notifications (TRUE or FALSE)
Usage: srvctl add service -d  -s  -u {-r "" | -a ""} [-f]
-d       Unique name for the database
-s              Service name
-u                       Add a new instance to service configuration
-r        Name of new preferred instance
-a       Name of new available instance
-f                       Force the add operation even though a listener is not configured for a network
-h                       Print usage
[oracle@rh2 ~]$ srvctl add service -d PROD -s maclean -r "PROD1,PROD2" -P BASIC
[oracle@rh2 ~]$ crs_stat|grep maclean
NAME=ora.prod.maclean.svc
NAME=ora.prod.maclean_taf.svc
[oracle@rh2 ~]$ srvctl add service -d PROD -s maclean_pre -r "PROD1,PROD2" -P PRECONNECT
[oracle@rh2 ~]$ crs_stat|grep maclean_pre            
NAME=ora.prod.maclean_pre.svc
NAME=ora.prod.maclean_pre_preconnect.svc
/* 可以看到仅当创建的service使用preconnect的TAF policy时会附带创建{service_name}_preconnect.svc的服务 */
[oracle@rh2 ~]$ srvctl start service -d PROD -s maclean    
[oracle@rh2 ~]$ srvctl start service -d PROD -s maclean_pre
[oracle@rh2 ~]$ crs_stat |grep -A3 maclean
NAME=ora.prod.maclean.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=ONLINE on rh2
--
NAME=ora.prod.maclean_pre.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=ONLINE on rh2
--
NAME=ora.prod.maclean_pre_preconnect.svc
TYPE=ora.service.type
TARGET=ONLINE
STATE=OFFLINE
--
NAME=ora.prod.maclean_taf.svc
TYPE=ora.service.type
TARGET=OFFLINE
STATE=OFFLINE
[oracle@rh2 ~]$ srvctl start service -d prod -s maclean_pre_preconnect
PRCD-1084 : Failed to start service maclean_pre_preconnect
PRCR-1079 : Failed to start resource ora.prod.maclean_pre_preconnect.svc
CRS-2674: Start of 'ora.prod.maclean_pre_preconnect.svc' on 'rh2' failed
CRS-2674: Start of 'ora.prod.maclean_pre_preconnect.svc' on 'rh3' failed
CRS-2632: There are no more servers to try to place resource 
'ora.prod.maclean_pre_preconnect.svc' on that would satisfy its placement policy
/* 无法手动启动该preconnect.svc服务 */
[oracle@rh2 ~]$ crs_stat |grep preconnect
NAME=ora.prod.maclean_pre_preconnect.svc
[oracle@rh2 ~]$ srvctl remove service -d PROD -s maclean_pre_preconnect
PRCD-1107 : Removed service maclean_pre_preconnect but failed to remove its 
underlying server pool PROD_maclean_pre
PRCS-1012 : Failed to remove server pool PROD_maclean_pre
PRCR-1072 : Failed to unregister server pool ora.PROD_maclean_pre
CRS-2554: Server pool 'ora.PROD_maclean_pre' cannot be unregistered as 
it is referenced by resource 'ora.prod.maclean_pre.svc'
[oracle@rh2 ~]$ crs_stat |grep preconnec
/* 以上remove命令报错但该preconnect.svc服务还是被删除了,虽然我们并不推荐这样做 */
[oracle@rh2 admin]$ srvctl start service -d PROD -s maclean_pre
[oracle@rh2 admin]$ lsnrctl service
LSNRCTL for Linux: Version 11.2.0.2.0 - Production on 30-MAR-2011 18:55:43
Copyright (c) 1991, 2010, Oracle.  All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
Services Summary...
Service "+ASM" has 1 instance(s).
Instance "+ASM1", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Service "PROD" has 1 instance(s).
Instance "PROD1", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Service "PRODXDB" has 1 instance(s).
Instance "PROD1", status READY, has 1 handler(s) for this service...
Handler(s):
"D000" established:0 refused:0 current:0 max:1022 state:ready
DISPATCHER 
(ADDRESS=(PROTOCOL=tcp)(HOST=rh2)(PORT=36196))
Service "maclean" has 1 instance(s).
Instance "PROD1", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
Service "maclean_pre" has 1 instance(s).
Instance "PROD1", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
LOCAL SERVER
The command completed successfully
[oracle@rh2 admin]$ tnsping PROD_TAF
TNS Ping Utility for Linux: Version 11.2.0.2.0 - Production on 30-MAR-2011 18:54:38
Copyright (c) 1997, 2010, Oracle.  All rights reserved.
Used parameter files:
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)
(HOST = rh-cluster-scan)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) 
(SERVICE_NAME = maclean_pre)))
OK (10 msec)
[oracle@rh2 admin]$ sqlplus  maclean/maclean@PROD_TAF
SQL> select instance_name,host_name from v$instance;
INSTANCE_NAME
----------------
HOST_NAME
----------------------------------------------------------------
PROD2
rh3.oracle.com
shutdown PROD2 instance:SQL> shutdown abort
SQL> select instance_name,host_name from v$instance;
INSTANCE_NAME
----------------
HOST_NAME
----------------------------------------------------------------
PROD1
rh2.oracle.com
/* 可以看到即使删除preconnect.svc服务也不会影响到TAF的正常使用,
在生产环境中无需刻意去调试该服务,最好的应对方式是放任不管
*/

所以如果某天你看到你的crs_stat -t输出里有一条莫名的类似”ora….ect.svc ora….ce.type ONLINE OFFLINE”的记录,那么大可以忽略该资源的OFFLINE状态;这是正常现象,用不着紧张!

Would It affect RAC clusterware and database If we adjust OS time/Clock?

Question:

在RAC环境中节点之间的OS操作系统时钟一致是clusterware能够稳定运行的重要因素之一,但是如果我们确实有调整OS时间的需求,那么是否真的会影响到RAC的正常运行呢? 具体的影响是如何的呢?  又需要注意哪些方面的因素呢?

Answer:

RAC: Frequently Asked Questions (Doc ID 220970.1)

Does Oracle RAC work with NTP (Network Time Protocol)?
YES! NTP and Oracle RAC are compatible, as a matter of fact, it is recommended to setup NTP in an Oracle RAC cluster, for Oracle 9i Database, Oracle Database 10g, and Oracle Database 11g Release 1.


Keep the following points in mind:

# Minor changes in time (in the seconds range) are harmless for Oracle RAC and the Oracle Clusterware. If you intend on making large time changes it is best to shutdown the instances and the entire Oracle Clusterware stack on that node to avoid a false eviction, especially if you are using the Oracle RAC 10g low-brownout patches, which allow really low misscount settings.

# Backup/recovery aspect of large time changes are documented in Note: 77370.1, basically you can’t use RECOVER DATABASE UNTIL TIME to reach the second recovery point, It is possible to overcome with RECOVER DATABASE UNTIL CANCEL or UNTIL CHANGE. If you are doing complete recovery (most of the times) then this is not an issue since the Oracle recovery code uses SCN (System Change Numbers) to advance in the redo/archive logs. The SCN numbers never go back in time (unless a reset-logs operation is performed), there is always an association of an SCN to a human readable timestamp (which may change forward or backwards), hence the issue with recovery until point in time vs. until SCN/Cancel.

# If DBMS_SCHEDULER is in usage it will be affected by time changes, as it’s using actual clock rather than SCN.

# On platforms with OPROCD get fix for <> “OPROCD REBOOTS NODE WHEN TIME IS SET BACK BY XNTPD”

# If NTP is not configured correctly (using -x flag), and diagwait not set to 13 Note: 559365.1 10.2/11.1 RAC systems can be rebooted due to OPROCD, during a leap second event, see Note: 759143.1.
# Daylight saving time adjustments do not affect the system clock, only the displayed time, hence have no impact on the Oracle software.

Apart from these issues, the Oracle RDBMS server is immuned to time changes, i.e. will not affect transaction/read consistency operations.

Also please refer to note:
Dates & Calendars – Frequently Asked Questions (Doc ID 227334.1)

So please perform time changes in small amount using date command only.  Doing it precisely will  be difficult manually. Therefore using ntpd with -x option could be better solution for this case as well.

In general step in not more than 3 seconds when tuning time backward should be fine.

RAC动态资源(DRM)管理介绍

以下文本摘自: metalink doc 390483.1

Subject:  DRM – Dynamic Resource management
Doc ID:  390483.1  Type:  BULLETIN
Modified Date :  13-JAN-2009  Status:  PUBLISHED

In this Document
Purpose
Scope and Application
DRM – Dynamic Resource management
DRM – Dynamic Resource Mastering
References

Applies to:
Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.1.0
Oracle Server – Standard Edition – Version: 10.1.0.2 to 11.1.0
Information in this document applies to any platform.
Oracle Real application Clusters
Purpose

To describe the concept of DRM (Dynamic Resource Mastering)
Scope and Application

This note in intended for experienced Real application cluster DBA’s
DRM – Dynamic Resource management
DRM – Dynamic Resource Mastering

When using Real application Clusters (RAC), Each instance has its own SGA and buffer cache. RAC will ensure that these block changes are co -ordinated to maximize performance and to ensure data intergrity. Each copy of the buffer also called as a cache resource has a master which is one of the nodes of the cluster.

In database releases before 10g (10.1.0.2) once a cache resource is mastered on an instance, a re-mastering or a change in the master would take place only during a reconfiguration that would happen automatically during both normal operations like instance startup or instance shutdown or abnormal events like Node eviction by Cluster Manager. So if Node B is the master of a cache resource, this resource will remain mastered on Node B until reconfiguration.

10g  introduces a concept of resource remastering via DRM. With DRM a resource can be re-mastered on another node say from Node B to Node A if it is found that the cache resource is accessed more frequently from Node A. A reconfiguration is no longer the only reason for a resource to be re-mastered.

In 10gR1 DRM is driven by affinity of files and in 10gR2 it is based on objects.

Sample LMD trace file during a DRM operation

Begin DRM(202) - transfer pkey 4294951314 to 0 oscan 1.1
*** 2006-08-01 17:34:54.645
Begin DRM(202) - transfer pkey 4294951315 to 0 oscan 1.1
*** 2006-08-01 17:34:54.646
Begin DRM(202) - transfer pkey 4294951316 to 0 oscan 1.1
*** 2006-08-01 17:34:54.646
Begin DRM(202) - transfer pkey 4294951317 to 0 oscan 1.1

DRM attributes are intentionally undocumented since they may change depending on the version. These attributes should not be changed without discussing with Support.
@DRM is driven by the following
@ 1.) _gc_affinity_time = Time in minutes at which statistics will be evaluated (default = 10 mins)
@ 2.) _gc_affinity_limit = # of times a node accesses a file/object (default = 50)
@ 3.) _gc_affinity_minimum = minimum # of times per minute a file/object is accessed before affinity kicks in
@ (default = 600 per minute per cpu )

It is important to note that

  1. Two instance will not start a DRM operation at the same time however lmd,lms,lmon processes from all instances collectively take part in the DRM operation.
  2. Normal activity on the database is not affected due to DRM. This means users continue insert/update/delete operations without any interruptions. Also DRM operations complete very quickly.

@

Disable DRM
Generally DRM should not be disabled unless Oracle Support/Development has suggested turning it off due to some known issues.
@To disable DRM, set
@To disable DRM, set
@_gc_affinity_time=0                                 # Only if DB version is 10.1 or 10.2
@_gc_undo_affinity=FALSE                       # Only if Db version is 10.2
@_gc_policy_time=FALSE                         # Only if DB version is 11.1 or higher
@_gc_affinity_time has been renamed to _gc_policy_time in 11g

Building a RAC Test Environment on VMWare For Free

•VMWare Orientation
•Importance of prerequisite checking
•Configuring node #1
•Cloning (VM) node #1
•Installing Clusterware
•Installing ASM and Database
•Creating Database
•Common Challenges and Issues

VMWare Orientation
•Desktop Products
–VMWare Workstation: “Full” product, $
–VMWare Player: Free, no creation, just use
•Server Products
–VMWare Infrastructure (ESX Server): $$$,
robust, clustering capable, OS replacement
–VMWare VirtualCenter: $$, management
tool for whole environment, Windows
–VMWare Server: free, limited, on top of OS

How many LMS processes for Oracle Rac 9i?

Question:

How many LMS processes for Oracle 9i Rac?

Can you check if you have increased the CPU in the machine during the upgrade.
Also please let me know if you have changed any underscore parameter.

Do you have some formula for calculate LMS processes and Num of CPU ?? Let say 2 CPU for 1 LMS process ??

 

Answer:

This can go dynamically as and when required and the # of startup of lms is also controlled by
_lm_lms,_lm_max_lms _lm_min_lms , later 2 seems to be for dynamic control of this numbers.
These parameters should not be set manually.
Also the no. of lms process should be one less than the number of CPU on the node.

Know about RAC Clusterware Process OPROCD

OPROCD introduced in 10.2.0.4 Linux and other Unix platform.

  • Fencing
    • Cluster handling of nodes that should not have access to shared resources
    • STONITH – Power cycle the node
    • PCW – nodes fence themselves through the reboot(8) command
    • Fabric Fencing from Polyserve
      • Healthy nodes send SNMP msgs to Fabric switch to disable SAN access from unhealthy nodes [ fence them out ]
      • Server is left in up state to view logs etc.
  • Oracle’s Cluster I/O Fencing solution
  • Only started on Unix platforms when vendor Clusterware is not running
  • Does not run on Windows and Linux!
  • Takes 2 parameters
    • Timeout value [ length of time between executions ]
    • Margin [ leeway for dispatches ]
    • Oproc.debug –t 1000 –m 500
  • In fatal mode node will get reboot’ed
  • In non-fatal mode error messages will be logged

OPROCD – This process is spawned in any non-vendor clusterware environment, except
on Windows where Oracle uses a kernel driver to perform the same actions and Linux
prior to version 10.2.0.4. If oprocd detects problems, it will kill a node via C
code. It is spawned in init.cssd and runs as root. This daemon is used to detect
hardware and driver freezes on the machine. If a machine were frozen for long enough
that the other nodes evicted it from the cluster, it needs to kill itself to prevent
any IO from getting reissued to the disk after the rest of the cluster has remastered
locks.”

*** Oprocd log locations:
In /etc/oracle/oprocd or /var/opt/oracle/oprocd depending on version/platform.

Note that oprocd only runs when no vendor clusterware is running or on Linux > 10.2.0.4

COMMON CAUSES OF OPROCD REBOOTS

– A problem detected by the OPROCD process. This can be caused by 4 things:1) An OS scheduler problem.
2) The OS is getting locked up in a driver or hardware.
3) Excessive amounts of load on the machine, thus preventing the scheduler from
behaving reasonably.
4) An Oracle bug.OPROCD Bugs Known to Cause Reboots:

Bug 5015469 – OPROCD may reboot the node whenever the system date is moved
backwards.
Fixed in 10.2.0.3+

Bug 4206159 – Oprocd is prone to time regression due to current API used (AIX only)
Fixed in 10.1.0.3 + One off patch for Bug 4206159.

Diagnostic Fixes (VERY NECESSARY IN MOST CASES):

Bug 5137401 – Oprocd logfile is cleared after a reboot
Fixed in 10.2.0.4+

Bug 5037858 – Increase the warning levels if a reboot is approaching
Fixed in 10.2.0.3+

FILES TO REVIEW AND GATHER FOR OPROCD REBOOTS

If logging a service request, please provide ALL of the following files to Oracle
Support if possible:

– Oprocd logs in /etc/oracle/oprocd or /var/opt/oracle/oprocd depending on version/platform.

– All the files in the following directories from all nodes.

For 10.2 and above, all files under:

<CRS_HOME>/log

Recommended method for gathering these for each node would be to run the
diagcollection.pl script.

For 10.1:

<CRS_HOME>/crs/log
<CRS_HOME>/crs/init
<CRS_HOME>/css/log
<CRS_HOME>/css/init
<CRS_HOME>/evm/log
<CRS_HOME>/evm/init
<CRS_HOME>/srvm/log

Recommended method for gathering these for each node:

cd <CRS_HOME>
tar cf crs.tar crs/init crs/log css/init css/log evm/init evm/log srvm/log

– Messages or Syslog from all nodes from the time of the problem:

Sun: /var/adm/messages
HP-UX: /var/adm/syslog/syslog.log
Tru64: /var/adm/messages
Linux: /var/log/messages
IBM: /bin/errpt -a > messages.out

– ‘opatch lsinventory -detail’ output for the CRS home

– It would also be useful to get the following from each node leading up to the time
of the reboot:

– netstat -is (or equivelant)
– iostat -x (or equivelant)
– vmstat (or equivelant)

There is a tool called “OS Watcher” that helps gather this information. This tool
will dump netstat, vmstat, iostat, and other output at an inverval and save x number
of hours of archived data. For more information about this tool see Note 301137.1.

 

The OPROCD executable sets a signal handler for the SIGALRM handler and sets the interval timer based on the to-millisec parameter provided. The alarm handler gets the current time and checks it against the time that the alarm handler was last entered. If the difference exceeds (to-millisec + margin-millisec), it will fail; the production version will cause a node reboot.

In fatal mode, OPROCD will reboot the node if it detects excessive wait. In Non Fatal mode, it will write an error message out to the file .oprocd.log in one of the following directories.

Oracle clusterware has the following three daemons which may be responsible for panicing the node. It is possible that some other external entity may have rebooted the node. In the context of this discussion, we will assume that the reboot/panic was done by an Oracle clusterware daemon.

* Oprocd – Cluster fencing module
* Cssd – Cluster sychronization module which manages node membership
* Oclsomon – Cssd monitor which will monitor for cssd hangs

OPROCD This is a daemon that only gets activated when there is no vendor clusterware present on the OS.This daemon is also not activated to run on Windows/Linux. This daemon runs a tight loop and if it is not scheduled for 1.5 seconds, will reboot the node.
CSSD This daemon pings the other members of the cluster over the private network and Voting disk. If this does not get a response for Misscount seconds and Disktimeout seconds respectively, it will reboot the node.
Oclsomon This daemon monitors the CSSD to ensure that CSSD is scheduled by the OS, if it detects any problems it will reboot the node.

A sample log looks like
May 11 18:13:15.528 | INF | monitoring started with timeout(1000), margin(500)
May 11 18:13:15.548 | INF | normal startup, setting process to fatal mode
May 12 11:43:00.899 | INF | shutting down from client request
May 12 11:43:00.899 | INF | exiting current process in NORMAL mode
May 12 12:10:43.984 | INF | monitoring started with timeout(1000), margin(500)
May 13 11:29:37.528 | INF | shutting down from client request
May 13 11:29:37.528 | INF | exiting current process in NORMAL mode
When fatal mode is disabled, OPROCD will write the following to the log file and exit:
May 10 18:01:40.668 | INF | monitoring started with timeout(1000), margin(500)
May 10 18:23:02.490 | ERR | AlarmHandler:? timeout(1739751316) exceeds interval(1000000000)+margin(500000000)
[root@rh2 ~]# ps -ef|grep oprocd|grep -v grep
root     19763     1  0 Jun27 ?        00:00:00 oprocd start
[root@rh2 oprocd]# cd /etc/oracle/oprocd
[root@rh2 oprocd]# ls -l
total 20
drwxrwx--- 2 root oinstall 4096 Jun 27 23:52 check
drwxrwx--- 2 root oinstall 4096 Mar 29 22:37 fatal
-rwxr--r-- 1 root root      512 Jun 27 23:52 rh2.oprocd.lgl
-rw-r--r-- 1 root root      171 Jun 27 23:52 rh2.oprocd.log
drwxrwx--- 2 root oinstall 4096 Jun 27 23:52 stop
[root@rh2 oprocd]# cat rh2.oprocd.log
Jun 27 23:52:47.861 | INF | monitoring started with timeout(1000), margin(500), skewTimeout(125)
Jun 27 23:52:47.864 | INF | normal startup, setting process to fatal mode
[root@rh2 oprocd]# oprocd
usage:  oprocd [start | startInstall | stop | check | enableFatal| help | -?]
run [ -t | -m | -g | -f  | -e]   foreground startup
-t           timeout in ms
-m            timout margin in ms
-e           clock skew epsilon in ms
-g         group name to enable fatal
-f                    fatal startup
start  [-t | -m  | -e]           starts the daemon
-t         timeout in ms
-m          timout margin in ms
-e         clock skew epsilon in ms
startInstall [ -t | -m | -g  | - e] start process in install mode
-t    timeout in ms
-m     timout margin in ms
-e    clock skew epsilon in ms
-g  group name to enable fatal
enableFatal  [ -t ]             force install mode process to fatal
-t    timeout for response in ms
stop         [ -t ]             stops running daemon
-t    timeout for response in ms
check        [ -t ]           checks status of daemon
-t    timeout for response in ms
help                          this help information
-?                            same as help above
[root@rh2 oprocd]# oprocd stop
Jun 28 00:17:36.604 | INF | daemon shutting down

Oracle Clusterware Process Monitor (OPROCD) From Julian Dyke

Process Monitor Daemon
Provides Cluster I/O Fencing
Implemented on Unix systems
Not required with third-party clusterware
Implemented in Linux in 10.2.0.4 and above
In 10.2.0.3 and below hangcheck timer module is used
Provides hangcheck timer functionality to maintain cluster integrity
Behaviour similar to hangcheck timer
Runs as root
Locked in memory
Failure causes reboot of system
See /etc/init.d/init.cssd for operating system reboot commands
OPROCD takes two parameters
-t  - Timeout value
Length of time between executions (milliseconds)
Normally defaults to 1000
-m - Margin
Acceptable margin before rebooting (milliseconds)
Normally defaults to 500
Parameters are specified in /etc/init.d/init.cssd
OPROCD_DEFAULT_TIMEOUT=1000
OPROCD_DEFAULT_MARGIN=500
Contact Oracle Support before changing these values
/etc/init.d/init.cssd can increase OPROCD_DEFAULT_MARGIN based on two CSS variables
reboottime (mandatory)
diagwait (optional)
Values can for these be obtained using
[root@server3]# crsctl get css reboottime
[root@server3]# crsctl get css diagwait
Both values are reported in seconds
The algorithm is
If diagwait > reboottime then
OPROCD_DEFAULT_MARGIN := (diagwait - reboottime) * 1000
Therefore increasing diagwait will reduce frequency of reboots e.g
[root@server3]# crsctl set css diagwait 13

RAC Deadlock For Example

Single resource deadlock: blocking enqueue which blocks itself, f 0

Single resource deadlock: blocking enqueue which blocks itself, f 0
Granted global enqueue 0xd8578490
----------enqueue 0xd8578490------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERCW
req_level        : KJUSERPW
bast_level       : KJUSERNL
notify_func      : 0x4fe4b6e
resp             : 0xd9c7ad50
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :
Convert options  : KJUSERGETVALUE
History          : 0xd7d8d7da
Msg_Seq          : 0x60005
res_seq          : 2
valblk           : 0x00000000000000000000000000000000 .
DUMP LOCAL BLOCKER/HOLDER: block level 4 res [0x1451c][0x0],[TM][ext 0x0,0x0]
----------resource 0xd9c7ad50----------------------
resname       : [0x1451c][0x0],[TM][ext 0x0,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 112
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd cached
grant_bits    : KJUSERCW
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 0         0         2         0         0         0
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9c7ad50
On Scan_q?    : N
Total accesses: 150
Imm.  accesses: 143
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 0xd876ab70 gl KJUSERCW rp 0xd9c7ad50 [0x1451c][0x0],[TM][ext 0x0,0x0]
master 1 gl owner 0xda7233c0 possible pid 1993 xid 2B000-0001-000000B5 bast 0 rseq 1 mseq 0 history 0x49a51495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd8578490 gl KJUSERCW rl KJUSERPW rp 0xd9c7ad50 [0x1451c][0x0],[TM][ext 0x0,0x0]
master 1 owner 2  bast 1 rseq 2 mseq 0x60005 history 0xd7d8d7da
convert opt KJUSERGETVALUE
----------enqueue 0xd876ab70------------------------
lock version     : 2071
Owner inst       : 1
grant_level      : KJUSERCW
req_level        : KJUSERPW
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c7ad50
procp            : 0xd9979b08
pid              : 1993
proc version     : 81
oprocp           : (nil)
opid             : 1993
group lock owner : 0xda7233c0
possible pid     : 1993
xid              : 2B000-0001-000000B5
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x49a51495
Msg_Seq          : 0x0
res_seq          : 1
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd876ab70
sid: 416 ser: 217 audsid: 2301258 user: 95/SPOT
flags: (0x10041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 1993
image: oracle@rh2.oracle.com (J000)
client details:
O/S info: user: oracle, term: UNKNOWN, ospid: 1993
machine: rh2.oracle.com program: oracle@rh2.oracle.com (J000)
application name: DBMS_SCHEDULER, hash value=2478762354
action name: QUEST_PPCM_JOB_PM_1, hash value=3637730750
current SQL:
DELETE FROM QUEST_PPCM_SNAPSHOT WHERE SNAPSHOT_TYPE = :B2 AND INSTANCE_ID > 0
AND SNAPSHOT_TIMESTAMP < TRUNC (SYSDATE) - :B1  
----------enqueue 0xd8578490------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERCW
req_level        : KJUSERPW
bast_level       : KJUSERNL
notify_func      : 0x4fe4b6e
resp             : 0xd9c7ad50
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :
Convert options  : KJUSERGETVALUE
History          : 0xd7d8d7da
Msg_Seq          : 0x60005
res_seq          : 2
valblk           : 0x03000000000000000100000000000000 . 
Requesting global enqueue 0xd876ab70 
----------enqueue 0xd876ab70------------------------
lock version     : 2071
Owner inst       : 1
grant_level      : KJUSERCW
req_level        : KJUSERPW
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c7ad50
procp            : 0xd9979b08
pid              : 1993
proc version     : 81
oprocp           : (nil)
opid             : 1993
group lock owner : 0xda7233c0
possible pid     : 1993
xid              : 2B000-0001-000000B5
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x49a51495
Msg_Seq          : 0x0
res_seq          : 1
valblk           : 0x00000000000000000000000000000000 . u
ser session for deadlock lock 0xd876ab70
sid: 416 ser: 217 audsid: 2301258 user: 95/SPOT
flags: (0x10041) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 1993
image: oracle@rh2.oracle.com (J000)
client details:     O/S info: user: oracle, term: UNKNOWN, ospid: 1993
machine: rh2.oracle.com program: oracle@rh2.oracle.com (J000)
application name: DBMS_SCHEDULER, hash value=2478762354
action name: QUEST_PPCM_JOB_PM_1, hash value=3637730750
current SQL:
DELETE FROM QUEST_PPCM_SNAPSHOT WHERE SNAPSHOT_TYPE = :B2 AND INSTANCE_ID > 0 AND SNAPSHOT_TIMESTAMP < TRUNC (SYSDATE) - :B1
----------resource 0xd9c7ad50----------------------
resname       : [0x1451c][0x0],[TM][ext 0x0,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 112
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd cached
grant_bits    : KJUSERCW
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 0         0         2         0         0         0
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9c7ad50
On Scan_q?    : N
Total accesses: 150
Imm.  accesses: 143
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 0xd876ab70 gl KJUSERCW rp 0xd9c7ad50 [0x1451c][0x0],[TM][ext 0x0,0x0]
master 1 gl owner 0xda7233c0 possible pid 1993 xid 2B000-0001-000000B5 bast 0 rseq 1 mseq 0 history 0x49a51495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd8578490 gl KJUSERCW rl KJUSERPW rp 0xd9c7ad50 [0x1451c][0x0],[TM][ext 0x0,0x0]
master 1 owner 2  bast 1 rseq 2 mseq 0x60005 history 0xd7d8d7da
convert opt KJUSERGETVALUE
-------------------------------------------------------------------------------
Trace Bucket Dump Begin: default bucket for process 43 (osid: 1993, J000)
TIME(*=approx):SEQ:COMPONENT:FILE@LINE:FUNCTION:SECT/DUMP: [EVENT#:PID:SID] DATA

TM DEADLOCK IN RAC:

* End DRM for pkey remastering request(s) (locally requested)
ENQUEUE DUMP REQUEST: from 2 spnum 12 on [0x1475a][0x0],[TM][ext 0x0,0x0] for reason 3 mtype 0
*** 2011-06-21 21:57:42.212
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x1475a][0x0],[TM][ext 0x0,0x0]
----------resource 0xd9e62b48----------------------
resname       : [0x1475a][0x0],[TM][ext 0x0,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 46
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd cached
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9e62b48
On Scan_q?    : N
Total accesses: 30
Imm.  accesses: 25
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 0xd87530a8 gl KJUSEREX rp 0xd9e62b48 [0x1475a][0x0],[TM][ext 0x0,0x0]
master 1 gl owner 0xda7233c0 possible pid 26255 xid 2B000-0001-00000576 bast 0 rseq 3 mseq 0 history 0x9a514495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd85b60e0 gl KJUSERNL rl KJUSEREX rp 0xd9e62b48 [0x1475a][0x0],[TM][ext 0x0,0x0]
master 1 owner 2  bast 0 rseq 12 mseq 0x1 history 0x97ad
convert opt KJUSERGETVALUE
----------enqueue 0xd87530a8------------------------
lock version     : 10199
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62b48
procp            : 0xd99750b0
pid              : 26255
proc version     : 229
oprocp           : (nil)
opid             : 26255
group lock owner : 0xda7233c0
possible pid     : 26255
xid              : 2B000-0001-00000576
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x9a514495
Msg_Seq          : 0x0
res_seq          : 3
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd87530a8
sid: 416 ser: 1053 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26255
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26254
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
lock table lock2 in exclusive mode
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26255] on resource TM-0001475A-00000000
*** 2011-06-21 21:57:42.215
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd85b60e0------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62b48
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :  KJUSERNO_XID
Convert options  : KJUSERGETVALUE
History          : 0x97ad
Msg_Seq          : 0x1
res_seq          : 12
valblk           : 0xe0720660ff7f000020700660ff7f0000 .r` p`
Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x1485a][0x0],[TM][ext 0x0,0x0]
----------resource 0xd9c5a7c8----------------------
resname       : [0x1485a][0x0],[TM][ext 0x0,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 46
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd cached
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x000000000a0000000a00000001000000 .
access_inst   : 2
vbreq_state   : 0
state         : x0
resp          : 0xd9c5a7c8
On Scan_q?    : N
Total accesses: 29
Imm.  accesses: 20
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 0a 00 00 00 0a 00 00 00 01 00 00 00
GRANTED_Q :
lp 0xd85b5f20 gl KJUSEREX rp 0xd9c5a7c8 [0x1485a][0x0],[TM][ext 0x0,0x0]
master 1 owner 2  bast 0 rseq 15 mseq 0x2 history 0x977d8d
open opt  KJUSERNO_XID
CONVERT_Q:
lp 0xd8757ff8 gl KJUSERNL rl KJUSEREX rp 0xd9c5a7c8 [0x1485a][0x0],[TM][ext 0x0,0x0]
master 1 gl owner 0xda7233c0 possible pid 26255 xid 2B000-0001-00000576 bast 0 rseq 2 mseq 0 history 0xda51449a
convert opt KJUSERGETVALUE
----------enqueue 0xd85b5f20------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSEREX
notify_func      : (nil)
resp             : 0xd9c5a7c8
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :  KJUSERNO_XID
Convert options  : KJUSERGETVALUE
History          : 0x977d8d
Msg_Seq          : 0x2
res_seq          : 15
valblk           : 0x00000000000000000000000000000000 .
----------enqueue 0xd8757ff8------------------------
lock version     : 10261
Owner inst       : 1
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c5a7c8
procp            : 0xd99750b0
pid              : 26255
proc version     : 229
oprocp           : (nil)
opid             : 26255
group lock owner : 0xda7233c0
possible pid     : 26255
xid              : 2B000-0001-00000576
dd_time          : 10.0 secs
dd_count         : 1
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : Y
lock_state       : OPENING CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0xda51449a
Msg_Seq          : 0x0
res_seq          : 2
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd8757ff8
sid: 416 ser: 1053 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26255
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26254
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
lock table lock2 in exclusive mode
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26255] on resource TM-0001485A-00000000
*** 2011-06-21 21:57:42.219
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x1475a][0x0],[TM][ext 0x0,0x0]
----------resource 0xd9e62b48----------------------
resname       : [0x1475a][0x0],[TM][ext 0x0,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 46
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd cached
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x7c5b5c0900000000806a0660ff7f0000 |[\j`
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9e62b48
On Scan_q?    : N
Total accesses: 32
Imm.  accesses: 26
Granted_locks : 1
Cvting_locks  : 1
value_block:  7c 5b 5c 09 00 00 00 00 80 6a 06 60 ff 7f 00 00
GRANTED_Q :
lp 0xd87530a8 gl KJUSEREX rp 0xd9e62b48 [0x1475a][0x0],[TM][ext 0x0,0x0]
master 1 gl owner 0xda7233c0 possible pid 26255 xid 2B000-0001-00000576 bast 0 rseq 3 mseq 0 history 0x9a514495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd85b60e0 gl KJUSERNL rl KJUSEREX rp 0xd9e62b48 [0x1475a][0x0],[TM][ext 0x0,0x0]
master 1 owner 2  bast 0 rseq 12 mseq 0x1 history 0x97ad
convert opt KJUSERGETVALUE
----------enqueue 0xd87530a8------------------------
lock version     : 10199
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62b48
procp            : 0xd99750b0
pid              : 26255
proc version     : 229
oprocp           : (nil)
opid             : 26255
group lock owner : 0xda7233c0
possible pid     : 26255
xid              : 2B000-0001-00000576
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x9a514495
Msg_Seq          : 0x0
res_seq          : 3
valblk           : 0x00000000ff7f000031000502ff7f0000 .1
user session for deadlock lock 0xd87530a8
sid: 416 ser: 1053 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26255
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26254
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
lock table lock2 in exclusive mode
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26255] on resource TM-0001475A-00000000
*** 2011-06-21 21:57:42.220
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd85b60e0------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62b48
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :  KJUSERNO_XID
Convert options  : KJUSERGETVALUE
History          : 0x97ad
Msg_Seq          : 0x1
res_seq          : 12
valblk           : 0x406f0660ff7f0000806c0660ff7f0000 @o`l`
Global blockers dump end:-----------------------------------
Global Wait-For-Graph(WFG) at ddTS[0.1] :
BLOCKED 0xd8757ff8 5 wq 2 cvtops x1 TM 0x1485a.0x0(ext 0x0,0x0)[2B000-0001-00000576] inst 1
BLOCKER 0xd87520d0 5 wq 1 cvtops x1 TM 0x1485a.0x0(ext 0x0,0x0)[22000-0002-000010C6] inst 2
BLOCKED 0xd8766338 5 wq 2 cvtops x1 TM 0x1475a.0x0(ext 0x0,0x0)[22000-0002-000010C6] inst 2
BLOCKER 0xd87530a8 5 wq 1 cvtops x1 TM 0x1475a.0x0(ext 0x0,0x0)[2B000-0001-00000576] inst 1

TX DEADLOCK in RAC:

ENQUEUE DUMP REQUEST: from 2 spnum 12 on [0x10001][0x7b3],[TX][ext 0x2,0x0] for reason 3 mtype 0
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x10001][0x7b3],[TX][ext 0x2,0x0]
----------resource 0xd9c5a6a0----------------------
resname       : [0x10001][0x7b3],[TX][ext 0x2,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 8
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x00000000000000000000000000000000 .
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9c5a6a0
On Scan_q?    : N
Total accesses: 54
Imm.  accesses: 44
Granted_locks : 1
Cvting_locks  : 1
value_block:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
GRANTED_Q :
lp 0xd85b62a0 gl KJUSEREX rp 0xd9c5a6a0 [0x10001][0x7b3],[TX][ext 0x2,0x0]
master 1 gl owner 0xda7233c0 possible pid 26287 xid 2B000-0001-00000578 bast 0 rseq 8 mseq 0 history 0x95
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd85b6460 gl KJUSERNL rl KJUSEREX rp 0xd9c5a6a0 [0x10001][0x7b3],[TX][ext 0x2,0x0]
master 1 owner 2  bast 0 rseq 22 mseq 0x1 history 0x97ad
convert opt KJUSERGETVALUE
----------enqueue 0xd85b62a0------------------------
lock version     : 1
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c5a6a0
procp            : 0xd996b090
pid              : 22148
proc version     : 0
oprocp           : (nil)
opid             : 22148
group lock owner : 0xda7233c0
possible pid     : 26287
xid              : 2B000-0001-00000578
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERNOQUEUE KJUSERNODEADLOCKWAIT
History          : 0x95
Msg_Seq          : 0x0
res_seq          : 8
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd85b62a0
sid: 416 ser: 1055 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26287
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26286
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=2
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26287] on resource TX-00010001-000007B3
*** 2011-06-21 22:08:18.048
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd85b6460------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c5a6a0
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :  KJUSERNO_XID
Convert options  : KJUSERGETVALUE
History          : 0x97ad
Msg_Seq          : 0x1
res_seq          : 22
valblk           : 0xe0720660ff7f000020700660ff7f0000 .r` p`
Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0xb0000][0xc8],[TX][ext 0x5,0x0]
----------resource 0xd9e626a8----------------------
resname       : [0xb0000][0xc8],[TX][ext 0x5,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 2
master_inst   : 2
hv idx        : 28
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd
Held mode     : KJUSERNL
Cvt mode      : KJUSEREX
Next Cvt mode : KJUSERNL
msg_seq       : 0x1
res_seq       : 2
grant_bits    : KJUSERNL
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         0
val_state     : KJUSERVS_NOVALUE
valblk        : 0x000000000a0000000a00000001000000 .
access_inst   : 2
vbreq_state   : 0
state         : x8
resp          : 0xd9e626a8
On Scan_q?    : N
Total accesses: 24
Imm.  accesses: 15
Granted_locks : 0
Cvting_locks  : 1
value_block:  00 00 00 00 0a 00 00 00 0a 00 00 00 01 00 00 00
GRANTED_Q :
CONVERT_Q:
lp 0xd8757ff8 gl KJUSERNL rl KJUSEREX rp 0xd9e626a8 [0xb0000][0xc8],[TX][ext 0x5,0x0]
master 2 gl owner 0xda7233c0 possible pid 26287 xid 2B000-0001-00000578 bast 0 rseq 2 mseq 0 history 0x495149da
convert opt KJUSERGETVALUE
----------enqueue 0xd8757ff8------------------------
lock version     : 11019
Owner inst       : 1
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e626a8
procp            : 0xd99750b0
pid              : 26287
proc version     : 230
oprocp           : (nil)
opid             : 26287
group lock owner : 0xda7233c0
possible pid     : 26287
xid              : 2B000-0001-00000578
dd_time          : 10.0 secs
dd_count         : 1
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : Y
lock_state       : OPENING CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x495149da
Msg_Seq          : 0x0
res_seq          : 2
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd8757ff8
sid: 416 ser: 1055 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26287
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26286
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=2
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26287] on resource TX-000B0000-000000C8
*** 2011-06-21 22:08:18.051
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x10001][0x7b3],[TX][ext 0x2,0x0]
----------resource 0xd9c5a6a0----------------------
resname       : [0x10001][0x7b3],[TX][ext 0x2,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 8
hv last r.inc : 56
current inc   : 56
hv status     : 0
hv master     : 0
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x7c5b5c0900000000806a0660ff7f0000 |[\j`
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9c5a6a0
On Scan_q?    : N
Total accesses: 56
Imm.  accesses: 45
Granted_locks : 1
Cvting_locks  : 1
value_block:  7c 5b 5c 09 00 00 00 00 80 6a 06 60 ff 7f 00 00
GRANTED_Q :
lp 0xd85b62a0 gl KJUSEREX rp 0xd9c5a6a0 [0x10001][0x7b3],[TX][ext 0x2,0x0]
master 1 gl owner 0xda7233c0 possible pid 26287 xid 2B000-0001-00000578 bast 0 rseq 8 mseq 0 history 0x95
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd85b6460 gl KJUSERNL rl KJUSEREX rp 0xd9c5a6a0 [0x10001][0x7b3],[TX][ext 0x2,0x0]
master 1 owner 2  bast 0 rseq 22 mseq 0x1 history 0x97ad
convert opt KJUSERGETVALUE
----------enqueue 0xd85b62a0------------------------
lock version     : 1
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c5a6a0
procp            : 0xd996b090
pid              : 22148
proc version     : 0
oprocp           : (nil)
opid             : 22148
group lock owner : 0xda7233c0
possible pid     : 26287
xid              : 2B000-0001-00000578
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERNOQUEUE KJUSERNODEADLOCKWAIT
History          : 0x95
Msg_Seq          : 0x0
res_seq          : 8
valblk           : 0x00000000ff7f000031000502ff7f0000 .1
user session for deadlock lock 0xd85b62a0
sid: 416 ser: 1055 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26287
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26286
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=2
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26287] on resource TX-00010001-000007B3
*** 2011-06-21 22:08:18.053
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd85b6460------------------------
lock version     : 1
Owner inst       : 2
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9c5a6a0
procp            : 0xd9971780
pid              : 0
proc version     : 0
oprocp           : (nil)
opid             : 0
group lock owner : (nil)
xid              : 0000-0000-00000000
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     :  KJUSERNO_XID
Convert options  : KJUSERGETVALUE
History          : 0x97ad
Msg_Seq          : 0x1
res_seq          : 22
valblk           : 0x406f0660ff7f0000806c0660ff7f0000 @o`l`
Global blockers dump end:-----------------------------------
Global Wait-For-Graph(WFG) at ddTS[0.6] :
BLOCKED 0xd8757ff8 5 wq 2 cvtops x1 TX 0xb0000.0xc8(ext 0x5,0x0)[2B000-0001-00000578] inst 1
BLOCKER 0xd8561ee0 5 wq 1 cvtops x28 TX 0xb0000.0xc8(ext 0x5,0x0)[22000-0002-000010C6] inst 2
BLOCKED 0xd8766338 5 wq 2 cvtops x1 TX 0x10001.0x7b3(ext 0x2,0x0)[22000-0002-000010C6] inst 2
BLOCKER 0xd85b62a0 5 wq 1 cvtops x28 TX 0x10001.0x7b3(ext 0x2,0x0)[2B000-0001-00000578] inst 1 
*** 2011-06-21 22:08:19.059
* Cancel deadlock victim lockp 0xd8757ff8

TX DEADLOCK LOCAL only:

*** 2011-06-21 22:27:00.022
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x70015][0x81e],[TX][ext 0x2,0x0]
----------resource 0xd9e62330----------------------
resname       : [0x70015][0x81e],[TX][ext 0x2,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 7
hv last r.inc : 42
current inc   : 56
hv status     : 0
hv master     : 1
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x506b0660ff7f00006d6cf50400000000 Pk`ml
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9e62330
On Scan_q?    : N
Total accesses: 23
Imm.  accesses: 15
Granted_locks : 1
Cvting_locks  : 1
value_block:  50 6b 06 60 ff 7f 00 00 6d 6c f5 04 00 00 00 00
GRANTED_Q :
lp 0xd8767a10 gl KJUSEREX rp 0xd9e62330 [0x70015][0x81e],[TX][ext 0x2,0x0]
master 1 gl owner 0xda2cff40 possible pid 26847 xid 2E000-0001-00000347 bast 0 rseq 1 mseq 0 history 0x14951495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd876a630 gl KJUSERNL rl KJUSEREX rp 0xd9e62330 [0x70015][0x81e],[TX][ext 0x2,0x0]
master 1 gl owner 0xda7233c0 possible pid 26843 xid 2B000-0001-0000057A bast 0 rseq 1 mseq 0 history 0x1495149a
convert opt KJUSERGETVALUE
----------enqueue 0xd8767a10------------------------
lock version     : 8523
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62330
procp            : 0xd99750b0
pid              : 26843
proc version     : 231
oprocp           : (nil)
opid             : 26843
group lock owner : 0xda2cff40
possible pid     : 26847
xid              : 2E000-0001-00000347
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERNOQUEUE KJUSERNODEADLOCKWAIT
History          : 0x14951495
Msg_Seq          : 0x0
res_seq          : 1
valblk           : 0x00000000000000000000000000000000 .
user session for deadlock lock 0xd8767a10
sid: 16 ser: 851 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 46 O/S info: user: oracle, term: UNKNOWN, ospid: 26847
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26846
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=11
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[46.26847] on resource TX-00070015-0000081E
*** 2011-06-21 22:27:00.024
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd876a630------------------------
lock version     : 9399
Owner inst       : 1
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62330
procp            : 0xd99750b0
pid              : 26843
proc version     : 231
oprocp           : (nil)
opid             : 26843
group lock owner : 0xda7233c0
possible pid     : 26843
xid              : 2B000-0001-0000057A
dd_time          : 10.0 secs
dd_count         : 1
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : Y
lock_state       : OPENING CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x1495149a
Msg_Seq          : 0x0
res_seq          : 1
valblk           : 0xa06e0660ff7f0000e06b0660ff7f0000 .n`k`
user session for deadlock lock 0xd876a630
sid: 416 ser: 1057 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26843
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/4, ospid: 26842
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=12
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26843] on resource TX-00070015-0000081E
*** 2011-06-21 22:27:00.025
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x40008][0x7d9],[TX][ext 0x2,0x0]
----------resource 0xd9e62208----------------------
resname       : [0x40008][0x7d9],[TX][ext 0x2,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1
hv idx        : 53
hv last r.inc : 42
current inc   : 56
hv status     : 0
hv master     : 1
open options  : dd
grant_bits    : KJUSERNL KJUSEREX
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
val_state     : KJUSERVS_NOVALUE
valblk        : 0x7c5b5c0900000000e0690660ff7f0000 |[\i`
access_inst   : 1
vbreq_state   : 0
state         : x0
resp          : 0xd9e62208
On Scan_q?    : N
Total accesses: 92
Imm.  accesses: 82
Granted_locks : 1
Cvting_locks  : 1
value_block:  7c 5b 5c 09 00 00 00 00 e0 69 06 60 ff 7f 00 00
GRANTED_Q :
lp 0xd876a7f0 gl KJUSEREX rp 0xd9e62208 [0x40008][0x7d9],[TX][ext 0x2,0x0]
master 1 gl owner 0xda7233c0 possible pid 26843 xid 2B000-0001-0000057A bast 0 rseq 6 mseq 0 history 0x14951495
open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xd876ab70 gl KJUSERNL rl KJUSEREX rp 0xd9e62208 [0x40008][0x7d9],[TX][ext 0x2,0x0]
master 1 gl owner 0xda2cff40 possible pid 26847 xid 2E000-0001-00000347 bast 0 rseq 6 mseq 0 history 0x1495149a
convert opt KJUSERGETVALUE
----------enqueue 0xd876a7f0------------------------
lock version     : 6107
Owner inst       : 1
grant_level      : KJUSEREX
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62208
procp            : 0xd9978428
pid              : 26847
proc version     : 238
oprocp           : (nil)
opid             : 26847
group lock owner : 0xda7233c0
possible pid     : 26843
xid              : 2B000-0001-0000057A
dd_time          : 0.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : N
lock_state       : GRANTED
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERNOQUEUE KJUSERNODEADLOCKWAIT
History          : 0x14951495
Msg_Seq          : 0x0
res_seq          : 6
valblk           : 0x00000000ff7f000031000502ff7f0000 .1
user session for deadlock lock 0xd876a7f0
sid: 416 ser: 1057 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 43 O/S info: user: oracle, term: UNKNOWN, ospid: 26843
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/4, ospid: 26842
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=12
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[43.26843] on resource TX-00040008-000007D9
*** 2011-06-21 22:27:00.029
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
----------enqueue 0xd876ab70------------------------
lock version     : 3827
Owner inst       : 1
grant_level      : KJUSERNL
req_level        : KJUSEREX
bast_level       : KJUSERNL
notify_func      : (nil)
resp             : 0xd9e62208
procp            : 0xd9978428
pid              : 26847
proc version     : 238
oprocp           : (nil)
opid             : 26847
group lock owner : 0xda2cff40
possible pid     : 26847
xid              : 2E000-0001-00000347
dd_time          : 5.0 secs
dd_count         : 0
timeout          : 0.0 secs
On_timer_q?      : N
On_dd_q?         : Y
lock_state       : OPENING CONVERTING
ast_flag         : 0x0
Open Options     : KJUSERDEADLOCK
Convert options  : KJUSERGETVALUE
History          : 0x1495149a
Msg_Seq          : 0x0
res_seq          : 6
valblk           : 0xa06e0660ff7f0000e06b0660ff7f0000 .n`k`
user session for deadlock lock 0xd876ab70
sid: 16 ser: 851 audsid: 4294967295 user: 0/SYS
flags: (0x41) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40008) -/-
pid: 46 O/S info: user: oracle, term: UNKNOWN, ospid: 26847
image: oracle@rh2.oracle.com (TNS V1-V3)
client details:
O/S info: user: oracle, term: pts/3, ospid: 26846
machine: rh2.oracle.com program: sqlplus@rh2.oracle.com (TNS V1-V3)
application name: sqlplus@rh2.oracle.com (TNS V1-V3), hash value=3660119738
current SQL:
update lock1 set t1=t1+10 where t1=11
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK
possible owner[46.26847] on resource TX-00040008-000007D9
*** 2011-06-21 22:27:00.031
Submitting asynchronized dump request [28]. summary=[ges process stack dump (kjdglblkrdm1)].
Global blockers dump end:-----------------------------------
Global Wait-For-Graph(WFG) at ddTS[0.db] :
BLOCKED 0xd876a630 5 wq 2 cvtops x1 TX 0x70015.0x81e(ext 0x2,0x0)[2B000-0001-0000057A] inst 1
BLOCKER 0xd8767a10 5 wq 1 cvtops x28 TX 0x70015.0x81e(ext 0x2,0x0)[2E000-0001-00000347] inst 1
BLOCKED 0xd876ab70 5 wq 2 cvtops x1 TX 0x40008.0x7d9(ext 0x2,0x0)[2E000-0001-00000347] inst 1
BLOCKER 0xd876a7f0 5 wq 1 cvtops x28 TX 0x40008.0x7d9(ext 0x2,0x0)[2B000-0001-0000057A] inst 1

Summary Of  Bugs Which Could Cause Deadlock In RAC Environment

Applies to:

Oracle Server - Enterprise Edition - Version: 9.2.0.4 to 11.1 Information in this document applies to any platform.

Purpose

The purpose of this Note is to explain various bugs filed specifically for the Dead lock errors in a RAC environment  against specific Oracle database versions (This Note covers bugs reported versions  above 9.2.0.4), and explain the symptoms of each bug, workarounds if any and references the patch availability at the time this article was written.

Scope

This article is a consolidated effort to summarisze top bugs reported specifically for the Dead lock  error in RAC environment  (This Note covers bugs reported versions above 9.2.0.4) which have been fixed. It is directed towards Oracle Support Analysts and Oracle Customers to have an overview of various bugs logged for the same error .

Summary Of Bugs Which Could Cause Deadlock In RAC Environment

Bugs Fixed in Version 9.2.0.5 Note 2941738.8 Bug 2941738   SMON processes may deadlock in RAC Note 2902030.8 Bug 2902030   Deadlocks in RAC not logged in alert log Bugs Fixed in Version 9.2.0.6 10.1.0.4 10.2.0.1 Note 3268802.8 Bug 3268802 Additional diagnostics for deadlock in RAC environment Note.3646162.8 Bug 3646162 False deadlock (ORA-60) in a RAC environment / TM lock mode change Note.3627263.8 Bug 3627263 DEADLOCK OPS STARTUP Deadlock / hang during RAC instance startup Bugs Fixed in Version 9.2.0.7 10.2.0.1 Note.3992847.8 Bug 3992847 Deadlocks are not detected in rac when one node rebooted more recently than the rest Note 3641819.8 Bug 3641819  Undetected deadlock possible in RAC Note 3777178.8 Bug 3777178  TA / US enqueue deadlock during transaction recovery in RAC Note 4220161.8 Bug 4220161 OPS Deadlock between SMON processes on different instances Bug Fixed in Version  9.2.0.8 10.2.0.1 Note 4371923.8 Bug 4371923 SMON may deadlock on TX enqueue waits for updates to COL_USAGE$ in RAC Bug Fixed in  Version 10.2.0.2 Note 4579381.8 Bug 4579381  Deadlock on DC_USERS in RAC (ORA-4020) Bug Fixed in Version 10.2.0.3 Note 5012368.8 Bug 5012368 Undetected deadlock in RAC  Note 4913415.8 Bug 4913415 Global deadlock not reported in RAC Bug Fixed in Version 10.2.0.4 Note 5470095.8 Bug 5470095  Self deadlock should provide more targeted diagnostics Note 5454831.8 Bug 5454831  deadlock possible on working set latches Note 5334733.8 Bug 5334733  Deadlock resolution can be slow in RAC  Note 4441119.8 Bug 4441119 Not enough information dumped when RAC detects a deadlock  Note 5883112.8 Bug 5883112 False deadlock in RAC Bug Fixed in Version 10.2.0.5 Note 6145177.8 Bug 6145177 Single resource deadlock with a zero DID For summary of bugs which could cause deadlock in single instance see Note 554616.1

沪ICP备14014813号

沪公网安备 31010802001379号