GoldenGate实现Live Standby主备库切换(1)

Oracle Goldengate目前支持主被动式的双向配置,换而言之OGG可以将来自于激活的主库的数据变化完全复制到从库中,从库在不断同步数据的同时已经为计划内的和计划外的outages做好了故障切换的准备,也就是我们说的Live Standby。这里我们重点介绍一下配置Oracle Goldengate Live Standby系统的步骤,和具体的故障切换过程。

SQL> conn clinic/clinic
Connected.
SQL> drop table tv;
create table tv (t1 int primary key,t2 int,t3 varchar2(30));
Table dropped.

SQL> 

Table created.

SQL> drop sequence seqt1;

create sequence seqt1 start with 1 increment by 1;
Sequence dropped.

SQL> SQL>
Sequence created.

declare
  rnd number(9,2);
begin
   for i in 1..100000 loop
     insert into tv values(seqt1.nextval,i*dbms_random.value,'MACLEAN IS TESTING');
     commit;
   end loop;
end;
/

/* 以上脚本在primary主库的某个应用账户下创建了测试用的数据,
    接着我们可以使用各种工具将数据初始化到从库中,如果在这个过程中
    希望实时在线数据迁移的话,可以参考《Goldengate实现在线数据迁移》
*/

/* 注意我们在Live Standby的环境中往往需要复制sequence序列,以保证切换到备库时业务可以正常进行  */

/* 初始化备库数据后,确保已与主库完全一致 */
primary :
SQL> select sum(t2) from tv;

   SUM(T2)
----------
2498624495

SQL> select last_number from user_sequences;

LAST_NUMBER
-----------
     100001

standby:
SQL> select sum(t2) from tv;

   SUM(T2)
----------
2498624495

SQL> select last_number from user_sequences;

LAST_NUMBER
-----------
     100001

以上完成准备工作后,我们可以进入到正式配置Goldengate live stanby的阶段,包括以下步骤:

  1. 配置由主库到备库的extract、replicat、data pump,该步骤同普通的单向复制没有太大的区别
  2. 配置由备库到主库的extract、replicat、data pump
  3. 启动由主库到备库的extract、replicat、data pump

接下来我们会实践整个配置过程:

1.
创建由主库到备库的extract、data pump、replicat
GGSCI (rh2.oracle.com) 10> dblogin userid maclean
Password: 
Successfully logged into database.
GGSCI (rh2.oracle.com) 11> add trandata clinic.*
Logging of supplemental redo data enabled for table CLINIC.TV
GGSCI (rh2.oracle.com) 4> add extract extstd1,tranlog,begin now
EXTRACT added.
GGSCI (rh2.oracle.com) 5> add exttrail /d01/ext/cl,megabytes 100,extract extstd1
EXTTRAIL added.
GGSCI (rh2.oracle.com) 7> view params extstd1
-- Identify the Extract group:
EXTRACT extstd1
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify the local trail that this Extract writes to:
EXTTRAIL /d01/ext/cl
-- Specify sequences to be captured:
SEQUENCE clinic.seqt1;
-- Specify tables to be captured:
TABLE clinic.*;
-- Exclude specific tables from capture if needed:
-- TABLEEXCLUDE 
GGSCI (rh2.oracle.com) 17> add extract pumpstd1,exttrailsource /d01/ext/cl,begin now
EXTRACT added.
GGSCI (rh2.oracle.com) 98> add rmttrail /d01/rmt/cl,megabytes 100,extract pumpstd1
RMTTRAIL added.
GGSCI (rh2.oracle.com) 129> view params pumpstd1
-- Identify the data pump group:
EXTRACT pumpstd1
userid maclean, password maclean
-- Specify database login information as needed for the database:
userid maclean, password maclean
RMTHOST rh3.oracle.com, MGRPORT 7809
-- Specify the remote trail on the standby system:
RMTTRAIL /d01/rmt/cl
-- Pass data through without mapping, filtering, conversion:
PASSTHRU
sequence clinic.seqt1;
Table clinic.*;
在备库上配置由主库到备库的replicat:
GGSCI (rh3.oracle.com) 4> add replicat repstd1,exttrail /d01/rmt/cl,begin now
REPLICAT added.
GGSCI (rh3.oracle.com) 49> view params repstd1
-- Identify the Replicat group:
REPLICAT repstd1
-- State that source and target definitions are identical:
ASSUMETARGETDEFS
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify tables for delivery:
MAP clinic.*, TARGET clinic.*;
-- Exclude specific tables from delivery if needed:
-- MAPEXCLUDE 
2.
创建由备库到主库的extract、data pump、replicat
GGSCI (rh3.oracle.com) 51> dblogin userid maclean
Password: 
Successfully logged into database.
GGSCI (rh3.oracle.com) 52> add trandata clinic.*
Logging of supplemental redo data enabled for table CLINIC.TV.
/* 不要忘记在备库端的相关表加上追加日志 */
GGSCI (rh3.oracle.com) 53> add extract extstd2,tranlog,begin now
EXTRACT added.
GGSCI (rh3.oracle.com) 54> add exttrail /d01/ext/cl,megabytes 100,extract extstd2
EXTTRAIL added.
GGSCI (rh3.oracle.com) 58> view params extstd2
-- Identify the Extract group:
EXTRACT extstd2
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify the local trail that this Extract writes to:
EXTTRAIL /d01/ext/cl
-- Specify sequences to be captured:
SEQUENCE clinic.seqt1;
-- Specify tables to be captured:
TABLE clinic.*;
-- Exclude specific tables from capture if needed:
-- TABLEEXCLUDE 
GGSCI (rh3.oracle.com) 59> add extract pumpstd2,exttrailsource /d01/ext/cl,begin now
EXTRACT added.
GGSCI (rh3.oracle.com) 60> add rmttrail /d01/rmt/cl,megabytes 100,extract pumpstd2
RMTTRAIL added.
GGSCI (rh3.oracle.com) 63> view params pumpstd2
-- Identify the data pump group:
EXTRACT pumpstd2
userid maclean, password maclean
-- Specify database login information as needed for the database:
userid maclean, password maclean
RMTHOST rh2.oracle.com, MGRPORT 7809
-- Specify the remote trail on the standby system:
RMTTRAIL /d01/rmt/cl
-- Pass data through without mapping, filtering, conversion:
PASSTHRU
sequence clinic.seqt1;
Table clinic.*;
在主库上配置replicat:
GGSCI (rh2.oracle.com) 136> add replicat repstd2,exttrail /d01/rmt/cl,begin now,checkpointtable maclean.ck
REPLICAT added.
GGSCI (rh2.oracle.com) 138> view params repstd2
-- Identify the Replicat group:
REPLICAT repstd2
-- State that source and target definitions are identical:
ASSUMETARGETDEFS
-- Specify database login information as needed for the database:
userid maclean, password maclean
-- Specify tables for delivery:
MAP clinic.*, TARGET clinic.*;
-- Exclude specific tables from delivery if needed:
-- MAPEXCLUDE 
3.
完成以上OGG配置后,可以启动主库到备库的extract、pump、以及replicat:
GGSCI (rh2.oracle.com) 141> start extstd1
Sending START request to MANAGER ...
EXTRACT EXTSTD1 starting
GGSCI (rh2.oracle.com) 142> start pumpstd1
Sending START request to MANAGER ...
EXTRACT PUMPSTD1 starting
GGSCI (rh3.oracle.com) 70> start repstd1
Sending START request to MANAGER ...
REPLICAT REPSTD1 starting
/* 如果你是在offline状态下配置的话,那么此时可以启用应用了*/

接下来我们尝试做有计划的主备库切换演练:

1.
首先停止一切在主库上的应用,这一点和DataGuard Switchover一样。在保证没有活动事务的情况下,才能切换干净。
2.
在主库端使用LAG等命令了解extract的延迟,若返回如"At EOF, no more records to process"的信息,则说明所有事务均已被抽取。
GGSCI (rh2.oracle.com) 144> lag extstd1
Sending GETLAG request to EXTRACT EXTSTD1 ...
Last record lag: 0 seconds.
At EOF, no more records to process.
在EOF的前提下关闭extract:
GGSCI (rh2.oracle.com) 146> stop extstd1 
Sending STOP request to EXTRACT EXTSTD1 ...
Request processed.
3.
同样对pump使用LAG命令,若返回如"At EOF, no more records to process"的信息,则说明已抽取的数据都被发送到备库了。
GGSCI (rh2.oracle.com) 147> lag pumpstd1
Sending GETLAG request to EXTRACT PUMPSTD1 ...
Last record lag: 3 seconds.
At EOF, no more records to process.
在EOF的前提下,关闭data pump
GGSCI (rh2.oracle.com) 148> stop pumpstd1
Sending STOP request to EXTRACT PUMPSTD1 ...
Request processed.
3.
检查备库端replicat的同步情况,如返回"At EOF, no more records to process.",则说明所有记录均被复制。
GGSCI (rh3.oracle.com) 71> lag repstd1
Sending GETLAG request to REPLICAT REPSTD1 ...
Last record lag: 5 seconds.
At EOF, no more records to process.
在EOF的前提下关闭replicat
GGSCI (rh3.oracle.com) 72> stop repstd1
Sending STOP request to REPLICAT REPSTD1 ...
Request processed.
4.
紧接着我们可以在备库上为业务应用用户赋予必要的insert、update、delete权限,启用各种触发器trigger及cascade delete约束等;
以上手段在主库上对应的操作是收回应用业务的权限,disable掉各种触发器及cascade delete约束,
之所以这样做是为了保证在任何时候扮演备库角色的数据库均不应当接受任何除了OGG外的手动的或者应用驱动的业务数据变更,
以保证主备库间的数据一致。
5.
修改原备库上的extract的启动时间到现在,已保证它不去抽取那些之前的重做日志
GGSCI (rh3.oracle.com) 75> alter extstd2 ,begin now
EXTRACT altered.
GGSCI (rh3.oracle.com) 76> start extstd2
Sending START request to MANAGER ...
EXTRACT EXTSTD2 starting
若之前没有启动由备库到主库的pump和replicat的话可以在此时启动:
GGSCI (rh3.oracle.com) 78> start pumpstd2
Sending START request to MANAGER ...
EXTRACT PUMPSTD2 starting
GGSCI (rh2.oracle.com) 161> start repstd2
Sending START request to MANAGER ...
REPLICAT REPSTD2 starting
6.此时我们可以正式启动在原备库现在的主库上的应用了
接下来我们尝试回切到原主库上:
1.前提步骤与之前的切换相似,首先停止在原备库上的任何应用,
之后使用LAG命令确认extract和replicat的进度,在确认后关闭extract和replicat。
完成在主库上的维护工作:包括赋予权限,启用触发器等等。
2.修改原主库上的extract的开始时间为当前,保证它不去处理之前的重做日志:
GGSCI (rh2.oracle.com) 165> alter extract extstd1,begin now
EXTRACT altered.
3.此时我们已经可以启动在原主库现在的主库上的应用了
4.启动最早配置的由主库到备库的extract、pump、replicat:
GGSCI (rh2.oracle.com) 166> start extstd1
Sending START request to MANAGER ...
EXTRACT EXTSTD1 starting
GGSCI (rh2.oracle.com) 171> start pumpstd1
Sending START request to MANAGER ...
EXTRACT PUMPSTD1 starting
GGSCI (rh3.oracle.com) 86> start repstd1
Sending START request to MANAGER ...
REPLICAT REPSTD1 starting
以上完成了OGG的Live Standby中主备库之间的计划内的切换Switchover,That's Great!

ARCHIVER ERROR ORA-00354: CORRUPT REDO LOG BLOCK HEADER

Problem Description:
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: ‘/oradata/3/TOOLS/stdby_redo/srl1.log’

LOG FILE
---------------
Filename = alert_TOOLS5_from_1021.log
See ...
...
Wed Oct 28 11:41:59 2009
Primary database is in MAXIMUM AVAILABILITY mode
Standby controlfile consistent with primary
RFS[1]: Successfully opened standby log 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
Wed Oct 28 11:42:00 2009
ARC0: Log corruption near block 604525 change 10551037679542 time ?
Wed Oct 28 11:42:00 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
ARC0: All Archive destinations made inactive due to error 354
Wed Oct 28 11:42:00 2009
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_2: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354)
(TOOLS)
Committing creation of archivelog '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc' (error 354)
ARCH: Archival stopped, error occurred. Will continue retrying
Wed Oct 28 11:42:05 2009
ORACLE Instance TOOLS - Archival Error
Wed Oct 28 11:42:05 2009
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:42:05 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc:
ORA-16038: log 2 sequence# 13831 cannot be archived
ORA-00354: corrupt redo log block header
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:43:04 2009
ARCH: Archival stopped, error occurred. Will continue retrying
Wed Oct 28 11:43:04 2009
ORACLE Instance TOOLS - Archival Error
Wed Oct 28 11:43:04 2009
Primary database is in MAXIMUM AVAILABILITY mode
Changing standby controlfile to RESYNCHRONIZATION level
Wed Oct 28 11:43:04 2009
ORA-16014: log 1 sequence# 13832 not archived, no available destinations
ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
Wed Oct 28 11:43:04 2009
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc1_2145.trc:
ORA-16014: log 1 sequence# 13832 not archived, no available destinations
ORA-00312: online log 1 thread 1: '/oradata/3/TOOLS/stdby_redo/srl0.log'
RFS[1]: Successfully opened standby log 2: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Wed Oct 28 11:43:13 2009
RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc'
Wed Oct 28 11:43:14 2009
RFS LogMiner: Registered logfile [/oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc] to LogMiner session id [4]
Wed Oct 28 11:43:15 2009
LOGMINER: Begin mining logfile for session 4 thread 1 sequence 13831, /oradata/3/TOOLS/archive/dgarc/1_13831_635534096.arc
Wed Oct 28 11:44:03 2009
RFS[3]: Archived Log: '/oradata/3/TOOLS/archive/dgarc/1_13832_635534096.arc'
...
LOG FILE
---------------
Filename = alert_TOOLS6_from_1021.log
See ...
...
Wed Oct 28 11:16:01 2009
Thread 1 advanced to log sequence 13830 (LGWR switch)
Current log# 8 seq# 13830 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13830 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13830 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:29:50 2009
LGWR: Standby redo logfile selected to archive thread 1 sequence 13831
LGWR: Standby redo logfile selected for thread 1 sequence 13831 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:29:50 2009
Thread 1 advanced to log sequence 13831 (LGWR switch)
Current log# 9 seq# 13831 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13831 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13831 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:41:59 2009
LGWR: Standby redo logfile selected to archive thread 1 sequence 13832
LGWR: Standby redo logfile selected for thread 1 sequence 13832 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:41:59 2009
Thread 1 advanced to log sequence 13832 (LGWR switch)
Current log# 10 seq# 13832 mem# 0: /oradata/1/redo/TOOLS/redo3a.log
Current log# 10 seq# 13832 mem# 1: /oradata/2/redo/TOOLS/redo3b.log
Current log# 10 seq# 13832 mem# 2: /oradata/3/redo/TOOLS/redo3c.log
Wed Oct 28 11:43:04 2009
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13833
LGWR: Standby redo logfile selected for thread 1 sequence 13833 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:43:04 2009
Thread 1 advanced to log sequence 13833 (LGWR switch)
Current log# 11 seq# 13833 mem# 0: /oradata/1/redo/TOOLS/redo4a.log
Current log# 11 seq# 13833 mem# 1: /oradata/2/redo/TOOLS/redo4b.log
Current log# 11 seq# 13833 mem# 2: /oradata/3/redo/TOOLS/redo4c.log
Wed Oct 28 11:45:04 2009
Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13834
LGWR: Standby redo logfile selected for thread 1 sequence 13834 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:45:05 2009
Thread 1 advanced to log sequence 13834 (LGWR switch)
Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:46:03 2009
Thread 1 cannot allocate new log, sequence 13835
Checkpoint not complete
Current log# 8 seq# 13834 mem# 0: /oradata/1/redo/TOOLS/redo1a.log
Current log# 8 seq# 13834 mem# 1: /oradata/2/redo/TOOLS/redo1b.log
Current log# 8 seq# 13834 mem# 2: /oradata/3/redo/TOOLS/redo1c.log
Wed Oct 28 11:46:10 2009
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 13835
LGWR: Standby redo logfile selected for thread 1 sequence 13835 for destination LOG_ARCHIVE_DEST_2
Wed Oct 28 11:46:11 2009
Thread 1 advanced to log sequence 13835 (LGWR switch)
Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:48:03 2009
Thread 1 cannot allocate new log, sequence 13836
Checkpoint not complete
Current log# 9 seq# 13835 mem# 0: /oradata/1/redo/TOOLS/redo2a.log
Current log# 9 seq# 13835 mem# 1: /oradata/2/redo/TOOLS/redo2b.log
Current log# 9 seq# 13835 mem# 2: /oradata/3/redo/TOOLS/redo2c.log
Wed Oct 28 11:48:06 2009
...
From the standby, as at 2009-10-28, 11:42, when the archiver tried to archive the standby
redo logfile. it encountered this error:
ORA-00354: corrupt redo log block header
ORA-00353: log corruption near block 604525 change 10551037679542 time 10/28/2009 11:29:50
ORA-00312: online log 2 thread 1: '/oradata/3/TOOLS/stdby_redo/srl1.log'
Errors in file /tools/oracle/admin/TOOLS/bdump/tools_arc0_2143.trc

The real logfile is retrieved from primary by the standby RFS process, then the log apply continue as usual.
The fact that the standby redo logs are corrupted and identified as corrupt by the ARC process , makes it clear that there could be some sort of I/O errors which has caused.
Reviewing the alert.log file it is clear that the RFS process fetched the new copy of the file which is corrupted and the issue has been resolved.
This is more an issue to be concentrated from the system adminisration end to determine in case there are any issues at the I.O subsystem
.

list some Script to Collect Data Guard Primary Site Diagnostic Information:

Overview
——–
This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.

Script Notes
————-
This script is intended to be run via sqlplus as the SYS or Internal user.

Script
——-
– – – – – – – – – – – – – – – – Script begins here – – – – – – – – – – – – – – – –

— NAME: dg_prim_diag.sql (Run on PRIMARY with a LOGICAL or PHYSICAL STANDBY)
— ————————————————————————
— Copyright 2002, Oracle Corporation
— LAST UPDATED: 2/23/04

— Usage: @dg_prim_diag
— ————————————————————————
— PURPOSE:
— This script is to be used to assist in collection information to help
— troubeshoot Data Guard issues with an emphasis on Logical Standby.
— ————————————————————————
— DISCLAIMER:
— This script is provided for educational purposes only. It is NOT
— supported by Oracle World Wide Technical Support.
— The script has been tested and appears to work as intended.
— You should always run new scripts on a test instance initially.
— ————————————————————————
— Script output is as follows:

set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,’Mondd_hhmi’) timecol,
‘.out’ spool_extension from sys.dual;
column output new_value dbname
select value || ‘_’ output
from v$parameter where name = ‘db_name’;
spool dg_prim_diag_&&dbname&&timestamp&&suffix
set linesize 79
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = ‘MON-DD-YYYY HH24:MI:SS’;
set feedback on
select to_char(sysdate) time from dual;

set echo on

— In the following the database_role should be primary as that is what
— this script is intended to be run on. If protection_level is different
— than protection_mode then for some reason the mode listed in
— protection_mode experienced a need to downgrade. Once the error
— condition has been corrected the protection_level should match the
— protection_mode after the next log switch.

column role format a7 tru
column name format a10 wrap

select name,database_role role,log_mode,
protection_mode,protection_level
from v$database;

— ARCHIVER can be (STOPPED | STARTED | FAILED). FAILED means that the
— archiver failed to archive a log last time, but will try again within 5
— minutes. LOG_SWITCH_WAIT The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log
— switching is waiting for. Note that if ALTER SYSTEM SWITCH LOGFILE is
— hung, but there is room in the current online redo log, then value is
— NULL

column host_name format a20 tru
column version format a9 tru

select instance_name,host_name,version,archiver,log_switch_wait
from v$instance;

— The following query give us information about catpatch.
— This way we can tell if the procedure doesn’t match the image.

select version, modified, status from dba_registry
where comp_id = ‘CATPROC’;

— Force logging is not mandatory but is recommended. Supplemental
— logging must be enabled if the standby associated with this primary is
— a logical standby. During normal operations it is acceptable for
— SWITCHOVER_STATUS to be SESSIONS ACTIVE or TO STANDBY.

column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru

select force_logging,remote_archive,
supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker
from v$database;

— This query produces a list of all archive destinations. It shows if
— they are enabled, what process is servicing that destination, if the
— destination is local or remote, and if remote what the current mount ID
— is.

column destination format a35 wrap
column process format a7
column archiver format a8
column ID format 99
column mid format 99

select dest_id “ID”,destination,status,target,
schedule,process,mountid mid
from v$archive_dest order by dest_id;

— This select will give further detail on the destinations as to what
— options have been set. Register indicates whether or not the archived
— redo log is registered in the remote destination control file.

set numwidth 8
column ID format 99

select dest_id “ID”,archiver,transmit_mode,affirm,async_blocks async,
net_timeout net_time,delay_mins delay,reopen_secs reopen,
register,binding
from v$archive_dest order by dest_id;

— The following select will show any errors that occured the last time
— an attempt to archive to the destination was attempted. If ERROR is
— blank and status is VALID then the archive completed correctly.

column error format a55 wrap

select dest_id,status,error from v$archive_dest;

— The query below will determine if any error conditions have been
— reached by querying the v$dataguard_status view (view only available in
— 9.2.0 and above):

column message format a80

select message, timestamp
from v$dataguard_status
where severity in (‘Error’,’Fatal’)
order by timestamp;

— The following query will determine the current sequence number
— and the last sequence archived. If you are remotely archiving
— using the LGWR process then the archived sequence should be one
— higher than the current sequence. If remotely archiving using the
— ARCH process then the archived sequence should be equal to the
— current sequence. The applied sequence information is updated at
— log switch time.

select ads.dest_id,max(sequence#) “Current Sequence”,
max(log_sequence) “Last Archived”
from v$archived_log al, v$archive_dest ad, v$archive_dest_status ads
where ad.dest_id=al.dest_id
and al.dest_id=ads.dest_id
group by ads.dest_id;

— The following select will attempt to gather as much information as
— possible from the standby. SRLs are not supported with Logical Standby
— until Version 10.1.

set numwidth 8
column ID format 99
column “SRLs” format 99
column Active format 99

select dest_id id,database_mode db_mode,recovery_mode,
protection_mode,standby_logfile_count “SRLs”,
standby_logfile_active ACTIVE,
archived_seq#
from v$archive_dest_status;

— Query v$managed_standby to see the status of processes involved in
— the shipping redo on this system. Does not include processes needed to
— apply redo.

select process,status,client_process,sequence#
from v$managed_standby;

— The following query is run on the primary to see if SRL’s have been
— created in preparation for switchover.

select group#,sequence#,bytes from v$standby_log;

— The above SRL’s should match in number and in size with the ORL’s
— returned below:

select group#,thread#,sequence#,bytes,archived,status from v$log;

— Non-default init parameters.

set numwidth 5
column name format a30 tru
column value format a48 wra
select name, value
from v$parameter
where isdefault = ‘FALSE’;

spool off

– – – – – – – – – – – – – – – – Script ends here – – – – – – – – – – – – – – – –

another one:

Overview
——–

This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.

Script Notes
————-

This script is intended to be run via sqlplus as the SYS or Internal user.

Script
——-

– – – – – – – – – – – – – – – – Script begins here – – – – – – – – – – – – – – – –

— NAME: DG_phy_stby_diag.sql
— ————————————————————————
— AUTHOR:
— Michael Smith – Oracle Support Services – DataServer Group
— Copyright 2002, Oracle Corporation
— ————————————————————————
— PURPOSE:
— This script is to be used to assist in collection information to help
— troubeshoot Data Guard issues.
— ————————————————————————
— DISCLAIMER:
— This script is provided for educational purposes only. It is NOT
— supported by Oracle World Wide Technical Support.
— The script has been tested and appears to work as intended.
— You should always run new scripts on a test instance initially.
— ————————————————————————
— Script output is as follows:

set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,’Mondd_hhmi’) timecol,
‘.out’ spool_extension from sys.dual;
column output new_value dbname
select value || ‘_’ output
from v$parameter where name = ‘db_name’;
spool dgdiag_phystby_&&dbname&&timestamp&&suffix
set lines 200
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = ‘MON-DD-YYYY HH24:MI:SS’;
set feedback on
select to_char(sysdate) time from dual;

set echo on


— ARCHIVER can be (STOPPED | STARTED | FAILED) FAILED means that the archiver failed
— to archive a — log last time, but will try again within 5 minutes. LOG_SWITCH_WAIT
— The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log switching is waiting for. Note that
— if ALTER SYSTEM SWITCH LOGFILE is hung, but there is room in the current online
— redo log, then value is NULL

column host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait from v$instance;

— The following select will give us the generic information about how this standby is
— setup. The database_role should be standby as that is what this script is intended
— to be ran on. If protection_level is different than protection_mode then for some
— reason the mode listed in protection_mode experienced a need to downgrade. Once the
— error condition has been corrected the protection_level should match the protection_mode
— after the next log switch.

column ROLE format a7 tru
select name,database_role,log_mode,controlfile_type,protection_mode,protection_level
from v$database;

— Force logging is not mandatory but is recommended. Supplemental logging should be enabled
— on the standby if a logical standby is in the configuration. During normal
— operations it is acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED.

column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker from v$database;

— This query produces a list of all archive destinations and shows if they are enabled,
— what process is servicing that destination, if the destination is local or remote,
— and if remote what the current mount ID is. For a physical standby we should have at
— least one remote destination that points the primary set but it should be deferred.

COLUMN destination FORMAT A35 WRAP
column process format a7
column archiver format a8
column ID format 99

select dest_id “ID”,destination,status,target,
archiver,schedule,process,mountid
from v$archive_dest;

— If the protection mode of the standby is set to anything higher than max performance
— then we need to make sure the remote destination that points to the primary is set
— with the correct options else we will have issues during switchover.

select dest_id,process,transmit_mode,async_blocks,
net_timeout,delay_mins,reopen_secs,register,binding
from v$archive_dest;

— The following select will show any errors that occured the last time an attempt to
— archive to the destination was attempted. If ERROR is blank and status is VALID then
— the archive completed correctly.

column error format a55 tru
select dest_id,status,error from v$archive_dest;

— Determine if any error conditions have been reached by querying thev$dataguard_status
— view (view only available in 9.2.0 and above):

column message format a80
select message, timestamp
from v$dataguard_status
where severity in (‘Error’,’Fatal’)
order by timestamp;

— The following query is ran to get the status of the SRL’s on the standby. If the
— primary is archiving with the LGWR process and SRL’s are present (in the correct
— number and size) then we should see a group# active.

select group#,sequence#,bytes,used,archived,status from v$standby_log;

— The above SRL’s should match in number and in size with the ORL’s returned below:

select group#,thread#,sequence#,bytes,archived,status from v$log;

— Query v$managed_standby to see the status of processes involved in the
— configuration.

select process,status,client_process,sequence#,block#,active_agents,known_agents
from v$managed_standby;

— Verify that the last sequence# received and the last sequence# applied to standby
— database.

select al.thrd “Thread”, almax “Last Seq Received”, lhmax “Last Seq Applied”
from (select thread# thrd, max(sequence#) almax
from v$archived_log
where resetlogs_change#=(select resetlogs_change# from v$database)
group by thread#) al,
(select thread# thrd, max(sequence#) lhmax
from v$log_history
where first_time=(select max(first_time) from v$log_history)
group by thread#) lh
where al.thrd = lh.thrd;

— The V$ARCHIVE_GAP fixed view on a physical standby database only returns the next
— gap that is currently blocking redo apply from continuing. After resolving the
— identified gap and starting redo apply, query the V$ARCHIVE_GAP fixed view again
— on the physical standby database to determine the next gap sequence, if there is
— one.

select * from v$archive_gap;

— Non-default init parameters.

set numwidth 5
column name format a30 tru
column value format a50 wra
select name, value
from v$parameter
where isdefault = ‘FALSE’;

spool off

– – – – – – – – – – – – – – – – Script ends here – – – – – – – – – – – – – – – –

Database Force open example

帮网友强制打开了一个没有备份的测试库,这个库没有备份也没有打开归档,因为之前也出现过active日志文件损毁,一直使用隐式参数才能正常打开:

                 _allow_resetlogs_corruption= TRUE

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638   QQ号:47079569    邮箱:service@parnassusdata.com

 

 

这次一开始这个库报ORA-600[2662]错误:

Mon Aug 23 09:37:00 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_852096.trc:
ORA-00600: internal error code, arguments: [2662], [0], [130131504], [0], [130254136], [4264285], [], []
Mon Aug 23 09:37:02 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_852096.trc:
ORA-00600: internal error code, arguments: [2662], [0], [130131506], [0], [130254136], [4264285], [], []

ORA-600 [2662] “Block SCN is ahead of Current SCN”错误是当数据块中的SCN领先于current SCN,由于后台进程或服务进程都会比对UGA中的dependent SCN和数据库当前的SCN,如果数据库当前SCN小于dependent SCN,那么该进程就会报ORA-600 [2662]错误,如果遭遇该错误的是服务进程,那么服务进程一般会异常终止;如果遭遇该错误的是后台进程譬如SMON,则会导致实例CRASH。
ORA-600 [2662]错误可以能由以下几种情况引起:
1.启用隐含参数_ALLOW_RESETLOGS_CORRUPTION后,以resetlogs形式打开数据库;这种情况下发生2662错误,根本原因是没有完全前滚导致控制文件中的SCN滞后于数据块中的SCN。
2.硬件故障导致数据库没法写控制文件和联机日志文件
3.错误的部分恢复数据库
4.恢复了控制文件,但是没有使用recover database using backup controlfile进行恢复
5.数据库crash后设置了_DISABLE_LOGGING隐含参数
6.在并行服务器环境中DLM存在问题

该错误的5个参数的具体含义如下:
ARGUMENTS:
Arg [a] Current SCN WRAP
Arg [b] Current SCN BASE
Arg [c] dependent SCN WRAP
Arg [d] dependent SCN BASE
Arg [e] Where present this is the DBA where the dependent SCN came from.

我们的case当中dependent SCN为130254136,而当前SCN为130131506,其差值为122630;从以上告警日志中可以看到数据库的当前SCN是在不断缓慢增长的,当我们遭遇到2662错误时,很滑稽的一点是只要不断重启数据库保持current SCN的增长,一段时间后2662错误会不药而愈。当然我们也可以不用这种笨办法,10015事件可以帮助我们调整数据库当前SCN:

/* 当数据库处于mount状态,可以使用10015事件来调整scn */
alter session  set events '10015 trace name adjust_scn level 1';
/* 这里可以设置level 2..10等 (level 1是在每次打开数据库时scn增加1000k)*/
/* 需要注意的是10g某些版本不同于9i,需要设置隐式参数_allow_error_simulation,才能真正增进scn */
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> col current_scn format 999,999,999,999
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
1141408
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1653518336 bytes
Fixed Size                  2213896 bytes
Variable Size             989857784 bytes
Database Buffers          654311424 bytes
Redo Buffers                7135232 bytes
Database mounted.
SQL> alter session set events '10015 trace name adjust_scn level 1';
Session altered.
SQL> alter database open;
Database altered.
SQL>  select current_scn from v$database;
CURRENT_SCN
-----------
1142031
/* 可以看到current_scn并未大量增加,10.2.0.4上默认10015 adjust_scn不被触发 */
SQL>  alter system set "_allow_error_simulation"=true scope=spfile;
System altered.
SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1653518336 bytes
Fixed Size                  2213896 bytes
Variable Size             989857784 bytes
Database Buffers          654311424 bytes
Redo Buffers                7135232 bytes
Database mounted.
SQL> alter session set events '10015 trace name adjust_scn level 1';
Session altered.
SQL> alter database open;
Database altered.
SQL>select current_scn from v$database;
CURRENT_SCN
----------------
1,073,741,980

在接手之前该网友已经通过反复重启数据库将数据库的当前SCN提高到dependent SCN的127037138;原以为这样就可以打开数据库了,谁知道又出现了一下错误:

Wed Aug 25 07:43:53 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_929958.trc:
ORA-00600: internal error code, arguments: [4000], [8], [], [], [], [], [], []
Wed Aug 25 07:43:53 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_929958.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [8], [], [], [], [], [], []
Wed Aug 25 07:43:53 2010
Error 704 happened during db open, shutting down database

bootstrap自举过程中遭遇了ORA-600 [4000]错误,该错误一般当Oracle尝试读取数据字典(主要是undo$基表)中记录的USN对应的回滚段失败引起.,通过设置隐式参数_corrupted_rollback_segments可以一定程度上规避该错误,强制打开数据库,其Argument[a]代表造成读取失败的USN(undo segment number),但实际上有问题的回滚段可能不止这一个:

/* 通过strings工具从system表空间上找回各回滚段的名字  */
$strings system.dbf |grep _SYSSMU|less
_SYSSMU1$
_SYSSMU2$
_SYSSMU3$
_SYSSMU4$
_SYSSMU5$
_SYSSMU6$
_SYSSMU7$
_SYSSMU8$
_SYSSMU9$
.........
alter system set "_corrupted_rollback_segments"='(_SYSSMU1$, _SYSSMU2$, _SYSSMU3$, _SYSSMU4$, _SYSSMU5$, _SYSSMU6$, _SYSSMU7$, _SYSSMU8$, _SYSSMU9$, _SYSSMU10$, _SYSSMU11$, _SYSSMU12$)' scope=spfile;
System altered.
/* 即便设置了_corrupted_rollback_segments隐式参数,也还有一定概率遭遇4000错误,尝试加上10513事件,并多次重启数据库 */
SQL> alter system set event='10513 trace name context forever,level 2' scope=spfile;
System altered.
/* 再次出现4000 错误 */
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_1016014.trc:
ORA-00600: internal error code, arguments: [4000], [8], [], [], [], [], [], []
Thu Aug 26 09:43:39 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_1016014.trc:
ORA-00704: bootstrap process failure
ORA-00704: bootstrap process failure
ORA-00600: internal error code, arguments: [4000], [8], [], [], [], [], [], []
Thu Aug 26 09:43:39 2010
Error 704 happened during db open, shutting down database
/* 再次重启后发现4000错误不再出现 * /

再次重启发现不再出现ORA-600[4000]错误,但在字典检查阶段Oracle认为数据文件227不匹配于当前的incarnation:

Thu Aug 26 11:13:22 2010
Dictionary check beginning
Thu Aug 26 09:46:00 2010
Errors in file /oracle/QAS/saptrace/usertrace/qas_ora_897162.trc:
ORA-01177: data file does not match dictionary - probably old incarnation
ORA-01110: data file 227: '/oracle/QAS/sapdata2/qas_192/qas.data196'
Error 1177 happened during db open, shutting down database
USER: terminating instance due to error 1177
Instance terminated by USER, pid = 897162

初步判断出现ORA-01177可能为2种可能性:
1.数据字典出现讹误,227号文件对应的incarnation信息不正确
2.在之前的某次resetlogs open过程中,227号文件头由于某些原因没有正确更新incarnation信息

针对这样的情况如果一定要找回该数据文件上的数据的话只能通过手动修改数据字典或文件头,当然也可以尝试使用一些直接从数据文件上抽取数据的工具。
因为这是一次友情协助,就没有继续深入下去,通过重建控制文件并跳过该数据文件解决了:

CREATE CONTROLFILE REUSE DATABASE "QAS" RESETLOGS  NOARCHIVELOG
--  SET STANDBY TO MAXIMIZE PERFORMANCE
MAXLOGFILES 255
MAXLOGMEMBERS 3
MAXDATAFILES 254
MAXINSTANCES 50
MAXLOGHISTORY 36302
LOGFILE
GROUP 1 (
'/oracle/QAS/redolog/redolog11A.dbf',
'/oracle/QAS/redolog/redolog11B.dbf'
) SIZE 500M,
GROUP 2 (
'/oracle/QAS/redolog/redolog12A.dbf',
'/oracle/QAS/redolog/redolog12B.dbf'
) SIZE 500M
-- STANDBY LOGFILE
DATAFILE
'/oracle/QAS/sapdata1/system_1/system.data1',
........
'/oracle/QAS/sapdata2/qas_192/qas.data195'
CHARACTER SET WE8DEC
Thu Aug 26 14:04:50 2010
Successful mount of redo thread 1, with mount id 2117500093
Thu Aug 26 14:04:50 2010
Completed: CREATE CONTROLFILE REUSE DATABASE "QAS" RESETLOGS
Thu Aug 26 14:05:05 2010
alter database mount
Thu Aug 26 14:05:05 2010
ORA-1100 signalled during: alter database mount...
Thu Aug 26 14:05:15 2010
alter database open resetlogs
RESETLOGS is being done without consistancy checks. This may result
in a corrupted database. The database should be recreated.
RESETLOGS after incomplete recovery UNTIL CHANGE 1125281471596
Resetting resetlogs activation ID 0 (0x0)
Online log 1 of thread 1 was previously cleared
Thu Aug 26 14:05:36 2010
Assigning activation ID 2117500093 (0x7e367cbd)
Thread 1 opened at log sequence 1
Current log# 2 seq# 1 mem# 0: /oracle/QAS/redolog/redolog12A.dbf
Current log# 2 seq# 1 mem# 1: /oracle/QAS/redolog/redolog12B.dbf
Successful open of redo thread 1
Thu Aug 26 14:05:36 2010
SMON: enabling cache recovery
Thu Aug 26 14:05:36 2010
Dictionary check beginning
Tablespace 'PSAPTEMP' #2 found in data dictionary,
but not in the controlfile. Adding to controlfile.
File #227 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00227' in the controlfile.
This file can no longer be recovered so it must be dropped.
File #228 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00228' in the controlfile.
This file can no longer be recovered so it must be dropped.
File #229 found in data dictionary but not in controlfile.
Creating OFFLINE file 'MISSING00229' in the controlfile.
This file can no longer be recovered so it must be dropped.
Dictionary check complete
Thu Aug 26 14:05:38 2010
SMON: enabling tx recovery
Thu Aug 26 14:05:38 2010
Database Characterset is WE8DEC
replication_dependency_tracking turned off (no async multimaster replication found)
Completed: alter database open resetlogs

这个case告诉我们,测试库并不一定就不重要,测试库也是需要备份的。

DataGuard Managed recovery hang

Our team deleted some archivelog by mistake. Rolled the database forwards by RMAN incremental recovery to an SCN. Did a manual recovery to sync it with the primary. Managed recovery is now failing.
alter database recover managed standby database disconnect

Alert log has :

Fri Jan 22 13:50:22 2010
Attempt to start background Managed Standby Recovery process
MRP0 started with pid=12
MRP0: Background Managed Standby Recovery process started
Media Recovery Waiting for thread 1 seq# 193389
Fetching gap sequence for thread 1, gap sequence 193389-193391
Trying FAL server: ITS
Fri Jan 22 13:50:28 2010
Completed: alter database recover managed standby database d
Fri Jan 22 13:53:25 2010
Failed to request gap sequence. Thread #: 1, gap sequence: 193389-193391
All FAL server has been attempted.

Managed recovery was working earlier today after the Rman incremental and resolved two gaps automatically. But it now appears hung with the standby falling behind the primary.

SQL> show parameter fal
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fal_client string ITS_STBY
fal_server string ITS
[v08k608:ITS:oracle]$ tnsping ITS_STBY
TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:01:17
Copyright (c) 1997 Oracle Corporation. All rights reserved.
Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= v08k608.am.mot.com)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (10 msec)
[v08k608:ITS:oracle]$ tnsping ITS
TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:01:27
Copyright (c) 1997 Oracle Corporation. All rights reserved.
Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= 187.10.68.75)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (320 msec)
Primary has :
SQL> show parameter log_archive_dest_2
log_archive_dest_2 string SERVICE=DRITS_V08K608 reopen=6
0 max_failure=10 net_timeout=1
80 LGWR ASYNC=20480 OPTIONAL
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_state_2 string ENABLE
[ITS]/its15/oradata/ITS/arch> tnsping DRITS_V08K608
TNS Ping Utility for Solaris: Version 9.2.0.7.0 - Production on 22-JAN-2010 15:03:24
Copyright (c) 1997 Oracle Corporation. All rights reserved.
Used parameter files:
/oracle/product/9.2.0/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL= TCP)(Host= 10.177.13.57)(Port= 1526)) (CONNECT_DATA = (SID = ITS)))
OK (330 msec)

The arch process on the primary database might hang due to a bug below so that it couldn’t ship the missing archive log
files to the standby database.

BUG 6113783 ARC PROCESSES CAN HANG INDEFINITELY ON NETWORK
[ Not published so not viewable in My Oracle Support ]
Fixed 11.2, 10.2.0.5 patchset

We could work workaround the issue by killing the arch processes on the primary site and they will be respawned
automatically immediately without harming the primary database.

[maclean@rh2 ~]$ ps -ef|grep arc
maclean   8231     1  0 22:24 ?        00:00:00 ora_arc0_PROD
maclean   8233     1  0 22:24 ?        00:00:00 ora_arc1_PROD
maclean   8350  8167  0 22:24 pts/0    00:00:00 grep arc
[maclean@rh2 ~]$ kill -9 8231 8233
[maclean@rh2 ~]$ ps -ef|grep arc
maclean   8389     1  0 22:25 ?        00:00:00 ora_arc0_PROD
maclean   8391     1  1 22:25 ?        00:00:00 ora_arc1_PROD
maclean   8393  8167  0 22:25 pts/0    00:00:00 grep arc
and alert log will have:
Fri Jul 30 22:25:27 EDT 2010
ARCH: Detected ARCH process failure
ARCH: Detected ARCH process failure
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=26, OS id=8389
Fri Jul 30 22:25:27 EDT 2010
ARC0: Archival started
ARC1: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC1 started with pid=27, OS id=8391
Fri Jul 30 22:25:27 EDT 2010
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
Fri Jul 30 22:25:27 EDT 2010
ARC1: Becoming the heartbeat ARCH

Actually if we don’t kill some fatal process in 10g , oracle will respawn all nonfatal processes.
For example:

[maclean@rh2 ~]$ ps -ef|grep ora_|grep -v grep
maclean  14264     1  0 23:16 ?        00:00:00 ora_pmon_PROD
maclean  14266     1  0 23:16 ?        00:00:00 ora_psp0_PROD
maclean  14268     1  0 23:16 ?        00:00:00 ora_mman_PROD
maclean  14270     1  0 23:16 ?        00:00:00 ora_dbw0_PROD
maclean  14272     1  0 23:16 ?        00:00:00 ora_lgwr_PROD
maclean  14274     1  0 23:16 ?        00:00:00 ora_ckpt_PROD
maclean  14276     1  0 23:16 ?        00:00:00 ora_smon_PROD
maclean  14278     1  0 23:16 ?        00:00:00 ora_reco_PROD
maclean  14338     1  0 23:16 ?        00:00:00 ora_arc0_PROD
maclean  14340     1  0 23:16 ?        00:00:00 ora_arc1_PROD
maclean  14452     1  0 23:17 ?        00:00:00 ora_s000_PROD
maclean  14454     1  0 23:17 ?        00:00:00 ora_d000_PROD
maclean  14456     1  0 23:17 ?        00:00:00 ora_cjq0_PROD
maclean  14458     1  0 23:17 ?        00:00:00 ora_qmnc_PROD
maclean  14460     1  0 23:17 ?        00:00:00 ora_mmon_PROD
maclean  14462     1  0 23:17 ?        00:00:00 ora_mmnl_PROD
maclean  14467     1  0 23:17 ?        00:00:00 ora_q000_PROD
maclean  14568     1  0 23:18 ?        00:00:00 ora_q001_PROD
[maclean@rh2 ~]$ ps -ef|grep ora_|grep -v pmon|grep -v ckpt |grep -v lgwr|grep -v smon|grep -v grep|grep -v dbw|grep -v psp|grep -v mman |grep -v rec|awk '{print $2}'|xargs kill -9
and alert log will have:
Fri Jul 30 23:20:58 EDT 2010
ARCH: Detected ARCH process failure
ARCH: Detected ARCH process failure
ARCH: STARTING ARCH PROCESSES
ARC0 started with pid=20, OS id=14959
Fri Jul 30 23:20:58 EDT 2010
ARC0: Archival started
ARC1: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
Fri Jul 30 23:20:58 EDT 2010
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC1 started with pid=21, OS id=14961
ARC1: Becoming the heartbeat ARCH
Fri Jul 30 23:21:29 EDT 2010
found dead shared server 'S000', pid = (10, 3)
found dead dispatcher 'D000', pid = (11, 3)
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process CJQ0
Restarting dead background process QMNC
CJQ0 started with pid=12, OS id=15124
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process MMON
QMNC started with pid=13, OS id=15126
Fri Jul 30 23:22:29 EDT 2010
Restarting dead background process MMNL
MMON started with pid=14, OS id=15128
MMNL started with pid=16, OS id=15132
That's all right!

Fail to queue the whole FAL gap in dataguard一例

近日告警日志中出现以下记录:
FAL[server]: Fail to queue the whole FAL gap
GAP – thread 1 sequence 180-180
DBID 3731271451 branch 689955035

这是一个10.2.0.3的dataguard环境,采用物理备库,归档传输模式;查询metalink发现相关note:

Symptoms

When using ARCH transport, gaps could be flagged in the alert log even though the single log gap was for a log that had not been written at the primary yet.

alert.log on primary shows:

FAL[server]: Fail to queue the whole FAL gap

GAP – thread 1 sequence 63962-63962

DBID 1243807152 branch 631898097

or alert.log on standby shows:

Fetching gap sequence in thread 1, gap sequence 63962-63962

Thu Jan 24 14:36:30 2008

FAL[client]: Failed to request gap sequence

GAP – thread 1 sequence 63962-63962

DBID 2004523329 branch 594300676

FAL[client]: All defined FAL servers have been attempted.

v$archive_gap returns no rows

SQL> select * from v$archive_gap;

no rows selected

Cause

Bug 5526409 – FAL gaps reported at standby for log not yet written at primary

Solution

Bug 5526409 is fixed in 10.2.0.4 and 11.1.

Upgrade to 10.2.0.4 as Bug 5526409 is fixed in 10.2.0.4.

Their is no impact of these messages on the database. You can safely ignore these messages.

One-off Patch for Bug 5526409 on top of 10.2.0.3 is available for some platforms. Please check Patch 5526409 for your platform.

该note描述在10.1.0.2-10.2.0.3版本中,在ARCH传输的DataGuard环境中可能出现日志传输gap为单个在primary库中尚未写出的日志,该gap可能会在告警日志中以以上形式标示。
该bug(问题)在版本10.2.0.4和11.1中得到了修复,在10.2.0.3版本中部分平台上有one-off补丁。但实际上该bug(问题)对于主备库不会有任何影响,我们也可以将之忽略。

7月最新发布11.2.0.1.2 Patch set update

7月13日,11g release 2 的第二个补丁集更新发布了;9i的最终版本为9.2.0.8,10g上10.2.0.5很有可能成为最终版本,我们预期今后(11g,12g)中Patch set数量会有效减少,而patch set update数量可能大幅增加;这样的更新形式可以为Oracle Database提升一定的软件形象。可以猜想11gr2的最终版本号可能是11.2.0.2/3.x。

附该psu的readme note:

Released: July 13, 2010

This document is accurate at the time of release. For any changes and additional information regarding PSU 11.2.0.1.2, see these related documents that are available at My Oracle Support (http://support.oracle.com/):

  • Note 854428.1 Patch Set Updates for Oracle Products
  • Note 1089071.1 Oracle Database Patch Set Update 11.2.0.1.2 Known Issues

This document includes the following sections:

1 Patch Information

Patch Set Update (PSU) patches are cumulative. That is, the content of all previous PSUs is included in the latest PSU patch.

PSU 11.2.0.1.2 includes the fixes listed in Section 5, “Bugs Fixed by This Patch”.

Table 1 describes installation types and security content. For each installation type, it indicates the most recent PSU patch to include new security fixes that are pertinent to that installation type. If there are no security fixes to be applied to an installation type, then “None” is indicated. If a specific PSU is listed, then apply that or any later PSU patch to be current with security fixes.

Table 1 Installation Types and Security Content

Installation Type Latest PSU with Security Fixes
Server homes PSU 11.2.0.1.2


Client-Only Installations None
Instant Client Installations None

(The Instant Client installation is not the same as the client-only Installation. For additional information about Instant Client installations, see Oracle Database Concepts.)

2 Patch Installation and Deinstallation

This section includes the following sections:

2.1 Platforms for PSU 11.2.0.1.2

For a list of platforms that are supported in this Patch Set Update, see My Oracle Support Note 1060989.1 Critical Patch Update July 2010 Patch Availability Document for Oracle Products.

2.2 OPatch Utility Information

You must use the OPatch utility version 11.2.0.1.0 or later to apply this patch. Oracle recommends that you use the latest released OPatch 11.2, which is available for download from My Oracle Support patch 6880880 by selecting the 11.2.0.0.0 release.

For information about OPatch documentation, including any known issues, see My Oracle Support Note 293369.1 OPatch documentation list.

2.3 Patch Installation

These instructions are for all Oracle Database installations.

2.3.1 Patch Pre-Installation Instructions

Before you install PSU 11.2.0.1.2, perform the following actions to check the environment and to detect and resolve any one-off patch conflicts.

2.3.1.1 Environments with ASM

If you are installing the PSU to an environment that has Automatic Storage Management (ASM), note the following:

  • For Linux x86 and Linux x86-64 platforms, install either (A) the bug fix for 8898852 and the Database PSU patch 9654983, or (B) the Grid Infrastructure PSU patch 9343627.
  • For all other platforms, no action is required. The fix for 8898852 was included in the base 11.2.0.1.0 release.

2.3.1.2 Environment Checks
  1. Ensure that the $PATH definition has the following executables: make, ar, ld, and nm.The location of these executables depends on your operating system. On many operating systems, they are located in /usr/ccs/bin, in which case you can set your PATH definition as follows:
    export PATH=$PATH:/usr/ccs/bin
    

2.3.1.3 One-off Patch Conflict Detection and Resolution

For an introduction to the PSU one-off patch concepts, see “Patch Set Updates Patch Conflict Resolution” in My Oracle Support Note 854428.1 Patch Set Updates for Oracle Products.

The fastest and easiest way to determine whether you have one-off patches in the Oracle home that conflict with the PSU, and to get the necessary conflict resolution patches, is to use the Patch Recommendations and Patch Plans features on the Patches & Updates tab in My Oracle Support. These features work in conjunction with the My Oracle Support Configuration Manager. Recorded training sessions on these features can be found in Note 603505.1.

However, if you are not using My Oracle Support Patch Plans, follow these steps:

  1. Determine whether any currently installed one-off patches conflict with the PSU patch as follows:
    unzip p9654983_11201_<platform>.zip
    opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir ./9654983
    
  2. The report will indicate the patches that conflict with PSU 9654983 and the patches for which PSU 9654983 is a superset.Note that Oracle proactively provides PSU 11.2.0.1.2 one-off patches for common conflicts.
  3. Use My Oracle Support Note 1061295.1 Patch Set Updates – One-off Patch Conflict Resolution to determine, for each conflicting patch, whether a conflict resolution patch is already available, and if you need to request a new conflict resolution patch or if the conflict may be ignored.
  4. When all the one-off patches that you have requested are available at My Oracle Support, proceed with Section 2.3.2, “Patch Installation Instructions”.

2.3.2 Patch Installation Instructions

Follow these steps:

  1. If you are using a Data Guard Physical Standby database, you must first install this patch on the primary database before installing the patch on the physical standby database. It is not supported to install this patch on the physical standby database before installing the patch on the primary database. For more information, see My Oracle Support Note 278641.1.
  2. Do one of the following, depending on whether this is a RAC environment:
    • If this is a RAC environment, choose one of the patch installation methods provided by OPatch (rolling, all node, or minimum downtime), and shutdown instances and listeners as appropriate for the installation method selected.This PSU patch is rolling RAC installable. Refer to My Oracle Support Note 244241.1 Rolling Patch – OPatch Support for RAC.
    • If this is not a RAC environment, shut down all instances and listeners associated with the Oracle home that you are updating. For more information, see Oracle Database Administrator’s Guide.
  3. Set your current directory to the directory where the patch is located and then run the OPatch utility by entering the following commands:
    unzip p9654983_11201_<platform>.zip
    cd 9654983
    opatch apply
    
  4. If there are errors, refer to Section 3, “Known Issues”.

2.3.3 Patch Post-Installation Instructions

After installing the patch, perform the following actions:

  1. Apply conflict resolution patches as explained in Section 2.3.3.1.
  2. Load modified SQL files into the database, as explained in Section 2.3.3.2.

2.3.3.1 Applying Conflict Resolution Patches

Apply the patch conflict resolution one-off patches that were determined to be needed when you performed the steps in Section 2.3.1.3, “One-off Patch Conflict Detection and Resolution”.

2.3.3.2 Loading Modified SQL Files into the Database

The following steps load modified SQL files into the database. For a RAC environment, perform these steps on only one node.

  1. For each database instance running on the Oracle home being patched, connect to the database using SQL*Plus. Connect as SYSDBA and run the catbundle.sql script as follows:
    cd $ORACLE_HOME/rdbms/admin
    sqlplus /nolog
    SQL> CONNECT / AS SYSDBA
    SQL> STARTUP
    SQL> @catbundle.sql psu apply
    SQL> QUIT
    

    The catbundle.sql execution is reflected in the dba_registry_history view by a row associated with bundle series PSU.

    For information about the catbundle.sql script, see My Oracle Support Note 605795.1 Introduction to Oracle Database catbundle.sql.

  2. Check the following log files in $ORACLE_HOME/cfgtoollogs/catbundle for any errors:
    catbundle_PSU_<database SID>_APPLY_<TIMESTAMP>.log
    catbundle_PSU_<database SID>_GENERATE_<TIMESTAMP>.log
    

    where TIMESTAMP is of the form YYYYMMMDD_HH_MM_SS. If there are errors, refer to Section 3, “Known Issues”.

2.3.4 Patch Post-Installation Instructions for Databases Created or Upgraded after Installation of PSU 11.2.0.1.2 in the Oracle Home

These instructions are for a database that is created or upgraded after the installation of PSU 11.2.0.1.2.

You must execute the steps in Section 2.3.3.2, “Loading Modified SQL Files into the Database” for any new database only if it was created by any of the following methods:

  • Using DBCA (Database Configuration Assistant) to select a sample database (General, Data Warehouse, Transaction Processing)
  • Using a script that was created by DBCA that creates a database from a sample database

2.4 Patch Deinstallation

These instructions apply if you need to deinstall the patch.

2.4.1 Patch Deinstallation Instructions for a Non-RAC Environment

Follow these steps:

  1. Verify that an $ORACLE_HOME/rdbms/admin/catbundle_PSU_<database SID>_ROLLBACK.sql file exists for each database associated with this ORACLE_HOME. If this is not the case, you must execute the steps in Section 2.3.3.2, “Loading Modified SQL Files into the Database” against the database before deinstalling the PSU.
  2. Shut down all instances and listeners associated with the Oracle home that you are updating. For more information, see Oracle Database Administrator’s Guide.
  3. Run the OPatch utility specifying the rollback argument as follows.
    opatch rollback -id 9654983
    
  4. If there are errors, refer to Section 3, “Known Issues”.

2.4.2 Patch Post-Deinstallation Instructions for a Non-RAC Environment

Follow these steps:

  1. Start all database instances running from the Oracle home. (For more information, see Oracle Database Administrator’s Guide.)
  2. For each database instance running out of the ORACLE_HOME, connect to the database using SQL*Plus as SYSDBA and run the rollback script as follows:
    cd $ORACLE_HOME/rdbms/admin
    sqlplus /nolog
    SQL> CONNECT / AS SYSDBA
    SQL> STARTUP
    SQL> @catbundle_PSU_<database SID>_ROLLBACK.sql
    SQL> QUIT
    

    In a RAC environment, the name of the rollback script will have the format catbundle_PSU_<database SID PREFIX>_ROLLBACK.sql.

  3. Check the log file for any errors. The log file is found in $ORACLE_HOME/cfgtoollogs/catbundle and is named catbundle_PSU_<database SID>_ROLLBACK_<TIMESTAMP>.log where TIMESTAMP is of the form YYYYMMMDD_HH_MM_SS. If there are errors, refer to Section 3, “Known Issues”.

2.4.3 Patch Deinstallation Instructions for a RAC Environment

Follow these steps for each node in the cluster, one node at a time:

  1. Shut down the instance on the node.
  2. Run the OPatch utility specifying the rollback argument as follows.
    opatch rollback -id 9654983
    

    If there are errors, refer to Section 3, “Known Issues”.

  3. Start the instance on the node as follows:
    srvctl start instance
    

2.4.4 Patch Post-Deinstallation Instructions for a RAC Environment

Follow the instructions listed in Section Section 2.4.2, “Patch Post-Deinstallation Instructions for a Non-RAC Environment” only on the node for which the steps in Section 2.3.3.2, “Loading Modified SQL Files into the Database” were executed during the patch application.

All other instances can be started and accessed as usual while you are executing the deinstallation steps.

3 Known Issues

For information about OPatch issues, see My Oracle Support Note 293369.1 OPatch documentation list.

For issues documented after the release of this PSU, see My Oracle Support Note 1089071.1 Oracle Database Patch Set Update 11.2.0.1.2 Known Issues.

Other known issues are as follows.

Issue 1
The following ignorable errors may be encountered while running the catbundle.sql script or its rollback script:

ORA-29809: cannot drop an operator with dependent objects
ORA-29931: specified association does not exist
ORA-29830: operator does not exist
ORA-00942: table or view does not exist
ORA-00955: name is already used by an existing object
ORA-01430: column being added already exists in table
ORA-01432: public synonym to be dropped does not exist
ORA-01434: private synonym to be dropped does not exist
ORA-01435: user does not exist
ORA-01917: user or role 'XDB' does not exist
ORA-01920: user name '<user-name>' conflicts with another user or role name
ORA-01921: role name '<role name>' conflicts with another user or role name
ORA-01952: system privileges not granted to 'WKSYS'
ORA-02303: cannot drop or replace a type with type or table dependents
ORA-02443: Cannot drop constraint - nonexistent constraint
ORA-04043: object <object-name> does not exist
ORA-29832: cannot drop or replace an indextype with dependent indexes
ORA-29844: duplicate operator name specified
ORA-14452: attempt to create, alter or drop an index on temporary table already in use
ORA-06512: at line <line number>. If this error follow any of above errors, then can be safely ignored.
ORA-01927: cannot REVOKE privileges you did not grant

4 References

The following documents are references for this patch.

Note 293369.1 OPatch documentation list

Note 360870.1 Impact of Java Security Vulnerabilities on Oracle Products

Note 468959.1 Enterprise Manager Grid Control Known Issues

Note 9352237.8 Bug 9352237 – 11.2.0.1.1 Patch Set Update (PSU)

5 Bugs Fixed by This Patch

This patch includes the following bug fixes.

5.1 CPU Molecules

CPU molecules in PSU 11.2.0.1.2:

PSU 11.2.0.1.2 contains the following new CPU molecules:

9676419 – DB-11.2.0.1-MOLECULE-004-CPUJUL2010

9676420 – DB-11.2.0.1-MOLECULE-005-CPUJUL2010

5.2 Bug Fixes

PSU 11.2.0.1.2 contains the following new fixes:

Automatic Storage Management

8755082 – ORA-00600: [KCFIS_TRANSLATE4:VOLUME LOOKUP], [2], [WRONG DEVICE NAME], [], [], [

8890026 – ASM PARTNERING CREATES IMBALANCED PARTNERSHIPS

9170608 – STBH:DD BLOCKS PINNED FOR QUERIES THAT DO NOT REQUEST USED SPACE

9363145 – STBH:DB INSTANCES TERMINATED BY ASMB DUE TO ORA-00600 [KFDSKALLOC0]

Buffer Cache

8330783 – HANGING DB WITH “CACHE BUFFER CHAINS” AND “BUFFER DEADLOCK” WAITS DURING INSERT

8822531 – TAKING AWR SNAP HANGS

Data Guard Broker

8918433 – UNPERSISTED FSFO STATE BITS CAN GET PERSISTED

9363384 – PHYSICAL STANDBY SERVICES NOT STARTED AFTER CONVERT FROM SNAPSHOT

9467635 – BROKER’S METADATA FILE UPGRADE TO 11.2 IS BROKEN

9467727 – GETSTATUS DOC YIELDS INCORRECT RESULT IF DBRESOURCE_ID PROP VALUE IS USED

Data Guard Logical

8774868 – LGSBFSFO: ORA-600 [3020], [3], [138] RAISED IN RECOVERY SLAVE

8822832 – V$ARCHIVE_DEST_STATUS HAS INCORRECT VALUE FOR APPLIED_SEQ#

DataGuard Redo Transport

8872096 – ARCHIVING FORCED DURING CLOSE WHEN NO STANDBY IS PRESENT

9399090 – STBH: CONSTANT/HIGH FREQUENT LOG SWITCHES ON BEEHIVE DATABASE IN THE LAST 3 DAYS

Shared Cursors

8865718 – RECURSIVE CURSORS CONTAINING “AS OF SNAPSHOT” CLAUSE ARE NOT SHARED

8981059 – HIGH VERSION COUNT:BIND_MISMATCH,USER_BIND_PEEK_MISMATCH,OPTIMIZER_MODE_MISMATCH

9010222 – APPS ST 11G ORA-00600 [KKSFBC-REPARSE-INFINITE-LOOP]

9067282 – TB:SH:ORA-00600:[KKSFBC-WRONG-KKSCSFLGS] WHILE RUNNING TPC-H

DML Drivers

9255542 – ARRAY INSERT TO PARTITIONED TABLE LOOSES ROWS DUE TO CONCURRENT DDL (ORA-14403)

9488887 – FORIEGN KEY VIOLATION WITH ARRAY-INSERT AND ONLINE IDX REBUILD AFTER BUG-9255542

Flashback Database

8834425 – ORA-240 IN RVWR PROCESS CAUSING 5MIN TRANSACTIONAL HANG

PLSQL

9210925 – AFTER MANUAL UPGRADE TO 11.1.0.7 PL/SQL CALLS INCORRECT FUNCTION

Automatic Memory Management

8505803 – PRE_PAGE_SGA RESULTS IN EXCESSIVE PAGE TABLE SIZE WHEN USING MEMORY_TARGET [AMM]

Partitioning

9165206 – PARTITIONING ORA-600 [KKPOLLS1] / [KKDOILSF1] – DURING PARTITION MAINTANANCE

Real Application Cluster

8875671 – LX64: ORA-600 ARGS [KJPNP_CHK:!MASTER_READY],

9093300 – LOTS OF REPEATED KJXOCDR: DROP DUPLICATE OPEN MESSAGE IN LMD TRACE

Row Access Method

8544696 – TABLE GROWTH – BLOCKS ARE NOT REUSED

Streams

8650719 – DOWNSTREAM CAPTURE ABORTS WITH ORA-26766

Secure Files

8856478 – RAM SECUREFILE PERF DEGRADATION WITH SF COMPRESSION ON SMALL LOBS DURING ATB MOVE

9272086 – STBH: DATA PUMP WRITER SEEMS TO BE WAITING ON WAIT FOR UNREAD MESSAGE ON BROADCA

DB Recovery

8909984 – APPSST GSI 11G: GAPS IN AWR SNAPSHOTS

9068088 – MEDIA RECOVERY WAS HUNG ON STANDBY

9145541 – ORA-600 [25027] / ORA-600 [4097] FOR ACTIVE TX IN A PLUGGED TABLESPACE

9167285 – PKT-BUGOLTP: ORA-07445: [KCRALC()+87]

Space Management

7519406 – ‘J000’ TRACE FILE REGARDING GATHER_STATS_JOB INTERMITTENTLY SINCE 10.2.0.4

8815639 – [11GR2-LNX-090813] MULTIPLE INSERT CAUSE DATA ALLOCATION ABOVE HHWM

9216806 – HIGH “ENQ: TS – CONTENTION” FOR TEMPORARY SEGMENT WHILE SQLLDR DIRECT PATH LOAD

9242411 – STRESS-BIGBH: LOTS OF OR-3113S IN BIGBH STRESS TEST

9461782 – ORA-7445 [KTSLF_SUMFSG()+54] [SIGSEGV] AND KTSLFSUM_CFS ON CALL STACK

Compression

9011088 – [11GR2]ADDING COLUMN TO COMPRESSED TABLE, DATA LOSS OCCURED.

9275072 – APPSST GSI 11G : BUFFER BUSY WAITS INSERTING INTO TABLES

9341448 – APPSST GSI 11G : BUFFER BUSY WAITS AND LATCH: CACHE BUFFERS WAITS WHEN INSERTING

9637033 – ORA-07445[KDR9IR2RST0] INSERT AS SELECT IN A COMPRESSED TABLE WITH > 255 COLUMNS

SQL Execution

8664189 – ORA-00600 [KDISS_UNCOMPRESS: BUFFER LENGTH]

9119194 – PSRC: DISTRIBUTED QUERY SLOWER IN 10.2.0.4 COMPARED TO 10.2.0.3

Transaction Management

8268775 – PERF: HIGH US ENQUEUE CONTENTION DURING A LOGIN STORM OR SESSION FAILOVER

8803762 – ORA-00600 [KDSGRP1] BLOCK CORRUPTION ON 11G DATABASE UPGRADE

Memory Management

8431487 – INSTANCE CRASH ORA-07445 [KGGHSTFEL()+192] ORA-07445[KGGHSTMAP()+241]

Message

9713537 – ENHANCE CAUSE/ACTION FIELDS OF THE INTERNAL ERROR ORA-00600

9714832 – ENHANCE CAUSE/ACTION FIELDS OF THE INTERNAL ERROR ORA-07445

x$ksusecst 内部视图详解

9i 中v$session_wait 是Oracle wait interface的一个主要用户接口,而该动态视图的内容来源于x$ksusecst内部视图:

SQL> select view_definition from v$fixed_view_definition where view_name='GV$SESSION_WAIT';
VIEW_DEFINITION
--------------------------------------------------------------------------------
select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.
ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,
0,0,-1,-1,-2,-2,   decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000)))
, s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME',  -1, '
WAITED SHORT TIME', 'WAITED KNOWN TIME')  from x$ksusecst s, x$ksled e where bit
and(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussop
c=e.indx
SQL> desc x$ksusecst
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
ADDR                                               RAW(4)
//即 v$session中 saddr 会话的起始地址
INDX                                               NUMBER
//即 instance_id
INST_ID                                            NUMBER
//即 sid
KSSPAFLG                                           NUMBER
KSUSEFLG                                           NUMBER
//该session是否仍活着, 1 为 alive
KSUSENUM                                           NUMBER
//另一个固有编号
KSUSSSEQ                                           NUMBER
// 相当于v$session 视图的SERIAL#列
KSUSSOPC                                           NUMBER
// 对应x$ksled视图indx列,等待事件列表的一个序列号
KSUSSP1                                            NUMBER
// 即v$session_wait表的p1列
KSUSSP1R                                           RAW(4)
// 即v$session_wait表的p1raw
KSUSSP2                                            NUMBER
// 即v$session_wait表的p2
KSUSSP2R                                           RAW(4)
// 即v$session_wait表的p2raw
KSUSSP3                                            NUMBER
// 即v$session_wait表的p3
KSUSSP3R                                           RAW(4)
// 即v$session_wait表的p3raw
KSUSSTIM                                           NUMBER
// 即v$session_wait表的wait_time,但单位为微秒
KSUSEWTM                                           NUMBER
// 即v$session_wait表的seconds_in_wait,单位仍为秒

粗略写了一个可以代替v$session_wait视图的查询语句,过滤了可能出现的空闲等待事件,同时细化wait_time列到us级别:

select s.inst_id,
s.indx sid,
s.ksussseq seq#,
e.kslednam event,
e.ksledp1 p1text,
s.ksussp1 p1,
s.ksussp1r p1raw,
e.ksledp2 p2text,
s.ksussp2 p2,
s.ksussp2r p2raw,
e.ksledp3 p3text,
s.ksussp3 p3,
s.ksussp3r p3raw,
s.ksusstim wait_time,
s.ksusewtm seconds_in_wait,
decode(s.ksusstim,
0,
'WAITING',
-2,
'WAITED UNKNOWN TIME',
-1,
'WAITED SHORT TIME',
'WAITED KNOWN TIME') state
from x$ksusecst s, x$ksled e
where bitand(s.ksspaflg, 1) != 0
and bitand(s.ksuseflg, 1) != 0
and s.ksussseq != 0
and s.ksussopc = e.indx
and e.kslednam not in ('pmon timer',
'VKTM Logical Idle Wait',
'VKTM Init Wait for GSGA',
'IORM Scheduler Slave Idle Wait',
'rdbms ipc message',
'i/o slave wait',
'VKRM Idle',
'wait for unread message on broadcast channel',
'wait for unread message on multiple broadcast channels',
'class slave wait',
'KSV master wait',
'PING',
'watchdog main loop',
'DIAG idle wait',
'ges remote message',
'gcs remote message',
'heartbeat monitor sleep',
'SGA: MMAN sleep for component shrink',
'MRP redo arrival',
'LNS ASYNC archive log',
'LNS ASYNC dest activation',
'LNS ASYNC end of log',
'simulated log write delay',
'LGWR real time apply sync',
'parallel recovery slave idle wait',
'LogMiner builder: idle',
'LogMiner builder: branch',
'LogMiner preparer: idle',
'LogMiner reader: log (idle)',
'LogMiner reader: redo (idle)',
'LogMiner client: transaction',
'LogMiner: other',
'LogMiner: activate',
'LogMiner: reset',
'LogMiner: find session',
'LogMiner: internal',
'Logical Standby Apply Delay',
'parallel recovery coordinator waits for slave cleanup',
'parallel recovery control message reply',
'parallel recovery slave next change',
'PX Deq: Txn Recovery Start',
'PX Deq: Txn Recovery Reply',
'fbar timer',
'smon timer',
'PX Deq: Metadata Update',
'Space Manager: slave idle wait',
'PX Deq: Index Merge Reply',
'PX Deq: Index Merge Execute',
'PX Deq: Index Merge Close',
'PX Deq: kdcph_mai',
'PX Deq: kdcphc_ack',
'shared server idle wait',
'dispatcher timer',
'cmon timer',
'pool server timer',
'JOX Jit Process Sleep',
'jobq slave wait',
'pipe get',
'PX Deque wait',
'PX Idle Wait',
'PX Deq: Join ACK',
'PX Deq Credit: need buffer',
'PX Deq Credit: send blkd',
'PX Deq: Msg Fragment',
'PX Deq: Parse Reply',
'PX Deq: Execute Reply',
'PX Deq: Execution Msg',
'PX Deq: Table Q Normal',
'PX Deq: Table Q Sample',
'Streams fetch slave: waiting for txns',
'Streams: waiting for messages',
'Streams capture: waiting for archive log',
'single-task message',
'SQL*Net message from client',
'SQL*Net vector message from client',
'SQL*Net vector message from dblink',
'PL/SQL lock timer',
'Streams AQ: emn coordinator idle wait',
'EMON slave idle wait',
'Streams AQ: waiting for messages in the queue',
'Streams AQ: waiting for time management or cleanup tasks',
'Streams AQ: delete acknowledged messages',
'Streams AQ: deallocate messages from Streams Pool',
'Streams AQ: qmn coordinator idle wait',
'Streams AQ: qmn slave idle wait',
'Streams AQ: RAC qmn coordinator idle wait',
'HS message to agent',
'ASM background timer',
'auto-sqltune: wait graph update',
'WCR: replay client notify',
'WCR: replay clock',
'WCR: replay paused',
'JS external job',
'cell worker idle',
'SQL*Net message to client');

关于参数log_file_name_convert

Oracle文档对于该参数的描述十分容易产生歧义:converts the filename of a new log file on the primary database to the filename of a log file on the standby database,有时被误解为归档日志的文件名转换。

如在某standby备库进行以下测试:

 

alter system set log_file_name_convert='orcl','ZZZZZZ' scope=spfile;
SQL> select fnnam,fnonm from x$kccfn;
FNNAM
--------------------------------------------------------------------------------
FNONM
--------------------------------------------------------------------------------
/u01/oradata/ZZZZZZ/redo03.log
/u01/oradata/orcl/redo03.log
/u01/oradata/ZZZZZZ/redo02.log
/u01/oradata/orcl/redo02.log
/u01/oradata/ZZZZZZ/redo01.log
/u01/oradata/orcl/redo01.log
alter system set log_file_name_convert='orcl','8888888' scope=spfile;
SQL> select fnnam,fnonm from x$kccfn;
FNNAM
--------------------------------------------------------------------------------
FNONM
--------------------------------------------------------------------------------
/u01/oradata/8888888/redo03.log
/u01/oradata/orcl/redo03.log
/u01/oradata/8888888/redo02.log
/u01/oradata/orcl/redo02.log
/u01/oradata/8888888/redo01.log
/u01/oradata/orcl/redo01.log

v$datafile中的大部分信息来源于x$kccfn内部视图,kccfn意为[F]ile [N]ames来源于Controlfile,其中 fnnam为经过对controlfile中文件名记录转制(由db_file_name_convert或 log_file_name_convert等参数convert)后的记录,而fnonm为控制文件中的原始文件名(或曰文件路径)。若在Data Guard配置过程中遭遇到日志文件名或数据文件名的转制问题,可以通过查询该视图进一步分析。

author: maclean
permanent link:https://www.askmaclean.com/2010/05/31/%E5%85%B3%E4%BA%8E%E5%8F%82%E6%95%B0log_file_name_convert/
date:2010-05-31
All rights reserved.

famous summary stack trace from Oracle Version 8.1.7.4.0 Bug Note

as this bug note claimed that:

PROBLEM:
——–
Customer frequently receives the following errors while rollback of a
transcation using Portal application:

ORA-603: ORACLE server session terminated by fatal error
ORA-600: internal error code, arguments: [6856], [0], [0], [], [], [], [],
[]

ORA-600: internal error code, arguments: [25012], [3], [15], [], [], [], [],
[]

DIAGNOSTIC ANALYSIS:
——————–
Alert.log:
~~~~~~~~~~
Wed May 19 12:47:28 2004
Errors in file /opt/oracle/admin/ORTPTP/udump/ortptp_ora_6363.trc:
ORA-603: ORACLE server session terminated by fatal error
ORA-600: internal error code, arguments: [6856], [0], [0], [], [], [], [],
[]
Wed May 19 14:38:39 2004
Errors in file /opt/oracle/admin/ORTPTP/udump/ortptp_ora_782.trc:
ORA-600: internal error code, arguments: [25012], [3], [15], [], [], [], [],
[]

Tablespace 3 = TEMP tablespace.

Block dump in tracefile ortptp_ora_21207.trc points to TEMP tablespace and
TEMP segment:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Block header dump:  0x00c0b917
Object id on Block? Y
seg/obj: 0xc0b916  csc: 0x00.18f4bc  itc: 1  flg: O  typ: 1 – DATA
fsl: 0  fnx: 0x0 ver: 0x01
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

WORKAROUND:
———–

RELATED BUGS:
————-
3562030

REPRODUCIBILITY:
—————-
Frequently

TEST CASE:
———-

STACK TRACE:
————
Summary Stack   (to Full stack)   (to Function List)
ksedmp             # KSE: dump the process state
kgeriv             # KGE Record Internal error code (with Va_list) (IGNORE)
kgeasi             # Raise an error on an ASSERTION failure (IGNORE)
kdbmrd             ? Module Notes: kdb.c – Kernel Data Block structure and
internal manipulation
kdoqmd             ? Module Notes: kdo.c – Kernel Data Operations
kcoapl             NAME: kcoapl – Kernel Cache Op APpLy
kcbchg1
kcbchg
ktuapundo          ktuapundo – Kernel Transaction Undo APply UNdo
ktbapundo          ktbapundo – Kernel Transaction Block APply UNdo
kdoiur             declare local objects */
kcoubk             kcoubk – Kernel Cache Op Undo callBacK — invoke undo
callback routine    */
ktundo             ktundo – Kernel Transaction UNDO
ktubko             Get undo record to rollback transaction, non-CR only */
ktuabt             ktuabt – Kernel Transaction Undo ABorT
*/
ktcrab             KTC: Kernel Transaction Control Real ABort – Abort a
transaction.
ktdabt
k2labo             abort session: first abort aborts tx
k2send             TESTING SUPPORT:
xctrol             XaCTion ROLlback: Rollback the current transaction of the
current session.
opiodr             OPIODR: ORACLE code request driver – route the current
request
ttcpip             TTCPIP: Two Task Common PIPe read/write
opitsk             opitsk – Two Task Oracle Side Function Dispatcher
opiino             opiino – ORACLE Program Interface INitialize Opi
opiodr             OPIODR: ORACLE code request driver – route the current
request
opidrv             # opidrv – ORACLE Program Interface DRiVer (IGNORE)
sou2o              # Main Oracle executable entry point
main               # Standard executable entry point
start              # C program entry point (IGNORE)
**********************************************************************************************

another summary:

drepprep     perform the document indexing
evapls    EVAluate any PLSql function
kcmclscn    check Lamport SCN
kcsadj1    adjust SCN
kgesinv    KGE Signal Internal (Named) error (with VA_list)
kghalo    KGH: main allocation entry point
kghalp    KGH: Allocate permanent memory
kghfnd    KGH: Find a chunk of at least the minimum size
kghfrunp    KGH: Ask client to free unpinned space
kghfrx    Free extent. This is called when a heap is unpinned to request that it
kghgex    KGH: Get a new extent
kghnospc    KGH: There is no space available in the heap
kghpmalo    KGH: Find and return a permanent chunk of space
kghxal    Allocate a fixed size piece of shared memory.
kglhpd    KGL HeaP Deallocate
kglobcl    KGL OBject CLear all tables
kglpnal    KGL PiN ALlOcate
kglpnc    KGL: PiN heaps and load data pieces of a Cursor object
kglpndl    KGL PiN DeLete
kglrfcl    KGL ReFerence CLear
kgmexec    KGM EXECute
kkmpost    POST PROCESSING
kksalx    ALlocate ‘size’ bytes from the eXecution-time heap
kkscls    KKS: Close the cursor, user is done with it
kkspfda    Multiple context area management
kkssbt    KKS: set bind types
kksscl    KKS: scan child list?
koklcopy    KOK Lob COPY.
koklcpb2c    KOK Lob CoPy Binary data (BFILE/BLOB) into Clob
kolfgdir    KOL File Get DIRectory object, path and FileNames.
kpuexec    KPU: Execute
kpuexecv8    KPU: Execute V8
kpurcsc    KPU Remote Call with ServiceContext, Callbacks
kqdgtc    return an open and parsed cursor for the given statement
kqldprr    KQLD Parent Referential constraint Read
kqllod    KQL: database object load
kqlsadd    kqlsadd – KQLS ADD a new element to a subordinate set
kqlslod    KQLS: Load all subordinate set elements for a given heap
kslcll    KSL: Clean up after a given latch
kslcllt    Clean up after a given latch
kslilcr    invoke latch cleanup routine:
ksmapg    KSM: Callback function for allocating a PGA extent, calls OSD to alloc
ksmasg    Callback function for allocating an SGA extent.
kssxdl    KSS: delete SO ignoring all except severe errors. cleans latches
ksucln    KSUCLN: Cleanup detached process
ksudlc    delete call
ksudlp    KSU: delete process.called when user detaches or during cleanup by PMON
ksuxda    KSUCLN: Attempt to delete all processes that are marked dead.
ksuxdl    KSUCLN: Delete state object for PMON
ksuxfl    KSU: Find dead processes and cleanup their latches. Called by PMON
kxfpbgpc    Get Permanent Chunks
kxfpbgtc    Buffer Allocation Get Chunk
kxfpnfy    KXFP: NotiFY (component notifier)
kxfxse    KXFX: execute
kxstcls    Trace cursor closing
opicca    ORACLE Program Interface: Clear Context Area
opiclo    ORACLE Program Interface: CLOse cursor
opiprs    ORACLE Program Interface: PaRSe
opitca    OPITCA: sets up the context area
pextproc    Pefm call EXTernal PROCedure
qerocStart    This function creates a collection iterator row-source to iterate
qkadrv    QKADRV: allocate query structures
qkajoi    QKAJOI: Query Kernel Allocation: JOIn processing
qximeop    QXIM Evaluate OPerand
rpicls    RPI: Recursive Program Interface CLoSe
selexe    SELEXE: prepare context area for fetch
xtyinpr    XTY Insert Numeric PRecision operator

 

ORA-600 Lookup Error Categories

Applies to:

Oracle Server – Enterprise Edition – Version:
Oracle Server – Personal Edition – Version:
Oracle Server – Standard Edition – Version:
Information in this document applies to any platform.
Checked for relevance 04-Jun-2009

Purpose

This note aims to provide a high level overview of the internal errors which may be encountered on the Oracle Server (sometimes referred to as the Oracle kernel). It is written to provide a guide to where a particular error may live and give some indication as to what the impact of the problem may be. Where a problem is reproducible and connected with a specific feature, you might obviously try not using the feature. If there is a consistent nature to the problem, it is good practice to ensure that the latest patchsets are in place and that you have taken reasonable measures to avoid known issues.

For repeatable issues which the ora-600 tool has not listed a likely cause , it is worth constructing a test case. Where this is possible, it greatly assists in the resolution time of any issue. It is important to remember that, in a many instances , the Server is very flexible and a workaround can very often be achieved.

Scope and Application

This bulletin provides Oracle DBAs with an overview of internal database errors.

Disclaimer: Every effort has been made to provide a reasonable degree of accuracy in what has been stated. Please consider that the details provided only serve to provide an indication of functionality and, in some cases, may not be wholly correct.

ORA-600 Lookup Error Categories

In the Oracle Server source, there are two types of ora-600 error :

  • the first parameter is a number which reflects the source component or layer the error is connected with; or
  • the first parameter is a mnemonic which indicates the source module where the error originated. This type of internal error is now used in preference to an internal error number.

Both types of error may be possible in the Oracle server.

Internal Errors Categorised by number range

The following table provides an indication of internal error codes used in the Oracle server. Thus, if ora-600[X] is encountered, it is possible to glean some high level background information : the error in generated in the Y layer which indicates that there may be a problem with Z.

Ora-600 Base Functionality Description
1 Service Layer The service layer has within it a variety of service related components which are associated with in memory related activities in the SGA such as, for example : the management of Enqueues, System Parameters, System state objects (these objects track the use of structures in the SGA by Oracle server processes), etc.. In the main, this layer provides support to allow process communication and provides support for locking and the management of structures to support multiple user processes connecting and interacting within the SGA. Note : vos  – Virtual Operating System provides features to support the functionality above.  As the name suggests it provides base functionality in much the same way as is provided by an Operating System.

Ora-600 Base Functionality Description
1 vos Component notifier
100 vos Debug
300 vos Error
500 vos Lock
700 vos Memory
900 vos System Parameters
1100 vos System State object
1110 vos Generic Linked List management
1140 vos Enqueue
1180 vos Instance Locks
1200 vos User State object
1400 vos Async Msgs
1700 vos license Key
1800 vos Instance Registration
1850 vos I/O Services components
2000 Cache Layer Where errors are generated in this area, it is advisable to check whether the error is repeatable and whether the error is perhaps associated with recovery or undo type operations; where this is the case and the error is repeatable, this may suggest some kind of hardware or physical issue with a data file, control file or log file. The Cache layer is responsible for making the changes to the underlying files and well as managing the related memory structures in the SGA. Note : rcv indicates recovery. It is important to remember that the Oracle cache layer is effectively going through the same code paths as used by the recovery mechanism.

Ora-600 Base Functionality Description
2000 server/rcv Cache Op
2100 server/rcv Control File mgmt
2200 server/rcv Misc (SCN etc.)
2400 server/rcv Buffer Instance Hash Table
2600 server/rcv Redo file component
2800 server/rcv Db file
3000 server/rcv Redo Application
3200 server/cache Buffer manager
3400 server/rcv Archival & media recovery component
3600 server/rcv recovery component
3700 server/rcv Thread component
3800 server/rcv Compatibility segment

It is important  to consider when the error occurred and the context in which the error was generated. If the error does not reproduce, it may be an in memory issue.

4000 Transaction Layer Primarily the transaction layer is involved with maintaining structures associated with the management of transactions.  As with the cache layer , problems encountered in this layer may indicate some kind of issue at a physical level. Thus it is important to try and repeat the same steps to see if the problem recurs.

Ora-600 Base Functionality Description
4000 server/txn Transaction Undo
4100 server/txn Transaction Undo
4210 server/txn Transaction Parallel
4250 server/txn Transaction List
4300 space/spcmgmt Transaction Segment
4400 txn/lcltx Transaction Control
4450 txn/lcltx distributed transaction control
4500 txn/lcltx Transaction Block
4600 space/spcmgmt Transaction Table
4800 dict/rowcache Query Row Cache
4900 space/spcmgmt Transaction Monitor
5000 space/spcmgmt Transaction Extent

It is important to try and determine what the object involved in any reproducible problem is. Then use the analyze command. For more information, please refer to the analyze command as detailed in the context of  Note:28814.1; in addition, it may be worth using the dbverify as discussed in Note:35512.1.

6000 Data Layer The data layer is responsible for maintaining and managing the data in the database tables and indexes. Issues in this area may indicate some kind of physical issue at the object level and therefore, it is important to try and isolate the object and then perform an anlayze on the object to validate its structure.

Ora-600 Base Functionality Description
6000 ram/data
ram/analyze
ram/index
data, analyze command and index related activity
7000 ram/object lob related errors
8000 ram/data general data access
8110 ram/index index related
8150 ram/object general data access

Again, it is important to try and determine what the object involved in any reproducible problem is. Then use the analyze command. For more information, please refer to the analyze command as detailed in the context of  Note:28814.1; in addition, it may be worth using the dbverify as discussed in Note:35512.1.

12000 User/Oracle Interface & SQL Layer Components This layer governs the user interface with the Oracle server. Problems generated by this layer usually indicate : some kind of presentation or format error in the data received by the server, i.e. the client may have sent incomplete information; or there is some kind of issue which indicates that the data is received out of sequence

Ora-600 Base Functionality Description
12200 progint/kpo
progint/opi
lob related
errors at interface level on server side, xa , etc.
12300 progint/if OCI interface to coordinating global transactions
12400 sqlexec/rowsrc table row source access
12600 space/spcmgmt operations associated with tablespace : alter / create / drop operations ; operations associated with create table / cluster
12700 sqlexec/rowsrc bad rowid
13000 dict/if dictionary access routines associated with kernel compilation
13080 ram/index kernel Index creation
13080 sqllang/integ constraint mechanism
13100 progint/opi archival and Media Recovery component
13200 dict/sqlddl alter table mechanism
13250 security/audit audit statement processing
13300 objsupp/objdata support for handling of object generation and object access
14000 dict/sqlddl sequence generation
15000 progint/kpo logon to Oracle
16000 tools/sqlldr sql loader related

You should try and repeat the issue and with the use of sql trace , try and isolate where exactly the issue may be occurring within the application.

14000 System Dependent Component internal error values This layer manages interaction with the OS. Effectively it acts as the glue which allows the Oracle server to interact with the OS. The types of operation which this layer manages are indicated as follows.

Ora-600 Base Functionality Description
14000 osds File access
14100 osds Concurrency management;
14200 osds Process management;
14300 osds Exception-handler or signal handler management
14500 osds Memory allocation
15000 security/dac,
security/logon
security/ldap
local user access validation; challenge / response activity for remote access validation; auditing operation; any activities associated with granting and revoking of privileges; validation of password with external password file
15100 dict/sqlddl this component manages operations associated with creating, compiling (altering), renaming, invalidating, and dropping  procedures, functions, and packages.
15160 optim/cbo cost based optimizer layer is used to determine optimal path to the data based on statistical information available on the relevant tables and indexes.
15190 optim/cbo cost based optimizer layer. Used in the generation of a new index to determine how the index should be created. Should it be constructed from the table data or from another index.
15200 dict/shrdcurs used to in creating sharable context area associated with shared cursors
15230 dict/sqlddl manages the compilation of triggers
15260 dict/dictlkup
dict/libcache
dictionary lookup and library cache access
15400 server/drv manages alter system and alter session operations
15410 progint/if manages compilation of pl/sql packages and procedures
15500 dict/dictlkup performs dictionary lookup to ensure semantics are correct
15550 sqlexec/execsvc
sqlexec/rowsrc
hash join execution management;
parallel row source management
15600 sqlexec/pq component provides support for Parallel Query operation
15620 repl/snapshots manages the creation of snapshot or materialized views as well as related snapshot / MV operations
15640 repl/defrdrpc layer containing various functions for examining the deferred transaction queue and retrieving information
15660 jobqs/jobq manages the operation of the Job queue background processes
15670 sqlexec/pq component provides support for Parallel Query operation
15700 sqlexec/pq component provides support for Parallel Query operation; specifically mechanism for starting up and shutting down query slaves
15800 sqlexec/pq component provides support for Parallel Query operation
15810 sqlexec/pq component provides support for Parallel Query operation; specifically functions for creating mechanisms through which Query co-ordinator can communicate with PQ slaves;
15820 sqlexec/pq component provides support for Parallel Query operation
15850 sqlexec/execsvc component provides support for the execution of SQL statements
15860 sqlexec/pq component provides support for Parallel Query operation
16000 loader sql Loader direct load operation;
16150 loader this layer is used for ‘C’ level call outs to direct loader operation;
16200 dict/libcache this is part of library Cache operation. Amongst other things it manages the dependency of SQL objects and tracks who is permitted to access these objects;
16230 dict/libcache this component is responsible for managing access to remote objects as part of library Cache operation;
16300 mts/mts this component relates to MTS (Multi Threaded Server) operation
16400 dict/sqlddl this layer contains functionality which allows tables to be loaded / truncated and their definitions to be modified. This is part of dictionary operation;
16450 dict/libcache this layer layer provides support for multi-instance access to the library cache; this functionality is applicable therefore to OPS environments;
16500 dict/rowcache this layer provides support to load / cache Oracle’s dictionary in memory in the library cache;
16550 sqlexec/fixedtab this component maps data structures maintained in the Oracle code to fixed tables such that they can be queried using the SQL layer;
16600 dict/libcache this layer performs management of data structures within the library cache;
16651 dict/libcache this layer performs management of dictionary related information within library Cache;
16701 dict/libcache this layer provides library Cache support to support database creation and forms part of the bootstrap process;
17000 dict/libcache this is the main library Cache manager. This Layer maintains the in memory representation of cached sql statements together will all the necessary support that this demands;
17090 generic/vos this layer implementations error management operations: signalling errors, catching  errors, recovering from errors, setting error frames, etc.;
17100 generic/vos Heap manager. The Heap manager manages the storage of internal data in an orderly and consistent manner. There can be many heaps serving various purposes; and heaps within heaps. Common examples are the SGA heap, UGA heap and the PGA heap. Within a Heap there are consistency markers which aim to ensure that the Heap is always in a consistent state. Heaps are use extensively and are in memory structures – not on disk.
17200 dict/libcache this component deals with loading remote library objects into the local library cache with information from the remote database.
17250 dict/libcache more library cache errors ; functionality for handling pipe operation associated with dbms_pipe
17270 dict/instmgmt this component manages instantiations of procedures, functions, packages, and cursors in a session. This provides a means to keep track of what has been loaded in the event of process death;
17300 generic/vos manages certain types of memory allocation structure.  This functionality is an extension of the Heap manager.
17500 generic/vos relates to various I/O operations. These relate to async i/o operation,  direct i/o operation and the management of writing buffers from the buffer cache by potentially a number of database writer processes;
17625 dict/libcache additional library Cache supporting functions
17990 plsql plsql ‘standard’ package related issues
18000 txn/lcltx transaction and savepoint management operations
19000 optim/cbo cost based optimizer related operations
20000 ram/index bitmap index and index related errors.
20400 ram/partnmap operations on partition related objects
20500 server/rcv server recovery related operation
21000 repl/defrdrpc,
repl/snapshot,
repl/trigger
replication related features
23000 oltp/qs AQ related errors.
24000 dict/libcache operations associated with managing stored outlines
25000 server/rcv tablespace management operations

Internal Errors Categorised by mnemonic

The following table details mnemonics error stems which are possible. If you have encountered : ora-600[kkjsrj:1] for example, you should look down the Error Mnemonic column (errors in alphabetical order) until you find the matching stem. In this case, kkj indicates that something unexpected has occurred in job queue operation.

Error Mnemonic(s) Functionality Description
ain ainp ram/index ain – alter index; ainp –  alter index partition management operation
apacb optim/rbo used by optimizer in connect by processing
atb atbi atbo ctc ctci cvw dict/sqlddl alter table , create table (IOT) or cluster operations as well as create view related operations (with constraint handling functionality)
dbsdrv sqllang/parse alter / create database operation
ddfnet progint/distrib various distributed operations on remote dictionary
delexe sqlexec/dmldrv manages the delete statement operation
dix ram/index manages drop index or validate index operation
dtb dict/sqlddl manages drop table operation
evaa2g evah2p evaa2g dbproc/sqlfunc various functions involves in evaluating operand outcomes such as : addition , average, OR operator, bites AND , bites OR, concatenation, as well as Oracle related functions : count(), dump() , etc. The list is extensive.
expcmo expgon dbproc/expreval handles expression evaluation with respect to two operands being equivalent
gra security/dac manages the granting and revoking of privilege rights to a user
gslcsq plsldap support for operations with an LDAP server
insexe sqlexec/dmldrv handles the insert statement operation
jox progint/opi functionality associated with the Java compiler and with the Java runtime environment within the Server
k2c k2d progint/distrib support for database to database operation in distributed environements as well as providing, with respect to the 2-phase commit protocol, a globally unique Database id
k2g k2l txn/disttx support for the 2 phase commit protocol protocol and the coordination of the various states in managing the distributed transaction
k2r k2s k2sp progint/distrib k2r – user interface for managing distributed transactions and combining distributed results ; k2s – handles logging on, starting a transaction, ending a transaction and recovering a transaction; k2sp – management of savepoints in a distributed environment.
k2v txn/disttx handles distributed recovery operation
kad cartserv/picklercs handles OCIAnyData implementation
kau ram/data manages the modification of indexes for inserts, updates and delete operations for IOTs as well as modification of indexes for IOTs
kcb kcbb kcbk kcbl kcbs kcbt kcbw kcbz cache manages Oracle’s buffer cache operation as well as operations used by capabilities such as direct load, has clusters , etc.
kcc kcf rcv manages and coordinates operations on the control file(s)
kcit context/trigger internal trigger functionality
kck rcv compatibility related checks associated with the compatible parameter
kcl cache background lck process which manages locking in a RAC or parallel server multiple instance environment
kco kcq kcra kcrf kcrfr kcrfw kcrp kcrr kcs kct kcv rcv various buffer cache operation such as quiesce operation , managing fast start IO target, parallel recovery operation , etc.
kd ram/data support for row level dependency checking and some log miner operations
kda ram/analyze manages the analyze command and collection of statistics
kdbl kdc kdd ram/data support for direct load operation, cluster space management and deleting rows
kdg ram/analyze gathers information about the underlying data and is used by the analyze command
kdi kdibc3 kdibco kdibh kdibl kdibo kdibq kdibr kdic kdici kdii kdil kdir kdis kdiss kdit kdk ram/index support of the creation of indexes on tables an IOTs and index look up
kdl kdlt ram/object lob and temporary lob management
kdo ram/data operations on data such as inserting a row piece or deleting a row piece
kdrp ram/analyze underlying support for operations provided by the dbms_repair package
kds kdt kdu ram/data operations on data such as retrieving a row and updating existing row data
kdv kdx ram/index functionality for dumping index and managing index blocks
kfc kfd kfg asm support for ASM file and disk operations
kfh kfp kft rcv support for writing to file header and transportable tablespace operations
kgaj kgam kgan kgas kgat kgav kgaz argusdbg/argusdbg support for Java Debug Wire Protocol (JDWP) and debugging facilites
kgbt kgg kgh kghs kghx kgkp vos kgbt – support for BTree operations; kgg – generic lists processing; kgh – Heap Manager : managing the internal structures withing the SGA / UGA / PGA and ensures their integrity; kghs – Heap manager with Stream support; kghx – fixed sized shared memory manager; kgkp – generic services scheduling policies
kgl kgl2 kgl3 kgla kglp kglr kgls dict/libcache generic library cache operation
kgm kgmt ilms support for inter language method services – or calling one language from another
kgrq kgsk kgski kgsn kgss vos support for priority queue and scheduling; capabilities for Numa support;  Service State object manager
kgupa kgupb kgupd0 kgupf kgupg kgupi kgupl kgupm kgupp kgupt kgupx kguq2 kguu vos Service related activities activities associated with for Process monitor (PMON); spawning or creating of background processes; debugging; managing process address space;  managing the background processes; etc.
kgxp vos inter process communication related functions
kjak kjat kjb kjbl kjbm kjbr kjcc kjcs kjctc kjcts kjcv kjdd kjdm kjdr kjdx kjfc kjfm kjfs kjfz kjg kji kjl kjm kjp kjr kjs kjt kju kjx ccl/dlm dlm related functionality ; associated with RAC or parallel server operation
kjxgf kjxgg kjxgm kjxgn kjxgna kjxgr ccl/cgs provides communication & synchronisation associated with GMS or OPS related functionality as well as name service and OPS Instance Membership Recovery Facility
kjxt ccl/dlm DLM request message management
kjzc kjzd kjzf kjzg kjzm ccl/diag support for diagnosibility amongst OPS related services
kkb dict/sqlddl support for operatoins which load/change table definitions
kkbl kkbn kkbo objsupp/objddl support for tables with lobs , nested tables and varrays as well as columns with objects
kkdc kkdl kkdo dict/dictlkup support for constraints, dictionary lookup and dictionary support for objects
kke optim/cbo query engine cost engine; provides support functions that provide cost estimates for queries under a number of different circumstances
kkfd sqlexec/pq support for performing parallel query operation
kkfi optim/cbo optimizer support for matching of expressions against functional ndexes
kkfr kkfs sqlexec/pq support for rowid range handling as well as for building parallel query query operations
kkj jobqs/jobq job queue operation
kkkd kkki dict/dbsched resource manager related support. Additionally, provides underlying functions provided by dbms_resource_manager and dbms_resource_manager_privs packages
kklr dict/sqlddl provides functions used to manipulate LOGGING and/or RECOVERABLE attributes of an object (non-partitioned table or index or  partitions of a partitioned table or index)
kkm kkmi dict/dictlkup provides various semantic checking functions
kkn ram/analyze support for the analyze command
kko kkocri optim/cbo Cost based Optimizer operation : generates alternative execution plans in order to find the optimal / quickest access to the data.  Also , support to determine cost and applicability of  scanning a given index in trying to create or rebuild an index or a partition thereof
kkpam kkpap ram/partnmap support for mapping predicate keys expressions to equivalent partitions
kkpo kkpoc kkpod dict/partn support for creation and modification of partitioned objects
kkqg kkqs kkqs1 kkqs2 kkqs3 kkqu kkqv kkqw optim/vwsubq query rewrite operation
kks kksa kksh kksl kksm dict/shrdcurs support for managing shared cursors/ shared sql
kkt dict/sqlddl support for creating, altering and dropping trigger definitions as well as handling the trigger operation
kkxa repl/defrdrpc underlying support for dbms_defer_query package operations
kkxb dict/sqlddl library cache interface for external tables
kkxl dict/plsicds underlying support for the dbms_lob package
kkxm progint/opi support for inter language method services
kkxs dict/plsicds underlying support for the dbms_sys_sql package
kkxt repl/trigger support for replication internal trigger operation
kkxwtp progint/opi entry point into the plsql compiler
kky drv support for alter system/session commands
kkz kkzd kkzf kkzg kkzi kkzj kkzl kkzo kkzp kkzq kkzr kkzu kkzv repl/snapshot support for snapshots or Materialized View validation and operation
kla klc klcli klx tools/sqlldr support for direct path sql loader operation
kmc kmcp kmd kmm kmr mts/mts support for Multi Threaded server operation (MTS) : manange and operate the virtual circuit mechanism, handle the dispatching of massages, administer shared servers and for collecting and maintaining statistics associated with MTS
knac knafh knaha knahc knahf knahs repl/apply replication apply operation associated with Oracle streams
kncc repl/repcache support for replication related information stored and maintained in library cache
kncd knce repl/defrdrpc replication related enqueue and dequeue of transction data as well as other queue related operations
kncog repl/repcache support for loading replicaiton object group information into library cache
kni repl/trigger support for replication internal trigger operation
knip knip2 knipi knipl knipr knipu knipu2 knipx repl/intpkg support for replication internal package operation.
kno repl/repobj support for replication objects
knp knpc knpcb knpcd knpqc knps repl/defrdrpc operations assocaied with propagating transactions to a remote node and coordination of this activity.
knst repl/stats replication statistics collection
knt kntg kntx repl/trigger support for replication internal trigger operation
koc objmgmt/objcache support for managing ADTs objects in the OOCI heap
kod objmgmt/datamgr support for persistent storage for objects : for read/write objects, to manage object IDs, and to manage object concurrency and recovery.
koh objmgmt/objcache object heap manager provides memory allocation services for objects
koi objmgmt/objmgr support for object types
koka objsupp/objdata support for reading images, inserting images, updating images, and deleting images based on object references (REFs).
kokb kokb2 objsupp/objsql support for nested table objects
kokc objmgmt/objcache support for pinning , unpinning and freeing objects
kokd objsupp/datadrv driver on the server side for managing objects
koke koke2 koki objsupp/objsql support for managing objects
kokl objsupp/objdata lob access
kokl2 objsupp/objsql lob DML and programmatic interface support
kokl3 objsupp/objdata object temporary LOB support
kokle kokm objsupp/objsql object SQL evaluation functions
kokn objsupp/objname naming support for objects
koko objsupp/objsup support functions to allow oci/rpi to communicate with Object Management Subsystem (OMS).
kokq koks koks2 koks3 koksr objsupp/objsql query optimisation for objects , semantic checking and semantic rewrite operations
kokt kokt2 kokt3 objsupp/objddl object compilation type manager
koku kokv objsupp/objsql support for unparse object operators and object view support
kol kolb kole kolf kolo objmgmt/objmgr support for object Lob buffering , object lob evaluation and object Language/runtime functions for Opaque types
kope2 kopi2 kopo kopp2 kopu koputil kopz objmgmt/pickler 8.1 engine implementation,  implementation of image ops for 8.1+ image format together with various pickler related support functions
kos objsupp/objsup object Stream interfaces for images/objects
kot kot2 kotg objmgmt/typemgr support for dynamic type operations to create, delete, and  update types.
koxs koxx objmgmt/objmgt object generic image Stream routines and miscellaneous generic object functions
kpcp kpcxlt progint/kpc Kernel programmatic connection pooling and kernel programmatic common type XLT translation routines
kpki progint/kpki kernel programatic interface support
kpls cartserv/corecs support for string formatting operations
kpn progint/kpn support for server to server communication
kpoal8 kpoaq kpob kpodny kpodp kpods kpokgt kpolob kpolon kpon progint/kpo support for programmatic operations
kpor progint/opi support for streaming protocol used by replication
kposc progint/kpo support for scrollable cursors
kpotc progint/opi oracle side support functions for setting up trusted external procedure callbacks
kpotx kpov progint/kpo support for managing local and distributed transaction coordination.
kpp2 kpp3 sqllang/parse kpp2 – parse routines for dimensions;
kpp3 – parse support for create/alter/drop summary  statements
kprb kprc progint/rpi support for executing sql efficiently on the Oracle server side as well as for copying data types during rpi operations
kptsc progint/twotask callback functions provided to all streaming operation as part of replication functionality
kpu kpuc kpucp progint/kpu Oracle kernel side programmatic user interface,  cursor management functions and client side connection pooling support
kqan kqap kqas argusdbg/argusdbg server-side notifiers and callbacks for debug operations.
kql kqld kqlp dict/libcache SQL Library Cache manager – manages the sharing of sql statements in the shared pool
kqr dict/rowcache row cache management. The row cache consists of a set of facilities to provide fast access to table definitions and locking capabilities.
krbi krbx krby krcr krd krpi rcv Backup and recovery related operations :
krbi – dbms_backup_restore package underlying support.; krbx –  proxy copy controller; krby – image copy; krcr – Recovery Controlfile Redo; krd – Recover Datafiles (Media & Standby Recovery);  krpi – support for the package : dbms_pitr
krvg krvt rcv/vwr krvg – support for generation of redo associated with DDL; krvt – support for redo log miner viewer (also known as log miner)
ksa ksdp ksdx kse ksfd ksfh ksfq ksfv ksi ksim ksk ksl ksm ksmd ksmg ksn ksp kspt ksq ksr kss ksst ksu ksut vos support for various kernel associated capabilities
ksx sqlexec/execsvc support for query execution associated with temporary tables
ksxa ksxp ksxr vos support for various kernel associated capabilities in relation to OPS or RAC operation
kta space/spcmgmt support for DML locks and temporary tables associated with table access
ktb ktbt ktc txn/lcltx transaction control operations at the block level : locking block, allocating space within the block , freeing up space, etc.
ktec ktef ktehw ktein ktel kteop kteu space/spcmgmt support for extent management operations :
ktec – extent concurrency operations; ktef – extent format; ktehw – extent high water mark operations; ktein – extent  information operations; ktel – extent support for sql loader; kteop – extent operations : add extent to segment, delete extent, resize extent, etc. kteu – redo support for operations changing segment header / extent map
ktf txn/lcltx flashback support
ktfb ktfd ktft ktm space/spcmgmt ktfb – support for bitmapped space manipulation of files/tablespaces;  ktfd – dictionary-based extent management; ktft – support for temporary file manipulation; ktm – SMON operation
ktp ktpr ktr ktri txn/lcltx ktp – support for parallel transaction operation; ktpr – support for parallel transaction recovery; ktr – kernel transaction read consistency;
ktri – support for dbms_resumable package
ktsa ktsap ktsau ktsb ktscbr ktsf ktsfx ktsi ktsm ktsp ktss ktst ktsx ktt kttm space/spcmgmt support for checking and verifying space usage
ktu ktuc ktur ktusm txn/lcltx internal management of undo and rollback segments
kwqa kwqi kwqic kwqid kwqie kwqit kwqj kwqm kwqn kwqo kwqp kwqs kwqu kwqx oltp/qs support for advanced queuing :
kwqa – advanced queue administration; kwqi – support for AQ PL/SQL trusted callouts; kwqic – common AQ support functions; kwqid – AQ dequeue support; kwqie – AQ enqueu support ; kwqit – time management operation ; kwqj – job queue scheduler for propagation; kwqm – Multiconsumer queue IOT support; kwqn – queue notifier; kwqo – AQ support for checking instType checking options; kwqp – queueing propagation; kwqs – statistics handling; kwqu – handles lob data. ; kwqx – support for handling transformations
kwrc kwre oltp/re rules engine evaluation
kxcc kxcd kxcs sqllang/integ constraint processing
kxdr sqlexec/dmldrv DML driver entrypoint
kxfp kxfpb kxfq kxfr kxfx sqlexec/pq parallel query support
kxhf kxib sqlexec/execsvc khhf- support for hash join file and memory management; kxib – index buffering operations
kxs dict/instmgmt support for executing shared cursors
kxti kxto kxtr dbproc/trigger support for trigger operation
kxtt ram/partnmap support for temporary table operations
kxwph ram/data support for managing attributes of the segment of a table / cluster / table-partition
kza security/audit support for auditing operations
kzar security/dac support for application auditing
kzck security/crypto encryption support
kzd security/dac support for dictionary access by security related functions
kzec security/dbencryption support inserting and retrieving encrypted objects into and out of the database
kzfa kzft security/audit support for fine grained auditing
kzia security/logon identification and authentication operations
kzp kzra kzrt kzs kzu kzup security/dac security related operations associated with privileges
msqima msqimb sqlexec/sqlgen support for generating sql statments
ncodef npi npil npixfr progint/npi support for managing remote network connection from  within the server itself
oba sqllang/outbufal operator buffer allocate for various types of operators : concatenate, decode, NVL, etc.  the list is extensive.
ocik progint/oci OCI oracle server functions
opiaba opidrv opidsa opidsc opidsi opiexe opifch opiino opilng opipar opipls opirip opitsk opix progint/opi OPI Oracle server functions – these are at the top of the server stack and are called indirectly by ythe client in order to server the client request.
orlr objmgmt/objmgr support for  C langauge interfaces to user-defined types (UDTs)
orp objmgmt/pickler oracle’s external pickler / opaque type interfaces
pesblt pfri pfrsqc plsql/cox pesblt – pl/sql built in interpreter; pfri – pl/sql runtime; pfrsqc – pl/sql callbacks for array sql and dml with returning
piht plsql/gen/utl support for pl/sql implementation of utl_http package
pirg plsql/cli/utl_raw support for pl/sql implementation of utl_raw package
pism plsql/cli/utl_smtp support for pl/sql implementation of utl_smtp package
pitcb plsql/cli/utl_tcp support for pl/sql implementation of utl_tcp package
piur plsql/gen/utl_url support for pl/sql implementation of utl_url package
plio plsql/pkg pl/sql object instantiation
plslm plsql/cox support for NCOMP processing
plsm pmuc pmuo pmux objmgmt/pol support for pl/sql handling of collections
prifold priold plsql/cox support to allow rpc forwarding to an older release
prm sqllang/param parameter handling associated with sql layer
prsa prsc prssz sqllang/parse prsa – parser for alter cluster command; prsc – parser for create database command; prssz – support for parse context to be saved
psdbnd psdevn progint/dbpsd psdbnd – support for managing bind variables; psdevn – support for pl/sql debugger
psdicd progint/plsicds small number of ICD to allow pl/sql to call into ‘C’ source
psdmsc psdpgi progint/dbpsd psdmsc – pl/sql system dependent miscellaneous functions ; psdpgi – support for opening and closing cursors in pl/sql
psf plsql/pls pl/sql service related functions for instantiating called pl/sql unit in library cache
qbadrv qbaopn sqllang/qrybufal provides allocation of buffer and control structures in query execution
qcdl qcdo dict/dictlkup qcdl – query compile semantic analysis; qcdo – query compile dictionary support for objects
qci dict/shrdcurs support for SQL language parser and semantic analyser
qcop qcpi qcpi3 qcpi4 qcpi5 sqllang/parse support for query compilation parse phase
qcs qcs2 qcs3 qcsji qcso dict/dictlkup support for semantic analysis by SQL compiler
qct qcto sqllang/typeconv qct – query compile type check operations; qcto –  query compile type check operators
qcu sqllang/parse various utilities provided for sql compilation
qecdrv sqllang/qryedchk driver performing high level checks on sql language query capabilities
qerae qerba qerbc qerbi qerbm qerbo qerbt qerbu qerbx qercb qercbi qerco qerdl qerep qerff qerfi qerfl qerfu qerfx qergi qergr qergs qerhc qerhj qeril qerim qerix qerjm qerjo qerle qerli qerlt qerns qeroc qeroi qerpa qerpf qerpx qerrm qerse qerso qersq qerst qertb qertq qerua qerup qerus qervw qerwn qerxt sqlexec/rowsrc row source operators :
qerae – row source (And-Equal) implementation; qerba – Bitmap Index AND row source; qerbc – bitmap index compaction row source; qerbi – bitmap index creation row source; qerbm – QERB Minus row source; qerbo  – Bitmap Index OR row source; qerbt – bitmap convert row source; qerbu – Bitmap Index Unlimited-OR row source; qerbx – bitmap index access row source; qercb – row source: connect by; qercbi – support for connect by; qerco – count row source; qerdl – row source delete; qerep – explosion row source; qerff – row source fifo buffer; qerfi  – first row row source; qerfl  – filter row source definition; qerfu – row source: for update; qerfx – fixed table row source; qergi – granule iterator row source; qergr – group by rollup row source; qergs – group by sort row source; qerhc – row sources hash clusters; qerhj – row source Hash Join;  qeril  – In-list row source; qerim – Index Maintenance row source; qerix – Index row source; qerjo – row source: join; qerle – linear execution row source implementation; qerli – parallel create index; qerlt – row source populate Table;  qerns  – group by No Sort row source; qeroc – object collection iterator row source; qeroi – extensible indexing query component; qerpa – partition row sources; qerpf – query execution row source: prefetch; qerpx – row source: parallelizer; qerrm – remote row source; qerse – row source: set implementation; qerso – sort row source; qersq – row source for sequence number; qerst  – query execution row sources: statistics; qertb – table row source; qertq  – table queue row source; qerua – row source : union-All;
qerup – update row source; qerus – upsert row source ; qervw – view row source; qerwn – WINDOW row source; qerxt – external table fetch row source
qes3t qesa qesji qesl qesmm qesmmc sqlexec/execsvc run time support for sql execution
qkacon qkadrv qkajoi qkatab qke qkk qkn qkna qkne sqlexec/rwsalloc SQL query dynamic structure allocation routines
qks3t sqlexec/execsvc query execution service associated with temp table transformation
qksmm qksmms qksop sqllang/compsvc qksmm –  memory management services for the SQL compiler; qksmms – memory management simulation services for the SQL compiler; qksop – query compilation service for operand processing
qkswc sqlexec/execsvc support for temp table transformation associated for with clause.
qmf xmlsupp/util support for ftp server; implements processing of ftp commands
qmr qmrb qmrs xmlsupp/resolver support hierarchical resolver
qms xmlsupp/data support for storage and retrieval of XOBs
qmurs xmlsupp/uri support for handling URIs
qmx qmxsax xmlsupp/data qmx – xml support; qmxsax – support for handling sax processing
qmxtc xmlsupp/sqlsupp support for ddl  and other operators related to the sql XML support
qmxtgx xmlsupp support for transformation : ADT -> XML
qmxtsk xmlsupp/sqlsupp XMLType support functions
qsme summgmt/dict summary management expression processing
qsmka qsmkz dict/dictlkup qsmka – support to analyze request in order to determine whether a summary could be created that would be useful; qsmkz – support for create/alter summary semantic analysis
qsmp qsmq qsmqcsm qsmqutl summgmt/dict qsmp – summary management partition processing; qsmq – summary management dictionary access; qsmqcsm – support for create / drop / alter summary and related dimension operations; qsmqutl – support for summaries
qsms summgmt/advsvr summary management advisor
qxdid objsupp/objddl support for domain index ddl operations
qxidm objsupp/objsql support for extensible index dml operations
qxidp objsupp/objddl support for domain index ddl partition operations
qxim objsupp/objsql extensible indexing support for objects
qxitex qxopc qxope objsupp/objddl qxitex – support for create / drop indextype; qxope – execution time support for operator  callbacks; qxope – execution time support for operator DDL
qxopq qxuag qxxm objsupp/objsql qxopq – support for queries with user-defined operators; qxuag – support for user defined aggregate processing; qxxm – queries involving external tables
rfmon rfra rfrdb rfrla rfrm rfrxpt drs implements 9i data guard broker monitor
rnm dict/sqlddl manages rename statement operation
rpi progint/rpi recursive procedure interface which handles the the environment setup where multiple recursize statements are executed from one top level statement
rwoima sqlexec/rwoprnds row operand operations
rwsima sqlexec/rowsrc row source implementation/retrieval according to the defining query
sdbima sqlexec/sort manages and performs sort operation
selexe sqlexec/dmldrv handles the operation of select statement execution
skgm osds platform specific memory management rountines interfacing with O.S. allocation functions
smbima sor sqlexec/sort manages and performs sort operation
sqn dict/sqlddl support for parsing references to sequences
srdima srsima stsima sqlexec/sort manages and performs sort operation
tbsdrv space/spcmgmt operations for executing create / alter / drop tablespace and related supporting functions
ttcclr ttcdrv ttcdty ttcrxh ttcx2y progint/twotask two task common layer which provides high level interaction and negotiation functions for Oracle client when communicating with the server.  It also provides important function of converting client side data / data types into equivalent on the server and vice versa
uixexe ujiexe updexe upsexe sqlexec/dmldrv support for : index maintenance operations, the execution of the update statement and associated actions connected with update as well as the upsert command which combines the operations of update and insert
vop optim/vwsubq view optimisation related functionality
xct txn/lcltx support for the management of transactions and savepoint operations
xpl sqlexec/expplan support for the explain plan command
xty sqllang/typeconv type checking functions
zlke security/ols/intext label security error handling component

Script to Collect Data Guard Diagnostic Information

Overview
——–

This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.

Script Notes
————-

This script is intended to be run via sqlplus as the SYS or Internal user.

Script
——-

- - - - - - - - - - - - - - - - Script begins here - - - - - - - - - - - - - - - -
-- NAME: dg_prim_diag.sql  (Run on PRIMARY with a LOGICAL or PHYSICAL STANDBY)
set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,'Mondd_hhmi') timecol,
'.out' spool_extension from sys.dual;
column output new_value dbname
select value || '_' output
from v$parameter where name = 'db_name';
spool dg_prim_diag_&&dbname&&timestamp&&suffix
set linesize 79
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = 'MON-DD-YYYY HH24:MI:SS';
set feedback on
select to_char(sysdate) time from dual;
set echo on
-- In the following the database_role should be primary as that is what
-- this script is intended to be run on.  If protection_level is different
-- than protection_mode then for some reason the mode listed in
-- protection_mode experienced a need to downgrade.  Once the error
-- condition has been corrected the protection_level should match the
-- protection_mode after the next log switch.
column role format a7 tru
column name format a10 wrap
select name,database_role role,log_mode,
protection_mode,protection_level
from v$database;
-- ARCHIVER can be (STOPPED | STARTED | FAILED). FAILED means that the
-- archiver failed to archive a log last time, but will try again within 5
-- minutes. LOG_SWITCH_WAIT The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log
-- switching is waiting for.  Note that if ALTER SYSTEM SWITCH LOGFILE is
-- hung, but there is room in the current online redo log, then value is
-- NULL
column host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait
from v$instance;
-- The following query give us information about catpatch.
-- This way we can tell if the procedure doesn't match the image.
select version, modified, status from dba_registry
where comp_id = 'CATPROC';
-- Force logging is not mandatory but is recommended.  Supplemental
-- logging must be enabled if the standby associated with this primary is
-- a logical standby. During normal operations it is acceptable for
-- SWITCHOVER_STATUS to be SESSIONS ACTIVE or TO STANDBY.
column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,
supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker
from v$database;
-- This query produces a list of all archive destinations.  It shows if
-- they are enabled, what process is servicing that destination, if the
-- destination is local or remote, and if remote what the current mount ID
-- is.
column destination format a35 wrap
column process format a7
column archiver format a8
column ID format 99
column mid format 99
select dest_id "ID",destination,status,target,
schedule,process,mountid  mid
from v$archive_dest order by dest_id;
-- This select will give further detail on the destinations as to what
-- options have been set.  Register indicates whether or not the archived
-- redo log is registered in the remote destination control file.
set numwidth 8
column ID format 99
select dest_id "ID",archiver,transmit_mode,affirm,async_blocks async,
net_timeout net_time,delay_mins delay,reopen_secs reopen,
register,binding
from v$archive_dest order by dest_id;
-- The following select will show any errors that occured the last time
-- an attempt to archive to the destination was attempted.  If ERROR is
-- blank and status is VALID then the archive completed correctly.
column error format a55 wrap
select dest_id,status,error from v$archive_dest;
-- The query below will determine if any error conditions have been
-- reached by querying the v$dataguard_status view (view only available in
-- 9.2.0 and above):
column message format a80
select message, timestamp
from v$dataguard_status
where severity in ('Error','Fatal')
order by timestamp;
-- The following query will determine the current sequence number
-- and the last sequence archived.  If you are remotely archiving
-- using the LGWR process then the archived sequence should be one
-- higher than the current sequence.  If remotely archiving using the
-- ARCH process then the archived sequence should be equal to the
-- current sequence.  The applied sequence information is updated at
-- log switch time.
select ads.dest_id,max(sequence#) "Current Sequence",
max(log_sequence) "Last Archived"
from v$archived_log al, v$archive_dest ad, v$archive_dest_status ads
where ad.dest_id=al.dest_id
and al.dest_id=ads.dest_id
group by ads.dest_id;
-- The following select will attempt to gather as much information as
-- possible from the standby.  SRLs are not supported with Logical Standby
-- until Version 10.1.
set numwidth 8
column ID format 99
column "SRLs" format 99
column Active format 99
select dest_id id,database_mode db_mode,recovery_mode,
protection_mode,standby_logfile_count "SRLs",
standby_logfile_active ACTIVE,
archived_seq#
from v$archive_dest_status;
-- Query v$managed_standby to see the status of processes involved in
-- the shipping redo on this system.  Does not include processes needed to
-- apply redo.
select process,status,client_process,sequence#
from v$managed_standby;
-- The following query is run on the primary to see if SRL's have been
-- created in preparation for switchover.
select group#,sequence#,bytes from v$standby_log;
-- The above SRL's should match in number and in size with the ORL's
-- returned below:
select group#,thread#,sequence#,bytes,archived,status from v$log;
-- Non-default init parameters.
set numwidth 5
column name format a30 tru
column value format a48 wra
select name, value
from v$parameter
where isdefault = 'FALSE';
spool off
- - - - - - - - - - - - - - - -  Script ends here  - - - - - - - - - - - - - - - -

Overview
——–

This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.

Script Notes
————-

This script is intended to be run via sqlplus as the SYS or Internal user.

Script
——-

- - - - - - - - - - - - - - - - Script begins here - - - - - - - - - - - - - - - -
-- NAME: DG_phy_stby_diag.sql
set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,'Mondd_hhmi') timecol,
'.out' spool_extension from sys.dual;
column output new_value dbname
select value || '_' output
from v$parameter where name = 'db_name';
spool dgdiag_phystby_&&dbname&&timestamp&&suffix
set lines 200
set pagesize 35
set trim on
set trims on
alter session set nls_date_format = 'MON-DD-YYYY HH24:MI:SS';
set feedback on
select to_char(sysdate) time from dual;
set echo on
--
-- ARCHIVER can be  (STOPPED | STARTED | FAILED) FAILED means that the archiver failed
-- to archive a -- log last time, but will try again within 5 minutes. LOG_SWITCH_WAIT
-- The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log switching is waiting for. Note that
-- if ALTER SYSTEM SWITCH LOGFILE is hung, but there is room in the current online
-- redo log, then value is NULL
column host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait from v$instance;
-- The following select will give us the generic information about how this standby is
-- setup.  The database_role should be standby as that is what this script is intended
-- to be ran on.  If protection_level is different than protection_mode then for some
-- reason the mode listed in protection_mode experienced a need to downgrade.  Once the
-- error condition has been corrected the protection_level should match the protection_mode
-- after the next log switch.
column ROLE format a7 tru
select name,database_role,log_mode,controlfile_type,protection_mode,protection_level
from v$database;
-- Force logging is not mandatory but is recommended.  Supplemental logging should be enabled
-- on the standby if a logical standby is in the configuration. During normal
-- operations it is acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED.
column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,supplemental_log_data_pk,supplemental_log_data_ui,
switchover_status,dataguard_broker from v$database;
-- This query produces a list of all archive destinations and shows if they are enabled,
-- what process is servicing that destination, if the destination is local or remote,
-- and if remote what the current mount ID is. For a physical standby we should have at
-- least one remote destination that points the primary set but it should be deferred.
COLUMN destination FORMAT A35 WRAP
column process format a7
column archiver format a8
column ID format 99
select dest_id "ID",destination,status,target,
archiver,schedule,process,mountid
from v$archive_dest;
-- If the protection mode of the standby is set to anything higher than max performance
-- then we need to make sure the remote destination that points to the primary is set
-- with the correct options else we will have issues during switchover.
select dest_id,process,transmit_mode,async_blocks,
net_timeout,delay_mins,reopen_secs,register,binding
from v$archive_dest;
-- The following select will show any errors that occured the last time an attempt to
-- archive to the destination was attempted.  If ERROR is blank and status is VALID then
-- the archive completed correctly.
column error format a55 tru
select dest_id,status,error from v$archive_dest;
-- Determine if any error conditions have been reached by querying thev$dataguard_status
-- view (view only available in 9.2.0 and above):
column message format a80
select message, timestamp
from v$dataguard_status
where severity in ('Error','Fatal')
order by timestamp;
-- The following query is ran to get the status of the SRL's on the standby.  If the
-- primary is archiving with the LGWR process and SRL's are present (in the correct
-- number and size) then we should see a group# active.
select group#,sequence#,bytes,used,archived,status from v$standby_log;
-- The above SRL's should match in number and in size with the ORL's returned below:
select group#,thread#,sequence#,bytes,archived,status from v$log;
-- Query v$managed_standby to see the status of processes involved in the
-- configuration.
select process,status,client_process,sequence#,block#,active_agents,known_agents
from v$managed_standby;
-- Verify that the last sequence# received and the last sequence# applied to standby
-- database.
select al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied"
from (select thread# thrd, max(sequence#) almax
from v$archived_log
where resetlogs_change#=(select resetlogs_change# from v$database)
group by thread#) al,
(select thread# thrd, max(sequence#) lhmax
from v$log_history
where first_time=(select max(first_time) from v$log_history)
group by thread#) lh
where al.thrd = lh.thrd;
-- The V$ARCHIVE_GAP fixed view on a physical standby database only returns the next
-- gap that is currently blocking redo apply from continuing. After resolving the
-- identified gap and starting redo apply, query the V$ARCHIVE_GAP fixed view again
-- on the physical standby database to determine the next gap sequence, if there is
-- one.
select * from v$archive_gap;
-- Non-default init parameters.
set numwidth 5
column name format a30 tru
column value format a50 wra
select name, value
from v$parameter
where isdefault = 'FALSE';
spool off
- - - - - - - - - - - - - - - -  Script ends here  - - - - - - - - - - - - - - - -

Overview
——–

This script is intended to provide an easy method to provide information
necessary to troubleshoot Data Guard issues.

Script Notes
————-

This script is intended to be run via sqlplus as the SYS or Internal user.

Script
——-

- - - - - - - - - - - - - - - - Script begins here - - - - - - - - - - - - - - - -
-- NAME: dg_lsby_diag.sql  (Run on LOGICAL STANDBY)
set echo off
set feedback off
column timecol new_value timestamp
column spool_extension new_value suffix
select to_char(sysdate,'Mondd_hhmi') timecol,
'.out' spool_extension from sys.dual;
column output new_value dbname
select value || '_' output
from v$parameter where name = 'db_name';
spool dg_lsby_diag_&&dbname&&timestamp&&suffix
set linesize 79
set pagesize 180
set long 1000
set trim on
set trims on
alter session set nls_date_format = 'MM/DD HH24:MI:SS';
set feedback on
select to_char(sysdate) time from dual;
set echo on
-- The following select will give us the generic information about how
-- this standby is setup.  The database_role should be logical standby as
-- that is what this script is intended to be ran on.
column ROLE format a7 tru
column NAME format a8 wrap
select name,database_role,log_mode,protection_mode
from v$database;
-- ARCHIVER can be (STOPPED | STARTED | FAILED). FAILED means that the
-- archiver failed to archive a log last time, but will try again within 5
-- minutes. LOG_SWITCH_WAIT The ARCHIVE LOG/CLEAR LOG/CHECKPOINT event log
-- switching is waiting for. Note that if ALTER SYSTEM SWITCH LOGFILE is
-- hung, but there is room in the current online redo log, then value is
-- NULL
column host_name format a20 tru
column version format a9 tru
select instance_name,host_name,version,archiver,log_switch_wait
from v$instance;
-- The following query give us information about catpatch.
-- This way we can tell if the procedure doesn't match the image.
select version, modified, status from dba_registry
where comp_id = 'CATPROC';
-- Force logging and supplemental logging are not mandatory but are
-- recommended if you plan to switchover.  During normal operations it is
-- acceptable for SWITCHOVER_STATUS to be SESSIONS ACTIVE or NOT ALLOWED.
column force_logging format a13 tru
column remote_archive format a14 tru
column dataguard_broker format a16 tru
select force_logging,remote_archive,supplemental_log_data_pk,
supplemental_log_data_ui,switchover_status,dataguard_broker
from v$database;
-- This query produces a list of all archive destinations.  It shows if
-- they are enabled, what process is servicing that destination, if the
-- destination is local or remote, and if remote what the current mount ID
-- is.
column destination format a35 wrap
column process format a7
column archiver format a8
column ID format 99
column mid format 99
select dest_id "ID",destination,status,target,
schedule,process,mountid  mid
from v$archive_dest order by dest_id;
-- This select will give further detail on the destinations as to what
-- options have been set.  Register indicates whether or not the archived
-- redo log is registered in the remote destination control file.
set numwidth 8
column ID format 99
select dest_id "ID",archiver,transmit_mode,affirm,async_blocks async,
net_timeout net_time,delay_mins delay,reopen_secs reopen,
register,binding
from v$archive_dest order by dest_id;
-- Determine if any error conditions have been reached by querying the
-- v$dataguard_status view (view only available in 9.2.0 and above):
column message format a80
select message, timestamp
from v$dataguard_status
where severity in ('Error','Fatal')
order by timestamp;
-- Query v$managed_standby to see the status of processes involved in
-- the shipping redo on this system.  Does not include processes needed to
-- apply redo.
select process,status,client_process,sequence#
from v$managed_standby;
-- Verify that log apply services on the standby are currently
-- running. If the query against V$LOGSTDBY returns no rows then logical
-- apply is not running.
column status format a50 wrap
column type format a11
set numwidth 15
SELECT TYPE, STATUS, HIGH_SCN
FROM V$LOGSTDBY;
-- The DBA_LOGSTDBY_PROGRESS view describes the progress of SQL apply
-- operations on the logical standby databases.  The APPLIED_SCN indicates
-- that committed transactions at or below that SCN have been applied. The
-- NEWEST_SCN is the maximum SCN to which data could be applied if no more
-- logs were received. This is usually the MAX(NEXT_CHANGE#)-1 from
-- DBA_LOGSTDBY_LOG.  When the value of NEWEST_SCN and APPLIED_SCN are the
-- equal then all available changes have been applied.  If your
-- APPLIED_SCN is below NEWEST_SCN and is increasing then SQL apply is
-- currently processing changes.
set numwidth 15
select
(case
when newest_scn = applied_scn then 'Done'
when newest_scn <= applied_scn + 9 then 'Done?'
when newest_scn > (select max(next_change#) from dba_logstdby_log)
then 'Near done'
when (select count(*) from dba_logstdby_log
where (next_change#, thread#) not in
(select first_change#, thread# from dba_logstdby_log)) > 1
then 'Gap'
when newest_scn > applied_scn then 'Not Done'
else '---' end) "Fin?",
newest_scn, applied_scn, read_scn from dba_logstdby_progress;
select newest_time, applied_time, read_time from dba_logstdby_progress;
-- Determine if apply is lagging behind and by how much.  Missing
-- sequence#'s in a range indicate that a gap exists.
set numwidth 15
column trd format 99
select thread# trd, sequence#,
first_change#, next_change#,
dict_begin beg, dict_end end,
to_char(timestamp, 'hh:mi:ss') timestamp,
(case when l.next_change# < p.read_scn then 'YES'
when l.first_change# < p.applied_scn then 'CURRENT'
else 'NO' end) applied
from dba_logstdby_log l, dba_logstdby_progress p
order by thread#, first_change#;
-- Get a history on logical standby apply activity.
set numwidth 15
select to_char(event_time, 'MM/DD HH24:MI:SS') time,
commit_scn, current_scn, event, status
from dba_logstdby_events
order by event_time, commit_scn, current_scn;
-- Dump logical standby stats
column name format a40
column value format a20
select * from v$logstdby_stats;
-- Dump logical standby parameters
column name format a33 wrap
column value format a33 wrap
column type format 99
select name, value, type from system.logstdby$parameters
order by type, name;
-- Gather log miner session and dictionary information.
set numwidth 15
select * from system.logmnr_session$;
select * from system.logmnr_dictionary$;
select * from system.logmnr_dictstate$;
select * from v$logmnr_session;
-- Query the log miner dictionary for key tables necessary to process
-- changes for logical standby Label security will move AUD$ from SYS to
-- SYSTEM.  A synonym will remain in SYS but Logical Standby does not
-- support this.
set numwidth 5
column name format a9 wrap
column owner format a6 wrap
select o.logmnr_uid, o.obj#, o.objv#, u.name owner, o.name
from system.logmnr_obj$ o, system.logmnr_user$ u
where
o.logmnr_uid = u.logmnr_uid and
o.owner# = u.user# and
o.name in ('JOB$','JOBSEQ','SEQ$','AUD$',
'FGA_LOG$','IND$','COL$','LOGSTDBY$PARAMETER')
order by u.name;
-- Non-default init parameters.
column name format a30 tru
column value format a48 wra
select name, value
from v$parameter
where isdefault = 'FALSE';
spool off
- - - - - - - - - - - - - - - -  Script ends here  - - - - - - - - - - - - - - - -

沪ICP备14014813号

沪公网安备 31010802001379号