Oracle Internal Event:10200 Consistent Read诊断事件

10200(consistent read buffer status)内部诊断事件可以用于探测一致性读CR(consistent read)块的访问情况,虽然cr读的统计信息可以从v$sysstat或AWR/statspack中获取,但是10200 event还是我们研究Consistent Read一致性读的有力工具。该事件可以通过在会话session级别设置ALTER SESSION SET EVENTS 或 DBMS_SYSTEM.SET_EV. Set 来开启,一般调用级别为Level 10。

该事件返回的trace跟踪文件,记录了为了实现一致性读的目的,哪些数据块以及这些块的各历史版本在执行过程中被创建(CR block creation)并检验(CR block inspection),以找出Best CR block满足Consistent一致性。

注意10200 Internal Event主要是被ktrgtc和ktrget(call ktrget to get one block ->calling KTR layer to apply RBS to have consistent read Block;)这2个Oracle内核功能函数触发,这2个内部函数Internal Function的主要作用:

ktrget:

  • Initializes a buffer cache CR scan request
  • Calls kcbgtcr for the best resident buffer to start from to build the CR buffer
  • Calls ktrgcm to build the CR buffer by applying undo
  • Returns CR buffer to the requestor

kcbgtcr:

  • If successful, returns the “best” candidate (performed by ktrexf or examination function)
  • Scans the hash bucket for the DBA for buffers that may be used to build a CR buffer
  • If not successful, calls kcbget

 

10200 event trace example:

 

[oracle@rh2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.4.0 - Production on Fri Sep 30 21:23:47 2011

Copyright (c) 1982, 2007, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing option

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com & www.askmaclean.com

SQL> create table tv(t1 int);

Table created.

SQL> alter session set events '10200 trace name context forever,level 10';

Session altered.

SQL> select * from tv;

        T1
----------
         1

10200 trace

Consistent read started for block 0 : 0040081a
  env: (scn: 0x0000.000cf852 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00  statement num=0

parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn: 0x0000.00000000)
 CR exa ret 2 on: 0x600139d0  scn: 0xffff.ffffffff  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00

scn: 0xffff.ffffffff  sfl: 0

Consistent read finished for block 0 : 40081a

Consistent read started for block 0 : 0040e508
  env: (scn: 0x0000.000cf852  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00  statement num=0
parent xid: xid: 0x0000.000.00000000  scn: 0x0000.00000000 8sch: scn:
 0x0000.00000000)
Consistent read finished for block 0 : 40e508

 

以上trace中各代码的含义如下:

Consistent read started for block 0 : 0040081a
0 -> tablespace number 0040081a -> DBA

 

env: (scn: 0x0000.000cf852 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00
以上为环境/会话信息,这个scn是env_scn ,一般就是数据库的current_scn

 

SFL :0  -> SFL 为 Snapshot Flag

 

CR exa ret 2 -> 此处的ret为reture code返回代码,是ktrgtc/ktrget函数的返回码

 

以下为ktrgtc/ktrget函数部分可能返回代码的含义:

 

#define KCBRSTOP (8|0) /* return this one now */
#define KCBRSAVE (8|1) /* save this one and continue */
#define KCBRSKIP (0|1) /* skip over this one and continue */
#define KCBRQUIT (0|2) /* chuck all, return nothing and stop */
#define KCBRLAST (0|3) /* quit if read from disk else skip */

 

10200诊断事件在11g中得到了加强,通过该event我们可以获得更多有用的trace信息了:

 

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
MACLEAN

SQL> alter session set events '10200 trace name context forever, level 10';

Session altered.

SQL> select * from tv;

        T1
----------
         1

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_28365.trc

trace content

ktrget2(): started for block   objd: 0x000040e1
env [0x2b54cde2a704]: (scn: 0x0000.0026b064  xid: 0x0000.000.00000000  uba: 0x00000000.0000.00
statement num=0  parent xid: 0x0000.000.00000000  st-scn: 0x0000.00000000  hi-scn: 0x0000.00000000
ma-scn: 0x0000.0026b053  flg: 0x00000660)
ktrexf(): returning 9 on:  0xbb132d0  cr-scn: 0xffff.ffffffff  xid: 0x0000.000.00000000
uba: 0x00000000.0000.00  cl-scn: 0xffff.ffffffff  sfl: 0
ktrgcm(): completed for block   objd: 0x000040e1
ktrget3(): completed for  block  objd: 0x000040e1

了解你所不知道的SMON功能(七):清理IND$字典基表

SMON的作用还包括清理IND$字典基表(cleanup ind$):
ind$_flags

触发场景

当我们在线创建或重建索引时(create or rebuild index online),服务进程会到IND$字典基表中将该索引对应的记录的FLAGS字段修改为十进制的256或者512(见上图0x100=256,0×200=512),如:

SQL> create index macleans_index on larges(owner,object_name) online;

SQL> select obj# from obj$ where name='MACLEANS_INDEX';

      OBJ#
----------
   1343842

SQL> select FLAGS from ind$ where obj#=1343842;

     FLAGS
----------
       256

ind_online$字典基表记录了索引在线创建/重建的历史

SQL> select * from ind_online$;

      OBJ#      TYPE#      FLAGS
---------- ---------- ----------
   1343839          1        256
   1343842          1        256

create table ind_online$
( obj#          number not null,
  type#         number not null,              /* what kind of index is this? */
                                                               /* normal : 1 */
                                                               /* bitmap : 2 */
                                                              /* cluster : 3 */
                                                            /* iot - top : 4 */
                                                         /* iot - nested : 5 */
                                                            /* secondary : 6 */
                                                                 /* ansi : 7 */
                                                                  /* lob : 8 */
                                             /* cooperative index method : 9 */
  flags         number not null
                                      /* index is being online built : 0x100 */
                                    /* index is being online rebuilt : 0x200 */
)

原则上online create/rebuild index的的清理工作由实际操作的服务进程负责完成,这种清理在DDL语句成功的情况下包括一系列数据字典的维护,在该DDL语句失败的情形中包括对临时段的清理和数据字典的维护,无论如何都需要drop在线日志中间表 SYS_JOURNAL_nnnnn(nnnn为该索引的obj#)。数据字典的维护工作就包含对IND$基表中相应索引记录的FLAGS标志位的恢复,但是如果服务进程在语句执行过程中意外终止的话,那么短时间内FLAGS标志位字段就无法得到恢复,这将导致对该索引的后续操作因ORA-8104错误而无法继续:

SQL> drop index macleans_index;
drop index macleans_index
           *
ERROR at line 1:
ORA-08104: this index object 1343842 is being online built or rebuilt

08104, 00000, "this index object %s is being online built or rebuilt"
// *Cause:  the index is being created or rebuild or waited for recovering
//          from the online (re)build
// *Action: wait the online index build or recovery to complete

SMON负责在启动后(startup)的每小时执行一次对IND$基表中因在线创建/重建索引失败所留下记录的清理,这种清理工作由kdicclean函数驱动(kdicclean is run by smon every 1 hour,called from SMON to find if there is any online builder death and cleanup our ind$ and obj$ and drop the journal table, stop journaling)。
这种清理工作典型的调用堆栈stack call如下:

ksbrdp -> ktmSmonMain  ktmmon -> kdicclean -> kdic_cleanup -> ktssdrp_segment

注意因为SMON进程的清理工作每小时才执行一次,而且在工作负载很高的情况下可能实际很久都不会得到清理,在这种情景中我们总是希望能尽快完成对索引的在线创建或重建,在10gr2以后的版本中我们可以直接使用dbms_repair.online_index_clean来手动清理online index rebuild的遗留问题:

SQL> drop index macleans_index;
drop index macleans_index
           *
ERROR at line 1:
ORA-08104: this index object 1343842 is being online built or rebuilt

DECLARE
 isClean BOOLEAN;
BEGIN
  isClean := FALSE;
  WHILE isClean=FALSE
  LOOP
    isClean := dbms_repair.online_index_clean(
    dbms_repair.all_index_id, dbms_repair.lock_wait);
    dbms_lock.sleep(10);
  END LOOP;
END;
/

SQL>  drop index macleans_index;
 drop index macleans_index
            *
ERROR at line 1:
ORA-01418: specified index does not exist

成功清理

但是如果在9i中的话就比较麻烦,可以尝试用以下方法(不是很推荐,除非你已经等了很久):

1.首先手工删除在线日志表,通过以下手段找出这个中间表的名字

select object_name
  from dba_objects
 where object_name like
       (select '%' || object_id || '%'
          from dba_objects
         where object_name = '&INDEX_NAME')
/
Enter value for index_name: MACLEANS_INDEX
old   6:          where object_name = '&INDEX_NAME')
new   6:          where object_name = 'MACLEANS_INDEX')

OBJECT_NAME
--------------------------------------------------------------------------------
SYS_JOURNAL_1343845

SQL> drop table SYS_JOURNAL_1343845;

Table dropped.

2.第二步要手动修改IND$字典基表

!!!!!! 注意!手动修改数据字典要足够小心!!

select flags from ind$ where obj#=&INDEX_OBJECT_ID;
Enter value for index_object_id: 1343845
old   1: select flags from ind$ where obj#=&INDEX_OBJECT_ID
new   1: select flags from ind$ where obj#=1343845

     FLAGS
----------
       256

a) 针对online create index,手动删除对应的记录
delete from IND$ where obj#=&INDEX_OBJECT_ID

b) 针对online rebuild index,手动恢复对应记录的FLAGS标志位
update IND$ set FLAGS=FLAGS-512 where obj#=&INDEX_OBJECT_ID

接下来我们实际观察一下清理工作的细节:

SQL> select obj# from obj$ where name='MACLEANS_INDEX';

      OBJ#
----------
   1343854

SQL> select FLAGS from ind$ where obj#=1343854;

     FLAGS
----------
       256

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 8;
Statement processed.
SQL> DECLARE
  2   isClean BOOLEAN;
  3  BEGIN
  4    isClean := FALSE;
  5    WHILE isClean=FALSE
  6    LOOP
  7      isClean := dbms_repair.online_index_clean(
  8      dbms_repair.all_index_id, dbms_repair.lock_wait);
  9
 10      dbms_lock.sleep(10);
 11    END LOOP;
 12  END;
 13  /

PL/SQL procedure successfully completed.

===============================10046 trace=============================
select i.obj#, i.flags, u.name, o.name, o.type#
  from sys.obj$ o, sys.user$ u, sys.ind_online$ i
 where (bitand(i.flags, 256) = 256 or bitand(i.flags, 512) = 512)
   and (not ((i.type# = 9) and bitand(i.flags, 8) = 8))
   and o.obj# = i.obj#
   and o.owner# = u.user#

select u.name,
       o.name,
       o.namespace,
       o.type#,
       decode(bitand(i.property, 1024), 0, 0, 1)
  from ind$ i, obj$ o, user$ u
 where i.obj# = :1
   and o.obj# = i.bo#
   and o.owner# = u.user#

delete from object_usage
 where obj# in (select a.obj#
                  from object_usage a, ind$ b
                 where a.obj# = b.obj#
                   and b.bo# = :1)

drop table "SYS"."SYS_JOURNAL_1343854" purge
delete from icoldep$ where obj# in (select obj# from ind$ where bo#=:1)

delete from ind$ where bo#=:1

delete from ind$ where obj#=:1

我们可以利用以下语句找出系统中可能需要恢复的IND$记录,注意不要看到查询有结果就认为这是操作失败的征兆,很可能是有人在线创建或重建索引:

select i.obj#, i.flags, u.name, o.name, o.type#
  from sys.obj$ o, sys.user$ u, sys.ind_online$ i
 where (bitand(i.flags, 256) = 256 or bitand(i.flags, 512) = 512)
   and (not ((i.type# = 9) and bitand(i.flags, 8) = 8))
   and o.obj# = i.obj#
   and o.owner# = u.user#
/

相关诊断事件可以通过设置诊断事件event=’8105 trace name context forever’
来禁止SMON清理IND$(Oracle event to turn off smon cleanup for online index build)

 alter system set events '8105 trace name context forever';

了解你所不知道的SMON功能(六):Instance Recovery

SMON的作用还包括RAC环境中的Instance Recovery,注意虽然Instance Recovery可以翻做实例恢复,但实际上和我们口头所说的实例恢复是不同的。我们口头语言所说的实例恢复很大程度上是指Crash Recovery崩溃恢复,Instance Recovery与Crash Recovery是存在区别的:针对单实例(single instance)或者RAC中所有节点全部崩溃后的恢复,我们称之为Crash Recovery。而对于RAC中的某一个节点失败,存活节点(surviving instance)试图对失败节点线程上redo做应用的情况,我们称之为Instance Recovery。对于Crash Recovery更多的内容可见<还原真实的cache recovery>一文。

现象

Instance Recovery期间分别存在cache recovery和ges/gcs remaster2个recovery stage,注意这2个舞台的恢复是同时进行的。cache recovery的主角是存活节点上的SMON进程,SMON负责分发redo给slave进程。而实施ges/gcs remaster的是RAC专有进程LMON。

整个Reconfiuration的过程如下图:
rac_instance_recovery_reconfiguration
 
注意以上Crash Detected时数据库进入部分可用(Partial Availability)状态,从Freeze Lockdb开始None Availability,到IR applies redo即前滚时转换为Partial Availability,待前滚完成后会实施回滚,但是此时数据库已经进入完全可用(Full Availability)状态了,如下图:
 
oracledatabase12g_instance_recovery

The graphic illustrates the degree of database availability during each step of Oracle instance recovery: 

A.         Real Application Clusters is running on multiple nodes.

B.         Node failure is detected.

C.         The enqueue part of the GRD is reconfigured; resource management is redistributed to the surviving nodes. This operation occurs relatively quickly.

D.        The cache part of the GRD is reconfigured and SMON reads the redo log of the failed instance to identify the database blocks that it needs to recover.

E.         SMON issues the GRD requests to obtain all the database blocks it needs for recovery. After the requests are complete, all other blocks are accessible.

F.         The Oracle server performs roll forward recovery. Redo logs of the failed threads are applied to the database, and blocks are available right after their recovery is completed.

G.        The Oracle server performs rollback recovery. Undo blocks are applied to the database for all uncommitted transactions.

H.        Instance recovery is complete and all data is accessible.

Note: The dashed line represents the blocks identified in step 2 in the previous slide. Also, the dotted steps represent the ones identified in the previous slide.

我们来实际观察一下Instance Recovery的过程:

INST 1:

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com

SQL> alter system set event='10426 trace name context forever,level 12' scope=spfile;  -- 10426 event Reconfiguration trace event
System altered.

SQL> startup force;
ORACLE instance started.

INST 2:
SQL> shutdown abort
ORACLE instance shut down.

=============================================================
========================alert.log============================

Reconfiguration started (old inc 4, new inc 6)
List of instances:
 1 (myinst: 1)
 Global Resource Directory frozen
 * dead instance detected - domain 0 invalid = TRUE
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
 LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived
 Set master node info
 Submitted all remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
 Post SMON to start 1st pass IR
Instance recovery: looking for dead threads
Beginning instance recovery of 1 threads
 parallel recovery started with 2 processes                 --2 recovery slave
 Submitted all GCS remote-cache requests
 Post SMON to start 1st pass IR
 Fix write in gcs resources
Reconfiguration complete
Started redo scan
Completed redo scan
 read 88 KB redo, 82 data blocks need recovery
Started redo application at
 Thread 2: logseq 374, block 2, scn 54624376
Recovery of Online Redo Log: Thread 2 Group 4 Seq 374 Reading mem 0
  Mem# 0: +DATA/prod/onlinelog/group_4.271.747100549
  Mem# 1: +DATA/prod/onlinelog/group_4.272.747100553
Completed redo application of 0.07MB
Completed instance recovery at
 Thread 2: logseq 374, block 178, scn 54646382
 73 data blocks read, 83 data blocks written, 88 redo k-bytes read
Thread 2 advanced to log sequence 375 (thread recovery)
Redo thread 2 internally disabled at seq 375 (SMON)
ARC3: Creating local archive destination LOG_ARCHIVE_DEST_1: '/s01/arch/2_374_747100216.dbf' (thread 2 sequence 374) (PROD1)
Setting Resource Manager plan SCHEDULER[0x310B]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
ARC3: Closing local archive destination LOG_ARCHIVE_DEST_1: '/s01/arch/2_374_747100216.dbf' (PROD1)
2011-06-27 22:19:29.280000 +08:00
Archived Log entry 792 added for thread 2 sequence 374 ID 0x9790ab2 dest 1:
ARC0: Creating local archive destination LOG_ARCHIVE_DEST_1: '/s01/arch/2_375_747100216.dbf' (thread 2 sequence 375) (PROD1)
2011-06-27 22:19:30.336000 +08:00
ARC0: Archiving disabled thread 2 sequence 375
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1: '/s01/arch/2_375_747100216.dbf' (PROD1)
Archived Log entry 793 added for thread 2 sequence 375 ID 0x9790ab2 dest 1:
minact-scn: Master considers inst:2 dead

==================================================================
===========================smon trace begin=======================

*** 2011-06-27 22:19:28.279
2011-06-27 22:19:28.279849 : Start recovery for domain=0, valid=0, flags=0x0
Successfully allocated 2 recovery slaves
Using 67 overflow buffers per recovery slave
Thread 2 checkpoint: logseq 374, block 2, scn 54624376
  cache-low rba: logseq 374, block 2
    on-disk rba: logseq 374, block 178, scn 54626382
  start recovery at logseq 374, block 2, scn 54624376
Instance recovery not required for thread 1

*** 2011-06-27 22:19:28.487
Started writing zeroblks thread 2 seq 374 blocks 178-185

*** 2011-06-27 22:19:28.487
Completed writing zeroblks thread 2 seq 374
==== Redo read statistics for thread 2 ====
Total physical reads (from disk and memory): 4096Kb
-- Redo read_disk statistics --
Read rate (ASYNC): 88Kb in 0.18s => 0.48 Mb/sec
Longest record: 8Kb, moves: 0/186 (0%)
Longest LWN: 33Kb, moves: 0/47 (0%), moved: 0Mb
Last redo scn: 0x0000.0341884d (54626381)
----------------------------------------------
----- Recovery Hash Table Statistics ---------
Hash table buckets = 262144
Longest hash chain = 1
Average hash chain = 82/82 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 248/330 = 0.8
----------------------------------------------

*** 2011-06-27 22:19:28.489
KCRA: start recovery claims for 82 data blocks

*** 2011-06-27 22:19:28.526
KCRA: blocks processed = 82/82, claimed = 81, eliminated = 1
2011-06-27 22:19:28.526088 : Validate domain 0
 **************** BEGIN RECOVERY HA STATS  ****************

 I'm the recovery instance

 smon posted (1278500359646), recovery started 0.027 secs,(1278500359673)
 domain validated 0.242 secs (1278500359888)
 claims opened 70, claims converted 11, claims preread 0

 ****************  END RECOVERY HA STATS  *****************
2011-06-27 22:19:28.526668 : Validated domain 0, flags = 0x0

*** 2011-06-27 22:19:28.556
Recovery of Online Redo Log: Thread 2 Group 4 Seq 374 Reading mem 0

*** 2011-06-27 22:19:28.560
Completed redo application of 0.07MB

*** 2011-06-27 22:19:28.569
Completed recovery checkpoint
----- Recovery Hash Table Statistics ---------
Hash table buckets = 262144
Longest hash chain = 1
Average hash chain = 82/82 = 1.0
Max compares per lookup = 1
Avg compares per lookup = 330/330 = 1.0
----------------------------------------------
*** 2011-06-27 22:19:28.572 5401 krsg.c
Acquiring RECOVERY INFO PING latch from [krsg.c:5401] IX0
*** 2011-06-27 22:19:28.572 5401 krsg.c
Successfully acquired RECOVERY INFO PING latch IX+
*** 2011-06-27 22:19:28.572 5406 krsg.c
Freeing RECOVERY INFO PING latch from [krsg.c:5406] IX0
*** 2011-06-27 22:19:28.572 5406 krsg.c
Successfully freed RECOVERY INFO PING latch IX-
krss_sched_work: Prod archiver request from process SMON (function:0x2000)
krss_find_arc: Evaluating ARC3 to receive message (flags 0x0)
krss_find_arc: Evaluating ARC0 to receive message (flags 0x0)
krss_find_arc: Evaluating ARC1 to receive message (flags 0xc)
krss_find_arc: Evaluating ARC2 to receive message (flags 0x2)
krss_find_arc: Selecting ARC2 to receive REC PING message
*** 2011-06-27 22:19:28.572 3093 krsv.c
krsv_send_msg: Sending message to process ARC2
*** 2011-06-27 22:19:28.572 1819 krss.c
krss_send_arc: Sent message to ARC2 (message:0x2000)
Recovery sets nab of thread 2 seq 374 to 178 with 8 zeroblks
Retrieving log 4
pre-aal: xlno:4 flno:0 arf:0 arb:2 arh:2 art:4
Updating log 3 thread 2 sequence 375
Previous log 3 thread 2 sequence 0
Updating log 4 thread 2 sequence 374
Previous log 4 thread 2 sequence 374
post-aal: xlno:4 flno:0 arf:3 arb:2 arh:2 art:3
krss_sched_work: Prod archiver request from process SMON (function:0x1)
krss_find_arc: Evaluating ARC3 to receive message (flags 0x0)
krss_find_arc: Selecting ARC3 to receive message
*** 2011-06-27 22:19:28.589 3093 krsv.c
krsv_send_msg: Sending message to process ARC3
*** 2011-06-27 22:19:28.589 1819 krss.c
krss_send_arc: Sent message to ARC3 (message:0x1)
Retrieving log 2
Kicking thread 1 to switch logfile
Retrieving log 4
Retrieving log 3
krss_sched_work: Prod archiver request from process SMON (function:0x1)
krss_find_arc: Evaluating ARC0 to receive message (flags 0x0)
krss_find_arc: Selecting ARC0 to receive message
*** 2011-06-27 22:19:28.599 3093 krsv.c
krsv_send_msg: Sending message to process ARC0
*** 2011-06-27 22:19:28.599 1819 krss.c
krss_send_arc: Sent message to ARC0 (message:0x1)
*** 2011-06-27 22:19:28.599 838 krsv.c
krsv_dpga: Waiting for pending I/O to complete

*** 2011-06-27 22:19:29.304
krss_sched_work: Prod archiver request from process SMON (function:0x1)
krss_find_arc: Evaluating ARC1 to receive message (flags 0xc)
krss_find_arc: Selecting ARC1 to receive message
*** 2011-06-27 22:19:29.304 3093 krsv.c
krsv_send_msg: Sending message to process ARC1
*** 2011-06-27 22:19:29.304 1819 krss.c
krss_send_arc: Sent message to ARC1 (message:0x1)
SMON[INST-TXN-RECO]:about to recover undo segment 11 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 11 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 12 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 12 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 13 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 13 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 14 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 14 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 15 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 15 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 16 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 16 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 17 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 17 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 18 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 18 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 19 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 19 as available status:2 ret:0
SMON[INST-TXN-RECO]:about to recover undo segment 20 status:3 inst:2
SMON[INST-TXN-RECO]: mark undo segment 20 as available status:2 ret:0

*** 2011-06-27 22:19:43.299
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x2, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x2.1
* >> RM REQ QS ---:
 single window RM request queue is empty
 multi-window RM request queue is empty
* Global DRM state ---:
   There is no dynamic remastering
   RM lock state = 0
 pkey 2.1 undo 1 stat 0 masters[32768, 1->1] reminc 4 RM# 1
 flg x0 type x0 afftime x36e6e3a8
 nreplays by lms 0 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true

*** 2011-06-27 22:22:18.333
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x2, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x2.1

* >> RM REQ QS ---:
 single window RM request queue is empty
 multi-window RM request queue is empty
* Global DRM state ---:
   There is no dynamic remastering
   RM lock state = 0
 pkey 2.1 undo 1 stat 0 masters[32768, 1->1] reminc 4 RM# 1
 flg x0 type x0 afftime x36e6e3a8
 nreplays by lms 0 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true

*** 2011-06-27 22:24:53.365
* kju_tsn_aff_drm_pending TRACEUD: called with tsn x2, dissolve 0
* kju_tsn_aff_drm_pending TRACEUD: tsn_pkey = x2.1

* >> RM REQ QS ---:
 single window RM request queue is empty
 multi-window RM request queue is empty
* Global DRM state ---:
   There is no dynamic remastering
   RM lock state = 0
 pkey 2.1 undo 1 stat 0 masters[32768, 1->1] reminc 4 RM# 1
 flg x0 type x0 afftime x36e6e3a8
 nreplays by lms 0 = 0
* kju_tsn_aff_drm_pending TRACEUD: matching request not found on swin queue
* kju_tsn_aff_drm_pending TRACEUD: pp found, stat x0
* kju_tsn_aff_drm_pending TRACEUD: 2 return true

========================================================================
==============================lmon trace begin==========================

 *** 2011-06-27 22:19:27.748
kjxgmpoll reconfig instance map: 1

*** 2011-06-27 22:19:27.748
kjxgmrcfg: Reconfiguration started, type 1
CGS/IMR TIMEOUTS:
  CSS recovery timeout = 31 sec (Total CSS waittime = 65)
  IMR Reconfig timeout = 75 sec
  CGS rcfg timeout = 85 sec
kjxgmcs: Setting state to 4 0.

*** 2011-06-27 22:19:27.759
     Name Service frozen
kjxgmcs: Setting state to 4 1.
kjxgrdecidever: No old version members in the cluster
kjxgrssvote: reconfig bitmap chksum 0x2137452d cnt 1 master 1 ret 0
kjxgrpropmsg: SSMEMI: inst 1 - no disk vote
kjxgrpropmsg: SSVOTE: Master indicates no Disk Voting
2011-06-27 22:19:27.760783 : kjxgrDiskVote: nonblocking method is chosen
kjxggpoll: change poll time to 50 ms
2011-06-27 22:19:27.918847 : kjxgrDiskVote: Obtained RR update lock for sequence 5, RR seq 4
2011-06-27 22:19:28.023160 : kjxgrDiskVote: derive membership from CSS (no disk votes)
2011-06-27 22:19:28.023240 : proposed membership: 1

*** 2011-06-27 22:19:28.081
2011-06-27 22:19:28.081952 : kjxgrDiskVote: new membership is updated by inst 1, seq 6
2011-06-27 22:19:28.082073 : kjxgrDiskVote: bitmap: 1
CGS/IMR TIMEOUTS:
  CSS recovery timeout = 31 sec (Total CSS waittime = 65)
  IMR Reconfig timeout = 75 sec
  CGS rcfg timeout = 85 sec
kjxgmmeminfo: can not invalidate inst 2
kjxgmps: proposing substate 2
kjxgmcs: Setting state to 6 2.
kjfmSendAbortInstMsg: send an abort message to instance 2
 kjfmuin: inst bitmap 1
 kjfmmhi: received msg from inst 1 (inc 2)
     Performed the unique instance identification check
kjxgmps: proposing substate 3
kjxgmcs: Setting state to 6 3.
     Name Service recovery started
     Deleted all dead-instance name entries
kjxgmps: proposing substate 4
kjxgmcs: Setting state to 6 4.
     Multicasted all local name entries for publish
     Replayed all pending requests
kjxgmps: proposing substate 5
kjxgmcs: Setting state to 6 5.
     Name Service normal
     Name Service recovery done

*** 2011-06-27 22:19:28.191
kjxgmps: proposing substate 6
kjxgmcs: Setting state to 6 6.
kjxgmcs: total reconfig time 0.432 seconds (from 2895072218 to 2895072650)
kjxggpoll: change poll time to 600 ms
kjfmact: call ksimdic on instance (2)
2011-06-27 22:19:28.211846 :

********* kjfcrfg() called, BEGIN LMON RCFG *********
2011-06-27 22:19:28.211906 : * Begin lmon rcfg step KJGA_RCFG_BEGIN
* kjfcrfg: Resource broadcasting disabled
* kjfcrfg: kjfcqiora returned success
kjfcrfg: DRM window size = 4096->4096 (min lognb = 15)
2011-06-27 22:19:28.211954 :
Reconfiguration started (old inc 4, new inc 6)
TIMEOUTS:
  Local health check timeout: 70 sec
  Rcfg process freeze timeout: 70 sec
  Remote health check timeout: 140 sec
  Defer Queue timeout: 163 secs
  CGS rcfg timeout: 85 sec
  Synchronization timeout: 248 sec
  DLM rcfg timeout: 744 sec
List of instances:
 1 (myinst: 1)
Undo tsn affinity 1
OMF 0
2011-06-27 22:19:28.212394 : * Begin lmon rcfg step KJGA_RCFG_FREEZE

*** 2011-06-27 22:19:28.233
* published: inc 6, isnested 0, rora req 0,
  rora start 0, rora invalid 0, (roram 32767), isrcvinst 1,
  (rcvinst 1), isdbopen 1, drh 0, (myinst 1)
  thread 1, isdbmounted 1, sid hash x0
* kjfcrfg: published bigns successfully
* Force-published at step 3
2011-06-27 22:19:28.233575 :  Global Resource Directory frozen
* roram 32767, rcvinst 1
* kjfc_thread_qry: instance 1 flag 3 thread 1 sid 0
* kjfcrfg: queried bigns successfully
inst 1
* kjfcrfg: single_instance_kjga = TRUE

 asby init, 0/1/x2

 asby returns, 0/1/x2/false

* Domain maps before reconfiguration:
*   DOMAIN 0 (valid 1): 1 2
* End of domain mappings

 * dead instance detected - domain 0 invalid = TRUE

* Domain maps after recomputation:
*   DOMAIN 0 (valid 0): 1
* End of domain mappings
2011-06-27 22:19:28.235110 : * Begin lmon rcfg step KJGA_RCFG_COMM
2011-06-27 22:19:28.235242 : GSIPC:KSXPCB: msg 0xd8b84550 status 34, type 2, dest 2, rcvr 0
2011-06-27 22:19:28.235339 : GSIPC:KSXPCB: msg 0xd8b80180 status 34, type 2, dest 2, rcvr 1
 Active Sendback Threshold = 50 %
 Communication channels reestablished
2011-06-27 22:19:28.240076 : * Begin lmon rcfg step KJGA_RCFG_EXCHANGE
2011-06-27 22:19:28.240192 : * Begin lmon rcfg step KJGA_RCFG_ENQCLEANUP
 Master broadcasted resource hash value bitmaps
2011-06-27 22:19:28.251474 :
 Non-local Process blocks cleaned out
2011-06-27 22:19:28.251822 : * Begin lmon rcfg step KJGA_RCFG_CLEANUP
2011-06-27 22:19:28.265220 : * Begin lmon rcfg step KJGA_RCFG_TIMERQ
2011-06-27 22:19:28.265308 : * Begin lmon rcfg step KJGA_RCFG_DDQ
2011-06-27 22:19:28.265393 : * Begin lmon rcfg step KJGA_RCFG_SETMASTER
2011-06-27 22:19:28.271551 :
 Set master node info
2011-06-27 22:19:28.271931 : * Begin lmon rcfg step KJGA_RCFG_ENQREPLAY
2011-06-27 22:19:28.275490 :  Submitted all remote-enqueue requests
2011-06-27 22:19:28.275596 : * Begin lmon rcfg step KJGA_RCFG_ENQDUBIOUS
 Dwn-cvts replayed, VALBLKs dubious
2011-06-27 22:19:28.277223 : * Begin lmon rcfg step KJGA_RCFG_ENQGRANT
 All grantable enqueues granted
2011-06-27 22:19:28.277992 : * Begin lmon rcfg step KJGA_RCFG_PCMREPLAY
2011-06-27 22:19:28.279234 :
2011-06-27 22:19:28.279255 :  Post SMON to start 1st pass IR               --SMON posted by LMON
2011-06-27 22:19:28.307890 :  Submitted all GCS cache requests             --IR acquires all gcs resource needed for recovery
2011-06-27 22:19:28.308038 : * Begin lmon rcfg step KJGA_RCFG_FIXWRITES
 Post SMON to start 1st pass IR
 Fix write in gcs resources
2011-06-27 22:19:28.313508 : * Begin lmon rcfg step KJGA_RCFG_END
2011-06-27 22:19:28.313720 :
2011-06-27 22:19:28.313733 :
Reconfiguration complete

*   domain 0 valid?: 0
* kjfcrfg: ask RMS0 to do pnp work
 **************** BEGIN DLM RCFG HA STATS  ****************

 Total dlm rcfg time (inc 6): 0.100 secs (1278500359581, 1278500359681)
   Begin step .........: 0.001 secs (1278500359581, 1278500359582)
   Freeze step ........: 0.020 secs (1278500359582, 1278500359602)
   Remap step .........: 0.002 secs (1278500359602, 1278500359604)
   Comm step ..........: 0.005 secs (1278500359604, 1278500359609)
   Sync 1 step ........: 0.000 secs (0, 0)
   Exchange step ......: 0.000 secs (1278500359609, 1278500359609)
   Sync 2 step ........: 0.000 secs (0, 0)
   Enqueue cleanup step: 0.011 secs (1278500359609, 1278500359620)
   Sync pcm1 step .....: 0.000 secs (0, 0)
   Cleanup step .......: 0.013 secs (1278500359620, 1278500359633)
   Timerq step ........: 0.000 secs (1278500359633, 1278500359633)
   Ddq step ...........: 0.000 secs (1278500359633, 1278500359633)
   Set master step ....: 0.006 secs (1278500359633, 1278500359639)
   Sync 3 step ........: 0.000 secs (0, 0)
   Enqueue replay step : 0.004 secs (1278500359639, 1278500359643)
   Sync 4 step ........: 0.000 secs (0, 0)
   Enqueue dubious step: 0.001 secs (1278500359643, 1278500359644)
   Sync 5 step ........: 0.000 secs (0, 0)
   Enqueue grant step .: 0.001 secs (1278500359644, 1278500359645)
   Sync 6 step ........: 0.000 secs (0, 0)
   PCM replay step ....: 0.030 secs (1278500359645, 1278500359675)
   Sync 7 step ........: 0.000 secs (0, 0)
   Fixwrt replay step .: 0.003 secs (1278500359675, 1278500359678)
   Sync 8 step ........: 0.000 secs (0, 0)
   End step ...........: 0.001 secs (1278500359680, 1278500359681)
 Number of replayed enqueues sent / received .......: 0 / 0
 Number of replayed fusion locks sent / received ...: 0 / 0
 Number of enqueues mastered before / after rcfg ...: 2217 / 2941
 Number of fusion locks mastered before / after rcfg: 3120 / 5747

 ****************  END DLM RCFG HA STATS  *****************

*** 2011-06-27 22:19:36.589
kjxgfipccb: msg 0x0x7ff526139320, mbo 0x0x7ff526139310, type 19, ack 0, ref 0, stat 34

=====================================================================
============================lms trace begin==========================


*** 2011-06-27 22:38:54.663
2011-06-27 22:38:54.663764 :  0 GCS shadows cancelled, 0 closed, 0 Xw survived
2011-06-27 22:38:54.673539 :  5230 GCS resources traversed, 0 cancelled
2011-06-27 22:38:54.707671 :  9322 GCS shadows traversed, 0 replayed, 0 duplicates,
5183 not replayed, dissolve 0 timeout 0 RCFG(10) lms 0 finished replaying gcs resources
2011-06-27 22:38:54.709132 :  0 write requests issued in 384 GCS resources        --check past image
 0 PIs marked suspect, 0 flush PI msgs
2011-06-27 22:38:54.709520 :  0 write requests issued in 273 GCS resources
 1 PIs marked suspect, 0 flush PI msgs
2011-06-27 22:38:54.709842 :  0 write requests issued in 281 GCS resources
 0 PIs marked suspect, 0 flush PI msgs
2011-06-27 22:38:54.710159 :  0 write requests issued in 233 GCS resources
 0 PIs marked suspect, 0 flush PI msgs
2011-06-27 22:38:54.710531 :  0 write requests issued in 350 GCS resources
 lms 0 finished fixing gcs write protocol

Instance Recovery和普通的Crash Recovery最大的区别在于实例恢复过程中的GRD Frozen和对GES/GCS资源的Remaster,这部分工作主要由LMON进程完成,可以从以上trace中发现一些KJGA_RCFG_*形式的Reconfiguration步骤,它们的含义:

Reconfiguration Steps:

1.    KJGA_RCFG_BEGIN
LMON continuously polling for reconfiguration event. Once cgs reports a change in cluster membership,
LMON starts reconfiguration

2.    KJGA_RCFG_FREEZE
All processes acknowledges to the reconfiguration freeze before LMON continue 

3.    KJGA_RCFG_REMAP
Updates new instance map (kjfchsu), re-distributes resource mastership. Invalidate recovery domains
if reconfiguration is caused by instance death.

4.    KJGA_RCFG_COMM
Reinitialize communication channel

5.    KJGA_RCFG_EXCHANGE
Exchange of master information of gcs, ges and file affinity master

6.    KJGA_RCFG_ENQCLEANUP
Delete remote dead gcs/ges locks. Cancel converting gcs requests. 

7.    KJGA_RCFG_CLEANUP
Cleanup/remove ges resources

8.    KJGA_RCFG_TIMERQ
Restore relative timeout for enqueue locks on timeout queue

9.    KJGA_RCFG_DDQ
Clean out enqueue locks on deadlock queue

10.  KJGA_RCFG_SETMASTER
Update master info for each enqueue resources that need to be remastered.

11.  KJGA_RCFG_REPLAY
Replay enqueue locks

12.  KJGA_RCFG_ENQDUBIOUS
Invalidates ges resources without established value

13.  KJGA_RCFG_ENQGRANT
Grants all grantable ges lock requests

14.  KJGA_RCFG_REPLAY2
Enqueue reconfiguration complete. Post SMON to start instance recovery.  Starts replaying gcs resources.

15.  KJGA_RCFG_FIXWRITES2
Fix write state of gcs resources

16.  KJGA_RCFG_END
Unfreeze lock database

Instance Recovery相关的诊断事件
我们无法禁止Instance Recovery的发生,事实上一旦出现Instance Crash那么Instance Recovery就是必须的。
与Instance Recovery相关的诊断事件主要有10426和29717等:

    10426 – Reconfiguration trace event
    10425 – Enqueue operations
    10432 – Fusion activity
    10429 – IPC tracing

oerr ora 10425
10425, 00000, "enable global enqueue operations event trace"
// *Document: NO
// *Cause:
// *Action: Dump trace for global enqueue operations.

oerr ora 10426
10426, 00000, "enable ges/gcs reconfiguration event trace"
// *Document: NO
// *Cause:
// *Action: Dump trace for ges/gcs reconfiguration.

oerr ora 10430
10430, 00000, "enable ges/gcs dynamic remastering event trace"
// *Document: NO
// *Cause:
// *Action: Dump trace for ges/gcs dynamic remastering.

oerr ora 10401
10401, 00000, "turn on IPC (ksxp) debugging"
// *Cause:
// *Action: Enables debugging code for IPC service layer (ksxp)

oerr ora 10708
10708, 00000, "print out trace information from the RAC buffer cache"
// *Cause: N/A
// *Action: THIS IS NOT A USER ERROR NUMBER/MESSAGE.  THIS DOES NOT NEED TO BE
//          TRANSLATED OR DOCUMENTED. IT IS USED ONLY FOR DEBUGGING.

oerr ora 29717
29717, 00000, "enable global resource directory freeze/unfreeze event trace"
// *Document: NO
// *Cause:
// *Action: Dump trace for global resource directory freeze/unfreeze.

diag RAC INSTANCE SHUTDOWN LMON
LMON will dump more informations to trace during reconfig and freeze.

event="10426 trace name context forever, level 8"
event="29717 trace name context forever, level 5"
or
event="10426 trace name context forever, level 12"
event="10430 trace name context forever, level 12"
event="10401 trace name context forever, level 8"
event="10046 trace name context forever, level 8"
event="10708 trace name context forever, level 15"
event="29717 trace name context forever, level 5"

see 29717  grd frozen trace

alter system set event='29717 trace name context forever, level 5' scope=spfile;

=========================================================================
============================lmon trace begin=============================

********* kjfcrfg() called, BEGIN LMON RCFG *********
2011-06-27 23:13:16.693089 : * Begin lmon rcfg step KJGA_RCFG_BEGIN
* kjfcrfg: Resource broadcasting disabled
* kjfcrfg: kjfcqiora returned success
kjfcrfg: DRM window size = 4096->4096 (min lognb = 15)
2011-06-27 23:13:16.693219 :
Reconfiguration started (old inc 4, new inc 6)
TIMEOUTS:
  Local health check timeout: 70 sec
  Rcfg process freeze timeout: 70 sec
  Remote health check timeout: 140 sec
  Defer Queue timeout: 163 secs
  CGS rcfg timeout: 85 sec
  Synchronization timeout: 248 sec
  DLM rcfg timeout: 744 sec
List of instances:
 1 (myinst: 1)
Undo tsn affinity 1
OMF 0
[FDB][start]
2011-06-27 23:13:16.701320 : * Begin lmon rcfg step KJGA_RCFG_FREEZE
[FACK][18711 not frozen]          --fack means acknowledge in advance
[FACK][18713 not frozen]
[FACK][18719 not frozen]
[FACK][18721 not frozen]
[FACK][18723 not frozen]
[FACK][18729 not frozen]
[FACK][18739 not frozen]
[FACK][18743 not frozen]
[FACK][18745 not frozen]
[FACK][18747 not frozen]
[FACK][18749 not frozen]
[FACK][18751 not frozen]
[FACK][18753 not frozen]
[FACK][18755 not frozen]
[FACK][18757 not frozen]
[FACK][18759 not frozen]
[FACK][18763 not frozen]
[FACK][18765 not frozen]
[FACK][18767 not frozen]
[FACK][18769 not frozen]
[FACK][18771 not frozen]
[FACK][18775 not frozen]
[FACK][18777 not frozen]
[FACK][18816 not frozen]
[FACK][18812 not frozen]
[FACK][18818 not frozen]
[FACK][18820 not frozen]
[FACK][18824 not frozen]
[FACK][18826 not frozen]
[FACK][18830 not frozen]
[FACK][18835 not frozen]
[FACK][18842 not frozen]
[FACK][18860 not frozen]
[FACK][18865 not frozen]
[FACK][18881 not frozen]
[FACK][18883 not frozen]
[FACK][18909 not frozen]

*** 2011-06-27 23:13:16.724
* published: inc 6, isnested 0, rora req 0,
  rora start 0, rora invalid 0, (roram 32767), isrcvinst 0,
  (rcvinst 32767), isdbopen 1, drh 0, (myinst 1)
  thread 1, isdbmounted 1, sid hash x0
* kjfcrfg: published bigns successfully
* Force-published at step 3
2011-06-27 23:13:16.724764 :  Global Resource Directory frozen
* kjfc_qry_bigns: noone has the rcvinst established yet, set it to the highest open instance = 1
* roram 32767, rcvinst 1
* kjfc_thread_qry: instance 1 flag 3 thread 1 sid 0
* kjfcrfg: queried bigns successfully

=====================================================================
==========================lmd0 trace begin===========================

*** 2011-06-27 23:13:16.700
[FFCLI][frozen]
[FUFCLI][normal]

了解你所不知道的SMON功能(一):清理临时段

SMON(system monitor process)系统监控后台进程,有时候也被叫做system cleanup process,这么叫的原因是它负责完成很多清理(cleanup)任务。但凡学习过Oracle基础知识的技术人员都会或多或少对该background process的功能有所了解。

曾几何时对SMON功能的了解程度可以作为评判一位DBA理论知识的重要因素,至今仍有很多公司在DBA面试中会问到SMON有哪些功能这样的问题。首先这是一道开放式的题目,并不会奢求面试者能够打全(答全几乎是不可能的,即便是在你阅读本篇文章之后),答出多少可以作为知识广度的评判依据(如果面试人特意为这题准备过,那么也很好,说明他已经能系统地考虑问题了),接着还可以就具体的某一个功能说开去,来了解面试者的知识深度,当然这扯远了。

我们所熟知的SMON是个兢兢业业的家伙,它负责完成一些列系统级别的任务。与PMON(Process Monitor)后台进程不同的是,SMON负责完成更多和整体系统相关的工作,这导致它会去做一些不知名的”累活”,当系统频繁产生这些”垃圾任务”,则SMON可能忙不过来。因此在10g中SMON变得有一点懒惰了,如果它在短期内接收到过多的工作通知(SMON: system monitor process posted),那么它可能选择消极怠工以便让自己不要过于繁忙(SMON: Posted too frequently, trans recovery disabled),之后会详细介绍。

SMON的主要作用包括:

1.清理临时段(SMON cleanup temporary segments)

触发场景

很多人错误地理解了这里所说的临时段temporary segments,认为temporary segments是指temporary tablespace临时表空间上的排序临时段(sort segment)。事实上这里的临时段主要指的是永久表空间(permanent tablespace)上的临时段,当然临时表空间上的temporary segments也是由SMON来清理(cleanup)的,但这种清理仅发生在数据库实例启动时(instance startup)。

永久表空间上同样存在临时段,譬如当我们在某个永久表空间上使用create table/index等DDL命令创建某个表/索引时,服务进程一开始会在指定的永久表空间上分配足够多的区间(Extents),这些区间在命令结束之前都是临时的(Temporary Extents),直到表/索引完全建成才将该temporary segment转换为permanent segment。另外当使用drop命令删除某个段时,也会先将该段率先转换为temporary segment,之后再来清理该temporary segment(DROP object converts the segment to temporary and then cleans up the temporary segment)。 常规情况下清理工作遵循谁创建temporary segment,谁负责清理的原则。换句话说,因服务进程rebuild index所产生的temporary segment在rebuild完成后应由服务进程自行负责清理。一旦服务进程在成功清理temporary segment之前就意外终止了,亦或者服务进程在工作过程中遇到了某些ORA-错误导致语句失败,那么SMON都会被要求(posted)负责完成temporary segment的清理工作。

对于永久表空间上的temporary segment,SMON会三分钟清理一次(前提是接到post),如果SMON过于繁忙那么可能temporary segment长期不被清理。temporary segment长期不被清理可能造成一个典型的问题是:在rebuild index online失败后,后续执行的rebuild index命令要求之前产生的temporary segment已被cleanup,如果cleanup没有完成那么就需要一直等下去。在10gR2中我们可以使用dbms_repair.online_index_clean来手动清理online index rebuild的遗留问题:

The dbms_repair.online_index_clean function has been created to cleanup online index rebuilds.
Use the dbms_repair.online_index_clean function to resolve the issue.
Please note if you are unable to run the dbms_repair.online_index_clean function it is due to the fact
that you have not installed the patch for Bug 3805539 or are not running on a release that includes this fix.
The fix for this bug is a new function in the dbms_repair package called dbms_repair.online_index_clean,
which has been created to cleanup online index [[sub]partition] [re]builds.

New functionality is not allowed in patchsets;
therefore, this is not available in a patchset but is available in 10gR2.

Check your patch list to verify the database is patched for Bug 3805539
using the following command and patch for the bug if it is not listed:

opatch lsinventory -detail

Cleanup after a failed online index [re]build can be slow to occurpreventing subsequent such operations
until the cleanup has occured.

接着我们通过实践来看一下smon是如何清理永久表空间上的temporary segment的:

设置10500事件以跟踪smon进程,这个诊断事件后面会介绍

SQL> alter system set events '10500 trace name context forever,level 10';
System altered.

在第一个会话中执行create table命令,这将产生一定量的Temorary Extents

SQL> create table smon as select * from ymon;

在另一个会话中执行对DBA_EXTENTS视图的查询,可以发现产生了多少临时区间

SQL> SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

COUNT(*)
----------
117

终止以上create table的session,等待一段时间后观察smon后台进程的trc可以发现以下信息:

*** 2011-06-07 21:18:39.817
SMON: system monitor process posted msgflag:0x0200 (-/-/-/-/TMPSDROP/-/-)

*** 2011-06-07 21:18:39.818
SMON: Posted, but not for trans recovery, so skip it.

*** 2011-06-07 21:18:39.818
SMON: clean up temp segments in slave

SQL> SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

COUNT(*)
----------
0

可以看到smon通过slave进程完成了对temporary segment的清理

与永久表空间上的临时段不同,出于性能的考虑临时表空间上的Extents并不在操作(operations)完成后立即被释放和归还。相反,这些Temporary Extents会被标记为可用,以便用于下一次的排序操作。SMON仍会清理这些Temporary segments,但这种清理仅发生在实例启动时(instance startup):

For performance issues, extents in TEMPORARY tablespaces are not released ordeallocated
once the operation is complete.Instead, the extent is simply marked as available for the next sort operation.
SMON cleans up the segments at startup.

A sort segment is created by the first statement that used a TEMPORARY tablespacefor sorting, after startup.
A sort segment created in a TEMPOARY tablespace is only released at shutdown.
The large number of EXTENTS is caused when the STORAGE clause has been incorrectly calculated.

现象

可以通过以下查询了解数据库中Temporary Extent的总数,在一定时间内比较其总数,若有所减少那么说明SMON正在清理Temporary segment

SELECT COUNT(*) FROM DBA_EXTENTS WHERE SEGMENT_TYPE='TEMPORARY';

也可以通过v$sysstat视图中的”SMON posted for dropping temp segment”事件统计信息来了解SMON收到清理要求的情况:

SQL> select name,value from v$sysstat where name like '%SMON%';
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
total number of times SMON posted                                         8
SMON posted for undo segment recovery                                     0
SMON posted for txn recovery for other instances                          0
SMON posted for instance recovery                                         0
SMON posted for undo segment shrink                                       0
SMON posted for dropping temp segment                                     1

另外在清理过程中SMON会长期持有Space Transacton(ST)队列锁,其他会话可能因为得不到ST锁而等待超时出现ORA-01575错误:

01575, 00000, "timeout waiting for space management resource"
// *Cause: failed to acquire necessary resource to do space management.
// *Action: Retry the operation.

如何禁止SMON清理临时段

可以通过设置诊断事件event=’10061 trace name context forever, level 10’禁用SMON清理临时段(disable SMON from cleaning temp segments)。

alter system set events '10061 trace name context forever, level 10';

相关诊断事件

除去10061事件外还可以用10500事件来跟踪smon的post信息,具体的事件设置方法见<EVENT: 10500 “turn on traces for SMON>

Oracle Event 10357 and 10351

[oracle@rh2 ~]$ oerr ora 10357
10357, 00000, "turn on debug information for direct path"
// *Cause:
// *Action:  turn on debug information for direct path

The cause of this issue was identified as unpublished Bug 9650718

In the bug, it was found that during cleanup from a direct path load, qesmm context was freed
without the pointer to the context being set to zero. Because the pointer was non zero,
there was code that tried to use the context.

To help in diagnosing the issue, the following was used:

This command will enable tracing for direct path operations in the server:
ALTER SYSTEM SET EVENTS 'TRACE [DIRPATH_LOAD] DISK=high';
This command will enable tracing for the I/O component of direct path load:
alter system set events '10357 trace name context forever, level 12';




klmalp: heap=0x7f9bd14efea8 size=72 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eab98
klmalp: heap=0x7f9bd14efea8 size=224 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eabe0
klmalp: heap=0x7f9bd14efea8 size=336 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eacc0
klmalp: heap=0x7f9bd14efea8 size=80 comment=klcliti:klclih+kltb+klpt mem=0x7f9bd14eae10
klmalp: heap=0x7f9bd14efea8 size=8168 comment=kdobjrbb mem=0x7f9bd14edbc0
klmalp: heap=0x7f9bd14efea8 size=1150 comment=ktbbhs:kdbh mem=0x7f9bd14eb230
klmalp: heap=0x7f9bd14efea8 size=432 comment=kcbl_structure_instance mem=0x7f9bd14eb6b0
klmalp: heap=0x7f9bd14efea8 size=64 comment=kllcqc:kllcq mem=0x7f9bd14eb860
klmalp: heap=0x7f9bd14efea8 size=1312 comment=kllcqc:kllcqslt mem=0x7f9bd14ecb40
kcblin: lbs=0x7f9bd14eb6b0 flg=104 slt=(nil) cnt=2 sz=262144 st_obj=0x8b0cc8c0 fb=0
kcblin: state objects are: Call=8b0cc8c0, Current Call=8b0cc8c0, Session=8b082f18, Proc=8bae46c8
kdblil2<-klclil1r<-qerltFRop<-qercoRop<-kdstf0000101km<-kdsttgr<-qertbFetch<-qercoFetch<-rwsfcd<-qerltFetch
<-insdlexe<-insExecStmtExecIniEngine<-insexe<-opiexe<-kpoal8<
-opiodr<-ttcpip<-opitsk<-opiino<-opiodr
klmalp: heap=0x7f9bd14efea8 size=48 comment=kdblix:klixllkey mem=0x7f9bd14ed060
klmalp: heap=0x7f9bd14efea8 size=528 comment=kcblinlm mem=0x7f9bd14ed090
kcblnb: lbs=d14eb6b0, slt=0, d=10dd2a9, nwr=0, cnt=0
klmalp: heap=0x7f9bd14efea8 size=262656 comment=kllcqgf:kllsltba mem=0x7f9bd135de00
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2aa, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ab, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ac, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ad, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ae, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2af, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b0, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b1, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b2, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b3, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b4, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b5, nwr=0, cnt=0
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2b6, nwr=0, cnt=0
kcblcow: lbs=d14eb6b0, nslots=2, nwr=1
kcblco:lbs=d14eb6b0 slt=d14ecb40 typ=1 afn=4 blk=905897 d=10dd2a9 cnt=17 buf=d135de00 rq=d14ecc18fob=898375c8
kcblco ret: lbs=d14eb6b0 slt=d14ecb40 type=1 afn=4 blk=905897 cnt=17 buf=d135de00 rq=d14ecc18 fob=898375c8, wait=0, more=-783370504, wrc=0
kcblcow: slt=d14ecb40 dba=10dd2a9, sz=32, blks=23, st=1, idx=0
kcblio: lbs=d14eb6b0 slt=d14ecdd0 nslt=2, d=10dd2c0, st=2, fbon=0, flg=104
kcblio: Logging lbs=d14eb6b0 slt=d14ecdd0 nslt=2, d=10dd2c0, st=2
kcblgr: lbs=d14eb6b0 slt=d14ecdd0, rd=10dd2c0, flg=104, blks=32
kcbldio:lbs=d14eb6b0 flg=104 slt=d14ecdd0 typ=1 async=16 afn=4 blk=905920 cnt=20 buf=d13ade00 rq=d14ecea8 fob=0 blks=32 st=2
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e1, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e2, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e3, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e4, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e5, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e6, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e7, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e8, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2e9, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ea, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2eb, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ec, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ed, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ee, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ef, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f0, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f1, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f2, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f3, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f4, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f5, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f6, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f7, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f8, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2f9, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fa, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fb, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fc, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fd, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2fe, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd2ff, nwr=1, cnt=23
kcblnb: lbs=d14eb6b0, slt=d14ecb40, d=10dd300, nwr=1, cnt=23

[oracle@rh2 ~]$ oerr ora 10351
10351, 00000, "size of slots"
// *Cause:
// *Action:  sets the size of slots to use
// *Comment: a slot is a unit of I/O and this factor controls the size
// *Comment: of the IO.

alter session set events '10351 trace name context forever, level 128';

level 128 -> direct path write max block 128

I generated a new run of the big testcase with event 10357, Patch 4417285
applied, manual workarea_size_policy, sort_area_size=50000000,
db_file_multiblock_read_count=16 and event 10351 with level 128.

I tried it with disk_asynch_io=TRUE and FALSE just to be certain this is not
something related to the async.

In the trace files I see something very peculiar. The slots size is 128 as
expected and I see many writes of 128 blocks but not all of them are and they
look like the they come in clusters. A few 128 writes, then a lot smaller of
different sizes but mainly less than 16 blocks and then another cluster of
big ones and so on.

kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c991, sz=128, blks=1, st=3, idx=15
kcblcow: dba=100c91b, sz=128, blks=117, st=3, idx=14
kcblcow: dba=100c991, sz=128, blks=1, st=3, idx=15
kcblcow: dba=100c990, sz=128, blks=1, st=3, idx=0
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100c992, sz=128, blks=128, st=3, idx=1
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100ca3a, sz=128, blks=1, st=3, idx=3
kcblcow: dba=100ca12, sz=128, blks=39, st=3, idx=2
kcblcow: dba=100ca3a, sz=128, blks=1, st=3, idx=3
kcblcow: dba=100ca39, sz=128, blks=1, st=3, idx=4

but
it is possible that there are other factor out of our control that forces
Oracle to stop adding blocks to the slot and write small batches.

In conclusion, in order to have the least ammount of direct operations and
have the maximum possible read/write batches these are the parameters to set
:

alter session set events '10351 trace name context forever, level 128';
alter session set workarea_size_policy=manual;
alter session set sort_area_size=50000000;

Trace obtained enqueue information by set event 10704

Oracle事件10704可以帮助我们了解队列Enqueue的使用情况,使用方法如下:

[oracle@rh2 bdump]$ oerr ora 10704
10704, 00000, "Print out information about what enqueues are being obtained"
// *Cause:  When enabled, prints out arguments to calls to ksqcmi and
//          ksqlrl and the return values.
// *Action: Level indicates details:
//   Level: 1-4: print out basic info for ksqlrl, ksqcmi
//          5-9: also print out stuff in callbacks:  ksqlac, ksqlop
//          10+: also print out time for each line

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10704 trace name context forever,level 10;
Statement processed.

SQL> lock  table tm in share mode;
Table(s) Locked.

SQL> oradebug tracefile_name;
/s01/admin/G10R2/udump/g10r2_ora_28400.trc

ksqgtl *** CU-9fec6e30-00000000 mode=6 flags=0x10 timeout=300 ***
ksqgtl: no transaction
ksqgtl: use existing ksusetxn DID
ksqgtl:
        ksqlkdid: 0001-0017-00000008
*** 2011-05-07 21:17:16.139
*** ksudidTrace: ksqgtl
        ksusesdi:   0000-0000-00000000
        ksusetxn:   0001-0017-00000008
ksqgtl: RETURNS 0
*** 2011-05-07 21:17:16.160
ksqrcl: CU,9fec6e30,0
ksqrcl: returns 0
*** 2011-05-07 21:17:23.884
ksqgtl *** CU-9fec69f8-00000000 mode=6 flags=0x10 timeout=300 ***
ksqgtl: no transaction
ksqgtl: use existing ksusetxn DID
ksqgtl:
        ksqlkdid: 0001-0017-00000008
*** 2011-05-07 21:17:23.884
*** ksudidTrace: ksqgtl
        ksusesdi:   0000-0000-00000000
        ksusetxn:   0001-0017-00000008
ksqgtl: RETURNS 0
*** 2011-05-07 21:17:23.898
ksqrcl: CU,9fec69f8,0
ksqrcl: returns 0
*** 2011-05-07 21:17:23.899
ksqgtl *** TM-0000d06b-00000000 mode=4 flags=0x401 timeout=21474836 ***
ksqgtl: xcb=0xa69d0a00, ktcdix=2147483647, topxcb=0xa69d0a00
        ktcipt(topxcb)=0x0
*** 2011-05-07 21:17:23.899
ksucti: init txn DID from session DID 0001-0017-00000008
ksqgtl:
        ksqlkdid: 0001-0017-00000008
*** 2011-05-07 21:17:23.899

*** ksudidTrace: ksqgtl
        ktcmydid(): 0001-0017-00000008
        ksusesdi:   0000-0000-00000000
        ksusetxn:   0001-0017-00000008
ksqgtl: RETURNS 0

该10704事件可以配合10046事件一起诊断异常的队列锁问题,记以录之!

EVENT 10235:"check memory manager internal structures"

Event:10235                     
~~~~~~~~~~~ 

Version/Use: 

  7.0 - 10.1.X   Check memory manager internal structures. 

  7.0 - 10.1.X "Check memory manager internal structures" 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

  NOTE: Events should NEVER be set by customers unless advised to do so by


      Oracle Support Services. Read [NOTE:75713.1] before setting any event. 

Summary Syntax: 

~~~~~~~~~~~~~~~ 

  EVENT="10235 trace name context forever,  level LL" 

  (Always comment exactly when and why this event is being set) 

   ** IMPORTANT: Do **NOT** use ALTER SESSION SET EVENTS or ORADEBUG 

                syntax to set this event in sessions. This can cause 

                lots of ORA-600 errors against SGA heaps as not all  

                sessions using the SGA heaps will be using the same event 

                level. This applies to ALL levels except level 65536. 

Levels: 

~~~~~~~ 

  The event being set at all causes certain heap checks to be performed. 

   ***  WARNING *********************************************************** 

    ***   This event should only EVER be set at the request of Oracle Support. 

    ***   It can impact performance on most types of system. 

    ***   Level 2 and above can impact latch contention. 

    ***   Level 3 and above can have a *SEVERE* impact on performance. 

  ************************************************************************ 

  The bottom 3 bits of the level cause the following checks to occur: 

     ~~~~~         ~~~~~~~~~~~~ 

     Level         Description 

     ~~~~~         ~~~~~~~~~~~~ 

        1             Fast check on heap free (kghfrh) 

        2             Do 1 AND fill memory with junk on alloc / free 

        3             Do 2 AND ensure the chunk belongs to given heap on free 

        4             Do 3 AND make permanent chunks freeable so they can  

                      also be checked 

                       This level can give rise to increased memory use 

                       and can trigger false ORA-4030 and false ORA-4031 

                       errors. 

 

  Oracle 9205 onwards only: 

    65536             This is introduced by the diagnostic enhancement in 

                      bug 3293155. It is a totally independent bit setting 

                      which has minimal impact on performance (unless ORed  

                      with other levels). When this is set Oracle tries to 

                      keep comments with "permanent" memory allocations 

                      which can be useful for memory leak problems if the 

                      leaked memory appears to be a leak of "perm" memory. 

                      This level can be set/unset dynamically but will only  

                      store comments in "perm" memory allocated when the  

                      event is set. 

 

  There are additional values which Oracle Support can use. 

 

Description/Steps: 

~~~~~~~~~~~~~~~~~~ 

  This event may be used to try to catch HEAP corruption problems closer  

  to when they occur.  Typically level 12 is required to get close to the 

  corruption but this can impact performance too much to be useful. 

 

Example Output / Interpreting Output: 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

  The event should cause an ORA-600 and heapdump to be produced if an 

  error is detected.  

Related: 

~~~~~~~~ 

 

EVENT:10052 "Stop SMON from cleaning up obj$"

Text:   Stop SMON from cleaning up obj$
——————————————————————————-

NOTE:
Events should NEVER be set by customers unless advised to do so by
Oracle Support Services. You must read <Note:75713.1> before setting ANY
event on a system.

Usage:
~~~~~~
To disable cleanup of OBJ$ by SMON set this event in the init.ora file
thus:

EVENT=”10052 trace name context forever”

Explanation:
~~~~~~~~~~~~
Setting event 10052 stops SMON trying to clean up OBJ$.

The implication of this is that OBJ$ will grow with entries for
non-existent objects, eg: objects which have been dropped and
object entries created for negative dependency reasons.

The usual reason for setting this event is to prevent SMON from
crashing / spinning in the OBJ$ cleanup code whilst diagnostics
or further investigations are performed. It is also sometimes
used in Parallel Server environments so that OBJ$ cleanup is
only performed on selected nodes.

@   The routine it disables is <Function:kqlclo> (KQL Clean Obj$) which scans
@   obj$ for objects marked as non-existent and then deletes them if
@   they are not child dependencies of anything.

沪公网安备 31010802001379号

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569