Advanced Diagnostic using oradebug dumpvar

oradebug工具是Oracle数据库调优和诊断的利器,合理运用oradebug可以大幅减少我们收集诊断信息所花费的时间。当然前提是合理运用,对于初级DBA有这样一个忠告,不要在生产环境中去接触或修改自己所不熟悉的领域的东西,这一点很重要。并不是说中高级DBA在知识和经验上能达到巨细靡遗的状态,实际上中高级Oracle DBA可能每天都在和新事物打交道。恰恰相反,中高级DBA是对以上忠告最忠实的践行者,在生产环境中绝不随意涉险于不确定、不明确、不熟悉的区域,很多时候这会让人觉得是一种”好奇心丧失”的表现。

言归正传,我们所要介绍的是oradebug的dumpvar命令,该命令用于打印转储固定的PGA/UGA/SGA的变量:,其语法如下:

oradebug dumpvar
Print/dump a fixed PGA/SGA/UGA variable.
Syntax                                                              Parameter
oradebug dumpvar <p|s|uga> <variable name> [level]                  <p|s|uga> PGA,SGA or UGA
                                                                    fixed variable name
                                                                    [level]
使用示例
SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com


SQL> show parameter db_files

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_files                             integer     200

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug dumpvar sga kcfdpk
kfil kcfdpk_ [698B950, 698B954) = 000000C8

这里的KCFDPK变量保存了db_files参数的值,000000C8 == 200

利用oradebug工具我们不仅可以做到对这些内部变量的窥视,还可以做到修改,
但是这对我们实际的诊断没有益处,仅仅可以用来满足某些模拟实验

仅仅了解dumpvar命令的使用方法并没有意义,只有和有意义的内部变量结合起来才能真正起到高级诊断的作用,在这里抛砖引玉罗列出部分诊断变量:

sgauso --  该变量可以用于判断use_stored_outlines
SQL> oradebug setmypid;
Statement processed.
Default use_stored_outlines=false;
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [060021418, 06002143C) =
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
SQL> alter system set use_stored_outlines=true;
System altered.
SQL> oradebug dumpvar sga sgauso
qolprm sgauso_ [060021418, 06002143C) =
00000001 45440007 4C554146 00000054 00000000 00000000 00000000 00000000 00000000
45440007 4C554146 ==>      DEFAULT,意为优化器使用DEFAULT category中存放的outline.
ksmvpa -- the size of the variable part of the pga
SQL> oradebug dumpvar pga ksmvpa
b4 ksmvpa_ [0068AA6B4, 0068AA6B8) = 0000E920          -- 59680 bytes
kkjsre        -- The SGA variable kkjsre must be 1 for jobs to execute automatically.
SQL> oradebug  dumpvar sga kkjsre
word kkjsre_ [060025760, 060025764) = 00000001
SQL> exec  dbms_ijob.set_enabled(false);
PL/SQL procedure successfully completed.
SQL> oradebug  dumpvar sga kkjsre
word kkjsre_ [060025760, 060025764) = 00000000

kcmsmx          --the MAX reasonable scn 
SQL> oradebug dumpvar pga kcmsmx
kscn kcmsmx_ [00B7E811C, 00B7E8124) = CA7F0000 00000C6C         -- 11.2.0.2
SQL> oradebug dumpvar pga kcmsmx
kscn kcmsmx_ [0068AB02C, 0068AB034) = A701C000 00000B3D         -- 10.2.0.4 
Notice the MAX scn allowed on 11.2 is far higher that that at earlier releases
(due to the fix from bug 9254170).
Use the fix for 9254170 on ALL DBs with an SCN rate set to match between versions.
Avoid excessive SCN generation rates.
kcvlcm - logging checkpoints to alert - FALSE
SQL> show parameter log_checkpoints_to_alert
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_checkpoints_to_alert             boolean     TRUE
SQL> oradebug dumpvar sga kcvlcm
word kcvlcm_ [060019E88, 060019E8C) = 00000001
SQL> alter system set log_checkpoints_to_alert=false;
System altered.
SQL> oradebug dumpvar sga kcvlcm
word kcvlcm_ [060019E88, 060019E8C) = 00000000
kcvcpr - false (a checkpoint is requested)
SQL> oradebug dumpvar sga  kcvcpr
word kcvcpr_ [060019E90, 060019E94) = 00000000
kcvcpf - false (checkpointing fast)
kcvgcw -false a global checkpointing is waiting
SQL> oradebug dumpvar sga kcvgcw
sword kcvgcw_ [060025748, 06002574C) = 00000000

kcfcpd - true if checkpoint writes done
kcvblg  kcvblg[0] is s 1 incidate some file/s in backup mode
SQL> oradebug dumpvar sga kcvblg
ub4 * kcvblg_ [060019E18, 060019E20) = 9A248508 00000000
SQL> oradebug peek 0x9A248508 100
[09A248508, 09A24856C) =
00000001 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000002 00000000
00000002 00000002 00000000 00000000 ...
alter database begin backup;
SQL>  oradebug peek 0x9A248508 100
[09A248508, 09A24856C) =
00000001 00000003 00000003 00000003 00000003
00000003 00000003 00000003 00000003 00000003
00000003 00000003 00000003 00000003 ...
kcsgscn_           -- current scn 
SQL> oradebug dumpvar sga kcsgscn_
kcslf kcsgscn_ [0600122B0, 0600122E0) =
01C7DB4B 00000000 00000000 00000000              01C7DB4B  -- 29875019
0001E907 00000000 00000000 00000000
00000000 00000000 60011F90 00000000              60011F90 is fixed
SQL> select current_scn from v$database;
CURRENT_SCN
-----------
29875044

kslwlst_  --  waiter list latch
SQL> oradebug dumpvar sga kslwlst
ksllt kslwlst_ [200040AC, 20004174) =
00000000 00000009 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 ...
kcfdfk -- 2 * db_files

SQL> oradebug dumpvar sga kcfdfk
kfil kcfdfk_ [060017EF0, 060017EF4) = 00000190            --400    when db_files=200 
SQL> alter system set db_files=2000 scope=spfile;
System altered.
SQL> oradebug dumpvar sga kcfdfk
kfil kcfdfk_ [060017EF0, 060017EF4) = 00000FA0             --4000
kcffsz  Address of Datafile Size 
So kcbzib() passes in information about how many blocks to read
( and where to start reading from ) The assertion is 
if (bno < 2 || bno + nblks - 1 > (kcflsz[fno] ? kcflsz[fno] : kcffsz[fno]))
==============================================================
.......NOTE: This section is removed after 9g..................
{
/* if the user gave us the block number then this is an external error,
* but if the dba was internally generated then this is a corruption of
* some kind. */
ASSERTNM5(usrdba, OERINM("kcfrbd_2"),
fno, bno, nblks, kcflsz[fno], kcffsz[fno]);
SQL> oradebug dumpvar sga kcffsz
ub4 * kcffsz_ [060017F20, 060017F28) = 9988E700 00000000
SQL> oradebug peek 0x9988E700 200
[09988E700, 09988E7C8) =
00000001
0007B200
00030980
0000AF00
00067CA0
00003200
00280000
00000500
00000000
00000A00
00000000
00000000
00000600
00040000 ...
SQL> select blocks,to_char(blocks,'XXXXXXXX') hex_blocks from v$datafile;
BLOCKS HEX_BLOCK
---------- ---------
504320     7B200
199040     30980
44800      AF00
425120     67CA0
12800      3200
2621440    280000
1280       500
0         0
2560       A00
38400      9600
6400      1900
BLOCKS HEX_BLOCK
---------- ---------
1536       600
262144     40000
13 rows selected.
kcflsz -- like kcffsz

SQL> oradebug dumpvar  sga kcflsz
ub4 * kcflsz_ [060017F28, 060017F30) = 99892588 00000000
SQL> oradebug peek 0x99892588 200
[099892588, 099892650) =
00000000 0007B200 00030980 0000AF00 00067CA0
00003200 00280000 00000500 00000000 00000A00
00000000 00000000 00000600 00040000 ...
THe arguments are file number, block number, number of blocks, kcflsz[fno],
kcffsz[fno]
Error comes out of kcf.c.
Can they try a validate structure cascade on this table and see what happens?
Please update the customer field so that it does not read internal.

kcf.c kcf.c Kernel Cache Files component.
ksl.c Kernel Service layer Latching & Wait-post Implement.
ksm.c Kernel Service Memory component implementation.
kkj.c Kernel Kompiletime Job queue.

还原真实的cache recovery

我们在学习Oracle基础知识的时候会了解到实例恢复(Instance Recovery)或者说崩溃恢复(Crash recovery)的概念,有时候甚至于这2个名词在我们日常的语言中表达同样的意思。实际上Instance Recovery与Crash Recovery是存在区别的:针对单实例(single instance)或者RAC中所有节点全部崩溃后的恢复,我们称之为Crash Recovery。而对于RAC中的某一个节点失败,存活节点(surviving instance)试图对失败节点线程上redo做应用的情况,我们称之为Instance Recovery。

不管是Instance Recovery还是Crash Recovery,都由2个部分组成:cache recovery继以transaction recovery。

根据官方文档的介绍,Cache Recovery也叫Rolling Forward,就是我们常说的前滚;而Transaction Recovery也叫Rolling Back,就是我们常说的回滚。前滚和回滚贯穿Oracle恢复的基本概念,是我们入门必要学习的知识,在次不多介绍。

有文事者,必济之以武略。理论学得再好,不实践也无用。所幸Crash Recovery是很容易做成的实验,我们不妨来看一看:

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.
SQL> alter database open;
Crash Recovery将从alter database open开始,我们来观察其日志
==================alert.log====================
alter database open
Tue Jun 14 18:19:53 2011
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Tue Jun 14 18:19:53 2011
Started redo scan
Tue Jun 14 18:19:53 2011
Completed redo scan
0 redo blocks read, 0 data blocks need recovery
Tue Jun 14 18:19:53 2011
Started redo application at
Thread 1: logseq 1004, block 1124, scn 17136185
Tue Jun 14 18:19:53 2011
Recovery of Online Redo Log: Thread 1 Group 2 Seq 1004 Reading mem 0
Mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_2_6v34jokt_.log
Mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_2_6v34jotq_.log
Tue Jun 14 18:19:53 2011
Completed redo application
Tue Jun 14 18:19:53 2011
Completed crash recovery at
Thread 1: logseq 1004, block 1124, scn 17156186
0 data blocks read, 0 data blocks written, 0 redo blocks read
Tue Jun 14 18:19:53 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=16, OS id=7829
Tue Jun 14 18:19:53 2011
Thread 1 advanced to log sequence 1005 (thread open)
Thread 1 opened at log sequence 1005
Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:19:53 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:19:53 2011
SMON: enabling cache recovery
Tue Jun 14 18:19:53 2011
db_recovery_file_dest_size of 204800 MB is 6.81% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jun 14 18:19:54 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:19:54 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=17, OS id=7831
Tue Jun 14 18:19:55 2011
Completed: alter database open

注意上述单实例Crash Recovery到数据库打开的整个过程:

  • alter database open
  • Beginning crash recovery of 1 threads
  • Started redo scan
  • Completed redo scan
  • Started redo application at
  • Completed redo application
  • Completed crash recovery at
  • SMON: enabling cache recovery
  • Successfully onlined Undo Tablespace 1
  • SMON: enabling tx recovery
  • Completed: alter database open

从上述步骤中我们可以看到三种恢复名词,即:

  • crash recovery
  • cache recovery
  • tx recovery

这和官方文档所描述的Crash Recovery概念是不一致的,我们现在来理清这几种recovery。

crash recovery包含对redo的scan和application,显然其完成的是Rolling Forward前滚的工作,告警日志中出现的crash recovery等同于官方文档中介绍的”cache recovery”,我们可以将” Completed crash recovery”看做前滚完成的标志。而tx recovery从字面就可以看出实际上是Transaction Recovery,tx recovery发生在Undo Tablespace online之后(回滚事务的前提是Undo可用),数据完成打开操作之前(“Completed: alter database open”)。注意tx recovery并不要求数据库打开前完成,仅仅是在数据库打开之前由smon启动(“SMON: enabling tx recovery”)。

剩下的唯一的问题是,这里的cache recovery是什么?显然它不是官方文档中所描述的”cache recovery”,几乎没有任何文档介绍存在这样一个recovery操作,这也是本文重点要介绍的。

我们来看另一个演示,这个演示用以说明cache recovery还存在于最普通的不包含Crash Recovery的数据库打开过程中:

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.
SQL> alter database open;
Database altered.
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com
==================alert.log====================
alter database open
Tue Jun 14 18:43:52 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=14, OS id=8133
Tue Jun 14 18:43:52 2011
Thread 1 opened at log sequence 1005
Current log# 3 seq# 1005 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_3_6v34jpmp_.log
Current log# 3 seq# 1005 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_3_6v34jpyn_.log
Successful open of redo thread 1
Tue Jun 14 18:43:52 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
Tue Jun 14 18:43:52 2011
SMON: enabling cache recovery
Tue Jun 14 18:43:53 2011
Successfully onlined Undo Tablespace 1.
Tue Jun 14 18:43:53 2011
SMON: enabling tx recovery
Database Characterset is UTF8
Opening with internal Resource Manager plan
where NUMA PG = 1, CPUs = 2
replication_dependency_tracking turned off (no async multimaster replication found)
Tue Jun 14 18:43:53 2011
Incremental checkpoint up to RBA [0x3ed.624.0], current log tail at RBA [0x3ed.944.0]
Tue Jun 14 18:43:53 2011
Starting background process QMNC
QMNC started with pid=15, OS id=8135
Tue Jun 14 18:43:53 2011
Completed: alter database open

因为是clean shutdown,所以这里不存在crash recovery。但这里同样出现了”SMON: enabling cache recovery”,可见cache recovery是每次实例启动instance startup必要执行的一种恢复操作。但问题是,这个恢复操作到底针对何种对象?

实际上cache recovery所要恢复的是rowcache,也就是我们常说的字典缓存(dictionary cache)。关于这个结论,肯定有很多人要问我这样说的依据是什么,对应于这个”cache recovery”的问题,我们很难从google中得到一些启示,因为它和官方文档所描述的”cache recovery-rolling forward”存在重名的关系。

为了证明cache recovery所恢复的是rowcache,我们需要一个实证,从正式的系统中得到验证。要做到这一点是比较困难的,我们需要Oracle愿意把整个database open的过程变成慢动作来供我们参考,验证要用到一些调试工具,例如gdb或者dbx。

我们首先将实例启动到mount状态,并对执行startup的LOCAL进程做gdb的breakpoint断点调试:

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup mount;
ORACLE instance started.
Total System Global Area 1065353216 bytes
Fixed Size                  2089336 bytes
Variable Size             486542984 bytes
Database Buffers          570425344 bytes
Redo Buffers                6295552 bytes
Database mounted.
找出LOCAL进程的系统进程号SPID
SQL> select spid from v$process
2  where addr in (
3  select paddr from v$session
4  where sid=(select distinct sid from v$mystat))
5  /
SPID
------------
8326
在实例startup nomount/mount后共享池的library cache就是可用的

SQL> select namespace from v$librarycache where gets!=0;
NAMESPACE
---------------
SQL AREA
TABLE/PROCEDURE
而rowcache则尚未被填充,因为字典缓存来源于自举对象(bootstrap$)和字典基表
SQL> select parameter,count,gets from v$rowcache where count!=0;
no rows selected
另开一个terminal窗口,并执行对LOCAL进程8326的gdb breakpoint调试
[oracle@rh2 ~]$ gdb $ORACLE_HOME/bin/oracle 8326
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /s01/db_1/bin/oracle...(no debugging symbols found)...done.
Attaching to program: /s01/db_1/bin/oracle, process 8326
Reading symbols from /s01/db_1/lib/libskgxp10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libskgxp10.so
Reading symbols from /s01/db_1/lib/libhasgen10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libhasgen10.so
Reading symbols from /s01/db_1/lib/libskgxn2.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libskgxn2.so
Reading symbols from /s01/db_1/lib/libocr10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocr10.so
Reading symbols from /s01/db_1/lib/libocrb10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocrb10.so
Reading symbols from /s01/db_1/lib/libocrutl10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libocrutl10.so
Reading symbols from /s01/db_1/lib/libjox10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libjox10.so
Reading symbols from /s01/db_1/lib/libclsra10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libclsra10.so
Reading symbols from /s01/db_1/lib/libdbcfg10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libdbcfg10.so
Reading symbols from /s01/db_1/lib/libnnz10.so...(no debugging symbols found)...done.
Loaded symbols for /s01/db_1/lib/libnnz10.so
Reading symbols from /usr/lib64/libaio.so.1...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libaio.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
0x0000003181a0d8e0 in __read_nocancel () from /lib64/libpthread.so.0
输入断点kcrf_commit_force和kqlobjlod
(gdb) break kcrf_commit_force
Breakpoint 1 at 0x2724b6c
(gdb) break kqlobjlod
Breakpoint 2 at 0x1ac5e8c

在之前的terminal中执行数据库打开操作,因为breakpoint的关系这个open操作会hang住,
这时我们通过观察告警日志来了解恢复进度

SQL> alter database open;                              --这里会hang住
在gdb下输入continue,
(gdb) c
Continuing.
Breakpoint 1, 0x0000000002724b6c in kcrf_commit_force ()
观察告警日志可以发现redo application已经完成,但还未进入cache recovery阶段
alter database open
Tue Jun 14 19:14:33 2011
Beginning crash recovery of 1 threads
parallel recovery started with 2 processes
Tue Jun 14 19:14:33 2011
Started redo scan
Tue Jun 14 19:14:33 2011
Completed redo scan
39 redo blocks read, 74 data blocks need recovery
Tue Jun 14 19:14:33 2011
Started redo application at
Thread 1: logseq 1006, block 1155
Tue Jun 14 19:14:33 2011
Recovery of Online Redo Log: Thread 1 Group 1 Seq 1006 Reading mem 0
Mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_1_6v34jnkn_.log
Mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_1_6v34jnst_.log
Tue Jun 14 19:14:33 2011
Completed redo application
Tue Jun 14 19:14:33 2011
Completed crash recovery at
Thread 1: logseq 1006, block 1194, scn 17200193
74 data blocks read, 74 data blocks written, 39 redo blocks read
Tue Jun 14 19:14:33 2011
LGWR: STARTING ARCH PROCESSES
ARC0: Archival started
LGWR: STARTING ARCH PROCESSES COMPLETE
ARC0 started with pid=17, OS id=8656
Tue Jun 14 19:14:33 2011
Thread 1 advanced to log sequence 1007 (thread open)
Thread 1 opened at log sequence 1007
Current log# 2 seq# 1007 mem# 0: /flashcard/oradata/G10R2/onlinelog/o1_mf_2_6v34jokt_.log
Current log# 2 seq# 1007 mem# 1: /s01/flash_recovery_area/G10R2/onlinelog/o1_mf_2_6v34jotq_.log
Successful open of redo thread 1
Tue Jun 14 19:14:33 2011
ARC0: Becoming the 'no FAL' ARCH
ARC0: Becoming the 'no SRL' ARCH
ARC0: Becoming the heartbeat ARCH
db_recovery_file_dest_size of 204800 MB is 6.81% used. This is a
user-specified limit on the amount of space that will be used by this
database for recovery-related files, and does not reflect the amount of
space available in the underlying filesystem or ASM diskgroup.
Tue Jun 14 19:14:37 2011
Incremental checkpoint up to RBA [0x3ef.3.0], current log tail at RBA [0x3ef.3.0]
且此时rowcache仍未被填充
SQL> select parameter,count,gets from v$rowcache where count!=0;
no rows selected

在gdb界面下再次执行continue 2次
(gdb) c
Continuing.
Breakpoint 1, 0x0000000002724b6c in kcrf_commit_force ()
(gdb) c
Continuing.
Breakpoint 2, 0x0000000001ac5e8c in kqlobjlod ()
观察告警日志可以发现已开始cache recovery,但也卡陷在cache recovery上,这保证我们的演示不受骚扰
Tue Jun 14 19:16:44 2011
SMON: enabling cache recovery
此时rowcache中出现唯一的一个dc_objects对象
select parameter,count,gets from v$rowcache where count!=0;
PARAMETER                             COUNT       GETS
-------------------------------- ---------- ----------
dc_objects                                1          1
这个对象是什么呢?也许你已经猜到了,我们做一个rowcache dump来看一下:
SQL> ALTER SESSION SET EVENTS 'immediate trace name row_cache level 10';
================row_cache trace===================
BUCKET 43170:
row cache parent object: address=0x92326060 cid=8(dc_objects)
hash=f3d1a8a1 typ=11 transaction=(nil) flags=00000001
own=0x92326130[0x9230f628,0x9230f628] wat=0x92326140[0x92326140,0x92326140] mode=S
status=EMPTY/-/-/-/-/-/-/-/-
set=0, complete=FALSE
data=
00000000 4f42000a 5453544f 24504152 00000000 00000000 00000000 00000000
00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
BUCKET 43170 total object count=1
可以看到上述dc_objects尚未完成加载(status=EMPTY & complete=FALSE ),那么这是一个什么object呢?
4f42000a 5453544f 24504152 => 转换为文本即:OB TSTO$PAR也就是BOOTSTRAP$
换而言之在cache recovery时第一个恢复的字典缓存对象是BOOTSTRAP$,这并不出乎我们的意料。
启动实例的LOCAL进程的等待事件为instance state change,这是常规情况下我们观察不到得
SQL> select event,p1text,p1 from v$session where wait_class!='Idle';
EVENT                                    P1TEXT                                           P1
---------------------------------------- ---------------------------------------- ----------
instance state change                    layer                                             2
在gdb界面下再次continue,将载入更多的rowcache对象
(gdb) c
Continuing.
Breakpoint 2, 0x0000000001ac5e8c in kqlobjlod ()
BUCKET 37:
row cache parent object: address=0x916cd980 cid=3(dc_rollback_segments)
hash=5fed2a24 typ=9 transaction=(nil) flags=000000a6
own=0x916cda50[0x916cda50,0x916cda50] wat=0x916cda60[0x916cda60,0x916cda60] mode=N
status=VALID/INSERT/-/FIXED/-/-/-/-/-
data=
00000000 00000000 00000001 00000009 59530006 4d455453 00000000 00000000
00000000 00000000 00000000 00000000 00000003 00000000 00000000 00000000
00000000 00000000 00000000 00000000
BUCKET 37 total object count=1

595300064d455453 -> SYSTEM 属于dc_rollback_segments 也就是著名的system回滚段
BUCKET 55556:
row cache parent object: address=0x916d8cd0 cid=8(dc_objects)
hash=ce89d903 typ=11 transaction=(nil) flags=00000001
own=0x916d8da0[0x9230f628,0x9230f628] wat=0x916d8db0[0x916d8db0,0x916d8db0] mode=S
status=EMPTY/-/-/-/-/-/-/-/-
set=0, complete=FALSE
data=
00000000 5f430006 234a424f 00000000 00000000 00000000 00000000 00000000
00000000 00000005 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000
BUCKET 55556 total object count=1

5f430006 234a424f -> C_OBJ# 是著名的bootstrap对象之一,可以从$ORACLE_HOME/rdbms/admin/sql.bsq中找到它
create rollback segment SYSTEM tablespace SYSTEM
storage (initial 50K next 50K)
/
create cluster c_obj# (obj# number)
pctfree 5 size 800                           /* don't waste too much space */
/* A table of 32 cols, 2 index, 2 col per index requires about 2K.
* A table of 10 cols, 2 index, 2 col per index requires about 750.
*/
storage (initial 130K next 200k maxextents unlimited pctincrease 0)
/* avoid space management during IOR I */
/

我们还可以通过v$rowcache_parent视图来了解dictionary cache的情况

SQL> col cache_name for a20
SQL> col keystr for a31
SQL> set linesize 200
SQL> select address,cache_name,existent,lock_mode,saddr,substr(key,1,30) keystr from v$rowcache_parent;
ADDRESS          CACHE_NAME           E  LOCK_MODE SADDR            KEYSTR
---------------- -------------------- - ---------- ---------------- -------------------------------
00000000916CCE20 dc_tablespaces       N          0 00               000000000000000000000000000000
00000000916CD980 dc_rollback_segments Y          0 00               000000000000000000000000000000
0000000092326060 dc_objects           Y          0 00               000000000A00424F4F545354524150
00000000916D8CD0 dc_objects           N          3 000000009BD91328 000000000600435F4F424A23000000
00000000916DA830 dc_object_ids        Y          0 00               380000000000000000000000000000

可以看到持有row cache lock的会话是'000000009BD91328',
且该dc_objects对象还处于non-existent状态,
换而言之真正装载rowcache的是启动实例的LOCAL服务进程
SQL> select sid,program,event,p1,p2,p3 from v$session where saddr='000000009BD91328';
SID PROGRAM                                          EVENT                                    P1   P2 P3
----- ------------------------------------------------ ---------------------------------------- -- ---- --
3294 sqlplus@rh2.oracle.com (TNS V1-V3)               db file scattered read                    1  378  3
该进程正在等待db file scattered read,fileid->1,block-378,这些块属于BOOTSTRAP$表

BOOTSTRAP$对象已从rowcache被载入到library cache中
SQL> select kglhdadr,kglnaobj from x$kglob where kglobtyp=2 and kglnaobj not like 'X$%';
KGLHDADR             KGLNAOBJ
-------------------- --------------------
0000000092326990     BOOTSTRAP$
SQL> select owner||'.'||Name from v$db_object_cache where type='TABLE' and name not like 'X$%';
OWNER||'.'||NAME
--------------------------------------------------------------------------------
SYS.BOOTSTRAP$

初步总结:

  1. 在数据库正式open前需要恢复字典缓存,这个步骤被称为cache recovery,其实是row cache recovery。与官方文档中描述的”cache recovery”不同,row cache recovery应当是Oracle Internal的叫法。
  2. 实际执行row cache recovery的不是SMON进程,而是启动实例的服务进程

【书籍推荐】Oracle 8i Internal Services

直到今天仍是经典,无可替代,绝不落伍。想了解Oracle internal 必读的一本书,每次翻开影印本都会有所收获,感谢作者的无私贡献!
[gview file=”http://askmaclean.com/wp-content/uploads/resource/oracle8i.internal.services.for.waits.latches.locks.pdf”]

Know more about redo log buffer and latches

1.
The Total size of the log buffer is determined by LOG_BUFFER parameter.

2.
Only Server process may pin a data block in exclusive mode.

3.
LGWR writes to the redo log files when:

  • The redo log buffer is 1/3 full.
  • 1 MB of data have been written to the redo log buffer.
  • A 3-second time-out occurs.
  • commit!
  • before dbwr write out dirty buffers
  • A thread is closed

 

4.
the correct sequence for the LGWR algorithm is :

  1. Acquire the redo writing and redo allocation latches
  2. determine the buffer to write out
  3. Release the redo allocation latch
  4. determine how many writes are required

 

5.
In Oracle8i, the redo copy latch is always acquired regardless of the redo size.

Because the parameter LOG_SMALL_ENTRY_MAX_SIZE is obsolete, a redo copy latch is always acquired.

6.
Three most relevant redo events under normal operations:

  • Log file parallel write
  • Log buffer space
  • Log file sync

 

7.
relevant redo statistics and their purpose

  • REDO writes                                                 === Number of times the log buffer is written
  • redo blocks written                                === Number of times the log buffer is written
  • redo write time                                          === Total time required to write all the redos to disk
  • redo buffer allocation retires         === Total number of retries necessary to allocate space in the redo buffer

 

8.
Log buffer contention is typically indicated by Redo buffer allocation retries .

The buffer allocation retries indicates that a process has to continually try to allocate buffers that are not available

9.
Redo buffer allocation retries can be gathered by querying the name and value columns from V$ dynamic views:
V$sesstat
V$sysstat

10.
Stripe the redo log files across physical disks will help flush the redo buffer faster.

The redo log file write batch is 128K, so striping the redo logs across 8 disks with a stripe size of 16K could improve the write time by as much as 75%.

11.
The number of redo copy latches can be defined by setting _Log_simultaneous_copies .

The parameter begins with an underscore because in 8i it is a hidden parameter.

12.
The purpose of the redo writing latch is to: Prevent multiple processes from posting
LGWR when there is no space in the buffer .

The redo writing latch frees space in log buffer and does not want
processes writing to the buffer while space is being freed

13.
_log_io_size is set to reduce contention for the redo writing latch.

But notice that it is a hidden parameter. This is the number of used redo blocks that will automatically initiate a log write.

14.
A standby database must be refreshed following a nologging operation.

When nologging occurs, the redo logs of the standby database are not written to.

15.
log_small_entry_max_size is used to determine if a redo copy latch is required based upon the redo entry size,

however, this parameter is obsolete in Oracle8i, as a copy latch is always acquired.

_log_simultaneous_copies determines the number of redo copy latches to allocate.

Note that this is a hidden parameter in Oracle8i because it is not recommended to have more than one copy latch.

log_checkpoint_interval determines the need for a checkpoint.

EVENT 10235:"check memory manager internal structures"

Event:10235                     
~~~~~~~~~~~ 
Version/Use: 
  7.0 - 10.1.X   Check memory manager internal structures. 
  7.0 - 10.1.X "Check memory manager internal structures" 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
  NOTE: Events should NEVER be set by customers unless advised to do so by

      Oracle Support Services. Read [NOTE:75713.1] before setting any event. 
Summary Syntax: 
~~~~~~~~~~~~~~~ 
  EVENT="10235 trace name context forever,  level LL" 
  (Always comment exactly when and why this event is being set) 
   ** IMPORTANT: Do **NOT** use ALTER SESSION SET EVENTS or ORADEBUG 
                syntax to set this event in sessions. This can cause 
                lots of ORA-600 errors against SGA heaps as not all  
                sessions using the SGA heaps will be using the same event 
                level. This applies to ALL levels except level 65536. 
Levels: 
~~~~~~~ 
  The event being set at all causes certain heap checks to be performed. 
   ***  WARNING *********************************************************** 
    ***   This event should only EVER be set at the request of Oracle Support. 
    ***   It can impact performance on most types of system. 
    ***   Level 2 and above can impact latch contention. 
    ***   Level 3 and above can have a *SEVERE* impact on performance. 
  ************************************************************************ 
  The bottom 3 bits of the level cause the following checks to occur: 
     ~~~~~         ~~~~~~~~~~~~ 
     Level         Description 
     ~~~~~         ~~~~~~~~~~~~ 
        1             Fast check on heap free (kghfrh) 
        2             Do 1 AND fill memory with junk on alloc / free 
        3             Do 2 AND ensure the chunk belongs to given heap on free 
        4             Do 3 AND make permanent chunks freeable so they can  
                      also be checked 
                       This level can give rise to increased memory use 
                       and can trigger false ORA-4030 and false ORA-4031 
                       errors. 
 
  Oracle 9205 onwards only: 
    65536             This is introduced by the diagnostic enhancement in 
                      bug 3293155. It is a totally independent bit setting 
                      which has minimal impact on performance (unless ORed  
                      with other levels). When this is set Oracle tries to 
                      keep comments with "permanent" memory allocations 
                      which can be useful for memory leak problems if the 
                      leaked memory appears to be a leak of "perm" memory. 
                      This level can be set/unset dynamically but will only  
                      store comments in "perm" memory allocated when the  
                      event is set. 
 
  There are additional values which Oracle Support can use. 
 
Description/Steps: 
~~~~~~~~~~~~~~~~~~ 
  This event may be used to try to catch HEAP corruption problems closer  
  to when they occur.  Typically level 12 is required to get close to the 
  corruption but this can impact performance too much to be useful. 
 
Example Output / Interpreting Output: 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
  The event should cause an ORA-600 and heapdump to be produced if an 
  error is detected.  
Related: 
~~~~~~~~ 
 

Know more about Oracle Latches

Two purposes of latches were:Serialize Execution,Serialize Access

Match the characteristics with latches and enqueues.

Several modes of access                   Enqueue
FIFO queue acquisition                    Enqueue
Mostly exclusive access                   Latch
Non-deterministic acquisition             Latch
No atomicity                              Enqueue
Atomicity                                 Latch

Most latches have a level between 0 and 8.

Usually a process can request a latch X only:After obtaining latch Y and if level X > level Y.
Remember that the level of X, which is child latch, must be greater than level Y, which is most likely a parent latch.

The structure for parent and child latches can be seen by querying X$KSLLT .
There is one KSLLT structure for each parent and child latch in an instance.

The characteristic of latch acquisition in NO-WAIT mode is:
One attempt to get a latch, no spin or sleep .
This is typically for latches where there are many of the same type. The process will attempt to acquire another latch.

The sleep time on latches approximately doubles with each wait until upper limit is reached. This is an exponential backoff.

The statistic waiters_woken counts the number of times a posting for a latch has occurred.
It can be found in V$LATCH.This view contains aggregate statistics for latches

PMON invokes a clean up function for releasing latches if the holding process dies.PMON will initiate the clean up function.

_MAX_SLEEP_HOLDING_LATCH The maximum exponential backoff when waiting for a latch and already holding anothers
_MAX_EXPONENTIAL_SLEEP Controls the maximum time a process has to sleep before requesting the latch again
_SPIN_COUNT Designates the number of times a process will spin while trying to acquire a latchs

V$SESSION_WAIT dynamic performance view gives real time information regarding what sessions are currently waiting or have just waited for an event.

When diagnosing latch contention, we should typically look for Latches with the most impact.

We should use below formula  to calculate the total cost of latch spinning:

_spin_count * sleeps/misses

Statistics from V$LATCH with their associated description:

IMMEDIATE_MISSES Number of times the fast get failed when requested in “no-wait” mode

MISSES Number of times the fast get call failed when requested in “willing-to-wait” modes

SPIN_GETS Number of times where the fast get failed but the latch was acquired during the first spin before having to yield to the CPU

GETS Number of times the latch was obtained when requested in “willing-to-wait” mode

SLEEPi Number of times a “willing-to-wait” request had to sleep I times

IMMEDIATE_GETS Number of times the latch was obtained when requested in “no-wait” mode

We can set event 10005 to perform latch tracing, or oradebug dump latches 3 ;

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com
[oracle@rh2 ~]$ oerr ora 10005
10005, 00000, "trace latch operations for debugging"
// *Document: NO
// *Cause:
// *Action: Enable tracing for various latch operations
// *Comment:
//    level 1 - trace latch gets and frees
//    level 4 - trace multiple posts by processes when latch is freed
//
Wait and latch (KSL) layer (10005)
We can set events 10005 (trace latch gets and frees) and 10073 (have PMON
dump info before latch cleanup).  Level does not matter for these events.
We can also set event="600 trace name LATCHES level 1" to dump latch info
when PMON hits the error (I'm not so sure this one is going to work).
event = "600 trace name latches level 10"
event = "10005 trace name context forever, level 1"
10005 event has been removed above 9i , try dump KSTDUMPCURPROC 1
SQL> oradebug setmypid;
SQL> oradebug unlimit;
SQL> oradebug dump KSTDUMPCURPROC 1
SQL> oradebug tracefile_name;
KSTDUMP: In-memory trace dump
TIME(usecs):SEQ# ORAPID   SID EVENT  OP DATA
========================================================================
D9396B13:000044F8    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 146/0x92 0/0x0
D9573C3D:00004568    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 146/0x92 0/0x0 time=1954087
D9573C4F:00004569    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 147/0x93 0/0x0
D9750D82:000045FC    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 147/0x93 0/0x0 time=1954097
D9750D8F:000045FD    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 148/0x94 0/0x0
D992DEA2:00004639    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 148/0x94 0/0x0 time=1954065
D992DEB2:0000463A    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 149/0x95 0/0x0
D9B0AFD8:000046A9    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 149/0x95 0/0x0 time=1954084
D9B0AFEB:000046AA    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 150/0x96 0/0x0
D9CE8110:00004742    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 150/0x96 0/0x0 time=1954083
D9CE811E:00004743    14     0 10005   1 KSL WAIT BEG [PX Idle Wait] 200/0xc8 151/0x97 0/0x0
D9EC5246:00004790    14     0 10005   2 KSL WAIT END [PX Idle Wait] 200/0xc8 151/0x97 0/0x0 time=1954086
D9ECA80D:00004792    14     0 10280   8 kst: process state object about to be deleted
E308018E:00006780    14     0 10280   1 kst: process state object created on 06-30 22:11:24.211
E308018F:00006781    14     0 10280   3 kst: process info: ospid=10380 pso_num=14 pso_serial#=2
E30801D0:00006782    14     0 10420   1 kso: new process: pid=10380 (legacy spawn)
E4ACF279:00006D4A    14     0 10280   8 kst: process state object about to be deleted
E4B2CD42:00006D4B    14     0 10280   1 kst: process state object created on 06-30 22:11:52.853
E4B2CD43:00006D4C    14     0 10280   3 kst: process info: ospid=10382 pso_num=14 pso_serial#=3
E4B2CDA3:00006D4D    14     0 10420   1 kso: new process: pid=10382 (legacy spawn)
KSTDUMP: End of in-memory trace dump
**** LGWR trace
05A46022:0000DEB0     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[58] comment=[Redo writer log switch operations]
05A46025:0000DEB1     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[59] comment=[Redo writer generate offline immed marker]
05A46026:0000DEB2     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[170] comment=[kfr Poke LGWR]
05A46027:0000DEB3     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[171] comment=[kfr Incr Ckpt]
05A46029:0000DEB4     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[172] comment=[kfr ACD relocation]
05A4602A:0000DEB5     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[1] comment=[KSB action for X-instance calls]
05A4602A:0000DEB6     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[63] comment=[Scumnt mount lock]
05A4602B:0000DEB7     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[64] comment=[Poll system events broadcast channel]
05A4602E:0000DEB8     6   221 10005   1 KSL WAIT BEG [rdbms ipc message] 300/0x12c 0/0x0 0/0x0
05D117DB:0000DF45     6   221 10005   2 KSL WAIT END [rdbms ipc message] 300/0x12c 0/0x0 0/0x0 time=2930603
05D117E6:0000DF46     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[1] comment=[KSB action for X-instance calls]
05D117E7:0000DF47     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[63] comment=[Scumnt mount lock]
05D117E8:0000DF48     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[64] comment=[Poll system events broadcast channel]
05D117ED:0000DF49     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[0] comment=[Monitor Cleanup]
05D117EE:0000DF4A     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[56] comment=[Redo writer IO's]
05D117FE:0000DF4B     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[57] comment=[Redo writer post action]
05D117FF:0000DF4C     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[58] comment=[Redo writer log switch operations]
05D11803:0000DF4D     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[59] comment=[Redo writer generate offline immed marker]
05D11804:0000DF4E     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[170] comment=[kfr Poke LGWR]
05D11805:0000DF4F     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[171] comment=[kfr Incr Ckpt]
05D11807:0000DF50     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[172] comment=[kfr ACD relocation]
05D11808:0000DF51     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[1] comment=[KSB action for X-instance calls]
05D11808:0000DF52     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[63] comment=[Scumnt mount lock]
05D11809:0000DF53     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[64] comment=[Poll system events broadcast channel]
05D1180C:0000DF54     6   221 10005   1 KSL WAIT BEG [rdbms ipc message] 300/0x12c 0/0x0 0/0x0
05FDCFC0:0000DFE1     6   221 10005   2 KSL WAIT END [rdbms ipc message] 300/0x12c 0/0x0 0/0x0 time=2930610
05FDCFCB:0000DFE2     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[1] comment=[KSB action for X-instance calls]
05FDCFCC:0000DFE3     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[63] comment=[Scumnt mount lock]
05FDCFCD:0000DFE4     6   221 10254  71 KSBCTI: (LGWR) : (interrupt action) : acnum=[64] comment=[Poll system events broadcast channel]
05FDCFD1:0000DFE5     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[0] comment=[Monitor Cleanup]
05FDCFD2:0000DFE6     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[56] comment=[Redo writer IO's]
05FDCFEF:0000DFE7     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[57] comment=[Redo writer post action]
05FDCFF0:0000DFE8     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[58] comment=[Redo writer log switch operations]
05FDCFF3:0000DFE9     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[59] comment=[Redo writer generate offline immed marker]
05FDCFF4:0000DFEA     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[170] comment=[kfr Poke LGWR]
05FDCFF5:0000DFEB     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[171] comment=[kfr Incr Ckpt]
05FDCFF7:0000DFEC     6   221 10254  73 KSBCTI: (LGWR) : (timeout action)   : acnum=[172] comment=[kfr ACD relocation]

Know more about Buffer Cache and Latch

We can examine X$BH table to obtain buffer header information,The BH stands for buffer header.

Structures that maintain a list of data buffer headers are called:Hash buckets.

Hash buckets are grouped by relative DBA and class number.
Hash chains list data buffer headers in one hash bucket.

Cache Recycle Pool For randomly accessed large tables

Default Pool For normally accessed tables

Keep Pool for frequently updated small tables

Set DB_BLOCK_LRU_LATCHES parameter to create multiple LRU lists.This parameter sets the number of latches,
and it is best used on a multi-CPU system.

DB_BLOCK_BUFFERS sets the number of buffers to allocate in the buffer cache.
DB_BLOCK_LRU_EXTENDED_STATISTICS is obsolete in Oracle8i.
DB_WRITER_PROCESSES sets the number of database writer processes.

Cache buffers LRU chain protects LRU lists.

The total number of working sets per instance is defined by DB_BLOCK_LRU_LATCHES. There is one latch per set.

The number of DBWR processes cannot be greater than the number of working sets.

Oracle get buffers as below steps:

  1. get a buffer descriptor
  2. specify a mode in which a buffer should be obtained
  3. scan the appropriate hash chain
  4. find the appropriate buffer in the chain or read from disk
  5. attach a state object to the buffer header

Buffers are initially hashed to LRU_AUX.This contains buffers that are candidates for reuse.
LRU_MAIN List houses buffers that are in use (pinned or dirty).
LRU_W List is the write list for dirty buffers.
LRU_XR List is the reuse range list for buffers that are to be written for reuse.

DBWR processes operate on working sets that are assigned to them in a cyclical manner.

There are three reasons may cause DBWR write buffers to disk:

  • To provide checkpoints
  • To do free requests
  • To do ping writes(obsolete)

The write batch size controls the number of asynchronous writes slots allocated to each DBWR.The write batch size controls the number of asynchronous writes slots allocated to each DBWR.

Two Most important DBWR statistics: DBWR make free request,DBWR lru scans 。

Two circumstances where DBWR purges dirty buffers whose DBA falls between the lowest and highest DBA of the datafiles:

  • ALTER TABLESPACE…BEGIN BACKUP
  • Make a tablespace read only

Of the nine latches that protect the buffer cache,the three that are most important:

  • Cache buffers lru chain
  • Cache buffers chains
  • Checkpoint queue latch

each of the fixed tables below with its purpose:

  • X$KCBWAIT         Wait statistics by block class
  • X$KCBFWAIT      Wait statistics by file id
  • X$KCBBHS              DBWR histogram statistics
  • X$KCBWBPD         Buffer pool descriptors

A target buffer cache hit ratio is above 90%.
But 60% could be a valid hit ratio for a DSS or Data Warehouse application.

Two primary events that relate to the buffer cache:

  • Buffer busy waits
  • Free buffer waits

We can use the views and table:V$SYSTEM_EVENT, V$WAITSTAT, and X$KCBFWAIT to determine the reason and block class for buffer busy waits.

VIEW: X$KTUXE – Transaction Entry (table)

View:   X$KTUXE
[K]ernel [T]ransaction [U]ndo
Transa[x]tion [E]ntry (table)
This view is very useful as it gives an indication of the state
of the transaction tables in the rollback segment headers.
The information here can be used to see the state of transactions
requiring transaction recovery do not show in <View:V$TRANSACTION>
<Event:10013> may be useful to trace transaction recovery.
Column          Type               Description
--------        ----               --------
ADDR            RAW(4|8)           address of this row/entry in the array or SGA
INDX            NUMBER             index number of this row in the fixed table array
INST_ID         NUMBER       8.x   oracle instance number
Transaction ID
KTUXEUSN        NUMBER             undo seg number
KUSNOLTP UB2MAXVAL           is no-undo xac
KTUXESLT        NUMBER             slot number
KSLTINV  UB2MAXVAL           not a valid slot num
KTUXESQN        NUMBER             wrap number, is savept # at start if no-undo xac
position within transaction
KTUXERDBF       NUMBER             relative File
KTUXERDBB       NUMBER             relative Block
KTUXESCNB       NUMBER             SCN base for prepare/commit
KTUXESCNW       NUMBER             SCN wrap for prepare/commit
KTUXESTA        VARCHAR2(16)       Transaction Status
KTUXECFL        VARCHAR2(24)       Transaction flags
KTUXEUEL        NUMBER             Used for extent of tx and link to commit list
Distributed tx: collecting dba and undo bk to start retrieving collecting info rec
KTUXEDDBF       NUMBER             relative file
KTUXEDDBB       NUMBER             relative dba
Parent transaction id
KTUXEPUSN       NUMBER       8.x   undo seg number
KUSNOLTP UB2MAXVAL           is no-undo xac
KTUXEPSLT       NUMBER       8.x   slot number
KSLTINV  UB2MAXVAL           not a valid slot num
KTUXEPSQN       NUMBER       8.x   wrap number, is savept # at start if no-undo xac
position within transaction
KTUXESIZ        NUMBER       8.1  number of undo blocks used by the transaction
Notes:
To see any DEAD transactions for deferred transaction recovery
after startup:
select * from x$ktuxe where ktuxecfl='DEAD';

Known Oracle Internal Stack Call Meaning

 ksedmp             # KSE: dump the process state
ksfdmp             # Call relevant dump routine
kgeasi             # Raise an error on an ASSERTION failure (IGNORE)
ktcrab             KTC: Kernel Transaction Control Real ABort - Abort a
transaction.
ktcsod             KTC: Transaction Control: STATE OBJECT PROCEDURE VECTOR
DEFINITION
kssdch_stage       
kssdch             KSS: delete children of state obj.
ksures             
ktmres             ktmres - KTM Resource cleanup routine.
ktmmon             KTM: TX Monitor: background timeout action
ksbrdp             KSB: run a detached (background) process
opirip             # Oracle Program Interface Run Independent Process
(IGNORE)
opidrv             # opidrv - ORACLE Program Interface DRiVer (IGNORE)
sou2o              # Main Oracle executable entry point
main               # Standard executable entry point

DBMS_REPAIR example


PURPOSE
This document provides an example of DBMS_REPAIR as introduced in Oracle 8i.
Oracle provides different methods for detecting and correcting data block
corruption - DBMS_REPAIR is one option.
WARNING: Any corruption that involves the loss of data requires analysis to
understand how that data fits into the overall database system. Depending on
the nature of the repair, you may lose data and logical inconsistencies can
be introduced; therefore you need to carefully weigh the gains and losses
associated with using DBMS_REPAIR.
SCOPE & APPLICATION
This article is intended to assist an experienced DBA working with an Oracle
Worldwide Support analyst only.  This article does not contain general
information regarding the DBMS_REPAIR package, rather it is designed to provide
sample code that can be customized by the user (with the assistance of
an Oracle support analyst) to address database corruption.  The
"Detecting and Repairing Data Block Corruption" Chapter of the Oracle8i
Administrator's  Guide should be read and risk assessment analyzed prior to
proceeding.
RELATED DOCUMENTS
Oracle 8i Administrator's Guide,  DBMS_REPAIR Chapter
Introduction
=============
Note: The DBMS_REPAIR package is used to work with corruption in the
transaction layer and the data layer only (software corrupt blocks).
Blocks with physical corruption (ex. fractured block) are marked as
the block is read into the buffer cache and DBMS_REPAIR ignores all
blocks marked corrupt.
The only block repair in the initial release of DBMS_REPAIR is to
*** mark the block software corrupt ***.
DB_BLOCK_CHECKING and DB_BLOCK_CHECKSUM must both be set to FALSE.
A backup of the file(s) with corruption should be made before using package.
Database Summary
===============
A corrupt block exists in table T1.
SQL> desc t1
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
COL1                                      NOT NULL NUMBER(38)
COL2                                               CHAR(512)
SQL> analyze table t1 validate structure;
analyze table t1 validate structure
*
ERROR at line 1:
ORA-01498: block check failure - see trace file
---> Note: In the trace file produced from the ANALYZE, it can be determined
---        that the corrupt block contains 3 rows of data (nrows = 3).
---        The leading lines of the trace file follows:
Dump file /export/home/oracle/product/8.1.5/admin/V815/udump/v815_ora_2835.trc
Oracle8 Enterprise Edition Release 8.1.5.0.0 - Beta
With the Partitioning option
*** 1998.12.16.15.53.02.000
*** SESSION ID:(7.6) 1998.12.16.15.53.02.000
kdbchk: row locked by non-existent transaction
table=0   slot=0
lockid=32   ktbbhitc=1
Block header dump:  0x01800003
Object id on Block? Y
seg/obj: 0xb6d  csc: 0x00.1cf5f  itc: 1  flg: -  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0002.011.00000121    uba: 0x008018fb.0345.0d  --U-    3  fsc
0x0000.0001cf60
data_block_dump
===============
tsiz: 0x7b8
hsiz: 0x18
pbl: 0x28088044
bdba: 0x01800003
flag=-----------
ntab=1
nrow=3
frre=-1
fsbo=0x18
fseo=0x19d
avsp=0x185
tosp=0x185
0xe:pti[0]      nrow=3  offs=0
0x12:pri[0]     offs=0x5ff
0x14:pri[1]     offs=0x3a6
0x16:pri[2]     offs=0x19d
block_row_dump:
[... remainder of file not included]
end_of_block_dump
DBMS_REPAIR.ADMIN_TABLES (repair and orphan key
================================================
ADMIN_TABLES provides administrative functions for repair and orphan key tables.
SQL> @adminCreate
SQL> connect sys/change_on_install
Connected.
SQL>
SQL> -- Repair Table
SQL>
SQL> declare
2  begin
3  -- Create repair table
4  dbms_repair.admin_tables (
5  --    table_name => 'REPAIR_TABLE',
6      table_type => dbms_repair.repair_table,
7      action => dbms_repair.create_action,
8      tablespace => 'USERS');          -- default TS of SYS if not specified
9  end;
10  /
PL/SQL procedure successfully completed.
SQL> select owner, object_name, object_type
2  from dba_objects
3  where object_name like '%REPAIR_TABLE';
OWNER                 OBJECT_NAME                      OBJECT_TYPE
------------------------------------------------------------------
SYS                   DBA_REPAIR_TABLE                 VIEW
SYS                   REPAIR_TABLE                     TABLE
SQL>
SQL> -- Orphan Key Table
SQL>
SQL> declare
2  begin
3  -- Create orphan key table
4  dbms_repair.admin_tables (
5      table_type => dbms_repair.orphan_table,
6      action => dbms_repair.create_action,
7      tablespace => 'USERS');          -- default TS of SYS if not specified
8  end;
9  /
PL/SQL procedure successfully completed.
SQL> select owner, object_name, object_type
2  from dba_objects
3  where object_name like '%ORPHAN_KEY_TABLE';
OWNER                 OBJECT_NAME                      OBJECT_TYPE
------------------------------------------------------------------
SYS                   DBA_ORPHAN_KEY_TABLE             VIEW
SYS                   ORPHAN_KEY_TABLE                 TABLE
DBMS_REPAIR.CHECK_OBJECT
=========================
CHECK_OBJECT procedure checks the specified object and populates the repair
table with information about corruption and repair directive(s).  Validation
consists of block checking all blocks in the object.  All blocks previously
marked corrupt will be skipped.
Note: In the initial release of DBMS_REPAIR the only repair is to mark the
block as software corrupt.
SQL> @checkObject
SQL> set serveroutput on
SQL>
SQL> declare
2     rpr_count int;
3  begin
4     rpr_count := 0;
5  dbms_repair.check_object (
6     schema_name => 'SYSTEM',
7     object_name => 'T1',
8     repair_table_name => 'REPAIR_TABLE',
9     corrupt_count => rpr_count);
10     dbms_output.put_line('repair count: ' || to_char(rpr_count));
11  end;
12  /
repair count: 1
PL/SQL procedure successfully completed.
SQL> desc repair_table
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
OBJECT_ID                                 NOT NULL NUMBER
TABLESPACE_ID                             NOT NULL NUMBER
RELATIVE_FILE_ID                          NOT NULL NUMBER
BLOCK_ID                                  NOT NULL NUMBER
CORRUPT_TYPE                              NOT NULL NUMBER
SCHEMA_NAME                               NOT NULL VARCHAR2(30)
OBJECT_NAME                               NOT NULL VARCHAR2(30)
BASEOBJECT_NAME                                    VARCHAR2(30)
PARTITION_NAME                                     VARCHAR2(30)
CORRUPT_DESCRIPTION                                VARCHAR2(2000)
REPAIR_DESCRIPTION                                 VARCHAR2(200)
MARKED_CORRUPT                            NOT NULL VARCHAR2(10)
CHECK_TIMESTAMP                           NOT NULL DATE
FIX_TIMESTAMP                                      DATE
REFORMAT_TIMESTAMP                                 DATE
SQL> select object_name, block_id, corrupt_type, marked_corrupt,
2  corrupt_description, repair_description
3  from repair_table;
OBJECT_NAME                      BLOCK_ID CORRUPT_TYPE MARKED_COR
------------------------------ ---------- ------------ ----------
CORRUPT_DESCRIPTION
--------------------------------------------------------------------------------
REPAIR_DESCRIPTION
--------------------------------------------------------------------------------
T1                                      3            1 FALSE
kdbchk: row locked by non-existent transaction
table=0   slot=0
lockid=32   ktbbhitc=1
mark block software corrupt
Data Extraction
===============
The repair table indicates that block 3 of file 6 is corrupt - but remember
that this block has not yet been marked as corrupt, therefore now is the
time to extract any meaningful data.  After the block is marked corrupt,
the entire block must be skipped.
1. Determine the number of rows in the block from ALTER SYSTEM DUMP (nrows = 3).
2. Query the corrupt object and extract as much information as possible.
SQL> -- The following query can be used to salvage data from a corrupt block.
SQL> -- Creating a temporary table facilitates data insertion.
SQL> create table temp_t1 as
2  select * from system.t1
3  where dbms_rowid.rowid_block_number(rowid) = 3
4  and dbms_rowid.rowid_to_absolute_fno (rowid, 'SYSTEM','T1') = 6;
Table created.
SQL> select col1 from temp_t1;
COL1
----------
2
3
DBMS_REPAIR.FIX_CORRUPT_BLOCKS  (ORA-1578)
============================================
FIX_CORRUPT_BLOCKS procedure fixes the corrupt blocks in the specified objects
based on information in the repair table.  After the block has been marked as
corrupt,  an ORA-1578 results when a full table scan is performed.
SQL> declare
2     fix_count int;
3  begin
4     fix_count := 0;
5  dbms_repair.fix_corrupt_blocks (
6     schema_name => 'SYSTEM',
7     object_name => 'T1',
8     object_type => dbms_repair.table_object,
9     repair_table_name => 'REPAIR_TABLE',
10     fix_count => fix_count);
11     dbms_output.put_line('fix count: ' || to_char(fix_count));
12  end;
13  /
fix count: 1
PL/SQL procedure successfully completed.
SQL> select object_name, block_id, marked_corrupt
2  from repair_table;
OBJECT_NAME                      BLOCK_ID MARKED_COR
------------------------------ ---------- ----------
T1                                      3 TRUE
SQL> select * from system.t1;
select * from system.t1
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 6, block # 3)
ORA-01110: data file 6: '/tmp/ts_corrupt.dbf'
DBMS_REPAIR.DUMP_ORPHAN_KEYS
==============================
DUMP_ORPHAN_KEYS reports on index entries that point to rows in corrupt data
blocks.
SQL> select index_name from dba_indexes
2  where table_name in (select distinct object_name from repair_table);
INDEX_NAME
------------------------------
T1_PK
SQL> @dumpOrphanKeys
SQL> set serveroutput on
SQL>
SQL> declare
2     key_count int;
3  begin
4     key_count := 0;
5  dbms_repair.dump_orphan_keys (
6     schema_name => 'SYSTEM',
7     object_name => 'T1_PK',
8     object_type => dbms_repair.index_object,
9     repair_table_name => 'REPAIR_TABLE',
10     orphan_table_name => 'ORPHAN_KEY_TABLE',
11     key_count => key_count);
12     dbms_output.put_line('orphan key count: ' || to_char(key_count));
13  end;
14  /
orphan key count: 3
PL/SQL procedure successfully completed.
SQL> desc orphan_key_table
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
SCHEMA_NAME                               NOT NULL VARCHAR2(30)
INDEX_NAME                                NOT NULL VARCHAR2(30)
IPART_NAME                                         VARCHAR2(30)
INDEX_ID                                  NOT NULL NUMBER
TABLE_NAME                                NOT NULL VARCHAR2(30)
PART_NAME                                          VARCHAR2(30)
TABLE_ID                                  NOT NULL NUMBER
KEYROWID                                  NOT NULL ROWID
KEY                                       NOT NULL ROWID
DUMP_TIMESTAMP                            NOT NULL DATE
SQL> select index_name, count(*) from orphan_key_table
2  group by index_name;
INDEX_NAME                       COUNT(*)
------------------------------ ----------
T1_PK                                   3
Note: Index entry in the orphan key table implies that the index should be
rebuilt to guarantee the a table probe and an index probe return the same
result set.
DBMS_REPAIR.SKIP_CORRUPT_BLOCKS
===============================
SKIP_CORRUPT_BLOCKS enables/disables the skipping of corrupt blocks during
index and table scans of a specified object.
Note: If an index and table are out of sync, then a SET TRANSACTION READ ONLY
transaction may be inconsistent in situations where one query probes only
the index and then a subsequent query probes both the index and the table.
If the table block is marked corrupt, then the two queries will return
different results.
Suggestion: If SKIP_CORRUPT_BLOCKS is enabled, then rebuild any indexes
identified in the orphan key table (or all index associated with object
if DUMP_ORPHAN_KEYS was omitted).
SQL> @skipCorruptBlocks
SQL> declare
2  begin
3  dbms_repair.skip_corrupt_blocks (
4     schema_name => 'SYSTEM',
5     object_name => 'T1',
6     object_type => dbms_repair.table_object,
7     flags => dbms_repair.skip_flag);
8  end;
9  /
PL/SQL procedure successfully completed.
SQL> select table_name, skip_corrupt from dba_tables
2  where table_name = 'T1';
TABLE_NAME                     SKIP_COR
------------------------------ --------
T1                             ENABLED
SQL> -- rows in corrupt block skipped, no errors on full table scan
SQL> select * from system.t1;
COL1              COL2
--------------------------------------------------------------------------------
4                 dddd
5                 eeee
--> Notice the pk index has not yet been corrected.
SQL> insert into system.t1 values (1,'aaaa');
insert into system.t1 values (1,'aaaa')
*
SQL> select * from system.t1 where col1 = 1;
no rows selected
DBMS_REPAIR.REBUILD_FREELISTS
===============================
REBUILD_FREELISTS rebuilds freelists for the specified object.
SQL> declare
2  begin
3  dbms_repair.rebuild_freelists (
4     schema_name => 'SYSTEM',
5     object_name => 'T1',
6     object_type => dbms_repair.table_object);
7  end;
8  /
PL/SQL procedure successfully completed.
Rebuild Index
=============
Note:  Every index identified in the orphan key table should be rebuilt to
ensure consistent results.
SQL> alter index system.t1_pk rebuild online;
Index altered.
SQL> insert into system.t1 values (1, 'aaaa');
1 row created.
SQL> select * from system.t1;
COL1              COL2
--------------------------------------------------------------------------------
4                 dddd
5                 eeee
1                 aaaa
Note - The above insert statement was used to provide a simple example.
This is the perfect world - we know the data that was lost.  The temporary
table (temp_t1) should also be used to include all rows extracted from
the corrupt block.
Conclusion
==========
At this point the table T1 is available but data loss was incurred.  In general,
data loss must be seriously considered before using the DBMS_REPAIR package for
mining the index segment and/or table block dumps is very complicated and
logical inconsistencies may be introduced.  In the initial release, the only
repair affected by DBMS_REPAIR is to mark the block as software corrupt.
<<End of Article>

沪ICP备14014813号

沪公网安备 31010802001379号