实测DB_BLOCK_CHECKSUM=FULL的作用

从10.2.0.3开始 DB_BLOCK_CHECKSUM有三个选项:OFF TYPICAL FULL
在10.2.0.3之前 DB_BLOCK_CHECKSUM有2个选项: TRUE FALSE

在11g中DB_BLOCK_CHECKSUM和 DB_BLOCK_CHECKING参数 被 DB_ULTRA_SAFE参数整合到一起:

DB_ULTRA_SAFE sets the default values for other parameters that control protection levels.
Values:

OFF
When any of DB_BLOCK_CHECKING, DB_BLOCK_CHECKSUM, or DB_LOST_WRITE_PROTECT are explicitly set, no changes are made.

DATA_ONLY
DB_BLOCK_CHECKING will be set to MEDIUM.
DB_LOST_WRITE_PROTECT will be set to TYPICAL.
DB_BLOCK_CHECKSUM will be set to FULL.

 

DATA_AND_INDEX
DB_BLOCK_CHECKING will be set to FULL.
DB_LOST_WRITE_PROTECT will be set to TYPICAL.
DB_BLOCK_CHECKSUM will be set to FULL.

DB_BLOCK_CHECKSUM 参数决定了DBWn进程和直接路径读取进程是否为块计算checksum并将该checksum存放在每个数据块的cache header并写入到磁盘中。当该数据块被读取时,该checksum会受到验证, 前提是DB_BLOCK_CHECKSUM 被设置为TYPICAL 或 FULL,且最近一次该块的写出中存有checksum。

在FULL模式下,Oracle还会当块要发生变化应用前对该块验证checksum,并会在DML update/insert/delete语句引起变化被应用到块后再次计算该checksum。
此外,Oracle会对写入到当前redo日志文件的每一个redo block计算checksum。

若该参数设置为OFF,则 DBWn进程仅为system表空间上的对象计算checksum, 而对于普通表空间不计算。

checksum让Oracle具备检测由底层磁盘、存储子系统、IO子系统引起的坏块。

若设置为FULL, 则DB_BLOCK_CHECKSUM还会捕捉内存讹误并避免将存在逻辑讹误的块被写入到磁盘上。

官方文档介绍设置DB_BLOCK_CHECKSUM为TYPICAL模式可能引起1%-2%的性能损耗,设置为FULL mode可能引起4%-5%的性能损耗。

 

除非有极高的数据安全性要求,否则推荐用户设置DB_BLOCK_CHECKSUM为TYPICAL。
实际测试发现在 Linux +版本11.2.0.3上DB_BLOCK_CHECKSUM=FULL对普通的内存讹误(memory corruption)几乎无效,详见以下测试:

 

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> show parameter db_block
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_block_buffers                     integer     0
db_block_checking                    string      FULL
db_block_checksum                    string      FULL
db_block_size                        integer     8192
create table maclean_corruption(t1 int) tablespace system;
insert into maclean_corruption values(1);
insert into maclean_corruption values(1);
commit;
1*  select t1,dump(t1,16) from maclean_corruption
SQL> /
T1 DUMP(T1,16)
---------- ------------------------------
1 Typ=2 Len=2: c1,2
1 Typ=2 Len=2: c1,2
SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from maclean_corruption;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
89985                                    1
89985                                    1
ALTER SESSION SET EVENTS 'immediate trace name buffers level 3';
SQL> oradebug setmypid
Statement processed.
SQL>  oradebug tracefile_name
/s01/diag/rdbms/prodb/PRODB/trace/PRODB_ora_28545.trc							   
BH (0xb0f748c8) file#: 1 rdba: 0x00415f81 (1/89985) class: 1 ba: 0xb0154000
set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 0 obj: 76977 objn: 76977 tsn: 0 afn: 1 hint: f
hash: [0xce69cd58,0xce69cd58] lru: [0xb0f74ae0,0xb0f74880]
obj-flags: object_ckpt_list
ckptq: [0xb0f6c168,0xb0f76a08] fileq: [0xb0f6c178,0xb0f76a18] objq: [0xc70de768,0xc70de768] objaq: [0xc70de748,0xc70de748]
st: XCURRENT md: NULL fpin: 'kcbwh6: kcbnew' tch: 4
flags: buffer_dirty redo_since_read
LRBA: [0xc.92.0] LSCN: [0x0.2329de] HSCN: [0x0.2329e0] HSUB: [1]
buffer tsn: 0 rdba: 0x00415f81 (1/89985)
scn: 0x0000.002329e0 seq: 0x01 flg: 0x06 tail: 0x29e00601
frmt: 0x02 chkval: 0x0a60 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00000000B0154000 to 0x00000000B0156000
0B0154000 0000A206 00415F81 002329E0 06010000  [....._A..)#.....]
0B0154010 00000A60 00000001 00012CB1 002329DD  [`........,...)#.]
0B0154020 00000000 00030002 00000000 001E0002  [................]
0B0154030 00000348 00C0312E 0032008E 00002002  [H....1....2.. ..]
0B0154040 002329E0 00000000 00000000 00000000  [.)#.............]
0B0154050 00000000 00000000 00000000 00020100  [................]
0B0154060 0016FFFF 1F781F94 00001F78 1F9A0002  [......x.x.......]
0B0154070 00001F94 00000000 00000000 00000000  [................]
0B0154080 00000000 00000000 00000000 00000000  [................]
Repeat 497 times
0B0155FA0 5F840000 C1020041 41000645 41004C50  [..._A...E..APL.A]
0B0155FB0 00415F83 0432C102 275F4100 00000000  [._A...2..A_'....]
0B0155FC0 00000000 00000000 00000000 00000000  [................]
Repeat 2 times
0B0155FF0 0201012C 012C02C1 02C10201 29E00601  [,.....,........)]
Block header dump:  0x00415f81
Object id on Block? Y
seg/obj: 0x12cb1  csc: 0x00.2329dd  itc: 2  flg: O  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0002.01e.00000348  0x00c0312e.008e.32  --U-    2  fsc 0x0000.002329e0
0x02   0x0000.000.00000000  0x00000000.0000.00  ----    0  fsc 0x0000.00000000
bdba: 0x00415f81
data_block_dump,data header at 0xb015405c
===============
tsiz: 0x1fa0
hsiz: 0x16
pbl: 0xb015405c
76543210
flag=--------
ntab=1
nrow=2
frre=-1
fsbo=0x16
fseo=0x1f94
avsp=0x1f78
tosp=0x1f78
0xe:pti[0]      nrow=2  offs=0
0x12:pri[0]     offs=0x1f9a
0x14:pri[1]     offs=0x1f94
block_row_dump:
tab 0, row 0, @0x1f9a
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 02
tab 0, row 1, @0x1f94
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 02
end_of_block_dump
0B0155FF0 0201012C 012C02C1 02C10201 29E00601  [,.....,........)]
0B0155FF4  012C02C1  这里的 02C1即 C102=> 十进制的 "1"
0B0155FF8  02C10201  同样的 02C1即 C102=> 十进制的 "1"
修改 02C1为 04C1 即将 1修改为 3 
oradebug setmypid
oradebug poke 0x0B0155FF4 4 0x012C04C1
SQL> select t1,dump(t1,16) from maclean_corruption;
T1 DUMP(T1,16)
---------- ------------------------------
1 Typ=2 Len=2: c1,2
3 Typ=2 Len=2: c1,4
SQL> delete maclean_corruption where t1=3;
1 row deleted.
SQL> commit;
Commit complete.
SQL> alter system flush buffer_cache;
System altered.
SQL> select  * from maclean_corruption;
T1
----------
1
SQL> ALTER SESSION SET EVENTS 'immediate trace name buffers level 3';
Session altered.
0B004BFF0 0201023C 002C04C1 02C10201 2B350601  
oradebug poke 0x0B004BFF8 4 0x50C10201;
BEFORE: [0B004BFF8, 0B004BFFC) = FFC10201
AFTER:  [0B004BFF8, 0B004BFFC) = 50C10201
SQL> 
SQL> 
SQL> select  * from maclean_corruption;
T1
----------
79
SQL> update maclean_corruption set t1=t1+1 where t1=80;
0 rows updated.
SQL> update maclean_corruption set t1=t1+1 where t1=79;
1 row updated.
SQL> select  * from maclean_corruption;
T1
----------
80
SQL> 
SQL>  oradebug peek 0x0B004BFF8 4
[0B004BFF8, 0B004BFFC) = 50C10201
BH (0xafff92d8) file#: 1 rdba: 0x00415f81 (1/89985) class: 1 ba: 0xaff4a000
set: 6 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,0
dbwrid: 0 obj: 76977 objn: 76977 tsn: 0 afn: 1 hint: f
hash: [0xb0f6ab88,0xce69cd58] lru: [0xafff94f0,0xcd1371e8]
obj-flags: object_ckpt_list
ckptq: [0xcd1554b8,0xcd1554b8] fileq: [0xcd1554d8,0xcd1554d8] objq: [0xc80e2bc0,0xc80e2bc0] objaq: [0xc80e2ba0,0xc80e2ba0]
st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 5
flags: buffer_dirty redo_since_read
LRBA: [0xc.1d7.0] LSCN: [0x0.232b9d] HSCN: [0x0.232bb7] HSUB: [1]
buffer tsn: 0 rdba: 0x00415f81 (1/89985)
scn: 0x0000.00232bb7 seq: 0x01 flg: 0x06 tail: 0x2bb70601
frmt: 0x02 chkval: 0x5e6a type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x00000000AFF4A000 to 0x00000000AFF4C000
0AFF4A000 0000A206 00415F81 00232BB7 06010000  [....._A..+#.....]
0AFF4A010 00005E6A 00000001 00012CB1 00232B33  [j^.......,..3+#.]
0AFF4A020 00000000 00030002 00000000 000D0003  [................]
0AFF4A030 00000346 00C03094 0021008F 00002001  [F....0....!.. ..]
0AFF4A040 00232BB7 0004000A 00000290 00C00AF0  [.+#.............]
0AFF4A050 0028007B 00072001 00232B35 00020100  [{.(.. ..5+#.....]
0AFF4A060 0016FFFF 1F781F94 00001F81 1F9A0002  [......x.........]
0AFF4A070 00001F94 00000000 00000000 00000000  [................]
0AFF4A080 00000000 00000000 00000000 00000000  [................]
Repeat 497 times
0AFF4BFA0 5F840000 C1020041 41000645 41004C50  [..._A...E..APL.A]
0AFF4BFB0 00415F83 0432C102 275F4100 00000000  [._A...2..A_'....]
0AFF4BFC0 00000000 00000000 00000000 00000000  [................]
Repeat 2 times
0AFF4BFF0 0201023C 012C04C1 51C10201 2BB70601  [
Block header dump:  0x00415f81
Object id on Block? Y
seg/obj: 0x12cb1  csc: 0x00.232b33  itc: 2  flg: O  typ: 1 - DATA
fsl: 0  fnx: 0x0 ver: 0x01
Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0003.00d.00000346  0x00c03094.008f.21  --U-    1  fsc 0x0000.00232bb7
0x02   0x000a.004.00000290  0x00c00af0.007b.28  --U-    1  fsc 0x0007.00232b35
bdba: 0x00415f81
data_block_dump,data header at 0xaff4a05c
===============
tsiz: 0x1fa0
hsiz: 0x16
pbl: 0xaff4a05c
76543210
flag=--------
ntab=1
nrow=2
frre=-1
fsbo=0x16
fseo=0x1f94
avsp=0x1f78
tosp=0x1f81
0xe:pti[0]      nrow=2  offs=0
0x12:pri[0]     offs=0x1f9a
0x14:pri[1]     offs=0x1f94
block_row_dump:
tab 0, row 0, @0x1f9a
tl: 6 fb: --H-FL-- lb: 0x1  cc: 1
col  0: [ 2]  c1 51
tab 0, row 1, @0x1f94
tl: 2 fb: --HDFL-- lb: 0x2
end_of_block_dump
0AFF4BFF0 0201023C 012C04C1 51C10201 2BB70601  
 oradebug poke 0x0AFF4BFF8 4 0x59C10201; ==> 修改 51=》59 记修改T1=80 为88
select * from maclean_corruption;
T1
----------
88
SQL> delete maclean_corruption where t1=88;
1 row deleted.
SQL> commit;
Commit complete.
SQL> alter system flush buffer_cache;
System altered.
SQL> /
System altered.
SQL> alter system archive log current;
System altered.
SQL> /
System altered.
SQL> select * from maclean_corruption;
no rows selected
SQL> 
SQL> analyze table maclean_corruption validate structure;
Table analyzed.

 

 

以上通过oradebug poke命令反复修改数据库块内的T1字段,模拟数据块在内存中出现内存讹误(block corruption in memory),发现DB_BLOCK_CHECKSUM=FULL对该种简单的内存讹误却无法实际检测到(这不代表DB_BLOCK_CHECKSUM=TYPICAL无法检查到检测由底层磁盘、存储子系统、IO子系统引起的坏块,DB_BLOCK_CHECKSUM=TYPICAL仍是检测物理坏块的最佳默认推荐配置),通过update/delete修改对应手工讹误的内存块,均可以正常工作并生成redo且flush脏块(dirty buffer)到磁盘上,则会将该内存讹误的情况持久化,且即便在db_block_checking=FULL的情况下也无法检测到一丁点逻辑讹误,这说明不管是db_block_checksum还是db_block_checking都对内存中数据块的细微讹误无法有效检测,虽然这不代表checksum+checking无法检测到更破坏块结构的内存讹误,但多少还是有些悲哀的。

 

DML UPDATE/DELETE与CR一致性读的秘密

这个问题源于OTN中文论坛的一个帖子<大事务中的更新丢失问题>

环境为Oracle 10.2.0.4 on Linux x64
有一个大表,百万级,col1字段全为0
t1 事务A启动,把所有记录col2全更新为1
t2 事务B启动,根据主键,把一条记录更新为2,然后commit
t3 事务A执行完成,并COMMIT
t4 查询此表,发现col1全部为1,事务B的更新丢失了。
这是为什么呢,其中逻辑是怎样的
谢谢!

 

对于这个问题我想说明的是对于事务transaction 而言Oracle同样提供读一致性,称为transaction-level read consistency:

The database can also provide read consistency to all queries in a transaction, known as transaction-level read consistency. In this case, each statement in a transaction sees data from the same point in time, which is the time at which the transaction began.

Oracle Database Concepts
11g Release 2 (11.2)
Part Number E10713-04

 

为了证明和演示该事务级别的读一致性,我设计了以下演示:

有一张2个列的大表(T1,T2), 其中分别有 T1=600000,T2=’U2′ 和  T1=900000和 T2=’U1’的 2行数据,T1为主键。

在A)时刻,Session A使用SELECT .. FOR UPDATE锁住T1=600000的这一行

在之后的B)时刻,Session B尝试update TAB set t2=’U3′ where t2=’U2′ 即更新所有T2=’U2’的数据行,但是因为 T1=600000,T2=’U2’这一行正好被Session A锁住了,所以Session B会一直等待’enq: TX – row lock contention’;T1=900000和 T2=’U1’的数据行位于Session B处理 T1=600000,T2=’U2’行等待之后才能处理到的数据块中。

在之后的C)时刻,Session C更新update TAB set t2=’U2′ where t1=900000;并commit, 即将T1=900000和 T2=’U1’更新为 T1=900000和 T2=’U2’,这样就符合Session B 更新Update的条件t2=’U2’了。

在D)时刻, Session A执行commit释放锁,Session B得以继续工作,当他处理到T1=900000的记录时存在以下分歧:

 

1)若update DML满足transaction-level read consistency,则它应当看到的是session B事务开始环境SCN(env SCN)时的块的前镜像,即虽然session C更新了t2=’U2’满足其条件,但是为了一致性,session B仍需要对该行所在数据块做APPLY UNDO,直到满足该session B事务开始时间点的Best CR块,而CR一致镜像中t2=’U1’,不满足Session B的更新条件, 那么session B在整个事务中仅更新一行数据 T1=600000,T2=’U2’,session B only Update One Rows。

2) 若update DML不满足transaction-level read consistency,则session B看到的是当前read commited的镜像,即是Session C已更新并提交的块镜像,此时的记录为T1=900000和 T2=’U2’符合session B的更新条件,则session B要更新2行数据。

 

我们来看一下实际实验的结果:

构建实验环境:

 

 

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
create table update_cr tablespace users as select rownum t1, 'MACLEAN     ' T2 
from dual connect by level  update update_cr set t2='U2' where t1=600000;
1 row updated.
SQL> update update_cr set t2='U1' where t1=900000;
1 row updated.
SQL> commit;
Commit complete.
SQL> select * from update_cr where t1=600000 or t1=900000;
T1 T2
---------- ------------
600000 U2
900000 U1

 

 

在A)时刻,Session A使用SELECT .. FOR UPDATE锁住T1=600000的这一行:

 

session A:
SQL> select * from update_cr where t1=600000 for update;
T1 T2
---------- ------------
600000 U2

 

在之后的B)时刻,Session B尝试update TAB set t2=’U3′ where t2=’U2′ 即更新所有T2=’U2’的数据行,但是因为 T1=600000,T2=’U2’这一行正好被Session A锁住了,所以Session B会一直等待’enq: TX – row lock contention’;T1=900000和 T2=’U1’的数据行位于Session B处理 T1=600000,T2=’U2’行等待之后才能处理到的数据块中。

 

 

session B:
SQL> alter system flush buffer_cache;
System altered.
SQL> /
System altered.
SQL> alter session set events '10046 trace name context forever,level 8 : 10201 trace name context forever,level 10';
Session altered.
SQL> update update_cr set t2='U3' where t2='U2';

 

 

在之后的C)时刻,Session C更新update TAB set t2=’U2′ where t1=900000;并commit, 即将T1=900000和 T2=’U1’更新为 T1=900000和 T2=’U2’,这样就符合Session B 更新Update的条件t2=’U2’了。

session C:
SQL> select * from update_cr where t1=900000;
T1 T2
---------- ------------
900000 U1
SQL> update update_cr set t2='U2' where t1=900000;
1 row updated.
SQL> commit;
Commit complete.
SQL> alter system flush buffer_cache;
System altered.

 

 

之后session A 执行 commit;session B得以继续update,得到实验的结果:

 

session A 执行 commit;
session B;
SQL> update update_cr set t2='U3' where t2='U2';
1 row updated.

 

 

以看到以上实验的结果 update仅更新了一行数据,证明了观点1″若update DML满足transaction-level read consistency,则它应当看到的是session B事务开始环境SCN(env SCN)时的块的前镜像,即虽然session C更新了t2=’U2’满足其条件,但是为了一致性,session B仍需要对该行所在数据块做APPLY UNDO,直到满足该session B事务开始时间点的Best CR块,而CR一致镜像中t2=’U1’,不满足Session B的更新条件, 那么session B在整个事务中仅更新一行数据 T1=600000,T2=’U2’,session B only Update One Rows。”的正确性。

即update/delete之类的DML在Oracle中满足transaction-level read consistency,保证其所”看到的”是满足事务开始时间点读一致性的Consistent Read,这也是为什么DML会产生Consistent Read的原因之一。

我们回过头来看一下 上面Session B产生的10201 undo apply和10046 trace的内容,可以发现更多内容:

 

 

SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from update_cr where t1=600000 or t1=900000;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
2589                                    4
3558                                    4
WAIT #139924171154784: nam='db file sequential read' ela= 19504 file#=3 block#=128 blocks=1 obj#=0 tim=1258427129863987
Applying CR undo to block 4 : 1000a1d itl entry 03:
xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e
flg: ---- lkc: 1 fsc: 0x0000.00000000
WAIT #139924171154784: nam='db file sequential read' ela= 12957 file#=3 block#=12302 blocks=1 obj#=0 tim=1258427129877291
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e statement num=2 parent xid: 0x0000.000.00000000 st-scn: 0xe404.00000000 hi-scn: 0x2100.00000000 ma-scn: 0x0000.00000000 flg: 0x62515e33)
CRS upd (before): 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eba sfl: 0
CRS upd (after) : 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e cl-scn: 0x0000.00223eba sfl: 0
*** 2009-11-16 22:06:01.253
WAIT #139924171154784: nam='enq: TX - row lock contention' ela= 31358812 name|mode=1415053318 usn<<16 | slot=65552 sequence=631 obj#=76969 tim=1258427161253791
WAIT #139924171154784: nam='db file sequential read' ela= 36051 file#=4 block#=2589 blocks=1 obj#=76969 tim=1258427161290180
WAIT #139924171154784: nam='db file scattered read' ela= 14718 file#=4 block#=2590 blocks=98 obj#=76969 tim=1258427161305684
WAIT #139924171154784: nam='db file scattered read' ela= 26432 file#=4 block#=2690 blocks=126 obj#=76969 tim=1258427161338364
WAIT #139924171154784: nam='db file scattered read' ela= 38289 file#=4 block#=2818 blocks=126 obj#=76969 tim=1258427161384445
WAIT #139924171154784: nam='db file scattered read' ela= 24265 file#=4 block#=2946 blocks=126 obj#=76969 tim=1258427161416302
WAIT #139924171154784: nam='db file scattered read' ela= 21288 file#=4 block#=3074 blocks=126 obj#=76969 tim=1258427161444684
WAIT #139924171154784: nam='db file scattered read' ela= 23840 file#=4 block#=3202 blocks=126 obj#=76969 tim=1258427161476063
WAIT #139924171154784: nam='db file scattered read' ela= 27439 file#=4 block#=3330 blocks=126 obj#=76969 tim=1258427161511429
WAIT #139924171154784: nam='db file scattered read' ela= 24424 file#=4 block#=3458 blocks=126 obj#=76969 tim=1258427161543429
Applying CR undo to block 4 : 1000de6 itl entry 03:
xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34
flg: --U- lkc: 1 fsc: 0x0000.00223ef9
WAIT #139924171154784: nam='db file sequential read' ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
WAIT #139924171154784: nam='db file sequential read' ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223ef8 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 statement num=8192 parent xid: 0x0000.000.0baf3fa0 st-scn: 0x3fa0.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000000)
CRS upd (before): 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eff sfl: 0
CRS upd (after) : 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 cl-scn: 0x0000.00223eff sfl: 0
WAIT #139924171154784: nam='db file scattered read' ela= 13029 file#=4 block#=3586 blocks=126 obj#=76969 tim=1258427161572511
WAIT #139924171154784: nam='db file scattered read' ela= 22282 file#=4 block#=3714 blocks=126 obj#=76969 tim=1258427161602324
WAIT #139924171154784: nam='db file sequential read' ela= 6127 file#=4 block#=522 blocks=1 obj#=76969 tim=1258427161615461
WAIT #139924171154784: nam='db file scattered read' ela= 13503 file#=4 block#=3842 blocks=42 obj#=76969 tim=1258427161629323

 

 

 

以上可以看到对T1=600000所在的数据块1000a1d=》datafile 4 259 apply了UNDO,其环境SCN 为scn: 0x0000.00223eb9:

Applying CR undo to block 4 : 1000a1d itl entry 03:
xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e
flg: —- lkc: 1 fsc: 0x0000.00000000
WAIT #139924171154784: nam=’db file sequential read’ ela= 12957 file#=3 block#=12302 blocks=1 obj#=0 tim=1258427129877291
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e statement num=2 parent xid: 0x0000.000.00000000 st-scn: 0xe404.00000000 hi-scn: 0x2100.00000000 ma-scn: 0x0000.00000000 flg: 0x62515e33)
CRS upd (before): 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eba sfl: 0
CRS upd (after) : 0xa5fe9198 cr-scn: 0x0000.00223eb9 xid: 0x0001.010.00000277 uba: 0x00c0300e.007c.1e cl-scn: 0x0000.00223eba sfl: 0

对于T1=900000的数据行所在块1000de6=》 datafile 4 3558 block同样apply了UNDO,其环境SCN均为00223eb9

Applying CR undo to block 4 : 1000de6 itl entry 03:
xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34
flg: –U- lkc: 1 fsc: 0x0000.00223ef9
WAIT #139924171154784: nam=’db file sequential read’ ela= 8033 file#=3 block#=12434 blocks=1 obj#=0 tim=1258427161557534
CRS upd rd env [0x7f42a284a704]: (scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 statement num=0 parent xid: 0x0000.000.00000000 st-scn: 0x0000.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00223e78 flg: 0x00000060) undo env [0x7fffe3fe7f60]: (
scn: 0x0000.00223ef8 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 statement num=8192 parent xid: 0x0000.000.0baf3fa0 st-scn: 0x3fa0.00000000 hi-scn: 0x0000.00000000 ma-scn: 0x0000.00000000 flg: 0x00000000)
CRS upd (before): 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0000.000.00000000 uba: 0x00000000.0000.00 cl-scn: 0x0000.00223eff sfl: 0
CRS upd (after) : 0xa4fdaa08 cr-scn: 0x0000.00223eb9 xid: 0x0003.01f.00000345 uba: 0x00c03092.008f.34 cl-scn: 0x0000.00223eff sfl: 0

Buffer cache spillover: only buffers

在启动数据库 (指database open)时若需要recovery的redo较多,涉及到的数据块多时可能会看到以下信息:

> Buffer cache spillover: only 32768 of 117227 buffers available$

 

这是由于crash recovery的redo apply本身需要用到buffer cache,此时buffer cache被称作recovery buffer。 一旦分配了recovery buffer,其将不会被释放或age out直到所有数据块的redo change都被应用并写入到磁盘。

 

由于只有一个实例可以参与recovery(不管是crash recovery还是instance recovery),恢复实例的buffer cache 大小将会是大规模恢复数据文件(recovery)的限制。 在锁声明阶段(lock claim phase),若实例耗尽了空余的buffer cache 则会spillover溢出到磁盘上(有点像swap)并必须予以处理。此时锁声明会暂时中止,oracle会先应用日志将redo change应用到哪些已经声明过的buffer cache中。直到所有这些recovery buffer都被恢复并被写出,此时它们才变得对下一次后续的锁声明可用。 由于以上说明的磁盘溢出恢复(spillover recovery)持续多次锁声明和日志应用阶段(理想的是一次锁声明,一次日志应用即完成redo apply),直到本次完整的恢复完成。

需要注意的是,以上crash recovery的算法在 buffer cache很小的情况下性能较差; 常见的例子是这样的, 在产品数据库中由于有着较大buffer cache而不会遇到该问题,而在某些基于存储或卷管理软件实现的复制、测试环境中,由于需要恢复的数据集合较大而且往往db_cache_size比产品库小很多,所以alter database open时往往看到该(Buffer cache spillover: only 32768 of 117227 buffers available$),且打开数据库耗费比产品库更多的时间。

 

我们一般不用担心buffer cache spillover,因为即便速度缓慢在10分钟内数据库往往还是能够打开的,但是如果你监控着alert.log 30分钟都没有任何新日志,那么可能是spillover导致的hang,如果遇到该问题请去OTN Ask Maclean

 

Crash/Instance Recovery与Media Recovery的本质区别

Crash/Instance Recovery与Media Recovery的本质区别在于:

Crash/Instance Recovery针对需要恢复的实例从增量检查点(incremental checkpoint)开始apply redo应用重做日志。由于日志覆盖的先提条件是完成相关日志的logfile switch checkpoint,且从定义上说归档日志总是落后于实例的检查点,所以对于crash/instance recovery崩溃或实例恢复总是只需要访问读取在线的重做日志(online redo logfile)。

 

介质恢复Media Recovery从旧数据文件的检查点开始apply redo引用重做日志,这些旧的数据文件可能来源于备份。 介质恢复情况下需要用到归档重做日志,因此RMAN或DBA(用户管理的备份)也需要将备份相关的归档日志还原出来。

 

Crash/Instance Recovery总是保证仅当所有的持久重做数据被应用之后才算恢复完成。 在Oracle保证所有已提交的事务都已经被包含恢复的情况下,才认为崩溃实例完成了恢复工作。

 

相反,介质恢复有不完全恢复(incomplete recovery)和部分恢复(partial recovery)的提法,以实现恢复数据库(db)到某个时间点的一致性。

 

 

Crash/Instance Recovery与Media Recovery的相同点在于:

 

Crash/Instance Recovery与Media Recovery都是传统的前滚恢复方式(rolling forward),原理上都是对持久redo log数据的重演。 不管是Crash/Instance Recovery还是Media Recovery的前滚,都需要继之以事务回滚以便回滚未提交的事务,虽然前滚完成后数据库即可以打开而不用等回滚完成,但是仅在回滚完成的时候我们认为数据库是真正一致的。

 

扩展阅读:

了解你所不知道的SMON功能(五):Recover Dead transaction
深入了解Oracle前滚恢复rolling forward(一)
了解你所不知道的SMON功能(六):Instance Recovery

_CORRUPTED_ROLLBACK_SEGMENTS隐藏参数

_CORRUPTED_ROLLBACK_SEGMENTS(corrupted undo segment list)隐藏参数所独有的功能:

  • 在实例启动startup并open database的阶段_CORRUPTED_ROLLBACK_SEGMENTS所列出的undo segments(撤销段/回滚段)将不会被访问读取
  • 所有指向这些被_CORRUPTED_ROLLBACK_SEGMENTS列出的undo segments的事务都被认为已经提交了commit,和这个undo segments已经被drop时类似
  • 衷心地建议用_CORRUPTED_ROLLBACK_SEGMENTS这个参数打开数据库后导出数据并重建数据库,这个参数使用的后遗症可能很顽固
  • Oracle公司内部有叫做TXChecker的工具可以检查问题事务

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638   QQ号:47079569    邮箱:service@parnassusdata.com

 

_offline_rollback_segments 和 _corrupted_rollback_segments 均会造成的实例行为变化:

  • 以上2个参数所列出的Undo Segments(撤销段/回滚段)将不会被在线使用online
  • 在UNDO$数据字典基表中将体现为OFFLINE的记录
  • 在实例instance的生命周期中将不会再给新的事务分配使用
  • 参数所列出的Undo Segments列表上的活跃事务active transaction将即不被回滚亦不被标记为dead以便SMON去回滚(了解你所不知道的SMON功能(五):Recover Dead transaction)

_OFFLINE_ROLLBACK_SEGMENTS隐藏参数

_OFFLINE_ROLLBACK_SEGMENTS(offline undo segment list)隐藏参数(hidden parameter)的独有作用:

  • 在实例startup启动并open database的阶段仍将读取_OFFLINE_ROLLBACK_SEGMENTS所列出的Undo segments(撤销段/回滚段),若访问这些undo segments出现了问题则将在alert.log和其他TRACE中体现出来,但不影响实际的startup进程
    • 若查询数据块发现活跃的事务,并ITL指向对应的undo segments则:
    • 若读取undo segments的transaction table事务表发现事务已提交则做数据块的清除
    • 若读取发现事务仍活动未commit,则生成一个CR块拷贝
    • 若读取该undo segments存在问题(可能是corrupted讹误,可能是missed丢失)则产生一个错误并写出到alert.log,查询将异常终止
  • 若DML更新相关的数据块会导致服务进程为了恢复活跃事务而进入死循环消耗大量CPU,解决方法是通过可以进行的查询工作重建相关表

 

 

_offline_rollback_segments 和 _corrupted_rollback_segments 均会造成的实例行为变化:

  • 以上2个参数所列出的Undo Segments(撤销段/回滚段)将不会被在线使用online
  • 在UNDO$数据字典基表中将体现为OFFLINE的记录
  • 在实例instance的生命周期中将不会再给新的事务分配使用
  • 参数所列出的Undo Segments列表上的活跃事务active transaction将即不被回滚亦不被标记为dead以便SMON去回滚(了解你所不知道的SMON功能(五):Recover Dead transaction)

在Oracle中如何让SELECT查询绕过UNDO

是否有想过如何在Oracle中实现脏读(dirty read),在Oracle官方文档或者Asktom的时候显然会提到Oracle是不实现脏读的, 总是有undo来提供数据块的前镜像(before image)以维护一致性Consistent, 通过正常途径我们几乎不可能破坏Oracle中查询的一致性来实现脏读。

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638   QQ号:47079569    邮箱:service@parnassusdata.com

 

是否真的无计可施? 非也,非也,Oracle作为一个高度复杂而又可控的RDBMS,仍是有很多空子可以钻的。

 

我们先来介绍以下的2个隐藏参数:

_offline_rollback_segments or _corrupted_rollback_segments 这2个隐藏参数对于熟悉Oracle数据库异常恢复或者解决ORA-600[4XXX]问题有经验的同学来说肯定不陌生,因为这2个参数是针对Undo存在Corruption讹误时忽略问题的有力工具,而大家对这2个参数实际起到的作用有多少认识呢?

 

我们可能在以下几种场景中用到_offline_rollback_segments 和 _corrupted_rollback_segments 这2个隐藏参数:

  • 强制打开数据库(FORCE OPEN DATABASE)
  • 控制一致性读和延迟块清除(consistent read & delayed block cleanout)
  • 强制删除某个rollback segment回滚段

 

但是请注意:千万不要在没有Oracle专家建议的情况下,在产品环境设置以上这2个隐藏参数,可能造成数据逻辑讹误等影响!!

 

_offline_rollback_segments 和 _corrupted_rollback_segments 均会造成的实例行为变化:

  • 以上2个参数所列出的Undo Segments(撤销段/回滚段)将不会被在线使用online
  • 在UNDO$数据字典基表中将体现为OFFLINE的记录
  • 在实例instance的生命周期中将不会再给新的事务分配使用
  • 参数所列出的Undo Segments列表上的活跃事务active transaction将即不被回滚亦不被标记为dead以便SMON去回滚(了解你所不知道的SMON功能(五):Recover Dead transaction)

 

_OFFLINE_ROLLBACK_SEGMENTS(offline undo segment list)隐藏参数(hidden parameter)的独有作用:

  • 在实例startup启动并open database的阶段仍将读取_OFFLINE_ROLLBACK_SEGMENTS所列出的Undo segments(撤销段/回滚段),若访问这些undo segments出现了问题则将在alert.log和其他TRACE中体现出来,但不影响实际的startup进程
    • 若查询数据块发现活跃的事务,并ITL指向对应的undo segments则:
    • 若读取undo segments的transaction table事务表发现事务已提交则做数据块的清除
    • 若读取发现事务仍活动未commit,则生成一个CR块拷贝
    • 若读取该undo segments存在问题(可能是corrupted讹误,可能是missed丢失)则产生一个错误并写出到alert.log,查询将异常终止
  • 若DML更新相关的数据块会导致服务进程为了恢复活跃事务而进入死循环消耗大量CPU,解决方法是通过可以进行的查询工作重建相关表

 

_CORRUPTED_ROLLBACK_SEGMENTS(corrupted undo segment list)隐藏参数所独有的功能:

  • 在实例启动startup并open database的阶段_CORRUPTED_ROLLBACK_SEGMENTS所列出的undo segments(撤销段/回滚段)将不会被访问读取
  • 所有指向这些被_CORRUPTED_ROLLBACK_SEGMENTS列出的undo segments的事务都被认为已经提交了commit,和这个undo segments已经被drop时类似
  • 衷心地建议用_CORRUPTED_ROLLBACK_SEGMENTS这个参数打开数据库后导出数据并重建数据库,这个参数使用的后遗症可能很顽固
  • Oracle公司内部有叫做TXChecker的工具可以检查问题事务

 

好了了解了以上2个参数之后,你将不难明白下面演示如何通过_CORRUPTED_ROLLBACK_SEGMENTS隐藏参数让SELECT查询绕过UNDO实现脏读的过程:

 

SQL> alter system set event= '10513 trace name context forever, level 2' scope=spfile;
System altered.
SQL> alter system set "_in_memory_undo"=false scope=spfile;
System altered.
10513 level 2 event可以禁止SMON 回滚rollback 死事务 dead transaction
_in_memory_undo 禁用 in memory undo 特性
SQL> startup force;
ORACLE instance started.
Total System Global Area 3140026368 bytes
Fixed Size                  2232472 bytes
Variable Size            1795166056 bytes
Database Buffers         1325400064 bytes
Redo Buffers               17227776 bytes
Database mounted.
Database opened.
session A:
SQL> conn maclean/maclean
Connected.
SQL> create table maclean tablespace users as select 1 t1 from dual connect by level exec dbms_stats.gather_table_stats('','MACLEAN');
PL/SQL procedure successfully completed.
SQL> set autotrace on;   
SQL> select sum(t1) from maclean;
SUM(T1)
----------
501
Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536
------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
515  bytes sent via SQL*Net to client
492  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processe
在没有活跃事务的情况下,直接读取current block,
全表扫描一致性读,consistent gets只要3次                  
SQL> update maclean set t1=0;
501 rows updated.
SQL> alter system checkpoint;
System altered.
这里session A不commit;
另开一个 session:
SQL> conn maclean/maclean
Connected.
SQL> 
SQL> set autotrace on;
SQL>  select sum(t1) from maclean;
SUM(T1)
----------
501
Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536
------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
505  consistent gets
0  physical reads
108  redo size
515  bytes sent via SQL*Net to client
492  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed
为了一致性读 上面的查询需要通过undo构造CR块,这导致consistent gets上升到 505
[oracle@vrh8 ~]$ ps -ef|grep LOCAL=YES |grep -v grep
oracle    5841  5839  0 09:17 ?        00:00:00 oracleG10R25 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
[oracle@vrh8 ~]$ kill -9 5841
杀掉session A对应的Server Process服务进程,这导致dead transaction 但是不被smon回滚
select ktuxeusn,
to_char(sysdate, 'DD-MON-YYYY HH24:MI:SS') "Time",
ktuxesiz,
ktuxesta
from x$ktuxe
where ktuxecfl = 'DEAD';
KTUXEUSN Time                   KTUXESIZ KTUXESTA
---------- -------------------- ---------- ----------------
2 06-AUG-2012 09:20:45          7 ACTIVE
此时有1个active rollback segment 
SQL> conn maclean/maclean
Connected.
SQL> set autotrace on;
SQL> select sum(t1) from maclean;
SUM(T1)
----------
501
Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536
------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
411  consistent gets
0  physical reads
108  redo size
515  bytes sent via SQL*Net to client
492  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed
到上面为止 虽然通过kill进程 和禁止smon 回滚dead transaction ,
形成了一个不回滚的死事务 但是仍通过undo实现了一致性读
找出当前active的rollback segment的名字
SQL> select segment_name from dba_rollback_segs where segment_id=2;
SEGMENT_NAME
------------------------------
_SYSSMU2$
SQL> alter system set "_corrupted_rollback_segments"='_SYSSMU2$'  scope=spfile;
System altered.
用 _corrupted_rollback_segments 废掉 上面的2个rollback segment, 这将导致无法提供undo 
SQL> startup force;
ORACLE instance started.
Total System Global Area 3140026368 bytes
Fixed Size                  2232472 bytes
Variable Size            1795166056 bytes
Database Buffers         1325400064 bytes
Redo Buffers               17227776 bytes
Database mounted.
Database opened.
SQL> conn maclean/maclean
Connected.
SQL> set autotrace on;
SQL> select sum(t1) from maclean;
SUM(T1)
----------
94
Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536
------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
228  recursive calls
0  db block gets
29  consistent gets
5  physical reads
116  redo size
514  bytes sent via SQL*Net to client
492  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
4  sorts (memory)
0  sorts (disk)
1  rows processed
SQL> /
SUM(T1)
----------
94
Execution Plan
----------------------------------------------------------
Plan hash value: 1679547536
------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |         |     1 |     3 |            |          |
|   2 |   TABLE ACCESS FULL| MACLEAN |   501 |  1503 |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
514  bytes sent via SQL*Net to client
492  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

 

 

以上可以看到 consistent gets下降到3,服务进程读取数据块发现存在活跃事务,但是ITL指向的UNDO SEGMENTS在_corrupted_rollback_segments的列表中,所以直接认为该事务已经COMMIT提交,以便绕过UNDO。

这里实现了脏读,虽然通过上述方法去实现脏读在产品环境中没有实际收益(有部分数据库软件允许脏读来做统计信息收集),破坏了一致性且需要设置需重启实例的 隐藏参数 , 仅供参考。

再议OPEN CURSOR与BULK COLLECT

有同学在T.askmaclean.com上发帖关于bulk collect与open cursor的问题, 帖子的地址在这里。  他的疑问在于:

 

这么说来 OPEN_CURSOR 负责解析SQL语句 和生成执行计划.

会不会去执行 执行计划?

是不是在第一次提取的时候才会执行 执行计划?

test_soruce

create table zengfankun_temp01 as select * from dba_objects;
select count(*) from zengfankun_temp01;–12,6826
analyze table zengfankun_temp01 compute statistics;

create or replace procedure test_open_cursor is
type type_owner is table of zengfankun_temp01.owner%type index by binary_integer;
type type_object_name is table of zengfankun_temp01.object_name%type index by binary_integer;
type type_object_id is table of zengfankun_temp01.object_id%type index by binary_integer;
type type_object_type is table of zengfankun_temp01.object_type%type index by binary_integer;
type type_last_ddl_time is table of zengfankun_temp01.last_ddl_time%type index by binary_integer;

l_ary_owner type_owner;
l_ary_object_name type_object_name;
l_ary_object_id type_object_id;
l_ary_object_type type_object_type;
l_ary_last_ddl_time type_last_ddl_time;

cursor cur_object is
select owner,object_name,object_id,object_type,last_ddl_time
from zengfankun_temp01
order by owner,object_name,object_type,last_ddl_time;
OPEN_START number;
OPEN_END number;
FETCH_START number;
FETCH_END number;
begin
DBMS_OUTPUT.ENABLE (buffer_size=>null) ;
OPEN_START:=dbms_utility.get_time();
open cur_object;
OPEN_END :=dbms_utility.get_time();
dbms_output.put_line(‘OPEN_TIME:’||TO_CHAR(OPEN_END-OPEN_START));
loop
FETCH_START:=dbms_utility.get_time();
fetch cur_object bulk collect into
l_ary_owner,
l_ary_object_name,
l_ary_object_id,
l_ary_object_type,
l_ary_last_ddl_time
limit 10000;
FETCH_END:=dbms_utility.get_time();
dbms_output.put_line(‘FETCH_TIME:’||TO_CHAR(FETCH_END-FETCH_START)||’ ROWCOUNT:’||cur_object%rowCount);

exit when cur_object%notfound or cur_object%notfound is null;
end loop;
end test_open_cursor;

OPEN_TIME:12
FETCH_TIME:21 ROWCOUNT:10000
FETCH_TIME:3 ROWCOUNT:20000
FETCH_TIME:3 ROWCOUNT:30000
FETCH_TIME:3 ROWCOUNT:40000
FETCH_TIME:3 ROWCOUNT:50000
FETCH_TIME:3 ROWCOUNT:60000
FETCH_TIME:3 ROWCOUNT:70000
FETCH_TIME:3 ROWCOUNT:80000
FETCH_TIME:3 ROWCOUNT:90000
FETCH_TIME:3 ROWCOUNT:100000
FETCH_TIME:3 ROWCOUNT:110000
FETCH_TIME:3 ROWCOUNT:120000
FETCH_TIME:1 ROWCOUNT:126826

重复执行

OPEN_TIME:0
FETCH_TIME:18 ROWCOUNT:10000
FETCH_TIME:3 ROWCOUNT:20000
FETCH_TIME:3 ROWCOUNT:30000
FETCH_TIME:3 ROWCOUNT:40000
FETCH_TIME:3 ROWCOUNT:50000
FETCH_TIME:3 ROWCOUNT:60000
FETCH_TIME:3 ROWCOUNT:70000
FETCH_TIME:3 ROWCOUNT:80000
FETCH_TIME:3 ROWCOUNT:90000
FETCH_TIME:3 ROWCOUNT:100000
FETCH_TIME:3 ROWCOUNT:110000
FETCH_TIME:3 ROWCOUNT:120000
FETCH_TIME:2 ROWCOUNT:126826

SQL已经处于软件解析了, 块大部分缓存到了内存当中.
因此OPEN CURSOR 时间接近0
第一次提取时间也降低了3个点.
而第N次 时间并没改变.

怎么说了 第N次 的时间只有提取到数组 所花费的时间.

很显然必有在某个地方存放结果集 游标指针指向该结果集. 否则如何知道要提取10000条呢?

 

 

实际我没有完全理解这位同学想表达的完全clear的观点, 我想着重和重复说明的是:

 

当OPEN CURSOR 操作发生时, PL/SQL引擎转到SQL引擎负责PARSE SQL语句获得执行计划, 同时它会记录OPEN CURSOR这一刻的SNAPSHOT SCN 快照SCN, 但是Oracle并不会实际FETCH相关的数据,也不会将这些数据复制到某个地方。

直到实际FETCH 数据时才会去访问实际的数据块,这些块一般都是Current Block, The most recent version of block , 这样的块的SCN >> Snapshot scn, 需要通过UNDO数据构建 出一个SCN 合适的Best Block ,以满足Read Consistentcy;如果此时 存在的UNDO SNAPSHOT不足以构造出这样一个很久之前的Best Block的话,那么就可能出现ORA-1555错误。

 

为了证明我的观点, 我会创建一个环境测试,这个环境会利用一张小表但是有这char(2000)这样的列, 这导致一条记录将占用一个数据块,我会使用bulk collect fetch一次fetch 10 条记录,如果实验理想那么OPEN CURSOR时将只完成PARSE解析SQL和开始执行的操作, 之后当每需要完成一次fetch bulk collect一次都需要去逻辑读取10个数据块,通过”_trace_pin_time”可以捕获Server Process去pin CR block的行为,换句话说可以看到一次Fetch Bulk Collect limit 10触发10个buffer被pin。

 

 

[oracle@nas ~]$ sqlplus  / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Aug 1 11:36:52 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
了解 Maclean Liu 刘相兵 @诗檀软件
SQL> create table maclean (t1 char(2000)) tablespace users pctfree 99; Table created. SQL> begin 2 for i in 1..200 loop 3 insert into maclean values('MACLEAN'); 4 commit ; 5 end loop; 6 end; 7 / PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats('','MACLEAN'); PL/SQL procedure successfully completed. SQL> select count(*) from maclean; COUNT(*) ---------- 200 SQL> select blocks,num_rows from dba_tables where table_name='MACLEAN'; BLOCKS NUM_ROWS ---------- ---------- 244 200 SQL> alter system set "_trace_pin_time"=1 scope=spfile; System altered. SQL> startup force; ORACLE instance started. Total System Global Area 3140026368 bytes Fixed Size 2232472 bytes Variable Size 1795166056 bytes Database Buffers 1325400064 bytes Redo Buffers 17227776 bytes Database mounted. Database opened. SQL> alter session set events '10046 trace name context forever,level 12'; Session altered. SQL> SQL> SQL> declare 2 cursor v_cursor is 3 select * from sys.maclean; 4 type v_type is table of sys.maclean%rowtype index by binary_integer; 5 rec_tab v_type; 6 begin 7 open v_cursor; 8 dbms_lock.sleep(30); 9 loop 10 fetch v_cursor bulk collect 11 into rec_tab limit 10; 12 dbms_lock.sleep(10); 13 exit when v_cursor%notfound; 14 end loop; 15 end; 16 / 看一下它的10046 trace+ pin trace: PARSING IN CURSOR #47499559136872 len=337 dep=0 uid=0 oct=47 lid=0 tim=1343836146412056 hv=496860239 ad='11a11dbb0' sqlid='4zh7954ftuz2g' declare cursor v_cursor is select * from sys.maclean; type v_type is table of sys.maclean%rowtype index by binary_integer; rec_tab v_type; begin open v_cursor; dbms_lock.sleep(30); loop fetch v_cursor bulk collect into rec_tab limit 10; dbms_lock.sleep(10); exit when v_cursor%notfound; end loop; end; END OF STMT PARSE #47499559136872:c=0,e=346,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=0,tim=1343836146412051 ===================== PARSING IN CURSOR #47499559126280 len=25 dep=1 uid=0 oct=3 lid=0 tim=1343836146414939 hv=3296884535 ad='11a11d250' sqlid='2mb1493284xtr' SELECT * FROM SYS.MACLEAN END OF STMT PARSE #47499559126280:c=1999,e=2427,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=2568761675,tim=1343836146414937 EXEC #47499559126280:c=0,e=55,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=2568761675,tim=1343836146415104 上面完成了 对 SELECT * FROM SYS.MACLEAN的 PARSE 并开始执行 , 但是没有FETCH任何记录也没有pin 逻辑读任何数据块, 这说明了OPEN CURSOR操作的本质 *** 2012-08-01 11:49:36.424 WAIT #47499559136872: nam='PL/SQL lock timer' ela= 30009361 duration=0 p2=0 p3=0 obj#=-1 tim=1343836176424782 等待了30s pin ktewh26: kteinpscan dba 0x10a6202:4 time 1039048805 pin ktewh27: kteinmap dba 0x10a6202:4 time 1039048847 pin kdswh11: kdst_fetch dba 0x10a6203:1 time 1039048898 pin kdswh11: kdst_fetch dba 0x10a6204:1 time 1039048961 pin kdswh11: kdst_fetch dba 0x10a6205:1 time 1039049004 pin kdswh11: kdst_fetch dba 0x10a6206:1 time 1039049042 pin kdswh11: kdst_fetch dba 0x10a6207:1 time 1039049089 pin kdswh11: kdst_fetch dba 0x10a6208:1 time 1039049123 pin kdswh11: kdst_fetch dba 0x10a6209:1 time 1039049159 pin kdswh11: kdst_fetch dba 0x10a620a:1 time 1039049191 pin kdswh11: kdst_fetch dba 0x10a620b:1 time 1039049225 pin kdswh11: kdst_fetch dba 0x10a620c:1 time 1039049260 kdst_fetch是实际fetch块中记录的函数 , 这里fetch了10个块 完成一次实际的FETCH FETCH #47499559126280:c=0,e=536,p=0,cr=12,cu=0,mis=0,r=10,dep=1,og=1,plh=2568761675,tim=1343836176425542 *** 2012-08-01 11:49:46.428 WAIT #47499559136872: nam='PL/SQL lock timer' ela= 10002694 duration=0 p2=0 p3=0 obj#=-1 tim=134383618642829 再次休眠10s pin kdswh11: kdst_fetch dba 0x10a620d:1 time 1049052211 pin kdswh11: kdst_fetch dba 0x10a620e:1 time 1049052264 pin kdswh11: kdst_fetch dba 0x10a620f:1 time 1049052299 pin kdswh11: kdst_fetch dba 0x10a6211:1 time 1049052332 pin kdswh11: kdst_fetch dba 0x10a6212:1 time 1049052364 pin kdswh11: kdst_fetch dba 0x10a6213:1 time 1049052398 pin kdswh11: kdst_fetch dba 0x10a6214:1 time 1049052430 pin kdswh11: kdst_fetch dba 0x10a6215:1 time 1049052462 pin kdswh11: kdst_fetch dba 0x10a6216:1 time 1049052494 pin kdswh11: kdst_fetch dba 0x10a6217:1 time 1049052525 FETCH #47499559126280:c=0,e=371,p=0,cr=10,cu=0,mis=0,r=10,dep=1,og=1,plh=2568761675,tim=1343836186428807 接着pin 10个数据块, 并实际fetch 一次 WAIT #47499559136872: nam='PL/SQL lock timer' ela= 10002864 duration=0 p2=0 p3=0 obj#=-1 tim=1343836196431754 pin kdswh11: kdst_fetch dba 0x10a6218:1 time 1059055662 pin kdswh11: kdst_fetch dba 0x10a6219:1 time 1059055714 pin kdswh11: kdst_fetch dba 0x10a621a:1 time 1059055748 pin kdswh11: kdst_fetch dba 0x10a621b:1 time 1059055781 pin kdswh11: kdst_fetch dba 0x10a621c:1 time 1059055815 pin kdswh11: kdst_fetch dba 0x10a621d:1 time 1059055848 pin kdswh11: kdst_fetch dba 0x10a621e:1 time 1059055883 pin kdswh11: kdst_fetch dba 0x10a621f:1 time 1059055915 pin kdswh11: kdst_fetch dba 0x10a6221:1 time 1059055953 pin kdswh11: kdst_fetch dba 0x10a6222:1 time 1059055992 FETCH #47499559126280:c=0,e=385,p=0,cr=10,cu=0,mis=0,r=10,dep=1,og=1,plh=2568761675,tim=1343836196432274 以下类似 可以看到上面的 DBA都是连续的 ............................ 末尾部分 WAIT #47499559136872: nam='PL/SQL lock timer' ela= 10002933 duration=0 p2=0 p3=0 obj#=-1 tim=1343836366495589 pin kdswh11: kdst_fetch dba 0x10a62f6:1 time 1229119497 pin kdswh11: kdst_fetch dba 0x10a62f7:1 time 1229119545 pin kdswh11: kdst_fetch dba 0x10a62f8:1 time 1229119576 pin kdswh11: kdst_fetch dba 0x10a62f9:1 time 1229119610 pin kdswh11: kdst_fetch dba 0x10a62fa:1 time 1229119644 pin kdswh11: kdst_fetch dba 0x10a62fb:1 time 1229119671 pin kdswh11: kdst_fetch dba 0x10a62fc:1 time 1229119703 pin kdswh11: kdst_fetch dba 0x10a62fd:1 time 1229119730 pin kdswh11: kdst_fetch dba 0x10a62fe:1 time 1229119760 pin kdswh11: kdst_fetch dba 0x10a62ff:1 time 1229119787 FETCH #47499559126280:c=0,e=340,p=0,cr=10,cu=0,mis=0,r=10,dep=1,og=1,plh=2568761675,tim=1343836366496067 可以看到起始DBA是 0x10a6203 , 末尾DBA 是 0x10a62ff 以下验证了起始DBA正是MACLEAN表的第一个数据块,而末尾DBA也正是Maclean表高水位块 getbfno函数用于将dba转换为数据文件号和块号 CREATE OR REPLACE FUNCTION getbfno (p_dba IN VARCHAR2) RETURN VARCHAR2 IS l_str VARCHAR2 (255) DEFAULT NULL; l_fno VARCHAR2 (15); l_bno VARCHAR2 (15); BEGIN l_fno := DBMS_UTILITY.data_block_address_file (TO_NUMBER (LTRIM (p_dba, '0x'), 'xxxxxxxx' ) ); l_bno := DBMS_UTILITY.data_block_address_block (TO_NUMBER (LTRIM (p_dba, '0x'), 'xxxxxxxx' ) ); l_str := 'datafile# is:' || l_fno || CHR (10) || 'datablock is:' || l_bno || CHR (10) || 'dump command:alter system dump datafile ' || l_fno || ' block ' || l_bno || ';'; RETURN l_str; END; / Function created. SQL> select getbfno('0x10a6203') from dual; GETBFNO('0X10A6203') -------------------------------------------------------------------------------- datafile# is:4 datablock is:680451 dump command:alter system dump datafile 4 block 680451; SQL> select getbfno('0x10a62ff') from dual; GETBFNO('0X10A62FF') -------------------------------------------------------------------------------- datafile# is:4 datablock is:680703 dump command:alter system dump datafile 4 block 680703; SQL> select dbms_rowid.rowid_block_number(min(rowid)),dbms_rowid.rowid_relative_fno(min(rowid)) from maclean; DBMS_ROWID.ROWID_BLOCK_NUMBER(MIN(ROWID)) ----------------------------------------- DBMS_ROWID.ROWID_RELATIVE_FNO(MIN(ROWID)) ----------------------------------------- 680451 4 SQL> select dbms_rowid.rowid_block_number(max(rowid)),dbms_rowid.rowid_relative_fno(max(rowid)) from maclean; DBMS_ROWID.ROWID_BLOCK_NUMBER(MAX(ROWID)) ----------------------------------------- DBMS_ROWID.ROWID_RELATIVE_FNO(MAX(ROWID)) ----------------------------------------- 680703 4

 

 

以上演示验证了3个观点:

1.当OPEN CURSOR 操作发生时, PL/SQL引擎转到SQL引擎负责PARSE SQL语句获得执行计划, 同时它会记录OPEN CURSOR这一刻的SNAPSHOT SCN 快照SCN, 但是Oracle并不会实际FETCH相关的数据,也不会将这些数据复制到某个地方。

2.直到实际FETCH 数据时才会去访问实际的数据块

3. 单纯的open cursor+ fetch bulk collect不会在”某个地方存放结果集”

增量检查点如何更新控制文件?

有同学在 T.askmaclean.com上提问关于增量检查点更新控制文件的问题:

 

 

Know more about checkpoint

checkpoint 分成很多种 full 、file、thread、parallel query、 object 、incremental 、logfile switch

每一种checkpoint 都有其自身的特性,例如Incremental Checkpoint会要求ckpt 每3s 更新一次controlfile 但是不更新datafile header, 而FULL CHECKPOINT要求立即完成(同步的) 且会同时更新 controlfile 和 datafile header。

Incremental Checkpoint会要求ckpt 每3s 更新一次controlfile
>>我想问的时:如何查看此时控制文件中更新的SCN?除了DUMP控制文件,有没有命令查询?

 

 

 

我希望通过以下演示说明该问题:

 

 

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump controlf 4;
Statement processed.
SQL> oradebug tracefile_name ;
/s01/admin/G10R25/udump/g10r25_ora_4660.trc
另开一个窗口等待6s在做一次controlf DUMP 
SQL>  exec dbms_lock.sleep(6);
oradebug setmypid;
oradebug dump controlf 4;
oradebug tracefile_name ;
PL/SQL procedure successfully completed.
SQL> Statement processed.
SQL> Statement processed.
SQL> /s01/admin/G10R25/udump/g10r25_ora_4663.trc

 

 

比较以上获得的2个前后有6s间隔的CONTROLFILE DUMP 文件:

 

 

[oracle@vrh8 udump]$ diff /s01/admin/G10R25/udump/g10r25_ora_4660.trc /s01/admin/G10R25/udump/g10r25_ora_4663.trc
1c1
< /s01/admin/G10R25/udump/g10r25_ora_4660.trc --- > /s01/admin/G10R25/udump/g10r25_ora_4663.trc
13c13
< Unix process pid: 4660, image: oracle@vrh8.oracle.com (TNS V1-V3) --- > Unix process pid: 4663, image: oracle@vrh8.oracle.com (TNS V1-V3)
15,18c15,19
< *** ACTION NAME:() 2012-07-22 07:59:08.215
< *** MODULE NAME:(sqlplus@vrh8.oracle.com (TNS V1-V3)) 2012-07-22 07:59:08.215
< *** SERVICE NAME:(SYS$USERS) 2012-07-22 07:59:08.215
< *** SESSION ID:(159.7) 2012-07-22 07:59:08.215 --- > *** 2012-07-22 07:59:31.779
> *** ACTION NAME:() 2012-07-22 07:59:31.779
> *** MODULE NAME:(sqlplus@vrh8.oracle.com (TNS V1-V3)) 2012-07-22 07:59:31.779
> *** SERVICE NAME:(SYS$USERS) 2012-07-22 07:59:31.779
> *** SESSION ID:(159.9) 2012-07-22 07:59:31.779
96,98c97,99
< THREAD #1 - status:0x2 flags:0x0 dirty:56
< low cache rba:(0x1a.3.0) on disk rba:(0x1a.121.0)
< on disk scn: 0x0000.013fe7a8 07/22/2012 07:59:02 --- > THREAD #1 - status:0x2 flags:0x0 dirty:57
> low cache rba:(0x1a.3.0) on disk rba:(0x1a.148.0)
> on disk scn: 0x0000.013fe7c2 07/22/2012 07:59:27
100,101c101,102
< heartbeat: 789262462 mount id: 2675014163
< Flashback log tail log# 15 thread# 1 seq 229 block 274 byte 0 --- > heartbeat: 789262470 mount id: 2675014163
> Flashback log tail log# 15 thread# 1 seq 229 block 275 byte 0
2490c2491
<   V$RMAN_STATUS: recid=140734752341296, stamp=140734752341288 --- >   V$RMAN_STATUS: recid=140733792718464, stamp=140733792718456
2501c2502
<   V$RMAN_STATUS: recid=140734752341296, stamp=140734752341288 --- >   V$RMAN_STATUS: recid=140733792718464, stamp=140733792718456
2511c2512
<   V$RMAN_STATUS: recid=140734752341296, stamp=140734752341288 --- >   V$RMAN_STATUS: recid=140733792718464, stamp=140733792718456
2521c2522
<   V$RMAN_STATUS: recid=140734752341296, stamp=140734752341288 --- >   V$RMAN_STATUS: recid=140733792718464, stamp=140733792718456
2531c2532
<   V$RMAN_STATUS: recid=140734752341296, stamp=140734752341288 --- >   V$RMAN_STATUS: recid=140733792718464, stamp=140733792718456

 

 

 

排除部分V$RMAN_STATUS记录存在差异外,最主要的差别在于: CHECKPOINT PROGRESS RECORDS 这是因为 ckpt 每3s一次对controlfile做heartbeat 更新 CHECKPOINT PROGRESS RECORDS。

第一次 controlf dump:

 

***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
(size = 8180, compat size = 8180, section max = 11, section in-use = 0,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 2, numrecs = 11)
THREAD #1 – status:0x2 flags:0x0 dirty:56
low cache rba:(0x1a.3.0) on disk rba:(0x1a.121.0)
on disk scn: 0x0000.013fe7a8 07/22/2012 07:59:02
resetlogs scn: 0x0000.01394f1a 07/19/2012 07:27:21
heartbeat: 789262462 mount id: 2675014163
Flashback log tail log# 15 thread# 1 seq 229 block 274 byte 0
THREAD #2 – status:0x0 flags:0x0 dirty:0
low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)
on disk scn: 0x0000.00000000 01/01/1988 00:00:00
resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00
heartbeat: 0 mount id: 0
Flashback log tail log# 0 thread# 0 seq 0 block 0 byte 0

 

 

第二次 controlf dump:

 

***************************************************************************
CHECKPOINT PROGRESS RECORDS
***************************************************************************
(size = 8180, compat size = 8180, section max = 11, section in-use = 0,
last-recid= 0, old-recno = 0, last-recno = 0)
(extent = 1, blkno = 2, numrecs = 11)
THREAD #1 – status:0x2 flags:0x0 dirty:57
low cache rba:(0x1a.3.0) on disk rba:(0x1a.148.0)
on disk scn: 0x0000.013fe7c2 07/22/2012 07:59:27
resetlogs scn: 0x0000.01394f1a 07/19/2012 07:27:21
heartbeat: 789262470 mount id: 2675014163
Flashback log tail log# 15 thread# 1 seq 229 block 275 byte 0
THREAD #2 – status:0x0 flags:0x0 dirty:0
low cache rba:(0x0.0.0) on disk rba:(0x0.0.0)
on disk scn: 0x0000.00000000 01/01/1988 00:00:00
resetlogs scn: 0x0000.00000000 01/01/1988 00:00:00

 

差异在于
on disk rba
on disk scn
heartbeat
Flashback log tail log#

 

即实际CKPT每3s更新heartbeat控制文件一次,更新的内容是 on disk rba、on disk scn、heartbeat 如果启用了闪回日志的话那么还有Flashback log , 而并不更新数据库当前的SCN(CURRENT SCN)。

 

如果你想查看ckpt每3s更新的 on disk scn的话可以参考 内部视图X$KCCCP–[K]ernel [C]ache [C]ontrolfile management [c]heckpoint [p]rogress  X$KCCCP Checkpoint Progress Records:

 

 

 

SQL> desc x$kcccp;
Name                                      Null?    Type
----------------------------------------- -------- ----------------------------
ADDR                                               RAW(8)
INDX                                               NUMBER
INST_ID                                            NUMBER
CPTNO                                              NUMBER
CPSTA                                              NUMBER
CPFLG                                              NUMBER
CPDRT                                              NUMBER
CPRDB                                              NUMBER
CPLRBA_SEQ                                         NUMBER
CPLRBA_BNO                                         NUMBER
CPLRBA_BOF                                         NUMBER
CPODR_SEQ                                          NUMBER
CPODR_BNO                                          NUMBER
CPODR_BOF                                          NUMBER
CPODS                                              VARCHAR2(16)
CPODT                                              VARCHAR2(20)
CPODT_I                                            NUMBER
CPHBT                                              NUMBER
CPRLS                                              VARCHAR2(16)
CPRLC                                              NUMBER
CPMID                                              NUMBER
CPSDR_SEQ                                          NUMBER
CPSDR_BNO                                          NUMBER
CPSDR_ADB                                          NUMBER

 

 

其中cpods 为 ” on disk scn” ,cpodr_seq||cpodr_bno||cpodr_bof为”on disk rba”,CPHBT为heartbeat number:

 

 

SQL> select cpods "on disk scn",
2         to_char(cpodr_seq, 'xxxxxx') || ',' ||
3         to_char(cpodr_bno, 'xxxxxxxxx') || ',' ||
4         to_char(cpodr_bof, 'xxxxxxxxx') "on disk rba",
5         CPHBT "heartbeat number"
6    from x$kcccp;
on disk scn      on disk rba                    heartbeat number
---------------- ------------------------------ ----------------
20968609              1a,      240a,         0         789263152
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
8 rows selected.
SQL> 
SQL> 
SQL> exec dbms_lock.sleep(3);
PL/SQL procedure successfully completed.
SQL> select cpods "on disk scn",
2         to_char(cpodr_seq, 'xxxxxx') || ',' ||
3         to_char(cpodr_bno, 'xxxxxxxxx') || ',' ||
4         to_char(cpodr_bof, 'xxxxxxxxx') "on disk rba",
5         CPHBT "heartbeat number"
6    from x$kcccp;
on disk scn      on disk rba                    heartbeat number
---------------- ------------------------------ ----------------
20968613              1a,      2410,         0         789263154
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
8 rows selected.
SQL> 
SQL> exec dbms_lock.sleep(3);
PL/SQL procedure successfully completed.
SQL> select cpods "on disk scn",
2         to_char(cpodr_seq, 'xxxxxx') || ',' ||
3         to_char(cpodr_bno, 'xxxxxxxxx') || ',' ||
4         to_char(cpodr_bof, 'xxxxxxxxx') "on disk rba",
5         CPHBT "heartbeat number"
6    from x$kcccp;
on disk scn      on disk rba                    heartbeat number
---------------- ------------------------------ ----------------
20968623              1a,      241e,         0         789263156
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
0                      0,         0,         0                 0
8 rows selected.

Know more about shared pool subpool

有同学在T.askmaclean.com上求助Shared Pool里SubPool子分区问题,这里我们来搞清楚_kghdsidx_count 与 subpool 以及subpool中的分区( 实际是duration)的关系:

 

SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS for Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production
SQL> set linesize 200 pagesize 1400
SQL> show parameter kgh
NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
_kghdsidx_count                      integer                          7
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump heapdump 536870914;
Statement processed.
SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_11783.trc
[oracle@vrh8 dbs]$ grep "sga heap"  /s01/admin/G10R25/udump/g10r25_ora_11783.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x60036110
FIVE LARGEST SUB HEAPS for heap name="sga heap(1,0)"   desc=0x60036110
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003f938
FIVE LARGEST SUB HEAPS for heap name="sga heap(2,0)"   desc=0x6003f938
HEAP DUMP heap name="sga heap(3,0)"  desc=0x60049160
FIVE LARGEST SUB HEAPS for heap name="sga heap(3,0)"   desc=0x60049160
HEAP DUMP heap name="sga heap(4,0)"  desc=0x60052988
FIVE LARGEST SUB HEAPS for heap name="sga heap(4,0)"   desc=0x60052988
HEAP DUMP heap name="sga heap(5,0)"  desc=0x6005c1b0
FIVE LARGEST SUB HEAPS for heap name="sga heap(5,0)"   desc=0x6005c1b0
HEAP DUMP heap name="sga heap(6,0)"  desc=0x600659d8
FIVE LARGEST SUB HEAPS for heap name="sga heap(6,0)"   desc=0x600659d8
HEAP DUMP heap name="sga heap(7,0)"  desc=0x6006f200
FIVE LARGEST SUB HEAPS for heap name="sga heap(7,0)"   desc=0x6006f200
SQL> alter system set "_kghdsidx_count"=6 scope=spfile;
System altered.
SQL> startup force;
ORACLE instance started.
Total System Global Area  859832320 bytes
Fixed Size                  2100104 bytes
Variable Size             746587256 bytes
Database Buffers          104857600 bytes
Redo Buffers                6287360 bytes
Database mounted.
Database opened.
SQL> 
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump heapdump 536870914;
Statement processed.
SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_11908.trc
[oracle@vrh8 dbs]$ grep "sga heap"  /s01/admin/G10R25/udump/g10r25_ora_11908.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x600360f0
FIVE LARGEST SUB HEAPS for heap name="sga heap(1,0)"   desc=0x600360f0
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003f918
FIVE LARGEST SUB HEAPS for heap name="sga heap(2,0)"   desc=0x6003f918
HEAP DUMP heap name="sga heap(3,0)"  desc=0x60049140
FIVE LARGEST SUB HEAPS for heap name="sga heap(3,0)"   desc=0x60049140
HEAP DUMP heap name="sga heap(4,0)"  desc=0x60052968
FIVE LARGEST SUB HEAPS for heap name="sga heap(4,0)"   desc=0x60052968
HEAP DUMP heap name="sga heap(5,0)"  desc=0x6005c190
FIVE LARGEST SUB HEAPS for heap name="sga heap(5,0)"   desc=0x6005c190
HEAP DUMP heap name="sga heap(6,0)"  desc=0x600659b8
FIVE LARGEST SUB HEAPS for heap name="sga heap(6,0)"   desc=0x600659b8
SQL> 
SQL> alter system set "_kghdsidx_count"=2 scope=spfile;
System altered.
SQL> 
SQL> startup force;
ORACLE instance started.
Total System Global Area  851443712 bytes
Fixed Size                  2100040 bytes
Variable Size             738198712 bytes
Database Buffers          104857600 bytes
Redo Buffers                6287360 bytes
Database mounted.
Database opened.
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump heapdump 2;
Statement processed.
SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_12003.trc
[oracle@vrh8 ~]$ grep "sga heap"  /s01/admin/G10R25/udump/g10r25_ora_12003.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x600360b0
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003f8d
SQL> alter system set cpu_count=16 scope=spfile;
System altered.
SQL> startup force;
ORACLE instance started.
Total System Global Area  851443712 bytes
Fixed Size                  2100040 bytes
Variable Size             738198712 bytes
Database Buffers          104857600 bytes
Redo Buffers                6287360 bytes
Database mounted.
Database opened.
SQL> oradebug setmypid;
Statement processed.
SQL>  oradebug dump heapdump 2;
Statement processed.
SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_12065.trc
[oracle@vrh8 ~]$ grep "sga heap"  /s01/admin/G10R25/udump/g10r25_ora_12065.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x600360b0
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003f8d8
SQL> show parameter sga_target
NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
sga_target                           big integer                      0
SQL> alter system set sga_target=1000M scope=spfile;
System altered.
SQL> startup force;
ORACLE instance started.
Total System Global Area 1048576000 bytes
Fixed Size                  2101544 bytes
Variable Size             738201304 bytes
Database Buffers          301989888 bytes
Redo Buffers                6283264 bytes
Database mounted.
Database opened.
SQL> alter system set sga_target=1000M scope=spfile;
System altered.
SQL> startup force;
ORACLE instance started.
Total System Global Area 1048576000 bytes
Fixed Size                  2101544 bytes
Variable Size             738201304 bytes
Database Buffers          301989888 bytes
Redo Buffers                6283264 bytes
Database mounted.
Database opened.
SQL> 
SQL> 
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump heapdump 2;
Statement processed.
SQL>  oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_12148.trc
SQL> 
SQL> Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@vrh8 dbs]$ grep "sga heap"  /s01/admin/G10R25/udump/g10r25_ora_12148.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x60036690
HEAP DUMP heap name="sga heap(1,1)"  desc=0x60037ee8
HEAP DUMP heap name="sga heap(1,2)"  desc=0x60039740
HEAP DUMP heap name="sga heap(1,3)"  desc=0x6003af98
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003feb8
HEAP DUMP heap name="sga heap(2,1)"  desc=0x60041710
HEAP DUMP heap name="sga heap(2,2)"  desc=0x60042f68
_enable_shared_pool_durations:该参数控制是否启用10g中特有的shared pool duration特性,当我们设置sga_target为0时该参数为false;
同时在10.2.0.5前若cursor_space_for_time设置为true时该参数也为false,不过在10.2.0.5以后cursor_space_for_time参数被废弃
SQL> alter system set "_enable_shared_pool_durations"=false scope=spfile;
System altered.
SQL> 
SQL> startup force;
ORACLE instance started.
Total System Global Area 1048576000 bytes
Fixed Size                  2101544 bytes
Variable Size             738201304 bytes
Database Buffers          301989888 bytes
Redo Buffers                6283264 bytes
Database mounted.
Database opened.
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump heapdump 2;
Statement processed.
SQL> oradebug tracefile_name
/s01/admin/G10R25/udump/g10r25_ora_12233.trc
SQL> 
SQL> Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options\
[oracle@vrh8 dbs]$ grep "sga heap"   /s01/admin/G10R25/udump/g10r25_ora_12233.trc
HEAP DUMP heap name="sga heap"  desc=0x60000058
HEAP DUMP heap name="sga heap(1,0)"  desc=0x60036690
HEAP DUMP heap name="sga heap(2,0)"  desc=0x6003feb8

 

 

结论:
1.
_kghdsidx_count 控制了 shared pool subpool的数量, _kghdsidx_count的最大允许值是7 即最多 7个 shared pool subpool

2.
为什么会在一个 subpool中还有4个 sub partition 如:

sga heap(1,0) sga heap(1,1) sga heap(1,2) sga heap(1,3)

这不是因为 cpu的数目 也不是因为_kghdsidx_count, 而是因为 在10g 中AUTO SGA 引入了 shared pool duration的概念,

duration 分成4类:

  • Session duration
  • Instance duration (never freed)
  • Execution duration (freed fastest)
  • Free memory

引入了 shared pool duration的目的是

在10gR1中Shared Pool的shrink收缩操作存在一些缺陷,造成缺陷的原因是在该版本中Buffer Cache还没有能力共享使用一个granule,这是因为Buffer Cache的granule的尾部由granule header和Metadata(可能是buffer header或者RAC中的Lock Elements)拼接组成,在其尾部不容许存在空洞。另一个原因是当时的shared pool允许不同生命周期duration(以后会介绍)的chunk存放在同一个granule中,这造成共享池无法完全释放granule。到10gR2中通过对Buffer Cache Granule结构的修改允许在granule header和buffer及Metadata(buffer header或LE)存在缝隙,同时shared pool中不同duration的chunk将不在共享同一个granule,通过以上改进buffer cache与shared pool间的内存交换变得可行。此外在10gr2中streams pool也开始支持内存交换(实际根据不同的streams pool duration存在限制)

reference : https://www.askmaclean.com/archives/understanding-automatic-sga-memory-management.html

沪ICP备14014813号

沪公网安备 31010802001379号