【案例分享】RAC中可能由于节点cache temp space导致ora-1652错误

11.2.0.3  RAC环境中, 可能由于节点 cache temp space,导致虽然free temp space还有很多,但仍有节点因为无法分配temp space而出现ORA-1652错误。

具体可以参考如下note, 目前已经提供了补丁Bug 14383007 – Sort runs out of temp space in RAC even when temp space is available

 

 

Applies to:

Oracle Database – Enterprise Edition – Version 11.2.0.3 to 11.2.0.3 [Release 11.2]
Information in this document applies to any platform.

Symptoms

Temporary tablespace space allocation fails in RAC even when there is still free temp space.

ORA-12801: error signaled in parallel query server P017
ORA-01652: unable to extend temp segment by 640 in tablespace XY_TEMP

 

Cause

Unbalanced temp space distribution in RAC.  One instance seems to consume and cache most of the temp space, causing another instance to hit the ora-1652.

SQL> select inst_id, tablespace_name, round((total_blocks*8192)/(1024*1024*1024),2) “Space(GB)”
2 from gv$sort_segment
3 where tablespace_name=’XY_TEMP’
4 order by 1;

INST_ID  TABLESPACE_NAME        Space(GB)
———- —————————— ———-
1           XY_TEMP                           33.39
2           XY_TEMP                           33.77
3           XY_TEMP                           34.16
4           XY_TEMP                           33.39
5           XY_TEMP                           33.46
6           XY_TEMP                       1118.79   <<<<very unbalanced
7           XY_TEMP                           33.28
8           XY_TEMP                           34.26

This is reported in Bug 14383007 – sort runs out of temp space on 2 nodes even when temp space is available
This bug will be fixed in 11.2.0.4 (future release). Refer < Document 14383007.8> for more details.

Useful queries for debugging:

Collect the information every few seconds:
1. select  * from gv$sort_segment
2. select sum(bytes), owner from gv$temp_extent_map group by owner;
3. select inst_id, blocks_cached, blocks_used, extents_cached, extents_used from GV$TEMP_EXTENT_POOL;

Solution

Workaround is:
Retry the operation.

One-off patch 14383007 has been provided for certain platform, please check My Oracle Support for patch detail.

Bug 14383007 Sort runs out of temp space in RAC even when temp space is available
Versions confirmed as being affected 11.2.0.3
Description
Temp space allocation fails with out-of-space ORA-1652 errors in RAC even when
there is still free temp space available.
Rediscovery Notes:
User might hit this issue if temp space allocation fails with out-of-space in
RAC even when there is still free temp space.

gc buffer busy/gcs log flush sync与log file sync

这篇博文整理自我的帖子: RAC中的gc current block busy与redo log flush

 

对于log file sync(本质上是 write redolog慢)引发gc buffer busy acquire /release 集群等待事件的这个命题的真伪,其实Oracle在开发性能调优组件ADDM时一早给了我们答案:

 

RECOMMENDATION 2: Host Configuration, 12% benefit (507182 seconds)
ACTION: Investigate the possibility of improving the performance of I/O
to the online redo log files.
RATIONALE: The average size of writes to the online redo log files was
40 K and the average time per write was 10 milliseconds.

ADDITIONAL INFORMATION:

Waits on event “log file sync” were the cause of significant database
wait on “gc buffer busy” when releasing a data block. Waits on event
“log file sync” in this instance can cause global cache contention on
remote instances.

 

如果你在ADDM(?/rdbms/admin/addmrpt)中找到上述文字,那么基本可以确认gc buffer busy的源头是log file sync(虽然本质上不是),那么优先解决log file sync的问题; log file sync 当然有少数的bug存在,但更多的是存储、板卡、链路等硬件因素造成的。解决了log file sync后,那么gc buffer busy往往也就解决了。

 

gc current block busy 等待是RAC中global cache全局缓存当前块的争用等待事件, 该等待事件时长由三个部分组成:

Time to process current block request in the cache= (pin time + flush time + send time)

gc current block flush time
The current block flush time is part of the service (or processing) time for a current block. The pending redo needs to be flushed to the log file by LGWR before LMS sends it. The operation is asynchronous in that LMS queues the request, posts LGWR, and continues processing. The LMS would check its log flush queue for completions and then send the block, or go to sleep and be posted by LGWR. The redo log write time and redo log sync time can influence the overall service time significantly.
flush time 是Oracle为了保证Instance Recovery实例恢复机制,而要求每一个current block在本地节点local instance被修改后(modify/update) 必须要将该current block相关的redo 写入到logfile 后(要求LGWR必须完成写入后才能返回),才能由LMS进程传输给其他节点使用。

 

而gc buffer busy acquire/release 往往是 gc current block busy的衍生产品, 当同一实例内的 多个进程并发地访问同一个数据块时 , 首先发起的进程 将进入 gc current block busy的等待 ,而在 buffer waiter list 上的后续进程 会陷入gc buffer busy acquire/release 等待(A user on the same instance has started a remote operation on the same resource and the request has not completed yet or the block was requested by another node and the block has not been released by the local instance when the new local access was made), 这里存在一个排队效应, 即 gc current block busy是缓慢的,那么在 排队的gc buffer busy acquire/release就会更慢:

Pin time = (time to read the block into cache) + (time to modify/process the buffer)
Busy time = (average pin time) * (number of interested users waiting ahead of me)

不局限于current block (reference AWR Avg global cache current block flush time (ms)), cr block(Avg global cache cr block flush time (ms)) 也存在flush time。

可以通过 设置_cr_server_log_flush to false(LMS are/is waiting for LGWR to flush the pending redo during CR fabrication. Without going too much in to details, you can turn off the behaviour by setting _cr_server_log_flush to false.) 来禁止cr server flush redo log,_gc_log_flush(if TRUE, flush redo log before a current block transfer)来让current block transfer不用flush redo。 但是上述2个参数是有其副作用的……….. 大多数情况不要考虑去设置它们,用它们是个馊主意。

 

_gc_split_flush if TRUE, flush index split redo before rejecting bast FALSE  ==> 控制index split redo flush,默认为FALSE

 

 

以上告诉我们 IO 在RAC中是十分重要的,特别是log file的write性能, 其重要性不亚于CPU 和 Interconnect network。

 

log file sync with gc buffer busy acquire release

 

 

Avg global cache current block flush time (ms)

 

 

 

gcs log flush sync

 

 

When LMS processes a current or cr request, it may encounter certain conditions requiring that the request be queued until some action is performed on the block. A very common condition is that the block is frequently changed and has changes pending which have not been written to the logs. In this case, LMS would have to ask LGWR to flush the log and must check the log flush completion queue or wait for a notification from LGWR. This means that more work must be done to process that block and there may even be considerable time waited for the log IO to complete or the notification to be sent. The processing at the cache layer is CPU intensive. One must also bear in mind that before and after entering the Cache processing, LMS must have picked up the message from the IPC layer and pass it on for sending. In both cases, a message may be sitting in a queue if LMS is busy and/or cannot be scheduled predictably. The scenarios briefly described here may present themselves to the requesting process as “busy” or “congested” waits ( which will be discussed later )

对于index block split/ enq: TX index contention而言,慢的redo flush也可能是催化剂

index split index contention avg flush ping times

 

Log file sync delays due to slow log IO can impact the block shipping
LGWR does not post LMS when the redo is on disk
Can be considered and indicator of contention for “hot” blocks between instance competing for write or read-after-write access

 

【视频教学】Maclean教你用Vbox在Linux 6.3上安装Oracle 11gR2 RAC

建设完成!!

 
 OEL 6.3 11.2.0.3 RAC VBOX安装脚本.txt (5.08 KB, 下载次数: 25)

Vbox+ Oracle Linux 6.3 + 11.2.0.3 RAC安装第一讲安装OS操作系统 by Maclean Liu

 

 

 

Vbox+ Oracle Linux 6.3 + 11.2.0.3 RAC安装第二讲克隆并配置虚拟主机 Maclean Liu

 

 

 

Vbox+ Oracle Linux 6.3 + 11.2.0.3 RAC 第三讲配置Vbox虚拟共享存储并使用udev绑定

 

 

Vbox+ Oracle Linux 6.3 + 11.2.0.3 第四讲安装11.2.0.3 Grid Infrastructure和db

 

 

Vbox+Linux 6.3+11.2.0.3 RAC安装第五讲安装GI+RDBMS 11.2.0.3.5 PSU并DBCA建库

 

针对11.2 RAC丢失OCR和Votedisk所在ASM Diskgroup的恢复手段

之前有同学在我的Oracle Allstarts群里讨论关于丢失包含ocr和votedisk的ASM diskgroup导致11gR2 RAC cluster无法正常启动的问题,最早我在《在11gR2 RAC中修改ASM DISK Path磁盘路径》一文中介绍了,如何在不启动CRS的情况下启动11.2中的ASM实例并实施操作, 这里也需要用到同样的命令”crsctl start crs -excl -nocrs “;

 

 

注意以下的恢复手段,针对ASM中单独的ocr或者单独的votedisk丢失也有效,因为11.2中普遍把ocr和votedisk存放在ASM中,而ASM的启动又依赖于ocr和votedisk,所以在丢失ocr或votedisk仍一一者都会导致cluter无法正常启动;这里我们仅仅讨论如何让CRS正常启动,如果丢失的diskgroup中还存放有数据库的话,数据的恢复不属于本篇文章的讨论范畴。

 

前提:恢复的前提是你仍有和故障前一样多的ASM LUN DISK,且你有OCR的自动备份,注意默认情况下每4个小时会自动备份一次,只要你没有删除$GI_HOME,一般都会有备份可用;不要求有votedisk备份

 

恢复场景: 利用dd命令清空ocr和votedisk所在diskgroup header,模拟diskgroup corruption:

 

 

1. 检查votedisk和 ocr备份

[root@vrh1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a853d6204bbc4feabfd8c73d4c3b3001 (/dev/asm-diskh) [SYSTEMDG]
 2. ONLINE   a5b37704c3574f0fbf21d1d9f58c4a6b (/dev/asm-diskg) [SYSTEMDG]
 3. ONLINE   36e5c51ff0294fc3bf2a042266650331 (/dev/asm-diski) [SYSTEMDG]
 4. ONLINE   af337d1512824fe4bf6ad45283517aaa (/dev/asm-diskj) [SYSTEMDG]
 5. ONLINE   3c4a349e2e304ff6bf64b2b1c9d9cf5d (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).

su - grid

[grid@vrh1 ~]$ ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy

vrh1     2012/08/09 01:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup00.ocr

vrh1     2012/08/08 21:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup01.ocr

vrh1     2012/08/08 17:59:55     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup02.ocr

vrh1     2012/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/day.ocr

vrh1     2012/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available

2. 彻底关闭所有节点上的clusterware ,OHASD 

crsctl stop has -f

3. GetAsmDH.sh ==> GetAsmDH.sh是ASM disk header的备份脚本 
请养成良好的习惯,做危险操作前备份asm header

[grid@vrh1 ~]$ ./GetAsmDH.sh 

############################################
 1) Collecting Information About the Disks:
############################################

SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 9 03:28:13 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

SQL> Connected.
SQL> SQL> SQL> SQL> SQL> SQL> SQL> 
           1           0 /dev/asm-diske
           1           1 /dev/asm-diskd
           2           0 /dev/asm-diskb
           2           1 /dev/asm-diskc
           2           2 /dev/asm-diskf
           3           0 /dev/asm-diskh
           3           1 /dev/asm-diskg
           3           2 /dev/asm-diski
           3           3 /dev/asm-diskj
           3           4 /dev/asm-diskk
SQL> SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

-rw-r--r-- 1 grid oinstall 1048 Aug  9 03:28 /tmp/HC/asmdisks.lst

############################################
 2) Generating asm_diskh.sh script.
############################################

-rwx------ 1 grid oinstall 666 Aug  9 03:28 /tmp/HC/asm_diskh.sh

############################################
 3) Executing  asm_diskh.sh script to 
    generate dd  dumps.
############################################

-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_1_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_1_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_2.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_2.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_3.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_4.dd

############################################
 4) Compressing dd dumps in the next format:
    (asm_dd_header_all_.tar)
############################################

/tmp/HC/dsk_1_0.dd
/tmp/HC/dsk_1_1.dd
/tmp/HC/dsk_2_0.dd
/tmp/HC/dsk_2_1.dd
/tmp/HC/dsk_2_2.dd
/tmp/HC/dsk_3_0.dd
/tmp/HC/dsk_3_1.dd
/tmp/HC/dsk_3_2.dd
/tmp/HC/dsk_3_3.dd
/tmp/HC/dsk_3_4.dd
./GetAsmDH.sh: line 81: compress: command not found
ls: /tmp/HC/*.Z: No such file or directory
[grid@vrh1 ~]$

4. 使用dd 命令 破坏ocr和votedisk所在diskgroup

[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskh bs=1024k count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00423853 seconds, 247 MB/s

[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskg bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0045179 seconds, 232 MB/s

[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diski bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00469976 seconds, 223 MB/s

[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskj bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00344262 seconds, 305 MB/s
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskk bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0053518 seconds, 196 MB/s

5. 在一个节点上尝试重新启动HAS

[root@vrh1 ~]# crsctl start has
CRS-4123: Oracle High Availability Services has been started.

 

 

 

但是因为ocr和votedisk所在diskgroup丢失,所以CSS将无法正常启动,如以下日志所示:

 

 

alertvrh1.log 

[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:35:41.207
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:35:56.240
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:11.284
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:26.305
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:41.328

ocssd.log
2012-08-09 03:40:26.662: [    CSSD][1078700352]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*)
2012-08-09 03:40:26.662: [    CSSD][1078700352]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery
2012-08-09 03:40:26.662: [   SKGFD][1078700352]Discovery with str:/dev/asm*:

2012-08-09 03:40:26.662: [   SKGFD][1078700352]UFS discovery with :/dev/asm*:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskf:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskb:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskj:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskh:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskc:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskd:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diske:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskg:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diski:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskk:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]OSS discovery with :/dev/asm*:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xdf22a0 from lib :UFS:: for disk :/dev/asm-diskf:

2012-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xf412a0 from lib :UFS:: for disk :/dev/asm-diskb:

2012-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf3a680 from lib :UFS:: for disk :/dev/asm-diskj:

2012-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf93da0 from lib :UFS:: for disk :/dev/asm-diskh:

2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvDiskVerify: Successful discovery of 0 disks
2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvFindInitialConfigs: No voting files found
2012-08-09 03:40:26.667: [    CSSD][1078700352](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

 

 

 

正式的恢复ocr和votedisk所在diskgroup的步骤如下:

 

 

1. 以-excl -nocrs 方式启动cluster,这将可以启动ASM实例 但不启动CRS

 [root@vrh1 vrh1]# crsctl start crs -excl -nocrs 

CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'vrh1'
CRS-2676: Start of 'ora.mdnsd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'vrh1'
CRS-2676: Start of 'ora.gpnpd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'vrh1'
CRS-2672: Attempting to start 'ora.gipcd' on 'vrh1'
CRS-2676: Start of 'ora.cssdmonitor' on 'vrh1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'vrh1'
CRS-2672: Attempting to start 'ora.diskmon' on 'vrh1'
CRS-2676: Start of 'ora.diskmon' on 'vrh1' succeeded
CRS-2676: Start of 'ora.cssd' on 'vrh1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'vrh1'
CRS-2672: Attempting to start 'ora.ctssd' on 'vrh1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'vrh1'
CRS-2676: Start of 'ora.ctssd' on 'vrh1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'vrh1'
CRS-2676: Start of 'ora.asm' on 'vrh1' succeeded

 

 

 

2.重建原ocr和votedisk所在diskgroup,注意compatible.asm必须是11.2:

 

 

[root@vrh1 vrh1]# su - grid
[grid@vrh1 ~]$ sqlplus  / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 9 04:16:58 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> create diskgroup systemdg high redundancy disk '/dev/asm-diskh','/dev/asm-diskg','/dev/asm-diski','/dev/asm-diskj','/dev/asm-diskk'
 ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';

 

 

 

3.从ocr backup中恢复ocr并做ocrcheck检验:

 

 

[root@vrh1 ~]# ocrconfig -restore /g01/11.2.0/grid/cdata/vrh-cluster/backup00.ocr

[root@vrh1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       3180
         Available space (kbytes) :     258940
         ID                       : 1238458014
         Device/File Name         :  +systemdg
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

 

 

 

4. 准备恢复votedisk ,可能会遇到下面的错误:

 

[grid@vrh1 ~]$ crsctl replace votedisk  +SYSTEMDG
CRS-4602: Failed 27 to add voting file 2e4e0fe285924f86bf5473d00dcc0388.
CRS-4602: Failed 27 to add voting file 4fa54bb0cc5c4fafbf1a9be5479bf389.
CRS-4602: Failed 27 to add voting file a109ead9ea4e4f28bfe233188623616a.
CRS-4602: Failed 27 to add voting file 042c9fbd71b54f5abfcd3ab3408f3cf3.
CRS-4602: Failed 27 to add voting file 7b5a8cd24f954fafbf835ad78615763f.
Failed to replace voting disk group with +SYSTEMDG.
CRS-4000: Command Replace failed, or completed with errors.

 

 

需要重新配置一下ASM的参数,并重启ASM:

 

 

SQL> alter system set asm_diskstring='/dev/asm*';

System altered.

SQL> create spfile from memory;

File created.

SQL>  startup force mount;
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started

Total System Global Area  283930624 bytes
Fixed Size                  2227664 bytes
Variable Size             256537136 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted

SQL> show parameter spfile

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /g01/11.2.0/grid/dbs/spfile+AS
                                                 M1.ora

[grid@vrh1 trace]$  crsctl replace votedisk  +SYSTEMDG
CRS-4256: Updating the profile
Successful addition of voting disk 85edc0e82d274f78bfc58cdc73b8c68a.
Successful addition of voting disk 201ffffc8ba44faabfe2efec2aa75840.
Successful addition of voting disk 6f2a25c589964faabf6980f7c5f621ce.
Successful addition of voting disk 93eb315648454f25bf3717df1a2c73d5.
Successful addition of voting disk 3737240678964f88bfbfbd31d8b3829f.
Successfully replaced voting disk group with +SYSTEMDG.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

 

 

 

 

5. 重启has服务,检验cluster是否正常:

 

 

[root@vrh1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[root@vrh1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   85edc0e82d274f78bfc58cdc73b8c68a (/dev/asm-diskh) [SYSTEMDG]
 2. ONLINE   201ffffc8ba44faabfe2efec2aa75840 (/dev/asm-diskg) [SYSTEMDG]
 3. ONLINE   6f2a25c589964faabf6980f7c5f621ce (/dev/asm-diski) [SYSTEMDG]
 4. ONLINE   93eb315648454f25bf3717df1a2c73d5 (/dev/asm-diskj) [SYSTEMDG]
 5. ONLINE   3737240678964f88bfbfbd31d8b3829f (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).

[root@vrh1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.BACKUPDG.dg
               ONLINE  ONLINE       vrh1                                         
ora.DATA.dg
               ONLINE  ONLINE       vrh1                                         
ora.LISTENER.lsnr
               ONLINE  ONLINE       vrh1                                         
ora.LSN_MACLEAN.lsnr
               ONLINE  ONLINE       vrh1                                         
ora.SYSTEMDG.dg
               ONLINE  ONLINE       vrh1                                         
ora.asm
               ONLINE  ONLINE       vrh1                     Started             
ora.gsd
               OFFLINE OFFLINE      vrh1                                         
ora.net1.network
               ONLINE  ONLINE       vrh1                                         
ora.ons
               ONLINE  ONLINE       vrh1                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr  http://www.askmaclean.com
      1        ONLINE  ONLINE       vrh1                                         
ora.cvu
      1        OFFLINE OFFLINE                                                   
ora.oc4j
      1        OFFLINE OFFLINE                                                   
ora.scan1.vip
      1        ONLINE  ONLINE       vrh1                                         
ora.vprod.db
      1        ONLINE  OFFLINE                                                   
      2        ONLINE  OFFLINE                                                   
ora.vrh1.vip
      1        ONLINE  ONLINE       vrh1                                         
ora.vrh2.vip
      1        ONLINE  INTERMEDIATE vrh1                     FAILED OVER

从此SQLPLUS有了Top命令

Jagjeet Singh为我们提供了这样一个sqlplus下的RAC dashboard,实时的性能工具,看上去和top或者nmon命令的界面很相似,借助于这个工具我们将可以实时监控RAC的性能指标而不依赖于图形界面。

 

点击这里下载Jagjeet Singh的sqlplus dashboard for rac工具

 

具体的安装方法如下:

 

 

解压zip包 

[oracle@vrh1 ~]$ unzip Pck.zip 
Archive:  Pck.zip
   creating: Pck/
  inflating: Pck/1_grants.sql        
  inflating: Pck/2_Types.sql         
  inflating: Pck/3_pkg.sql           
  inflating: Pck/drop.sql            
  inflating: Pck/howtoinstall.sql    
  inflating: Pck/outputexplained.pdf  
  inflating: Pck/sampleoutput.png    

[oracle@vrh1 ~]$ cd Pck

[oracle@vrh1 Pck]$ sqlplus  / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Wed Aug 8 02:08:49 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

1.  执行授权文件1_grants.sql        

SQL> start 1_grants.sql
Enter value for _usr: maclean

需要指定安装在哪个schema下

2. 登录上面指定的schema,并执行 2_Types.sql 创建type

SQL> conn maclean/maclean
Connected.
SQL> 
SQL> 
SQL> start 2_Types.sql

Type created.

Type created.

Type created.

Type created.

Type created.

Type created.

Type created.

Type created.

Type created.

Type created.

3. 执行3_pkg.sql 创建包体

SQL> start 3_pkg.sql

Package created.

No errors.

Package body created.

 若想删除该工具吗,卸载的方法 

SQL> connect maclean/maclean
SQL> start drop.sql

 

 

 

具体使用的方法如下:

 

 

 SQL>set lines 190 pages 0 arraysize 50
 SQL>select * from table(jss.gtop(50,10)) ;

50代表 arraysize 一般不用修改
10代表 10s采样一次,采样间隔

 

示例输出:

 

这很酷,仿佛文字版的spotlight  FOR RAC

【教学视频】Maclean教你一步一步使用Vbox在Linux 5上安装Oracle 11gR2 RAC

【教学视频】Maclean教你一步一步使用Vbox在Linux 5上安装Oracle 11gR2 RAC

该安装讲座相关的所有脚本和命令都保存在下面这个附件中:

11gR2-RAC-install-command-list.txt (1.43 KB)

 

 

第一讲创建VBOX虚拟主机并安装Oracle Enterprise Linux 5.7

 

 

 

第二讲配置并clone Vbox虚拟主机 创建共享存储磁盘

 

 

 

 

第三讲安装11gR2 11.2.0.3 Grid Infrastructure Clusterware GI集群软件

 

 

 

 

第四讲安装11gR2 RAC DB RDBMS HOME数据库软件并DBCA创建数据库

 

 

[视频教学]Maclean教你用Vbox在Enterprise Linux 5上安装Oracle 10gR2 RAC

【视频教学】Maclean手把手教你用Vbox在Enterprise Linux 5上安装Oracle 10gR2 RAC数据库

WebEx视频播放软件 , 以下wrf格式文件 需要使用webex播放,

下载WebEX: 

WebEx.zip (2.55 MB)

Agenda:

相关脚本 和命令附件:   脚本命令.txt (1.71 KB)

第一讲视频-在Windows上安装Vbox

第一讲视频-在Windows上安装Vbox.zip (4.41 MB)

第二讲视频-在Vbox上安装Oracle Enterprise Linux 5.7虚拟主机

第二讲视频-在Vbox上安装Oracle Enterprise Linux 5.7虚拟主机.zip (12.05 MB)

第三讲视频-配置并克隆Oracle Enterprise 5.7虚拟主机

第三讲视频-配置并克隆Oracle Enterprise 5.7虚拟主机.zip (14.82 MB)

第四讲视频-配置Vbox共享存储和UDEV存储设备绑定

第四讲视频-配置Vbox共享存储和UDEV存储设备绑定.zip (13.16 MB)

第五讲视频-安装10.2.0.1 x64 Linux Oracle Clusterware CRS

第五讲视频-安装10.2.0.1 x64 Linux Oracle Clusterware CRS.zip (10.48 MB)

第六讲视频-升级Clusterware CRS 10.2.0.1 到 Patchset 10.2.0.5

第六讲视频-升级Clusterware CRS 10.2.0.1 到 Patchset 10.2.0.5 .zip (3.2 MB)

第七讲视频-安装10.2.0.1 DB数据库并升级到10.2.0.5 Patchset 并DBCA创建数据库

第七讲视频-安装10.2.0.1DB数据库升级到10.2.0.5创建数据库.zip (10.37 MB)

Global Cache CR Requested But Current Block Received

这篇文章和之前的《MINSCN与Cache Fusion Read Consistent》 是姊妹篇,他们源于同一个问题帖子

我们来重现提问者所看到的这样一个场景:

 

SQL> select * from V$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select count(*) from gv$instance;

  COUNT(*)
----------
         2

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.oracledatabase12g.com

 

 

 

在11gR2 2节点RAC环境中将一个数据块的status修改为XG,假设这个Xcurrent block当前在INSTANCE 2被hold住,这时我们在INSTANCE 1反复查询这个数据块,并观察结果:

 

SQL> select  * from test;

        ID
----------
         1
         2

SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                               89233                                    1
                               89233                                    1

SQL> alter system flush buffer_cache;

System altered.

INSTANCE 1 Session A:

SQL>  update test set id=id+1 where id=1;

1 row updated.

INSTANCE 1 Session B:

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1755287

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_19111.trc

GLOBAL CACHE ELEMENT DUMP (address: 0xa4ff3080):
  id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
  lock: X rls: 0x0 acq: 0x0 latch: 3
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kdswh11: kdst_fetch'
  bscn: 0x0.146e20 bctx: (nil) write: 0 scan: 0x0
  lcp: (nil) lnk: [NULL] lch: [0xa9f6a6f8,0xa9f6a6f8]
  seq: 32 hist: 58 145:0 118 66 144:0 192 352 197 48 121 113 424 180 58
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x02000001 lflg: 0x1 state: XCURRENT tsn: 0 tsh: 2
      addr: 0xa9f6a5c8 obj: 76896 cls: DATA bscn: 0x0.1ac898
BH (0xa9f6a5c8) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0xa9e56000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 3 pwc: 0,15
  dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
  hash: [0x91f4e970,0xbae9d5b8] lru: [0x91f58848,0xa9f6a828]
  lru-flags: debug_dump
  obj-flags: object_ckpt_list
  ckptq: [0x9df6d1d8,0xa9f6a740] fileq: [0xa2ece670,0xbdf4ed68] objq: [0xb4964e00,0xb4964e00] objaq: [0xb4964de0,0xb4964de0]
  st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 2 le: 0xa4ff3080
  flags: buffer_dirty redo_since_read
  LRBA: [0x19.5671.0] LSCN: [0x0.1ac898] HSCN: [0x0.1ac898] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415c91 (1/89233)
  scn: 0x0000.001ac898 seq: 0x01 flg: 0x00 tail: 0xc8980601
  frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

 

 

可以看到此时block: (1/89233)的GLOBAL CACHE ELEMENT DUMP中LOCK状态仍是X 而非XG , 这是因为这个Current Block仅在一个Instance中被modify修改过,没有在全局范围内被更新过。

 

 

紧接着在Instance 2 修改该块:

 

 

Instance 2 Session C:

SQL> update test set id=id+1 where id=2;

1 row updated.

Instance 2 Session D:

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1756658

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_ora_13038.trc

GLOBAL CACHE ELEMENT DUMP (address: 0x89fb25a0):
  id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
  lock: XG rls: 0x0 acq: 0x0 latch: 3
  flags: 0x20 fair: 0 recovery: 0 fpin: 'kduwh01: kdusru'
  bscn: 0x0.1acdf3 bctx: (nil) write: 0 scan: 0x0
  lcp: (nil) lnk: [NULL] lch: [0x96f4cf80,0x96f4cf80]
  seq: 61 hist: 324 21 143:0 19 16 352 329 144:6 14 7 352 197
  LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
    flg: 0x0a000001 state: XCURRENT tsn: 0 tsh: 1
      addr: 0x96f4ce50 obj: 76896 cls: DATA bscn: 0x0.1acdf6
BH (0x96f4ce50) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0x96bd4000
  set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,15
  dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
  hash: [0x96ee1fe8,0xbae9d5b8] lru: [0x96f4d0b0,0x96f4cdc0]
  obj-flags: object_ckpt_list
  ckptq: [0xbdf519b8,0x96f4d5a8] fileq: [0xbdf519d8,0xbdf519d8] objq: [0xb4a47b90,0xb4a47b90] objaq: [0x96f4d0e8,0xb4a47b70]
  st: XCURRENT md: NULL fpin: 'kduwh01: kdusru' tch: 1 le: 0x89fb25a0
  flags: buffer_dirty redo_since_read remote_transfered
  LRBA: [0x11.9e18.0] LSCN: [0x0.1acdf6] HSCN: [0x0.1acdf6] HSUB: [1]
  buffer tsn: 0 rdba: 0x00415c91 (1/89233)
  scn: 0x0000.001acdf6 seq: 0x01 flg: 0x00 tail: 0xcdf60601
  frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

 GCS CLIENT 0x89fb2618,6 resp[(nil),0x15c91.1] pkey 76896.0
   grant 2 cvt 0 mdrole 0x42 st 0x100 lst 0x20 GRANTQ rl G0
   master 1 owner 2 sid 0 remote[(nil),0] hist 0x94121c601163423c
   history 0x3c.0x4.0xd.0xb.0x1.0xc.0x7.0x9.0x14.0x1.
   cflag 0x0 sender 1 flags 0x0 replay# 0 abast (nil).x0.1 dbmap (nil)
   disk: 0x0000.00000000 write request: 0x0000.00000000
   pi scn: 0x0000.00000000 sq[(nil),(nil)]
   msgseq 0x1 updseq 0x0 reqids[6,0,0] infop (nil) lockseq x2b8
   pkey 76896.0
   hv 93 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 18, dom 0]
   kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
   lb 0, hb 0, myb 15250, drmb 15250, apifrz 0

 

 

在Instance 2中被修改过后block: (1/89233)的 GLOBAL CACHE ELEMENT Lock Convert成lock: XG

除了通过GC_ELEMENTS DUMP来分析XCUR Cache Fusion外,也可以直接查询X$ VIEW,主要是 X$LE X$KJBR X$KJBL, 这三个X$ VIEW的更多信息可以很方便地从我的博客中找到:

 

 

INSTANCE 2 Session D:

SELECT *
  FROM x$le
 WHERE le_addr IN (SELECT le_addr
                     FROM x$bh
                    WHERE obj IN (SELECT data_object_id
                                    FROM dba_objects
                                   WHERE owner = 'SYS'
                                     AND object_name = 'TEST')
                      AND class = 1
                      AND state != 3);

ADDR                   INDX    INST_ID LE_ADDR              LE_ID1     LE_ID2
---------------- ---------- ---------- ---------------- ---------- ----------
    LE_RLS     LE_ACQ   LE_FLAGS    LE_MODE   LE_WRITE   LE_LOCAL LE_RECOVERY
---------- ---------- ---------- ---------- ---------- ---------- -----------
   LE_BLKS    LE_TIME LE_KJBL
---------- ---------- ----------------
00007F94CA14CF60       7003          2 0000000089FB25A0      89233          1
         0          0         32          2          0          1           0
         1          0 0000000089FB2618

 

 

PCM Resource NAME由[ID1][ID2],[BL]等组成, ID1和ID2 通过blockno和 fileno计算获得, 这里我们直接参考以上GC_elements dump中的 id1: 0x15c91 id2: 0×1 pkey: OBJ#76896 block: (1/89233)信息 ,则  kjblname 和 kjbrname 应以”[0x15c91][0x1],[BL]” 开头:

 

 

INSTANCE 2 Session D:

SQL> set linesize 80 pagesize 1400
SQL> SELECT *
  2    FROM x$kjbl l
  3   WHERE l.kjblname LIKE '%[0x15c91][0x1],[BL]%';

ADDR                   INDX    INST_ID KJBLLOCKP        KJBLGRANT KJBLREQUE
---------------- ---------- ---------- ---------------- --------- ---------
  KJBLROLE KJBLRESP         KJBLNAME
---------- ---------------- ------------------------------
KJBLNAME2                       KJBLQUEUE
------------------------------ ----------
KJBLLOCKST                                                       KJBLWRITING
---------------------------------------------------------------- -----------
KJBLREQWRITE  KJBLOWNER KJBLMASTER KJBLBLOCKED KJBLBLOCKER    KJBLSID KJBLRDOMID
------------ ---------- ---------- ----------- ----------- ---------- ----------
  KJBLPKEY
----------
00007F94CA22A288        451          2 0000000089FB2618 KJUSEREX  KJUSERNL
         0 00               [0x15c91][0x1],[BL][ext 0x0,0x
89233,1,BL                              0
GRANTED                                                                    0
           0          1          0           0           0          0          0
     76896

SQL> SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';

no rows selected

Instance 1 session B:

SQL>  SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';

ADDR                   INDX    INST_ID KJBRRESP         KJBRGRANT KJBRNCVL
---------------- ---------- ---------- ---------------- --------- ---------
  KJBRROLE KJBRNAME                       KJBRMASTER KJBRGRANTQ
---------- ------------------------------ ---------- ----------------
KJBRCVTQ         KJBRWRITER          KJBRSID KJBRRDOMID   KJBRPKEY
---------------- ---------------- ---------- ---------- ----------
00007F801ACA68F8       1355          1 00000000B5A62AE0 KJUSEREX  KJUSERNL
         0 [0x15c91][0x1],[BL][ext 0x0,0x          0 00000000B48BB330
00               00                        0          0      76896

 

 

接着我们将在Instance 1上查询block: (1/89233),这应当会引发Instance 2 build cr block 并传输给Instance 1, 为了进一步了解其细节 我们分别对 Instance 1的 Foreground Process 和 Instance 2的LMS进程做详细的RAC  TRACE:

 

Instance 2:

[oracle@vrh2 ~]$ ps -ef|grep ora_lms|grep -v grep
oracle   23364     1  0 Apr29 ?        00:33:15 ora_lms0_VPROD2

SQL> oradebug setospid 23364
Oracle pid: 13, Unix process pid: 23364, image: oracle@vrh2.oracle.com (LMS0)

SQL> oradebug event 10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high;
Statement processed.

SQL> oradebug tracefile_name
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_lms0_23364.trc

Instance 1 session B :

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1756658
         3    1756661
         3    1755287

Instance 1 session A :

SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';

Session altered.

SQL> select * from test;

        ID
----------
         2
         2

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1761520

 

从x$BH的信息来看,如我们所预期的Instance 1收到了build好的CR block,进一步分析 TRACE 文件:

 

Instance 1 foreground Process:

PARSING IN CURSOR #140336527348792 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335939136125254 hv=1689401402 ad='b1a4c828' sqlid='c99yw1xkb4f1u'
select * from test
END OF STMT
PARSE #140336527348792:c=2999,e=2860,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125253
EXEC #140336527348792:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125373
WAIT #140336527348792: nam='SQL*Net message to client' ela= 6 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939136125420

*** 2012-05-02 02:12:16.125
kclscrs: req=0 block=1/89233
2012-05-02 02:12:16.125574 : kjbcro[0x15c91.1 76896.0][4]

*** 2012-05-02 02:12:16.125
kclscrs: req=0 typ=nowait-abort

*** 2012-05-02 02:12:16.125
kclscrs: bid=1:3:1:0:f:1e:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1c:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:14:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:12:16.125718 : kjbcro[0x15c91.1 76896.0][4]
2012-05-02 02:12:16.125751 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:12:16.125780 : kjbsentscn[0x0.1ae0f0][to 2]
2012-05-02 02:12:16.125806 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177740 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:12:16.125918 : kjbmscr(0x15c91.1)reqid=0x8(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:12:16.126959 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1

*** 2012-05-02 02:12:16.127
kclwcrs: wait=0 tm=1233

*** 2012-05-02 02:12:16.127
kclwcrs: got 1 blocks from ksxprcv
WAIT #140336527348792: nam='gc cr block 2-way' ela= 1233 p1=1 p2=89233 p3=1 obj#=76896 tim=1335939136127199
2012-05-02 02:12:16.127272 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-05-02 02:12:16.127309 : kjbrcvdscn[0x0.1ae0f0][from 2][idx 2012-05-02 02:12:16.127329 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f0][from 2]

 

 

前台进程 kjbcro[0x15c91.1 76896.0][4] kjbsentscn[0x0.1ae0f0][to 2] 向Instance 2请求SCN=1ae0f0=1761520的 block: (1/89233),并进入’gc cr block 2-way’ 等待,之后成功收到请求的CR block。

 

Instance 2 LMS TRACE

 

2012-05-02 02:12:15.634057 : GSIPC:RCVD: ksxp msg 0x7f16e1598588 sndr 1 seq 0.177740 type 36 tkts 0
2012-05-02 02:12:15.634094 : GSIPC:RCVD: watq msg 0x7f16e1598588 sndr 1, seq 177740, type 36, tkts 0
2012-05-02 02:12:15.634108 : GSIPC:TKT: collect msg 0x7f16e1598588 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.634162 : kjbrcvdscn[0x0.1ae0f0][from 1][idx 2012-05-02 02:12:15.634186 : kjbrcvdscn[no bscn1, wm 32768, RMno 0, reminc 18, dom 0]
   kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
   lb 0, hb 0, myb 15250, drmb 15250, apifrz 0
 GCS CLIENT END
2012-05-02 02:12:15.635211 : kjbdowncvt[0x15c91.1 76896.0][1][options x0]
2012-05-02 02:12:15.635230 : GSIPC:AMBUF: rcv buff 0x7f16e1c56420, pool rcvbuf, rqlen 1103
2012-05-02 02:12:15.635308 : GSIPC:GPBMSG: new bmsg 0x7f16e1c56490 mb 0x7f16e1c56420 msg 0x7f16e1c564b0 mlen 152 dest x101 flushsz -1
2012-05-02 02:12:15.635334 : kjbmslset(0x15c91.1)) seq 0x4 reqid=0x6 (shadow 0xb48bb330.xb)(rsn 2)(mas@1)
2012-05-02 02:12:15.635355 : GSIPC:SPBMSG: send bmsg 0x7f16e1c56490 blen 184 msg 0x7f16e1c564b0 mtype 33 attr|dest x30101 bsz|fsz x1ffff
2012-05-02 02:12:15.635377 : GSIPC:SNDQ: enq msg 0x7f16e1c56490, type 65521 seq 118669, inst 1, receiver 1, queued 1

*** 2012-05-02 02:12:15.635
kclccctx: cleanup copy 0x7f16e1d94798
2012-05-02 02:12:15.635479 : [kjmpmsgi:compl][type 36][msg 0x7f16e1598588][seq 177740.0][qtime 0][ptime 1257]
2012-05-02 02:12:15.635511 : GSIPC:BSEND: flushing sndq 0xb491dd28, id 1, dcx 0xbc516778, inst 1, rcvr 1  qlen 0 1
2012-05-02 02:12:15.635536 : GSIPC:BSEND: no batch1 msg 0x7f16e1c56490 type 65521 len 184 dest (1:1)
2012-05-02 02:12:15.635557 : kjbsentscn[0x0.1ae0f1][to 1]
2012-05-02 02:12:15.635578 : GSIPC:SENDM: send msg 0x7f16e1c56490 dest x10001 seq 118669 type 65521 tkts x10002 mlen xb800e8
WAIT #0: nam='gcs remote message' ela= 180 waittime=1 poll=0 event=0 obj#=0 tim=1335939135635819
2012-05-02 02:12:15.635853 : GSIPC:RCVD: ksxp msg 0x7f16e167e0b0 sndr 1 seq 0.177741 type 32 tkts 0
2012-05-02 02:12:15.635875 : GSIPC:RCVD: watq msg 0x7f16e167e0b0 sndr 1, seq 177741, type 32, tkts 0
2012-05-02 02:12:15.636012 : GSIPC:TKT: collect msg 0x7f16e167e0b0 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.636040 : kjbrcvdscn[0x0.1ae0f1][from 1][idx 2012-05-02 02:12:15.636060 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f1][from 1]
2012-05-02 02:12:15.636082 : GSIPC:TKT: dest (1:1) rtkt not acked 1  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:12:15.636102 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:12:15.636125 : [kjmxmpm][type 32][seq 0.177741][msg 0x7f16e167e0b0][from 1]
2012-05-02 02:12:15.636146 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23a,(client 0x9fff7b58,0x1)(from 1)(lseq xdf0)

 

2号实例上LMS进程的情况是这样的 收到gcs remote message GSIPC 关于要求SCN=[0x0.1ae0f0] block=1/89233的请求,进入BAST kjbmpbast(0x15c91.1),因为 block=1/89233已被写入到磁盘 触发fairness算法(在11.2.0.3中默认_fairness_threshold=2),对current block做KCL: F156: fairness downconvert,从Xcurrent DownConvert为 Scurrent:

 

Instance 2:
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

STATE CR_SCN_BAS
---------- ----------
2 0
3 1756658

 

之后Instance 2 LMS 将cr block加入到 kjbmslset(0x15c91.1)) 传送队列SEND QUEUE GSIPC:SNDQ: enq msg 0x7f16e1c56490。

 

接着我们再次在Instance 1上运行对 block: (1/89233)的查询 看会发生什么:

 

Instance 2:

SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;

CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
          29273         437

Instance 1 session A:

SQL>
SQL> select * from test;

        ID
----------
         2
         2

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         3    1761942
         3    1761932
         1          0
         3    1761520

Instance 2:

SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;

CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
          29274         437

select * from test
END OF STMT
PARSE #140336529675592:c=0,e=337,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940051
EXEC #140336529675592:c=0,e=96,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940204
WAIT #140336529675592: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939668940348

*** 2012-05-02 02:21:08.940
kclscrs: req=0 block=1/89233
2012-05-02 02:21:08.940676 : kjbcro[0x15c91.1 76896.0][5]

*** 2012-05-02 02:21:08.940
kclscrs: req=0 typ=nowait-abort

*** 2012-05-02 02:21:08.940
kclscrs: bid=1:3:1:0:f:21:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1f:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:17:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:21:08.940799 : kjbcro[0x15c91.1 76896.0][5]
2012-05-02 02:21:08.940833 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:21:08.940859 : kjbsentscn[0x0.1ae28c][to 2]
2012-05-02 02:21:08.940870 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177810 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:21:08.940976 : kjbmscr(0x15c91.1)reqid=0xa(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:21:08.941314 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1

*** 2012-05-02 02:21:08.941
kclwcrs: wait=0 tm=707

*** 2012-05-02 02:21:08.941
kclwcrs: got 1 blocks from ksxprcv
2012-05-02 02:21:08.941818 : kjbassume[0x15c91.1][sender 2][mymode x1][myrole x0][srole x0][flgs x0][spiscn 0x0.0][swscn 0x0.0]
2012-05-02 02:21:08.941852 : kjbrcvdscn[0x0.1ae28d][from 2][idx 2012-05-02 02:21:08.941871 : kjbrcvdscn[no bscn

 

前台进程与上一次类似申请的是SCN=[0x0.1ae28c]=1761932 Version的CR block, 但这一次receive的居然是Xcurrent Block且其SCN=1ae28d=1761933,Instance 1收到了Xcurrent Block后自己build出了查询所需要的SCN=1761932的CR BLOCK, 因为实际收到的是Current block,所以等待事件变成了'gc current block 2-way'。

这里可以看到前台进程并没有request current block,而仍是使用kjbcro;追根述源是Instance 2的LMS进程给它传送了Current Block:

 

Instance 2 LMS trace:

2012-05-02 02:21:08.448743 : GSIPC:RCVD: ksxp msg 0x7f16e14a4398 sndr 1 seq 0.177810 type 36 tkts 0
2012-05-02 02:21:08.448778 : GSIPC:RCVD: watq msg 0x7f16e14a4398 sndr 1, seq 177810, type 36, tkts 0
2012-05-02 02:21:08.448798 : GSIPC:TKT: collect msg 0x7f16e14a4398 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.448816 : kjbrcvdscn[0x0.1ae28c][from 1][idx 2012-05-02 02:21:08.448834 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28c][from 1]
2012-05-02 02:21:08.448857 : GSIPC:TKT: dest (1:1) rtkt not acked 2  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.448875 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.448970 : [kjmxmpm][type 36][seq 0.177810][msg 0x7f16e14a4398][from 1]
2012-05-02 02:21:08.448993 : kjbmpbast(0x15c91.1) reqid=0x6 (req 0xa4ff30f8)(reqinst 1)(reqid 10)(flags x0)

*** 2012-05-02 02:21:08.449
kclcrrf: req=48054 block=1/89233

*** 2012-05-02 02:21:08.449
kcl_compress_block: compressed: 6 free space: 7680
2012-05-02 02:21:08.449085 : kjbsentscn[0x0.1ae28d][to 1]
2012-05-02 02:21:08.449142 : kjbdeliver[to 1][0xa4ff30f8][10][current 1]
2012-05-02 02:21:08.449164 : kjbmssch(reqlock 0xa4ff30f8,10)(to 1)(bsz 344)
2012-05-02 02:21:08.449183 : GSIPC:AMBUF: rcv buff 0x7f16e18bcec8, pool rcvbuf, rqlen 1102

*** 2012-05-02 02:21:08.449
kclccctx: cleanup copy 0x7f16e1d94838

*** 2012-05-02 02:21:08.449
kcltouched: touch seconds 3271

*** 2012-05-02 02:21:08.449
kclgrantlk: req=48054
2012-05-02 02:21:08.449347 : [kjmpmsgi:compl][type 36][msg 0x7f16e14a4398][seq 177810.0][qtime 0][ptime 1119]
WAIT #0: nam='gcs remote message' ela= 568 waittime=1 poll=0 event=0 obj#=0 tim=1335939668449962
2012-05-02 02:21:08.450001 : GSIPC:RCVD: ksxp msg 0x7f16e1bb22a0 sndr 1 seq 0.177811 type 32 tkts 0
2012-05-02 02:21:08.450024 : GSIPC:RCVD: watq msg 0x7f16e1bb22a0 sndr 1, seq 177811, type 32, tkts 0
2012-05-02 02:21:08.450043 : GSIPC:TKT: collect msg 0x7f16e1bb22a0 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.450060 : kjbrcvdscn[0x0.1ae28e][from 1][idx 2012-05-02 02:21:08.450078 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28e][from 1]
2012-05-02 02:21:08.450097 : GSIPC:TKT: dest (1:1) rtkt not acked 3  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.450116 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.450136 : [kjmxmpm][type 32][seq 0.177811][msg 0x7f16e1bb22a0][from 1]
2012-05-02 02:21:08.450155 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23e,(client 0x9fff7b58,0x1)(from 1)(lseq xdf4)

 

 

为什么Instance 2上的LMS要偷懒,不构造build cr block,而直接传输给Instance 1自己所有的Current Block呢?通过观察Instance 2上的v$cr_block_server视图可以发现LIGHT_WORKS字段在发生current block transfer后并没有增长,说明并没有触发 CR server的 Light Work Rule(Light Work Rule是8i Cr Server中就存在的优化算法,当Remote LMS创建 build CR涉及的工作过多时,resource holder的LMS会选择传输现有block,而不完成CR build If creating the consistent read version block involves too much work (such as reading blocks from disk), then the holder sends the block to the requestor, and the requestor completes the CR fabrication. The holder maintains a fairness counter of CR requests. After the fairness threshold is reached, the holder downgrades it to lock mode.)。

 

到底何种条件下 CR Request 才会收到Current Block呢?

答案是:针对不同种类class的block,CR server有不同的做法。 对于undo block或者 undo header block的CR quest, LMS总是传输Current Block, 这是为了给 远程节点 可能正在进行的 block cleanout、 CR  Version构造提供便利。

 

对于普通的数据块 data blocks, 默认情况下总是 CR quest  & CR received的(排除之前所说的Light Work Rule,LMS"偷懒"), 除非Current Block已经DownConvert降级为S lock,那么LMS进程会宣称直接ship一个current version的block。

 

为了证明这一点 , 我们再次测试 ,这次我们将控制DownConvert的隐藏参数”_fairness_threshold“提高到200,这将造成Xcurrent Block无法降级为Scurrent, 也将导致LMS更少地传输Current Version的Data Block:

 

 

 

SQL> show parameter fair

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_fairness_threshold                  integer     200

Instance 1:

SQL>  update test set id=id+1 where id=4;

1 row updated.

Instance 2:

SQL>  update test set id=id+1 where id=2;

1 row updated.

SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1838166

在Instance 1上 不断查询,并 通过instance 2的 v$cr_block_server验证

instance 1
SQL> select * from test;

        ID
----------
        10
         3

instance 2:

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1883707
         8          0

SQL> select * from test;

        ID
----------
        10
         3

SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;

     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1883707
         8          0

...................

SQL> /

     STATE CR_SCN_BAS
---------- ----------
         2          0
         3    1883707
         3    1883695

repeat cr request on Instance 1

SQL> /

     STATE CR_SCN_BAS
---------- ----------
         8          0
         3    1883707
         3    1883695

 

 

实际测试发现_fairness_threshold似乎存在某个上限,虽然设置为200 但是在不多的几次CR serve后就Downgrade了lock, 这造成对data block的 CR Request最终还是Receive了 Current Block。