Know more _high_priority_processes

Know more _high_priority_processes

 

LMS process priority? In release 9i, increasing LMS process priority to RT or FX (with larger
CPU quanta scheduling schemes) helps. Oracle Corporation has already identified this
potential issue and in release 10g, this issue is taken care of, and LMS processes are running
with RT priority. This alleviates many issues with global cache transfer.
Two parameters control this behaviour2:
a. _high_priority_processes: This controls processes that will have higher priority and
v$parameter specifies this as ¡°High priority process name mask¡± with default value
of LMS*.
b. _os_sched_high_priority: This controls whether OS scheduling high priority is
enabled or not and defaults to 1.


 

In Oracle11g, it is NOT RECOMMENDED TO USE OS facilities like ¡°chrt¡± or ¡°priocntl¡± to manually
manipulate the scheduling classes of Oracle background processes. This is extremely important for Oracle
to be able to keep track of which processes are running in RT and which ones are not. Process scheduling
classes should only be manipulated using the _HIGH_PRIORITY_PROCESSES parameter as described
below.

 

_HIGH_PRIORITY_PROCESSES is a new underscore parameter that specifies which processes should run in
RT class. The default value is “LMS*|VKTM”, meaning that the LMSn and VKTM processes would
automatically run in RT at instance startup. If, for some reason you decide to run your LMSn processes in
time-sharing scheduling class (TS), change this parameter value to “VKTM”, which means that Oracle will
only manipulate the scheduling class for VKTM.
_OS_SCHED_HIGH_PRIORITY is another parameter related to process scheduling class manipulation. The
default value is 1, and changing this to 0 means that all scheduling class manipulation will be disabled,
regardless of the values set for _HIGH_PRIORITY_PROCESSES. We strongly recommend leaving this
parameter set to 1, and to only manipulate the scheduling classes using _HIGH_PRIORITY_PROCESSES.
If everything were configured correctly, the following messages indicating that VKTM and LMSn are running
in RT class would appear in the alert log. Alternatively, you could also use the “ps” command to see which
scheduling class each process is running at.
VKTM started with pid=3, OS id=14839 at elevated priority
LMS0 started with pid=10, OS id=14857 at elevated priority

 

If you do not see these messages, Oracle has failed to manipulate the scheduling class for some reason. To
understand the cause of the failure, check the trace file for each process to see if there is an error. A common
cause of failure would be incorrect permissions on the oradism binary.
ksesethighpri:(ksb.c:3489) Failed to elevate LMS0’s priority from 0 to 1
Error Info: Category(-2), Opname(skgdism_create), Loc(sp.c:1612),\
ErrMsg(Error 0) Dism(-1073762366)
Increasing the priority of other, non-RAC related background processes such as DBWR or LGWR is generally
not advised unless performed in a benchmark environment

针对11.2 RAC丢失OCR和Votedisk所在ASM Diskgroup的恢复手段

之前有同学在我的Oracle Allstarts群里讨论关于丢失包含ocr和votedisk的ASM diskgroup导致11gR2 RAC cluster无法正常启动的问题,最早我在《在11gR2 RAC中修改ASM DISK Path磁盘路径》一文中介绍了,如何在不启动CRS的情况下启动11.2中的ASM实例并实施操作, 这里也需要用到同样的命令”crsctl start crs -excl -nocrs “;

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638   QQ号:47079569    邮箱:service@parnassusdata.com

 

注意以下的恢复手段,针对ASM中单独的ocr或者单独的votedisk丢失也有效,因为11.2中普遍把ocr和votedisk存放在ASM中,而ASM的启动又依赖于ocr和votedisk,所以在丢失ocr或votedisk仍一一者都会导致cluter无法正常启动;这里我们仅仅讨论如何让CRS正常启动,如果丢失的diskgroup中还存放有数据库的话,数据的恢复不属于本篇文章的讨论范畴。

 

前提:恢复的前提是你仍有和故障前一样多的ASM LUN DISK,且你有OCR的自动备份,注意默认情况下每4个小时会自动备份一次,只要你没有删除$GI_HOME,一般都会有备份可用;不要求有votedisk备份

 

恢复场景: 利用dd命令清空ocr和votedisk所在diskgroup header,模拟diskgroup corruption:

 

 

1. 检查votedisk和 ocr备份
[root@vrh1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   a853d6204bbc4feabfd8c73d4c3b3001 (/dev/asm-diskh) [SYSTEMDG]
2. ONLINE   a5b37704c3574f0fbf21d1d9f58c4a6b (/dev/asm-diskg) [SYSTEMDG]
3. ONLINE   36e5c51ff0294fc3bf2a042266650331 (/dev/asm-diski) [SYSTEMDG]
4. ONLINE   af337d1512824fe4bf6ad45283517aaa (/dev/asm-diskj) [SYSTEMDG]
5. ONLINE   3c4a349e2e304ff6bf64b2b1c9d9cf5d (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).
su - grid
[grid@vrh1 ~]$ ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy
vrh1     2012/08/09 01:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup00.ocr
vrh1     2012/08/08 21:59:56     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup01.ocr
vrh1     2012/08/08 17:59:55     /g01/11.2.0/maclean/grid/cdata/vrh-cluster/backup02.ocr
vrh1     2012/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/day.ocr
vrh1     2012/08/08 05:59:54     /g01/11.2.0/grid/cdata/vrh-cluster/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available
2. 彻底关闭所有节点上的clusterware ,OHASD 
crsctl stop has -f
3. GetAsmDH.sh ==> GetAsmDH.sh是ASM disk header的备份脚本 
请养成良好的习惯,做危险操作前备份asm header
[grid@vrh1 ~]$ ./GetAsmDH.sh 
############################################
1) Collecting Information About the Disks:
############################################
SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 9 03:28:13 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
SQL> Connected.
SQL> SQL> SQL> SQL> SQL> SQL> SQL> 
1           0 /dev/asm-diske
1           1 /dev/asm-diskd
2           0 /dev/asm-diskb
2           1 /dev/asm-diskc
2           2 /dev/asm-diskf
3           0 /dev/asm-diskh
3           1 /dev/asm-diskg
3           2 /dev/asm-diski
3           3 /dev/asm-diskj
3           4 /dev/asm-diskk
SQL> SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
-rw-r--r-- 1 grid oinstall 1048 Aug  9 03:28 /tmp/HC/asmdisks.lst
############################################
2) Generating asm_diskh.sh script.
############################################
-rwx------ 1 grid oinstall 666 Aug  9 03:28 /tmp/HC/asm_diskh.sh
############################################
3) Executing  asm_diskh.sh script to 
generate dd  dumps.
############################################
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_1_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_1_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_2_2.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_0.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_1.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_2.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_3.dd
-rw-r--r-- 1 grid oinstall 1048576 Aug  9 03:28 /tmp/HC/dsk_3_4.dd
############################################
4) Compressing dd dumps in the next format:
(asm_dd_header_all_.tar)
############################################
/tmp/HC/dsk_1_0.dd
/tmp/HC/dsk_1_1.dd
/tmp/HC/dsk_2_0.dd
/tmp/HC/dsk_2_1.dd
/tmp/HC/dsk_2_2.dd
/tmp/HC/dsk_3_0.dd
/tmp/HC/dsk_3_1.dd
/tmp/HC/dsk_3_2.dd
/tmp/HC/dsk_3_3.dd
/tmp/HC/dsk_3_4.dd
./GetAsmDH.sh: line 81: compress: command not found
ls: /tmp/HC/*.Z: No such file or directory
[grid@vrh1 ~]$
4. 使用dd 命令 破坏ocr和votedisk所在diskgroup
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskh bs=1024k count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00423853 seconds, 247 MB/s
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskg bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0045179 seconds, 232 MB/s
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diski bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00469976 seconds, 223 MB/s
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskj bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00344262 seconds, 305 MB/s
[root@vrh1 ~]# dd if=/dev/zero of=/dev/asm-diskk bs=1024k count=1 
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.0053518 seconds, 196 MB/s
5. 在一个节点上尝试重新启动HAS
[root@vrh1 ~]# crsctl start has
CRS-4123: Oracle High Availability Services has been started.

 

 

 

但是因为ocr和votedisk所在diskgroup丢失,所以CSS将无法正常启动,如以下日志所示:

 

 

alertvrh1.log 
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:35:41.207
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:35:56.240
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:11.284
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:26.305
[cssd(5162)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /g01/11.2.0/grid/log/vrh1/cssd/ocssd.log
2012-08-09 03:36:41.328
ocssd.log
2012-08-09 03:40:26.662: [    CSSD][1078700352]clssnmReadDiscoveryProfile: voting file discovery string(/dev/asm*)
2012-08-09 03:40:26.662: [    CSSD][1078700352]clssnmvDDiscThread: using discovery string /dev/asm* for initial discovery
2012-08-09 03:40:26.662: [   SKGFD][1078700352]Discovery with str:/dev/asm*:
2012-08-09 03:40:26.662: [   SKGFD][1078700352]UFS discovery with :/dev/asm*:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskf:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskb:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskj:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskh:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskc:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskd:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diske:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskg:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diski:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Fetching UFS disk :/dev/asm-diskk:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]OSS discovery with :/dev/asm*:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xdf22a0 from lib :UFS:: for disk :/dev/asm-diskf:
2012-08-09 03:40:26.665: [   SKGFD][1078700352]Handle 0xf412a0 from lib :UFS:: for disk :/dev/asm-diskb:
2012-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf3a680 from lib :UFS:: for disk :/dev/asm-diskj:
2012-08-09 03:40:26.666: [   SKGFD][1078700352]Handle 0xf93da0 from lib :UFS:: for disk :/dev/asm-diskh:
2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvDiskVerify: Successful discovery of 0 disks
2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2012-08-09 03:40:26.667: [    CSSD][1078700352]clssnmvFindInitialConfigs: No voting files found
2012-08-09 03:40:26.667: [    CSSD][1078700352](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

 

 

 

正式的恢复ocr和votedisk所在diskgroup的步骤如下:

 

 

1. 以-excl -nocrs 方式启动cluster,这将可以启动ASM实例 但不启动CRS

 [root@vrh1 vrh1]# crsctl start crs -excl -nocrs 
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'vrh1'
CRS-2676: Start of 'ora.mdnsd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'vrh1'
CRS-2676: Start of 'ora.gpnpd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'vrh1'
CRS-2672: Attempting to start 'ora.gipcd' on 'vrh1'
CRS-2676: Start of 'ora.cssdmonitor' on 'vrh1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'vrh1'
CRS-2672: Attempting to start 'ora.diskmon' on 'vrh1'
CRS-2676: Start of 'ora.diskmon' on 'vrh1' succeeded
CRS-2676: Start of 'ora.cssd' on 'vrh1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'vrh1'
CRS-2672: Attempting to start 'ora.ctssd' on 'vrh1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'vrh1'
CRS-2676: Start of 'ora.ctssd' on 'vrh1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'vrh1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'vrh1'
CRS-2676: Start of 'ora.asm' on 'vrh1' succeeded

 

 

 

2.重建原ocr和votedisk所在diskgroup,注意compatible.asm必须是11.2:

 

 

[root@vrh1 vrh1]# su - grid
[grid@vrh1 ~]$ sqlplus  / as sysasm
SQL*Plus: Release 11.2.0.3.0 Production on Thu Aug 9 04:16:58 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> create diskgroup systemdg high redundancy disk '/dev/asm-diskh','/dev/asm-diskg','/dev/asm-diski','/dev/asm-diskj','/dev/asm-diskk'
ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';

 

 

 

3.从ocr backup中恢复ocr并做ocrcheck检验:

 

 

[root@vrh1 ~]# ocrconfig -restore /g01/11.2.0/grid/cdata/vrh-cluster/backup00.ocr
[root@vrh1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3180
Available space (kbytes) :     258940
ID                       : 1238458014
Device/File Name         :  +systemdg
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded

 

 

 

4. 准备恢复votedisk ,可能会遇到下面的错误:

 

[grid@vrh1 ~]$ crsctl replace votedisk  +SYSTEMDG
CRS-4602: Failed 27 to add voting file 2e4e0fe285924f86bf5473d00dcc0388.
CRS-4602: Failed 27 to add voting file 4fa54bb0cc5c4fafbf1a9be5479bf389.
CRS-4602: Failed 27 to add voting file a109ead9ea4e4f28bfe233188623616a.
CRS-4602: Failed 27 to add voting file 042c9fbd71b54f5abfcd3ab3408f3cf3.
CRS-4602: Failed 27 to add voting file 7b5a8cd24f954fafbf835ad78615763f.
Failed to replace voting disk group with +SYSTEMDG.
CRS-4000: Command Replace failed, or completed with errors.

 

 

需要重新配置一下ASM的参数,并重启ASM:

 

 

SQL> alter system set asm_diskstring='/dev/asm*';
System altered.
SQL> create spfile from memory;
File created.
SQL>  startup force mount;
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started
Total System Global Area  283930624 bytes
Fixed Size                  2227664 bytes
Variable Size             256537136 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
SQL> show parameter spfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /g01/11.2.0/grid/dbs/spfile+AS
M1.ora
[grid@vrh1 trace]$  crsctl replace votedisk  +SYSTEMDG
CRS-4256: Updating the profile
Successful addition of voting disk 85edc0e82d274f78bfc58cdc73b8c68a.
Successful addition of voting disk 201ffffc8ba44faabfe2efec2aa75840.
Successful addition of voting disk 6f2a25c589964faabf6980f7c5f621ce.
Successful addition of voting disk 93eb315648454f25bf3717df1a2c73d5.
Successful addition of voting disk 3737240678964f88bfbfbd31d8b3829f.
Successfully replaced voting disk group with +SYSTEMDG.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced

 

 

 

 

5. 重启has服务,检验cluster是否正常:

 

 

[root@vrh1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@vrh1 ~]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   85edc0e82d274f78bfc58cdc73b8c68a (/dev/asm-diskh) [SYSTEMDG]
2. ONLINE   201ffffc8ba44faabfe2efec2aa75840 (/dev/asm-diskg) [SYSTEMDG]
3. ONLINE   6f2a25c589964faabf6980f7c5f621ce (/dev/asm-diski) [SYSTEMDG]
4. ONLINE   93eb315648454f25bf3717df1a2c73d5 (/dev/asm-diskj) [SYSTEMDG]
5. ONLINE   3737240678964f88bfbfbd31d8b3829f (/dev/asm-diskk) [SYSTEMDG]
Located 5 voting disk(s).
[root@vrh1 ~]# crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.BACKUPDG.dg
ONLINE  ONLINE       vrh1                                         
ora.DATA.dg
ONLINE  ONLINE       vrh1                                         
ora.LISTENER.lsnr
ONLINE  ONLINE       vrh1                                         
ora.LSN_MACLEAN.lsnr
ONLINE  ONLINE       vrh1                                         
ora.SYSTEMDG.dg
ONLINE  ONLINE       vrh1                                         
ora.asm
ONLINE  ONLINE       vrh1                     Started             
ora.gsd
OFFLINE OFFLINE      vrh1                                         
ora.net1.network
ONLINE  ONLINE       vrh1                                         
ora.ons
ONLINE  ONLINE       vrh1                                         
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr  https://www.askmaclean.com
1        ONLINE  ONLINE       vrh1                                         
ora.cvu
1        OFFLINE OFFLINE                                                   
ora.oc4j
1        OFFLINE OFFLINE                                                   
ora.scan1.vip
1        ONLINE  ONLINE       vrh1                                         
ora.vprod.db
1        ONLINE  OFFLINE                                                   
2        ONLINE  OFFLINE                                                   
ora.vrh1.vip
1        ONLINE  ONLINE       vrh1                                         
ora.vrh2.vip
1        ONLINE  INTERMEDIATE vrh1                     FAILED OVER

从此SQLPLUS有了Top命令

Jagjeet Singh为我们提供了这样一个sqlplus下的RAC dashboard,实时的性能工具,看上去和top或者nmon命令的界面很相似,借助于这个工具我们将可以实时监控RAC的性能指标而不依赖于图形界面。

 

点击这里下载Jagjeet Singh的sqlplus dashboard for rac工具

 

具体的安装方法如下:

 

 

解压zip包 
[oracle@vrh1 ~]$ unzip Pck.zip 
Archive:  Pck.zip
creating: Pck/
inflating: Pck/1_grants.sql        
inflating: Pck/2_Types.sql         
inflating: Pck/3_pkg.sql           
inflating: Pck/drop.sql            
inflating: Pck/howtoinstall.sql    
inflating: Pck/outputexplained.pdf  
inflating: Pck/sampleoutput.png    
[oracle@vrh1 ~]$ cd Pck
[oracle@vrh1 Pck]$ sqlplus  / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Wed Aug 8 02:08:49 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
1.  执行授权文件1_grants.sql        
SQL> start 1_grants.sql
Enter value for _usr: maclean
需要指定安装在哪个schema下
2. 登录上面指定的schema,并执行 2_Types.sql 创建type
SQL> conn maclean/maclean
Connected.
SQL> 
SQL> 
SQL> start 2_Types.sql
Type created.
Type created.
Type created.
Type created.
Type created.
Type created.
Type created.
Type created.
Type created.
Type created.
3. 执行3_pkg.sql 创建包体
SQL> start 3_pkg.sql
Package created.
No errors.
Package body created.
 若想删除该工具吗,卸载的方法 
SQL> connect maclean/maclean
SQL> start drop.sql

 

 

 

具体使用的方法如下:

 

 

 SQL>set lines 190 pages 0 arraysize 50
SQL>select * from table(jss.gtop(50,10)) ;
50代表 arraysize 一般不用修改
10代表 10s采样一次,采样间隔

 

示例输出:

 

这很酷,仿佛文字版的spotlight  FOR RAC

CTSSD服务无法正常启动:Failure 4 in trying to open SV key PROCL-4/PROCL-5 clsctss_r_av2

如果CTSSD服务无法正常启动,则在11.2中将无法正常启动HAS=>CRS导致RAC无法正常使用。

具体的诊断需要参考$GRID_HOME/log/$HOSTNAME/ctssd/octssd.log 的日志,例如:

 

 

2010-12-23 16:45:31.287: [ GIPC][1257901344] gipcCheckInitialization: possible incompatible non-threaded init from [prom.c : 687], original from [clsss.c : 5325]
[ default][1257901344]Failure 4 in trying to open SV key SYSTEM.version.localhost
[ default][1257901344]procr_open_key error 4 errorbuf : PROCL-4: The local registry key to be operated on does not exist.
or PROCL-5
2010-12-23 16:45:31.289: [ CTSS][1257901344]clsctss_r_av2: Error [3] retrieving Active Version from OLR. Returns [19].
2010-12-23 16:45:31.292: [ CTSS][1257901344](:ctss_init16:): Error [19] retrieving active version. Returns [19].
2010-12-23 16:45:31.292: [ CTSS][1257901344]ctss_main: CTSS init failed [19]
2010-12-23 16:45:31.292: [ CTSS][1257901344]ctss_main: CTSS daemon aborting [19].

 

 

触发该错误,主要有2种可能:

  1. ocr所在ASM DISK的用户权限存在问题
  2. ocr所在ASM DISK在节点之间设备名不一致
  3. HAS的状态不正常
  4.  bug 10420872

 

对于问题2  建议通过kfed read /dev/oracleasm/disks/OCRVOTE01 | grep kfbh.type  来确认节点之间的ASM DISK一致。

对于问题3 可以尝试 重启 HAS 服务:

su - root
crsctl stop has -f
crsctl start has

【教学视频】Maclean教你一步一步使用Vbox在Linux 5上安装Oracle 11gR2 RAC

 

https://zcdn.askmaclean.com/dbDao_install%2011gr2%20RAC%20on%20Linux%205.7%20in%20vbox.pdf

[视频教学]Maclean教你用Vbox在Enterprise Linux 5上安装Oracle 10gR2 RAC

【视频教学】Maclean手把手教你用Vbox在Enterprise Linux 5上安装Oracle 10gR2 RAC数据库

WebEx视频播放软件 , 以下wrf格式文件 需要使用webex播放,

下载WebEX: 

 

 
 WebEx.zip (2.55 MB, 下载次数: 6539)

Agenda:

相关脚本 和命令附件:    脚本命令.txt (1.71 KB, 下载次数: 1444)

第一讲视频-在Windows上安装Vbox

 第一讲视频-在Windows上安装Vbox.zip (4.41 MB, 下载次数: 4930)

第二讲视频-在Vbox上安装Oracle Enterprise Linux 5.7虚拟主机

 第二讲视频-在Vbox上安装Oracle Enterprise Linux 5.7虚拟主机.zip (12.05 MB, 下载次数: 11741)

第三讲视频-配置并克隆Oracle Enterprise 5.7虚拟主机

 第三讲视频-配置并克隆Oracle Enterprise 5.7虚拟主机.zip (14.82 MB, 下载次数: 30549)

第四讲视频-配置Vbox共享存储和UDEV存储设备绑定

 第四讲视频-配置Vbox共享存储和UDEV存储设备绑定.zip (13.16 MB, 下载次数: 29091)

第五讲视频-安装10.2.0.1 x64 Linux Oracle Clusterware CRS

 第五讲视频-安装10.2.0.1 x64 Linux Oracle Clusterware CRS.zip (10.48 MB, 下载次数: 8603)

第六讲视频-升级Clusterware CRS 10.2.0.1 到 Patchset 10.2.0.5

 第六讲视频-升级Clusterware CRS 10.2.0.1 到 Patchset 10.2.0.5 .zip (3.2 MB, 下载次数: 5716)

第七讲视频-安装10.2.0.1 DB数据库并升级到10.2.0.5 Patchset 并DBCA创建数据库

 第七讲视频-安装10.2.0.1DB数据库升级到10.2.0.5创建数据库.zip (10.37 MB, 下载次数: 8511)

More About ORACLE_AFFINITY ORACLE_PRIORITY Windows Process Scheduling

Using ORACLE_AFFINITY

Under certain conditions, it might be desirable to reserve a CPU for a particular thread or set of threads. Good candidates are usually threads that either are single threads and whose performance degrades with higher load or who affect performance adversely on all CPU’s on the system.

 

When running RAC, the CPU utilization of LMS processes increases at a faster pace than the number of lock and CR and current blocks requests. The throughput and scalability of LMS are crucial to the performance of RAC, since current and CR server, as well as lock request latencies are directly correlated with how fast and efficient LMS can process the requests. Tests on Solaris in Oracle8i have shown, that binding LMD0 to a separate CPU, will give nearly optimal performance results. On NT, the LMS threads can also be bound to a dedicated processor.

 

Example:

 

HKEY_LOCAL_MACHINE->SOFTWARE->ORACLE->

 

  • ORACLE_AFFINITY=”LMS0:8;LMS1:8;DEF:7;USER:7”

On a 4-CPU machine, this would indicate that the LMS threads run on CPU 3, and all other Oracle threads on CPU’s 0, 1 and 2.

 

  • ORACLE_AFFINITY=”LMS0:1;DEF:14”

LMS0 runs on CPU 0, all other threads on CPU’s 1, 2 and 3.

 

Using ORACLE_PRIORITY

 

Similarly to ORACLE_AFFINITY, the priority of an Oracle thread can be modified. There exist six different thread priorities:

 

  • lowest
  • below_normal
  • normal
  • above_normal
  • highest
  • time_critical

 

Since threads can have their priorities dynamically adjusted by the system within a process in the idle, normal or high priority class, using ORACLE_PRIORITY may not have a big impact. Also, when CPU resources are already limited, having a single thread running with increased priority will not increase throughput any further. This was observed when running the LMD thread with priority set to HIGHEST in an Oracle8 8i environment.

 

More information can be found in the “Oracle Database 10g Architecture on Windows” white paper on OTN, in the Windows specific 10g documentation.

Global Cache CR Requested But Current Block Received

这篇文章和之前的《MINSCN与Cache Fusion Read Consistent》 是姊妹篇,他们源于同一个问题帖子

我们来重现提问者所看到的这样一个场景:

 

SQL> select * from V$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select count(*) from gv$instance;
COUNT(*)
----------
2
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com

 

 

 

在11gR2 2节点RAC环境中将一个数据块的status修改为XG,假设这个Xcurrent block当前在INSTANCE 2被hold住,这时我们在INSTANCE 1反复查询这个数据块,并观察结果:

 

SQL> select  * from test;
ID
----------
1
2
SQL> select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
89233                                    1
89233                                    1
SQL> alter system flush buffer_cache;
System altered.
INSTANCE 1 Session A:
SQL>  update test set id=id+1 where id=1;
1 row updated.
INSTANCE 1 Session B:
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1755287
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD1/trace/VPROD1_ora_19111.trc
GLOBAL CACHE ELEMENT DUMP (address: 0xa4ff3080):
id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
lock: X rls: 0x0 acq: 0x0 latch: 3
flags: 0x20 fair: 0 recovery: 0 fpin: 'kdswh11: kdst_fetch'
bscn: 0x0.146e20 bctx: (nil) write: 0 scan: 0x0
lcp: (nil) lnk: [NULL] lch: [0xa9f6a6f8,0xa9f6a6f8]
seq: 32 hist: 58 145:0 118 66 144:0 192 352 197 48 121 113 424 180 58
LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
flg: 0x02000001 lflg: 0x1 state: XCURRENT tsn: 0 tsh: 2
addr: 0xa9f6a5c8 obj: 76896 cls: DATA bscn: 0x0.1ac898
BH (0xa9f6a5c8) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0xa9e56000
set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 3 pwc: 0,15
dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
hash: [0x91f4e970,0xbae9d5b8] lru: [0x91f58848,0xa9f6a828]
lru-flags: debug_dump
obj-flags: object_ckpt_list
ckptq: [0x9df6d1d8,0xa9f6a740] fileq: [0xa2ece670,0xbdf4ed68] objq: [0xb4964e00,0xb4964e00] objaq: [0xb4964de0,0xb4964de0]
st: XCURRENT md: NULL fpin: 'kdswh11: kdst_fetch' tch: 2 le: 0xa4ff3080
flags: buffer_dirty redo_since_read
LRBA: [0x19.5671.0] LSCN: [0x0.1ac898] HSCN: [0x0.1ac898] HSUB: [1]
buffer tsn: 0 rdba: 0x00415c91 (1/89233)
scn: 0x0000.001ac898 seq: 0x01 flg: 0x00 tail: 0xc8980601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data

 

 

可以看到此时block: (1/89233)的GLOBAL CACHE ELEMENT DUMP中LOCK状态仍是X 而非XG , 这是因为这个Current Block仅在一个Instance中被modify修改过,没有在全局范围内被更新过。

 

 

紧接着在Instance 2 修改该块:

 

 

Instance 2 Session C:
SQL> update test set id=id+1 where id=2;
1 row updated.
Instance 2 Session D:
SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1756658
SQL> oradebug setmypid;
Statement processed.
SQL> oradebug dump gc_elements 255;
Statement processed.
SQL> oradebug tracefile_name;
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_ora_13038.trc
GLOBAL CACHE ELEMENT DUMP (address: 0x89fb25a0):
id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)
lock: XG rls: 0x0 acq: 0x0 latch: 3
flags: 0x20 fair: 0 recovery: 0 fpin: 'kduwh01: kdusru'
bscn: 0x0.1acdf3 bctx: (nil) write: 0 scan: 0x0
lcp: (nil) lnk: [NULL] lch: [0x96f4cf80,0x96f4cf80]
seq: 61 hist: 324 21 143:0 19 16 352 329 144:6 14 7 352 197
LIST OF BUFFERS LINKED TO THIS GLOBAL CACHE ELEMENT:
flg: 0x0a000001 state: XCURRENT tsn: 0 tsh: 1
addr: 0x96f4ce50 obj: 76896 cls: DATA bscn: 0x0.1acdf6
BH (0x96f4ce50) file#: 1 rdba: 0x00415c91 (1/89233) class: 1 ba: 0x96bd4000
set: 5 pool: 3 bsz: 8192 bsi: 0 sflg: 2 pwc: 0,15
dbwrid: 0 obj: 76896 objn: 76896 tsn: 0 afn: 1 hint: f
hash: [0x96ee1fe8,0xbae9d5b8] lru: [0x96f4d0b0,0x96f4cdc0]
obj-flags: object_ckpt_list
ckptq: [0xbdf519b8,0x96f4d5a8] fileq: [0xbdf519d8,0xbdf519d8] objq: [0xb4a47b90,0xb4a47b90] objaq: [0x96f4d0e8,0xb4a47b70]
st: XCURRENT md: NULL fpin: 'kduwh01: kdusru' tch: 1 le: 0x89fb25a0
flags: buffer_dirty redo_since_read remote_transfered
LRBA: [0x11.9e18.0] LSCN: [0x0.1acdf6] HSCN: [0x0.1acdf6] HSUB: [1]
buffer tsn: 0 rdba: 0x00415c91 (1/89233)
scn: 0x0000.001acdf6 seq: 0x01 flg: 0x00 tail: 0xcdf60601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data
GCS CLIENT 0x89fb2618,6 resp[(nil),0x15c91.1] pkey 76896.0
grant 2 cvt 0 mdrole 0x42 st 0x100 lst 0x20 GRANTQ rl G0
master 1 owner 2 sid 0 remote[(nil),0] hist 0x94121c601163423c
history 0x3c.0x4.0xd.0xb.0x1.0xc.0x7.0x9.0x14.0x1.
cflag 0x0 sender 1 flags 0x0 replay# 0 abast (nil).x0.1 dbmap (nil)
disk: 0x0000.00000000 write request: 0x0000.00000000
pi scn: 0x0000.00000000 sq[(nil),(nil)]
msgseq 0x1 updseq 0x0 reqids[6,0,0] infop (nil) lockseq x2b8
pkey 76896.0
hv 93 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 18, dom 0]
kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
lb 0, hb 0, myb 15250, drmb 15250, apifrz 0

 

 

在Instance 2中被修改过后block: (1/89233)的 GLOBAL CACHE ELEMENT Lock Convert成lock: XG

除了通过GC_ELEMENTS DUMP来分析XCUR Cache Fusion外,也可以直接查询X$ VIEW,主要是 X$LE X$KJBR X$KJBL, 这三个X$ VIEW的更多信息可以很方便地从我的博客中找到:

 

 

INSTANCE 2 Session D:
SELECT *
FROM x$le
WHERE le_addr IN (SELECT le_addr
FROM x$bh
WHERE obj IN (SELECT data_object_id
FROM dba_objects
WHERE owner = 'SYS'
AND object_name = 'TEST')
AND class = 1
AND state != 3);
ADDR                   INDX    INST_ID LE_ADDR              LE_ID1     LE_ID2
---------------- ---------- ---------- ---------------- ---------- ----------
LE_RLS     LE_ACQ   LE_FLAGS    LE_MODE   LE_WRITE   LE_LOCAL LE_RECOVERY
---------- ---------- ---------- ---------- ---------- ---------- -----------
LE_BLKS    LE_TIME LE_KJBL
---------- ---------- ----------------
00007F94CA14CF60       7003          2 0000000089FB25A0      89233          1
0          0         32          2          0          1           0
1          0 0000000089FB2618

 

 

PCM Resource NAME由[ID1][ID2],[BL]等组成, ID1和ID2 通过blockno和 fileno计算获得, 这里我们直接参考以上GC_elements dump中的 id1: 0x15c91 id2: 0x1 pkey: OBJ#76896 block: (1/89233)信息 ,则  kjblname 和 kjbrname 应以”[0x15c91][0x1],[BL]” 开头:

 

 

INSTANCE 2 Session D:
SQL> set linesize 80 pagesize 1400
SQL> SELECT *
2    FROM x$kjbl l
3   WHERE l.kjblname LIKE '%[0x15c91][0x1],[BL]%';
ADDR                   INDX    INST_ID KJBLLOCKP        KJBLGRANT KJBLREQUE
---------------- ---------- ---------- ---------------- --------- ---------
KJBLROLE KJBLRESP         KJBLNAME
---------- ---------------- ------------------------------
KJBLNAME2                       KJBLQUEUE
------------------------------ ----------
KJBLLOCKST                                                       KJBLWRITING
---------------------------------------------------------------- -----------
KJBLREQWRITE  KJBLOWNER KJBLMASTER KJBLBLOCKED KJBLBLOCKER    KJBLSID KJBLRDOMID
------------ ---------- ---------- ----------- ----------- ---------- ----------
KJBLPKEY
----------
00007F94CA22A288        451          2 0000000089FB2618 KJUSEREX  KJUSERNL
0 00               [0x15c91][0x1],[BL][ext 0x0,0x
89233,1,BL                              0
GRANTED                                                                    0
0          1          0           0           0          0          0
76896
SQL> SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';
no rows selected
Instance 1 session B:
SQL>  SELECT r.* FROM x$kjbr r WHERE r.kjbrname LIKE '%[0x15c91][0x1],[BL]%';
ADDR                   INDX    INST_ID KJBRRESP         KJBRGRANT KJBRNCVL
---------------- ---------- ---------- ---------------- --------- ---------
KJBRROLE KJBRNAME                       KJBRMASTER KJBRGRANTQ
---------- ------------------------------ ---------- ----------------
KJBRCVTQ         KJBRWRITER          KJBRSID KJBRRDOMID   KJBRPKEY
---------------- ---------------- ---------- ---------- ----------
00007F801ACA68F8       1355          1 00000000B5A62AE0 KJUSEREX  KJUSERNL
0 [0x15c91][0x1],[BL][ext 0x0,0x          0 00000000B48BB330
00               00                        0          0      76896

 

 

接着我们将在Instance 1上查询block: (1/89233),这应当会引发Instance 2 build cr block 并传输给Instance 1, 为了进一步了解其细节 我们分别对 Instance 1的 Foreground Process 和 Instance 2的LMS进程做详细的RAC  TRACE:

 

Instance 2:
[oracle@vrh2 ~]$ ps -ef|grep ora_lms|grep -v grep
oracle   23364     1  0 Apr29 ?        00:33:15 ora_lms0_VPROD2
SQL> oradebug setospid 23364
Oracle pid: 13, Unix process pid: 23364, image: oracle@vrh2.oracle.com (LMS0)
SQL> oradebug event 10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high;
Statement processed.
SQL> oradebug tracefile_name
/s01/orabase/diag/rdbms/vprod/VPROD2/trace/VPROD2_lms0_23364.trc
Instance 1 session B :
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1756658
3    1756661
3    1755287
Instance 1 session A :
SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';
Session altered.
SQL> select * from test;
ID
----------
2
2
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1761520

 

从x$BH的信息来看,如我们所预期的Instance 1收到了build好的CR block,进一步分析 TRACE 文件:

 

Instance 1 foreground Process:
PARSING IN CURSOR #140336527348792 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335939136125254 hv=1689401402 ad='b1a4c828' sqlid='c99yw1xkb4f1u'
select * from test
END OF STMT
PARSE #140336527348792:c=2999,e=2860,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125253
EXEC #140336527348792:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939136125373
WAIT #140336527348792: nam='SQL*Net message to client' ela= 6 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939136125420
*** 2012-05-02 02:12:16.125
kclscrs: req=0 block=1/89233
2012-05-02 02:12:16.125574 : kjbcro[0x15c91.1 76896.0][4]
*** 2012-05-02 02:12:16.125
kclscrs: req=0 typ=nowait-abort
*** 2012-05-02 02:12:16.125
kclscrs: bid=1:3:1:0:f:1e:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1c:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:14:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:12:16.125718 : kjbcro[0x15c91.1 76896.0][4]
2012-05-02 02:12:16.125751 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:12:16.125780 : kjbsentscn[0x0.1ae0f0][to 2]
2012-05-02 02:12:16.125806 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177740 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:12:16.125918 : kjbmscr(0x15c91.1)reqid=0x8(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:12:16.126959 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1
*** 2012-05-02 02:12:16.127
kclwcrs: wait=0 tm=1233
*** 2012-05-02 02:12:16.127
kclwcrs: got 1 blocks from ksxprcv
WAIT #140336527348792: nam='gc cr block 2-way' ela= 1233 p1=1 p2=89233 p3=1 obj#=76896 tim=1335939136127199
2012-05-02 02:12:16.127272 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-05-02 02:12:16.127309 : kjbrcvdscn[0x0.1ae0f0][from 2][idx 2012-05-02 02:12:16.127329 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f0][from 2]

 

 

前台进程 kjbcro[0x15c91.1 76896.0][4] kjbsentscn[0x0.1ae0f0][to 2] 向Instance 2请求SCN=1ae0f0=1761520的 block: (1/89233),并进入’gc cr block 2-way’ 等待,之后成功收到请求的CR block。

 

Instance 2 LMS TRACE

 

2012-05-02 02:12:15.634057 : GSIPC:RCVD: ksxp msg 0x7f16e1598588 sndr 1 seq 0.177740 type 36 tkts 0
2012-05-02 02:12:15.634094 : GSIPC:RCVD: watq msg 0x7f16e1598588 sndr 1, seq 177740, type 36, tkts 0
2012-05-02 02:12:15.634108 : GSIPC:TKT: collect msg 0x7f16e1598588 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.634162 : kjbrcvdscn[0x0.1ae0f0][from 1][idx 2012-05-02 02:12:15.634186 : kjbrcvdscn[no bscn1, wm 32768, RMno 0, reminc 18, dom 0]
kjga st 0x4, step 0.0.0, cinc 20, rmno 6, flags 0x0
lb 0, hb 0, myb 15250, drmb 15250, apifrz 0
GCS CLIENT END
2012-05-02 02:12:15.635211 : kjbdowncvt[0x15c91.1 76896.0][1][options x0]
2012-05-02 02:12:15.635230 : GSIPC:AMBUF: rcv buff 0x7f16e1c56420, pool rcvbuf, rqlen 1103
2012-05-02 02:12:15.635308 : GSIPC:GPBMSG: new bmsg 0x7f16e1c56490 mb 0x7f16e1c56420 msg 0x7f16e1c564b0 mlen 152 dest x101 flushsz -1
2012-05-02 02:12:15.635334 : kjbmslset(0x15c91.1)) seq 0x4 reqid=0x6 (shadow 0xb48bb330.xb)(rsn 2)(mas@1)
2012-05-02 02:12:15.635355 : GSIPC:SPBMSG: send bmsg 0x7f16e1c56490 blen 184 msg 0x7f16e1c564b0 mtype 33 attr|dest x30101 bsz|fsz x1ffff
2012-05-02 02:12:15.635377 : GSIPC:SNDQ: enq msg 0x7f16e1c56490, type 65521 seq 118669, inst 1, receiver 1, queued 1
*** 2012-05-02 02:12:15.635
kclccctx: cleanup copy 0x7f16e1d94798
2012-05-02 02:12:15.635479 : [kjmpmsgi:compl][type 36][msg 0x7f16e1598588][seq 177740.0][qtime 0][ptime 1257]
2012-05-02 02:12:15.635511 : GSIPC:BSEND: flushing sndq 0xb491dd28, id 1, dcx 0xbc516778, inst 1, rcvr 1  qlen 0 1
2012-05-02 02:12:15.635536 : GSIPC:BSEND: no batch1 msg 0x7f16e1c56490 type 65521 len 184 dest (1:1)
2012-05-02 02:12:15.635557 : kjbsentscn[0x0.1ae0f1][to 1]
2012-05-02 02:12:15.635578 : GSIPC:SENDM: send msg 0x7f16e1c56490 dest x10001 seq 118669 type 65521 tkts x10002 mlen xb800e8
WAIT #0: nam='gcs remote message' ela= 180 waittime=1 poll=0 event=0 obj#=0 tim=1335939135635819
2012-05-02 02:12:15.635853 : GSIPC:RCVD: ksxp msg 0x7f16e167e0b0 sndr 1 seq 0.177741 type 32 tkts 0
2012-05-02 02:12:15.635875 : GSIPC:RCVD: watq msg 0x7f16e167e0b0 sndr 1, seq 177741, type 32, tkts 0
2012-05-02 02:12:15.636012 : GSIPC:TKT: collect msg 0x7f16e167e0b0 from 1 for rcvr -1, tickets 0
2012-05-02 02:12:15.636040 : kjbrcvdscn[0x0.1ae0f1][from 1][idx 2012-05-02 02:12:15.636060 : kjbrcvdscn[no bscn <= rscn 0x0.1ae0f1][from 1]
2012-05-02 02:12:15.636082 : GSIPC:TKT: dest (1:1) rtkt not acked 1  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:12:15.636102 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:12:15.636125 : [kjmxmpm][type 32][seq 0.177741][msg 0x7f16e167e0b0][from 1]
2012-05-02 02:12:15.636146 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23a,(client 0x9fff7b58,0x1)(from 1)(lseq xdf0)

 

2号实例上LMS进程的情况是这样的 收到gcs remote message GSIPC 关于要求SCN=[0x0.1ae0f0] block=1/89233的请求,进入BAST kjbmpbast(0x15c91.1),因为 block=1/89233已被写入到磁盘 触发fairness算法(在11.2.0.3中默认_fairness_threshold=2),对current block做KCL: F156: fairness downconvert,从Xcurrent DownConvert为 Scurrent:

 

Instance 2:
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
2 0
3 1756658

 

之后Instance 2 LMS 将cr block加入到 kjbmslset(0x15c91.1)) 传送队列SEND QUEUE GSIPC:SNDQ: enq msg 0x7f16e1c56490。

 

接着我们再次在Instance 1上运行对 block: (1/89233)的查询 看会发生什么:

 

Instance 2:
SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;
CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
29273         437
Instance 1 session A:
SQL>
SQL> select * from test;
ID
----------
2
2
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1761942
3    1761932
1          0
3    1761520
Instance 2:
SQL> select CURRENT_RESULTS,LIGHT_WORKS from v$cr_block_server;
CURRENT_RESULTS LIGHT_WORKS
--------------- -----------
29274         437
select * from test
END OF STMT
PARSE #140336529675592:c=0,e=337,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940051
EXEC #140336529675592:c=0,e=96,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335939668940204
WAIT #140336529675592: nam='SQL*Net message to client' ela= 5 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335939668940348
*** 2012-05-02 02:21:08.940
kclscrs: req=0 block=1/89233
2012-05-02 02:21:08.940676 : kjbcro[0x15c91.1 76896.0][5]
*** 2012-05-02 02:21:08.940
kclscrs: req=0 typ=nowait-abort
*** 2012-05-02 02:21:08.940
kclscrs: bid=1:3:1:0:f:21:0:0:10:0:0:0:1:2:4:1:20:0:0:0:c3:49:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:1f:0:4d:26:a3:52:0:0:0:0:c7:c:ca:62:c3:49:0:0:0:0:1:0:17:8e:47:76:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:99:ed:0:0:0:0:0:0:10:0:0:0
2012-05-02 02:21:08.940799 : kjbcro[0x15c91.1 76896.0][5]
2012-05-02 02:21:08.940833 : GSIPC:GMBQ: buff 0xba0ee018, queue 0xbb79a7b8, pool 0x60013fa0, freeq 0, nxt 0xbb79a7b8, prv 0xbb79a7b8
2012-05-02 02:21:08.940859 : kjbsentscn[0x0.1ae28c][to 2]
2012-05-02 02:21:08.940870 : GSIPC:SENDM: send msg 0xba0ee088 dest x20001 seq 177810 type 36 tkts xff0000 mlen x1680198
2012-05-02 02:21:08.940976 : kjbmscr(0x15c91.1)reqid=0xa(req 0xa4ff30f8)(rinst 1)hldr 2(infosz 200)(lseq x2b8)
2012-05-02 02:21:08.941314 : GSIPC:KSXPCB: msg 0xba0ee088 status 30, type 36, dest 2, rcvr 1
*** 2012-05-02 02:21:08.941
kclwcrs: wait=0 tm=707
*** 2012-05-02 02:21:08.941
kclwcrs: got 1 blocks from ksxprcv
2012-05-02 02:21:08.941818 : kjbassume[0x15c91.1][sender 2][mymode x1][myrole x0][srole x0][flgs x0][spiscn 0x0.0][swscn 0x0.0]
2012-05-02 02:21:08.941852 : kjbrcvdscn[0x0.1ae28d][from 2][idx 2012-05-02 02:21:08.941871 : kjbrcvdscn[no bscn

 

前台进程与上一次类似申请的是SCN=[0x0.1ae28c]=1761932 Version的CR block, 但这一次receive的居然是Xcurrent Block且其SCN=1ae28d=1761933,Instance 1收到了Xcurrent Block后自己build出了查询所需要的SCN=1761932的CR BLOCK, 因为实际收到的是Current block,所以等待事件变成了’gc current block 2-way’。

这里可以看到前台进程并没有request current block,而仍是使用kjbcro;追根述源是Instance 2的LMS进程给它传送了Current Block:

 

Instance 2 LMS trace:
2012-05-02 02:21:08.448743 : GSIPC:RCVD: ksxp msg 0x7f16e14a4398 sndr 1 seq 0.177810 type 36 tkts 0
2012-05-02 02:21:08.448778 : GSIPC:RCVD: watq msg 0x7f16e14a4398 sndr 1, seq 177810, type 36, tkts 0
2012-05-02 02:21:08.448798 : GSIPC:TKT: collect msg 0x7f16e14a4398 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.448816 : kjbrcvdscn[0x0.1ae28c][from 1][idx 2012-05-02 02:21:08.448834 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28c][from 1]
2012-05-02 02:21:08.448857 : GSIPC:TKT: dest (1:1) rtkt not acked 2  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.448875 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.448970 : [kjmxmpm][type 36][seq 0.177810][msg 0x7f16e14a4398][from 1]
2012-05-02 02:21:08.448993 : kjbmpbast(0x15c91.1) reqid=0x6 (req 0xa4ff30f8)(reqinst 1)(reqid 10)(flags x0)
*** 2012-05-02 02:21:08.449
kclcrrf: req=48054 block=1/89233
*** 2012-05-02 02:21:08.449
kcl_compress_block: compressed: 6 free space: 7680
2012-05-02 02:21:08.449085 : kjbsentscn[0x0.1ae28d][to 1]
2012-05-02 02:21:08.449142 : kjbdeliver[to 1][0xa4ff30f8][10][current 1]
2012-05-02 02:21:08.449164 : kjbmssch(reqlock 0xa4ff30f8,10)(to 1)(bsz 344)
2012-05-02 02:21:08.449183 : GSIPC:AMBUF: rcv buff 0x7f16e18bcec8, pool rcvbuf, rqlen 1102
*** 2012-05-02 02:21:08.449
kclccctx: cleanup copy 0x7f16e1d94838
*** 2012-05-02 02:21:08.449
kcltouched: touch seconds 3271
*** 2012-05-02 02:21:08.449
kclgrantlk: req=48054
2012-05-02 02:21:08.449347 : [kjmpmsgi:compl][type 36][msg 0x7f16e14a4398][seq 177810.0][qtime 0][ptime 1119]
WAIT #0: nam='gcs remote message' ela= 568 waittime=1 poll=0 event=0 obj#=0 tim=1335939668449962
2012-05-02 02:21:08.450001 : GSIPC:RCVD: ksxp msg 0x7f16e1bb22a0 sndr 1 seq 0.177811 type 32 tkts 0
2012-05-02 02:21:08.450024 : GSIPC:RCVD: watq msg 0x7f16e1bb22a0 sndr 1, seq 177811, type 32, tkts 0
2012-05-02 02:21:08.450043 : GSIPC:TKT: collect msg 0x7f16e1bb22a0 from 1 for rcvr -1, tickets 0
2012-05-02 02:21:08.450060 : kjbrcvdscn[0x0.1ae28e][from 1][idx 2012-05-02 02:21:08.450078 : kjbrcvdscn[no bscn <= rscn 0x0.1ae28e][from 1]
2012-05-02 02:21:08.450097 : GSIPC:TKT: dest (1:1) rtkt not acked 3  unassigned bufs 0  tkts 0  newbufs 0
2012-05-02 02:21:08.450116 : GSIPC:TKT: remove ctx dest (1:1)
2012-05-02 02:21:08.450136 : [kjmxmpm][type 32][seq 0.177811][msg 0x7f16e1bb22a0][from 1]
2012-05-02 02:21:08.450155 : kjbmpocr(0xb0.6)seq 0x1,reqid=0x23e,(client 0x9fff7b58,0x1)(from 1)(lseq xdf4)

 

 

为什么Instance 2上的LMS要偷懒,不构造build cr block,而直接传输给Instance 1自己所有的Current Block呢?通过观察Instance 2上的v$cr_block_server视图可以发现LIGHT_WORKS字段在发生current block transfer后并没有增长,说明并没有触发 CR server的 Light Work Rule(Light Work Rule是8i Cr Server中就存在的优化算法,当Remote LMS创建 build CR涉及的工作过多时,resource holder的LMS会选择传输现有block,而不完成CR build If creating the consistent read version block involves too much work (such as reading blocks from disk), then the holder sends the block to the requestor, and the requestor completes the CR fabrication. The holder maintains a fairness counter of CR requests. After the fairness threshold is reached, the holder downgrades it to lock mode.)。

 

到底何种条件下 CR Request 才会收到Current Block呢?

答案是:针对不同种类class的block,CR server有不同的做法。 对于undo block或者 undo header block的CR quest, LMS总是传输Current Block, 这是为了给 远程节点 可能正在进行的 block cleanout、 CR  Version构造提供便利。

 

对于普通的数据块 data blocks, 默认情况下总是 CR quest  & CR received的(排除之前所说的Light Work Rule,LMS”偷懒”), 除非Current Block已经DownConvert降级为S lock,那么LMS进程会宣称直接ship一个current version的block。

 

为了证明这一点 , 我们再次测试 ,这次我们将控制DownConvert的隐藏参数”_fairness_threshold“提高到200,这将造成Xcurrent Block无法降级为Scurrent, 也将导致LMS更少地传输Current Version的Data Block:

 

 

 

SQL> show parameter fair
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
_fairness_threshold                  integer     200
Instance 1:
SQL>  update test set id=id+1 where id=4;
1 row updated.
Instance 2:
SQL>  update test set id=id+1 where id=2;
1 row updated.
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1838166
在Instance 1上 不断查询,并 通过instance 2的 v$cr_block_server验证
instance 1
SQL> select * from test;
ID
----------
10
3
instance 2:
SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1883707
8          0
SQL> select * from test;
ID
----------
10
3
SQL>  select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1883707
8          0
...................
SQL> /
STATE CR_SCN_BAS
---------- ----------
2          0
3    1883707
3    1883695
repeat cr request on Instance 1
SQL> /
STATE CR_SCN_BAS
---------- ----------
8          0
3    1883707
3    1883695

 

 

实际测试发现_fairness_threshold似乎存在某个上限,虽然设置为200 但是在不多的几次CR serve后就Downgrade了lock, 这造成对data block的 CR Request最终还是Receive了 Current Block。

MINSCN与Cache Fusion Read Consistent

问题来源于 网友在Ask Maclean Home 上关于  RAC 中 Past Image PI块的提问, 网友提出了一个演示,我们在11.2.0.3 2 Node RAC的环境中重现这个实验:

 

SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
SQL> select * from global_name;
GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com
SQL> drop table test purge;
Table dropped.
SQL> alter system flush buffer_cache;
System altered.
SQL> create table test(id number);
insert into test values(1);
insert into test values(2);
commit;
/* 我们利用 rowid定位TEST表仅有的2行数据的数据块位置 */
select dbms_rowid.rowid_block_number(rowid),dbms_rowid.rowid_relative_fno(rowid) from test;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID) DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID)
------------------------------------ ------------------------------------
                               89233                                    1
                               89233                                    1 
SQL> alter system flush buffer_cache;
System altered.
Instance 1  Session A 执行UPDATE操作:
SQL> update test set id=id+1 where id=1;
1 row updated.
Instance 1  Session B 查询x$BH buffer header视图 了解 相关Buffer的状态 
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
     STATE CR_SCN_BAS
---------- ----------
         1          0
         3    1227595

 

X$BH 视图的 STATE字段代表Buffer的状态, 以下是状态列表:

 

STATE       NUMBER
KCBBHFREE         0       buffer free
KCBBHEXLCUR       1       buffer current (and if DFS locked X)
KCBBHSHRCUR       2       buffer current (and if DFS locked S)
KCBBHCR           3       buffer consistant read
KCBBHREADING      4       Being read
KCBBHMRECOVERY    5       media recovery (current & special)
KCBBHIRECOVERY    6       Instance recovery (somewhat special)

 

 

这个演示中我们需要用到的是 : state =1 Xcurrent 、 state=2 Scurrent 、 state=3 CR

 

接着在 Instance 2  更新 同一个数据块内的另一条记录 ,这回引发 gc current block 2 way  并将Current Block 传输到 Instance 2, 同时 Instance 1 的原”Current Block” Convert 成 Past Image:

 

 

Instance 2 Session C 
SQL> update test set id=id+1 where id=2;
1 row updated.
Instance 2 Session D
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
1          0
3    1227641
3    1227638

 

STATE =1 的Xcurrent block已传输到 Instance 2 , 再来看 Instance 1 此时的 GC状态:

 

Instance 1 Session B 
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
     STATE CR_SCN_BAS
---------- ----------
         3    1227641
         3    1227638
         8          0
         3    1227595

 

 

问题出现在这里, 当网友再次在Instance 1上的session A中执行对TEST表的SELECT查询后 ,发现原来的 3个 State=3的CR 块 数量减少到了1个:

 

Instance 1 session A 即最初执行UPDATE的 session
SQL> alter session set events '10046 trace name context forever,level 8:10708 trace name context forever,level 103: trace[rac.*] disk high';
Session altered.
SQL> select * from test;
        ID
----------
         2
         2
select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
      STATE CR_SCN_BAS
---------- ----------
         3    1227716
         3    1227713
         8          0

 

 

网友在原帖中是通过v$BH 视图观察CR块的数量,发现在执行SELECT查询后 CR块数量反而减少了,故而产生了疑问。 我们在以上演示中直接 观察X$BH视图可以发现 , 原本的三个CR块的SCN Version分别为: 1227641、1227638、1227595, 在SELECT查询完成后被 2个不同SCN version的CR块 1227716和 1227713 所替换, Oracle为什么要这样做呢?

 

所幸我们在实际执行SELECT查询前设置了event 10708和 rac.*的诊断TRACE,我们先来看看TRACE内容:

 

PARSING IN CURSOR #140444679938584 len=337 dep=1 uid=0 oct=3 lid=0 tim=1335698913632292 hv=3345277572 ad='bc0e68c8' sqlid='baj7tjm3q9sn4'
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),0), NVL(SUM(C2),0) FROM (SELECT /*+ NO_PARALLEL("TEST") FULL("TEST") NO_PARALLEL_INDEX("TEST") */ 1 AS C1, 1 AS C2 FROM "SYS"."TEST" "TEST") SAMPLESUB
END OF STMT
PARSE #140444679938584:c=1000,e=27630,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632252
EXEC #140444679938584:c=0,e=44,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,plh=1950795681,tim=1335698913632390
*** 2012-04-29 07:28:33.632
kclscrs: req=0 block=1/89233
*** 2012-04-29 07:28:33.632
kclscrs: bid=1:3:1:0:7:80:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:3f:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7a:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0
2012-04-29 07:28:33.632578 : kjbcrc[0x15c91.1 76896.0][9]
2012-04-29 07:28:33.632616 : GSIPC:GMBQ: buff 0xba1e8f90, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278
2012-04-29 07:28:33.632634 : kjbmscrc(0x15c91.1)seq 0x2 reqid=0x1c(shadow 0xb4bb4458,reqid x1c)mas@2(infosz 200)(direct 1)
2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]
2012-04-29 07:28:33.632669 : GSIPC:SENDM: send msg 0xba1e9000 dest x20001 seq 24026 type 32 tkts xff0000 mlen x17001a0
2012-04-29 07:28:33.633385 : GSIPC:KSXPCB: msg 0xba1e9000 status 30, type 32, dest 2, rcvr 1
*** 2012-04-29 07:28:33.633
kclwcrs: wait=0 tm=689
*** 2012-04-29 07:28:33.633
kclwcrs: got 1 blocks from ksxprcv
WAIT #140444679938584: nam='gc cr block 2-way' ela= 689 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913633418
2012-04-29 07:28:33.633490 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-04-29 07:28:33.633510 : kjbrcvdscn[0x0.12bbc1][from 2][idx 2012-04-29 07:28:33.633527 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc1][from 2]
*** 2012-04-29 07:28:33.633
kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=689

 

 

通过TRACE不难发现 因为之前没有收集过TEST表的统计信息, 所以这里出发了Dynamic Sampling的动态采样,这本身会引发对TEST表的 CR读请求,实际产生了一次’gc cr block 2-way’ 等待:

2012-04-29 07:28:33.632654 : kjbsentscn[0x0.12bbc1][to 2]

12bbc1= 1227713  与上述X$BH中的一个CR块对应,kjbsentscn[0x0.12bbc1][to 2] 可以理解为 向 Instance 2 发送了SCN=12bbc1=1227713   DBA=0x15c91.1 76896.0 的  CR Request(obj#=76896)

之后kjbrcvdscn函数确认了 [no bscn <= rscn 0x0.12bbc1][from 2] ,即没有 比已receive的 SCN Version =12bbc1 更好的Best Version

 

CR Server Arch

CR Server in Oracle9i Release 2 There are three instances involved: the requestor instance, the lock master instance, and the current block owner instance. The lock is granted if one of the following is true: Resource held mode is NULL. Resource held mode is S and there is no holder of an S lock in the master node. Otherwise, the master forwards the CR request to the holder node. If the lock is global, then you choose a node to forward the CR request to as follows: If there is a past image (PI) at the lock master instance, and the PI SCN is greater than snap-scn, then the master node is this node. Otherwise, you choose a PI with the smallest SCN and PI-SCN greater than snap-SCN. The owner node of this PI is the node you forward the CR request to. The PI with smallest SCN is the most interesting one, because you have less UNDO to be applied. If there is no PI at all, you choose the node that the current buffer belongs to.

 

 

动态采样完成后才真正执行了用户发出的SELECT语句:

 

 

PARSING IN CURSOR #140444682869592 len=18 dep=0 uid=0 oct=3 lid=0 tim=1335698913635874 hv=1689401402 ad='b1a188f0' sqlid='c99yw1xkb4f1u'
select * from test
END OF STMT
PARSE #140444682869592:c=4999,e=34017,p=0,cr=7,cu=0,mis=1,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635870
EXEC #140444682869592:c=0,e=23,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1357081020,tim=1335698913635939
WAIT #140444682869592: nam='SQL*Net message to client' ela= 7 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1335698913636071
*** 2012-04-29 07:28:33.636
kclscrs: req=0 block=1/89233
*** 2012-04-29 07:28:33.636
kclscrs: bid=1:3:1:0:7:83:1:0:4:0:0:0:1:2:4:1:26:0:0:0:70:1a:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:4:3:2:1:2:0:2:0:1c:86:2d:4:0:0:0:0:a2:3c:7c:b:70:1a:0:0:0:0:1:0:7d:f8:76:1d:1:2:dc:5:a9:fe:17:75:0:0:0:0:0:0:0:0:0:0:0:0:63:e5:0:0:0:0:0:0:10:0:0:0
2012-04-29 07:28:33.636209 : kjbcrc[0x15c91.1 76896.0][9]
2012-04-29 07:28:33.636228 : GSIPC:GMBQ: buff 0xba0e5d50, queue 0xbb79f278, pool 0x60013fa0, freeq 1, nxt 0xbb79f278, prv 0xbb79f278
2012-04-29 07:28:33.636244 : kjbmscrc(0x15c91.1)seq 0x3 reqid=0x1d(shadow 0xb4bb4458,reqid x1d)mas@2(infosz 200)(direct 1)
2012-04-29 07:28:33.636252 : kjbsentscn[0x0.12bbc4][to 2]
2012-04-29 07:28:33.636358 : GSIPC:SENDM: send msg 0xba0e5dc0 dest x20001 seq 24029 type 32 tkts xff0000 mlen x17001a0
2012-04-29 07:28:33.636861 : GSIPC:KSXPCB: msg 0xba0e5dc0 status 30, type 32, dest 2, rcvr 1
*** 2012-04-29 07:28:33.637
kclwcrs: wait=0 tm=865
*** 2012-04-29 07:28:33.637
kclwcrs: got 1 blocks from ksxprcv
WAIT #140444682869592: nam='gc cr block 2-way' ela= 865 p1=1 p2=89233 p3=1 obj#=76896 tim=1335698913637294
2012-04-29 07:28:33.637356 : kjbcrcomplete[0x15c91.1 76896.0][0]
2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn <= rscn 0x0.12bbc4][from 2]
*** 2012-04-29 07:28:33.637
kclwcrs: req=0 typ=cr(2) wtyp=2hop tm=865

 

类似的, “SELECT * FROM TEST”也引发了一次’gc cr block 2-way’等待:
2012-04-29 07:28:33.637374 : kjbrcvdscn[0x0.12bbc4][from 2][idx 2012-04-29 07:28:33.637389 : kjbrcvdscn[no bscn

最后Foreground Process从 Remote LMS哪里got的是 SCN=1227716 Version的CR, 同样与 之前我们从X$BH 视图查到的scn对应。

 

这样就可以解释为什么Instance 1上出现了2个SCN更大的CR块, 但仍无法解释原来存在于Instance 1 Buffer Cache中的 三个SCN Version 较小的CR 块消失的原因。

 

我们来看下面的演示:

 

 

 

SQL> alter system set "_enable_minscn_cr"=false scope=spfile;
System altered.
SQL> alter system set "_db_block_max_cr_dba"=20 scope=spfile;
System altered.
SQL> startup force;
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.
Total System Global Area 1570009088 bytes
Fixed Size                  2228704 bytes
Variable Size             989859360 bytes
Database Buffers          570425344 bytes
Redo Buffers                7495680 bytes
Database mounted.
Database opened.

 

设置以上 “_enable_minscn_cr”=false 和 “_db_block_max_cr_dba”=20 并重启RAC所有实例, 重现上述演示:

 

 

在Instance 2 Session C 中update更新一次数据块 就对应地在Instance 1 中查询一次 ,以反复在Instance 1中Request CR 
SQL> update test set id=id+1 where id=2;              -- Instance 2
1 row updated.
SQL> select * From test;                         -- Instance 1
ID
----------
1
2
下面为 Instance 1的 X$BH记录
select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
 STATE CR_SCN_BAS
---------- ----------
3    1273080
3    1273071
3    1273041
3    1273039
8          0
SQL>  update test set id=id+1 where id=3;
1 row updated.
SQL> select * From test;
ID
----------
1
2
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1273091
3    1273080
3    1273071
3    1273041
3    1273039
8          0
...................
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1273793
3    1273782
3    1273780
3    1273769
3    1273734
3    1273715
3    1273691
3    1273679
3    1273670
3    1273643
3    1273635
3    1273623
3    1273106
3    1273091
3    1273080
3    1273071
3    1273041
3    1273039
3    1273033
19 rows selected.
SQL> select state,cr_scn_bas from x$bh where file#=1 and dbablk=89233 and state!=0;
STATE CR_SCN_BAS
---------- ----------
3    1274993

 

 

如上述演示 在设置了 “_enable_minscn_cr”(enable/disable minscn optimization for CR)=false 和 “_db_block_max_cr_dba”=20 (Maximum Allowed Number of CR buffers per dba) 2个 参数后 最多的时候 Instance 1 中缓存了同一个数据块的 多达 19个版本的CR块。

 

“_enable_minscn_cr”是11g以后出现的新隐藏参数,它控制Oracle是否计算CR块的最小SCN,当Foreground Process Receive接收到同一个数据块的更新(SCN更大)的SCN Version CR Block时可能会清除CBC上的 SCN较小的、旧的CR块 , 这样做的目的是减少Buffer Cache中同一个数据块 不同版本SCN Version的CR块的数量,注意不管是 语句级别或者事务级别 其所要求的Snap_Scn 快照 SCN总是 语句或事务开始时的Current SCN, 保留一些旧的CR块虽然可能对一些持续时间长的查询或者游标有益, 但是实例Buffer Cache中 同一个数据块的多版本 CR块的总数量是有限的, 这个总数受到 “_db_block_max_cr_dba” 隐藏参数的控制, 如我们上述演示中设置为20 ,则最多可以在Buffer Cache中缓存多大19个版本的CR块; 注意该”_db_block_max_cr_dba” 参数的默认值为6 , 即一个实例Buffer cache中同一个数据块的CR 版本同时不多于6个。

 

引入”_enable_minscn_cr” 优化CR的最小MINSCN 是有其理由的, 即便那些版本较旧的CR块被新陈代谢了, 但只要需要 Foreground Process还是可以通过CR Request , 要求 Holder Instance LMS 去build一个 BEST CR 来用, 这不消我们去担心。

Script:收集RAC诊断信息

以下脚本可以用于汇总RAC诊断信息, 并非每一个输出或者每一个日志都要阅读,在问题诊断时可以有针对性地查看, 如果是远程协助的会,可以让用户收集完打包 ,这样省去多次交互的时间:

 

 

      - Cluster 启动状态
   - hostname
   - uptime 
   - vmstat iostat nmon OS stats 
   - /etc/sysctl.conf /etc/system  vmo -a no -a 
   - /etc/inittab
   - /etc/init.d/init.crs
   - /etc/init.d/init.crsd
   - /etc/init.d/init.cssd
   - /etc/init.d/init.evmd
   - olsnodes
   - crsctl check css
   - crsctl check crs status
   - crsctl check crsd status
   - crsctl check cssd status
   - crsctl check evmd status
   - crsctl query css votedisk
   - crsctl get css priority
   - crsctl get css diagwait
   - crsctl get css disktimeout
   - crsctl get css misscount
   - crsctl get css reboottime
   - crsctl query crs softwareversion
   - crsctl query crs activeversion
   - opatch lsinventory -detail 
   - /var/opt/oracle/ocr.loc 或 /etc/oracle/ocr.loc
   - cluvfy comp sys -n <hostname> -p crs -verbose
   - cluvfy comp sys -n <hostname> -p databaase -verbose
   - cluvfy stage -post crsinst -n <hostname> -verbose
   - cluvfy comp software -verbose -n all
   - Clusterware 版本
   - $ORA_CRS_HOME 已安装产品的信息
   - $ORA_CRS_HOME 已应用补丁的信息
   - crs_stat -u
   - crs_stat -p
   - $ORA_CRS_HOME/log/<hostname>/alert<hostname>.log
   - $ORA_CRS_HOME/crs/log/*.log
   - $ORA_CRS_HOME/log/<hostname>/crsd/*
   - $ORA_CRS_HOME/css/log/*.log
   - $ORA_CRS_HOME/css/init/*.log
   - $ORA_CRS_HOME/log/<hostname>/cssd/*
   - $ORA_CRS_HOME/racg/dump/*
   - $ORACLE_HOME/racg/dump/*
   - $ORA_CRS_HOME/log/<hostname>/racg/*
   - $ORACLE_HOME/log/<hostname>/racg/*
   - $ORA_CRS_HOME/evm/log/*.log
   - $ORA_CRS_HOME/evm/init/*.log
   - $ORA_CRS_HOME/log/<hostname>/evmd/*
   - $ORA_CRS_HOME/log/<hostname>/client/*
   - $ORACLE_HOME/log/<hostname>/client/*
   - $ORA_CRS_HOME/log/<hostname>/ohasd/ohasd.log
   - $ORA_CRS_HOME/log/<hostname>/gpnpd/gpnpd.log
   - /etc/oracle/oprocd/*oprocd.log 或 /var/opt/oracle/oprocd/*oprocd.log
   - /etc/oracle/lastgasp/*.lgl 或  /var/opt/oracle/lastgasp/*.lgl
   - debug resource
  # cd $ORA_CRS_HOME/bin
  # crsctl debug log res <resname:level>
  # crsctl debug log res <vip resourfce name>:5
   - ocrdump
   - ocrcheck
   - ocrconfig -showbackup
   - /var/opt/oracle/srvConfig.loc 或 /etc/srvConfig.loc
   - srvctl config nodeapps -n <hostname> -a -l -g -s
   - srvctl config
   - srvctl config database -d <dbname> -a
   - srvctl config scan_listener
   - srvctl status scan_listener
   - srvctl config scan
   - srvctl status scan
   - srvctl config srvpool
   - srvctl status srvpool -a
   - srvctl config listener
   - srvctl status listener
   - srvctl config listener -n <hostname>
   - oifcfg iflist
   - oifcfg getif
   - interconnect 信息 x$skgxpia or oradebug ipc
   - IPC 信息
   - messages 或 syslog OS日志 
   IBM: /bin/errpt -a > messages.out
      Linux: /var/log/messages
      Solaris: /var/adm/messages
   - netstat -rn
   - ifconfig -a
   - cat /etc/hosts
   - netstat -s

沪ICP备14014813号

沪公网安备 31010802001379号