Oracle内部视图X$KFFXP

X$KFFXP是ASM(Automatic Storage Management)自动存储管理特性的重要内部视图,该视图反应了File Extent Map映射关系,ASM会将文件split成多个多个piece分片,这些分片被称为Extents。 在Disk上存放这些Extent的位置,就是我们常说的”Allocation Unit”。

 

KFF意为Kernel File,X$KFFXP即Kernel File Extent Maps, 该内部视图的一条记录代表一个Extent。

 

其字段含义如下:

 

GROUP_KFFXP        diskgroup number (1 - 63) ASM disk group number. Join with v$asm_disk and v$asm_diskgroup

NUMBER_KFFXP      file number for the extent ASM file number. Join with v$asm_file and v$asm_alias

COMPOUND_KFFXP    (group_kffxp << 24) + file # File identifier. Join with compound_index in v$asm_file

INCARN_KFFXP      file incarnation number File incarnation id. Join with incarnation in v$asm_file

PXN_KFFXP            physical extent number  Extent number per file

XNUM_KFFXP          extent number bit 31 set if indirect Logical extent
number per file (mirrored extents have the same value)

LXN_KFFXP            logical extent number 0,1 used to identify primary/mirror extent,
2 identifies file header allocation unit (hypothesis) used in the query such that
we go after only the primary extents, not secondary extents 

DISK_KFFXP          disk on which AU is located  Disk number where the extent is allocated.
Join with v$asm_disk Relative position of the allocation unit from the beginning of the disk. 

AU_KFFXP              AU number on disk of AU allocation unit size (1 MB) in v$asm_diskgroup

从11g开始加入了CHK_KFFXP SIZE_KFFXP 2个新的字段

CHK_KFFXP   未知 可能是范围为[0-256]的某种校验值

SIZE_KFFXP  size_kffxp is used such that we account for variable sized extents. 
sum(size_kffxp) provides the number of AUs that are on that disk.

 

在实例级别控制ASM Diskgroup AU 和 stripe size的是2个隐藏参数 _asm_ausize 1048576 以及 _asm_stripesize 131072。从11g开始一个Extent可能包含多个AU。

 

可以通过以下脚本查询文件与Extent等ASM属性的映射关系:

 

set linesize 140 pagesize 1400
col "FILE NAME" format a40
set head on
select NAME         "FILE NAME",
       NUMBER_KFFXP "FILE NUMBER",
       XNUM_KFFXP   "EXTENT NUMBER",
       DISK_KFFXP   "DISK NUMBER",
       AU_KFFXP     "AU NUMBER",
       SIZE_KFFXP   "NUMBER of AUs"
  from x$kffxp, v$asm_alias
 where GROUP_KFFXP = GROUP_NUMBER
   and NUMBER_KFFXP = FILE_NUMBER
   and system_created = 'Y'
   and lxn_kffxp = 0
 order by name;

ASM file metadata operation等待事件

什么是” ASM file metadata operation “等待事件?

这是一个内部的undocumented等待事件:

“ASM file metadata operation” event is associated with database (instance) communication with ASM (instance). If the database has files in ASM disk group(s) it needs to access disk group(s), get extent maps for files that it already has, get updated extent info (e.g. after a rebalance), create new files in ASM etc.

High waits for ‘ksv master wait’ while doing an ASM file metadata operation were reported when a data migration utility was running. This wait was also seen for a drop of a tablespace.

Many waits for “ASM metadata file operation” may be seen under kfnConnect
when sessions query ASM group statistics in a DB instance.

Rediscovery Notes:
Waits occur under repeated kfnConnect calls when querying certain
views based on fixed (X$) views, such as V$ASM_DISKGROUP_STAT.

At this time on the ASM only a few process(one or two) will be on non-idle
wait events such as “ASM file metadata operation” .

WAITING FOR ‘ASM FILE METADATA OPERATION’ DURING RMAN BACKUP & ASM HANGS

Oracle安装与操作系统用户组

Oracle软件在安装维护过程中长要和操作用户组(OS user group)打交道,从早前的只有oracle用户和dba组发展到今天11gr2中的grid用户和asm组,Oracle管理的日新月异可见一斑。

我们在单实例(single-instance)环境中常用的三个操作用户组,分别是:

oinstall用户组

oinstall 组是Oracle推荐创建的OS用户组之一,建议在系统第一次安装oracle软件产品之前创建该oinstall组,理论上该oinstall组应当拥有oracle软件产品目录(例如$CRS_HOME和$ORACLE_HOME)和oracle Inventory信息目录仓库,oracle Inventory信息目录记录了系统上安装过的oracle产品的记录。关于oracle Inventory产品信息仓库更多内容可以参考<深入理解Oracle Universal Installer (OUI)>一文。

若系统中已有安装过oracle产品软件,则现有的oracle Inventory目录的所有组必须是今后用来安装新oracle软件产品的用户的主组(primary group)。

现有的oracle Inventory拥有者组可以通过/etc/oraInst.loc位置文件了解:

inventory_loc=/u01/app/oracle/oraInventory
inst_group=oinstall

若/etc/oraInst.loc(少数平台不在该位置)位置文件不存在,那么建议创建oinstall用户组,注意在RAC环境中要保持各节点上用户组的gid一致:

# /usr/sbin/groupadd -g GID oinstall

OSDBA用户组(dba)

OSDBA是我们必须要创建的一种系统DBA用户组(dba),若没有该用户组我们将无法安装数据库软件及执行管理数据库的任务。

OSOPER用户组(oper)

OSOPER是一种额外的用户组(oper),我们可以选择要不要创建该用户组,创建该用户组可以满足让os用户行使某些数据库管理权限(包括SYSOPER角色权限)的目的。注意SYSOPER的权限包括startup和shutdown,所以要小心为该用户组添加成员。

 

创建OSOPER用户组的方法:

# /usr/sbin/groupadd oper

综上所述在单机环境(single-instance)中oracle软件拥有者用户(常见的oracle或者orauser),因该同时是oinstall、dba、oper用户组的成员。同时该用户的主用户组必须是oinstall。

Oracle Database 11g release 2中选择Privileged Operating System Groups

rdbms_os_groups_dba_oper

 

而在11.2的GI/CRS环境中数据库软件拥有者用户(oracle或orauser)还必须是asmdba用户组的成员。

usermod -g oinstall -G dba,oper,asmdba [oracle|orauser]

id oracle
uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),54322(dba),701(asmdba),54324(oper)

注意OSDBA和OSOPER用户组都受到$ORACLE_HOME/rdbms/lib/config.c 源文件的影响,该文件定义了默认的 SS_DBA_GRP “dba” 和SS_OPER_GRP “oper”,该源文件内容如下:

/*  Refer to the Installation and User's Guide for further information.  */

/* IMPORTANT: this file needs to be in sync with
              rdbms/src/server/osds/config.c, specifically regarding the
              number of elements in the ss_dba_grp array.
 */

#define SS_DBA_GRP "dba"
#define SS_OPER_GRP "oper"
#define SS_ASM_GRP ""

char *ss_dba_grp[] = {SS_DBA_GRP, SS_OPER_GRP, SS_ASM_GRP};
~

11g release2中oracle建议独立地管理Grid Infrastructure和ASM实例,因此有必要创建更多的os用户组以满足不同的权限分配。

我们在11.2的GI中常用的ASM用户组有以下三个:

OSASM(asmadmin)用户组

如果使用ASM,那么我们必须创建osasm(asmadmin)用户组,该OSASM用户组的成员将被赋予SYSASM权限,以满足组成员管理Oracle Clusterware和Oracle ASM的权限需求。

OSDBA for ASM group(asmdba)用户组

OSDBA(asmdba)用户组的成员将被赋予读写访问ASM文件的权限。GI/CRS拥有者用户和所有oracle数据库软件的拥有者必须是该组的成员。同时所有OSDBA(dba)用户组的成员也必须是asmdba组的成员。

OSOPER for ASM(asmoper)用户组

asmoper和osoper类似都是额外的可选择创建的用户组,创建该独立的用户组以满足赋予用户一套受限的ASM实例管理权限(ASM的SYSOPER角色),该权限包括了启动和停止ASM实例,默认情况下OSASM(asmadmin)组成员将拥有所有SYSOPER的ASM管理权限。

在11.2的GI/CRS环境中一般会创建grid或griduser用户来管理GI软件和ASM实例,以如下方式创建grid用户:

 useradd -g oinstall -G asmadmin,asmdba,asmoper grid  

 id grid
 uid=54322(grid) gid=54321(oinstall) groups=54321(oinstall),700(asmadmin),701(asmdba),55000(asmoper)

Oracle 11g release2 Grid Infrastructure中选择Privileged Operating System Groups:

rdbms_os_groups_dba_oper

综合上述OS用户和用户组间的关系:

os_user_group_gi_rac

更多内容可以参考下文:

The OSDBA group (typically, dba)

You must create this group the first time you install Oracle Database software on the system. This group identifies operating system user accounts that have database administrative privileges (the SYSDBA privilege). If you do not create separate OSDBA, OSOPER and OSASM groups for the Oracle ASM instance, then operating system user accounts that have the SYSOPER and SYSASM privileges must be members of this group. The name used for this group in Oracle code examples is dba. If you do not designate a separate group as the OSASM group, then the OSDBA group you define is also by default the OSASM group.

To specify a group name other than the default dba group, then you must choose the Advanced installation type to install the software or start Oracle Universal Installer (OUI) as a user that is not a member of this group. In this case, OUI prompts you to specify the name of this group.

Members of the OSDBA group formerly were granted SYSASM privileges on Oracle ASM instances, including mounting and dismounting disk groups. This privileges grant is removed with Oracle Grid Infrastructure 11g release 2, if different operating system groups are designated as the OSDBA and OSASM groups. If the same group is used for both OSDBA and OSASM, then the privilege is retained.

The OSOPER group for Oracle Database (typically, oper)

This is an optional group. Create this group if you want a separate group of operating system users to have a limited set of database administrative privileges (the SYSOPER privilege). By default, members of the OSDBA group also have all privileges granted by the SYSOPER privilege.

To use the OSOPER group to create a database administrator group with fewer privileges than the default dba group, then you must choose the Advanced installation type to install the software or start OUI as a user that is not a member of the dba group. In this case, OUI prompts you to specify the name of this group. The usual name chosen for this group is oper.

The Oracle Automatic Storage Management Group (typically asmadmin)

This is a required group. Create this group as a separate group if you want to have separate administration privilege groups for Oracle ASM and Oracle Database administrators. In Oracle documentation, the operating system group whose members are granted privileges is called the OSASM group, and in code examples, where there is a group specifically created to grant this privilege, it is referred to as asmadmin.

If you have multiple databases on your system, and use multiple OSDBA groups so that you can provide separate SYSDBA privileges for each database, then you should create a separate OSASM group, and use a separate user from the database users to own the Oracle Grid Infrastructure installation (Oracle Clusterware and Oracle ASM). Oracle ASM can support multiple databases.

Members of the OSASM group can use SQL to connect to an Oracle ASM instance as SYSASM using operating system authentication. The SYSASM privileges permit mounting and dismounting disk groups, and other storage administration tasks. SYSASM privileges provide no access privileges on an RDBMS instance.

The Oracle ASM Database Administrator group (OSDBA for ASM, typically asmdba)

Members of the Oracle ASM Database Administrator group (OSDBA for ASM) are granted read and write access to files managed by Oracle ASM. The Oracle Grid Infrastructure installation owner and all Oracle Database software owners must be a member of this group, and all users with OSDBA membership on databases that have access to the files managed by Oracle ASM must be members of the OSDBA group for ASM.

Members of the Oracle ASM Operator Group (OSOPER for ASM, typically asmoper)

This is an optional group. Create this group if you want a separate group of operating system users to have a limited set of Oracle ASM instance administrative privileges (the SYSOPER for ASM privilege), including starting up and stopping the Oracle ASM instance. By default, members of the OSASM group also have all privileges granted by the SYSOPER for ASM privilege.

To use the Oracle ASM Operator group to create an ASM administrator group with fewer privileges than the default asmadmin group, then you must choose the Advanced installation type to install the software, In this case, OUI prompts you to specify the name of this group. In code examples, this group is asmoper.

An Oracle central inventory group, or oraInventory group (oinstall). Members who have the central inventory group as their primary group, are granted the OINSTALL permission to write to the oraInventory directory.

A single system privileges group that is used as the OSASM, OSDBA, OSDBA for ASM, and OSOPER for ASM group (dba), whose members are granted the SYSASM and SYSDBA privilege to administer Oracle Clusterware, Oracle ASM, and Oracle Database, and are granted SYSASM and OSOPER for ASM access to the Oracle ASM storage.

An Oracle grid installation for a cluster owner (grid), with the oraInventory group as its primary group, and with the OSASM group as the secondary group, with its Oracle base directory /u01/app/grid.

An Oracle Database owner (oracle) with the oraInventory group as its primary group, and the OSDBA group as its secondary group, with its Oracle base directory /u01/app/oracle.

/u01/app owned by grid:oinstall with 775 permissions before installation, and by root after the root.sh script is run during installation. This ownership and permissions enables OUI to create the Oracle Inventory directory, in the path /u01/app/oraInventory.

/u01 owned by grid:oinstall before installation, and by root after the root.sh script is run during installation.

/u01/app/11.2.0/grid owned by grid:oinstall with 775 permissions. These permissions are required for installation, and are changed during the installation process.

/u01/app/grid owned by grid:oinstall with 775 permissions before installation, and 755 permissions after installation.

/u01/app/oracle owned by oracle:oinstall with 775 permissions.

An Oracle central inventory group, or oraInventory group (oinstall), whose members that have this group as their primary group are granted permissions to write to the oraInventory directory.

A separate OSASM group (asmadmin), whose members are granted the SYSASM privilege to administer Oracle Clusterware and Oracle ASM.

A separate OSDBA for ASM group (asmdba), whose members include grid, oracle1 and oracle2, and who are granted access to Oracle ASM.

A separate OSOPER for ASM group (asmoper), whose members are granted limited Oracle ASM administrator privileges, including the permissions to start and stop the Oracle ASM instance.

An Oracle grid installation for a cluster owner (grid), with the oraInventory group as its primary group, and with the OSASM (asmadmin), OSDBA for ASM (asmdba) group as a secondary group.

Two separate OSDBA groups for two different databases (dba1 and dba2) to establish separate SYSDBA privileges for each database.

Two Oracle Database software owners (oracle1 and oracle2), to divide ownership of the Oracle database binaries, with the OraInventory group as their primary group, and the OSDBA group for their database (dba1 or dba2) and the OSDBA for ASM group (asmdba) as their secondary groups.

An OFA-compliant mount point /u01 owned by grid:oinstall before installation.

An Oracle base for the grid installation owner /u01/app/grid owned by grid:oinstall with 775 permissions, and changed during the installation process to 755 permissions.

An Oracle base /u01/app/oracle1 owned by oracle1:oinstall with 775 permissions.

An Oracle base /u01/app/oracle 2 owned by oracle2:oinstall with 775 permissions.

A Grid home /u01/app/11.2.0/grid owned by grid:oinstall with 775 (drwxdrwxr-x) permissions. These permissions are required for installation, and are changed during the installation process to root:oinstall with 755 permissions (drwxr-xr-x).

/u01/app/oraInventory. This path remains owned by grid:oinstall, to enable other Oracle software owners to write to the central inventory.

利用Cluster Verify Utility工具体验RAC最佳实践

Cluster Verification Utilit(CVU)是Oracle所推荐的一种集群检验工具。该检验工具帮助用户在Cluter部署的各个阶段验证集群的重要组件,这些阶段包括硬件搭建、Clusterware的安装、RDBMS的安装、存储等等。我们既可以在Cluster安装之前使用CVU来帮我们检验所配置的环境正确可用,也可以在软件安装完成后使用CVU来做对集群的验收。

CVU提供了一种可扩展的框架,其所实施的常规检验活动是独立于具体的平台,并且向存储和网络的检验提供了厂商接口(Vendor Interface)。
CVU工具不依赖于其他Oracle软件,仅使用命令cluvfy,如cluvfy stage -pre crsinst -n vrh1,vrh2。

cluvfy的部署十分简单,在本地节点安装后,该工具在运行过程中会自动部署到远程主机上。具体的自动部署流程如下:

  1. 用户在本地节点安装CVU
  2. 用户针对多个节点实施Verify检验命令
  3. CVU工具将拷贝自身必要的文件到远程节点
  4. CVU会在所有节点执行检验任务并生成报告

CVU工具可以为我们提供以下功能:

  1. 验证Cluster集群是否规范配置以便后续的RAC安装、配置和操作顺利
  2. 全类型的验证
  3. 非破坏性的验证
  4. 提供了易于使用的接口
  5. 支持各种平台和配置的RAC,明确完善的统一行为方式

注意不要误解cluvfy的作用,它仅仅是一个检验者,而不负责实际的配置或修复工作:

  1. cluvfy不支持任何类型的cluster或RAC操作
  2. 在检验到问题或失败后,cluvfy不会采取任何修正行为
  3. cluvfy不是性能调优或监控工具
  4. cluvfy不会尝试帮助你验证RAC数据库的内部结构

RAC的实际部署可以被逻辑地区分为多个操作阶段,这些阶段被称作是”stage”,在实际的部署过程中每一个stage由一系列的操作组成。每一个stage的都有自身的预检查(pre-check)和验收检查(post-check),如图:

cluvfy_stage_list

 

我们会在CRS和RAC数据库的安装过程中具体使用Cluster Verify Utility的不同”stage”,各种不同的stage可以使用cluvfy stage -list命令列出:

cluvfy stage -list

USAGE:
cluvfy stage {-pre|-post} <stage-name> <stage-specific options>  [-verbose]

Valid Stages are:
      -pre cfs        : pre-check for CFS setup
      -pre crsinst    : pre-check for CRS installation
      -pre acfscfg    : pre-check for ACFS Configuration.
      -pre dbinst     : pre-check for database installation
      -pre dbcfg      : pre-check for database configuration
      -pre hacfg      : pre-check for HA configuration
      -pre nodeadd    : pre-check for node addition.
      -post hwos      : post-check for hardware and operating system
      -post cfs       : post-check for CFS setup
      -post crsinst   : post-check for CRS installation
      -post acfscfg   : post-check for ACFS Configuration.
      -post hacfg     : post-check for HA configuration
      -post nodeadd   : post-check for node addition.
      -post nodedel   : post-check for node deletion.

在RAC Cluster中独立的子系统或者模块被称作组件(component),集群组件的可用性、完整性、稳定性以及其他一些表现均可以使用CVU来验证。简单如某个存储设备、复杂如包含CRSD、EVMD、CSSD、OCR等多个子组件的CRS stack都可以被认为是一个组件。

在CRS运行过程中为了检验Cluster中的某个特定组件(component)或者为了独立诊断某个Cluster集群子系统,需要用到合适的组件检查命令;各种不同组件的检验可以使用cluvfy comp -list命令列出:

 cluvfy comp -list

USAGE:
cluvfy comp     [-verbose]

Valid Components are:
      nodereach       : checks reachability between nodes
      nodecon         : checks node connectivity
      cfs             : checks CFS integrity
      ssa             : checks shared storage accessibility
      space           : checks space availability
      sys             : checks minimum system requirements
      clu             : checks cluster integrity
      clumgr          : checks cluster manager integrity
      ocr             : checks OCR integrity
      olr             : checks OLR integrity
      ha              : checks HA integrity
      crs             : checks CRS integrity
      nodeapp         : checks node applications existence
      admprv          : checks administrative privileges
      peer            : compares properties with peers
      software        : checks software distribution
      acfs            : checks ACFS integrity
      asm             : checks ASM integrity
      gpnp            : checks GPnP integrity
      gns             : checks GNS integrity
      scan            : checks SCAN configuration
      ohasd           : checks OHASD integrity
      clocksync       : checks Clock Synchronization
      vdisk           : checks Voting Disk configuration and UDEV settings
      dhcp            : Checks DHCP configuration
      dns             : Checks DNS configuration

我们可以从OTN的CVU专栏内下载到最新版本的cluvfy,如果没有特殊的要求那么我们总是推荐使用最新版本。一般在完成RAC安装后也可以从以下2个路径找到cluvfy:

Clusterware Home
<crs_home>/bin/cluvfy

Oracle Home
$ORACLE_HOME/bin/cluvfy

使用最为频繁的几个cluvfy命令如下:

Verify the hardware and operating system:检验操作系统和硬件的配置

cluvfy stage -post hwos -n vrh1,vrh2

cluvfy stage -post hwos -n vrh1,vrh2

Performing post-checks for hardware and operating system setup 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Check for multiple users with UID value 0 passed
Time zone consistency check passed

Checking shared storage accessibility...

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdb                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdc                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdd                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sde                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdf                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdg                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdh                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdi                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdj                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdk                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdl                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdm                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdn                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdo                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdp                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdq                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdr                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sds                              vrh2 vrh1               

  Disk                                  Sharing Nodes (2 in count)
  ------------------------------------  ------------------------
  /dev/sdt                              vrh2 vrh1               
Shared storage check was successful on nodes "vrh2,vrh1"
Post-check for hardware and operating system setup was successful.

Cluster Installation Ready check on all nodes:安装Clusterware前执行以下命令

 cluvfy stage -pre crsinst -n vrh1,vrh2

cluvfy stage -pre crsinst -n vrh1,vrh2
Performing pre-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "vrh1"
Checking user equivalence...
User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Node connectivity passed for subnet "192.168.1.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.1.0"

Node connectivity passed for subnet "192.168.2.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "192.168.2.0"

Node connectivity passed for subnet "169.254.0.0" with node(s) vrh2,vrh1
TCP connectivity check passed for subnet "169.254.0.0"

Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
vrh2 eth0:192.168.1.163 eth0:192.168.1.164 eth0:192.168.1.166
vrh1 eth0:192.168.1.161 eth0:192.168.1.190 eth0:192.168.1.162

Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
vrh2 eth1:169.254.8.92
vrh1 eth1:169.254.175.195

Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
vrh2 eth1:192.168.2.19
vrh1 eth1:192.168.2.18

Node connectivity check passed

Checking ASMLib configuration.
Check for ASMLib configuration passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Membership check for user "grid" in group "oinstall" [as Primary] failed
Check failed on nodes:
        vrh2,vrh1
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed

Starting Clock synchronization checks using Network Time Protocol(NTP)...

NTP Configuration file check started...
No NTP Daemons or Services were found to be running

Clock synchronization check using Network Time Protocol(NTP) passed

Core file name pattern consistency check passed.

User "grid" is not part of "root" group. Check passed
Default user file creation mask check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed

Starting check for Huge Pages Existence ...

Check for Huge Pages Existence passed

Starting check for Hardware Clock synchronization at shutdown ...

Check for Hardware Clock synchronization at shutdown passed

Pre-check for cluster services setup was unsuccessful on all the nodes.

Database Installation Ready check on all nodes:安装RDBMS前执行以下命令
cluvfy stage -pre dbinst -n vrh1,vrh2

cluvfy stage -pre dbinst -n vrh1,vrh2 

Performing pre-checks for database installation 

Checking node reachability...
Node reachability check passed from node "vrh1"

Checking user equivalence...
User equivalence check passed for user "grid"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"

Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"

Node connectivity check passed

Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "vrh2:/tmp"
Free disk space check passed for "vrh1:/tmp"
Check for multiple users with UID value 54322 passed
User existence check passed for "grid"
Group existence check passed for "asmadmin"
Group existence check passed for "dba"
Membership check for user "grid" in group "asmadmin" [as Primary] passed
Membership check for user "grid" in group "dba" failed
Check failed on nodes:
        vrh2,vrh1
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Soft limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
Soft limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "make-3.81( x86_64)"
Package existence check passed for "binutils-2.17.50.0.6( x86_64)"
Package existence check passed for "gcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "glibc-2.5-24 (x86_64)( x86_64)"
Package existence check passed for "compat-libstdc++-33-3.2.3 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-0.125 (x86_64)( x86_64)"
Package existence check passed for "elfutils-libelf-devel-0.125( x86_64)"
Package existence check passed for "glibc-common-2.5( x86_64)"
Package existence check passed for "glibc-devel-2.5 (x86_64)( x86_64)"
Package existence check passed for "glibc-headers-2.5( x86_64)"
Package existence check passed for "gcc-c++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libaio-devel-0.3.106 (x86_64)( x86_64)"
Package existence check passed for "libgcc-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "libstdc++-devel-4.1.2 (x86_64)( x86_64)"
Package existence check passed for "sysstat-7.0.2( x86_64)"
Package existence check passed for "ksh-20060214( x86_64)"
Check for multiple users with UID value 0 passed
Current group ID check passed
Default user file creation mask check passed

Checking CRS integrity...

CRS integrity check passed

Checking Cluster manager integrity... 

Checking CSS daemon...
Oracle Cluster Synchronization Services appear to be online.

Cluster manager integrity check passed

Checking node application existence...

Checking existence of VIP node application (required)
VIP node application check passed

Checking existence of NETWORK node application (required)
NETWORK node application check passed

Checking existence of GSD node application (optional)
GSD node application is offline on nodes "vrh2,vrh1"

Checking existence of ONS node application (optional)
ONS node application check passed

Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed

Checking if CTSS Resource is running on all nodes...
CTSS resource check passed

Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed

Check CTSS state started...
CTSS is in Active state. Proceeding with check of clock time offsets on all nodes...
Check of clock time offsets passed

Oracle Cluster Time Synchronization Services check passed
Checking consistency of file "/etc/resolv.conf" across nodes

File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
The DNS response time for an unreachable node is within acceptable limit on all nodes

File "/etc/resolv.conf" is consistent across nodes

Time zone consistency check passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.

Pre-check for database installation was unsuccessful on all the nodes.

利用cluvfy检验ocr组件:
cluvfy comp ocr

cluvfy comp ocr
Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations

ASM Running check passed. ASM is running on all specified nodes

Checking OCR config file "/etc/oracle/ocr.loc"...

OCR config file "/etc/oracle/ocr.loc" check successful

Disk group for ocr location "+SYSTEMDG" available on all the nodes

Disk group for ocr location "+FRA" available on all the nodes

Disk group for ocr location "+DATA" available on all the nodes

NOTE:
This check does not verify the integrity of the OCR contents.
Execute 'ocrcheck' as a privileged user to verify the contents of OCR.

OCR integrity check passed

Verification of OCR integrity was successful.

CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

客户的一套11.2.0.1 RAC系统采用ASM diskgroup 存放ocr和votedisk,该REG diskgroup中的某个LUN disk由于硬件的原因损坏了,导致冗余的votedisk表决磁盘有一个处于OFFLINE状态,客户希望能删除该OFFLINE的votedisk并新增一个可用的。

在删除该votedisk文件时出现了CRS-4258的错误,错误如下:

crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  5b3380d6367e4f94bf19e9db5f2f684e ()  []
 2. ONLINE   6802e6d139354fb3bf95725dd01a02fd (/dev/ocr2) [REG]
 3. ONLINE   a433d51ebd2d4facbfc8e95b017f5393 (/dev/asm-disk1) [REG]
 4. ONLINE   3784d344bffa4f6ebff21c4dd3c873bd (/dev/asm-disk2) [REG]
Located 4 voting disk(s).

crsctl delete css votedisk 5b3380d6367e4f94bf19e9db5f2f684e
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

居然无法移除ASM存储上的voting files,太搞笑了。

客户在MOS上找到了CRS-4258相关问题的Note:

CRS-4258: Addition and Deletion of Voting Files are not Allowed Because the Voting Files are on ASM in 11gR2 [ID 1060146.1]
Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.1 - Release: 11.2 to 11.2
Information in this document applies to any platform.
Symptoms

CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM in 11gR2.


Changes
Stale voting files are seen after accidently dropping one of ASM disks belonging to the ASM diskgroup where voting files are stored.
And CRS-4258 occurs when trying to delete the stale voting files using crsctl delete css votedisk FUID.

[root@grid]# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 5b91aad0a2184f3dbfa8f970e8ae4d49 (/dev/oracleasm/disks/ASM10) [PLAY]
2. ONLINE 53b1b40b73164f9ebf3f498f6d460187 (/dev/oracleasm/disks/ASM9) [PLAY]
3. OFFLINE 82dfd04b96f14f6dbf36f5a62b118f61 () []

[root@grid]# crsctl delete css votedisk 82dfd04b96f14f6dbf36f5a62b118f61
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM
Cause

1. Seeing stale voting files is due to bug 9024611.

2. "delete" command is not available , only "replace" command is available when voting files are stored on  ASM diskgroup.    

    Please see Oracle Clusterware Administration and Deployment Guide11g Release 2 (11.2)

Solution


1. This issue is permanently fixed in 11.2.0.2.0.

2. Apply patch 9024611. Please contact Oracle support if this patch is not available on your platform.

3. If CSS has stale voting files even after applying patch 9024611, do the following workaround -

WORKAROUND:
Do something to trigger ASM to try to relocate the voting file.

e.g)  $ crsctl replace votedisk  +asm_disk_group   --- Put available ASM diskgroup

        $ crsctl query css votedisk         --- Check if voting files are all online on the new ASM diskgroup
        $ crsctl replace votedisk +PLAY    -- Put the original ASM diskgroup where voting files were 

4. If the workaround above cannot be followed for any reason then you can request the patch for unpublished bug 9409327 for your platform.

References
BUG:9294664 - NOT ABLE TO REMOVE THE VOTEDISK WHICH IS OFFILNE

Hdr: 9294664 11.2.0.1 PCW 11.2.0.1 ADMUTL PRODID-5 PORTID-226 9024611
Abstract: NOT ABLE TO REMOVE THE VOTEDISK WHICH IS OFFILNE

PROBLEM:
--------
crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   9f7f4f7f798d4f69bfe31653894421a2 (ORCL:GRID1) [GRID]
2. OFFLINE  a9b785a59c3c4f67bf15babc67ffb79a () []
3. OFFLINE  29988f37fa794f12bfea3f3672c99609 () []
4. ONLINE   a8b3a040195c4f54bfce8ef21bd4fa07 (ORCL:GRID3) [GRID]
5. ONLINE   a1e4fbd9df6f4f67bf8fc12fe9780721 (ORCL:GRID2) [GRID]
Located 5 voting disk(s).


[root@sdc-drrac01 grid]# crsctl delete css votedisk 
a9b785a59c3c4f67bf15babc67ffb79a
CRS-4258: Addition and deletion of voting files are not allowed because the 
voting files are on ASM

DIAGNOSTIC ANALYSIS:
--------------------
Ct is performing some voting disk failover scenarios in which he has removed 
the 2 votedisk which were on ASM buy drop disk using asmlib and after that 
recreating the disk again and start the cluster in exclusive mode and start 
the ASM and mount the diskgourp So that rebalancing has been done but after 
that

 crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   9f7f4f7f798d4f69bfe31653894421a2 (ORCL:GRID1) [GRID]
2. OFFLINE  a9b785a59c3c4f67bf15babc67ffb79a () []
3. OFFLINE  29988f37fa794f12bfea3f3672c99609 () []
4. ONLINE   a8b3a040195c4f54bfce8ef21bd4fa07 (ORCL:GRID3) [GRID]
5. ONLINE   a1e4fbd9df6f4f67bf8fc12fe9780721 (ORCL:GRID2) [GRID]
Located 5 voting disk(s).

and not able to drop the vote disk which is offiline 

WORKAROUND:
-----------
n/a

RELATED BUGS:
-------------
as per bug 9024611 tried the workaround:

but while running 

crsctl css votedisk delete 

we got syntax error and found that there is no command with crsctl css ...

这个Note说明不能移除ASM存储内voting files的问题在11.2.0.2.0上已经解决了,也可以通过安装one-off patch 9024611来修复。

但是实际在11.2.0.2上测试可以发现仍旧无法删除ASM上的voting files:

root@rh2 ~]# crsctl query crs  releaseversion
Oracle High Availability Services release version on the local node is [11.2.0.2.0]

[root@rh2 ~]# crsctl query crs  activeversion
Oracle Clusterware active version on the cluster is [11.2.0.2.0]


[grid@rh2 ~]$ /s01/grid/OPatch/opatch lsinventory
Invoking OPatch 11.2.0.1.1

Oracle Interim Patch Installer version 11.2.0.1.1
Copyright (c) 2009, Oracle Corporation.  All rights reserved.


Oracle Home       : /s01/grid
Central Inventory : /s01/app/oraInventory
   from           : /etc/oraInst.loc
OPatch version    : 11.2.0.1.1
OUI version       : 11.2.0.2.0
OUI location      : /s01/grid/oui
Log file location : /s01/grid/cfgtoollogs/opatch/opatch2011-08-04_18-50-34PM.log

Patch history file: /s01/grid/cfgtoollogs/opatch/opatch_history.txt

Lsinventory Output file location : /s01/grid/cfgtoollogs/opatch/lsinv/lsinventory2011-08-04_18-50-34PM.txt

--------------------------------------------------------------------------------
Installed Top-level Products (1): 

Oracle Grid Infrastructure                                           11.2.0.2.0
There are 1 products installed in this Oracle Home.


There are no Interim patches installed in this Oracle Home.


Rac system comprising of multiple nodes
  Local node = rh2
  Remote node = rh3

--------------------------------------------------------------------------------

OPatch succeeded.


[root@rh2 ~]# crsctl delete css votedisk a433d51ebd2d4facbfc8e95b017f5393

CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM

又是一个伪修复的Bug….!!

无法,寄希望与replace能解决问题,结果发现:

crsctl replace votedisk +DATA
Failed to create voting files on disk group DATA.
Change to configuration failed, but was successfully rolled back.
CRS-4000: Command Replace failed, or completed with errors.

方法四中指出的unpublished bug 9409327(Patch 9409327: OFFLINE VF ENTRY REMAINS AFTER PATCH FOR BUG 9024611),目前仅在IBM AIX on POWER Systems (64-bit)的11.2.0.1上有对应的补丁。

利用UDEV服务解决RAC ASM存储设备名

<Why ASMLIB and why not?>我们介绍了使用ASMLIB作为一种专门为Oracle Automatic Storage Management特性设计的内核支持库(kernel support library)的优缺点,同时建议使用成熟的UDEV方案来替代ASMLIB。

这里我们就给出配置UDEV的具体步骤,还是比较简单的:

1.确认在所有RAC节点上已经安装了必要的UDEV包

[root@rh2 ~]# rpm -qa|grep udev
udev-095-14.21.el5

2.通过scsi_id获取设备的块设备的唯一标识名,假设系统上已有LUN sdc-sdp

for i in c d e f g h i j k l m n o p ;
do
echo "sd$i" "`scsi_id -g -u -s /block/sd$i` ";
done

sdc 1IET_00010001
sdd 1IET_00010002
sde 1IET_00010003
sdf 1IET_00010004
sdg 1IET_00010005
sdh 1IET_00010006
sdi 1IET_00010007
sdj 1IET_00010008
sdk 1IET_00010009
sdl 1IET_0001000a
sdm 1IET_0001000b
sdn 1IET_0001000c
sdo 1IET_0001000d
sdp 1IET_0001000e 

以上列出于块设备名对应的唯一标识名

3.创建必要的UDEV配置文件,

首先切换到配置文件目录

[root@rh2 ~]# cd /etc/udev/rules.d

定义必要的规则配置文件

[root@rh2 rules.d]# touch 99-oracle-asmdevices.rules 

[root@rh2 rules.d]# cat 99-oracle-asmdevices.rules
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010001", NAME="ocr1", OWNER="grid", GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010002", NAME="ocr2", OWNER="grid", GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010003", NAME="asm-disk1",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010004", NAME="asm-disk2",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010005", NAME="asm-disk3",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010006", NAME="asm-disk4",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010007", NAME="asm-disk5",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010008", NAME="asm-disk6",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_00010009", NAME="asm-disk7",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_0001000a", NAME="asm-disk8",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_0001000b", NAME="asm-disk9",  OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_0001000c", NAME="asm-disk10", OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_0001000d", NAME="asm-disk11", OWNER="grid",  GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -s %p", RESULT=="1IET_0001000e", NAME="asm-disk12", OWNER="grid",  GROUP="asmadmin", MODE="0660"

Result 为/sbin/scsi_id -g -u -s %p的输出--Match the returned string of the last PROGRAM call. This key may be
used in any following rule after a PROGRAM call.
按顺序填入刚才获取的唯一标识名即可

OWNER为安装Grid Infrastructure的用户,在11gr2中一般为grid,GROUP为asmadmin
MODE采用0660即可

NAME为UDEV映射后的设备名,
建议为OCR和VOTE DISK创建独立的DISKGROUP,为了容易区分将该DISKGROUP专用的设备命名为ocr1..ocrn的形式
其余磁盘可以根据其实际用途或磁盘组名来命名

4.将该规则文件拷贝到其他节点上
[root@rh2 rules.d]# scp 99-oracle-asmdevices.rules Other_node:/etc/udev/rules.d

5.在所有节点上启动udev服务,或者重启服务器即可

[root@rh2 rules.d]# /sbin/udevcontrol reload_rules
[root@rh2 rules.d]# /sbin/start_udev
Starting udev:                                            [  OK  ]

6.检查设备是否到位

[root@rh2 rules.d]# cd /dev
[root@rh2 dev]# ls -l ocr*
brw-rw---- 1 grid asmadmin 8, 32 Jul 10 17:31 ocr1
brw-rw---- 1 grid asmadmin 8, 48 Jul 10 17:31 ocr2

[root@rh2 dev]# ls -l asm-disk*
brw-rw---- 1 grid asmadmin 8,  64 Jul 10 17:31 asm-disk1
brw-rw---- 1 grid asmadmin 8, 208 Jul 10 17:31 asm-disk10
brw-rw---- 1 grid asmadmin 8, 224 Jul 10 17:31 asm-disk11
brw-rw---- 1 grid asmadmin 8, 240 Jul 10 17:31 asm-disk12
brw-rw---- 1 grid asmadmin 8,  80 Jul 10 17:31 asm-disk2
brw-rw---- 1 grid asmadmin 8,  96 Jul 10 17:31 asm-disk3
brw-rw---- 1 grid asmadmin 8, 112 Jul 10 17:31 asm-disk4
brw-rw---- 1 grid asmadmin 8, 128 Jul 10 17:31 asm-disk5
brw-rw---- 1 grid asmadmin 8, 144 Jul 10 17:31 asm-disk6
brw-rw---- 1 grid asmadmin 8, 160 Jul 10 17:31 asm-disk7
brw-rw---- 1 grid asmadmin 8, 176 Jul 10 17:31 asm-disk8
brw-rw---- 1 grid asmadmin 8, 192 Jul 10 17:31 asm-disk9

Oracle等待事件kfk:async disk IO

kfk: async disk IO等待事件是ASM下异步的System I/O等待事件,kfk内核层面在disk_asynch_io=true时被激活。当rbal或其他ASM相关后台进程在维护ASM磁盘组时可能进入kfk: async disk IO等待。

SQL> col name for a20
SQL> col PARAMETER1 for a10
SQL> col PARAMETER2 for a10
SQL> col PARAMETER3 for a10
SQL> col WAIT_CLASS for a15

SQL> select name,parameter1,parameter2,parameter3,wait_class from v$event_name where name='kfk: async disk IO';

NAME                 PARAMETER1 PARAMETER2 PARAMETER3 WAIT_CLASS
-------------------- ---------- ---------- ---------- ---------------
kfk: async disk IO   count      intr       timeout    System I/O

SQL> select * from v$version;    

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

SQL> select name,value from v$system_parameter where name in ('instance_type','asm_power_limit');

NAME                 VALUE
-------------------- ----------
instance_type        asm
asm_power_limit      10

SQL> conn / as sysasm
Connected.

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10046 trace name context forever,level 8;
Statement processed.
SQL> alter diskgroup data check all;

Diskgroup altered.

SQL> oradebug event 10046 trace name context off;
Statement processed.

SQL> oradebug tracefile_name;
/s01/orabase/diag/asm/+asm/+ASM1/trace/+ASM1_ora_29405.trc

=====================trace=====================
PARSING IN CURSOR #140442102181424 len=30 dep=0 uid=0 oct=193 
lid=0 tim=1313673029551496 hv=2849532521 ad='6bd58b50' sqlid='ft5h7dunxhum9'
alter diskgroup data check all
END OF STMT
PARSE #140442102181424:c=1999,e=14171,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1313673029551493
WAIT #140442102181424: nam='Disk file operations I/O' ela= 573 FileOperation=2 fileno=0 filetype=15 obj#=-1 
WAIT #140442102181424: nam='Disk file operations I/O' ela= 33 FileOperation=2 fileno=0 filetype=15 obj#=-1 
WAIT #140442102181424: nam='Disk file operations I/O' ela= 29 FileOperation=2 fileno=0 filetype=15 obj#=-1 
WAIT #140442102181424: nam='kfk: async disk IO' ela= 941 count=1 intr=0 timeout=4294967295 obj#=-1

fdp_checkDsk(): 20
----- Abridged Call Stack Trace -----
ksedsts()+461<-kfdp_checkDsk()+476<-kfdCheck()+1649<-kfgCheck()+477<-kfxdrvAl
terOne()+5976<-kfxdrvAlter()+2287<-kfxdrvEntry()+1306<-opiexe()+20028<-opiosq
0()+3993<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+
1670<-opiino()+966<-opiodr()+910
<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<
-__libc_start_main()+244<-_start()+36
----- End of Abridged Call Stack Trace -----
WAIT #140442102181424: nam='rdbms ipc reply' ela= 1610 from_process=19 timeou
t=2147483647 p3=0 obj#=-1 tim=1313673029798048
kfdp_checkDsk(): 21
----- Abridged Call Stack Trace -----
ksedsts()+461<-kfdp_checkDsk()+476<-kfdCheck()+1649<-kfgCheck()+477<-kfxdrvAl
terOne()+5976<-kfxdrvAlter()+2287<-kfxdrvEntry()+1306<-opiexe()+20028<-opiosq
0()+3993<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+
1670<-opiino()+966<-opiodr()+910
<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<
-__libc_start_main()+244<-_start()+36
----- End of Abridged Call Stack Trace -----
WAIT #140442102181424: nam='rdbms ipc reply' ela= 1677 from_process=19 timeou
t=2147483647 p3=0 obj#=-1 tim=1313673029885645
kfdp_checkDsk(): 22
----- Abridged Call Stack Trace -----
ksedsts()+461<-kfdp_checkDsk()+476<-kfdCheck()+1649<-kfgCheck()+477<-kfxdrvAl
terOne()+5976<-kfxdrvAlter()+2287<-kfxdrvEntry()+1306<-opiexe()+20028<-opiosq
0()+3993<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+
1670<-opiino()+966<-opiodr()+910
<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<
-__libc_start_main()+244<-_start()+36
----- End of Abridged Call Stack Trace -----
WAIT #140442102181424: nam='rdbms ipc reply' ela= 1350 from_process=19 timeou
t=2147483647 p3=0 obj#=-1 tim=1313673029970397
kfdp_checkDsk(): 23
----- Abridged Call Stack Trace -----
kfdp_checkDsk(): 24
----- Abridged Call Stack Trace -----
ksedsts()+461<-kfdp_checkDsk()+476<-kfdCheck()+1649<-kfgCheck()+477<-kfxdrvAl
terOne()+5976<-kfxdrvAlter()+2287<-kfxdrvEntry()+1306<-opiexe()+20028<-opiosq
0()+3993<-kpooprx()+274<-kpoal8()+800<-opiodr()+910<-ttcpip()+2289<-opitsk()+
1670<-opiino()+966<-opiodr()+910
<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+252<-main()+201<
-__libc_start_main()+244<-_start()+36
----- End of Abridged Call Stack Trace -----

Script:收集ASM诊断信息

收集ASM诊断信息最佳的工具仍是RDA,对于不能使用RDA的环境可以采用如下脚本:

 

spool asm_diag1.txt
set pagesize 1000
set lines 500
col "Group Name"   form a25
col "Disk Name"    form a30
col "State"  form a15
col "Type"   form a7
col "Free GB"   form 9,999
alter session set nls_date_format='DD-MON-YYYY HH24:MI:SS';
select sysdate "Date and Time" from dual;

select * from v$asm_diskgroup order by 1;
select * from v$asm_disk order by 1, 2, 3;
select * from gv$asm_operation order by 1;
select * from v$version where banner like '%Database%' order by 1;
select * from gv$asm_client order by 1;

prompt

prompt ASM Disk Groups
prompt ===============

select group_number  "Group"
,      name          "Group Name"
,      state         "State"
,      type          "Type"
,      total_mb/1024 "Total GB"
,      free_mb/1024  "Free GB"
from   v$asm_diskgroup
/

prompt
prompt ASM Disks
prompt ==============

col "Group"          form 999
col "Disk"           form 999
col "Header"         form a9
col "Mode"           form a8
col "Redundancy"     form a10
col "Failure Group"  form a10
col "Path"           form a19

 select group_number  "Group"
,      disk_number   "Disk"
,      header_status "Header"
,      mode_status   "Mode"
,      state         "State"
,      redundancy    "Redundancy"
,      total_mb      "Total MB"
,      free_mb       "Free MB"
,      name          "Disk Name"
,      failgroup     "Failure Group"
,      path          "Path"
from   v$asm_disk
order by group_number
,        disk_number

/

prompt
prompt Instances currently accessing these diskgroups
prompt ==============================================

select c.group_number  "Group"
,      g.name          "Group Name"
,      c.instance_name "Instance"
from   v$asm_client c
,      v$asm_diskgroup g
where  g.group_number=c.group_number
/

prompt
prompt Report the Percentage of Imbalance in all Mounted Diskgroups
prompt ==============================================
select dfail, count(dfail) from
(
select disk, count(failgroup) as dfail
from x$kfdpartner, v$asm_disk where
number_kfdpartner=disk_number and grp=group_number
group by disk, failgroup
)
group by dfail; 

select g.name as "GROUP", d.name as "DISK", d.failgroup, fcnt, pcnt,
decode(pcnt - fcnt, 0, 'MUST', 'SHOULD') as action from
(select gnum, DISK1, failgroup, count(failgroup) as fcnt from
(select gnum, DISK1
from
(
select d.group_number as gnum, disk as disk1,
count(distinct failgroup) as dfail
from x$kfdpartner, v$asm_disk_stat d where
number_kfdpartner=disk_number and grp=d.group_number
and active_kfdpartner=1
group by d.group_number, disk
), v$asm_disk_stat
where dfail < 3
and disk1=disk_number
and gnum=group_number),
x$kfdpartner, v$asm_disk_stat d where
number_kfdpartner=disk_number and grp=d.group_number and grp=gnum
and disk1=disk
and active_kfdpartner=1
group by gnum, disk1, failgroup),
(select grp, disk, count(disk) as pcnt from x$kfdpartner where
active_kfdpartner=1 group by grp, disk),
v$asm_diskgroup_stat g, v$asm_disk_stat d
where gnum=grp and gnum=g.group_number and gnum=d.group_number and
disk=disk1 and disk=disk_number and
((fcnt = 1 and (pcnt - fcnt) > 3) or ((pcnt - fcnt) = 0))
/

col TYPE form a15
col FILE_NUMBER form 9999 head FILE_NUM
col GROUP_NUMBER form 9999 head GR_NUM
col GB for 9999.99

select GROUP_NUMBER   ,
 FILE_NUMBER          ,
 COMPOUND_INDEX       ,
 INCARNATION          ,
 BLOCK_SIZE           ,
 BLOCKS               ,
 BYTES/1024/1024/1024 GB ,
 TYPE                 ,
 STRIPED              ,
 CREATION_DATE        ,
 MODIFICATION_DATE
from v$asm_file
where TYPE != 'ARCHIVELOG'
/

prompt
prompt free ASM disks and their paths
prompt ===========================
select header_status , mode_status, path from V$asm_disk
where header_status in ('FORMER','CANDIDATE')
/

show parameter asm
show parameter size
show parameter proc
show parameter cluster
show parameter instance_type
show parameter instance_name

show parameter pfile

show sga

spool off

 

Code to be run on the ASM instance. Use file asmdebug.sql

 

set newpage none
set linesize 100
spool /tmp/asmdebug.out
--
-- Get a timestamp
select rpad('>', 10, '>'), to_char(sysdate, 'MON DD HH24:MM:SS') from dual;
--
-- Diskgroup information
set head off
select 'Diskgroup Information' from dual;
set head on
column name format a15
column DG# format 99
select group_number DG#, name, state, type, total_mb, free_mb from
v$asm_diskgroup;
--
-- Get the # of Allocation Units  per DG
set head off
select 'Number of AUs per diskgroup' from dual;
set head on
select count(number_kfdat) AU_count, group_kfdat DG# from x$kfdat
group by group_kfdat;
--
-- Get the # of Allocation Units  per DiskGroup and Disk
set head off
select 'Number of AUs per Diskgroup,Disk' from dual;
col "group#,disk#" for a30
set head on
select count(*)AU_count, GROUP_KFDAT||','||number_kfdat "group#,disk#" from x$kfdat group by GROUP_KFDAT,number_kfdat;
--
-- Get the # of allocated (V) and free (F) Allocation Units
set head off
select 'Number of allocated (V) and free (F) Allocation Units' from dual;
col "VF" for a2
set head on
select GROUP_KFDAT "group#", number_kfdat "disk#", v_kfdat "VF", count(*)
from x$kfdat
group by GROUP_KFDAT, number_kfdat, v_kfdat;
--
-- Get the # of Allocation Units per ASM file
set head off
select 'Number of AUs per ASM file ordered by AU count for metadata only'
from dual;
set head on
select count(XNUM_KFFXP) AU_count,  NUMBER_KFFXP file#, GROUP_KFFXP DG# from x$kffxp where NUMBER_KFFXP < 256
group by NUMBER_KFFXP, GROUP_KFFXP
order by count(XNUM_KFFXP) ;
--
-- Get the # of Allocation Units per ASM file by file alias.  Change the
-- system_created Y|N depending if you want the short or long ASM name
set head off
select 'Number of AUs per ASM file ordered by AU count.  This is for non
metadata' from dual;
col name format a60
set head on
select GROUP_KFFXP, NUMBER_KFFXP, name, count(*)
from x$kffxp, v$asm_alias
where GROUP_KFFXP=GROUP_NUMBER and NUMBER_KFFXP=FILE_NUMBER and
system_created='Y'
     group by GROUP_KFFXP, NUMBER_KFFXP, name
     order by GROUP_KFFXP, NUMBER_KFFXP;
--
-- Get partner information.  This is really only useful if redundancy is other than
-- external.
set head off
select 'The following shows the disk to partner relationship.  This is really only
useful if using normal or high redundancy.' from dual;
set head on
select grp DG#, disk, NUMBER_KFDPARTNER partner, PARITY_KFDPARTNER parity, ACTIVE_KFDPARTNER active
from x$kfdpartner;
--
-- Another look at file utilization.
set head off
set linesize 132
select 'bytes is the sum of AUs with data in them * 1024^2
space is the sum of all AUs allocated for this file * 1024^2'
from dual;
set head on
col Name format a60
select f.group_number, f.file_number, bytes, space, space/(1024*1024) "InMB", a.name "Name"
from v$asm_file f, v$asm_alias a
where f.group_number=a.group_number and f.file_number=a.file_number
    and system_created='Y'
    order by f.group_number, f.file_number;
--
-- Get robust disk information
set linesize 400
col failgroup format a20
col label format a20
col name format a40
col path format a40
set head off
select 'Robust disk information' from dual;
set head on
select GROUP_NUMBER, DISK_NUMBER, INCARNATION, MOUNT_STATUS, HEADER_STATUS, MODE_STATUS, STATE, LIBRARY, TOTAL_MB, FREE_MB,
NAME, FAILGROUP, LABEL, PATH, CREATE_DATE, MOUNT_DATE, READS,
WRITES, READ_ERRS, WRITE_ERRS, READ_TIME, WRITE_TIME, BYTES_READ, BYTES_WRITTEN
from v$asm_disk;
--
spool off

 

 

Code to be executed on the database instances using this ASM instance. Use file rdbmsdebug.sql

 

set newpage none
spool /tmp/rdbmsdebug.out
--
-- Get a timestamp
select rpad('>', 10, '>'), to_char(sysdate, 'MON DD HH24:MM:SS') from dual;
--
-- Get datafile information as the database sees it
set head off
select 'V$DATAFILE information' from dual;
set head on
set linesize 132
col name format a60
select file#, name, block_size, blocks, bytes, bytes/(1024*1024) "InMB",
status from v$datafile;
--
-- Get controlfile information as the database sees it
set head off
select 'V$CONTROLFILE information' from dual;
set head on
select * from v$controlfile;
--
-- Get archivelog information as the database sees it
set head off
select 'GV$ARCHIVED_LOG information' from dual;
set head on
select name, thread#, sequence#, blocks*block_size "size", status
status from gv$archived_log
order by thread#,sequence#;
--
-- Get redolog information as the database sees it
set head off
select 'v$log and v$logfile information' from dual;
set head on
col member format a60
select a.group#, member, thread#, sequence#, bytes, a.status
from v$log a, v$logfile b where a.group# = b.group#
order by thread#;
--
-- Get tempfiles information as the database sees it
set head off
select 'GV$TEMPFILE information' from dual;
set head on
col name format a60
select INST_ID,TS#,FILE#, RFILE#,NAME, CREATION_CHANGE# , CREATION_TIME,  STATUS, BYTES, BLOCKS, CREATE_BYTES, BLOCK_SIZE
from gv$tempfile order by inst_id, file#;
spool off

 

Other items to collect:

System error logs
Alert logs from all database and ASM instances
All recent trace files from all databases involved and all of the ASM instances
All “.trc” files from the ASM instances

 

check ASM disk space

 

1) Determine which (if any) disks contain no free space (ie are below the threshold)
select group_kfdat "group #",
       number_kfdat "disk #",
       count(*) "# AU's"
from x$kfdat a
where v_kfdat = 'V'
and not exists (select *
                from x$kfdat b
                where a.group_kfdat = b.group_kfdat
                and a.number_kfdat = b.number_kfdat
                and b.v_kfdat = 'F')
group by GROUP_KFDAT, number_kfdat;

If no rows are returned ... the following query can also be used

select disk_number "Disk #", free_mb
from v$asm_disk
where group_number = *** disk group number ***
order by 2;

If rows are returned from the first query ... or FREE_MB is less than 100mb in the second ... then there is probably insufficient disk space to allow a rebalance to occur ... Note the Disk #'s for later

2) Determine which files have allocation units on the disk(s) that are on exhausted disks

select name, file_number
from v$asm_alias
where group_number in (select group_kffxp
                                           from x$kffxp
                                           where group_kffxp=*** disk group number ***
                                           and disk_kffxp in (*** disk list from #1 above ***)
                                           and au_kffxp != 4294967294
                                           and number_kffxp >= 256)
and file_number in (select number_kffxp
                                  from x$kffxp
                                  where group_kffxp=*** disk group number ***
                                  and disk_kffxp in (*** disk list from #1 above ***)
                                  and au_kffxp != 4294967294
                                  and number_kffxp >= 256)
and system_created='Y';

3) Free up space so that the rebalance can occur

Using the file list from #2 above ... we will need to either drop or move tablespace(s)/datafile(s) such that all disks that are exhausted have at least 100mb free ...

NOTE ... the AU count above ... should relate to 1mb AU size ... so if a single file ... with at least 100 au's can be dropped or moved ... this
should be sufficient to free up enough space to allow the rebalance to occur

Droppable tablespaces may be things like:
    * temporary tablespaces
    * index tablespaces (assuming you know how to rebuild the indexes)

If none of the tablespaces are droppable then the tablespace(s)/datafile(s) will need to be
    * moved to another diskgroup (at least temporarily) ...
    * dropped using RMAN (with the database shutdown) and will be restored later

   Note 330103.1  How to Move Asm Database Files From one Diskgroup To Another ?

4) Check to see if there is sufficient FREE_MB on the problem disks

select disk_number "Disk #", free_mb
from v$asm_disk
where disk_group = *** disk group number ***
and disk_number in (*** disk list from #1 above ***)
order by 2;

 

Check Diskgroup Balance

If you want to perform an imbalance check for all mounted diskgroups, run the script in MOS Note 367445.1.

If you want to determine if you already have imbalance for a file in an existing diskgroup, use the following query:

select disk_kffxp, sum(size_kffxp) from x$kffxp where group_kffxp=AAA and number_kffxp=BBB and lxn_kffxp=0 group by disk_kffxp order by 2;

Breakdown of input/output is as follows:

  • AAA is the group_number in v$asm_alias
  • BBB is file_number in v$asm_alias
  • disk_kffxp gives us the disk number.
  • size_kffxp is used such that we account for variable sized extents.
  • sum(size_kffxp) provides the number of AUs that are on that disk.
  • lxn_kffxp is used in the query such that we go after only the primary extents, not secondary extents

If you want to check balance from an IO perspective, query the statistics in v$asm_disk_iostat before and after running a large SQL statement. For example, if the running a large query that does just reads, the reads and read_bytes columns should be roughly the same for all disks in the diskgroup.