ORA-15063/ ORA-15042- 的故障排除步骤

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638    QQ号:47079569    邮箱:service@parnassusdata.com

 

应用于:

Oracle 数据库 – 企业版
本文档中的信息适用于任何平台

目的

由于错误ORA-15063,磁盘组不能安装时的自我调试步骤:

ORA-15063: ASM discovered an insufficient number of disks for diskgroup s%

ORA-15040: diskgroup is incomplete

ORA-15042: ASM disk “%” is missing

 

故障排除步骤

SECTION A – 开始

开始先参考文档  NOTE 452770.1 “TROUBLESHOOTING – ASM disk not found/visible/discovered issues ”
首先确定所有磁盘时受影响磁盘的一部分,在alert_+ASM*.log中看最后一次成功安装。

你应该搜索一部分,如下所示:

SQL> ALTER DISKGROUP DATA MOUNT /* asm agent *//* {0:0:214} */

NOTE: cache registered group DATA number=1 incarn=0x44bef6bb

NOTE: cache began mount (not first) of group DATA number=1 incarn=0x44bef6bb

NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so

NOTE: Assigning number (1,0) to disk (ORCL:DATA01P)

NOTE: Assigning number (1,1) to disk (ORCL:DATA02P)

NOTE: Assigning number (1,2) to disk (ORCL:DATA03P)

NOTE: Assigning number (1,3) to disk (ORCL:DATA04P)

NOTE: Assigning number (1,4) to disk (ORCL:DATA05P)

..

NOTE: cache opening disk 0 of grp 1: DATA01P label:DATA01P

NOTE: cache opening disk 1 of grp 1: DATA02P label:DATA02P

..

SUCCESS: diskgroup DATA was mounted

注意:ASMLIB未使用时,ASM磁盘路径在安装部分指定

NOTE: cache opening disk 1 of grp 1: REDO3_0001 path:/dev/mpath/3600601600ba12c00d4b784363e69e211

 NOTE: cache opening disk 2 of grp 1: REDO3_0002 path:/dev/mpath/3600601600ba12c00d4b784363e69e212

 …

按照文档452770.1所示,隔离报告为“丢失”的设备。

 

最后,按照以下步骤开始检查:

A1) 如果OS日志报告有任何 IO /存储/多路径的错误- 调查并解决这些问题。

 

该步骤是强制性的,因为通常ORA-15063/ ORA-15042是由潜在的IO/存储错误引起的。

A2) 如果ASM磁盘使用的设备正确演示并在OS级配置。

如果额外报错“ORA-15075:disk(s) are not visible cluster-wide”,确保所有设备在集群范围内可见。

A3) 如果所有ASM磁盘具有相应的权限(例如:它们应该由电网所有者所拥有)

如果ASM磁盘的所有权不知出于什么原因已经改变,请将其更正。

A4) 查询V$ asm_disks时,如果/如何报告“丢失”的设备
———————————————————————————–
如果设备报告为如下状态:

=> “PROVISIONED/CANDIDATE” – 这表明ASM磁盘头被损坏

-> 调查损坏背后的IO问题- 参看 step A1. Oracle never wipes out its metadata!! A checksum is made for every write before  being accepted.

-> 检查盘头的状态,已确认该损坏:

$> kfed read <path_to_your_missing_devices>

       

        kfbh.endian:                          0 ; 0x000: 0x00

        kfbh.hard:                            0 ; 0x001: 0x00

        kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

        kfbh.datfmt:                          0 ; 0x003: 0x00

        kfbh.block.blk:                       0 ; 0x004: blk=0

        kfbh.block.obj:                       0 ; 0x008: file=0

        ….
 -> 尝试修复磁盘头,看磁盘组能否安装:

$> kfed repair <path_to_your_missing_devices>
->检查ASM或你的数据库,是否有报告额外的损坏(如:ORA-15196)- 因为IO/存储问题可能会影响多个块。

如果有任何损坏,请打开SR到Oracle Support。

 注意:
1) 使用非默认AU大小,AUSZ= <AU_SIZE>每个都必须用KFED命令指定
2) “kfed repair” 只适用于11g!

=> “UNKNOWN/IGNORED” – 这表明ASM在OS级不可见
    -> 重新查看步骤 A1,A2 and A3:
———————————————————————————–

A5) 如果asm_diskstring 仍设置正确。

关于Windows设置,你也可以参考 NOTE 880061.1 “ASM Is Unable To Detect SCSI Disks On Windows”
SECTION B – 使用ASMLIB

使用ASMLIB时,按照上述步骤(section A),并检查与ORA-15063相关的错误:

B1) ORA-15183无法在Oracle/ ORA-15183中初始化ASMLIBASMLIB初始化错误 [未安装驱动程序/代理]

参考: NOTE 340519.1 Cannot Start ASM Ora-15063/ORA-15183

B2) ORA-15186: ASMLIB误差函数= [asm_open],错误=[1] mesg = [Operation not permitted]

检查你的ASMLIB健康状况。

 => 更正安装的rpm’s

 => 更正symlinks – 所有节点都应显示:
  
    # ls -l  /etc/sysconfig/oracleasm
       lrwxrwxrwx 1 root root 24 Sep 18 22:10 /etc/sysconfig/oracleasm -> oracleasm-_dev_oracleas

 =>更正ASMLIB 配置 (/etc/sysconfig/oracleasm) –    使用多路径时

     # ORACLEASM_SCANORDER: Matching patterns to order disk scanning
        ORACLEASM_SCANORDER=”dm”
     # ORACLEASM_SCANEXCLUDE: Matching patterns to exclude disks from scan
        ORACLEASM_SCANEXCLUDE=”sd”
 
B3) 查看ASMLIB磁盘是否位于/dev/oracleasm/disks目录下

=> /dev/oracleasm/disks/*下的设备在所有节点上都必须报告为DM设备(不是单一路径设备-sd*- )。如果不是,请更正! (见步骤B2)
$> ls -al /dev/oracleasm/disks

 

brw-rw—- 1 grid dba 253, 29 Feb 12 11:44 /dev/oracleasm/disks/DATA01P

brw-rw—- 1 grid dba 253, 35 Feb 12 11:44 /dev/oracleasm/disks/DATA02P

brw-rw—- 1 grid dba 253, 27 Feb 15 16:04 /dev/oracleasm/disks/DATA03P

brw-rw—- 1 grid dba 253, 24 Feb 12 11:44 /dev/oracleasm/disks/DATA04P

brw-rw—- 1 grid dba 253, 25 Feb 12 11:44 /dev/oracleasm/disks/DATA05P

 

=>如果上面的输出缺失一个ASMLIB磁盘,首先尝试重新扫描设备,比如root:

 # /etc/init.d/oracleasm scandisks

=>如果 /dev/oracleasm/disks仍然缺失ASMLIB磁盘,请你的系统管理员来进行调查(见步骤A1A2A3)。

B4) 检查ASMLIB磁盘有没有正确的ASMLIB邮票和状态Check if ASMLIB disk(s) has the correct ASMLIB stamp and status:
 $> kfed read <ASMLIB_device> |grep provstr

      kfdhdb.driver.provstr: ORCLDISK<diskname> ; 0x000: length=20

 

 $> kfed read <ASMLIB_device> | egrep ‘kfbh.type|kfdhdb.dskname|kfdhdb.hdrsts’

      kfbh.type:      1 ; 0x002: KFBTYP_DISKHEAD

      kfdhdb.dskname: DATA01P ; 0x028: length=14

      kfdhdb.hdrsts:  3 ; 0x027: KFDHDR_MEMBER    
=> 如果输出为“kfdhdb.driver.provstr: ORCLCLRD” (but kfdhdb.hdrsts= MEMBER and kfbh.type=KFBTYP_DISKHEAD),那么你的磁盘已被“oracleasm deletedisk”删除

=> 如果kfbh.type = KFBTYP_INVALID  -> 见步骤 A4)  ,看”kfed repair” 是否可以解决此问题。

B5)也可以参考下面的文件:

NOTE: 398622.1     ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
NOTE: 1384504.1   Mount ASM Disk Group Fails : ORA-15186, ORA-15025, ORA-15063  
NOTE: 967461.1    “Multipath: error getting device” seen in OS log causes ASM/ASMlib to shutdown by itself
NOTE: 1526920.1   ORA-15186 ORA-15063 on node 2

SECTION C  –  其他注意事项

如果上述检查都已完成,但错误仍然存在,请结合您的配置/情形查看以下注意事项::

NOTE:  577526.1     ORA-15063 ASM Discovered An Insufficient Number Of Disks For Diskgroup using NetApp Storage
NOTE:  784776.1     ORA-15063 When Mounting a Diskgroup After Storage Cloning ( BCV / Split Mirror / SRDF / HDS / Flash Copy )
NOTE:  555918.1     ORA-15038 On Diskgroup Mount After Node Eviction
NOTE:  1484723.1   ASM Candidate Raw Device Is Not Presented As A RAC Cluster Wide Shared character Devices On Unix.
NOTE:  1534211.1   ORA-15017 and ORA-15063 errors for unused diskgroups in 11.2
NOTE:  1487443.1   Mounting Diskgroup Fails With ORA-15063 and V$ASM_DISK Shows PROVISIONED
NOTE:  742832.1     AIX:After changing Multipathing drivers from RDAC to MPIO ASM discovered an insufficient number of disks
NOTE:  1276913.1   Unable to discover or use raw devices for ASM in HP-UX Itanium in 11.2.0.2 ( ORA-15063 )

SECTION D  – 打开SR时需要收集的信息

如果你自己解决不了该问题,请收集以下信息,将SR交给Oracle Support

D1) alert_+ASM*.log (from all nodes if RAC)

D2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?

D3) KFED reports

#! /bin/sh

rm /tmp/kfed_DH.out /tmp/kfed_BK.out

for i in `ls <your_path_to_asm_disks>`

 do

 echo $i >> /tmp/kfed_DH.out

 kfed read $i >> /tmp/kfed_DH.out

 echo $i >> /tmp/kfed_BK.out

 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out

done
作为GRID/ ASM所有者运行kfed.sh。上传/tmp/kfed_DH.out,/tmp/kfed_BK.out!注意非默认AU大小- 如果使用非默认AU大小,就必须指定。(见 note 1485597.1 “ASM tools used by Support : KFOD, KFED, AMDU”)
D4) ASMLIB 信息
NOTE : 869526.1 Collecting The Required Information For Support To Troubleshot ASM/ASMLIB Issues.

D5)ASM 设备列表

$> ls -al <path_to_ASM_devices>

D6) OS日志 (如果这是RAC配置,就来自所有节点from all nodes if this is  configuration)

SECTION E  – 磁盘添加失败后磁盘报告显示为丢失。

如果磁盘添加失败后遇到ORA-15063,请收集以下信息,并将SR交给Oracle Support。

E1) alert_+ASM*.log (from all nodes if RAC)

E2) script#1 from NOTE 470211.1 How To Gather/Backup ASM Metadata In A Formatted Manner version 10.1, 10.2, 11.1 & 11.2?

E3) KFED 报告

#! /bin/sh

rm /tmp/kfed_*.out

for i in `ls <your_path_to_asm_disks>`

 do

 echo $i >> /tmp/kfed_DH.out

 kfed read $i >> /tmp/kfed_DH.out

 echo $i >> /tmp/kfed_BK.out

 kfed read $i aun=1 blkn=254  >> /tmp/kfed_BK.out

echo $i >> /tmp/kfed_PST.out

kfed read $i aun=1 blkn=2 >> /tmp/kfed_PST.out

 echo $i >> /tmp/kfed_FS.out

 kfed read $i blkn=1 >> /tmp/kfed_FS.out

 echo $i >> /tmp/kfed_FD.out

 kfed read $i aun=2 blkn=1 >> /tmp/kfed_FD.out

 echo $i >> /tmp/kfed_DD.out

 kfed read $i aun=2 blkn=0 >> /tmp/kfed_DD.out  ##there might be more than one block needed if a large number of disks -> this might be asked later by Oracle Support

done
作为GRID/ ASM所有者运行kfed.sh。上传/tmp/kfed_*.out!注意非默认AU大小- 如果使用非默认AU大小,就必须指定。(见 note 1485597.1 “ASM tools used by Support : KFOD, KFED, AMDU”)
E4) AMDU 输出

amdu -diskstring ‘<ASM_DISKSTRING>’ -dump ‘<DISKGROUP_NAME>’ -noimage

amdu -diskstring ‘<ASM_DISKSTRING>’ -print <DISKGROUP_NAME>.F2.V0.C2 > DG.amdu
####F2.V0.C2  –> This will only extract up to 16 disks information. If there is a large number of disks, a larger output is needed

 

 

参考

NOTE:1678139.1 – KFED Reports “KFBTYP_INVALID” & OS Metadata [LVM2 001] In “/dev/emcpower” Disk /ASM disk Member (ASM Disk Overlapping : Scenario #2).
NOTE:452770.1 – TROUBLESHOOTING – ASM disk not found/visible/discovered issues
NOTE:1526920.1 – ORA-15186 ORA-15063 on node 2
NOTE:1534211.1 – ORA-15017 and ORA-15063 errors for unused diskgroups in 11.2
NOTE:577526.1 – Ora-15063: Asm Discovered An Insufficient Number Of Disks For Diskgroup using NetApp Storage
NOTE:470211.1 – How To Gather & Backup ASM/ACFS Metadata In A Formatted Manner version 10.1, 10.2, 11.1, 11.2 and 12.1?
NOTE:1484723.1 – ASM Candidate Raw Device Is Not Presented As A RAC Cluster Wide Shared character Devices On Unix.
NOTE:742832.1 – After changing Multipathing drivers from RDAC to MPIO ASM discovered an insufficient number of disks
NOTE:880061.1 – ASM Is Unable To Detect SCSI Disks On Windows.
NOTE:398622.1 – ORA-15186: ASMLIB error function = [asm_open], error = [1], mesg = [Operation not permitted]
NOTE:869526.1 – Collecting The Required Information For Support To Validate & Troubleshoot ASM/ASMLIB Issues.
NOTE:1276913.1 – Unable to discover or use raw devices for ASM in HP-UX Itanium in 11.2.0.2 ( ORA-15063 )
NOTE:340519.1 – Cannot Start Asm Ora-15063/ORA-15183
NOTE:1384504.1 – Mount ASM Disk Group Fails : ORA-15186, ORA-15025, ORA-15063
NOTE:1485597.1 – ASM tools used by Support : KFOD, KFED, AMDU
NOTE:967461.1 – “Multipath: error getting device” seen in OS log causes ASM/ASMlib to shutdown by itself

NOTE:784776.1 – ORA-15063 When Mounting a Diskgroup After Storage Cloning ( BCV / Split Mirror / SRDF / HDS / Flash Copy )
NOTE:555918.1 – ORA-15038 On Diskgroup Mount After Node Eviction
NOTE:1487443.1 – Mounting Diskgroup Fails With ORA-15063 and V$ASM_DISK Shows PROVISIONED

关注dbDao.com的新浪微博

扫码关注dbDao.com 微信公众号:

沪公网安备 31010802001379号

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569