Oracle ASM CORRUPTED AT BLOCKS : ORA-15196: INVALID ASM BLOCK HEADER

If you cannot recover data by yourself, ask Parnassusdata, the professional ORACLE database recovery team for help.

Parnassusdata Software Database Recovery Team

Service Hotline:  +86 13764045638 E-mail: [email protected]

 

Thu Oct 04 19:32:32 2012
SUCCESS: ALTER DISKGROUP DG01 ADD DISK 'ORCL:DISK024' SIZE 100821 M
,'ORCL:DISK025' SIZE 100821 M
NOTE: starting rebalance of group 3/0xf8194f61 (DG01) at power 5
Starting background process ARB0
NOTE: assigning ARB3 to group 3/0xf8194f61 (DG01)
NOTE: assigning ARB4 to group 3/0xf8194f61 (DG01)
Thu Oct 04 19:36:43 2012
WARNNING: cache read a corrupted block group=DG01 dsk=6 blk=40 from disk 6
NOTE: a corrupted block from group DG01 was dumped to
/orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc
WARNNING: cache read(retry) a corrupted block group=DG01 dsk=6 blk=40 from
disk 6
ERROR: cache failed to read group=DG01 dsk=6 blk=40 from disk(s): 6 DISK007
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
System State dumped to trace file
/orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc
NOTE: failed to create amdu dump with error -1
NOTE: cache initiating offline of disk 6 group DG01
NOTE: process 11708 initiating offline of disk 6.3916021716 (DISK007) with
mask 0x7e in group 3
Thu Oct 04 19:36:43 2012
WARNING: Disk DISK007 in mode 0x7f is now being offlined
NOTE: initiating PST update: grp = 3, dsk = 6/0xe969bfd4, mode = 0x15
kfdp_updateDsk(): 22
Thu Oct 04 19:36:43 2012
kfdp_updateDskBg(): 22
ERROR: too many offline disks in PST (grp 3)
WARNING: Disk DISK007 in mode 0x7f offline aborted
Thu Oct 04 19:36:43 2012
NOTE: active pin 0x0x6b834e68 found in ARB1
ERROR: ORA-15130 thrown in ARB1 for group number 3
Thu Oct 04 19:36:43 2012
ERROR: ORA-15130 thrown in ARB2 for group number 3
Errors in file /orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc:
ORA-15130: diskgroup "DG01" is being dismounted
ORA-15066: offlining disk "DISK007" may result in a data loss
Errors in file /orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb2_11710.trc:
ORA-15130: diskgroup "DG01" is being dismounted.


1) +ASM2 instance reported corrupted blocks during an add disk operation on
the DG01 diskgroup thus this diskgroup was dismounted:
======================================================

Thu Oct 04 19:32:32 2012
SUCCESS: ALTER DISKGROUP DG01 ADD DISK 'ORCL:DISK024' SIZE 100821 M
,'ORCL:DISK025' SIZE 100821 M
NOTE: starting rebalance of group 3/0xf8194f61 (DG01) at power 5

Starting background process ARB0
NOTE: assigning ARB3 to group 3/0xf8194f61 (DG01)
NOTE: assigning ARB4 to group 3/0xf8194f61 (DG01)
Thu Oct 04 19:36:43 2012
WARNNING: cache read a corrupted block group=DG01 dsk=6 blk=40 from disk 6
NOTE: a corrupted block from group DG01 was dumped to
/orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc
WARNNING: cache read(retry) a corrupted block group=DG01 dsk=6 blk=40 from
disk 6
ERROR: cache failed to read group=DG01 dsk=6 blk=40 from disk(s): 6 DISK007
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
System State dumped to trace file
/orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc
NOTE: failed to create amdu dump with error -1
NOTE: cache initiating offline of disk 6 group DG01
NOTE: process 11708 initiating offline of disk 6.3916021716 (DISK007) with
mask 0x7e in group 3
Thu Oct 04 19:36:43 2012
WARNING: Disk DISK007 in mode 0x7f is now being offlined
NOTE: initiating PST update: grp = 3, dsk = 6/0xe969bfd4, mode = 0x15
kfdp_updateDsk(): 22
Thu Oct 04 19:36:43 2012
kfdp_updateDskBg(): 22
ERROR: too many offline disks in PST (grp 3)
WARNING: Disk DISK007 in mode 0x7f offline aborted
Thu Oct 04 19:36:43 2012
NOTE: active pin 0x0x6b834e68 found in ARB1
ERROR: ORA-15130 thrown in ARB1 for group number 3
Thu Oct 04 19:36:43 2012
ERROR: ORA-15130 thrown in ARB2 for group number 3
Errors in file /orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc:
ORA-15130: diskgroup "DG01" is being dismounted
ORA-15066: offlining disk "DISK007" may result in a data loss
Errors in file /orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb2_11710.trc:
ORA-15130: diskgroup "DG01" is being dismounted.

2) Affected disk is:
======================================================
WARNNING: cache read a corrupted block group=DG01 dsk=6 blk=40 from disk 6
======================================================
NOTE: offline of disk(s) signalled ORA-15130
ORA-15130: diskgroup “DG01” is being dismounted
ORA-15066: offlining disk “DISK007” may result in a data loss

======================================================
=)> disk 6 of grp 3: DISK007 label:DISK007
======================================================

3) This is an 11.2.0.1.0 ASM RAC configuration.
4) This problem occurred at “Thu Oct 04 19:36:43 2012”

======================================================
Thu Oct 04 19:36:43 2012
WARNNING: cache read a corrupted block group=DG01 dsk=6 blk=40 from disk 6
NOTE: a corrupted block from group DG01 was dumped to
/orabase/diag/asm/+asm/+ASM2/trace/+ASM2_arb1_11708.trc
WARNNING: cache read(retry) a corrupted block group=DG01 dsk=6 blk=40 from
disk 6
ERROR: cache failed to read group=DG01 dsk=6 blk=40 from disk(s): 6 DISK007
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
ORA-15196: invalid ASM block header [kfc.c:23908] [check_kfbh] [2147483654]
[40] [2202114410 != 1169765350]
======================================================

5) AMDU dump reports 2 AT blocks and 2 ASM metadata blocks as corrupted:
=====================================================

******************************* AMDU Settings
********************************
ORACLE_HOME = /oragridbase/product/11.2.0/grid
System name: Linux
Node name: ausu596a
Release: 2.6.18-194.17.1.el5
Version: #1 SMP Mon Sep 20 07:12:06 EDT 2010
Machine: x86_64
amdu run: 04-OCT-12 23:53:23
Endianess: 1
--------------------------------- Operations
---------------------------------
-dump DG01
------------------------------- Disk Selection
-------------------------------
-diskstring '/dev/oracleasm/disks/*'
------------------------------ Reading Control
-------------------------------
------------------------------- Output Control
-------------------------------
********************************* DISCOVERY
**********************************
---------------------------- SCANNING DISK N0023
-----------------------------
Disk N0023: '/dev/oracleasm/disks/DISK007'
AMDU-00209: Corrupt block found: Disk N0023 AU [0] block [40] type [0]
AMDU-00201: Disk N0023: '/dev/oracleasm/disks/DISK007'
AMDU-00209: Corrupt block found: Disk N0023 AU [0] block [41] type [0]
AMDU-00201: Disk N0023: '/dev/oracleasm/disks/DISK007'
AMDU-00209: Corrupt block found: Disk N0023 AU [0] block [40] type [3]
AMDU-00201: Disk N0023: '/dev/oracleasm/disks/DISK007'


** UNABLE TO SCAN AU 17024 THROUGH 17471 **
AMDU-00209: Corrupt block found: Disk N0023 AU [0] block [41] type [3]
AMDU-00201: Disk N0023: '/dev/oracleasm/disks/DISK007'
** UNABLE TO SCAN AU 17024 THROUGH 17471 **
Allocated AU's: 94948
Free AU's: 5873
AU's read for dump: 11
Block images saved: 1655
Map lines written: 11
Heartbeats seen: 0
Corrupt metadata blocks: 2
Corrupt AT blocks: 2
---------------------------- SCANNING DISK N0005
-----------------------------
------------------------- SUMMARY FOR DISKGROUP DG01
-------------------------
Allocated AU's: 2252039
Free AU's: 268486
AU's read for dump: 296
Block images saved: 40920
Map lines written: 296
Heartbeats seen: 0
Corrupt metadata blocks: 2
Corrupt AT blocks: 2
======================================================

6) Related bugs (closed as vendor issue):
======================================================
=)> Bug.13829821 (45) ORA-15196 [KFC.C 25210] [CHECK_KFBH] [2147483649]
[8] [2170839822 != 2170840087]
=)> Bug.13591322 (45) ORA-15196 AT BLOCK CORRUPTION
=)> Bug.12861891 (32) CORRUPTED BLOCK FOUND IN ASM_YB_FLASH DISK GROUP ON
RAC NODE
=)> Bug.10267691 (45) ORA-15196 INVALID ASM BLOCK HEADER [KFC.C 9195]
[HARD_KFBH] [2147483648] [1

o Ct had a corruption on an AT block, when they were trying to add disks to
DG01, dg created w/ external redundancy . They collected amdu and kfed for
the corrupted block, but they didn’t get the dd copy for the blocks.

1) They confirmed for sure that the I/O problems reported by the OS on the
physical disks are not associated to the ASM disks members or affected ASM
disk.
.
2) They explained that the dd dump provided so far was taken from the entire
disk, instead of the partition created in this entire disk, which was used to
the ‘/dev/oracleasm/disks/DISK007’ ASMLIB disk

[orcl tar7]$ kfed read dd.out
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
2B272579F400 00000000 00000000 00000000 00000000 […………….]
Repeat 26 times
2B272579F5B0 00000000 00000000 00000000 01000000 […………….]
2B272579F5C0 FE830001 003FFFFF AFB60000 00000C4E [……?…..N…]
2B272579F5D0 00000000 00000000 00000000 00000000 […………….]
Repeat 1 times
2B272579F5F0 00000000 00000000 00000000 AA550000 […………..U.]
2B272579F600 00000000 00000000 00000000 00000000 […………….]
Repeat 223 times
KFED-00322: Invalid content encountered during block traversal:
[kfbtTraverseBlock][Invalid OSM block type][][0]

[orcl tar7]$ od -c dd.out | more
.
.
.
0077040 O R C L D I S K D I S K 0 0 7 \0
0077060 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0077100 \0 \0 \v 006 \0 001 003 D I S K 0 0 7 \0

it appears there some operation has wiped out ie. ‘written all zeros’
part of the AT blocks 40 and all of block 41.-
.
aunum=0 blknum=41 | more
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 0 ; 0x008: TYPE=0x0 NUMB=0x0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
F5285200 00000000 00000000 00000000 00000000 […………….]
Repeat 31 times
.
.
offsets :00281f0: 001c 0000 8701 8000 e428 0000 3101 8000 ………(..1…
0028200: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028210: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028220: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028230: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028240: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028250: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028260: 0000 0000 0000 0000 0000 0000 0000 0000 …………….
0028270: 0000 0000 0000 0000 0000 0000 0000 0000 …………….

ASM disk header is duplicated at locations 0x00007e10 and 0x00205e10
ASM backsup the disk header for repairs, in case it’s overwritten by
an external entity.
.
Corruption seen here is with allocation table blocks. In 12.1, ASM can
survive such a corruption provided the diskgroup compatibility is
advanced to 12.1.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *