【转】ASM Continuing Operations Directory

4

Some long-running ASM operations, like the rebalance, drop disk, create/delete/resize file, cannot be described by a single record in the ASM active change directory. Those operations are tracked via the ASM continuing operations directory (COD) – the ASM file number 4. There is one COD per disk group.

 

If the process performing the long-running operation dies before completing it, a recovery process will look at the entry and either complete or rollback the operation. There are two types of continuing operations – background and rollback.

 

Background operation

 

A background operation is performed by an ASM instance background process. It is done as part of a disk group maintenance and it continues until it is either completed or the ASM instance dies. If the instance dies, then the recovering instance needs to resume the background operation. The disk group rebalance is the best example of a background operation.

 

Let’s query the X$KFFXP view to find the COD allocation units for disk group 3 (group_kffxp=3). COD is ASM file number 4, hence number_kffxp=4 in the query:

 

SQL> SELECT x.xnum_kffxp “Extent”,

x.au_kffxp “AU”,

x.disk_kffxp “Disk #”,

d.name “Disk name”

FROM x$kffxp x, v$asm_disk_stat d

WHERE x.group_kffxp=d.group_number

and x.disk_kffxp=d.disk_number

and x.group_kffxp=3

and x.number_kffxp=4

ORDER BY 1, 2;

 

Extent         AU     Disk # Disk name

———- ———- ———- ——————————

0          8          0 ASMDISK5

 

SQL>

 

This is telling us that the ACD is in allocation unit 8 on disk ASMDISK5. Let’s have a closer look (note the AU size of 4 MB for this disk group):

 

$ kfed read /dev/oracleasm/disks/ASMDISK5 ausz=4m aun=8 blkn=0 | more

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            9 ; 0x002: KFBTYP_COD_BGO

kfrcbg.size:                          0 ; 0x000: 0x0000

kfrcbg.op:                            0 ; 0x002: 0x0000

kfrcbg.inum:                          0 ; 0x004: 0x00000000

kfrcbg.iser:                          0 ; 0x008: 0x00000000

$

 

This shows the COD block for a background operation (kfbh.type=KFBTYP_COD_BGO) and not much happening at the moment – all kfrcbg fields are 0. Most notably the operation code (kfrcbg.op) is 0, which means that there are no active background operations. The op code 1 would indicate an active disk rebalance operation.

 

Rollback operation

 

A rollback operation is similar to a database transaction. It is started at the request of an ASM foreground process. To begin a rollback operation a slot must be found in the rollback directory – block 1 of the ASM continuing operations directory. If all slots are busy then the operation sleeps until one is free. During the operation the disk group is in an inconsistent state. The operation needs to either complete or rollback all its changes to the disk group. The foreground is usually performing the operation on behalf of a database instance. If the database instance dies or the ASM foreground process dies, or an unrecoverable error occurs, then the operation must be terminated.

 

Creating a file is a good example of a rollback operation. If an error occurs while allocating the space for the file, then the partially created file must be deleted. If the database instance does not commit the file creation, the file must be automatically deleted. If the ASM instance dies then this must be done by the recovering instance.

 

Let’s have a look at block 1 of the COD:

 

$ kfed read /dev/oracleasm/disks/ASMDISK5 ausz=4m aun=8 blkn=1 | more

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                           15 ; 0x002: KFBTYP_COD_RBO

kfrcrb10[0].opcode:                   1 ; 0x000: 0x0001

kfrcrb10[0].inum:                     1 ; 0x002: 0x0001

kfrcrb10[0].iser:                     1 ; 0x004: 0x00000001

kfrcrb10[0].pnum:                    18 ; 0x008: 0x00000012

kfrcrb10[1].opcode:                   0 ; 0x00c: 0x0000

kfrcrb10[1].inum:                     0 ; 0x00e: 0x0000

kfrcrb10[1].iser:                     0 ; 0x010: 0x00000000

kfrcrb10[1].pnum:                     0 ; 0x014: 0x00000000

$

 

Fields kfrcrb10[i] track the active rollback operations. We see that there is one operation in progress (kfrcrb10[0] have non-null values), and from the opcode list we know this is a file create operation. The value kfrcrb10[0].inum=1 means that the operation is running in the ASM instance 1.

 

The rollback operation opcodes are:

 

1 – Create a file

2 – Delete a file

3 – Resize a file

4 – Drop alias entry

5 – Rename alias entry

6 – Rebalance space COD

7 – Drop disks force

8 – Attribute drop

9 – Disk Resync

10 – Disk Repair Time

11 – Volume create

12 – Volume delete

13 – Attribute directory creation

14 – Set zone attributes

15 – User drop

 

Conclusion

 

The ASM continuing operations directory (COD) – keeps track of the long-running ASM operations. In case of any problems, the COD entries can be used to either continue or rollback the operation. The operation cleanup is performed by another ASM instance (in a cluster environments), or by the same ASM instance – usually after the instance restart.

Comment

*

沪ICP备14014813号

沪公网安备 31010802001379号