ASM is a critical component of the Exadata software stack. It is also a bit different - compared to non-Exadata environments. It still manages your disk groups, but builds those with grid disks. It still takes care of disk errors, but also handles predictive disk failures. It doesn't like external redundancy, but it makes the disk group smart scan capable. Let's have a closer look.
Grid disks
In Exadata the ASM disks live on storage cells and are presented to compute nodes (where ASM instances run) via Oracle proprietary iDB protocol. Each storage cell has 12 hard disks and 16flash disks. During Exadata deployment grid disks are created on those 12 hard disks. Flash disks are used for the flash and redo log cache, so grid disks are normally not created on flash disks.
Grid disks are not exposed to the Operating System, so only database instances, ASM and related utilities, that speak iDB, can see them. The kfod, ASM discovery tool, is one such utility. Here is an example of kfod discovering grid disks in one Exadata environment:
There are no external redundancy disk groups in Exadata - you have a choice of either normal or high redundancy. When creating disk groups, ASM automatically puts all grid disks from the same storage cell into the same failgroup. The failgroup is then named after the storage cell.
This would be an example of creating a diskgroup in Exadata environment (note how that grid disk prefix comes in handy):
$ kfod disks=all
-----------------------------------------------------------------
Disk Size Path User Group
=================================================================
1: 433152 Mb o/192.168.10.9/DATA_CD_00_exacell01
2: 433152 Mb o/192.168.10.9/DATA_CD_01_exacell01
3: 433152 Mb o/192.168.10.9/DATA_CD_02_exacell01
...
13: 29824 Mb o/192.168.10.9/DBFS_DG_CD_02_exacell01
14: 29824 Mb o/192.168.10.9/DBFS_DG_CD_03_exacell01
15: 29824 Mb o/192.168.10.9/DBFS_DG_CD_04_exacell01
...
23: 108224 Mb o/192.168.10.9/RECO_CD_00_exacell01
24: 108224 Mb o/192.168.10.9/RECO_CD_01_exacell01
25: 108224 Mb o/192.168.10.9/RECO_CD_02_exacell01
...
474: 108224 Mb o/192.168.10.22/RECO_CD_09_exacell14
475: 108224 Mb o/192.168.10.22/RECO_CD_10_exacell14
476: 108224 Mb o/192.168.10.22/RECO_CD_11_exacell14
-----------------------------------------------------------------
ORACLE_SID ORACLE_HOME
=================================================================
+ASM1 /u01/app/11.2.0.3/grid
+ASM2 /u01/app/11.2.0.3/grid
+ASM3 /u01/app/11.2.0.3/grid
...
+ASM8 /u01/app/11.2.0.3/grid
$
Note that grid disks are prefixed with either DATA, RECO or DBFS_DG. Those are ASM disk group names in this environment. Each grid disk name ends with the storage cell name. It is also important to note that disks with the same prefix have the same size. The above example is from a full rack - hence 14 storage cells and 8 ASM instances.
ASM_DISKSTRING
In Exadata ASM_DISKSTRING='o/*/*'. That is suggesting to ASM that it is running on an Exadata compute node and to expect grid disks.
$ sqlplus / as sysasm
SQL> show parameter asm_diskstring
NAME TYPE VALUE
-------------- ------ -----
asm_diskstring string o/*/*
Automatic failgroupsSQL> create diskgroup RECO
disk 'o/*/RECO*'
attribute
'COMPATIBLE.ASM'='11.2.0.0.0',
'COMPATIBLE.RDBMS'='11.2.0.0.0',
'CELL.SMART_SCAN_CAPABLE'='TRUE';
Once the disk group is created we can check the disk and failgroup names:
SQL> select name, failgroup, path from v$asm_disk_stat where name like 'RECO%';
NAME FAILGROUP PATH
-------------------- --------- -----------------------------------
RECO_CD_08_EXACELL01 EXACELL01 o/192.168.10.3/RECO_CD_08_exacell01
RECO_CD_07_EXACELL01 EXACELL01 o/192.168.10.3/RECO_CD_07_exacell01
RECO_CD_01_EXACELL01 EXACELL01 o/192.168.10.3/RECO_CD_01_exacell01
...
RECO_CD_00_EXACELL02 EXACELL02 o/192.168.10.4/RECO_CD_00_exacell02
RECO_CD_05_EXACELL02 EXACELL02 o/192.168.10.4/RECO_CD_05_exacell02
RECO_CD_04_EXACELL02 EXACELL02 o/192.168.10.4/RECO_CD_04_exacell02
...
SQL>
Note that we did not specify the failgroup names in the CREATE DISKGROUP statement. ASM has automatically put grid disks from the same storage cell in the same failgroup.
cellip.ora The cellip.ora is the configuration file, on every database server, that tells ASM instances which cells are available to the cluster. Here is a content of a typical cellip.ora file for a quarter rack system:
$ cat /etc/oracle/cell/network-config/cellip.ora
cell="192.168.10.3"
cell="192.168.10.4"
cell="192.168.10.5"
Now that we see what is in the cellip.ora, the grid disk path, in the examples above, should make more sense.
Disk group attributes
The following attributes and their values are recommended in Exadata environments:
- COMPATIBLE.ASM - Should be set to the ASM software version in use.
- COMPATIBLE.RDBMS - Should be set to the database software version in use.
- CELL.SMART_SCAN_CAPABLE - Has be set to TRUE. This attribute/value is actually mandatory in Exadata.
- AU_SIZE - Should be set to 4M. This is the default value in recent ASM versions for Exadata environments.
| Parameter | Value |
|---|---|
| CLUSTER_INTERCONNECTS | Bondib0 IP address for X2-2. Colon delimited Bondib* IP addresses for X2-8. |
| ASM_POWER_LIMIT | 1 for a quarter rack, 2 for all other racks. |
| SGA_TARGET | 1250 MB |
| PGA_AGGREGATE_TARGET | 400 MB |
| MEMORY_TARGET | 0 |
| MEMORY_MAX_TARGET | 0 |
| PROCESSES | For less than 10 instances per node: 50*(#db instances per node + 1). For 10 0r more more instances per node: [50*MIN(#db instances per node + 1, 11)] + [10*MAX(#db instance per node - 10, 0)] |
| USE_LARGE_PAGES | ONLY |
# ps -ef | egrep "diskmon|dskm" | grep -v grep
oracle 3205 1 0 Mar16 ? 00:01:18 ora_dskm_ONE2
oracle 10755 1 0 Mar16 ? 00:32:19 /u01/app/11.2.0.3/grid/bin/diskmon.bin -d -f
oracle 17292 1 0 Mar16 ? 00:01:17 asm_dskm_+ASM2
oracle 24388 1 0 Mar28 ? 00:00:21 ora_dskm_TWO2
oracle 27962 1 0 Mar27 ? 00:00:24 ora_dskm_THREE2
#
In Exadata, the diskmon is responsible for
- Handling of storage cell failures and I/O fencing
- Monitoring of Exadata Server state on all storage cells in the cluster (heartbeat)
- Broadcasting intra database IORM (I/O Resource Manager) plans from databases to storage cells
- Monitoring or the control messages from database and ASM instances to storage cells
- Communicating with other diskmons in the cluster