ASM + AMDU

AMDU just extracts data from an ASM file system onto a regular file system.  The output generates database datafiles with an extension “.f”  All you have to do is an ALTER DATAFILE … RENAME … TO … command and you can use these extracted files to mount and open your database.

 

Little bit of history of our unique situation which will hopefully make the following steps clear:
1) We were using two ASM diskgroups +DATA and +FRA
2) DATA group has 16 disk groups of 1.03TB each disk
3) DATA group got messed up because we had some corruption in the disk headers, due to accidentally adding 2 new disk to the DATA diskgroup that were over 2TB in size.
4) We then removed these 2 new disks by creating a new diskgroup (DATA1) with these 2 disks.
5) FRA diskgroup was OK and it contained a copy of the controlfile and multiplexed online redo logs and multiplexed archived logs.  This is one recommendation I can’t stress enough => multiplexing your controlfile, online logs and archived logs.  We also had a pfile converted from an spfile available as well.  Recovery becomes more of a pain in the balls without these things, I’d strongly advide at least making sure these things are backed up.
6) The result is that we were left with the DATA diskgroup thinking is was supposed to have these 2 disks as members; and the 2 disks have no disk header information about the DATA diskgroup – those 2 disks think they’re part of the diskgroup DATA1 only.

Damn, what a mess.  But the good new is, we fixed it using AMDU. AMDU is just a bit by bit data extraction tool.  It’s similar to RMAN only in that it extracts your Oracle data to another location.  It doesn’t check for corruption or anything else, it just copies stuff from one place to another.  There are no tricks or tuning or fancy parameters in AMDU.  It’s just a straight forward, simple extraction tool.

AMDU isn’t documented and Oracle support will tell you not to do the following steps without guidance from support, but if you’re just wanting to mess around with AMDU for educational purposes or are in a total desperate situation, then you can try the following at your own risk.  Although we did use this on our Live environment.  Oh, and I forgot to mention – we didn’t have ANY backup at all because of a mixup in hardware.  So we were basically running live without a net.  Anyway, AMDU did save the day, here’s how:

1) Find the file names for each datafile in your ASM diskgroup.  Our looked something like this:
+FRA/orcl/datafile/media.260.739318209
+DATA/orcl/datafile/system.256.739321475
+DATA/orcl/datafile/sysaux.257.739321555
+DATA/orcl/datafile/undotbs.258.739321589
+DATA/orcl/datafile/users.259.739321609

These file names are another thing I strongly recommend you keep in a separate text file in a safe location.  And while I’m thinking about it, also keep the DBID returned from RMAN when you connect to it.  You’ll need RMAN after AMDU is done, so it’s good to check it out anyway.
The important part we needed from the datfiles was the digits “.260″, “.256″, “.257″, “.258″, “.259″   We had this because we were using OMF with our ASM.  I’m not sure how it works if you’re not using OMF but I can imagine some research and experimentation would solve that.

2) Next you have to make a place to let AMDU extract data too.  This needs to be a file system and it needs to be at least the same size that the storage needed for the DB.  For example, our DB was 9TB in size, therefore we needed to create a 9TB file system.  We made it 10TB just to be sure.
AMDU just extracts data from an ASM file system.  It doesn’t check or verify anything.

3) Next, make sure AMDU works.  What you’ll do here is a -dump of the metadata which AMDU will use to find the data to extract.  This command will produce 3 small files on your file system in the same directory from which you launch the amdu command. This can be run harmlessly.
$ amdu -diskstring ‘/dev/rdsk/*’ -dump DATA

If you get a Bus Error (core dump) error, try making sure you NLS_ parameters are cleared and also try export LD_LIBRARY_PATH=/path/of/amdu

4) Check the report.txt file for anything strange or any errors.  In the report.txtfile we had, we saw things like this:
AMDU-00201: Disk N0018: ‘/dev/rdsk/c7t60080E5000185EB00000037A4D0DE994d0s6′
AMDU-00209: Corrupt block found: Disk N0038 AU [1] block [254] type [0]
AMDU-00201: Disk N0038: ‘/dev/rdsk/c7t60080E5000185EB20000039B4D0DE979d0s6′
AMDU-00209: Corrupt block found: Disk N0040 AU [1] block [254] type [0]
AMDU-00201: Disk N0040: ‘/dev/rdsk/c7t60080E5000185EB20000039D4D0DE9B5d0s6′
AMDU-00209: Corrupt block found: Disk N0020 AU [1] block [254] type [0]
AMDU-00201: Disk N0020: ‘/dev/rdsk/c7t60080E5000185EB00000037C4D0DE9CDd0s6′
AMDU-00209: Corrupt block found: Disk N0042 AU [1] block [254] type [0]
AMDU-00201: Disk N0042: ‘/dev/rdsk/c7t60080E5000185EB20000039F4D0DE9F3d0s6′
AMDU-00209: Corrupt block found: Disk N0056 AU [1] block [254] type [0]

but it turned out this was OK and not cause for concern.  I know I said earlier AMDU doesn’t check for corruption, but the mentioned errors above happened in a second step where each allocation unit will be checked for corruptions and I/O errors by default.  Even if we did have corruption, we still needed to extract with AMDU and then do an RMAN backup to another set of disks, which will verify if we have corruption and where it is.  Then we can work on fixing it.

6) When you’re ready to extract your data using AMDU, do this:
$ cd <directory whenre you want to extract the data to>
$ amdu -diskstring ‘/dev/rdsk/*’ -extract ‘DATA.258

where -diskstring is the same setting you have in your ASM instance for asm_diskstring parameter and -extract requires the parameters <diskgroup>.<middle_number_of _the_OMF_file_name>

This will extract the ASM datafile to your file system.  It’ll create a folder in the directory where you launched the amdu command from and create two files in that folder called DATA_258.f  and report.txt.  The file DATA_258.f is basically the same thing that you’ll get if you did a regular extract from an ASM file system to a regular file system.  The DATA_258.f file can be used by the database with just a few configuration changes to your database.

7) Now that we have the file extracted to a file system, we have to tell the DB to start using this “.f” file.  This can be accomplished with the standard RENAME DATAFILE command while the database in mounted:
ALTER DATABASE RENAME file ‘+DATA/orcl/datafile/system.258.738387863′ TO ‘/test/amdu_2010_12_31_01_29_39/DATA2_258.f’;

8) Follow the above steps for all your datafiles.  Don’t forget the tempfile, or recreate it.

9) You’ll also have to do the standard DBA stuff to get the database instance open, like modifying the pfile to point to your controlfiles (if their in different places from where they’re supposed to be), etc …

10) If you had multiplexed your online redo logs and achived logs on your DATA diskgroup, you’ll have to drop the ENTIRE online redo log groups and recreate them.  Well, create new ones first on +FRA or your file system, then drop the entire redo log groups.  You can create more after rebuilding your DATA diskgroup.

11) Now that your data is off the ASM diskgroup and you have your database open and running on the amdu converted files, you can clear the diskheaders on your ASM disks, rebuild your ASM diskgroup, launch an RMAN backup, then do an RMAN switch to copy command.  See below:

a) clear disk headers using dd if=/dev/zero of=/dev/rdsk/c7t60080E5000185EB2000003924D0DE891d0s6 bs=8192 count=12800    where /dev/rdsk/c7t60080E5000185EB2000003924D0DE891d0s6 is the location of a disk in your corrupted diskgroup.  Repeat for every disk in your diskgroup.

b) recreate your diskgroup  create diskgroup DATA1 external redundancy disk
‘/dev/rdsk/c*****   where /dev/rdsk/****** is the location of a disk you want to add to the diskgroup

c) With your DB in MOUNT state, you can switch over the SYSTEM, SYSAUX, UNDO tablespaces using these commands:
RMAN> copy datafile 1 to ‘+DATA’;
RMAN> switch datafile 1 to copy;

To move other tablespaces while the DB is open:

backup as copy tablespace MEDIA format ‘+DATA’;  => where tablespace MEDIA is associated with datafile #5

RMAN> sql ‘alter database datafile 5 offline’;
RMAN> list copy of datafile 5;
RMAN> switch datafile 5 to copy;
RMAN> list copy of datafile 5;
RMAN> recover datafile 5;
RMAN> sql ‘alter database datafile 5 online’;
RMAN> report schema;
RMAN> backup current controlfile;

And you’re back in business


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *