maclean
2011-05-29
<p>注意启用slaves I/O 可能导致额外的CI enqueue争用<br>
</p>
<p>SQL> show parameter dbwr</p>
<p>NAME TYPE VALUE<br>
------------------------------------ ----------- ------------------------------<br>
dbwr_io_slaves integer 3</p>
<p>SQL> select vs.sid,vs.program,vl.type,vl.id1,vl.id2,vl.lmode,vl.request,vl.ctime,vl.block from v$lock vl ,v$session vs<br>
2 where <br>
3 vl.sid=vs.sid<br>
4 and<br>
5 vl.type='CI'<br>
6 ;</p>
<p> SID PROGRAM TY ID1 ID2 LMODE REQUEST CTIME BLOCK<br>
---------- ------------------------------------------------ -- --- --- ----- ------- ---------- -----<br>
136 sqlplus@rh2.oracle.com (TNS V1-V3) CI 1 5 6 0 25149 1<br>
165 oracle@rh2.oracle.com (CKPT) CI 1 1 4 0 25215 1<br>
165 oracle@rh2.oracle.com (CKPT) CI 2 1 4 0 25374 0<br>
136 sqlplus@rh2.oracle.com (TNS V1-V3) CI 1 1 0 6 25149 0<br>
164 oracle@rh2.oracle.com (SMON) CI 1 5 0 6 8284 0<br>
</p>
<p>SQL> oradebug setmypid;<br>
Statement processed.</p>
<p>SQL> oradebug hanganalyze 3;<br>
Hang Analysis in /s01/admin/G10R2/udump/g10r2_ora_5744.trc</p>
<p>==============<br>
HANG ANALYSIS:<br>
==============<br>
Open chains found:<br>
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/165/1/0xdc2f8260/4703/rdbms ipc message><br>
-- <0/136/34/0xdc2fd170/1569/enq: CI - contention><br>
-- <0/164/1/0xdc2f8a48/4705/enq: CI - contention><br>
-- <0/160/1/0xdc2fa9e8/4713/buffer busy waits><br>
Other chains found:<br>
Chain 2 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/138/55/0xdc3000e0/5744/No Wait><br>
Chain 3 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/139/1/0xdc302868/4862/i/o slave wait><br>
Chain 4 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/145/3/0xdc2ff8f8/4840/Streams AQ: waiting for time man><br>
Chain 5 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/146/1981/0xdc2fc988/5308/i/o slave wait><br>
Chain 6 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/149/1/0xdc2ff110/4815/Streams AQ: qmn coordinator idle><br>
Chain 7 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/153/1/0xdc2fe928/4813/i/o slave wait><br>
Chain 8 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/158/10/0xdc302080/4860/Streams AQ: qmn slave idle wait><br>
Chain 9 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/159/255/0xdc2fc1a0/31232/i/o slave wait><br>
Chain 10 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/161/1/0xdc2fa200/4711/buffer busy waits><br>
Chain 11 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :<br>
<0/167/1/0xdc2f7290/4699/io done><br>
</p>
<p>see enq: CI - contention、i/o slave wait、buffer busy waits</p>
<p>Foreground Process alter system checkpoint <br>
CI 1 5 -- Reuse (checkpoint and invalidate) block range |Used to queue up interested clients</p>
<p>CKPT <br>
CI 1 1 -- same | Used to pass in parameters<br>
CI 2 1 -- LGWR Checkpointing and Hot Backup | Used to pass in parameters</p>
<p>Foreground Process alter system flush buffer_cache<br>
CI 1 1</p>
<p>SMON<br>
CI 1 5<br>
</p>
<p>SQL> select * From x$ksqrs where ksqrsidt='CI';</p>
<p>ADDR INDX INST_ID KSQRSID1 KSQRSID2 KS KSQRSFLG<br>
---------------- ---------- ---------- ---------- ---------- -- ----------<br>
00000000DC579A48 101 1 2 1 CI 2<br>
00000000DC581910 337 1 1 1 CI 2<br>
00000000DC583FB0 409 1 1 5 CI 2</p>
<p>resource structure , one for unique id1|id2<br>
SELECT ksqlkres,<br>
ksqlkadr,<br>
ksqlkmod,<br>
ksqlkreq,<br>
ksqlkctim,<br>
KSQLKLBLK<br>
FROM x$ksqeq<br>
WHERE ksqlkres IN (SELECT addr<br>
FROM x$ksqrs<br>
WHERE ksqrsidt = 'CI');<br>
</p>
<p>KSQLKRES KSQLKADR KSQLKMOD KSQLKREQ KSQLKCTIM KSQLKLBLK<br>
---------------- ---------------- ---------- ---------- ---------- ----------<br>
00000000DC583FB0 00000000DC51FDE8 6 0 25923 1<br>
00000000DC581910 00000000DC51FF18 4 0 25989 1<br>
00000000DC579A48 00000000DC51FFB0 4 0 26148 0<br>
00000000DC581910 00000000DC520650 0 6 25923 0<br>
00000000DC583FB0 00000000DC5208B0 0 6 9058 0<br>
00000000DC581910 00000000DC522318 0 0 27369 27369<br>
00000000DC579A48 00000000DC5223B0 0 0 27369 27369<br>
</p>
<p>SQL> select sid,serial# from v$session where sid=136;</p>
<p> SID SERIAL#<br>
---------- ----------<br>
136 34</p>
<p>SQL> alter system kill session '136,34';<br>
hanging here..............we KILL process</p>
<p>still hang</p>
<p>SQL> select vs.sid,vs.program,vl.type,vl.id1,vl.id2,vl.lmode,vl.request,vl.ctime,vl.block from v$lock vl ,v$session vs<br>
2 where <br>
3 vl.sid=vs.sid<br>
4 and<br>
5 vl.type='CI'<br>
6 ;</p>
<p> SID PROGRAM TY ID1 ID2 LMODE REQUEST CTIME BLOCK<br>
---------- ------------------------------------------------ -- --- --- ----- ------- ---------- -----<br>
165 oracle@rh2.oracle.com (CKPT) CI 1 1 4 0 26175 1<br>
165 oracle@rh2.oracle.com (CKPT) CI 2 1 4 0 26334 0<br>
164 oracle@rh2.oracle.com (SMON) CI 1 1 0 6 42 0<br>
164 oracle@rh2.oracle.com (SMON) CI 1 5 6 0 42 0<br>
</p>
<p> </p>
<p>Sun May 29 16:30:30 2011<br>
Errors in file /s01/admin/G10R2/bdump/g10r2_ora_4864.trc:<br>
ORA-00600: internal error code, arguments: [ksfdcko3], [0x0DBED8338], [], [], [], [], [], []<br>
ORA-27041: unable to open file<br>
Linux-x86_64 Error: 2: No such file or directory<br>
Additional information: 3<br>
Sun May 29 16:30:30 2011<br>
Errors in file /s01/admin/G10R2/bdump/g10r2_ora_4866.trc:<br>
ORA-00600: internal error code, arguments: [ksfdcko3], [0x0DBED8338], [], [], [], [], [], []<br>
ORA-27041: unable to open file</p>
<p>ksedmp: internal or fatal error<br>
ORA-00600: internal error code, arguments: [ksfdcko3], [0x0DBED8338], [], [], [], [], [], []<br>
ORA-27041: unable to open file<br>
Linux-x86_64 Error: 2: No such file or directory<br>
Additional information: 3<br>
----- Call Stack Trace -----<br>
calling call entry argument values in hex<br>
location type point (? means dubious value)<br>
-------------------- -------- -------------------- ----------------------------<br>
ksedst()+31 call ksedst1() 000000000 ? 000000001 ?<br>
7FFFAC32C340 ? 7FFFAC32C3A0 ?<br>
7FFFAC32C2E0 ? 000000000 ?<br>
ksedmp()+610 call ksedst() 000000000 ? 000000001 ?<br>
7FFFAC32C340 ? 7FFFAC32C3A0 ?<br>
7FFFAC32C2E0 ? 000000000 ?<br>
ksfdmp()+21 call ksedmp() 000000003 ? 000000001 ?<br>
7FFFAC32C340 ? 7FFFAC32C3A0 ?<br>
7FFFAC32C2E0 ? 000000000 ?<br>
kgerinv()+161 call ksfdmp() 000000003 ? 000000001 ?<br>
7FFFAC32C340 ? 7FFFAC32C3A0 ?<br>
7FFFAC32C2E0 ? 000000000 ?<br>
kgesinv()+33 call kgerinv() 0068D41C0 ? 006B8ABF0 ?<br>
7FFFAC32C3A0 ? 7FFFAC32C2E0 ?<br>
000000000 ? 000000000 ?<br>
ksesin()+211 call kgesinv() 0068D41C0 ? 006B8ABF0 ?<br>
7FFFAC32C3A0 ? 7FFFAC32C2E0 ?<br>
000000000 ? 000000000 ?<br>
ksfdcko()+1173 call ksesin() 0053F31C4 ? 000000001 ?<br>
000000002 ? 0DBED8338 ?<br>
000000016 ? 0DC303858 ?<br>
ksfvcko()+67 call ksfdcko() 0DBD764A8 ? 000000003 ?<br>
000000002 ? 0DBED8338 ?<br>
000000016 ? 0DC303858 ?<br>
ksfdsubmit()+76 call ksfvcko() 0D8BE81D0 ? 000000002 ?<br>
000000002 ? 0DBED8338 ?<br>
000000016 ? 0DC303858 ?<br>
ksfvdoaio()+373 call ksfdsubmit() 0D8BE81D0 ? 0D8BE8160 ?<br>
000000003 ? 000000001 ?<br>
000000016 ? 0DC303858 ?<br>
ksfvrdp()+6683 call ksfvdoaio() 0D8BE8130 ? 0D8BE8160 ?<br>
000000003 ? 000000001 ?<br>
000000016 ? 0DC303858 ?<br>
opirip()+1157 call ksfvrdp() 0D8BE8130 ? 0D9D1BD18 ?<br>
000000003 ? 0D7AAB568 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
opidrv()+582 call opirip() 000000032 ? 000000004 ?<br>
7FFFAC32E978 ? 0D7AAB568 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
sou2o()+114 call opidrv() 000000032 ? 000000004 ?<br>
7FFFAC32E978 ? 0D7AAB568 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
opimai_real()+317 call sou2o() 7FFFAC32E950 ? 000000032 ?<br>
000000004 ? 7FFFAC32E978 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
main()+116 call opimai_real() 000000003 ? 7FFFAC32E9E0 ?<br>
000000004 ? 7FFFAC32E978 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
__libc_start_main() call main() 000000003 ? 7FFFAC32E9E0 ?<br>
+244 000000004 ? 7FFFAC32E978 ?<br>
0068D4340 ? 7FFF00000000 ?<br>
_start()+41 call __libc_start_main() 000723748 ? 000000001 ?<br>
7FFFAC32EB38 ? 000000000 ?<br>
0068D4340 ? 000000003 ?</p>
<p>RMAN> report schema;</p>
<p>using target database control file instead of recovery catalog<br>
Report of database schema</p>
<p>List of Permanent Datafiles<br>
===========================<br>
File Size(MB) Tablespace RB segs Datafile Name<br>
---- -------- -------------------- ------- ------------------------<br>
hanging here...............</p>
<p>RMAN> restore datafile 9;</p>
<p>Starting restore at 29-MAY-11<br>
using channel ORA_DISK_1<br>
using channel ORA_DISK_2</p>
<p>channel ORA_DISK_1: starting datafile backupset restore<br>
channel ORA_DISK_1: specifying datafile(s) to restore from backup set<br>
restoring datafile 00009 to /flashcard/wwli.dbf<br>
channel ORA_DISK_1: reading from backup piece /s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_nnndf_TAG20110529T162547_6y40rd0f_.bkp<br>
RMAN-00571: ===========================================================<br>
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============<br>
RMAN-00571: ===========================================================<br>
RMAN-03002: failure of restore command at 05/29/2011 23:56:23<br>
ORA-19870: error reading backup piece /s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_nnndf_TAG20110529T162547_6y40rd0f_.bkp<br>
ORA-19573: cannot obtain exclusive enqueue for datafile 9</p>
<p>[oracle@rh2 bdump]$ oerr ora 19573 <br>
19573, 00000, "cannot obtain %s enqueue for datafile %s"<br>
// *Cause: The file access enqueue could not be obtained for a file<br>
// specified in a backup, copy or restore operation.<br>
// If the enqueue type shown is 'shared', then the file is the<br>
// input file for a backup or copy. If the type is 'exclusive', then<br>
// the file is the output file for a datafile copy or restore which<br>
// is attempting to overwrite the currently active version of that<br>
// file - in this case, the file must be offline or the database must<br>
// be closed. If the type is 'read-only', then you are attempting<br>
// to back up or copy this file while the database is in NOARCHIVELOG<br>
// mode.<br>
// *Action: Wait until the conflicting operation is complete, then retry<br>
// the copy or backup. If the database is in NOARCHIVELOG mode, then<br>
// all files being backed up must be closed cleanly.<br>
</p>
<p>SQL> select vs.sid,vs.program,vl.type,vl.id1,vl.id2,vl.lmode,vl.request,vl.ctime,vl.block from v$lock vl ,v$session vs<br>
2 where <br>
3 vl.sid=vs.sid<br>
4 and<br>
5 vl.type='MR' and vl.id1=9;</p>
<p> SID PROGRAM TY ID1 ID2 LMODE REQUEST CTIME BLOCK<br>
---- ------------------------------------------------ -- --- --- ----- ------- ---------- -----<br>
167 oracle@rh2.oracle.com (DBW0) MR 9 0 4 0 269474 0</p>
<p>MR enqueue lock is still taken by DBW0 ,reset dbwr_io_slaves and restart instance </p>
<p>SQL> alter system set dbwr_io_slaves=0 scope=spfile;</p>
<p>System altered.</p>
<p>SQL> shutdown abort;<br>
ORACLE instance shut down.<br>
SQL> startup;<br>
ORACLE instance started.</p>
<p>Total System Global Area 1065353216 bytes<br>
Fixed Size 2089336 bytes<br>
Variable Size 486542984 bytes<br>
Database Buffers 570425344 bytes<br>
Redo Buffers 6295552 bytes<br>
Database mounted.<br>
ORA-01157: cannot identify/lock data file 9 - see DBWR trace file<br>
ORA-01110: data file 9: '/flashcard/wwli.dbf'<br>
</p>
<p>RMAN> restore datafile 9;</p>
<p>Starting restore at 30-MAY-11<br>
using target database control file instead of recovery catalog<br>
allocated channel: ORA_DISK_1<br>
channel ORA_DISK_1: sid=155 devtype=DISK<br>
allocated channel: ORA_DISK_2<br>
channel ORA_DISK_2: sid=154 devtype=DISK</p>
<p>channel ORA_DISK_1: starting datafile backupset restore<br>
channel ORA_DISK_1: specifying datafile(s) to restore from backup set<br>
restoring datafile 00009 to /flashcard/wwli.dbf<br>
channel ORA_DISK_1: reading from backup piece /s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_nnndf_TAG20110529T162547_6y40rd0f_.bkp<br>
channel ORA_DISK_1: restored backup piece 1<br>
piece handle=/s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_nnndf_TAG20110529T162547_6y40rd0f_.bkp tag=TAG20110529T162547<br>
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01<br>
Finished restore at 30-MAY-11</p>
<p>RMAN> recover datafile 9;</p>
<p>Starting recover at 30-MAY-11<br>
using channel ORA_DISK_1<br>
using channel ORA_DISK_2</p>
<p>starting media recovery</p>
<p>channel ORA_DISK_2: starting archive log restore to default destination<br>
channel ORA_DISK_2: restoring archive log<br>
archive log thread=1 sequence=710<br>
channel ORA_DISK_1: starting archive log restore to default destination<br>
channel ORA_DISK_2: restoring archive log<br>
archive log thread=1 sequence=711<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=704<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=705<br>
channel ORA_DISK_2: restoring archive log<br>
archive log thread=1 sequence=712<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=706<br>
channel ORA_DISK_2: restoring archive log<br>
archive log thread=1 sequence=713<br>
channel ORA_DISK_2: reading from backup piece /s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_annnn_TAG20110529T162617_6y40scvo_.bkp<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=707<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=708<br>
channel ORA_DISK_1: restoring archive log<br>
archive log thread=1 sequence=709<br>
channel ORA_DISK_1: reading from backup piece /s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_annnn_TAG20110529T162617_6y40sbqg_.bkp<br>
channel ORA_DISK_1: restored backup piece 1<br>
piece handle=/s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_annnn_TAG20110529T162617_6y40sbqg_.bkp tag=TAG20110529T162617<br>
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_704_6y4vf989_.arc thread=1 sequence=704<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_704_6y4vf989_.arc recid=716 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_705_6y4vf98w_.arc thread=1 sequence=705<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_705_6y4vf98w_.arc recid=718 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_706_6y4vf99k_.arc thread=1 sequence=706<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_706_6y4vf99k_.arc recid=720 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_707_6y4vf9b3_.arc thread=1 sequence=707<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_707_6y4vf9b3_.arc recid=721 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_708_6y4vf9bn_.arc thread=1 sequence=708<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_708_6y4vf9bn_.arc recid=722 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_709_6y4vf9cm_.arc thread=1 sequence=709<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_709_6y4vf9cm_.arc recid=723 stamp=752457641<br>
channel ORA_DISK_2: restored backup piece 1<br>
piece handle=/s01/flash_recovery_area/G10R2/backupset/2011_05_29/o1_mf_annnn_TAG20110529T162617_6y40scvo_.bkp tag=TAG20110529T162617<br>
channel ORA_DISK_2: restore complete, elapsed time: 00:00:01<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_710_6y4vf976_.arc thread=1 sequence=710<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_710_6y4vf976_.arc recid=714 stamp=752457641<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_711_6y4vf97v_.arc thread=1 sequence=711<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_711_6y4vf97v_.arc recid=715 stamp=752457641<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_712_6y4vf98h_.arc recid=717 stamp=752457641<br>
channel default: deleting archive log(s)<br>
archive log filename=/s01/flash_recovery_area/G10R2/archivelog/2011_05_30/o1_mf_1_713_6y4vf993_.arc recid=719 stamp=752457641<br>
media recovery complete, elapsed time: 00:00:03<br>
Finished recover at 30-MAY-11<br>
</p>
<p>SQL> alter database open ;</p>
<p>Database altered.<br>
</p>
<p>Never Lost your datafile while the instance is still running ,if you have enabled DBWR slaves IO!!</p>