Oracle内部错误:ORA-00600[OSDEP_INTERNAL]一例

一套HP-UX上的9.2.0.5系统在shutdown abort时出现ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], []内部错误,伴随有ORA-27302: failure occurred at: skgpwinit4,ORA-27303: additional information: attach to invalid skgp shared ctx,具体日志如下:

/opt/oracle/product/9.2.0.5/rdbms/log/ngende_ora_7669.trc
Oracle9i Enterprise Edition Release 9.2.0.5.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.5.0 - Production
ORACLE_HOME = /opt/oracle/product/9.2.0.5
System name: HP-UX
Node name: yictngd3
Release: B.11.23
Version: U
Machine: ia64
Instance name: nGende
Redo thread mounted by this instance: 0 
Oracle process number: 0
7669

*** 2010-09-08 00:10:02.985
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [], [], []
ORA-27302: failure occurred at: skgpwinit4
ORA-27303: additional information: attach to invalid skgp shared ctx
Current SQL information unavailable - no session.

Call stack
--------------
ksedmp <- ksfdmp <- kgerinv <- kgerin <- kgerecoserr <- ksucrp <- ksucresg <- kpolna 
<- kpogsk <- opiodr <- ttcpip <- opitsk <- Cannot <- Cannot <- Cannot <- Cannot <- opiino 
<- opiodr <- opidrv <- sou2o <- main <- main_opd_entry

经查该内部错误与操作系统共享内存有关,相关的Note有:

Ora-00600: Internal Error Code, Arguments: [Osdep_internal] [ID 304027.1]
Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.2 to 10.2.0.3 - Release: 9.2 to 10.2
Information in this document applies to any platform.
***Checked for relevance on 03-NOV-2010***

Getting ORA-600 [OSDEP_INTERNAL] errors while starting up the database:

ORA-00600: internal error code, arguments: [OSDEP_INTERNAL],
[], [], [], [], [], [], []
ORA-27302: failure occurred at: skgpwreset1
ORA-27303: additional information: invalid shared ctx
ORA-27146: post/wait initialization failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpsemsper
Symptoms
Getting ORA-600 [OSDEP_INTERNAL]
Accompanied by the following errors
ORA-27302:Failure occured at: skgpwreset1
ORA-27303:additional information: invalid shared ctx
ORA-27146: post/wait initialization failed
ORA-27300: OS system dependent operation: segment failed with error 28
ORA-27301: OS system Failure message: No space left on device
ORA-27302: failure occured at: sskgpsemsper

Cause
The functions in the trace file generated point to the semaphore settings .
Smmns is set too low.

Solution
set semmns 32767
Arrange to make the changes persistent as per the Operating system then restart the server and check if the changes are persistent.
eg: Linux /etc/sysctl.conf

sem = semmsl semmns semopm semmni
kernel.sem = 256 32768 100 228

Getting ORA-00600 [OSDEP_INTERNAL]: Internal Error While Trying To Connect / As Sysdba [ID 253885.1]

Applies to:
Oracle Server - Enterprise Edition - Version: 9.2.0.3 and later   [Release: 9.2 and later ]
HP-UX PA-RISC (64-bit)
Symptoms
Getting following error while trying to connect as sysdba using sqlplus:

SQL> conn / as sysdba
ERROR:
ORA-01041: internal error. hostdef extension doesn't exist

Alert.log shows:

ORA-00600: internal error code, arguments: [OSDEP_INTERNAL], [], [], [], [], [],[], []
ORA-27302: failure occurred at: skgpwinit4
ORA-27303: additional information: attach to invalid skgp shared ctx
Cause
- Database was shutdown using "shutdown abort" option.
- Shared memory segment was not removed even though the instance was down.
Solution
+ Check which shared memory segments are owned by the oracle owner

Use the ipcs -bm command:

% ipcs -bm

m 34034336 0xf8f18468 --rw-r----- ORACLE dba 16777216

+ Delete the 'orphan' shared memory segments:

% ipcrm -m 34034336

If there is more than one instance running on the server and you are not sure how to identify the shared
memory segments then please contact support.

不恰当的设置OS VM参数可能导致该问题,而在HP-UX PA-RISC平台上使用'shotdown abort'命令时可能因为共享内存未能正常移除而出现该内部错误;因为实例还是以'abort'方式关闭的,仅仅是共享内存未能释放,所以只需要以ipcs->ipcrm等os命令将相应的共享内存段释放就可以了,不会造成其他影响。

11g内存管理新特性的internal表现

11g中自动内存管理(Automatic Memory Management ,amm), 令dba在数据库内存配置的相关工作更加简单. AMM现在将SGA与PGA整合到一起管理,而您只需要设置memory_target参数即可限定Oracle将使用到的内存尺寸,Oracle将自动分配这些内存空间.

您一定很困惑Oracle在unix平台上是如何对共享的sga内存空间与私有的pga内存空间进行切换的?这意味着Oracle需要经常释放sga中的部分内存以便允许pga去使用它们.传统的sys V 使用的共享内存shm接口不具备如此的灵活性.我们来看看Oracle是如何做到的?

先来获取我们需要的11g实例共享内存id(shared memory id)

[oracle@rh2 ~]$ sysresv                // 该命令需要设置了正确的LD_LIBRARY_PATH
IPC Resources for ORACLE_SID “T11” :
Shared Memory:
ID              KEY
65537           0x95c84bb8
Semaphores:
ID              KEY
327681          0xdf521034
Oracle Instance alive for sid “T11”

试着找出对应的sys V共享内存段:

[oracle@rh2 ~]$ ipcs -m

—— Shared Memory Segments ——–
key        shmid      owner      perms      bytes      nattch     status
0x95c84bb8 65537      oracle    660        4096       0

对应的存在着共享内存段,但该段很小只有 4096 byte哦,既然Oracle不再把sga放到共享段中,那藏到哪里去了呢?

我们接下来检查Oracle实例进程的内存影射状况.

[oracle@rh2 ~]$  pmap `pgrep -f lgwr`|less
14889:   ora_lgwr_T11
0000000000400000 155016K r-x–  /usr/oracle/product/11g/db_1/bin/oracle
0000000009c62000  12404K rw—  /usr/oracle/product/11g/db_1/bin/oracle
000000000a87f000    732K rwx–    [ anon ]
0000000060000000      4K r–s-  /dev/shm/ora_T11_65537_0
0000000060001000  16380K rw-s-  /dev/shm/ora_T11_65537_0
0000000061000000  16384K rw-s-  /dev/shm/ora_T11_65537_1
0000000062000000  16384K rw-s-  /dev/shm/ora_T11_65537_2
0000000063000000  16384K rw-s-  /dev/shm/ora_T11_65537_3
0000000064000000  16384K rw-s-  /dev/shm/ora_T11_65537_4
0000000065000000  16384K rw-s-  /dev/shm/ora_T11_65537_5
0000000066000000  16384K rw-s-  /dev/shm/ora_T11_65537_6
0000000067000000  16384K rw-s-  /dev/shm/ora_T11_65537_7
0000000068000000  16384K rw-s-  /dev/shm/ora_T11_65537_8
0000000069000000  16384K rw-s-  /dev/shm/ora_T11_65537_9
000000006a000000  16384K rw-s-  /dev/shm/ora_T11_65537_10
000000006b000000  16384K rw-s-  /dev/shm/ora_T11_65537_11
000000006c000000  16384K rw-s-  /dev/shm/ora_T11_65537_12
000000006d000000  16384K rw-s-  /dev/shm/ora_T11_65537_13
000000006e000000  16384K rw-s-  /dev/shm/ora_T11_65537_14
000000006f000000  16384K rw-s-  /dev/shm/ora_T11_65537_15
0000000070000000  16384K rw-s-  /dev/shm/ora_T11_65537_16
0000000071000000  16384K rw-s-  /dev/shm/ora_T11_65537_17
0000000072000000  16384K rw-s-  /dev/shm/ora_T11_65537_18
0000000073000000  16384K rw-s-  /dev/shm/ora_T11_65537_19
0000000074000000  16384K rw-s-  /dev/shm/ora_T11_65537_20
0000000075000000  16384K rw-s-  /dev/shm/ora_T11_65537_21
0000000076000000  16384K rw-s-  /dev/shm/ora_T11_65537_22
0000000077000000  16384K rw-s-  /dev/shm/ora_T11_65537_23
0000000078000000  16384K rw-s-  /dev/shm/ora_T11_65537_24
0000000079000000  16384K rw-s-  /dev/shm/ora_T11_65537_25
000000007a000000  16384K rw-s-  /dev/shm/ora_T11_65537_26
000000007b000000  16384K rw-s-  /dev/shm/ora_T11_65537_27
000000007c000000  16384K rw-s-  /dev/shm/ora_T11_65537_28
000000007d000000  16384K rw-s-  /dev/shm/ora_T11_65537_29
000000007e000000  16384K rw-s-  /dev/shm/ora_T11_65537_30
000000007f000000  16384K rw-s-  /dev/shm/ora_T11_65537_31
0000000080000000  16384K rw-s-  /dev/shm/ora_T11_65537_32
0000000081000000  16384K rw-s-  /dev/shm/ora_T11_65537_33
0000000082000000  16384K rw-s-  /dev/shm/ora_T11_65537_34
0000000083000000  16384K rw-s-  /dev/shm/ora_T11_65537_35
0000000084000000  16384K rw-s-  /dev/shm/ora_T11_65537_36
0000000085000000  16384K rw-s-  /dev/shm/ora_T11_65537_37
0000000086000000  16384K rw-s-  /dev/shm/ora_T11_65537_38
0000000087000000  16384K rw-s-  /dev/shm/ora_T11_65537_39
0000000088000000  16384K rw-s-  /dev/shm/ora_T11_65537_40
0000000089000000  16384K rw-s-  /dev/shm/ora_T11_65537_41
000000008a000000  16384K rw-s-  /dev/shm/ora_T11_65537_42

............

0000003e79109000      4K rw—  /lib64/tls/librt-2.3.4.so
0000003e7910a000     64K rw—    [ anon ]
0000003e79600000     84K r-x–  /lib64/libnsl-2.3.4.so
0000003e79615000   1020K —–  /lib64/libnsl-2.3.4.so
0000003e79714000      4K r—-  /lib64/libnsl-2.3.4.so
0000003e79715000      4K rw—  /lib64/libnsl-2.3.4.so
0000003e79716000      8K rw—    [ anon ]
0000007fbfff3000     52K rwx–    [ stack ]
ffffffffff600000      4K r-x–    [ anon ]
total          2497724K

pmap工具诠释了进程相关共享内存的情况,可以看到许多个16MB的"文件"对应了Oracle服务进程的空间地址.这是linux上POSIX风格的共享内存管理模式,使用"文件"形式包含共享内存段.

借助于将sga分割成许多块,Oracle可以很容易地把sga部分内存返回给OS,而服务器进程即可以利用到这些内存.(当memory_max_target>1024时,颗粒为16MB,否则为4MB).

接下来我们测试下Oracle是如何释放部分sga内存的.

对比实例启动前后:

启动前:

[oracle@rh2 ~]$ ls -l /dev/shm
总用量 0

启动后:
[oracle@rh2 ~]$ ls -l /dev/shm
总用量 1373704
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_0
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_1
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_10
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_100
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_101
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_102
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_103
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_104
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_105
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_106
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_107
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_108
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_109
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_11
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_110
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_111
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_112
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_113
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_114
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_115
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_116
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_117
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_118
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_119
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_12

可以看到启动后出现的16MB文件形式共享内存中部分大小为0,这些块被选出当发生内存交换时来被’destory’.使用pmap工具仍可以看到该部分影射,而实际上已经被Oracle释放了.

现在我们加大pga,观察其交换情况.

SQL> alter system set pga_aggregate_target=1900M ;

System altered.

[oracle@rh2 ~]$ ls -l /dev/shm
总用量 289984
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_0
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_1
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_10
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_100
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_101
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_102
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_103
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_104
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_105
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_106
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_107
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_108
-rw-r—–  1 oracle oinstall 16777216  9月 27 18:59 ora_T11_327680_109
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_11
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_110
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_111
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_112
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_113
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_114
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_115
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_116
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_117
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_118
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_119
-rw-r—–  1 oracle oinstall        0  9月 27 18:59 ora_T11_327680_12
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_120
-rw-r—–  1 oracle oinstall        0  9月 27 19:09 ora_T11_327680_121

可以看到出现了大量size为0的"文件",期许的交换出现了.

可见在11g中Oracle采用了新的共享内存实现方式,区别于旧的"一块式"共享段,更为灵活了.

沪ICP备14014813号

沪公网安备 31010802001379号