如何使用gdb工具对Oracle系统状态(systemstate)做trace

当Oracle系统hang住 ,无法使用一切方法登录时 (包括 sqlplus -prelim / as sysdba),我们可以使用gdb调试工具来对 Oracle做系统 dump ,通过 系统 dump信息 判断 具体hang的原因 。 若直接 将 进程 kill 掉,则将失去现场 无法帮助今后避免 这样的hang情况。

要使用gdb 外部工具, 就需要知道目前实例中后台进程的进程号。

我们一般通过 以下命令列出 Oracle 进程:ps -ef|grep <SID>

[oracle@rh2 ~]$ ps -ef|grep oraclewebmoney
oracle   16996 16995  0 21:55 ?        00:00:00 oraclewebmoney (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

然后启动gdb ,指定Oracle软件中二进制文件 oracle的位置和 进程id

[oracle@rh2 udump]$ gdb $ORACLE_HOME/bin/oracle  16996
GNU gdb Red Hat Linux (6.3.0.0-1.159.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB.  Type “show warranty” for details.
This GDB was configured as “x86_64-redhat-linux-gnu”…
(no debugging symbols found)
Using host libthread_db library “/lib64/tls/libthread_db.so.1”.

Attaching to program: /u01/oracle/product/10.2.0/db_1/bin/oracle, process 14594
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libskgxp10.so…(no debugging symbols found)…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libskgxp10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libhasgen10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libhasgen10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libskgxn2.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libskgxn2.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocr10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocr10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocrb10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocrb10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libocrutl10.so…done.
Loaded symbols for /u01/oracle/product/10.2.0/db_1/lib/libocrutl10.so
Reading symbols from /u01/oracle/product/10.2.0/db_1/lib/libjox10.so…

在gdb 提示行中 输入 print ksudss(10),即

(gdb) print ksudss(10)

之后将在udump目录中产生相关<SID>_ora_<pid>的trace文件,我们通过分析trace可以发现hang的主要原因。

trace文件示例如下:

System name:    Linux
Node name:      rh2
Release:        2.6.9-78.ELsmp
Version:        #1 SMP Wed Jul 9 15:46:26 EDT 2008
Machine:        x86_64
Instance name: webmoney
Redo thread mounted by this instance: 1
Oracle process number: 15
Unix process pid: 16996, image: oracle@rh2 (TNS V1-V3)

*** 2009-09-07 21:57:14.100
*** SERVICE NAME:(SYS$USERS) 2009-09-07 21:57:14.100
*** SESSION ID:(528.2041) 2009-09-07 21:57:14.100
===================================================
SYSTEM STATE
————
System global information:
processes: base 0x91637c30, size 500, cleanup 0x9167a2e0
allocation: free sessions 0x91779840, free calls (nil)
control alloc errors: 0 (process), 0 (session), 0 (call)
PMON latch cleanup depth: 0
seconds since PMON’s last scan for dead processes: 45
system statistics:

GDE Error: Error retrieving file - if necessary turn off error checking (404:Not Found)

关注dbDao.com的新浪微博

扫码关注dbDao.com 微信公众号:

Comments

  1. wangliang says:

    要使用gdb 外部工具, 就需要知道目前实例中后台进程的进程号;
    “目前实例中后台进程的进程号”这个指的是哪个进程,那么多oralce进程呢?

  2. 学习了

  3. edwin yao says:

    很不错,以后可以用上了.

Speak Your Mind

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569