如何诊断ORA-125XX连接问题

以下这个Action Script是我收集的在解决ORA-125XX(如ORA-12560)这类网络链接故障时的一些思路,主要包括 现有的网络配置(client & server side)、监听日志、SQLNET Client trace等信息– How to troubleshooting ORA-125** connection issues:

 

ORA-12560

 

# CHECK FIREWALL, WINDOWS FIREWALL , ANTI-Virus Software First !
ping hostname
tnsping TNS
trcroute TNS
telnet  <hostname> <port>  

tracert hostname

client side
sqlplus scott/tiger@TNS

&
server side
sqlplus scott/tiger@TNS

cat /etc/hosts
cat /etc/resolv.conf
cat /etc/nsswitch.conf
ipconfig -a
ping 127.0.0.1

$ORACLE_HOME/network/admin/sqlnet.ora
$ORACLE_HOME/network/admin/tnsnames.ora
$ORACLE_HOME/network/admin/listener.ora
$ORACLE_HOME/network/admin/endpoints_listener.ora
$ORACLE_HOME/network/log/*
sqlnet.log listener.log
/var/log/messages
/var/adm/messages
errpt -a

ls -ld $ORACLE_HOME
netstat -rn
ps -ef | grep -i tns
lsnrctl status {listener_name}
lsnrvtl services {listener_name}

ulimit -a

1. Complete database alert log.

2. If the database was not restarted from the time of last occurance of the
issue,

select * from v$resource_limit

3. RAM and SWAP configured on the server.

4. ulimit settings for oracle user:

ulimit -aS
ulimit -aH

5. Kernel parameter settings:

/etc/sysctl.conf

dblogin

show parameter cluster_database

show parameter listener

$srvctl config vip -n {nodename}

$lsnrctl status listener

agent.log and the crsd.log ..

crsd agent log and the crsd.log
$crsctl getperm resource ora.LISTENER.lsnr

sql net client trace , Client side tracing is done by adding the following syntax to the client’s sqlnet.ora file:

We will need a timestamped matching set of client/listener sqlnet traces while error is reproduced in order to find the root cause of the issue.

++ Enable client sqlnet tracing.
=======================

To do this add the following to client sqlnet.ora:

TRACE_LEVEL_CLIENT=16
TRACE_UNIQUE_CLIENT=TRUE
TRACE_DIRECTORY_CLIENT=path
TRACE_FILE_CLIENT=client
TRACE_TIMESTAMP_CLIENT=ON

replace path with a local directory for the trace files. (for example c:\temp)
Do a test connection from the problematic client and check if the trace files are created.
Upload the traces containing the error to me on metalink.

++ Enable listener sqlnet tracing.
==========================
To do this edit the listener.ora and add,

TRACE_LEVEL_{listener name}=16
TRACE_TIMESTAMP_{listener name}=TRUE
TRACE_DIRECTORY_{listener name}=/tmp {– this can be any directory other than a top level directory like / or c:\

Replace {listener name} with the name of the listener. For example if your listener was called LISTENER then TRACE_LEVEL_LISTENER=16

You need to restrict the amount of disk space used by the tracing then you must also set,

TRACE_FILELEN_{listener name}=500000 {– size of the files in K
TRACE_FILENO_{listener name}=10 {– number of files

This will limit the traces to 10 files of around 500Mb, so 5000Mb in total. When the 10th file is full it will reuse file number one.
You will need to stop/start the listener for this to take effect.
When the problem reproduces please can you upload the listener trace and the listener log.

Trace_level_client=16
Trace_directory_client={path_to_the_trace_directory} # use the full path to the trace directory
Trace_unique_client=on
Trace_timestamp_client=on
Diag_adr_enabled=off

trace Local listener or SCAN listeners
TRACE_LEVEL_{listener_name}= 16
TRACE_TIMESTAMP_{listener_name}=on
TRACE_DIRECTORY_{listener_name}={path_to_the_trace_directory}

truss -o /tmp/lisener.out -fae lsnrctl start {listener_name}

Some Useful Note:

Note.444705.1 TroubleShooting Guide For ORA-12514 TNS listener could not resolve SERVICE_NAME given in connect descriptor
Note.761740.1 Technicians Unable To Receive Orders While MWM Components Display ODBC Errors And Are Connected
Note.119007.1 ORA-12560: Administering the Listener on UNIX – Troubleshooting
Note 276812.1 TNS-12542 Error When Executing Batch Jobs or in High Transaction Environment
Note.219208.1 Ext/Pub Client Connection via Connect Manager Fails with TNS-12564
Note.393941.1 Ext/Mod ORA-12564 Reported When Using 10g Connection Manager
Note.1116960.1 ORA-609 TNS-12537 and TNS-12547 in 11g Alert.log
Note.550859.1 Abstract TROUBLESHOOTING GUIDE TNS-12518 TNS listener could not hand off client connection.
Note.207303.1 Client / Server / Interoperability Support Between Different Oracle Versions
Note.119706.1 Troubleshooting Guide TNS-12535 or ORA-12535 or ORA-12170 Errors

For database links between different Oracle versions connections must be supported in BOTH directions in the matrix found in Note 207303.1
eg: As 9.2 -} 7.3.4 is not supported then database links between these version
are not supported in either direction.
You are trying to connect two versions (client-server) that are not certified (as confirmed by Note 207303.1) and between which exist many technical incompatibilities.

CLIENT — LISTENER — SERVER RESULT

8 11.1 8 OK
9 11.1 9 OK
10 11.1 10 OK
11 11.1 11 OK

8 11.2 8 FAILS
9 11.2 9 OK
10 11.2 10 OK
11 11.2 11 OK

9 11.1 8 OK
10 11.1 8 OK
11 11.1 8 OK

9 11.2 8 FAILS
10 11.2 8 FAILS
11 11.2 8 FAILS

The relevant relationship that appears to be at issue is LISTENER and DATABASE. Client version is not a factor.

But if the ultimate outcome is that the 11.2 (11gR2) LISTENER is indicated (though I still haven’t seen documentation of this) as not compatible with use on a ORACLE 8i (8.1.7.0) DATABASE, then we’ll capture that here and move on. I would, however, like to see some evidence of this, if it is available. I can find notes in the KB about 10gR2’s listener not supporting 8i database, and I can find notes about 11gR1 having resolved that regression. But I can find nothing regarding listener/database compatibility that mentions 11gR2, that would explain our results.
Clients should be complied with Servers , For Sever 11.2 the only supported clients are 11.2.0 , 11.1.0 , 10.2.0 : 10g end MUST be at 10.2.0.2 (or higher) respectively in order to use PLSQL between those versions. See Note:4511371.8 for more details and finally 10.1.0.5 only with extended support .

On the other Side in order to connect from listener to DB server in a supported way , Listener version should be greater than or equal to the server version .

Note 207303.1 should still be followed.

Poor Data Export Performance(revised)

这是一个十分简单的单表导出作业,表上大约有200万条数据;但实际exp导出时dump文件每秒增长不足1MB,速度十分缓慢。

SQL> show user;
USER is "SH"
SQL> select count(*) from orders;
COUNT(*)
----------
2319232
SQL> select bytes/1024/1024 from dba_segments where owner='SH' and segment_name='ORDERS';
BYTES/1024/1024
---------------
261
/*该表大小为261MB*/
/*这里我们使用直接路径导出*/
[maclean@rh2 ~]$ exp sh/sh file=/s01/orders.dmp direct=y tables=orders
Export: Release 11.2.0.1.0 - Production on Wed Aug 18 21:52:56 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8MSWIN1252 character set (possible charset conversion)
About to export specified tables via Direct Path ...
. . exporting table                         ORDERS
/* 我们通过脚本观察导出文件大小增长速度,每秒大约0.4MB */
[maclean@rh2 s01]$ while (1>0) do ls -lh orders.dmp ; sleep 1; done
-rw-r--r-- 1 maclean oinstall 912K Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 1.4M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 1.8M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 2.2M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 2.6M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 3.0M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 3.5M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 3.9M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 4.3M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 4.7M Aug 18 21:54 orders.dmp
-rw-r--r-- 1 maclean oinstall 5.1M Aug 18 21:54 orders.dmp
/*是什么导致了exp如此缓慢呢,有必看一下该导出服务进程的trace文件*/
SQL> select spid from v$process where addr=(select paddr from v$session where module like 'exp@%');
SPID
------------------------
22624
SQL> oradebug setospid 22624;
Oracle pid: 37, Unix process pid: 22624, image: oracle@rh2 (TNS V1-V3)
SQL> oradebug tracefile_name;
/rdbms/sprod/SPROD/trace/SPROD_ora_22624.trc
SQL> host
[maclean@rh2 trace]$ ls -lh SPROD_ora_22624.trc
-rw-r----- 1 maclean oinstall 1.2G Aug 18 21:58 SPROD_ora_22624.trc
/* 这个trace文件居然有1.2G之巨,会存放了些什么呢?*/
[maclean@rh2 trace]$ tail -100 SPROD_ora_22624.trc
2010-08-18 21:59:14.748194 : nsbasic_bsd:41 20 43 4C 41 53 53 07  |A.CLASS.|
2010-08-18 21:59:14.748201 : nsbasic_bsd:00 78 6D 08 0F 01 1A 2E  |.xm.....|
2010-08-18 21:59:14.748212 : nsbasic_bsd:07 00 78 6D 08 0F 01 1A  |..xm....|
2010-08-18 21:59:14.748219 : nsbasic_bsd:2E 13 00 32 30 30 39 2D  |...2009-|
2010-08-18 21:59:14.748225 : nsbasic_bsd:30 38 2D 31 35 3A 30 30  |08-15:00|
2010-08-18 21:59:14.748232 : nsbasic_bsd:3A 32 35 3A 34 35 05 00  |:25:45..|
2010-08-18 21:59:14.748239 : nsbasic_bsd:43 48 45 43 4B 01 00 4E  |CHECK..N|
2010-08-18 21:59:14.748246 : nsbasic_bsd:01 00 4E 01 00 4E 02 00  |..N..N..|
2010-08-18 21:59:14.748253 : nsbasic_bsd:C1 02 FE FF 06 00 50 55  |......PU|
2010-08-18 21:59:14.748260 : nsbasic_bsd:42 4C 49 43 15 00 2F 39  |BLIC../9|
/*绝大多数是Oracle sqlnet trace的内容,是不是因为设置了Oracle SqlNet端的trace选项,从而导致了导出服务进程写出大量sqlnet trace,最终引发exp的缓慢?*/
[maclean@rh2 trace]$ cat $ORACLE_HOME/network/admin/sqlnet.ora
TRACE_LEVEL_SERVER = 16
TRACE_FILE_SERVER = SERVER
TRACE_TIMESTAMP_ SERVER = ON
TRACE_UNIQUE_SERVER = ON
/*的确启用了server端sqlnet trace*/
[maclean@rh2 trace]$ echo "" >  /s01/11gdb/network/admin/sqlnet.ora
/*清空sqlnet.ora配置文件,禁用server端sqlnet trace*/
/*再次尝试exp导出*/
[maclean@rh2 ~]$ exp sh/sh file=/s01/orders.dmp direct=y tables=orders
Export: Release 11.2.0.1.0 - Production on Wed Aug 18 22:00:35 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8MSWIN1252 character set (possible charset conversion)
About to export specified tables via Direct Path ...
. . exporting table                         ORDERS
[maclean@rh2 s01]$ while (1>0) do ls -lh orders.dmp ; sleep 1; done
-rw-r--r-- 1 maclean oinstall 61M Aug 18 22:00 orders.dmp
-rw-r--r-- 1 maclean oinstall 94M Aug 18 22:00 orders.dmp
-rw-r--r-- 1 maclean oinstall 108M Aug 18 22:00 orders.dmp
-rw-r--r-- 1 maclean oinstall 140M Aug 18 22:00 orders.dmp
-rw-r--r-- 1 maclean oinstall 162M Aug 18 22:00 orders.dmp
-rw-r--r-- 1 maclean oinstall 181M Aug 18 22:00 orders.dmp
/*dump文件每秒增长20MB左右,exp导出速度恢复正常*/
/*验证了之前设置server端16级的sqlnet trace会引起exp性能下降的猜测*/
/*如果是expdp数据泵导出工具是否会受到该sqlnet trace的影响呢*/
[maclean@rh2 s01]$ echo "TRACE_LEVEL_SERVER = 16
> TRACE_FILE_SERVER = SERVER
> TRACE_TIMESTAMP_ SERVER = ON
> TRACE_UNIQUE_SERVER = ON" > $ORACLE_HOME/network/admin/sqlnet.ora
[maclean@rh2 dump]$ expdp sh/sh directory=ordump dumpfile=orders.dmp tables=orders
Export: Release 11.2.0.1.0 - Production on Wed Aug 18 22:05:17 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "SH"."SYS_EXPORT_TABLE_01":  sh/******** directory=ordump dumpfile=orders.dmp tables=orders
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 261 MB
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
. . exported "SH"."ORDERS"                               223.8 MB 2319232 rows
Master table "SH"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SH.SYS_EXPORT_TABLE_01 is:
/s01/dump/orders.dmp
Job "SH"."SYS_EXPORT_TABLE_01" successfully completed at 22:05:29
/* expdp数据泵在12s内完成了导出工作,可见其并不受到sqlnet trace的不良影响*/
/* 如果我们降低Sqlnet trace的级别,是否可以降低其对exp导出性能的影响呢?*/
/*尝试将trace level改成1*/
[maclean@rh2 s01]$ echo "TRACE_LEVEL_SERVER = 1
> TRACE_FILE_SERVER = SERVER
> TRACE_TIMESTAMP_ SERVER = ON
> TRACE_UNIQUE_SERVER = ON"   >  $ORACLE_HOME/network/admin/sqlnet.ora
maclean@rh2 ~]$ exp sh/sh file=/s01/orders.dmp direct=y tables=orders
Export: Release 11.2.0.1.0 - Production on Wed Aug 18 22:11:57 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses WE8MSWIN1252 character set (possible charset conversion)
About to export specified tables via Direct Path ...
. . exporting table                         ORDERS    2319232 rows exported
EXP-00091: Exporting questionable statistics.
Export terminated successfully with warnings.
[maclean@rh2 s01]$ while (1>0) do ls -lh orders.dmp ; sleep 1; done
-rw-r--r-- 1 maclean oinstall 0 Aug 18 22:11 orders.dmp
-rw-r--r-- 1 maclean oinstall 33M Aug 18 22:11 orders.dmp
-rw-r--r-- 1 maclean oinstall 74M Aug 18 22:11 orders.dmp
-rw-r--r-- 1 maclean oinstall 108M Aug 18 22:12 orders.dmp
-rw-r--r-- 1 maclean oinstall 146M Aug 18 22:12 orders.dmp
-rw-r--r-- 1 maclean oinstall 183M Aug 18 22:12 orders.dmp
/* 导出速度没有明显下降,较低级别的server端sqlnet trace不会对exp导出性能造成影响*/

这个例子告诉我们在使用高级别的sqlnet trace诊断Oracle网络问题后,一定要记得要还原现场的配置,否则可能造成”莫名的性能问题”。

如何跟踪Oracle动态服务注册

如何trace Oracle PMON进程动态注册过程?这个问题我想到2个答案,对PMON做event trace或者采用Oracle Network Server因该都可以达到目的。
让我们来实践一下!

Oracle Network Server Trace模式

1. 启用Oracle SqlNet服务器端trace,这需要我们修改sqlnet.ora配置文件

[maclean@rh2 ~]$ echo "TRACE_LEVEL_SERVER = 16
> TRACE_FILE_SERVER = SERVER
> TRACE_DIRECTORY_SERVER= /home/maclean/ntrc"  > $ORACLE_HOME/network/admin/sqlnet.ora
[maclean@rh2 ~]$ cat $ORACLE_HOME/network/admin/sqlnet.ora
TRACE_LEVEL_SERVER = 16
TRACE_FILE_SERVER = SERVER
TRACE_DIRECTORY_SERVER= /home/maclean/ntrc

2. 触发trace

SQL> conn / as sysdba
Connected.
SQL>  select spid from V$process, V$session where audsid=userenv('SESSIONID') and paddr=addr;
SPID
------------
4264
SQL> alter system register;
System altered.
/*在之前指定的TRACE_DIRECTORY_SERVER目录下将出现形如server_$spid.trc的trace文件*/
[maclean@rh2 ntrc]$ cat server_4290.trc|grep nsprecv|grep -A 2 -B 2 reg
[20-AUG-2010 10:42:53:896] nsprecv: 00 00 00 00 00 15 61 6C  |......al|
[20-AUG-2010 10:42:53:896] nsprecv: 74 65 72 20 73 79 73 74  |ter.syst|
[20-AUG-2010 10:42:53:896] nsprecv: 65 6D 20 72 65 67 69 73  |em.regis|
[20-AUG-2010 10:42:53:896] nsprecv: 74 65 72 01 00 00 00 01  |ter.....|
[20-AUG-2010 10:42:53:896] nsprecv: 00 00 00 00 00 00 00 00  |........|
/*分析该trace文件后可以匹配到以上动态注册语句*/
/*记得将server端sqlnet trace设置disable掉*/
[maclean@rh2 ntrc]$ echo "" > $ORACLE_HOME/network/admin/sqlnet.ora

PMON TRACE模式
1.找出PMON的进程号

SQL> select spid ,program from v$process where program like '%PMON%';
SPID         PROGRAM
------------ ------------------------------------------------
4050         oracle@rh2 (PMON)

2.执行10257事件trace

SQL> oradebug setospid 4050;
Oracle pid: 2, Unix process pid: 4050, image: oracle@rh2 (PMON)
SQL> oradebug event 10257 trace name context forever,level 16;
Statement processed.
SQL> alter system register;
System altered.
SQL> oradebug event 10257 trace name context off;
Statement processed.
SQL> oradebug tracefile_name;
/s01/10gdb/admin/YOUYUS/bdump/youyus_pmon_4050.trc
cat /s01/10gdb/admin/YOUYUS/bdump/youyus_pmon_4050.trc
.......................
kmmlrl: register now
kmmgdnu: cXDB
goodness=0, delta=1,
flags=0x5:unblocked/not overloaded, update=0x6:G/D/-
kmmgdnu: YOUYUS
goodness=0, delta=1,
flags=0x4:unblocked/not overloaded, update=0x6:G/D/-
kmmlrl: nsgr update returned 0
kmmlrl: nsgr register returned 0
/*可以看到注册了YOUYUS和cXDB2个service的过程*/

若无法正常注册可以按照以下步骤检查:
1. 检查Oracle Net命名方式是否正确设置,该参数由sqlnet.ora配置文件中的NAMES.DIRECTORY_PATH指定。
2. 确认SQL NET配置文件如sqlnet.ora,tnsnames.ora等的位置正确配载。
3. 检查是否设置了TNS_ADMIN环境变量,该变量将影响配置文件的正确路径。
4. 确认LOCAL_LISTENER或REMOTE_LISTENER中指定的service可以被正常tnsping通,若以上参数未指定值则尝试步骤6。
5. 确认主机名可以被正确解析为ip地址,如以下示例:

C:\Users\weisly>nslookup www.oracle.com
服务器:  dir-605
Address:  192.168.0.1
非权威应答:
名称:    a398.g.akamai.net.0.1.cn.akamaitech.net
Addresses:  117.104.136.98
117.104.136.128
Aliases:  www.oracle.com
www.oracle.com.edgesuite.net
a398.g.akamai.net

6. 修改LOCAL_LISTENER为非服务别名解析的形式,这样可以排除配置文件可能造成的影响,从而确认问题是否由配置文件设置不当而引起:

SQL> conn / as sysdba
Connected.
SQL> alter system set local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rh2)(PORT=1521)))';
System altered.
SQL> alter system set REMOTE_LISTENER=' (ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = rh1)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = rh2)(PORT = 1521)))';
System altered.

7. 将主机名替换为ip地址并测试是否解决问题。
8. 以IPC协议替代常见的TCP协议测试,绕过TCP和主机名等因素可能造成的问题,以确认问题是否由这些因素引起:

SQL> alter system set local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=KEY1)))';
System altered.
/*其中KEY值需要和listener.ora中指定的一致。*/
SQL> host
[maclean@rh2 admin]$ lsnrctl status KEY1
LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 20-AUG-2010 11:30:06
Copyright (c) 1991, 2007, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=KEY1)))
STATUS of the LISTENER
------------------------
Alias                     KEY1
Version                   TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date                20-AUG-2010 11:29:15
Uptime                    0 days 0 hr. 0 min. 50 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /s01/10gdb/network/admin/listener.ora
Listener Log File         /s01/10gdb/network/log/key1.log
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=KEY1)))
Services Summary...
Service "YOUYUS" has 1 instance(s).
Instance "YOUYUS", status READY, has 1 handler(s) for this service...
Service "YOUYUS_XPT" has 1 instance(s).
Instance "YOUYUS", status READY, has 1 handler(s) for this service...
Service "cXDB" has 1 instance(s).
Instance "YOUYUS", status READY, has 10 handler(s) for this service...
The command completed successfully

实在不行的话,我们还是提SR吧!

沪ICP备14014813号

沪公网安备 31010802001379号