快速部署RDA Remote Diagnostic Agent

RDA Remote Diagnostic Agent远程诊断代理是Oracle Support售后服务使用的标准工具之一,当用户在Metalink上提交SR(TAR)时可能Oracle GCS(Global Customer Service)支持会需要让用户从MOS上下载RDA工具,通过RDA收集丰富的数据库环境信息(如包含OS、DB、CRS等),以便原厂售后直接从RDA report中抓取诊断信息,避免了因诊断信息不足 而反复信息交互所浪费的时间 ; 此外Oracle的一些ACS高级客户服务的现场服务过程中也会利用到RDA,例如当用户要求ACS到现场进行月度或季度巡检是RDA就是标准的检查工具。

 

说了这么多, 你可能还是不太了解RDA。 这不要紧,你可以把RDA看成一套由oracle定义了很多模块的脚本工具盒, 用它可以收集到诊断过程复杂的Oracle产品问题的各种信息。  可以从My Oracle Support(metalink)的《Remote Diagnostic Agent (RDA) 4 – Getting Started [ID 314422.1]》专栏中下载到各个平台的最新版RDA。

 

 

虽然RDA是Oracle Support使用的工具 , 但是并不是说非原厂售后就看不懂RDA也不能利用到这款工具。  RDA从脚本、模块到最后生成的报告都是自然可读的。   譬如Maclean.Liu 我 在诊断较复杂的问题过程中也会用到RDA, 我甚至推荐在数据库巡检、健康检查过程中也使用RDA, 当然是配合其他工具一起使用。

 

在使用RDA之前我们需要完成配置工作,所谓配置就是选择我们要使用的RDA module和一些临时设置,先来认识一下有哪些module:

 

将下载到的rda zip包解压

[oracle@vrh8 ~]$ unzip /tmp/p9079828_418_LINUX.zip 

[oracle@vrh8 ~]$ cd rda

[oracle@vrh8 rda]$ ./rda.sh -h
Usage: rda.pl [-bcdflntvwxy] [-ABCDEHIKLMPQRSTV] [-e list] [-m dir]
              [-s name] [-o out] [-p prof] arg ...
        -A      Authentify user through the setup file
        -B      Start background collection
        -C      Collect diagnostic information
        -D      Delete specified modules from the setup
        -E      Explain specified error numbers
        -H      Halt background collection
        -I      Regenerate the index
        -K      Kill background collection
        -L      List the modules available
        -M      Display the related manual pages
        -O      Render output specifications from STDIN
        -P      Package the reports (tar or zip)
        -Q      Display the related setup questions
        -R      Generate specified reports
        -S      Setup specified modules
        -T      Execute test modules
        -V      Display component version numbers
        -b      Don't backup setup file before saving
        -c      Check the RDA installation and exit
        -d      Set debug mode
        -e list Specify a list of alternate setting definitions (var=val,...)
        -f      Set force mode
        -h      Display the command usage and exit
        -l      Use a lock file to prevent concurrent usage of a setup file
        -m dir  Specify the module directory ('modules' by default)
        -n      Start a new data collection
        -o out  Specify the file for background collection output redirection
        -p prof Specify the setup profile ('Default' by default)
        -q      Set quiet mode
        -s name Specify the setup name ('setup' by default)
        -t      Set trace mode
        -v      Set verbose mode
        -w      Wait as long as the background collection daemon is active
        -x      Produce module cross reference
        -y      Accept all defaults and skip all pauses

列出所有可用module

[oracle@vrh8 rda]$ ./rda.sh -L Module

Available data collection modules are:
  ACFS     Collects ASM Cluster File System Information
  ACT      Collects Oracle E-Business Suite Application Information
  ADBA     Collects ACS Oracle Database Assessment
  ADX      Collects AutoConfig and Rapid Clone Information
  AGT      Collects Enterprise Manager Agent Information
  APEX     Collects APEX Information
  ASAP     Collects Oracle Communications ASAP Information
  ASBR     Collects Application Server Backup and Recovery Information
  ASG      Collects Application Server Guard Information
  ASIT     Collects Oracle Application Server Installation Information
  ASM      Collects Automatic Storage Management Information
  B2B      Collects Oracle Business to Business Information
  BAM      Collects Business Activity Monitoring Information
  BEE      Collects Beehive Information
  BI       Collects Oracle Business Intelligence Enterprise Edition Info.
  BPEL     Collects Oracle BPEL Process Manager Information
  BR       Collects Database Backup and Recovery Information
  BRM      Collects Oracle Communications BRM Information
  CCR      Collects OCM Diagnostic Information
  CFG      Collects Key Configuration Information
  COHR     Collects Oracle Coherence Information
  CONT     Collects Oracle Content Services Information
  CRID     Collects Oracle Access Manager (COREid) Information
  D2PC     Collects Distributed Transaction Information
  DB       Controls RDBMS Data Collection
  DBA      Collects RDBMS Information
  DBC      Collects Database Control Information
  DBM      Collects RDBMS Memory Information
  DEV      Collects Oracle Developer Information
  DG       Collects Data Guard Information
  DNFS     Collects Direct NFS Information
  DSCS     Collects Discussions Information
  DSCV     Collects Oracle Discoverer Information
  ECM      Controls Oracle Enterprise Content Management 11g Data Collection
  EM       Collects Enterprise Manager OMS and Repository Info (Obsolete)
  END      Finalizes the Data Collection
  EPMA     Collects Enterprise Performance Management Architect Information
  ESB      Collects Enterprise Service Bus Information
  ESS      Collects Oracle Essbase Information
  ESSO     Collects Oracle Enterprise Single Sign-On Information
  EXA      Collects Exadata Information
  FLTR     Controls Report Content Filtering
  GRDN     Collects Oracle Guardian Information
  GRID     Controls Grid Control Data Collection
  GTW      Collects Transparent/Procedural Gateway Information
  HFM      Collects Oracle Hyperion Financial Management information
  HPL      Collects Oracle Hyperion Planning Information
  IA       Collects Intelligent Agent Information
  IAS      Collects Web Server Information
  IFS      Collects iFS (iFS, CMSDK, Files) Information
  INI      Initializes the Data Collection
  INST     Collects the Oracle Installation Information
  IPSA     Collects Oracle Communications IP Service Activator Information
  J2EE     Collects J2EE/OC4J Information
  JDBC     Collects Oracle Java DB Connectivity (JDBC) Information
  JDEV     Collects Oracle JDeveloper Information
  JIVE     Collects Jive Information
  LANG     Collects Oracle Language Information
  LOAD     Produces the External Collection Reports
  LOG      Collects Database Trace and Log Files
  MAIL     Collects Oracle Collaboration Suite Mail Information
  MSLG     Collects Microsoft Languages Information
  ND       Collects Oracle Communications Network Discovery Information
  NET      Collects Network Information
  NM       Collects Oracle Communications Network Mediation Information
  NPRF     Samples Performance Information (root not required)
  OCAL     Collects Oracle Calendar Information
  OCFS     Collects Oracle Cluster File System Information
  OCM      Setting up Configuration Manager Interface
  OCS      Controls Oracle Collaboration Suite Data Collection
  ODI      Collects Oracle Data Integrator Information
  ODM      Collects Oracle Data Mining Information
  OES      Collects Oracle Express Server Information
  OID      Collects Oracle Internet Directory Information
  OIM      Collects Oracle Identity Manager Information
  OLAP     Collects OLAP Information
  OMM      Collects Oracle Multimedia or Oracle interMedia Information
  OMS      Collects Oracle Management Server Information (obsolete)
  ONET     Collects Oracle Net Information
  OS       Collects the Operating System Information
  OVD      Collects Oracle Virtual Directory Information
  OVMM     Collects Oracle VM Manager Information
  OVMS     Collects Oracle VM Server Information
  OWB      Collects Oracle Warehouse Builder Information
  OWSM     Collects Oracle Web Services Manager Information
  PDA      Collects Oracle Portal Information
  PDBA     Collects PeopleSoft Information from an Oracle Database
  PERF     Collects Performance Information
  PLNC     Collects Oracle PL/SQL Native Compilation Information
  PROF     Collects the User Profile
  PS       Collects Oracle Communications Policy Services Information
  PWEB     Collects PeopleSoft Information from Web Application Server
  RAC      Collects Cluster Information
  RACD     Performs a Database Hang Analysis
  RDSP     Produces the Remote Data Collection Reports
  RET      Collects Oracle Retail Information
  REXE     Performs the Remote Data Collections
  RPRF     Samples Performance Information (root privileges required)
  RSRC     Collects Database Resource Manager Information
  RTC      Collects Real Time Communication Information
  SEBL     Collects Siebel Information
  SES      Collects Oracle Secure Enterprise Search Information
  SMPL     Controls Sampling
  SOA      Collects Oracle SOA Suite Information
  SP       Collects SQL*Plus/iSQL*Plus Information
  SSO      Collects Single Sign-On Information
  STC      Collects Streams Configuration Information
  STM      Collects Streams Monitoring Information
  TOPL     Collects Oracle TopLink Information
  TTEN     Collects Oracle TimesTen In-Memory Database Information
  UCM      Collects Oracle Universal Content Management Information
  UOA      Collects Oracle Universal Online Archive 11g Information
  WAC      Collects Web Access Client Information
  WCI      Collects Oracle WebCenter Information
  WEBC     Collects Oracle Web Cache Information
  WKSP     Collects Workspaces Information
  WLS      Collects Oracle WebLogic Server Information
  WMC      Collects Webmail Client Information
  WRLS     Collects Wireless Information
  XDB      Collects XDB Information
  XSMP     Samples User Defined Data
  XTRA     Collects User Defined Data

 

 

如以上列表中RAC模块用来Collects Cluster Information收集集群信息, 而RACD模块则负责收集RAC数据库挂起的相关信息Performs a Database Hang Analysis。

 

我们在配置RDA的时可以直接执行./rda.sh, 脚本会提示我们需要选择启用哪些Module,但是因为Module过多,整个配置过程就会浪费很多时间。

 

为了避免每配置一套新环境都要大费周章确认那么多模块, 所以在RDA中定义了很多典型场景使用的profile, 这些profile已经配好了固定的一些Module , 下面我们来看一下RDA profile:

 

 

列出所有目前可用的profile

[oracle@vrh8 rda]$ ./rda.sh -L profiles
Available profiles are:
  9iAS               Oracle Application Server 9i problems
  AS10g              Oracle Application Server 10g problems
  AS10g_Identity     Oracle Identity Management 10g problems
  AS10g_MidTier      Oracle Application Server 10g Middle Tier problems
  AS10g_Repository   Oracle Application Server 10g metadata repository problems
  AS10g_WebTier      Oracle Application Server 10g WebTier problems
  AS_BackupRecovery  Oracle Application Server backup/recovery problems
  Act                Oracle Application Overview
  AppsCheck          Equivalent to AppsCheck
  AsmFileSystem      Oracle ASM Cluster File System problems
  Bam                Business Activity Monitoring problems
  Beehive            Oracle Beehive problems
  DB10g              Oracle Database 10g problems
  DB11g              Oracle Database 11g problems
  DB8i               Oracle Database 8i problems
  DB9i               Oracle Database 9i problems
  DB_Assessment      Oracle Database assessment collections
  DB_BackupRecovery  Oracle Database backup and recovery problems
  DB_Perf            Oracle Database performance problems
  DataGuard          Data Guard problems
  DirectNFS          Direct NFS problems
  Discoverer10g      Oracle Discoverer 10g problems
  Discoverer11g      Oracle Discoverer 11g problems
  EnterpriseSearch   Oracle Secure Enterprise Search problems
  Essbase            Oracle Essbase problems
  FM11g_Bi           Business Intelligence Enterprise Edition 11g problems
  FM11g_Ecm          Oracle Enterprise Content Management 11g problems
  FM11g_Forms        Oracle Forms 11g problems
  FM11g_Identity     Oracle Identity Management 11g problems
  FM11g_Odi          Oracle Data Integrator Standalone 11g problems
  FM11g_Portal       Oracle Portal 11g problems
  FM11g_Reports      Oracle Reports 11g problems
  FM11g_Soa          Oracle SOA Suite 11g problems
  FM11g_WebTier      Oracle Fusion Middleware 11g Web Tier problems
  FM11g_WlsBi        Business Intelligence Enterprise Edition 11g with WLS
  FM11g_WlsForms     Oracle Forms 11g with WLS problems
  FM11g_WlsIdentity  Oracle Identity Management 11g with WLS problems
  FM11g_WlsOdi       Oracle Data Integrator Suite 11g with WLS problems
  FM11g_WlsPortal    Oracle Portal 11g with WLS problems
  FM11g_WlsReports   Oracle Reports 11g with WLS problems
  FM11g_WlsWebTier   Oracle Fusion Middleware 11g Web Tier with WLS problems
  FinManagement      Oracle Hyperion Financial Management problems
  GridControl        Grid Control problems
  InterMedia         Oracle interMedia problems
  Linux              Linux problems
  LinuxPerf          Linux performance problems
  Maa_Assessment     Maximum Availability Architecture assessment collections
  Multimedia         Oracle Multimedia problems
  OSMonitor          Operating System performance sampling
  OVMManager         Oracle VM Manager problems
  Pda10g             Portal 10g problems
  Pda11g             Portal 11g problems
  Pda9i              Portal 9i problems
  PeopleSoft_DB      PeopleSoft Oracle Database tier assessment collections
  PeopleSoft_Web     PeopleSoft Web application server assessment collections
  Rac                Real Application Cluster problems
  Rac_AdvancedAsm    Cluster with ASM problems (ASM advanced mode)
  Rac_Asm            Cluster with ASM problems
  Rac_Assessment     Real Application Cluster assessment collections
  Rac_Perf           Cluster performance problems
  Retail             Oracle Retail problems
  Security           Filter sensitive information from the reports
  SupportInformer70  Oracle Communication BRM 7.0 problems
  SupportInformer72  Oracle Communication BRM 7.2 problems
  SupportInformer73  Oracle Communication BRM 7.3 problems
  SupportInformer74  Oracle Communication BRM 7.4 problems
  TimesTen           Oracle TimesTen problems
  TopLink10g         Oracle TopLink 10g problems
  WebCenter10g       Oracle WebCenter 10g problems
  WebCenter11g       Oracle WebCenter 11g problems
  WebCenterCont10g   Oracle WebCenter 10g with Oracle Content Services problems
  WebLogicServer     Oracle WebLogic Server problems

 

 

上例列出了该版本RDA默认就有的Profile , 如DB11g这个profile是用来收集11g Database数据库的诊断信息的, 而DB10g 是收集10g Database诊断信息的, DB_Perf是收集数据库性能诊断信息的。

可以具体了解 这些profile 预设了哪些Module:

 

 

[oracle@vrh8 rda]$ ./rda.sh -M -p DB11g
NAME
    Profile DB11g - Oracle Database 11g problems

MODULES
    The DB11g profile uses the following modules:
      OS        Collects the Operating System Information
      PROF      Collects the User Profile
      PERF      Collects Performance Information
      NET       Collects Network Information
      ONET      Collects Oracle Net Information
      INST      Collects the Oracle Installation Information
      DB        Controls RDBMS Data Collection
      DBA       Collects RDBMS Information
      DBM       Collects RDBMS Memory Information
      LOG       Collects Database Trace and Log Files
      DNFS      Collects Direct NFS Information
      SP        Collects SQL*Plus/iSQL*Plus Information
      GRID      Controls Grid Control Data Collection
      AGT       Collects Enterprise Manager Agent Information
      DBC       Collects Database Control Information

[oracle@vrh8 rda]$ ./rda.sh -M -p DB10g
NAME
    Profile DB10g - Oracle Database 10g problems

MODULES
    The DB10g profile uses the following modules:
      OS        Collects the Operating System Information
      PROF      Collects the User Profile
      PERF      Collects Performance Information
      NET       Collects Network Information
      ONET      Collects Oracle Net Information
      INST      Collects the Oracle Installation Information
      DB        Controls RDBMS Data Collection
      DBA       Collects RDBMS Information
      DBM       Collects RDBMS Memory Information
      LOG       Collects Database Trace and Log Files
      SP        Collects SQL*Plus/iSQL*Plus Information
      GRID      Controls Grid Control Data Collection
      AGT       Collects Enterprise Manager Agent Information
      DBC       Collects Database Control Information

 

 

除了module之外profile可能还定义了一些临时变量如force_onet_tests 是否强制做oracle net网络测试等, 可以用-f( Set force mode)选项来列出这些temporary settings:

 

 

[oracle@vrh8 rda]$ ./rda.sh -fM -p DB10g
NAME
    Profile DB10g - Oracle Database 10g problems

MODULES
    The DB10g profile uses the following modules:
      OS        Collects the Operating System Information
      PROF      Collects the User Profile
      PERF      Collects Performance Information
      NET       Collects Network Information
      ONET      Collects Oracle Net Information
      INST      Collects the Oracle Installation Information
      DB        Controls RDBMS Data Collection
      DBA       Collects RDBMS Information
      DBM       Collects RDBMS Memory Information
      LOG       Collects Database Trace and Log Files
      SP        Collects SQL*Plus/iSQL*Plus Information
      GRID      Controls Grid Control Data Collection
      AGT       Collects Enterprise Manager Agent Information
      DBC       Collects Database Control Information

SETTINGS
    The DB10g profile sets the following temporary settings:
      force_db_tests=1
      force_dba_tests=1
      force_dbm_tests=1
      force_log_tests=1
      force_onet_tests=1

 

 

也可以列出全部预定义的profile的Module信息:

 

 

[oracle@vrh8 rda]$ ./rda.sh -xv profiles
Treating profiles ...
Profile Cross Reference

Defined Profiles:
  9iAS               S100OS, S105PROF, S110PERF, S120NET, S130INST, S300IAS,
                     S305ASBR, S306ASG, S310J2EE, S330SSO, S340OID, S350WEBC
  AS10g              S100OS, S105PROF, S110PERF, S120NET, S130INST, S300IAS,
                     S305ASBR, S306ASG, S310J2EE, S330SSO, S340OID, S350WEBC
  AS10g_Identity     S100OS, S105PROF, S110PERF, S120NET, S130INST, S300IAS,
                     S305ASBR, S306ASG, S310J2EE, S330SSO, S340OID, S342OVD
  AS10g_MidTier      S100OS, S105PROF, S110PERF, S120NET, S130INST, S249WRLS,
                     S290DEV, S300IAS, S310J2EE, S325PDA, S350WEBC, S390DSCV
  AS10g_Repository   S100OS, S105PROF, S110PERF, S120NET, S130INST, S300IAS,
                     S305ASBR, S306ASG, S310J2EE
  AS10g_WebTier      S100OS, S105PROF, S110PERF, S120NET, S130INST, S300IAS,
                     S310J2EE, S350WEBC, S410GRID
  AS_BackupRecovery  S100OS, S300IAS, S305ASBR
  Act                S100OS, S105PROF, S110PERF, S130INST, S500ACT
  AppsCheck          S100OS, S105PROF, S110PERF, S130INST, S500ACT
  AsmFileSystem      S100OS, S105PROF, S120NET, S122ONET, S130INST, S402ASM,
                     S403ACFS
  Bam                S100OS, S105PROF, S110PERF, S120NET, S374BAM
..........

 

 

 

使用-Q 选项可以更详细地列出profile相关的问题:

 

 

[oracle@vrh8 rda]$ ./rda.sh -Q -p DB11g

NAME
    S120NET - Collects Network Information

SETTING DESCRIPTION
  NETWORK_PING_TESTS
    "Do you want RDA to perform the network ping tests (Y/N)?"

  LOCAL_NODE
    "Enter the name of the node the script is running on (used for ping
    tests)"

  WAN_NODE
    "Enter a remote node connecting to this server (used for ping tests)"

  RDBMS_NODE
    "Enter the node hosting the database instance (used for ping tests)"

  WEB_NODE
    "Enter the node where the Web Server/Forms server is on (used for ping
    tests)"

...............

 

 

通过继承profile的定义可以快速配置RDA,例如我们尝试使用DB11g这个profile:

 

 

[oracle@vrh8 rda]$ ./rda.sh -S -p DB11g

使用profile后RDA问你的问题明显减少了哦

之后在运行rda.sh就会收集信息了

[oracle@vrh8 rda]$ ./rda.sh
-------------------------------------------------------------------------------
RDA Data Collection Started 06-Feb-2012 01:23:22
-------------------------------------------------------------------------------
Processing Initialization module ...
Enter the password for 'SYSTEM':
Please re-enter it to confirm:
Processing OCM module ...
Processing PERF module ...
Processing CFG module ...
Processing OS module ..

 

 

可能你还是觉得麻烦:”我使用oracle操作系统用户登录的,为啥每次还要输密码呢?直接sysdba不行吗?” 对于有些内部视图X$ View的查询也确实需要sysdba权限, 我们可以如下设置来使用sysdba身份:

 


[oracle@vrh8 rda]$ ./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB11g

[oracle@vrh8 rda]$ ./rda.sh
-------------------------------------------------------------------------------
RDA Data Collection Started 06-Feb-2012 01:27:37
-------------------------------------------------------------------------------
Processing Initialization module ...
Processing OCM module ...
Processing PERF module ...
Processing CFG module ...
Processing OS module ...

 

 

除了单独使用某个profile外,我们还可以组合使用多个profile,使用-p profile1-profile2这样的语法即可,如:

 

[oracle@vrh8 rda]$ ./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB11g-DataGuard

使用-p profile1-profile2这样的模式即可

 

 

RDA经过配置后会默认将配置信息写到其目录下的setup.cfg文件中,之后再使用rda.sh收集信息即会沿用该cfg文件:

 

 

cat setup.cfg

# Oracle Remote Diagnostic Agent - Setup Information
###############################################################################

#------------------------------------------------------------------------------
# Data Collection Overview
#------------------------------------------------------------------------------
# S000INI=pending
# S010CFG=pending
# S020SMPL=pending
# S090OCM=pending
# S100OS=pending
# S105PROF=pending
# S110PERF=pending
# S120NET=pending
# S122ONET=pending
# S130INST=pending
# S200DB=skip
# S201DBA=pending
# S203DBM=pending
# S204LOG=pending
# S205BR=pending
# S212DNFS=skip
# S213SP=skip
# S400RAC=pending
# S400RACD=skip
# S401OCFS=skip
# S405DG=pending
# S410GRID=skip
# S420AGT=skip
# S430DBC=skip
# S909RDSP=skip
# S919LOAD=pending
# S990FLTR=skip
www.askmaclean.com
www.askmaclean.com
# S999END=pending
.................

 

 

除了最常见的DB11g 、 DB10g外还有一些很有用的profile可以加速我们对问题的诊断, 在这里分享一下:

 

 

For 11g

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB11g

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p DB11g
--收集alert文本

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1,DBCONTROL_SERVER_IN_USE=1 -p DB11g
--收集DBcontrol信息

./rda.sh -vSCRPfy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p DB11g
--收集诊断信息并打包

DB10g 

./rda.sh -S -p DB10g

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB10g

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,DBCONTROL_SERVER_IN_USE=1 -p DB10g

DB9i

./rda.sh -S -p DB9i

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB9i

DB_BackupRecovery 收集备份恢复信息

./rda.sh -S -p DB_BackupRecovery

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,RMAN_IN_USE=0 -p DB_BackupRecovery
--不使用RMAN备份

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,RMAN_IN_USE=1,RMAN_CATALOG=0 -p DB_BackupRecovery
--使用RMAN但不使用CATALOG恢复目录

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,RMAN_IN_USE=1,RMAN_CATALOG=1,RMAN_SCHEMA=rman,RMAN_EXPORT_USER=rman@catlogdb -p DB_BackupRecovery

DB_Perf 收集数据库性能信息

./rda.sh -S -p DB_Perf

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DB_Perf

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,PERF_START_TIME=11-Mar-2010_12:00,PERF_END_TIME=11-Mar-2010_13:00 -p DB_Perf
--指定收集性能信息的时间段 

DataGuard 收集dg 信息

./rda.sh -S -p DataGuard

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DataGuard

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ONET_IN_USE=1,ALERT_TEXT=1 -p DataGuard
--同时也收集Oracle Net Services信息

RAC 收集Real Application Cluster CRS信息

./rda.sh -S -p Rac

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p Rac

RAC ASM   收集 Rac + Clusterware + Asm 信息

./rda.sh -S -p Rac_Asm

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ASM_ORACLE_SID=+ASM1 -p Rac_Asm

Rac_AdvancedAsm 收集更详细的Rac + Clusterware + Asm 信息

./rda.sh -S -p Rac_AdvancedAsm

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ASM_ORACLE_SID=+ASM1 -p Rac_AdvancedAsm

Rac_Perf  收集RAC数据库性能信息

./rda.sh -S -p Rac_Perf

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p Rac_Perf

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,PERF_START_TIME=11-Mar-2010_12:00,PERF_END_TIME=11-Mar-2010_13:00 -p Rac_Perf

DirectNFS 

./rda.sh -S -p DirectNFS

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p DirectNFS

AsmFileSystem

./rda.sh -S -p AsmFileSystem

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p AsmFileSystem

DB_Assessment 

./rda.sh -S -p Rac_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p Rac_Assessment

 Rac_Assessment

./rda.sh -S -p Rac_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p Rac_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ASM_ORACLE_SID=+ASM1 -p Rac_Assessment

 Maa_Assessment

./rda.sh -S -p Maa_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/ -p Maa_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ASM_ORACLE_SID=+ASM1 -p Maa_Assessment

Exadata_Assessment

./rda.sh -S -p Exadata_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p Exadata_Assessment

 ./rda.sh -vSCRPfy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1,EXA_COLLECT_CELL=0 -p Exadata_Assessment

 ./rda.sh -vSCRPfy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p Exadata_Assessment

Maa_Exa_Assessment

./rda.sh -S -p Maa_Exa_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p Maa_Exa_Assessment

./rda.sh -Sy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1,EXA_COLLECT_CELL=0 -p Maa_Exa_Assessment

./rda.sh -vSCRPfy -e SQL_SYSDBA=1,SQL_LOGIN=/,ALERT_TEXT=1 -p Maa_Exa_Assessment

 

 

 

还可以利用rda对OS做数据库软件安装前的预检查,如将需要安装11.2g,则执行 ./rda.sh -T hcve:

 

 

[oracle@vrh8 rda]$ ./rda.sh -T hcve
Processing HCVE tests ...
Available Pre-Installation Rule Sets:
1. Oracle Database 10g R1 (10.1.0) Preinstall (Linux-x86)
2. Oracle Database 10g R1 (10.1.0) Preinstall (Linux AMD64)
3. Oracle Database 10g R1 (10.1.0) Preinstall (IA-64 Linux)
4. Oracle Database 10g R2 (10.2.0) Preinstall (Linux AMD64)
5. Oracle Database 10g R2 (10.2.0) Preinstall (IA-64 Linux)
6. Oracle Database 10g R2 (10.2.0) Preinstall (Linux-x86)
7. Oracle Database 11g R1 (11.1.0) Preinstall (Linux AMD64)
8. Oracle Database 11g R1 (11.1.0) Preinstall (Linux-x86)
9. Oracle Database 11g R2 (11.2.0) Preinstall (Linux-x86)
ID     NAME                 RESULT  VALUE
====== ==================== ======= ==========================================
A00010 OS Certified?        PASSED  Adequate
A00050 Enter ORACLE_HOME    RECORD  /s01/oracle/product/10.2.0/db_1
A00060 ORACLE_HOME Valid?   PASSED  OHexists
A00070 O_H Permissions OK?  PASSED  CorrectPerms
A00080 oraInventory Permiss PASSED  oraInventoryOK
A00090 Got ld,nm,ar,make?   PASSED  ld_nm_ar_make_found
A00100 Umask Set to 022?    PASSED  UmaskOK
A00120 Limit Processes      PASSED  Adequate
A00130 Limit Descriptors    PASSED  Adequate
A00140 LDLIBRARYPATH Unset? PASSED  UnSet
A00180 JAVA_HOME Unset?     PASSED  UnSet
A00190 Enter JDK Home       RECORD
A00200 JDK Version          FAILED  JDK home is missing
A00210 Other O_Hs in PATH?  FAILED  OratabEntryInPath
A00220 Other OUI Up?        PASSED  NoOtherOUI
A00230 /tmp Adequate?       PASSED  TempSpaceOK
A00240 Disk Space OK?       PASSED  DiskSpaceOK
A00250 Swap (in MB)         RECORD  5951
A00260 RAM (in MB)          PASSED  3955
A00270 Swap OK?             PASSED  SwapToRAMOK
A00280 Network              PASSED  Connected
A00290 IP Address           RECORD  192.168.1.191
A00300 Domain Name          RECORD  oracle.com
A00310 DNS Lookup           FAILED  nslookup host.domain
A00320 /etc/hosts Format    FAILED  Missing host.domain
A00330 Kernel Parameters OK PASSED  KernelOK
A00380 Tainted Kernel?      PASSED  NotVerifiable
A00400 ip_local_port_range  PASSED  RangeOK
A00480 EL4 RPMs OK?         SKIPPED NotEL4
A00490 EL5 RPMs OK?         FAILED  [kernel-headers(i386)] not installed ..>
A00530 RHEL4 RPMs OK?       SKIPPED NotRedHat
A00540 RHEL5 RPMs OK?       SKIPPED NotRedHat
A00570 SUSE SLES10 RPMs OK? SKIPPED NotSuSE
A00580 SUSE SLES11 RPMs OK? SKIPPED NotSuSE
Result file: /home/oracle/rda/output/RDA_HCVE_A200DB11R2_lnx_res.htm

 

 

上例对OS做了Oracle Database 11g R2 (11.2.0) Preinstall的预安装检查 ,并给出了检查结果。

 

 

 

还可以使用./rda.sh命令对现有的RDA软件做完整性检测,保证RDA没有被修改过:

 

 

[oracle@vrh8 rda]$ ./rda.sh -cv
Loading the file list ...
Checking the directory '.' ...
Checking the directory 'RDA' ...
Checking the directory 'RDA/Handle' ...
Checking the directory 'RDA/Library' ...
Checking the directory 'RDA/Library/Remote' ...
Checking the directory 'RDA/Local' ...
Checking the directory 'RDA/Object' ...
Checking the directory 'RDA/Operator' ...
Checking the directory 'RDA/Value' ...
Checking the directory 'hcve' ...
Checking the directory 'modules' ...
No issues found

Know about My Oracle Configuration Manager (OCM)

今天在和客户的会议上,客户领导大赞cisco某款产品,据他介绍该产品中集成了运行数据收集功能,通过该功能可以减少人力资本,让远程支持(remote support)和前摄式支持(proactive support)变得可能且高效。

旁边的Oracle原厂工程师坐不住了,开始滔滔不绝地介绍Oracle Configuration Manager(OCM)这款工具,在他嘴里OCM除了前摄式地收集Oracle产品信息并传输到My Oracle Support网站以方便后续的patch support和Health checks外,更成为了Oracle解决客户紧急状况的救星,只要使用了OCM配合在MOS上开1级的SR,即便不买Oracle ACS服务也可以解决致命问题,溢美之词不绝于耳。

在我看来OCM这套工具从功能上讲是很不错的,它所收集的系统信息能够帮助Oracle驱动以下功能:

Proactive

  • Setup-once. Distribute-to-many
    • Install, configure & communicate
    • Eliminate inaccurate SR profiles
    • Unified Systems region
  • Projects Feature
    • Include Business information  along with Technical information
    • Include all Milestones (code freeze, go, no-go etc.)
  • Systems Details
    • Change Management features
  • Healthchecks Feature
    • Dozens of Healthchecks to check against Support Best Practices
    • Create standards across the Enterprise
  • Patch Recommendations
    • Security CPU, Patchset and High Priority Patches identified
    • Patch Plan creation to analyze and validate for patch conflicts
  • Inventory and Reporting Features
    • Lifecycle Management Consistency and Service Request History
  • My Oracle Support Community

Reactive

  • Priority Handling of Service Requests
    • Log all SRs using Systems/Configurations

 

Oracle Configuration Manager的具体工作架构图如下:

 

oracle_configuration_manager_arch

 

很容易从上图中发现Oracle Configuration Manager存在两个致命的问题:

1.为了让OCM将其所收集的系统数据自动上传到位于外网的Software Configuration Manager中那就是你要么让配置OCM的服务器暴露在外网内、要么通过代理服务器完成数据的上传,无论使用哪种方式上传都会带来风险,对于在Oracle中存放敏感数据的客户在服务器与外界间搭出一条通路的做法是不可接受的.虽然存在“Disconnect Mode”,但这需要手动参与。

2.外部世界总是趋于活动的,显然如果My Oracle Support网站上OCM相关的接口更新了,那么无疑在客户服务器上的OCM软件也需要相应升级。使用过OEM界面上Support WorkBench的同学因该深有体会,一旦软件版本不再是最新的了,那么Support WorkBench的某些功能就会失效,或者整体都不可用。虽然OCM软件的更新并不复杂耗时,但仍需要DBA定期维护。

Oracle研发Oracle Configuration Manager的目的是将客户的具体需求和MOS站点间进一步融合,客户在只需要购买PS(Premier support)基础服务的情况下就可以得到由OCM所带来的服务附加值,这样的设计理念是我所推崇的。

题外话是如果我是Boss或者架构师的话,我必然会选择充分利用OCM的优势;然而在实际的企业产品数据库运维过程中要将这些优势发挥出来又是极不容易的!

据我个人的了解,某国际著名通信设备制造商(最近被Google收购那家)在2年前曾在数十套系统中部署过OCM,然而直到今日已经没有任何一套系统仍在使用OCM,MOS上的System Info也已经被删个精光了。

另一个值得玩味的案例是Oracle公司,据Oracle公司内部人员介绍Oracle自身的数十套系统之前几年也在大力推广下使用OCM,但实际能稳定运行并持续上传数据的服务器不到10台,只剩下MOS上System面板内那一排代表数据更新失败的鲜红打叉,扯远了……….

关于My Oracle Configuration Manager更多详细的信息见Oracle官方文档<Configuration Manager>

 

利用Procexp工具监控Windows平台上的Oracle数据库性能

我们可以从http://technet.microsoft.com/en-us/sysinternals下载到Windows平台上的系统内部调试工具包,这些工具中大部分是由Mark Russinovich编写的,其中最为著名的tools包括进程管探测器(Process Explorer)、Regmon等。

这里我们要介绍的是使用Procexp工具监控Windows平台上的Oracle性能信息。Procexp是一套功能齐全的进程信息管理工具,它使用图形界面显示(GUI),可以把它看做是Windows平台上taskmgr.exe任务管理器的扩展,事实上它完全足以代替taskmgr,前提是用户需要有一定的OS基础。

我们来看看Procexp针对运行在Windows上的Oracle(一种不太推荐的组合)时所能监控的信息:

1.进程属性

  • 包括进程的性能数据,包括CPU、Virtual Memory、Physical Memory、I/O、Handles
  • 查看详细的线程信息(包括个别线程的CPU使用率)
  • 查看线程堆栈(thread stack)
  • Kill/suspend thread

2.系统全局信息

  • 提供系统级别的性能数据

3.创建进程的DUMP文件

  • 创建FULL或minidump转储以便诊断BUG

4.识别文件句柄(Handle)或动态链接库(DDL)

  • 可以用来判断进程锁住了那些DDL文件或普通文件句柄

当我们在Windows上成功启动Oracle实例后就可以通过Procexp.exe工具来监控数据库性能了,使用十分简便,选中”Oracle.exe”进程之后右键菜单Properties即可浏览进程的属性:

procexp_monitor_oracle1

选中Performance面板后可以浏览进程的性能数据,这就像是Windows上的nmon命令,注意要使用管理员身份运行Procexp.exe,否则可能性能数据无法正确收集而显示N/A:

procexp_monitor_oracle2

选中Performance Graphy面板浏览图形化的性能趋势图:

procexp_monitor_oracle3

 

点击Thread面板我们可以浏览Oracle.exe进程下的线程信息,Windows平台上的一点不便就是无法通过线程信息直接判断该线程是哪个”后台进程”或”服务进程”,需要配合v$process视图才能做到。

SQL> select spid ,program from v$process;

SPID                     PROGRAM
------------------------ --------------------
                         PSEUDO
3124                     ORACLE.EXE (PMON)
4328                     ORACLE.EXE (VKTM)
5096                     ORACLE.EXE (GEN0)
2840                     ORACLE.EXE (DIAG)
2068                     ORACLE.EXE (DBRM)
2464                     ORACLE.EXE (PSP0)
4468                     ORACLE.EXE (DIA0)
120                      ORACLE.EXE (MMAN)
4424                     ORACLE.EXE (DBW0)
1312                     ORACLE.EXE (LGWR)
684                      ORACLE.EXE (CKPT)
5684                     ORACLE.EXE (SMON)
1016                     ORACLE.EXE (RECO)
4516                     ORACLE.EXE (MMON)
1108                     ORACLE.EXE (MMNL)
6108                     ORACLE.EXE (NSS2)
2728                     ORACLE.EXE (SHAD)

18 rows selected.

以上SPID=3124即指TID为3124的线程为PMON”后台进程”,在Thread面板上能够直接了解到某个线程的CPU使用率,这在我们诊断Oracle.exe进程有过高的CPU使用率时可以方便定位;点击stack按钮可以调出该线程当前的调用栈,这在我们确定BUG的时候很有用。

procexp_monitor_oracle4

 

同时在以上Thread面板上还可以使用Kill/Suspend按钮来杀死或停止某个异常线程(前提是我们确认所要杀死的线程是非关键后台的non-critical background thread),在Windows平台上这原本是需要使用orakill命令来完成的。

TCP/IP面板可以为我们提供简要的进程网络信息,包括Local Address和Remote Address,如果要获取更完整的信息可以配合其他网络监控工具(如工具包中的TCPView):

procexp_monitor_oracle5

 

另一个十分有用的功能是Environment面板,该面板用以显示详细的环境变量信息,如:Path,TEMP,ORACLE_SID,CLASSPATH等等,在诊断一些本地登录问题或实例异常问题时十分有效:

procexp_monitor_oracle6

Procexp工具也能像taskmgr那样监控系统级的性能信息,而且更为详细,点击主面板上的View -> System Information:

procexp_monitor_oracle7

 

如上文所述View DDL/Handle功能可以帮助我们了解Oracle进程所调用的动态链接库文件(DDL)和所持有的文件句柄(Handle)信息,因为Windows平台上某个被打开的文件时无法被同时修改或移动的,这在我们维护过程中可能造成许多麻烦, 而又因为win平台上没有如lsof,fuser这样的工具,所以我们在诊断Oracle软件的某些文件锁定问题时可以借助于该功能。

procexp_monitor_oracle8

如上图所示Oracle.exe持有”\Device\NamedPipe\*oraspawn_pipe*.4284“等多个文件的句柄。

procexp_monitor_oracle9

如上图所示Oracle.exe加载了多个ora开头的DDL,因为Windows平台上的特殊性,Oracle软件大量使用DDL库来替换在Unix平台上编译在Oracle 2进制镜像中的指令,这样方便了升级(直接替换DDL文件就可以了,无需编译,这也导致Windows平台上PSU/CPU补丁发布的特殊性),可以注意到这些DDL文件还标有Version信息,大多为11.02.0000.0001,编译时间为2010/2/10 9:01。

在较新版本的Procexp工具中还加入了create dump功能,以完善该工具的诊断能力。针对Oracle实例的异常现象和Bug可以创建进程转储信息,以便提交给Oracle Support分析问题,一般来说你并不需要亲自分析dump文件,这是一项高级功能,不要对正常运行着的生产数据库使用这一终极手段。

procexp_monitor_oracle10

总结

如果你还在抱怨Windows平台上为什么没有一个如Unix平台上NMON功能强大的监控软件的话,那么Procexp会是一个非常杰出的选择,另一点需要感恩的是这是一款免费软件,访问该软件的Homepage,可以让你了解更多的有用信息。

常用工具收集页面

DBA在性能调优或诊断过程中多少会使用一些成品工具,以下列出一些我工作中使用较为频繁的工具:

Program platform download URL
nmon AIX POWER http://www.ibm.com/developerworks/wikis/download/attachments/53871937/nmon4aix12e.zip?version=1
nmon Linux http://nmon.sourceforge.net/docs/MPG_nmon_for_Linux_14a_binaries.zip
sarmon Solaris http://sourceforge.net/projects/sarmon/files/
putty Interl x86 http://the.earth.li/~sgtatham/putty/latest/x86/putty.exe
OS watcher Non Windows https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=REFERENCE&id=301137.1
Procwatcher Non Windows https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=459694.1
SQLTXPLAIN ALL https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBLESHOOTING&id=215187.1
TRCANLZR ALL https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=TROUBLESHOOTING&id=224270.1
STRMMON Non Windows https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=290605.1
Oracle Cluster Verification Utility ALL http://www.oracle.com/technetwork/database/clustering/downloads/cvu-download-homepage-099973.html
Oracle Cluster Health Monitor (CHM) Linux and Windows http://www.oracle.com/technetwork/database/clustering/downloads/ipd-download-homepage-087212.html
RDA 4 ALL https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=DIAGNOSTIC%20TOOLS&id=250262.1
Latest Opatch ALL https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=BULLETIN&id=224346.1
RAC Diagnostic Information (racdiag.sql) ALL https://www.askmaclean.com/archives/script-to-collect-rac-diagnostic-information-racdiag-sql.html
ass.awk ALL https://www.askmaclean.com/archives/oracle-systemstate-dump-analytic-tool-ass-awk-v1-09.html
LTOM ALL https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=DIAGNOSTIC%20TOOLS&id=352363.1

今后还会不断更新!

沪ICP备14014813号-2

沪公网安备 31010802001379号