Grid Control OMS Agent代理工作原理图

我们在使用Grid Control集中化管理OS、Oracle数据库时要求在host上安装Agent代理程序,以便Agent定期收集OS、Oracle信息传输给Oracle Grid Control Management Server(OMS),并执行OMS下达的一系列指令。

大多数人对于Agent的了解仅限于如何安装和启动agent,下图展示了OMS Agent的架构:

 

 

Agent主要由2个组件(component)部分组成,分别是Collector 收集器和 Metric Engine 度量引擎。

 

Collector收集器是agent的重要子系统。它负责收集并上传metric data度量数据到OMS(oms最终将这些数据存入数据库中)。Collector 利用collection file中的信息判定针对哪些target目标需要收集metric data以及多久收集一次。 为了获取数据,Collector将查询投递给Metric Engine,而Metric Engine负责实际的metric data的收集。 Metric Engine  通过Fetchlets 、Metadata原信息文件(Metadata files defined in OH/sysman/admin/metadata)和 已发现的target 信息文件(Targets defined in OH/sysman/emd/targets.xml)来获得每一个目标的metrics监控信息。 同时 metadata原信息文件也提供了实际如何去计算metrics度量的算法。

 

基于以上这些信息,Metric Engine 将使用恰当的fetchlets从监控目标获取数据, 这里的 Fetchlets指的是指定数据的访问方式, 例如访问数据库性能数据会采用SQL Fetchlets,而访问OS数据则使用OS Fetchlets。

 

一旦Collector 收集到metric data,它会将这些度量数据和已定义的阀值做对比,检查是否发送警告(alert waring), 同时将这些度量信息保存到本地文件系统上($OH/sysman/emd/upload目录)。 这些文件最后通过http 或 https 协议 传送到OMS服务器的指定URL上,该URL被$OH/sysman/config/emd.properties 配置文件中的REPOSITORY_URL指定,如以下例子:

 

 

[root@nas ~]# cat /w01/wls/agent/core/12.1.0.1.0/stage/sysman/config/emd.properties
#
#   emd Root directory(read-only location). Metrics should not create files
#   under this directory
#
#
emdRoot=/w01/wls/agent/core/12.1.0.1.0
#
#   agent Root directory(writeable).s
#   Use this property to base any temporary file creation.
#
#
agentStateDir=%EMSTATE%
#  perl executable directory  
#
perlBin=/w01/wls/agent/core/12.1.0.1.0/perl/bin
#
# script directory
#
scriptsDir=/w01/wls/agent/core/12.1.0.1.0/sysman/admin/scripts
#
# stage directory for provisioning
#
emStageDir=/tmp
#
#  EMD main servlet URL
#
EMD_URL=http://nas:%EM_SERVLET_PORT%/emd/main/
#
#  OMS Upload URL
#
#  if there is no receiving OMS or if you wish to disable the UploadManager
#  please set this value to empty or comment out below line
#
REPOSITORY_URL=https://:4900/empbs/upload/
#
#The following properties are advanced read-only properties
#
#
# The location of the file that contains the root certificate.
#
emdRootCertLoc=/w01/wls/agent/core/12.1.0.1.0/sysman/config/b64LocalCertificate.txt
internetCertLoc=/w01/wls/agent/core/12.1.0.1.0/sysman/config/b64InternetCertificate.txt
#
# The download URL for the EMD Oracle Wallet and its local file location.
#
# Note: Ensure that this URL references a valid port number at which the
# console is available on http
#
emdWalletSrcUrl=https://:4900/em/wallets/emd
emdWalletDest=/w01/wls/agent/core/12.1.0.1.0/sysman/config/server
# JAVA HOME required for agent operations
#
JAVA_HOME=/w01/wls/agent/core/12.1.0.1.0/jdk
#
# This string is used by the agent to determine which algorithm to use for encrypted data
# The string value will be same as the release version
#
agentVersion=12.1.0.1.0
#
# To enable the metric browser, uncomment the following line
# This is a reloadable parameter
#
#_enableMetricBrowser=true
#
# These are the optional Java flags for the agent
#
agentJavaDefines=-Xmx128m
#
#   The agent base directory.
#
agentBaseDir=/w01/wls/agent
#
############################################################################
########################### Modifiable Properties ##########################
############################################################################
#
#
#### Tracing related properties
#
#
# emagent perl tracing levels
# supported levels: DEBUG, INFO, WARN, ERROR
# default level is WARN
#
#
EMAGENT_PERL_TRACE_LEVEL=INFO
# logging properties
Logger.log4j.appender.Rolling=org.apache.log4j.RollingFileAppender
Logger.log4j.appender.Rolling.File=%EMSTATE%/sysman/log/gcagent.log
Logger.log4j.appender.Rolling.Append=true
Logger.log4j.appender.Rolling.MaxFileSize=5000000
Logger.log4j.appender.Rolling.MaxBackupIndex=10
Logger.log4j.appender.Rolling.layout=oracle.sysman.gcagent.util.logging.GCPattern
# FOR NOW add a nother log for errors
Logger.log4j.appender.Errors=org.apache.log4j.RollingFileAppender
Logger.log4j.appender.Errors.File=%EMSTATE%/sysman/log/gcagent_errors.log
Logger.log4j.appender.Errors.Append=true
Logger.log4j.appender.Errors.Threshold=ERROR
Logger.log4j.appender.Errors.layout=oracle.sysman.gcagent.util.logging.GCPattern
Logger.log4j.appender.Errors.MaxFileSize=50000000
Logger.log4j.appender.Errors.MaxBackupIndex=3
# Add a test appender for individual tests
Logger.log4j.appender.Test=org.apache.log4j.FileAppender
Logger.log4j.appender.Test.File=/dev/null
Logger.log4j.appender.Test.Append=true
Logger.log4j.appender.Test.Threshold=DEBUG
Logger.log4j.appender.Test.layout=oracle.sysman.gcagent.util.logging.GCPattern
#
# If you increase the maximum file size for the Mdu and Errors logs, you
# should consider setting _maxFileSizeToCopy to a value that is higher then the
# new number (please note that this will potnetially increase the size of your
# incidents)
#
#
# Set root category priority to INFO and its only appender to Rolling.
Logger.log4j.rootCategory=INFO, Rolling, Errors, Test
#
# Enable HTTPListener (jetty) at INFO level.
# TODO: remove this when true trace is supported
Logger.log4j.category.oracle.sysman.gcagent.comm.agent.http.HTTPListener=INFO
Logger.log4j.appender.stdout=org.apache.log4j.ConsoleAppender
Logger.log4j.appender.stdout.layout=oracle.sysman.gcagent.util.logging.GCPattern
# Set the class loaders to level INFO
Logger.log4j.category.oracle.sysman.gcagent.metadata.impl.ChainedClassLoader=INFO
Logger.log4j.category.oracle.sysman.gcagent.metadata.impl.ReverseDelegationClassLoader=INFO
Logger.log4j.category.oracle.sysman.gcagent.metadata.impl.PluginLibraryClassLoader=INFO
Logger.log4j.category.oracle.sysman.gcagent.metadata.impl.PluginClassLoader=INFO
# Add an appender for MetaData Updates
Logger.log4j.appender.Mdu=org.apache.log4j.RollingFileAppender
Logger.log4j.appender.Mdu.File=%EMSTATE%/sysman/log/gcagent_mdu.log
Logger.log4j.appender.Mdu.Append=true
Logger.log4j.appender.Mdu.Threshold=INFO
Logger.log4j.appender.Mdu.layout=org.apache.log4j.PatternLayout
Logger.log4j.appender.Mdu.layout.ConversionPattern=%d [%t] - %m%n
Logger.log4j.appender.Mdu.MaxFileSize=50000000
Logger.log4j.appender.Mdu.MaxBackupIndex=3
Logger.log4j.category.oracle.sysman.gcagent.dispatch.MetadataUpdater=INFO, Mdu
Logger.log4j.additivity.oracle.sysman.gcagent.dispatch.MetadataUpdater=false
# Turn off QA log by default
Logger.log4j.category.QA=FATAL, QA
#Logger._enableTrace=true
#
#### Scalability related properties
#
#List of ora errors which can be ignored and need not be uploaded to repos
IgnoreDownOraErrors=12541,01033,01034,12505,03134,12170,12500,01219,1089,12560,12514,12528,12545
################################
#
# Put all additional properties here
#
################################
# uncomment for ease of debugging
#MaxThreads=1
# Set the server's graceful shutdown delay.
GracefulShutdownDelay=3
# Dump the dispatcher when overloaded
_dumpDispatcherWhenOverloaded=true
# Whether the EMD should listen on all NICs on the current host (the default)
# or just the NIC associated with the hostname in EMD_URL
AgentListenOnAllNICs=true
# Dump each request
#_dumpEveryDispatcherRequest=true
# Dynamic properties timeout for specific target types
dynamicPropsComputeTimeout_rac_database=180
dynamicPropsComputeTimeout_cluster=180
dynamicPropsComputeTimeout_has=180
dynamicPropsComputeTimeout_oracle_database=180
dynamicPropsComputeTimeout_oc4jjvm=180
dynamicPropsComputeTimeout_microsoft_sqlserver_database=180
dynamicPropsComputeTimeout_host=180
dynamicPropsComputeTimeout_osm_instance=180
_disableLoadDPFromCacheNormal=true
#Enable jobsystem streams tracing
_enableJobSystemStreamsTracing=true
# Allow beacon aplication to have 500 megabytes of space. Primarily for ATS collections.
# 500 * 1024 * 1024 = 524288000
applicationMetadataQuota_BEACON=524288000
#Enable auto tuning out of the box
enableAutoTuning=true

 

由Collector最终收集到的这些信息文件仅在满足以下任意条件时实际传送给OMS:

1) 有一条alert告警信息需要发送
2) Collector收集到的信息文件的大小超过一个预定值(默认为20MB 20480KB), 该预定限制值由$OH/sysman/config/emd.properties中UploadFileSize参数指定。
3) 从上一次数据加载算起时间超过30分钟(默认),该预订限制值由$OH/sysman/config/emd.properties中UploadInterval 参数指定。

 

注意与Agent的处理方式不同,由Agent发送给OMS的Alert severities告警信息,OMS会直接将其存入到EM Repository数据库中,而不是以临时文件的形式暂存。

 

Agent除了Metric Engine和Collector 2个主要模块外, 还有其他子系统负责完成不同的工作:

 

  • Target Manager
    • Target Manager holds monitored targets
    • Target data in $EM/sysman/emd/targets.xml
    • lists managed targets, each with name, type, and other properties
    • Credential properties are encrypted
    • Targets can be marked broken
      • Required properties not provided
      • Dynamic properties take too long to compute
    • Discovery of new target instances possible by running perl scripts that list unmonitored instances.
  • Metric Engine
    • Driven by XML target metadata
    • one file per target-type, found in $OH/sysman/admin/metadata/*.xml
    • defines metrics; each may have multiple columns
    • for each metric, defines how data is collected:
      • QueryDescriptor : by fetchlet
      • PushDescriptor: by recvlet
      • ExecutionDescriptor: aggregation from other metrics
    • Supports multiple target versions with ValidIf
    • Defines properties for target type
      • Instance properties: specified in targets.xml
      • Dynamic properties: computed by metric engine
    • Metric Engine holds target-type metadata
      • given a target and a metric name, calls fetchlet manager and/or metric cache and returns a metric result
    • Metric Cache caches last-collected data for use in computing expressions
    • Aggregate metric support allows metrics to be computed via views, joins and group bys over other metrics
      • GetView: select columns or rows from a MetricResult
      • GroupBy: compute aggregation information (SUM, COUNT, MIN, MAX)
      • Union: add rows returned by multiple MetricResults
      • JoinTables: combine multiple metrics’ columns
  • Fetchlet Manager
    • A fetchlet is a data-access mechanism available to compute metric data
      • OS fetchlets : launch an OS process and interpret output
        • OS Fetchlet
        • OSLine Fetchlet
        • OSLineToken
        • UDM : User Defined Metric
      • SQL fetchlet : run a SQL or PL/SQL statement
      • URL fetchlets
        • HTTP data
        • URLTiming Fetchlet
      • and more…
  • Collection Manager
    • Holds all collections, both default and per-target
    • CollectionItem is the basic unit of scheduled collection
    • multiple metrics collected from the same target at the same interval can be collected in the same thread (MetricColl)
    • Once data is collected for a CollectionItem, any Conditions are evaluated
      • three states: Clear, Warning, Critical or Unknown
      • last evaluated Condition states are stored in $EM/sysman/emd/state/*
    • Collection XML files
      • default collections defined for all targets of a type in $OH/sysman/admin/default_collection/*.xml
      • additional collections for a particular target in $EM/sysman/emd/collection/*.xml
      • specifies, by metric, schedule for collection and thresholds to be applied to columns
  • Blackout Manager
    • Manage blackout information stored in $EM/sysman/emd/blackouts.xml
    • Scheduled collections consult Blackout Manager; if target is currently blacked-out, collection does not proceed
    • Targets may be affected by multiple blackouts; if any blackout is effective on a target, the target is blacked-out
    • Node blackouts affect all targets monitored by the agent
    • Blackouts file :
      • blackouts in $EM/sysman/emd/blackouts.xml
      • each blackout can be applied to one or more targets; if target is node, blackout applies to all targets
      • blackout can be immediate or scheduled; if scheduled, can be one-time or repeated
  • Scheduler
    • Schedules activities in order of next run time
      • multiple schedule formats:
        • Once: happens only once
        • Interval: happens every n minutes/hours/days
        • Week: happens on certain day of week
        • Month: happens on certain day of month
      • can specify begin time/end time
    • Spawns threads to do work whose time has arrived
    • Used by Collector and Blackout Manager
    • Health Monitor checks that the scheduler is doing its work
    • emctl status agent scheduler
      • Dumps out all the scheduled elements
  • Upload Manager
    • As data is collected by other agent components, serializes writing of  intermediary .dat files (stored in $AS/sysman/emd/upload)
    • .dat files merged into .xml files on five priority channels
    • XML files sent to OMS as HTTP requests
    • maintains statistics on pending xml files; will disable collections based on number of files, aggregate size of files, and percentage free disk space on upload filesystem
    • Upload interval dynamic, based on properties and previous upload status
  • Ping Manager
    • Periodically, sends HTTP heartbeat request to OMS and verifies response
    • OMS response dictates interval before next ping
    • Exchange timezone information
    • A successful ping from the agent to the OMS is required before any uploads will occur

沪ICP备14014813号

沪公网安备 31010802001379号