Know more about RAC GES STATISTICS

GES 全称为Global Enqueue Services是RAC中重要的全局队列锁服务,V$GES_STATISTICS 动态性能视图汇聚了主要的GES STATISTICS 统计信息。为了更好地理解GES STATISTICS ,我们可以通过下表理解这些统计项的含义:

 

V$GES_STATISTICS Reference (10.2.0.1.0)

 

0 messages sent directly         

 

Incremented when any process successfully sends a message to a remote instance without being blocked and without flow control.

 

1 messages flow controlled                 

 

Incremented when any process could not send a message directly because there were not enough tickets available.

 

2 messages sent indirectly

 

Incremented when any process is asked to flow-control the message (i.e. a process tried to send a message indirectly, even if a ticket was available). This can also be incremented when previous message to the same target node had failed or a GCS/GES operation is being frozen.

 

3 messages received logical

 

When LMS receives a GCS/GES message from remote or local client, this statistic is incremented.

 

61 messages received actual

 

When LMS receives a message from a remote instance, this is incremented. A single actual message can contain multiple logical messages. Note, that when remote messages are queued, because they are flow controlled or they are indirectly / intentionally queued, the LMS process tries to send them in batch instead of flushing them individually.

 

4 flow control messages sent

5 flow control messages received

 

Messages flow controlled due to lack of ticket.

 

6 gcs msgs received

7 gcs msgs process time(ms)

8 ges msgs received

9 ges process time(ms)

 

When LMS receives a message, and if the message is related to either GCS (Global Cache Service) or GES (Global Enqueue Service) activity, it is incremented. After a GCS/GES message is processed, the process (typically LMD or LMS) updates the following statistics.

 

  • gcs msgs received
  • gcs msgs process time(ms)
  • ges msgs received
  • ges msgs process time(ms)

 


10 msgs causing lmd to send msgs 

11 lmd msg send time(ms)

65 msgs causing lms to send msgs 

66 lms msg send time(ms)

 

Incremented when the LMD/LMS processes a GCS/GES message and it causes LMD/LMS to send one or more messages. For example, if LMS receives a message, and as part of processing it sends four more messages, the statistic will be incremented by 1, not 4. In order to get the exact number of messages sent by LMS, the session statistic ‘gcs messages sent’ and ‘ges messages sent’ needs to be retrieved for the session running LMS (from V$SESSTAT).

 

12 gcs side channel msgs actual

13 gcs side channel msgs logical

 

‘side channel msgs logical’ indicates the number of blocks shipped from this node to other nodes. ‘side channel msgs actual’ indicates the actual number of messages sent to other nodes. When CR blocks or current blocks are sent to a remote node, the sender actually sends another reliable message to the requestor, because the CR block or current block being shipped could be lost. For example, a node sends 100 CR blocks to another node (logical number of messages). The sender node then may send a message saying ‘I’ve sent 100 blocks’ in a single message (actual number of messages). The init.ora parameter ‘_side_channel_batch_size’ defines the number of side channel messages to be sent in a single message.  With reliable IPC technology such as RSM and HyperFabric, we do not need side channel messages, and this value should be 0. With non-reliable IPC technology like UDP, these should be increased.

 

14 gcs pings refused

 

Incremented when the master node sends a BAST to a holder node, and the holder node is not able to service the BAST for some reason (typically because the block is not present or the ping queue is full).

 

15 gcs writes refused

 

Same as above, except that this is for Writes. In RAC if the blocks are globally dirty the writes are mediated by the GCS.

 

16 gcs error msgs

 

Certain race conditions in the GCS this statistic to be updated. It usually involves sending some extra messages to resolve the race through the use of error messages.

 

17 gcs out-of-order msgs

 

With direct sends, it is possible for two messages, which are sent from the same instance, to be received out-of-order at the master node. This statistic is updated whenever that happens.

 

18 gcs immediate (null) converts

 

Incremented when NULL lock can be granted immediately

 

19 gcs immediate cr (null) converts

 

Incremented when NULL lock for CR request can be granted immediately

 

20 gcs immediate (compatible) converts

 

Incremented when shared lock can be granted immediately

 

21 gcs immediate cr (compatible) converts

 

Incremented when shared lock for CR request can be granted immediately

 

22 gcs blocked converts

 

Incremented when the lock cannot be granted immediately. The lock is on the head of the convert queue.

 

23 gcs queued converts

 

Incremented when the lock cannot be granted immediately, and there is a conflicting lock in the convert queue ahead of this lock.

 

24 gcs blocked cr converts

 

Incremented when a CR request cannot be granted a lock because the lock is already being converted or the lock is in exclusive mode

 

25 gcs compatible basts

 

Number of BAST’s sent to holder node of a compatible lock.

 

26 gcs compatible cr basts (local)

 

CR request can be granted a lock, and BAST is sent to holder node. The lock is in local role.

 

60 gcs compatible cr basts (global)

 

This is incremented when the lock request is compatible but we can’t read from
disk because the block is globally dirty.

 

27 gcs cr basts to PIs

 

CR request is sent to an instance that has a PI buffer that satisfies this CR request.

 

28 dynamically allocated gcs resources

29 dynamically allocated gcs shadows

 

Number of gcs resources / shadows dynamically allocated after the startup of instance. We should not see these increasing at all.  _gcs_resources and _gcs_shadows could be used to change the default number of these resources to avoid dynamic allocation, but we should treat it as a bug (the default should be enough or it could be memory leak.).

 

30 gcs recovery claim msgs

 

Number of recovery claim messages processed by this instance.

 

31 gcs indirect ast

 

AST is sent to LMS instead of foreground process.

 


32 gcs dbwr write request msgs

33 gcs dbwr flush pi msgs

34 gcs lms write request msgs

35 gcs lms flush pi msgs

36 gcs write notification msgs

 

Messages related to flushing dirty XCUR / PI buffers. To flush PI buffers, request master node to write the most recent copy of the block in the global cache, which is ‘write request msgs’. Once the most recent copy of the block in the global cache is written to disk, PI buffers in the global cache can be purged, which is ‘flush pi msgs’. Once the most recent copy is written to disk, ‘write notification’ message is sent to the master node.

 

37 gcs retry convert request

 

Convert request had to be retried due to some race conditions.

 

38 gcs regular cr

 

CR for data blocks

 

39 gcs undo cr

 

CR for undo blocks

 

40 gcs assume no cvt

 

Assume was processed when the convert q is empty.

 

41 gcs assume cvt

 

Assume was processed when the convert q is non-empty.

 

42 broadcast msgs on commit(actual)

 

MCPD=0, number of messages sent to update the SCN.

 

43 broadcast msgs on commit(logical)

 

Same as 42, but logical (because the update may have been piggybacked).

 

44 broadcast msgs on commit(wasted)

 

Update SCN mesage is sent, but it is potentially a waste because receiver may have already updated the SCN.

 

45 acks for commit broadcast(actual)

46 acks for commit broadcast(logical)

 

Same as 42, 43 except that it applies to the receiving instance.

 

47 false posts waiting for scn acks

 

We posted LGWR because we thought MCPD broadcast completed, but it didn’t.

 

48 gcs forward cr to pinged instance

 

CR request is forwarded to the instance that is currently converting the GCS resource

 

49 gcs cr serve without current lock

 

CR block is served by the instance that does not have the current lock.

 

50 msgs sent queued

51 msgs sent queue time (ms)

 

Number of logical messages sent through send queue and their queuing time. Queuing time for queued messages: ‘msgs sent queue time (ms)’ / ‘msgs sent queued’  à Average message queuing time for flow controlled or indirectly sent messages. Note: this is calculated at ‘kjct’ layer (GCS/GES communication layer).

 

52 msgs sent queued on ksxp

53 msgs sent queue time on ksxp (ms)

 

Number of messages queued, and queuing time on ksxp layer. This includes all types of GCS/GES messages sent by any Oracle processes (foreground and background processes). Note: ‘msgs sent queued’ is a statistic from the kjct layer where we know if the messages are directly sent or indirectly sent.

 

54 msgs received queue time (ms)

55 msgs received queued

 

Elapsed time that a message is actually received until it is processed. Number of messages received  (logical). The ratio ‘msgs received queue time (ms)’ / ‘msgs received queued’ gives us the average queuing time between message arrival and start processing.

 

56 implicit batch messages sent

57 implicit batch messages received

 

Number of messages sent/received that are batched implicitly. Note: messages that are queued because of flow control or because of indirect messages, can be batched.

 

58 gcs refuse xid

 

Number of lock request received but refused to process by this instance, because index split is in progress (new in Oracle9i Release 2)

 

59 gcs ast xid

 

Number of lock request canceled because of index split

 

62 process batch messages sent

63 process batch messages received

 

Number of messages sent/received in batch.  When LMS receives multiple remote messages at a time, it processes all of them first, and then sends messages in batch as a result, instead of sending the result for each request individually.

 

64 messages sent pbatched

 

This is the number of messages being sent through process batching. This is the logical number whereas “process batch messages sent” is the actual number of messages sent out. Process batching in 10g is used for multi-block read, newing, receiver direct send (LMD0, LMS*, LMON) and fusion write (DBW*).

 


67 global posts requested

 

AQ requested that a global post be delivered to another instance

 

68 global posts dropped

 

Post was dropped because there was no buffer space.

 

69 global posts queued

 

A post was queued to be sent to another instance

 

70 global posts sent

 

A post was actually sent to another instance

 

71 global posts queue time

 

Time difference between enqueuing and sending the post.

 

72 messages sent not implicit batched

 

This is the number of indirect sent messages not get any batching done from the send proxies due to various reason. For example, the message is big enough or is defined as non-batch type.

 

73 messages queue sent actual

 

Actual number of messages sent indirectly by send proxies.

 

74 messages queue sent logical

 

Logical number of messages sent indirectly by send proxies including the number of embedded message batched either through process batching or batching done in send proxies.

 

实际V$GES_STATISTICS的信息来源于X$KJISFT内部视图

11gR2新特性:LMHB Lock Manager Heart Beat后台进程

LMHB是11gR2中新引入的后台进程,官方文档的介绍是Global Cache/Enqueue Service Heartbeat Monitor,Monitor the heartbeat of LMON, LMD, and LMSn processes,LMHB monitors LMON, LMD, and LMSn processes to ensure they are running normally without blocking or spinning。  Database and ASM instances, Oracle RAC

该进程负责监控LMON、LMD、LMSn等RAC关键的后台进程,保证这些background process不被阻塞或spin。 LMHB可能是Lock Manager Heartbeat的缩写。

 

我们来看一下该进程的trace跟踪文件以便了解其功能:

按照 100s -> 80s -> 100s -> 80s的间隔监控并输出一次LMSn、LCKn、LMON、LMD等进程的状态及wait chain,由kjfmGCR_HBCheckAll函数控制

 

*** 2012-02-03 00:03:10.066
==============================
LMS0 (ospid: 17247) has not moved for 77 sec (1328245389.1328245312)
kjfmGCR_HBCheckAll: LMS0 (ospid: 17247) has status 2
  : waiting for event 'gcs remote message' for 0 secs with wait_id 15327.
  ===[ Wait Chain ]===
  Wait chain is empty.
kjgcr_Main: KJGCR_ACTION - id 5

*** 2012-02-03 00:04:50.091
==============================
LMS0 (ospid: 17247) has not moved for 88 sec (1328245489.1328245401)
kjfmGCR_HBCheckAll: LMS0 (ospid: 17247) has status 2
  : waiting for event 'gcs remote message' for 0 secs with wait_id 24546.
  ===[ Wait Chain ]===
  Wait chain is empty.
kjgcr_Main: KJGCR_ACTION - id 5

 LCK0 (ospid: 2662) has not moved for 95 sec (1309746735.1309746640)
  kjfmGCR_HBCheckAll: LCK0 (ospid: 2662) has status 6
  ==================================================
  === LCK0 (ospid: 2662) Heartbeat Report
  ==================================================
  LCK0 (ospid: 2662) has no heartbeats for 95 sec. (threshold 70 sec)
   : Not in wait; last wait ended 80 secs ago.
   : last wait_id 2317342 at 'libcache interrupt action by LCK'.
  ..
  .
   Session Wait History:
       elapsed time of 1 min 20 sec since last wait
    0: waited for 'libcache interrupt action by LCK'
  ..

 

大约每3分钟输出一次TOP CPU User,CPU使用率高的session信息:

 

*** 2012-02-03 00:05:30.102
kjgcr_SlaveReqBegin: message queued to slave
kjgcr_Main: KJGCR_ACTION - id 3
CPU is high.  Top oracle users listed below:
     Session           Serial         CPU
      29                 7             0
     156                23             0
       3                 1             0
       4                 1             0
       5                 1             0

*** 2012-02-03 00:08:30.147
kjgcr_SlaveReqBegin: message queued to slave
kjgcr_Main: KJGCR_ACTION - id 3
CPU is high.  Top oracle users listed below:
     Session           Serial         CPU
      29                 7             0
     156                23             0
       3                 1             0
       4                 1             0
       5                 1             0

 

如果发现有session的CPU使用率极高,根据内部算法可能会激活 资源计划(resource management plan) ,甚至于kill 进程:

 

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, identify users, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, kill users, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, activate RM plan, count 0

*** 2012-02-03 00:08:35.149
kjgcr_Main: Reset called for action high cpu, set BG into RT, count 0

 

从11.2.0.2 开始LMHB开始使用slave 进程GCRn来完成实际的任务(Global Conflict Resolution Slave Process Performs synchronous tasks on behalf of LMHB GCRn processes are transient slaves that are started and stopped as required by LMHB to perform synchronous or resource intensive tasks.) LMHB会控制GCRn进程的启停,以便使用多个GCRn完成同步和缓解资源紧张的任务(例如kill进程)。

可以看到实际LMHB调用的多为kjgcr或kjfmGCR开头的内部函数,GCR意为Global Conflict Resolution。

 

kjgcr_Main: KJGCR_ACTION – id 5

GCR 进程的trace :

*** 2011-11-28 02:42:44.466
kjgcr_SlaveActionCbk: Callback failed, check trace
Dumping GCR slave work message at 0x96b81fc0
GCR layer information: type = 1, index = 0
Unformatted dump of ksv layer header:

 

LMHB进程的出现是为了提高RAC的可用性,特别是在资源紧张的环境中他会主动地去尝试kill掉最耗费资源的服务进程,以保证LMS等关键的RAC后台进程能正常工作; 因为该进程定期监控LMS、LMON等后台进程的等待事件及session的CPU使用率等信息,所以该LMHB进程的跟踪日志也可能成为诊断RAC故障的之一,这是11.2.0.1以来RAC一个潜在的新特性和增强。

相关隐式参数

_lm_hb_callstack_collect_time hb diagnostic call stack collection time in seconds — 5s
_lm_hb_disable_check_list list of process names to be disabled in heartbeat check — none

 

11.2是第一个引入LMHB进程的版本,所以并不是太成熟,在实际过程中对于资源使用率很高的RAC系统而言LMHB可能会帮一些倒忙,若你确实遇到了相关的问题或者是在11.2 RAC上碰到了一些诡异的现象,那么可以关注一下以下这些MOS Note:

 

ORA-29770 LMHB Terminates Instance as LMON Waited for Control File IO or LIBRARY CACHE or ROW CACHE Event for too Long [ID 1197674.1]
Bug 8888434 – LMHB crashes the instance with LMON waiting on controlfile read [ID 8888434.8]
Bug 11890804 – LMHB crashes instance with ORA-29770 after long “control file sequential read” waits [ID 11890804.8]
Bug 11890804: LMHB TERMINATE INSTANCE WHEN LMON WAIT CHANGE FROM CF READ AFTER 60 SEC
Bug 13467673: CSS MISSCOUNT AND ALL ASM DOWN WITH ORA-29770 BY LMHB
Bug 13390052: KJFMGCR_HBCHECKALL MESSAGES ARE CONTINUOUSLY LOGGED IN LMHB TRACE FILE.
Bug 13322797: LMHB TERMINATES THE INSTANCE DUE TO ERROR 29770
Bug 11827088 – Latch ‘gc element’ contention, LMHB terminates the instance [ID 11827088.8]

Bug 13061883: LMHB IS TERMINATING THE INSTANCE DURING SHUTDOWN IMMEDIATE
Bug 12564133 – ORA-600[1433] in LMHB process during RAC reconfiguration [ID 12564133.8]
Bug 12886605: ESSC: LMHB TERMINATE INSTANCE DUE TO 29770 – LMON WAIT ENQ: AM – DISK OFFLINE
Bug 12757321: LMHB TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10296263: LMHB (OSPID: 15872): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 11899415: ORA-29771 AND LMHB (OSPID: XXXX) KILLS USER (OSPID: XXX
Bug 10431752: SINGLE NODE RAC: LMHB TERMINATES INSTANCE DUE TO 29770
Bug 11656856: LMHB (OSPID: 27701): TERMINATING THE INSTANCE DUE TO ERROR 29770
Bug 10411143: INSTANCE CRASHES WITH IPC SEND TIMEOUT AND LMHB TERMINATES WITH ORA-29770
Bug 11704041: DATABASE INSTANCE CRASH BY LMHB PROCESS
Bug 10412545: ORA-29770 LMHB TERMINATE INSTANCE DUE TO VARIOUS LONG CSS WAIT
Bug 10147827: INSTANCE TERMINATED BY LMHB WITH ERROR ORA-29770
Bug 10016974: ORA-29770 LMD IS HUNG FOR MORE THAN 70 SECONDS AND LMHB TERMINATE INSTANCE
Bug 9376100: LMHB TERMINATING INSTANCE DUE ERROR 29770

 

 

沪ICP备14014813号

沪公网安备 31010802001379号