理解Oracle在AIX平台上的内存使用

1.理解Oracle进程

首先我们要做的是理解Oracle的3种进程类型:后台进程( background process)和服务进程(也叫前台进程)还有用户进程。当我们尝试启动Oracle实例,首先受到召唤的是后台进程,一组后台进程和内存组建构成了 Oracle 实例,这些后台进程包括 日志记录进程lgwr,数据库写出进程 dbwr, 系统监控进程smon, 进程监控进程pmon, 分布式恢复进程reco, 检查点进程ckpt, 11g后变得更多了,多到我记不住。 这些进程在UNIX上的具体args总是形如ora_functionname_sid, 这里的functionname即后台进程的功能名而sid 即 $ORACLE_SID所指出的值。

第二类是用户进程,它可能是一个sqlplus命令行,可能是imp/exp工具,也可能是用户开发的一个java程序,当用户进程在本地启动时它们不直接操作SGA或者PGA,但毫无疑问它们也是需要消耗一定数量的虚拟内存的。
第三类进程就是我们说的服务进程,启动一个sqlplus 连接(这个连接可能是连到本地的 也可能的是远程的,这在我们讨论内存使用时无关紧要)的同时我们需要用到一个服务进程,它直接向我们的sqlplus终端负责。我们有时候也称服务进程为影子进程。影子进程总是和每一个用户进程一一对应的映射除非我们用到了MTS(多线程服务器)时。影子进程一般形如oracleSID,这里的sid和前文所指一般。

 

2.理解Oracle的内存使用

Oracle对内存的使用可以划分为2个大类型,即私有的和共享的。私有内存仅供单个进程使用。相反的,共享内存可以供多个进程使用且在具体使用上要复杂得多。在合计共享内存时,我们只需将所有进程所共享的内存段累加一次即可(Oracle 的SGA具体反映到OS层可能是多个shared memory segment,我们只需要将这一个或多个段的大小加到一起就可以了)。

我们可能使用到的最大的共享内存段毫无疑问会是SGA(SYSTEM GLOBAL AREA),我们看到的SGA被映射成虚拟地址且被每一个后台进程和前台进程attach到自己身上,以便随时能够利用到SGA; 我们有很多性能工具可以反映这部分的内存使用, 好比’top’,’ps -lf’, 但他们都无法分辨前后台进程内存使用中私有内存和共享内存分别得使用状况(我们往往只能得到一个Oracle使用内存很多的结论,却不知道是PGA还是 SGA消耗的更多的内存)。如果我们把从这些途径中获得每个进程的内存使用量相加,我们会发现这样计算的总内存使用量是SGA+PGA的几十倍,这是违反常识的,实际也分配不到那么多内存。 要真正了解Oracle内存使用,你使用的内存窥测命令需要能够分离Oracle使用的私有内存和共享内存。在Aix平台上有这样一个svmon(在其他 UNIX平台上有一个我认为更好的工具是pmap,与之对应AIX上有一个procmap命令,但这个命令并不能窥测Oracle 私有或共享内存的使用,所以我们只能退而求其次了)。

 

您可能在AIX的安装光盘上通过安装文件(filesets) “perfagent.tools”来获取该工具。使用”smit install_lastest”命令可以配备这个命令。对于svmon,作为一个非AIX操作系统专家而言,我推荐您读一下我引用的这篇文档:

 

The svmon Command

The svmon command provides a more in-depth analysis of memory usage. It is more informative, but also more intrusive, than the vmstat and ps commands. The svmon command captures a snapshot of the current state of memory. However, it is not a true snapshot because it runs at the user level with interrupts enabled.

To determine whether svmon is installed and available, run the following command:

# lslpp -lI perfagent.tools

The svmon command can only be executed by the root user.

If an interval is used (-i option), statistics will be displayed until the command is killed or until the number of intervals, which can be specified right after the interval, is reached.

You can use four different reports to analyze the displayed information:

Global (-G)
Displays statistics describing the real memory and paging space in use for the whole system.

Process (-P)
Displays memory usage statistics for active processes.

Segment (-S)
Displays memory usage for a specified number of segments or the top ten highest memory-usage processes in descending order.

Detailed Segment (-D)
Displays detailed information on specified segments.

Additional reports are available in AIX 4.3.3 and later, as follows:

User (-U)
Displays memory usage statistics for the specified login names. If no list of login names is supplied, memory usage statistics display all defined login names.

Command (-C)
Displays memory usage statistics for the processes specified by command name.

Workload Management Class (-W)
Displays memory usage statistics for the specified workload management classes. If no classes are supplied, memory usage statistics display all defined classes.

To support 64-bit applications, the output format of the svmon command was modified in AIX 4.3.3 and later.

Additional reports are available in operating system versions later than 4.3.3, as follows:

Frame (-F)
Displays information about frames. When no frame number is specified, the percentage of used memory is reported. When a frame number is specified, information about that frame is reported.

Tier (-T)
Displays information about tiers, such as the tier number, the superclass name when the -a flag is used, and the total number of pages in real memory from segments belonging to the tier.

 

 

 

How Much Memory is in Use

To print out global statistics, use the -G flag. In this example, we will repeat it five times at two-second intervals.

 

 

# svmon -G -i 2 5
memory   inuse  pinpgspace
size inuse free pin work pers clnt work pers clnt size inuse
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674
16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676
16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676
16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676
16384 16254 130 2006 10679 2939 2636 2006 0 0 40960 12676

The columns on the resulting svmon report are described as follows:

memory
Statistics describing the use of real memory, shown in 4 K pages.

size
Total size of memory in 4 K pages.

inuse
Number of pages in RAM that are in use by a process plus the number of persistent pages that belonged to a terminated process and are still resident in RAM. This value is the total size of memory minus the number of pages on the free list.

free
Number of pages on the free list.

pin
Number of pages pinned in RAM (a pinned page is a page that is always resident in RAM and cannot be paged out).

in use
Detailed statistics on the subset of real memory in use, shown in 4 K frames.

work
Number of working pages in RAM.

pers
Number of persistent pages in RAM.

clnt
Number of client pages in RAM (client page is a remote file page).

pin
Detailed statistics on the subset of real memory containing pinned pages, shown in 4 K frames.

work
Number of working pages pinned in RAM.

pers
Number of persistent pages pinned in RAM.

clnt
Number of client pages pinned in RAM.

pg space
Statistics describing the use of paging space, shown in 4 K pages. This data is reported only if the -r flag is not used. The value reported starting with AIX 4.3.2 is the actual number of paging-space pages used (which indicates that these pages were paged out to the paging space). This differs from the vmstat command in that vmstat's avm column which shows the virtual memory accessed but not necessarily paged out.

size
Total size of paging space in 4 K pages.

inuse
Total number of allocated pages.

In our example, there are 16384 pages of total size of memory. Multiply this number by 4096 to see the total real memory size (64 MB). While 16250 pages are in use, there are 134 pages on the free list and 2006 pages are pinned in RAM. Of the total pages in use, there are 10675 working pages in RAM, 2939 persistent pages in RAM, and 2636 client pages in RAM. The sum of these three parts is equal to the inuse column of the memory part. The pin part divides the pinned memory size into working, persistent and client categories. The sum of them is equal to the pin column of the memory part. There are 40960 pages (160 MB) of total paging space, and 12676 pages are in use. The inuse column of memory is usually greater than the inuse column of pg spage because memory for file pages is not freed when a program completes, while paging-space allocation is.

In AIX 4.3.3 and later, systems the output of the same command looks similar to the following:

# svmon -G -i 2 5

size inuse free pin virtual
memory 65527 64087 1440 5909 81136
pg space 131072 55824

work pers clnt
pin 5918 0 0
in use 47554 13838 2695

size inuse free pin virtual
memory 65527 64091 1436 5909 81137
pg space 131072 55824

work pers clnt
pin 5918 0 0
in use 47558 13838 2695

size inuse free pin virtual
memory 65527 64091 1436 5909 81137
pg space 131072 55824

work pers clnt
pin 5918 0 0
in use 47558 13838 2695

size inuse free pin virtual
memory 65527 64090 1437 5909 81137
pg space 131072 55824

work pers clnt
pin 5918 0 0
in use 47558 13837 2695

size inuse free pin virtual
memory 65527 64168 1359 5912 81206
pg space 131072 55824

work pers clnt
pin 5921 0 0
in use 47636 13837 2695

The additional output field is the virtual field, which shows the number of pages allocated in the system virtual space.

Who is Using Memory?

The following command displays the memory usage statistics for the top ten processes. If you do not specify a number, it will display all the processes currently running in this system.

# svmon -Pau 10

Pid Command Inuse Pin Pgspace
15012 maker4X.exe 4783 1174 4781
2750 X 4353 1178 5544
15706 dtwm 3257 1174 4003
17172 dtsession 2986 1174 3827
21150 dtterm 2941 1174 3697
17764 aixterm 2862 1174 3644
2910 dtterm 2813 1174 3705
19334 dtterm 2813 1174 3704
13664 dtterm 2804 1174 3706
17520 aixterm 2801 1174 3619

Pid: 15012
Command: maker4X.exe

Segid Type Description Inuse Pin Pgspace Address Range
1572 pers /dev/hd3:62 0 0 0 0..-1
142 pers /dev/hd3:51 0 0 0 0..-1
1bde pers /dev/hd3:50 0 0 0 0..-1
2c1 pers /dev/hd3:49 1 0 0 0..7
9ab pers /dev/hd2:53289 1 0 0 0..0
404 work kernel extension 27 27 0 0..24580
1d9b work lib data 39 0 23 0..607
909 work shared library text 864 0 7 0..65535
5a3 work sreg[4] 9 0 12 0..32768
1096 work sreg[3] 32 0 32 0..32783
1b9d work private 1057 1 1219 0..1306 : 65307..65535
1af8 clnt 961 0 0 0..1716
0 work kernel 1792 1146 3488 0..32767 : 32768..65535
...

 

 

The output is divided into summary and detail sections. The summary section lists the top ten highest memory-usage processes in descending order.

Pid 15012 is the process ID that has the highest memory usage. The Command indicates the command name, in this case maker4X.exe. The Inuse column (total number of pages in real memory from segments that are used by the process) shows 4783 pages (each page is 4 KB). The Pin column (total number of pages pinned from segments that are used by the process) shows 1174 pages. The Pgspace column (total number of paging-space pages that are used by the process) shows 4781 pages.

The detailed section displays information about each segment for each process that is shown in the summary section. This includes the segment ID, the type of the segment, description (a textual description of the segment, including the volume name and i-node of the file for persistent segments), number of pages in RAM, number of pinned pages in RAM, number of pages in paging space, and address range.

The Address Range specifies one range for a persistent or client segment and two ranges for a working segment. The range for a persistent or a client segment takes the form ‘0..x,’ where x is the maximum number of virtual pages that have been used. The range field for a working segment can be ‘0..x : y..65535’, where 0..x contains global data and grows upward, and y..65535 contains stack area and grows downward. For the address range, in a working segment, space is allocated starting from both ends and working towards the middle. If the working segment is non-private (kernel or shared library), space is allocated differently. In this example, the segment ID 1b9d is a private working segment; its address range is 0..1306 : 65307..65535. The segment ID 909 is a shared library text working segment; its address range is 0..65535.

A segment can be used by multiple processes. Each page in real memory from such a segment is accounted for in the Inuse field for each process using that segment. Thus, the total for Inuse may exceed the total number of pages in real memory. The same is true for the Pgspace and Pin fields. The sum of Inuse, Pin, and Pgspace of all segments of a process is equal to the numbers in the summary section.

You can use one of the following commands to display the file name associated with the i-node:

 

* ncheck -i i-node_number volume_name
* find file_system_associated_with_lv_name -xdev -inum inode_number -print

To get a similar output in AIX 4.3.3 and later, use the following command:

# svmon -Put 10

------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd
2164 X 15535 1461 34577 37869 N N

Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range
1966 2 work process private 9984 4 31892 32234 0..32272 :
65309..65535
4411 d work shared library text 3165 0 1264 1315 0..65535
0 0 work kernel seg 2044 1455 1370 4170 0..32767 :
65475..65535
396e 1 pers code,/dev/hd2:18950 200 0 - - 0..706
2ca3 - work 32 0 0 32 0..32783
43d5 - work 31 0 6 32 0..32783
2661 - work 29 0 0 29 0..32783
681f - work 29 0 25 29 0..32783
356d f work shared library data 18 0 18 24 0..310
34e8 3 work shmat/mmap 2 2 2 4 0..32767
5c97 - pers /dev/hd4:2 1 0 - - 0..0
5575 - pers /dev/hd2:19315 0 0 - - 0..0
4972 - pers /dev/hd2:19316 0 0 - - 0..5
4170 - pers /dev/hd3:28 0 0 - - 0..0
755d - pers /dev/hd9var:94 0 0 - - 0..0
6158 - pers /dev/hd9var:90 0 0 - - 0..0

------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd
25336 austin.ibm. 12466 1456 2797 11638 N N

Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range
14c3 2 work process private 5644 1 161 5993 0..6550 :
65293..65535
4411 d work shared library text 3165 0 1264 1315 0..65535
0 0 work kernel seg 2044 1455 1370 4170 0..32767 :
65475..65535
13c5 1 clnt code 735 0 - - 0..4424
d21 - pers /dev/andy:563 603 0 - - 0..618
9e6 f work shared library data 190 0 2 128 0..3303
942 - pers /dev/cache:16 43 0 - - 0..42
2ca3 - work 32 0 0 32 0..32783
49f0 - clnt 10 0 - - 0..471
1b07 - pers /dev/andy:8568 0 0 - - 0..0
623 - pers /dev/hd2:22539 0 0 - - 0..1
2de9 - clnt 0 0 - - 0..0
1541 5 mmap mapped to sid 761b 0 0 - -
5d15 - pers /dev/andy:487 0 0 - - 0..3
4513 - pers /dev/andy:486 0 0 - - 0..45
cc4 4 mmap mapped to sid 803 0 0 - -
242a - pers /dev/andy:485 0 0 - - 0..0
...

The Vsid column is the virtual segment ID, and the Esid column is the effective segment ID. The effective segment ID reflects the segment register that is used to access the corresponding pages.

Detailed Information on a Specific Segment ID

The -D option displays detailed memory-usage statistics for segments.

# svmon -D 404
Segid: 404
Type: working
Description: kernel extension
Address Range: 0..24580
Size of page space allocation: 0 pages ( 0.0 Mb)
Inuse: 28 frames ( 0.1 Mb)
Page Frame Pin Ref Mod
12294 3320 pin ref mod
24580 1052 pin ref mod
12293 52774 pin ref mod
24579 20109 pin ref mod
12292 19494 pin ref mod
12291 52108 pin ref mod
24578 50685 pin ref mod
12290 51024 pin ref mod
24577 1598 pin ref mod
12289 35007 pin ref mod
24576 204 pin ref mod
12288 206 pin ref mod
4112 53007 pin mod
4111 53006 pin mod
4110 53005 pin mod
4109 53004 pin mod
4108 53003 pin mod
4107 53002 pin mod
4106 53001 pin mod
4105 53000 pin mod
4104 52999 pin mod
4103 52998 pin mod
4102 52997 pin mod
4101 52996 pin mod
4100 52995 pin mod
4099 52994 pin mod
4098 52993 pin mod
4097 52992 pin ref mod

The detail columns are explained as follows:

Page
Specifies the index of the page within the segment.

Frame
Specifies the index of the real memory frame that the page resides in.

Pin
Specifies a flag indicating whether the page is pinned.

Ref
Specifies a flag indicating whether the page's reference bit is on.

Mod
Specifies a flag indicating whether the page is modified.

The size of page space allocation is 0 because all the pages are pinned in real memory.

An example output from AIX 4.3.3 and later, is very similar to the following:

# svmon -D 629 -b

Segid: 629
Type: working
Address Range: 0..77
Size of page space allocation: 7 pages ( 0.0 Mb)
Virtual: 11 frames ( 0.0 Mb)
Inuse: 7 frames ( 0.0 Mb)

Page Frame Pin Ref Mod
0 32304 N Y Y
3 32167 N Y Y
7 32321 N Y Y
8 32320 N Y Y
5 32941 N Y Y
1 48357 N N Y
77 47897 N N Y

The -b flag shows the status of the reference and modified bits of all the displayed frames. After it is shown, the reference bit of the frame is reset. When used with the -i flag, it detects which frames are accessed between each interval.

Note: Use this flag with caution because of its performance impacts.

List of Top Memory Usage of Segments

The -S option is used to sort segments by memory usage and to display the memory-usage statistics for the top memory-usage segments. If count is not specified, then a count of 10 is implicit. The following command sorts system and non-system segments by the number of pages in real memory and prints out the top 10 segments of the resulting list.

# svmon -Sau

Segid Type Description Inuse Pin Pgspace Address Range
0 work kernel 1990 1408 3722 0..32767 : 32768..65535
1 work private, pid=4042 1553 1 1497 0..1907 : 65307..65535
1435 work private, pid=3006 1391 3 1800 0..4565 : 65309..65535
11f5 work private, pid=14248 1049 1 1081 0..1104 : 65307..65535
11f3 clnt 991 0 0 0..1716
681 clnt 960 0 0 0..1880
909 work shared library text 900 0 8 0..65535
101 work vmm data 497 496 1 0..27115 : 43464..65535
a0a work shared library data 247 0 718 0..65535
1bf9 work private, pid=21094 221 1 320 0..290 : 65277..65535

All output fields are described in the previous examples.

An example output from AIX 4.3.3 and later is similar to the following:

# svmon -Sut 10

Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range
1966 - work 9985 4 31892 32234 0..32272 :
65309..65535
14c3 - work 5644 1 161 5993 0..6550 :
65293..65535
5453 - work 3437 1 2971 4187 0..4141 :
65303..65535
4411 - work 3165 0 1264 1315 0..65535
5a1e - work 2986 1 13 2994 0..3036 :
65295..65535
340d - work misc kernel tables 2643 0 993 2645 0..15038 :
63488..65535
380e - work kernel pinned heap 2183 1055 1416 2936 0..65535
0 - work kernel seg 2044 1455 1370 4170 0..32767 :
65475..65535
6afb - pers /dev/notes:92 1522 0 - - 0..10295
2faa - clnt 1189 0 - - 0..2324

Correlating svmon and vmstat Outputs

There are some relationships between the svmon and vmstat outputs. The svmon report of AIX 4.3.2 follows (the example is the same with AIX 4.3.3 and later, although the output format is different):

# svmon -G
m e m o r y i n u s e p i n p g s p a c e
size inuse free pin work pers clnt work pers clnt size inuse
16384 16254 130 2016 11198 2537 2519 2016 0 0 40960 13392

The vmstat command was run in a separate window while the svmon command was running. The vmstat report follows:

# vmstat 5
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 13392 130 0 0 0 0 2 0 125 140 36 2 1 97 0
0 0 13336 199 0 0 0 0 0 0 145 14028 38 11 22 67 0
0 0 13336 199 0 0 0 0 0 0 141 49 31 1 1 98 0
0 0 13336 199 0 0 0 0 0 0 142 49 32 1 1 98 0
0 0 13336 199 0 0 0 0 0 0 145 49 32 1 1 99 0
0 0 13336 199 0 0 0 0 0 0 163 49 33 1 1 92 6
0 0 13336 199 0 0 0 0 0 0 142 49 32 0 1 98 0

The global svmon report shows related numbers. The vmstatfre column relates to the svmon memory free column. The number that vmstat reports as Active Virtual Memory (avm) is reported by the svmon command as pg space inuse (13392).

The vmstat avm column provides the same figures as the pg space inuse column of the svmon command except starting with AIX 4.3.2 where Deferred Page Space Allocation is used. In that case, the svmon command shows the number of pages actually paged out to paging space whereas the vmstat command shows the number of virtual pages accessed but not necessarily paged out (see Looking at Paging Space and Virtual Memory).

Correlating svmon and ps Outputs

There are some relationships between the svmon and ps outputs. The svmon report of AIX 4.3.2 follows (the example is the same with AIX 4.3.3 and later, although the output format is different):

# svmon -P 7226

Pid Command Inuse Pin Pgspace
7226 telnetd 936 1 69

Pid: 7226
Command: telnetd

Segid Type Description Inuse Pin Pgspace Address Range
828 pers /dev/hd2:15333 0 0 0 0..0
1d3e work lib data 0 0 28 0..559
909 work shared library text 930 0 8 0..65535
1cbb work sreg[3] 0 0 1 0..0
1694 work private 6 1 32 0..24 : 65310..65535
12f6 pers code,/dev/hd2:69914 0 0 0 0..11

Compare with the ps report, which follows:

# ps v 7226
PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
7226 - A 0:00 51 240 24 32768 33 0 0.0 0.0 telnetd

SIZE refers to the virtual size in KB of the data section of the process (in paging space). This number is equal to the number of working segment pages of the process that have been touched (that is, the number of paging-space pages that have been allocated) times 4. It must be multiplied by 4 because pages are in 4 K units and SIZE is in 1 K units. If some working segment pages are currently paged out, this number is larger than the amount of real memory being used. The SIZE value (240) correlates with the Pgspace number from the svmon command for private (32) plus lib data (28) in 1 K units.

RSS refers to the real memory (resident set) size in KB of the process. This number is equal to the sum of the number of working segment and code segment pages in memory times 4. Remember that code segment pages are shared among all of the currently running instances of the program. If 26 ksh processes are running, only one copy of any given page of the ksh executable program would be in memory, but the ps command would report that code segment size as part of the RSS of each instance of the ksh program. The RSS value (24) correlates with the Inuse numbers from the svmon command for private (6) working-storage segments, for code (0) segments, and for lib data (0) of the process in 1-K units.

TRS refers to the size of the resident set (real memory) of text. This is the number of code segment pages times four. As was noted earlier, this number exaggerates memory use for programs of which multiple instances are running. This does not include the shared text of the process. The TRS value (0) correlates with the number of the svmon pages in the code segment (0) of the Inuse column in 1 K units. The TRS value can be higher than the TSIZ value because other pages, such as the XCOFF header and the loader section, may be included in the code segment.

The following calculations can be made for the values mentioned:

SIZE = 4 * Pgspace of (work lib data + work private)
RSS = 4 * Inuse of (work lib data + work private + pers code)
TRS = 4 * Inuse of (pers code)

Calculating the Minimum Memory Requirement of a Program

To calculate the minimum memory requirement of a program, the formula would be:

Total memory pages (4 KB units) = T + ( N * ( PD + LD ) ) + F

where:

T
= Number of pages for text (shared by all users)

N
= Number of copies of this program running simultaneously

PD
= Number of working segment pages in process private segment

LD
= Number of shared library data pages used by the process

F
= Number of file pages (shared by all users)

Multiply the result by 4 to obtain the number of kilobytes required. You may want to add in the kernel, kernel extension, and shared library text segment values to this as well even though they are shared by all processes on the system. For example, some applications like CATIA and databases use very large shared library modules. Note that because we have only used statistics from a single snapshot of the process, there is no guarantee that the value we get from the formula will be the correct value for the minimum working set size of a process. To get working set size, one would need to run a tool such as the rmss command or take many snapshots during the life of the process and determine the average values from these snapshots (see Assessing Memory Requirements Through the rmss Command).

If we estimate the minimum memory requirement for the program pacman, shown in Finding Memory-Leaking Programs, the formula would be:

T
= 2 (Inuse of code,/dev/lv01:12302 of pers)

PD
= 1632 (Inuse of private of work)

LD
= 12 (Inuse of lib data of work)

F
= 1 (Inuse of /dev/hd2:53289 of pers

That is: 2 + (N * (1632+ 12)) + 1, equal to 1644 * N + 3 in 4 KB units.


 

 

需要注意一点是,svmon会将UNIX上的文件系统缓存对应到曾经申请过这些文件页的进程身上。可笑的是,这些文件系统缓存是不受Oracle本身控制的,他既不是PGA亦不是SGA,这些缓存是受AIX操作系统分配并被排他式地控制着(controlled exclusively).以缓存文件为目的的这部分内存不在我们考虑的Oracle内存使用问题的范畴内,因为这部分内存实际是被AIX所支配着,与我们讨论的PGA/SGA没有联系,如果我们的环境中全部是裸设备(raw device)的话(当然这不太可能),就不存在大量文件系统缓存的问题了。当然这也不意味着这部分在我们考虑总的内存使用时被忽略或漠视,因为这部分文件系统缓存同样会消耗大量物理内存并可能引起不必要的换页操作。我们可以通过”svmon -Pau 10″来了解这部分内存的使用状况;在AIX上著名的性能调优工具virtual memory optimizer,原先的vmtume,现在的vmo工具,可以帮助我们调节文件系统内存的具体阀值如 maxperm,minperm,strict_maxperm(这里不做展开)。有兴趣的话可以参考下面引用的这篇文档:

 

 

 

 

Tuning VMM Page Replacement with the vmtune Command

The memory management algorithm, discussed in Real-Memory Management, tries to keep the size of the free list and the percentage of real memory occupied by persistent segment pages within specified bounds. These bounds can be altered with the vmtune command, which can only be run by the root user. Changes made by this tool remain in effect until the next reboot of the system. To determine whether the vmtune command is installed and available, run the following command:

# lslpp -lI bos.adt.samples

Note: The vmtune command is in the samples directory because it is very VMM-implementation dependent. The vmtune code that accompanies each release of the operating system is tailored specifically to the VMM in that release. Running the vmtune command from one release on a different release might result in an operating-system failure. It is also possible that the functions of vmtune may change from release to release. Do not propagate shell scripts or /etc/inittab entries that include the vmtune command to a new release without checking the vmtune documentation for the new release to make sure that the scripts will still have the desired effect.

Executing the vmtune command on AIX 4.3.3 with no options results in the following output:

# /usr/samples/kernel/vmtune
vmtune:  current values:
-p       -P        -r          -R         -f       -F       -N        -W
minperm  maxperm  minpgahead maxpgahead  minfree  maxfree  pd_npages maxrandwrt
52190   208760       2          8        120      128     524288        0

-M      -w      -k      -c        -b         -B           -u        -l    -d
maxpin npswarn npskill numclust numfsbufs hd_pbuf_cnt lvm_bufcnt lrubucket defps

209581    4096    1024       1      93         96          9      131072     1

-s              -n         -S           -h
sync_release_ilock  nokillroot  v_pinshm  strict_maxperm
0               0           0             0

number of valid memory pages = 261976   maxperm=79.7% of real memory
maximum pinable=80.0% of real memory    minperm=19.9% of real memory
number of file memory pages = 19772     numperm=7.5% of real memory

The output shows the current settings for all the parameters.
Choosing minfree and maxfree Settings

The purpose of the free list is to keep track of real-memory page frames released by terminating processes and to supply page frames to requestors immediately, without forcing them to wait for page steals and the accompanying I/O to complete. The minfree limit specifies the free-list size below which page stealing to replenish the free list is to be started. The maxfree parameter is the size above which stealing will end.

The objectives in tuning these limits are to ensure that:

* Any activity that has critical response-time objectives can always get the page frames it needs from the free list.
* The system does not experience unnecessarily high levels of I/O because of premature stealing of pages to expand the free list.

The default value of minfree and maxfree depend on the memory size of the machine. The default value of maxfree is determined by this formula:

maxfree = minimum (# of memory pages/128, 128)

By default the minfree value is the value of maxfree - 8. However, the difference between minfree and maxfree should always be equal to or greater than maxpgahead. Or in other words, the value of maxfree should always be greater than or equal to minfree plus the size of maxpgahead. The minfree/maxfree values will be different if there is more than one memory pool. Memory pools were introduced in AIX 4.3.3 for MP systems with large amounts of RAM. Each memory pool will have its own minfree/maxfree which are determined by the previous formulas, but the minfree/maxfree values shown by the vmtune command will be the sum of the minfree/maxfree for all memory pools.

Remember, that minfree pages in some sense are wasted, because they are available, but not in use. If you have a short list of the programs you want to run fast, you can investigate their memory requirements with the svmon command (see Determining How Much Memory Is Being Used), and set minfree to the size of the largest. This technique risks being too conservative because not all of the pages that a process uses are acquired in one burst. At the same time, you might be missing dynamic demands that come from programs not on your list that may lower the average size of the free list when your critical programs run.

A less precise but more comprehensive tool for investigating an appropriate size for minfree is the vmstat command. The following is a portion of a vmstat command output obtained while running an C compilation on an otherwise idle system.

# vmstat 1
kthr     memory             page              faults        cpu
----- ----------- ------------------------ ------------ -----------
r  b   avm   fre  re  pi  po  fr   sr  cy  in   sy  cs us sy id wa
0  0  3085   118   0   0   0   0    0   0 115    2  19  0  0 99  0
0  0  3086   117   0   0   0   0    0   0 119  134  24  1  3 96  0
2  0  3141    55   2   0   6  24   98   0 175  223  60  3  9 54 34
0  1  3254    57   0   0   6 176  814   0 205  219 110 22 14  0 64
0  1  3342    59   0   0  42 104  249   0 163  314  57 43 16  0 42
1  0  3411    78   0   0  49 104  169   0 176  306  51 30 15  0 55
1  0  3528   160   1   0  10 216  487   0 143  387  54 50 22  0 27
1  0  3627    94   0   0   0  72  160   0 148  292  79 57  9  0 34
1  0  3444   327   0   0   0  64  102   0 132  150  41 82  8  0 11
1  0  3505   251   0   0   0   0    0   0 128  189  50 79 11  0 11
1  0  3550   206   0   0   0   0    0   0 124  150  22 94  6  0  0
1  0  3576   180   0   0   0   0    0   0 121  145  30 96  4  0  0
0  1  3654   100   0   0   0   0    0   0 124  145  28 91  8  0  1
1  0  3586   208   0   0   0  40   68   0 123  139  24 91  9  0  0

Because the compiler has not been run recently, the code of the compiler itself must be read in. All told, the compiler acquires about 2 MB in about 6 seconds. On this 32 MB system, maxfree is 64 and minfree is 56. The compiler almost instantly drives the free list size below minfree, and several seconds of rapid page-stealing activity take place. Some of the steals require that dirty working segment pages be written to paging space, which shows up in the po column. If the steals cause the writing of dirty permanent segment pages, that I/O does not appear in the vmstat report (unless you have directed the vmstat command to report on the I/O activity of the physical volumes to which the permanent pages are being written).

This example describes a fork() and exec() environment (not an environment where a process is long lived, such as in a database) and is not intended to suggest that you set minfree to 500 to accommodate large compiles. It suggests how to use the vmstat command to identify situations in which the free list has to be replenished while a program is waiting for space. In this case, about 2 seconds were added to the compiler execution time because there were not enough page frames immediately available. If you observe the page frame consumption of your program, either during initialization or during normal processing, you will soon have an idea of the number page frames that need to be in the free list to keep the program from waiting for memory.

If we concluded from the example above that minfree needed to be 128, and we had set maxpgahead to 16 to improve sequential performance, we would use the following vmtune command:

# /usr/samples/kernel/vmtune -f 128 -F 144

Tuning Memory Pools

In operating system versions later than AIX 4.3.3, the vmtune -m number_of_memory_pools command allows you to change the number of memory pools that are configured at system boot time. The -m flag is therefore not a dynamic change. The change is written to the kernel file if it is an MP kernel (the change is not allowed on a UP kernel). A value of 0 restores the default number of memory pools.

By default, the vmtune -m command writes to the file /usr/lib/boot/unix_mp, but this can be changed with the command vmtune -U path_to_unix_file. Before changing the kernel file, the vmtune command saves the original file as name_of_original_file.sav.
Tuning lrubucket to Reduce Memory Scanning Overhead

Tuning lrubucket can reduce scanning overhead on large memory systems. In AIX 4.3, a new parameter lrubucket was added. The page-replacement algorithm scans memory frames looking for a free frame. During this scan, reference bits of pages are reset, and if a free frame has not been found, a second scan is done. In the second scan, if the reference bit is still off, the frame will be used for a new page (page replacement).

On large memory systems, there may be too many frames to scan, so now memory is divided up into buckets of frames. The page-replacement algorithm will scan the frames in the bucket and then start over on that bucket for the second scan before moving on to the next bucket. The default number of frames in this bucket is 131072 or 512 MB of RAM. The number of frames is tunable with the command vmtune -l, and the value is in 4 K frames.
Choosing minperm and maxperm Settings

The operating system takes advantage of the varying requirements for real memory by leaving in memory pages of files that have been read or written. If the file pages are requested again before their page frames are reassigned, this technique saves an I/O operation. These file pages may be from local or remote (for example, NFS) file systems.

The ratio of page frames used for files versus those used for computational (working or program text) segments is loosely controlled by the minperm and maxperm values:

* If percentage of RAM occupied by file pages rises above maxperm, page-replacement steals only file pages.
* If percentage of RAM occupied by file pages falls below minperm, page-replacement steals both file and computational pages.
* If percentage of RAM occupied by file pages is between minperm and maxperm, page-replacement steals only file pages unless the number of file repages is higher than the number of computational repages.

In a particular workload, it might be worthwhile to emphasize the avoidance of file I/O. In another workload, keeping computational segment pages in memory might be more important. To understand what the ratio is in the untuned state, we use the vmtune command with no arguments.

# /usr/samples/kernel/vmtune
vmtune:  current values:
-p       -P        -r          -R         -f       -F       -N        -W
minperm  maxperm  minpgahead maxpgahead  minfree  maxfree  pd_npages maxrandwrt
52190   208760       2          8        120      128     524288        0

-M      -w      -k      -c        -b         -B           -u        -l    -d
maxpin npswarn npskill numclust numfsbufs hd_pbuf_cnt lvm_bufcnt lrubucket defps
209581    4096    1024       1      93         96          9      131072     1

-s              -n         -S           -h
sync_release_ilock  nokillroot  v_pinshm  strict_maxperm
0               0           0             0

number of valid memory pages = 261976   maxperm=79.7% of real memory
maximum pinable=80.0% of real memory    minperm=19.9% of real memory
number of file memory pages = 19772     numperm=7.5% of real memory

The default values are calculated by the following algorithm:

minperm (in pages) = ((number of memory frames) - 1024) * .2
maxperm (in pages) = ((number of memory frames) - 1024) * .8

The numperm value gives the number of file pages in memory, 19772. This is 7.5 percent of real memory.

If we know that our workload makes little use of recently read or written files, we may want to constrain the amount of memory used for that purpose. The following command:

# /usr/samples/kernel/vmtune -p 15 -P 50

sets minperm to 15 percent and maxperm to 50 percent of real memory. This would ensure that the VMM would steal page frames only from file pages when the ratio of file pages to total memory pages exceeded 50 percent. This should reduce the paging to page space with no detrimental effect on the persistent storage. The maxperm value is not a strict limit, it is only considered when the VMM needs to perform page replacement. Because of this, it is usually safe to reduce the maxperm value on most systems.

On the other hand, if our application frequently references a small set of existing files (especially if those files are in an NFS-mounted file system), we might want to allow more space for local caching of the file pages by using the following command:

# /usr/samples/kernel/vmtune -p 30 -P 90

NFS servers that are used mostly for reads with large amounts of RAM can benefit from increasing the value of maxperm. This allows more pages to reside in RAM so that NFS clients can access them without forcing the NFS server to retrieve the pages from disk again.

Another example would be a program that reads 1.5 GB of sequential file data into the working storage of a system with 2 GB of real memory. You may want to set maxperm to 50 percent or less, because you do not need to keep the file data in memory.
Placing a Hard Limit on Persistent File Cache with strict_maxperm

Starting with AIX 4.3.3, a new vmtune option (-h) called strict_maxperm has been added. This option, when set to 1, places a hard limit on how much memory is used for a persistent file cache by making the maxperm value be the upper limit for this file cache. When the upper limit is reached, the least recently used (LRU) is performed on persistent pages.

 

 

另一个可以尝试的工具是”ps vg”命令,一般来说每个AIX版本上都默认存在”ps”命令。输入”ps v”命令后紧跟上进程号,可以显示该进程号对应进程
的较详细内存使用状况,注意在”v”之前是没有”-“号的,以下是”ps -lf”命令和”ps v”命令的对比:

 

$ps -lfp  5029994
F S      UID     PID    PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD
240001 A  orauser 5029994       1   0  60 20 1d2e7b510 98000            Apr 15      - 190:34 ora_pmon_DEC

$ps v   5029994
PID    TTY STAT  TIME PGIN  SIZE   RSS   LIM  TSIZ   TRS %CPU %MEM COMMAND
5029994      - A    190:34    4  9152 144536    xx 88849 135384  0.0  0.0 ora_pm

 

 

“ps v”命令显示了我们感兴趣的RSS和TRS值,RSS也就是我们说的驻留集,其等于工作段页数(working-segment)*4 + 代码段(code segment) *4,单位为kbytes,而TRS值则仅等于代码段(code segment)*4 kbytes。
请注意AIX平台上内存页的单位为4096 bytes即4k一页,这就是为什么以上RSS和TRS值需要乘以四,举例来说在实际内存使用中代码段占用了2页内存(2 * 4096bytes= 8k),则显示的TRS值应为8。由于RSS既包含了work_segment又包含了code_segment,则RSS-TRS所仅余为工作段内存(work_segment),或曰私有内存段(private memory)。以上例而言,pmon后台进程所用内存:

 

144536(RSS)-135384(TRS)=9152
9152*1024=9371648 bytes

则pmon后台进程所用私有内存为9152k(9371648 bytes),而非”ps -lf”命令所显示的95MB(98000k)。


TRS即代码段所用内存大致与$ORACLE_HOME/bin/oracle 2进制文件的大小相仿,每个Oracle进程(前后台进程)都需要引用到该oracle 2进制文件,实际该code_segment代码段概念即Unix C中正文段(text)的概念。
如果您真的有闲心想要计算Oracle后台进程内存使用总量,那么可以尝试使用一下公式估算:


(P1.RSS-P1.TRS)+(P2.RSS-P2.TRS)+(P3.RSS-P3.TRS)+…+(Pn.RSS-Pn.TRS)+ TRS + SGA

 

前台进程的所使用的私有内存计算要复杂上一些,因为前台进程更频繁地使用的私有内存,同时Oracle会尝试回收部分内存,所以其波动更大。你可以多试几次”ps v”命令以便察觉当前窥视的前台进程内存使用是否存在颠簸。
呵呵,在AIX这个黑盒上想要了解Oracle内存使用的细节还真有些难度,实在不行我们就猜吧!


Posted

in

by

Tags:

Comments

13 responses to “理解Oracle在AIX平台上的内存使用”

  1. admin Avatar
    admin

    Use “svmon” on AIX to determine what comprises memory for a Oracle Process

    SCOPE & APPLICATION
    ——————-

    DBA’s UNIX admin’s looking for memory use for Oracle on AIX

    Monitoring Oracle Memory usage on AIX using svmon
    ————————————————-

    In order to diagnose and breakdown the components of Oracle process memory,
    AIX’s “svmon” is a good utility to use. “svmon” can be installed via the
    AIX OS cd’s and needs root access to use, e.g.

    svmon -P 20876

    ——————————————————————————-
    Pid Command [1] Inuse [2] Pin [3] Pgsp[4] Virtual 64-bit Mthrd
    20876 ora_pmon_V8 29598 1451 182 16560 N N

    Vsid Esid Type Description Inuse Pin Pgsp Virtual Addr Range
    [5] [6] [7] [8]
    1781 3 work shmat/mmap 11824 [9] 0 0 11824 0..24700
    1761 1 pers code,large file /dev 9681 [10] 0 – – 0..9680
    0 0 work kernel seg 3982 [11] 1450 182 3390 0..21804 :
    65474..65535
    18018 d work shared library text 2852 [12] 0 0 158 0..65535
    4764 2 work process private 1127 [15] 1 0 1127 [13] 0..1182 :
    65307..65535
    f74f f work shared library data 81 [16] 0 0 61 [14] 0..1291
    1e59e – pers large file /dev/lv00 33 0 – – 0..32
    e58e – pers large file /dev/lv00 16 0 – – 0..82
    b74b – pers large file /dev/lv00 1 0 – – 0..0
    3703 – pers large file /dev/lv00 1 0 – – 0..0

    Across the top you will see the summary (overall totals for the process):

    [1] “Command” indicates the command name,this case it is the ora_pmon_v817 process
    [2] “Inuse” column is the total number of pages in real memory from segments that
    are used by the process), in this example,a total of 29598 pages for the
    ora_pmon_V8 process (each page is normally 4 KB on AIX, to be confirmed by IBM).
    [3] “Pin” column is the total number of pages pinned from segments that are
    used by the process. Those pages are part of the “Inuse”, but are pinned in
    memory so that they can’t be swapped out.
    [4] “Pgspace” column (total number of paging-space pages that are used by the
    process) shows 6427 pages.

    The detailed section is a breakdown of each segment for the process that is
    shown in the summary section:

    [5] “Vsid” column is the virtual segment ID
    [6] “Esid” column is the effective segment ID.
    [7] “Type” is the type of segment (pers = persistent, work = versatile)
    [8] “Description” gives more details for the segment id. It normally includes
    volume name and the i-node of the file for persistent segments.

    To display the file name associated with the i-node:

    ncheck -i
    find -xdev -inum -print

    Points to keep in mind:
    -A segment can be used by multiple processes. Typically, the SGA [9] and the
    code segments ([10], [12]) are shared. Issuing the svmon command on different
    background processes gives the same shared segment ID used. The private parts
    segment ID should change.
    -Each page in real memory from a segment is included in the “Inuse” field.
    -The total for “Inuse” may exceed the total number of pages in real memory.
    -The sum of Inuse,Pgspace and Pin of all segments of a process is equal to the
    totals in the summary section.

    Comparing the “svmon” output with “ps v ” output
    —————————————————–

    The ‘ps v ‘ command gives less details memory information, e.g.

    ps v 20876
    PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
    20876 – A 0:00 0 4752 43556 32768 27289 38724 0.0 5.0 ora_pmon_V8
    [1] [2] [3] [4] [5]

    Looking at ‘svmon’ and ‘ps v ‘ output gives the following correspondence:

    [1] “SIZE” value equates with the Pgspace number from the svmon command
    for ‘process private’ [13] + shared library data [14] in 1 K units,
    i.e. 4752 = 4 * (1127 + 61)
    [2] “RSS” value equates with the Inuse numbers from the svmon command
    for private working-storage segments [15], for code segments [10],
    and for shared library data [16] of the process in 1-K units multiplied by 4.
    i.e. 43556 = 4 * ( 1127 + 9681 + 81)
    [3] “TSIZ” value is the size of text (shared-program) image. This is the size
    of the text section of the executable file. Pages of the text section of the
    executable program are only brought into memory when they are touched, that is,
    branched to or loaded from. This TSIZ value can also be seen by executing the
    ‘dump -ov’ command against an executable program.
    [4] “TRS” value equates with the Inuse numbers from the svmon command
    for code segments [10] process in 1-K units multiplied by 4.
    i.e. 38724 = 4 * 9681
    To get the private non shared memory used subtract TRS [4] from RSS [2] of
    the process, but remember value will change if the process is active.
    e.g. 43556 – 38724 = 4832K private memory usage for pmon.
    [5] %MEM is calculated as the sum of the number of working segment and
    code segment pages in memory times 4 (that is, the RSS value), divided
    by the size of the real memory of the machine in KB, times 100, rounded
    to the nearest full percentage point.

    Remarks
    ——-

    1. Determining memory usage for shadow processes is more complicated
    since the amount of memory used can change greatly from one moment
    to the next depending on what the user is doing. For capacity planning
    run the “ps v” command repeatedly at regular intervals while the process
    is under peak load to get an average value. Take this value and multiply
    it by the number of expected users to estimate how much memory will
    be needed on the system.

    2. To detect memory leaks, use svmon and look at the segment of Type = work
    and Description = private. Check how many 4K (4096 byte) pages are used
    under the Pgspace column. This is the minimum number of working pages
    this segment is using in all of virtual memory. A Pgspace number that grows,
    but never decreases, may indicate a memory leak. Memory leaks occur when an
    application fails to deallocate memory.

  2. admin Avatar
    admin

    补充:
    ps vg|grep ora|head -6
    204948 – A 0:00 0 4208 52700 xx 88687 48492 0.0 0.0 oracles
    221262 – A 0:00 0 4212 52704 xx 88687 48492 0.0 0.0 oracles
    225534 – A 184:39 1 6068 54560 xx 88687 48492 6.7 0.0 oracles
    237780 – A 0:00 0 4208 52700 xx 88687 48492 0.0 0.0 oracles
    258260 – A 0:00 0 4956 52984 xx 88687 48492 0.0 0.0 oracles
    286936 – A 9:47 0 6692 52708 xx 88687 48492 0.0 0.0 oracles

    ps vg命令存在一个问题,因为其所输出的argument是不完整的,我们无法通过grep argument对这些输出进一步处理(如为不同实例分组等)。
    我们可以通过以下脚本估算主机上所有实例下Oracle进程所使用的私有内存:
    ps vg|grep ora|awk ‘{sum=sum+$7-$10} END {print sum/1024}’
    2432.89
    该服务器上所有Oracle process所用私有段(RSS-TRS)共占用2432MB。可以看到此时的TRS为47M。

    sysresv

    IPC Resources for ORACLE_SID “shbas” :
    Shared Memory:
    ID KEY
    1048583 0x28b04b6c
    Oracle Instance alive for sid “shbas”
    ipcs -mb|grep 1048583
    m 1048583 0x28b04b6c –rw-r—– oracle dba 1610620928
    SGA即共享段内存占用1536M,与参数SGA_MAX_SIZE设定值一致。
    则我们估算的Oracle内存使用总量为:2432+47+1536=4015MB
    svmon -U oracle|head -40

    ===============================================================================
    User Inuse Pin Pgsp Virtual
    oracle 1084391 82284 209510 1131219

    PageSize Inuse Pin Pgsp Virtual
    s 4 KB 977975 156 209510 1024803
    m 64 KB 2555 1037 0 2555

    …………………………………………………………………….
    SYSTEM segments Inuse Pin Pgsp Virtual
    65647 65539 2115 67671

    PageSize Inuse Pin Pgsp Virtual
    s 4 KB 111 3 2115 2135
    m 64 KB 0 0 0 0

    Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
    0 0 work kernel segment (lgpg_vsid=0) L 16 16 0 16
    4001 9ffffffd work shared library s 80 0 2092 2097
    198066 9ffffffe work shared library s 23 0 17 27
    5ac56d – work s 8 3 6 11

    …………………………………………………………………….
    EXCLUSIVE segments Inuse Pin Pgsp Virtual
    1000521 13833 207071 1045120

    PageSize Inuse Pin Pgsp Virtual
    s 4 KB 977721 153 207071 1022320
    m 64 KB 1425 855 0 1425
    利用svmon -U 命令可以看到当前Oracle用户所独占的内存为1022320*4K+1425*16K=4015MB,与我们预估的4015M一致(数据很完美的测试,您的环境中可能会有出入)。

  3. admin Avatar
    admin

    我们通过svmon工具来进一步观察,AIX上Oracle的code segment:
    ps -ef|grep mmnl
    oracle 590060 901148 0 15:10:49 pts/4 0:00 grep mmnl
    oracle 782530 1 0 Mar 23 – 177:56 ora_mmnl_shbas
    svmon -P 782530|grep “clnt text data BSS heap”
    3608dc 10 clnt text data BSS heap, s 32546 0 – –
    32546页即32546*4k=130184K
    ls -l `which oracle`
    -rwsr-s–x 1 oracle dba 133931095 Apr 02 2009 /ora10g/app/oracle/product/10.2.0/db10g/bin/oracle
    oracle 2进制文件的大小为133931095byte即130792K,对应上文中的”clnt text data BSS heap”,即我们说的TRS。

    我们来看另一台服务器上的情况:
    ps -ef|grep mmnl
    orauser 3276870 1 0 Apr 15 – 529:00 ora_mmnl_CRMDB11
    orauser 3412080 2310302 0 15:18:22 pts/7 0:00 grep -i mmnl
    orauser 4612126 1 1 Apr 15 – 435:01 ora_mmnl_CRMDB21
    orauser 2875856 1 0 Apr 15 – 284:21 ora_mmnl_PRMDB11
    svmon -P 2875856|grep “clnt text data BSS heap”
    620b8c 10 clnt text data BSS heap, s 33846 0 – –
    ls -l `which oracle`
    -rwsr-sr-x 1 orauser dba 138632049 Jan 22 2009 /oravl01/oracle/10.2.0.4/bin/oracle
    33846*4K=135384K 约等于138632049 bytes

    可以看到这2台服务器上的Oracle binary略有不同,虽然都是10.2.0.4,但后者很有可能打了psu或者one-off的补丁。
    我们利用opatch进一步探索,
    a主机上:
    ./opatch lsinventory
    Invoking OPatch 10.2.0.4.3

    Oracle Interim Patch Installer version 10.2.0.4.3
    Copyright (c) 2007, Oracle Corporation. All rights reserved.

    Oracle Home : /ora10g/app/oracle/product/10.2.0/db10g
    Central Inventory : /ora10g/oraInventory
    from : /etc/oraInst.loc
    OPatch version : 10.2.0.4.3
    OUI version : 10.2.0.4.0
    OUI location : /ora10g/app/oracle/product/10.2.0/db10g/oui
    Log file location : /ora10g/app/oracle/product/10.2.0/db10g/cfgtoollogs/opatch/opatch2010-07-27_15-20-11PM.log

    Lsinventory Output file location : /ora10g/app/oracle/product/10.2.0/db10g/cfgtoollogs/opatch/lsinv/lsinventory2010-07-27_15-20-11PM.txt

    ——————————————————————————–
    Installed Top-level Products (2):

    Oracle Database 10g 10.2.0.1.0
    Oracle Database 10g Release 2 Patch Set 3 10.2.0.4.0
    There are 2 products installed in this Oracle Home.

    There are no Interim patches installed in this Oracle Home.

    ——————————————————————————–

    OPatch succeeded.

    SQL> select comments from registry$history;

    no rows selected

    b主机上:
    ./opatch lsinventory
    Invoking OPatch 10.2.0.4.3

    Oracle Interim Patch Installer version 10.2.0.4.3
    Copyright (c) 2007, Oracle Corporation. All rights reserved.

    Oracle Home : /oravl01/oracle/10.2.0.4
    Central Inventory : /oravl01/oracle/oraInventory
    from : /etc/oraInst.loc
    OPatch version : 10.2.0.4.3
    OUI version : 10.2.0.4.0
    OUI location : /oravl01/oracle/10.2.0.4/oui
    Log file location : /oravl01/oracle/10.2.0.4/cfgtoollogs/opatch/opatch2010-07-27_15-23-24PM.log

    Lsinventory Output file location : /oravl01/oracle/10.2.0.4/cfgtoollogs/opatch/lsinv/lsinventory2010-07-27_15-23-24PM.txt

    ——————————————————————————–
    Installed Top-level Products (2):

    Oracle Database 10g 10.2.0.1.0
    Oracle Database 10g Release 2 Patch Set 3 10.2.0.4.0
    There are 2 products installed in this Oracle Home.

    Interim patches (42) :

    Patch 7612639 : applied on Wed Jan 21 21:11:03 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:24 hrs PST8PDT
    Bugs fixed:
    7612639

    Patch 7278117 : applied on Wed Jan 21 21:10:59 GMT+08:00 2009
    Created on 31 Jul 2008, 10:18:00 hrs UTC
    Bugs fixed:
    7155655, 6917874, 7278117

    Patch 7272297 : applied on Wed Jan 21 21:10:54 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:22 hrs PST8PDT
    Bugs fixed:
    7272297

    Patch 7189447 : applied on Wed Jan 21 21:10:50 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:21 hrs PST8PDT
    Bugs fixed:
    7189447

    Patch 7123643 : applied on Wed Jan 21 21:10:44 GMT+08:00 2009
    Created on 13 Aug 2008, 08:30:05 hrs PST8PDT,M3.2.0,M11.1.0
    Bugs fixed:
    7123643

    Patch 7008262 : applied on Wed Jan 21 21:10:29 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:18 hrs PST8PDT
    Bugs fixed:
    7008262

    Patch 6972843 : applied on Wed Jan 21 21:10:24 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:16 hrs PST8PDT
    Bugs fixed:
    6972843

    Patch 6678845 : applied on Wed Jan 21 21:10:20 GMT+08:00 2009
    Created on 9 Jan 2009, 01:59:11 hrs PST8PDT
    Bugs fixed:
    6678845

    Patch 6455659 : applied on Wed Jan 21 21:10:15 GMT+08:00 2009
    Created on 23 Oct 2008, 20:58:45 hrs PST8PDT
    Bugs fixed:
    6455659

    Patch 6327692 : applied on Wed Jan 21 21:10:10 GMT+08:00 2009
    Created on 31 Jul 2008, 08:57:01 hrs UTC
    Bugs fixed:
    6327692

    Patch 5868257 : applied on Wed Jan 21 21:10:04 GMT+08:00 2009
    Created on 2 Jul 2008, 06:17:33 hrs UTC
    Bugs fixed:
    5868257

    Patch 7609058 : applied on Wed Jan 21 20:43:02 GMT+08:00 2009
    Created on 12 Dec 2008, 02:33:31 hrs PST8PDT
    Bugs fixed:
    7609058

    Patch 7609057 : applied on Wed Jan 21 20:42:58 GMT+08:00 2009
    Created on 12 Dec 2008, 02:32:11 hrs PST8PDT
    Bugs fixed:
    7609057

    Patch 7592346 : applied on Wed Jan 21 20:42:53 GMT+08:00 2009
    Created on 12 Dec 2008, 02:31:21 hrs PST8PDT
    Bugs fixed:
    7375644, 7150470, 7592346

    Patch 7375617 : applied on Wed Jan 21 20:42:49 GMT+08:00 2009
    Created on 16 Sep 2008, 02:00:43 hrs PST8PDT
    Bugs fixed:
    7375617

    Patch 7375613 : applied on Wed Jan 21 20:42:46 GMT+08:00 2009
    Created on 16 Sep 2008, 02:00:41 hrs PST8PDT
    Bugs fixed:
    7375613

    Patch 7375611 : applied on Wed Jan 21 20:42:42 GMT+08:00 2009
    Created on 16 Sep 2008, 02:00:38 hrs PST8PDT
    Bugs fixed:
    7375611

    Patch 7197583 : applied on Wed Jan 21 20:42:37 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:32 hrs PST8PDT
    Bugs fixed:
    7197583

    Patch 7155254 : applied on Wed Jan 21 20:42:30 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:30 hrs PST8PDT
    Bugs fixed:
    7155254

    Patch 7155253 : applied on Wed Jan 21 20:41:33 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:27 hrs PST8PDT
    Bugs fixed:
    7155253

    Patch 7155252 : applied on Wed Jan 21 20:40:46 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:25 hrs PST8PDT
    Bugs fixed:
    7155252

    Patch 7155251 : applied on Wed Jan 21 20:40:42 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:24 hrs PST8PDT
    Bugs fixed:
    7155251

    Patch 7155250 : applied on Wed Jan 21 20:40:37 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:22 hrs PST8PDT
    Bugs fixed:
    7155250

    Patch 7155249 : applied on Wed Jan 21 20:40:32 GMT+08:00 2009
    Created on 10 Jul 2008, 03:17:01 hrs PST8PDT
    Bugs fixed:
    7155249

    Patch 7155248 : applied on Wed Jan 21 20:40:27 GMT+08:00 2009
    Created on 10 Jul 2008, 03:06:50 hrs PST8PDT
    Bugs fixed:
    7155248

    Patch 7573282 : applied on Wed Jan 21 20:26:11 GMT+08:00 2009
    Created on 21 Nov 2008, 00:35:48 hrs PST8PDT
    Bugs fixed:
    7573282

    Patch 7552082 : applied on Wed Jan 21 20:26:07 GMT+08:00 2009
    Created on 19 Nov 2008, 00:20:43 hrs PST8PDT
    Bugs fixed:
    6658484, 6352003, 6193945, 7552082

    Patch 7552067 : applied on Wed Jan 21 20:26:01 GMT+08:00 2009
    Created on 19 Nov 2008, 00:04:13 hrs PST8PDT
    Bugs fixed:
    5147386, 6799205, 7552067

    Patch 7552042 : applied on Wed Jan 21 20:25:57 GMT+08:00 2009
    Created on 19 Nov 2008, 00:20:20 hrs PST8PDT
    Bugs fixed:
    4637902, 7027551, 6500033, 7552042, 6219529

    Patch 7378735 : applied on Wed Jan 21 20:25:52 GMT+08:00 2009
    Created on 18 Sep 2008, 02:03:45 hrs PST8PDT
    Bugs fixed:
    7378735, 5863926, 6014513

    Patch 7378661 : applied on Wed Jan 21 20:25:47 GMT+08:00 2009
    Created on 17 Sep 2008, 10:30:08 hrs UTC
    Bugs fixed:
    7378661, 7038750, 6145177

    Patch 7196894 : applied on Wed Jan 21 20:25:41 GMT+08:00 2009
    Created on 30 Oct 2008, 22:10:38 hrs PST8PDT
    Bugs fixed:
    7196894

    Patch 6378112 : applied on Wed Jan 21 20:25:37 GMT+08:00 2009
    Created on 19 Nov 2008, 00:11:21 hrs PST8PDT
    Bugs fixed:
    6378112

    Patch 6200820 : applied on Wed Jan 21 20:25:33 GMT+08:00 2009
    Created on 19 Nov 2008, 00:12:17 hrs PST8PDT
    Bugs fixed:
    6200820

    Patch 6163771 : applied on Wed Jan 21 20:25:29 GMT+08:00 2009
    Created on 17 Sep 2008, 18:33:13 hrs PST8PDT
    Bugs fixed:
    6163771

    Patch 6052226 : applied on Wed Jan 21 20:25:23 GMT+08:00 2009
    Created on 17 Sep 2008, 22:42:48 hrs PST8PDT
    Bugs fixed:
    6052226

    Patch 4693355 : applied on Wed Jan 21 20:25:17 GMT+08:00 2009
    Created on 6 Aug 2008, 10:18:36 hrs UTC
    Bugs fixed:
    4693355

    Patch 7493592 : applied on Wed Jan 21 20:09:53 GMT+08:00 2009
    Created on 13 Nov 2008, 06:39:37 hrs PST8PDT,M3.2.0,M11.1.0
    Bugs fixed:
    7493592

    Patch 6827260 : applied on Wed Jan 21 19:11:52 GMT+08:00 2009
    Created on 11 Sep 2008, 10:06:53 hrs UTC
    Bugs fixed:
    6827260

    Patch 6824129 : applied on Wed Jan 21 19:10:11 GMT+08:00 2009
    Created on 3 Nov 2008, 02:23:21 hrs PST8PDT
    Bugs fixed:
    6824129

    Patch 6725855 : applied on Wed Jan 21 19:07:56 GMT+08:00 2009
    Created on 15 Sep 2008, 09:45:24 hrs PST8PDT,M3.2.0,M11.1.0
    Bugs fixed:
    6725855

    Patch 6346115 : applied on Wed Jan 21 19:05:49 GMT+08:00 2009
    Created on 19 Aug 2008, 03:41:09 hrs PST8PDT,M3.2.0,M11.1.0
    Bugs fixed:
    6346115

    Rac system comprising of multiple nodes
    Local node = p595crm1
    Remote node = p595crm2

    ——————————————————————————–

    OPatch succeeded.

    SQL> select comments from registry$history;

    COMMENTS
    ——————————
    CPUJan2009
    CPUJan2009
    view recompilation

    通过查询registry$history可以了解到系统打了那些psu/cpu补丁(this is another story),我们可以看到oracle binary文件较大的b主机上打了2009 Jan的CPU补丁。

  4. admin Avatar
    admin

    以下是MOS上的一个无头公案,AIX上每启动一个服务进程都会导致100MB物理内存减少,注意是real memory的减少:
    The newest Oracle connection process 1556496 is again using 24831 or 10170776
    byte (~96MB) of real system memory, not shared memory.
    Oracle GCS对这个case有些无语!

    Hdr: 8712291 10.2.0.4 RDBMS 10.2.0.4 MEMORY MGMT PRODID-5 PORTID-212
    Abstract: EXCESSIVE MEMORY USAGE ON AIX

    *** 07/21/09 12:29 am ***
    TAR:
    —-
    7569325.994

    PROBLEM:
    ——–
    Customer claims that each sqlplus connections established is consuming around
    100m of memory.

    He claims so by looking at RSS column of ps command.

    For example

    root@db2 /=>ps v 602350
    PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
    602350 – A 0:00 62 7516 102840 xx 88695 95324 0.0 1.0 oracleA

    Explained that RSS includes both private and shared memory of the process.

    Process 602350 memory consumption would be RSS-TRS as per

    Article-ID: Note 123754.1
    Title: AIX: Determining Oracle memory usage on AIX

    DIAGNOSTIC ANALYSIS:
    ——————–

    WORKAROUND:
    ———–

    RELATED BUGS:
    ————-

    REPRODUCIBILITY:
    —————-

    TEST CASE:
    ———-

    STACK TRACE:
    ————

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-

    DIAL-IN INFORMATION:
    ——————–

    IMPACT DATE:
    ————

    *** 07/21/09 12:30 am ***
    Also have an SVMON -P and SVMON -G output but need assistance to prove to
    customer that the process memory is not actually 100m, this is also confirmed
    from PGA query outputs.

    Please help in

    1) How to explain the breakdown of process in terms of private/shared memory
    using below output of svmon -p

    root@db2 /=>svmon -P 602350

    ——————————————————————————

    Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
    602350 oracle 886012 8384 5695 891586 Y N N

    PageSize Inuse Pin Pgsp Virtual
    s 4 KB 34220 8336 5695 39794
    m 64 KB 53237 3 0 53237
    L 16 MB 0 0 0 0

    Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
    d2b5a7 7000000a work default shmat/mmap m 4096 0 0 4096
    f925f0 70000008 work default shmat/mmap m 4096 0 0 4096
    e035c2 70000006 work default shmat/mmap m 4096 0 0 4096
    ce299e 70000007 work default shmat/mmap m 4096 0 0 4096
    c88d93 70000005 work default shmat/mmap m 4096 0 0 4096
    253648 70000003 work default shmat/mmap m 4096 0 0 4096
    298651 70000004 work default shmat/mmap m 4096 0 0 4096
    71b6e1 70000009 work default shmat/mmap m 4096 0 0 4096
    aac17 7000000d work default shmat/mmap m 3920 0 0 3920
    24284a 7000000e work default shmat/mmap m 3920 0 0 3920
    d2afa7 7000000c work default shmat/mmap m 3882 0 0 3882
    4f249c 7000000f work default shmat/mmap m 3600 0 0 3600
    cc279a 7000000b work default shmat/mmap m 3360 0 0 3360
    d82fb2 10 work text data BSS heap s 23831 0 0 23831
    2c2e5a 70000002 work default shmat/mmap m 1188 0 0 1188
    0 0 work kernel segment s 8476 8336 3858 12293
    b7256c 70000001 work default shmat/mmap m 391 0 0 391
    5870b0 90000000 work shared library text m 203 0 0 203
    d9abb1 11 work text data BSS heap s 1370 0 0 1370
    3d2a78 8001000a work private load data s 147 0 0 147
    8eb71f 80020014 work USLA heap s 105 0 0 105
    21ae41 9001000a work shared library data s 93 0 0 93
    87a90d f00000002 work process private m 5 3 0 5
    3c827b 90020014 work shared library s 59 0 45 69
    6936d0 ffffffff work application stack s 48 0 0 48
    8001 9ffffffd work shared library s 38 0 1788 1791
    b22566 8fffffff work private load data s 36 0 0 36
    408283 9fffffff pers USLA text,/dev/hd2:125054 s 10 0 – –
    430086 9ffffffe work shared library s 7 0 4 11
    bb2774 fffffff5 work application stack s 0 0 0 0
    4e2e9e fffffff1 work application stack s 0 0 0 0
    1caa3b fffffffe work application stack s 0 0 0 0
    812500 fffffff3 work application stack s 0 0 0 0
    7e36fe fffffff6 work application stack s 0 0 0 0
    f3b5e5 fffffffa work application stack s 0 0 0 0
    cb2794 – clnt /dev/u06_lv:4158 s 0 0 – –
    fbbbf5 fffffff8 work application stack s 0 0 0 0
    9b0534 – clnt /dev/u01_lv:30156 s 0 0 – –
    c6a78f fffffff7 work application stack s 0 0 0 0
    232644 fffffff2 work application stack s 0 0 0 0
    f42fea fffffffd work application stack s 0 0 0 0
    682ad2 fffffff4 work application stack s 0 0 0 0
    8e251e fffffff9 work application stack s 0 0 0 0
    34ba6b fffffffb work application stack s 0 0 0 0
    8a3916 fffffff0 work application stack s 0 0 0 0
    47b68d fffffffc work application stack s 0 0 0 0

    2) Customer’s system admin is trying to prove that 100m of mmory is indeed
    used by looking at number of free pages decreasing as soon as an sqlplus
    session is established.

    See test results provided by customer below and help me understand whether
    the process actually consumes 100m of memory
    ******************************************************************************
    ************
    Oracle Tech Support, we’re seeing new connections affecting the system memory
    pool that make our 16GB system is run out of memory. This effect is
    specifically measurable as system memory getting removed from the system free
    memory pool.

    Below I have specific details system free memory before and after local
    sqlplus connections are made. As each connection process is created, we see
    the system memory go down. This is irrespective of the process memory, which
    also shows 100MB in use.

    Before SQLPLUS any sqlplus connections.

    svmon -G
    size inuse free pin virtual
    memory 4194304 3082436 1111868 1670199 2060390
    pg space 8388608 101593

    work pers clnt
    pin 621623 0 0
    in use 2013476 4894 15490

    PageSize PoolSize inuse pgsp pin virtual
    s 4 KB – 1050132 101481 542615 1076646
    m 64 KB – 61483 7 4938 61484
    L 16 MB 256 0 0 256 0

    Before opening any connection, system free memory pages (4k) is 1111868 or
    4554211328 bytes

    SQLPLUS connection #1 (Just a sqlplus connection, no sql was executed)

    ps xauwww | grep -v grep | grep -E ‘(LOCAL|RSS)’
    USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
    oracle 2711706 0.0 1.0 94608 101240 – A 12:32:44 0:00 oracleAWA029L
    (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

    svmon -G
    size inuse free pin virtual
    memory 4194304 3112970 1081334 1670283 2090904
    pg space 8388608 101593

    work pers clnt
    pin 621707 0 0
    in use 2043990 4894 15510

    PageSize PoolSize inuse pgsp pin virtual
    s 4 KB – 1080506 101481 542603 1107000
    m 64 KB – 61493 7 4944 61494
    L 16 MB 256 0 0 256 0

    After the first sqlplus connection, PS is showing the RSS usage as 101240 K
    (1024 per K).
    The system free memory is 4429144064, a drop of 125067264 bytes (~119MB)

    The Oracle connection process 2711706 is using 24831 or 10170776 byte (~96MB)
    of real system memory, not shared memory.
    #svmon -P 2711706 | grep -E ‘(Description|BSS)’
    Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
    a02f42 10 work text data BSS heap s 23831 0 0 23831
    252848 11 work text data BSS heap s 1000 0 0 1000

    SQLPLUS connection #2 (again no sql was executed, in either sql connection)

    ps xauwww | grep -v grep | grep -E ‘(LOCAL|RSS)’
    USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
    oracle 1556496 0.0 1.0 94608 101240 – A 12:37:43 0:00 oracleAWA029L
    (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
    oracle 2711706 0.0 1.0 94608 101240 – A 12:32:44 0:00 oracleAWA029L
    (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

    svmon -G
    size inuse free pin virtual
    memory 4194304 3143485 1050819 1672298 2121388
    pg space 8388608 101593

    work pers clnt
    pin 623722 0 0
    in use 2074474 4894 15541

    PageSize PoolSize inuse pgsp pin virtual
    s 4 KB – 1110861 101481 544522 1137324
    m 64 KB – 61503 7 4950 61504
    L 16 MB 256 0 0 256 0

    After the second sqlplus connection, PS is showing the RSS usage as 101240 K
    (1024 per K) on both Oracle processes.

    The system free memory is 4304154624, a drop of 124989440 bytes (~119MB)

    The newest Oracle connection process 1556496 is again using 24831 or 10170776
    byte (~96MB) of real system memory, not shared memory.

    svmon -P 1556496 | grep -E ‘(Description|BSS)’
    Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
    56a4af 10 work text data BSS heap s 23831 0 0 23831
    f42fea 11 work text data BSS heap s 1000 0 0 1000
    ******************************************************************************

  5. admin Avatar
    admin

    Memory Consumption on AIX [ID 259983.1]

    Symptom
    ~~~~~~~

    If you are running Oracle on AIX and experiencing high memory usage and
    swapping, there are 2 things that could help you on reducing the
    consumption.

    1) Setting environment variables
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Oracle always recommends using system thread scope for running Oracle
    (please check note 259779.1).

    Additionally, on AIX 5.2 or newer versions of AIX this will also result in
    significantly less memory being allocated per process.

    On AIX 4.3 and AIX 5.1 system thread scope is also recommended, but on
    these systems it will have a less significant impact on memory usage per
    process. On AIX 4.3 and AIX 5.1 you can additionally set the environment
    variable NUM_SPAREVP to 1 for Oracle processes. This will significantly
    reduce the memory allocated per Oracle process.

    Use the following commands to set these environment variables prior to
    starting the Oracle instance and the listener:

    Add the following command into the script you use to set the Oracle
    environment variables (for example where ORACLE_HOME and ORACLE_SID
    are set).

    * Bourne, Bash, or Korn shell:
    Script is typically ~/.profile or /usr/local/bin/oraenv
    $ AIXTHREAD_SCOPE=S; export AIXTHREAD_SCOPE
    On AIX 4.3 and AIX 5.1 also add the following:
    $ NUM_SPAREVP=1; export NUM_SPAREVP

    * C shell:
    Script is typically ~/.login or /usr/local/bin/coraenv
    $ setenv AIXTHREAD_SCOPE S
    On AIX 4.3 and AIX 5.1 also add the following:
    $ setenv NUM_SPAREVP 1

    2) applying APAR IY49415 and relinking Oracle
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    This alternative requires more work and everytime an Oracle patch is applied,
    this procedure has to be repeated.
    This solution is available only for AIX 5.2 due to new functionality on
    that OS version.

    Cause
    ~~~~~

    Current versions of AIX (AIX versions prior to AIX 5.2 + APAR IY49415) and the C
    for AIX Compiler (versions prior to VAC 6 with APAR IY50551) have a restriction
    that CONST data which contains pointers have to be loaded in a RW section (.data)
    of memory. The size of the .data section used by the oracle binary can be determined
    using the AIX command: `size -f $ORACLE_HOME/bin/oracle`.

    Solution
    ~~~~~~~~

    Long term solution: Future enhancements to AIX and the C for AIX Compiler will
    eliminate this restriction, allowing CONST data containing pointers to be
    located in the RO section (.text). Future Oracle versions compiled with this
    enhancement will use less total memory for the Oracle processes because the
    .text memory section is shared between multiple processes.

    Short term solution: To reduce the memory requirement for the current Oracle
    releases which were built without the compiler enhancement IBM has provided
    APAR IY49415. APAR IY49415 allows the existing 9iR2 oracle executable to be
    relinked with the CONST data structures containing pointers moved from the
    .data to the .text section, reducing the RW (.data) memory requirement of each
    Oracle process by approximately 1MB. APAR IY49415 is available from the IBM AIX fix
    distribution website. The relinking package available via ARU 5336110 (bug 3028673).
    ARU 5336110 contains a relinking script and a list of Oracle symbols which the
    relinking script will move from the .data section to the .text section.

    After you install APAR IY49415 and reboot the system you should follow these
    steps to relink the oracle executable:

    You can download the relinking script from:
    http://updates.oracle.com/ARULink/PatchDetails/process_form?aru=5336110

    Relink the oracle binary
    ~~~~~~~~~~~~~~~~~~~~~~~~

    1) save your current version of $ORACLE_HOME/oracle
    2) create a working directory $ORACLE_HOME/relink
    3) cd to $ORACLE_HOME/relink
    4) unzip the relinking package
    5) link $ORACLE_HOME/bin/oracle to ./oracle0
    6) run the script ./genscript to generate some required files and scripts
    7) run ./relink.sh to generate the new oracle binary oracle0.new.$$
    8) copy oracle0.new.$$ to $ORACLE_HOME/bin/oracle and verify that the
    permissions match the original oracle binary.

    Note that although IY50551 is addressing the compiler issue for AIX 5.2,
    Oracle will take advantage of this only in future releases. Currently,
    production releases will have to follow this procedure to circumvent this issue,
    and do not require APAR IY50551.

    References
    ~~~~~~~~~~

    bug 3028673

  6. admin Avatar
    admin

    Type B – Defect Fixed in Product Version –
    Severity 2 – Severe Loss of Service Product Version 9.2.0.7
    Status 96 – Closed, Duplicate Bug Platform 212 – IBM AIX on POWER Systems (64-bit)
    Created 03-Mar-2006 Platform Version 5.2
    Updated 28-May-2006 Base Bug 3662963
    Database Version 9.2.0.7
    Affects Platforms Port-Specific
    Product Source Oracle

    Show Related Products Related Products
    Line Oracle Database Products Family Oracle Database
    Area Oracle Database Product 5 – Oracle Server – Enterprise Edition

    Hdr: 5076085 9.2.0.7 RDBMS 9.2.0.7 MEMORY MGMT PRODID-5 PORTID-212 3662963
    Abstract: PRIVATE MEMORY INCREASE WITH ~12% AFTER APPLYING 9207 AS COMPARED TO 9206

    *** 03/03/06 12:59 am ***
    TAR:
    —-
    5142624.993

    PROBLEM:
    ——–
    Customer reported that after applying 9207, the memory usage on the machine
    increased. This was materialized in the increase of swap usage. There are
    ~4000 connected users – he has a major concern that this may destabilize his
    system eventually.

    DIAGNOSTIC ANALYSIS:
    ——————–
    I have asked him to install the 9206 database and present comparative values.
    Indeed, from the uploaded files, it seems there is a 12% of memory increase
    for the private space since from 9206 to 9207.

    oracle: 9.2.0.6: DB402> ps vg 1384556
    PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM
    COMMAND
    1384556 – A 0:00 0 2656 61048 32768 47178 58392 0.0 2.0
    oracleD

    Oracle9i Enterprise Edition Release 9.2.0.7.0 – 64bit Production

    oracle: 9.2.0.7: DB002_1> ps vg 3162288
    PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM
    COMMAND
    3162288 – A 0:00 0 3040 31572 32768 47604 28532 0.0 0.0
    oracle

    WORKAROUND:
    ———–
    none

  7. admin Avatar
    admin

    Type B – Defect Fixed in Product Version 9.2.0.99
    Severity 2 – Severe Loss of Service Product Version 9.2.0.5
    Status 80 – Development to Q/A Platform 212 – IBM AIX on POWER Systems (64-bit)
    Created 01-Jun-2004 Platform Version 5.2 ML003
    Updated 22-May-2006 Base Bug –
    Database Version 9.2.0.5
    Affects Platforms Generic
    Product Source Oracle

    Show Related Products Related Products
    Line Oracle Database Products Family Oracle Database
    Area Net Services Product 115 – Oracle Net Services

    Hdr: 3662963 9.2.0.5 NET 9.2.0.5 PRODID-115 PORTID-212
    Abstract: MEMORY REQD FOR SHADOW PROCESS USING DBLINK UP FROM 300K (9203) TO 5MB (9205)

    *** 06/01/04 02:09 am ***
    TAR:
    —-
    3809068.996

    PROBLEM:
    ——–
    CT applied the 9205 on top of 9203 and is seeing high virtual memory demand
    (increased page ins and high scan rate as well as increased paging space) when
    using database links.

    DIAGNOSTIC ANALYSIS:
    ——————–
    PGA and UGA allocation and usage does not seem to be the issue. PGA dumps and
    query on v$sesstat does not show any PGA leak.

    Used the SVMON tool to trace the memory allocation differences in 9203 and
    9205 before and after using database links

    In 9203
    ——-

    Before using the link

    207e8c 11 work text data BSS heap – 868 0 0 868

    After using the link

    207e8c 11 work text data BSS heap – 954 0 0 954

    In 9205
    ——-

    Before using the link

    56bf57 11 work text data BSS heap – 879 0 0 879

    After using the link

    56bf57 11 work text data BSS heap – 2129 0 0 2129

    Found out that the memory growth is happening in the BSS heap

    WORKAROUND:
    ———–
    NA

    RELATED BUGS:
    ————-
    Found similar memory growth reported for PMON process in bug: 3559518 ‘AFTER
    PATCHING AIX 5.2 OS TO LEVEL 5200-02 GET MEMORY LEAK IN PMON’

    REPRODUCIBILITY:
    —————-
    Reproducible at CT’s site

    TEST CASE:
    ———-
    NA

    STACK TRACE:
    ————
    NA

    SUPPORTING INFORMATION:
    ———————–
    Supporting files is put in ess30

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-
    NA

    DIAL-IN INFORMATION:
    ——————–
    NA

    IMPACT DATE:
    ————

  8. admin Avatar
    admin

    Type B – Defect Fixed in Product Version 10.2.0.3
    Severity 1 – Complete Loss of Service Product Version 10.2.0.3
    Status 32 – Not a Bug. To Filer Platform 212 – IBM AIX on POWER Systems (64-bit)
    Created 26-Mar-2009 Platform Version NO DATA
    Updated 30-Mar-2009 Base Bug –
    Database Version 10.2.0.3
    Affects Platforms Generic
    Product Source Oracle

    Show Related Products Related Products
    Line Oracle Database Products Family Oracle Database
    Area Oracle Database Product 5 – Oracle Server – Enterprise Edition

    Hdr: 8372026 10.2.0.3 RDBMS 10.2.0.3 PERFORMANCE PRODID-5 PORTID-212 ORA-27301
    Abstract: DATABASE IS EXHAUSTING MEMORY ON THE SERVER.

    *** 03/26/09 10:25 am ***
    TAR:
    —-

    PROBLEM:
    ——–
    The DB (CRMP) is gradually eating memory available on server and then eating
    paging space to bring down the server once the users starts connecting
    through application.

    When memory is exhausted, seeing the following in the alert log:
    ORA-27300: OS system dependent operation:fork failed with status: 12
    ORA-27301: OS failure message: Not enough space
    ORA-27302: failure occurred at: skgpspawn3.

    ORA-7445: exception encountered: core dump [] [] [] [] [] []
    Business Impact
    Once application connects, all DBs on this server are very slow.

    DIAGNOSTIC ANALYSIS:
    ——————–

    WORKAROUND:
    ———–

    RELATED BUGS:
    ————-

    REPRODUCIBILITY:
    —————-

    TEST CASE:
    ———-

    STACK TRACE:
    ————

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-

    DIAL-IN INFORMATION:
    ——————–

    IMPACT DATE:
    ————

    *** 03/26/09 02:27 pm ***
    PROBLEM:
    —————–

    Customer has five database on this non rac system.
    Once the database are started and the applications started, and users
    connect, they see that memory gets used up and the system starts to swap.
    This memory consumption/ paging and swapping continues to grow until the
    system us unusable.

    Once the applications and databases are stopped the swap space is released
    and eveything returns to normal.

    Topas and NMON show Oracle processes have the higest memory usage.

    No ORA-4030 errors are reported only the following are seen once this past
    week:

    When memory is exhausted, seeing the following in the alert log:
    ORA-27300: OS system dependent operation:fork failed with status: 12
    ORA-27301: OS failure message: Not enough space
    ORA-27302: failure occurred at: skgpspawn3.

    ORA-7445: exception encountered: core dump [] [] [] [] [] []
    Business Impact
    Once application connects, all DBs on this server are very slow.

    DIAGNOSTIC ANALYSIS:
    ——————–

    OSWatcher output collected.
    Output for Process conusming high memory at a regular basis, using the
    following command:
    ps -A -o user,pid,pcpu,pmem,vsz,time,args | sort +4 >
    /tmp/memory_usage.lst
    Errorstacks for a couple of sessions consuming high memory

    WORKAROUND:
    ———–
    Customer is working around the issue temporaily by bouncing the database
    when swapping and memory usage increases.

    RELATED BUGS:
    ————-

    REPRODUCIBILITY:
    —————-

    TEST CASE:
    ———-

    STACK TRACE:
    ————

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-

    DIAL-IN INFORMATION:
    ——————–

    IMPACT DATE:
    ————

    From the process listing. Top memory users:
    ————————————————————-
    Output from ‘ps -A -o user,pid,pcpu,pmem,vsz,time,args | sort +4 >
    /tmp/memory_usage.lst


    oracle 2621834 0.0 0.0 11116 00:00:11 oracleCRMP (LOCAL=NO)
    oracle 1507622 0.0 0.0 11120 00:00:09 oracleCRMP (LOCAL=NO)
    oracle 3592304 0.0 0.0 11136 00:00:10 oracleCRMP (LOCAL=NO)
    oracle 1929648 0.1 0.0 12508 00:00:06 ora_m000_RGWP
    oracle 3264766 0.0 0.0 12656 00:00:43 ora_dbw0_CRMP
    oracle 2220470 0.0 0.0 13120 00:00:05 oracleCRMP (LOCAL=NO)
    oracle 1372558 0.0 0.0 13212 00:03:33
    /oracle/product/10.2.0/db_1/bin/tnslsnr LISTENER
    -inherit
    oracle 1925360 0.3 0.0 13488 00:14:41 oracleCRMP (LOCAL=NO)
    oracle 3559604 0.2 0.0 16644 00:07:47 oracleCRMP (LOCAL=NO)
    oracle 2666984 0.0 0.0 17940 00:00:20 ora_lgwr_PDMP
    oracle 1790198 0.0 0.0 18552 00:00:19 ora_lgwr_AMTP
    oracle 2060450 0.0 0.0 20652 00:01:07 ora_ckpt_RGWP
    oracle 1835488 0.0 0.0 21020 00:00:07 ora_arc3_CRMP
    oracle 2392222 0.0 0.0 21028 00:00:07 ora_arc1_CRMP
    oracle 2662738 0.0 0.0 21028 00:02:29 ora_lgwr_CRMP
    oracle 1552846 0.0 0.0 21036 00:00:03 ora_arc0_PDMP
    oracle 1769498 0.0 0.0 21036 00:00:06 ora_arc1_AMTP
    oracle 1761484 0.0 0.0 21036 00:00:08 ora_arc1_CRNAP
    oracle 1192400 0.0 0.0 21036 00:00:13 ora_lgwr_CRNAP
    oracle 2052342 0.0 0.0 21148 00:01:10 ora_arc9_RGWP
    oracle 2769092 0.0 0.0 22428 00:00:00
    /oracle/product/10.2.0/oem_1/agent10g/perl/bin/perl

    /oracle/product/10.2.0/oem_1/agent10g/sysman/admin/scripts/db/dg/dgObserverSta
    tus.pl
    oracle 4165634 0.0 0.0 22428 00:00:00
    /oracle/product/10.2.0/oem_1/agent10g/perl/bin/perl

    /oracle/product/10.2.0/oem_1/agent10g/sysman/admin/scripts/db/dg/dgObserverSta
    tus.pl
    oracle 1966348 0.0 0.0 22436 00:00:00
    /oracle/product/10.2.0/oem_1/agent10g/perl/bin/perl

    /oracle/product/10.2.0/oem_1/agent10g/sysman/admin/scripts/db/dg/dgDatabases.p
    l
    oracle 1933522 0.0 0.0 23008 00:02:59 ora_dbw0_RGWP
    oracle 1974342 0.0 0.0 32680 00:00:24 ora_arc0_AMTP
    oracle 1216988 0.0 0.0 32680 00:00:28 ora_arc0_CRNAP
    oracle 2646108 0.0 0.0 32760 00:00:21 ora_arc0_CRMP
    oracle 2654280 0.0 0.0 32760 00:00:23 ora_arc2_CRMP
    oracle 2384296 0.0 0.0 32780 00:01:24 ora_arc3_RGWP
    oracle 3100862 0.0 0.0 32780 00:01:27 ora_arc5_RGWP
    oracle 2048338 0.0 0.0 32780 00:01:29 ora_arc7_RGWP
    oracle 1708448 0.0 0.0 32780 00:01:38 ora_arc1_RGWP
    oracle 1315286 0.0 0.0 32780 00:02:08 ora_arc8_RGWP
    oracle 1831222 0.0 0.0 32780 00:02:20 ora_arc6_RGWP
    oracle 2081132 0.0 0.0 32780 00:02:21 ora_arc4_RGWP
    oracle 1081626 0.0 0.0 32780 00:02:53 ora_arc2_RGWP
    oracle 2543950 0.0 0.0 32784 00:00:16 ora_arc1_PDMP
    oracle 2080964 0.0 0.0 32792 00:02:58 ora_arc0_RGWP
    oracle 2654552 0.1 0.0 37520 00:04:47 ora_lgwr_RGWP
    oracle 2781192 0.0 0.0 53400 00:00:59 oracleCRMP (LOCAL=NO)
    root 2253164 0.1 0.0 70260 00:03:18 ../jre/bin/java
    -Djava.library.path=../lib -jar
    ../lib/cxws-aix.jar
    root 1110490 0.7 0.0 85772 01:17:28 java -Xms32M -Xmx256M
    -DATEVAL_CHAN=jni
    -showversion -DZIP=1 -DARCH=aix4-r1
    -DLCF_DATDIR=/banktools/tivoli/lcf/dat/1
    -DLCF_BINDIR=/banktools/tivoli/lcf/bin/aix4-r1/mrt
    -DLCF_EPLABEL=unxaix_va2pamt010_srv00
    -DTRACE_LEVEL=0 -DTRACE_SIZE=5000000 -DREBOOT=true
    -DIP_ADDRESS=171.160.166.70
    -DINTERP=aix4-r1

    -Ddb2j.stream.error.file=/banktools/tivoli/lcf/dat/1/LCFNEW/AMW/logs/db2j.log
    -Ddb2j.stream.error.logSeverityLevel=50000 -Ddb2j.infolog.append=false
    -Ddb2j.language.logStatementText= -Ddb2j.locks.waitTimeout=360
    -Ddb2j.drda.host=171.160.166.70
    com.tivoli.dmunix.ep.agent.Main
    oracle 1425706 4.4 0.0 96708 08:11:44
    /oracle/product/10.2.0/oem_1/agent10g/bin/emagent

    *** 03/26/09 02:42 pm ***
    *** 03/26/09 02:43 pm *** (CHG: Sta->16)
    *** 03/26/09 02:47 pm ***
    *** 03/26/09 02:51 pm *** (CHG: Asg->NEW OWNER OWNER)
    *** 03/26/09 02:51 pm ***
    *** 03/26/09 04:01 pm ***
    *** 03/26/09 08:01 pm ***
    *** 03/26/09 08:11 pm ***
    *** 03/26/09 09:08 pm ***
    *** 03/26/09 09:13 pm ***
    *** 03/26/09 09:18 pm *** (CHG: Sta->10)
    *** 03/27/09 01:38 am ***
    *** 03/27/09 09:06 am ***
    DB Tuning Doc for AIX
    *** 03/27/09 09:11 am ***
    *** 03/27/09 12:35 pm ***
    DB Tuning Doc for AIX
    *** 03/30/09 11:33 am ***
    The customer confirmed on the 1pm ET status call that databases and
    applications are not performing well since the file system mount changes were
    performed (switching mount options for Oracle software file system from CIO
    to RW).
    There is no further investigation required on this. Thank you for your help.
    *** 03/30/09 11:34 am *** (CHG: Sta->16)
    *** 03/30/09 11:35 am ***
    Correction: that databases and applications ARE performing well.
    *** 03/30/09 12:02 pm *** (CHG: Fixed->10.2.0.3)
    *** 03/30/09 12:02 pm *** (CHG: Sta->32 SubComp->PERFORMANCE)
    *** 03/30/09 12:02 pm ***

  9. maclean Avatar

    oracle的内存占用为SGA+PGA+单个TRS。
    ——————————————————————————————-
    测试环境:
    AIX 5.3L 64bit
    ORACLE 10.2.0.3 enterprise edition
    before and after patch 6110331

    为了使oracle使用的内存始终处于物理内存里,而且在启动时进行完全内存分配,需要设置以下2个参数:
    alter system lock_sga=TRUE scope=spfile;
    alter system set pre_page_sga=true scope=spfile;

    事先构建2个表并加载至内存。
    SEGMENT_NAME BYTES/1024/1024
    ——————– —————
    TEST 104
    TEST1 103

    一、未使用patch 6110331(SGA=1536M)
    数据库关闭时
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 470174 2801085 0 0 0 0 0 0 17 25 74 0 0 99 0
    0 0 470174 2801085 0 0 0 0 0 0 25 8 80 0 0 99 0
    0 0 470174 2801085 0 0 0 0 0 0 26 5 75 0 0 99 0
    0 0 470174 2801085 0 0 0 0 0 0 37 30 88 0 0 99 0
    数据库启动完成后
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 887371 2383875 0 0 0 0 0 0 16 148 114 0 0 99 0
    0 0 887371 2383875 0 0 0 0 0 0 24 30 83 0 0 99 0
    0 0 887371 2383875 0 0 0 0 0 0 28 24 86 0 0 99 0 <—free内存下降了约1630M
    ps vg|grep ora命令输出
    $ ps vg|grep ora
    401408 – A 0:00 0 19320 98688 32768 81647 79368 0.0 1.0 ora_lgwr
    405674 – A 0:00 0 7288 86656 32768 81647 79368 0.0 1.0 oraclete
    421950 – A 0:00 0 3860 83228 32768 81647 79368 0.0 1.0 ora_mmnl
    430318 – A 0:00 0 5796 85164 32768 81647 79368 0.0 1.0 ora_cjq0
    442526 – A 0:00 0 4572 83940 32768 81647 79368 0.0 1.0 ora_smon
    458962 – A 0:00 0 3712 83080 32768 81647 79368 0.0 1.0 ora_psp0
    471082 – A 0:00 0 7856 87224 32768 81647 79368 0.0 1.0 ora_dbw0
    524542 – A 0:00 0 4556 83924 32768 81647 79368 0.0 1.0 ora_ckpt
    127476 – A 0:00 0 3784 83152 32768 81647 79368 0.0 1.0 ora_q001
    172340 – A 0:01 0 8636 88004 32768 81647 79368 0.1 1.0 ora_pmon
    373014 pts/2 A 0:00 0 192 228 32768 25 36 0.0 0.0 grep ora
    405832 – A 0:00 0 4004 83372 32768 81647 79368 0.0 1.0 ora_qmnc
    414034 – A 0:00 0 7356 86724 32768 81647 79368 0.0 1.0 ora_mmon
    422376 – A 0:00 0 4572 83940 32768 81647 79368 0.0 1.0 ora_q000
    426444 – A 0:00 0 5636 85004 32768 81647 79368 0.0 1.0 ora_reco
    442862 – A 0:00 0 3720 83088 32768 81647 79368 0.0 1.0 ora_mman
    登录10个sqlplus后
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 910916 2360127 0 0 0 0 0 0 17 102 99 0 0 99 0
    0 0 910916 2360127 0 0 0 0 0 0 31 86 101 0 0 99 0
    0 0 910916 2360126 0 0 0 0 0 0 28 53 89 0 0 99 0 <—free内存下降约92M
    使用这10个会话,同时对一个表test进行全表扫描查询操作后…
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 910962 2360064 0 0 0 0 0 0 21 38 98 0 0 99 0
    0 0 910961 2360065 0 0 0 0 0 0 18 96 97 0 0 99 0
    0 0 910961 2360065 0 0 0 0 0 0 31 48 95 0 0 99 0 <—free内存下降约0.23M
    再使用这10个会话,同时对同一个表进行全表扫描查询操作后…
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 910965 2360231 0 0 0 0 0 0 22 42 85 1 0 99 0
    0 0 910965 2360231 0 0 0 0 0 0 21 44 94 0 0 99 0
    0 0 910964 2360232 0 0 0 0 0 0 29 28 92 0 0 99 0 <—free内存上升约0.65M
    释放这10个sqlplus会话
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 889309 2381844 0 0 0 0 0 0 17 42 89 0 0 99 0
    0 0 889309 2381844 0 0 0 0 0 0 22 44 89 0 0 99 0
    0 0 889309 2381844 0 0 0 0 0 0 27 11 92 0 0 99 0 <—free内存上升约84M
    再登录10个sqlplus会话
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    0 0 908327 2362816 0 0 0 0 0 0 25 11 89 0 0 99 0
    0 0 908327 2362816 0 0 0 0 0 0 17 35 87 0 0 99 0
    0 0 908325 2362818 0 0 0 0 0 0 21 52 95 0 0 99 0 <—free内存下降约74M
    再释放这10个sqlplus
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    0 0 889351 2381788 0 0 0 0 0 0 21 123 96 0 0 99 0
    0 0 889351 2381788 0 0 0 0 0 0 27 31 95 0 0 99 0
    0 0 889351 2381788 0 0 0 0 0 0 29 38 87 0 0 99 0 <—free内存上升约74M

    二、已使用patch 6110331(SGA=1536M)
    数据库关闭时
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 473336 2821102 0 0 0 0 0 0 23 68 80 0 0 99 0
    0 0 473336 2821102 0 0 0 0 0 0 25 30 78 0 0 99 0
    0 0 473336 2821102 0 0 0 0 0 0 22 8 76 0 0 99 0
    数据库启动完成后
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 891613 2402799 0 0 0 0 0 0 23 50 89 0 0 99 0
    0 0 891613 2402799 0 0 0 0 0 0 28 45 89 0 0 99 0
    0 0 891613 2402799 0 0 0 0 0 0 21 28 93 0 0 99 0
    0 0 891613 2402799 0 0 0 0 0 0 17 34 83 0 0 99 0<—free内存下降了约1634M
    ps vg|grep ora命令输出
    $ ps vg|grep ora
    401438 – A 0:01 0 8624 128564 32768 81648 119940 0.2 2.0 ora_pmon
    405684 – A 0:00 0 3772 123712 32768 81648 119940 0.0 2.0 ora_q001
    421952 – A 0:00 0 5788 125728 32768 81648 119940 0.0 2.0 ora_cjq0
    430322 – A 0:00 2 7168 127108 32768 81648 119940 0.0 2.0 oraclete
    458974 – A 0:00 0 7836 127776 32768 81648 119940 0.0 2.0 ora_dbw0
    471088 – A 0:00 0 19312 139252 32768 81648 119940 0.0 2.0 ora_lgwr
    475186 – A 0:00 0 3708 123648 32768 81648 119940 0.0 2.0 ora_mman
    524348 – A 0:00 0 4028 123968 32768 81648 119940 0.0 2.0 ora_q000
    127478 – A 0:00 0 3848 123788 32768 81648 119940 0.0 2.0 ora_mmnl
    135524 pts/2 A 0:00 0 192 228 32768 25 36 0.0 0.0 grep ora
    221640 – A 0:00 0 3932 123872 32768 81648 119940 0.0 2.0 ora_qmnc
    405834 – A 0:00 0 3700 123640 32768 81648 119940 0.0 2.0 ora_reco
    414036 – A 0:00 0 4540 124480 32768 81648 119940 0.0 2.0 ora_ckpt
    422378 – A 0:00 2 4572 124512 32768 81648 119940 0.0 2.0 ora_smon
    426454 – A 0:00 0 3700 123640 32768 81648 119940 0.0 2.0 ora_psp0
    442646 – A 0:00 1 7376 127316 32768 81648 119940 0.0 2.0 ora_mmon
    登录10个sqlplus后
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 908711 2385692 0 0 0 0 0 0 20 67 119 0 0 99 0
    0 0 908711 2385692 0 0 0 0 0 0 21 47 82 0 0 99 0
    0 0 908711 2385692 0 0 0 0 0 0 29 19 91 0 0 99 0
    0 0 908711 2385692 0 0 0 0 0 0 26 114 95 0 0 99 0 <—free内存下降约67M
    使用这10个会话,同时对一个表test1进行全表扫描查询操作后…
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 910608 2385621 0 0 0 0 0 0 26 134 121 2 0 98 0
    0 0 910609 2385620 0 0 0 0 0 0 28 18 88 0 0 99 0
    0 0 910617 2385622 0 0 0 0 0 0 26 603 96 0 0 99 0 <—free内存下降约0.27M
    再使用这10个会话,同时对同一个表进行全表扫描查询操作后…
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 910640 2383782 0 0 0 0 0 0 38 84 103 1 0 98 0
    0 0 910639 2383784 0 0 0 0 0 0 26 50 83 0 0 99 0
    0 0 910639 2383784 0 0 0 0 0 0 21 43 99 0 0 99 0 <—free内存下降约0.11M
    释放这10个sqlplus会话
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    0 0 888917 2407304 0 0 0 0 0 0 24 216 102 0 0 99 0
    0 0 888917 2407304 0 0 0 0 0 0 22 62 92 0 0 99 0
    0 0 888917 2407304 0 0 0 0 0 0 25 27 92 0 0 99 0 <—free内存上升约85M
    再登录10个sqlplus会话
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    0 0 905971 2390241 0 0 0 0 0 0 25 2048 123 1 0 98 0
    0 0 905971 2390241 0 0 0 0 0 0 21 48 88 0 0 99 0
    0 0 905971 2390241 0 0 0 0 0 0 21 30 85 0 0 99 0 <—free内存下降约67M
    再释放这10个sqlplus
    kthr memory page faults cpu
    —– ———– ———————— ———— ———–
    0 0 888911 2407302 0 0 0 0 0 0 30 1009 99 0 0 99 0
    0 0 888911 2407302 0 0 0 0 0 0 25 39 82 0 0 99 0
    0 0 888911 2407302 0 0 0 0 0 0 25 33 93 0 0 99 0 <—free内存上升约67M

  10. maclean Avatar

    570主机在重新启动后发现空闲内存减少,重启前空闲内存大约保持在20-23GB左右,重新启动后空闲内存大约保持在10GB左右。
    问题分析:
    – IBM支持人员收集相关内存使用数据进行分析,具体内容如下:
    当前内存使用情况:
    size inuse free pin virtual
    memory 16252912 13748263 2504649 3880766 10063565
    pg space 2097152 37131
    从此项信息当前系统空余内存为10GB左右。
    使用perfpmr工具分项列表
    ====================================================|==========|===========
    Memory Overview | Pages | Megabytes
    —————————————————-|———-|———–
    Total memory in system | 16252912 | 63487.93
    Total memory in use | 13748263 | 53704.15
    Free memory | 2504649 | 9783.78
    ====================================================|==========|===========
    Memory accounting summary | 4K Pages | Megabytes
    —————————————————-|———-|———–
    Total memory in system | 16252912 | 63487.93
    Total memory in use | 13748263 | 53704.15
    Kernel identified memory (segids,wlm_hw_pages) | 5355659 | 20920.54
    Kernel un-identified memory | 7248 | 28.31
    Fork tree pages | 3466 | 13.53
    Large Page Pool free pages | 0 | 0.00
    Huge Page Pool free pages | 0 | 0.00
    User private memory | 495150 | 1934.17
    User shared memory | 4128254 | 16125.99
    User shared library text memory | 99984 | 390.56
    Text/Executable code memory in use | 59869 | 233.86
    Text/Executable code memory not in use | 164772 | 643.64
    File memory | 3474105 | 13570.72
    User un-identifed memory | -40244 | -157.20
    ———————- | |
    Total accounted in-use | 13748263 | 53704.15
    Free memory | 2504649 | 9783.78
    解读以上信息,系统kernel所使用内存约为21GB,用户专属内存约2GB,用户共享内存约16GB,文件内存使用14GB,共享文本及执行代码内存使用约1GB,空闲内存大约10GB。
    对于AIX系统本身而言,内存被分为两类,一种为工作区,用于存放进程数据、堆栈、核心Kernal数据以及共享内存,工作区的数据如果需要换页,只会交换到paging space;另一类为持久存储区,主要是文件数据在内存中的缓冲,当持久存储区的数据需要换页,则会交换到其所归属的文件。
    持久存储区的页又被分成Client pages和Non-client pages,其中,None-client page只能缓冲Journaled File System (JFS)文件系统的文件数据,而Client page缓冲所有其它类型文件系统的数据,如:NFS。
    上述对内存的两种分类,是因为内存页用途不同,AIX内存管理程序为了提高系统效率,从页交换的角度,将内存页又分为Computational,Non-computational两种。
    所有的工作区内存页都是computational,而持久存储区的内存页则要根据其缓冲的文件情况而定,当文件被打开且第一次被缓存时,默认定为Non-computational,但当某个进程尝试将该文件作为可执行代码进行执行时,该文件所有的页都被标记为computational。
    所以一个文件的所有页只能属于一种类型,且在系统重启之前不会改变。
    当系统长期运行后,相当部分的文件被某进程作为可执行代码打开,即被标记为computational,在使用完后即被释放,但是标记为Non-computational的文件缓存在内存没有完全使用前是不会有调入调出的动作发生的。现系统重新启动后,大部分文件都还在作为Non-computational的文件缓存来使用,我们可以看到当前的文件缓存大致使用了14GB,这应该是重启前后观察到空余内存减小的原因。这是一个正常现象,并非系统故障。

  11. admin Avatar
    admin

    === ODM Action Plan ===
    1. Please, see above comments.

    2. For getting actual memory usage, reference Note 174555.1 .

    < Note 174555.1-UNIX: Determining the Size of an Oracle Process >
    < Note 166491.1-Diagnosing Oracle Memory on AIX using SVMON >

    On the same system, get below results before applying the patch. And obtain below results again after applying the patch.

    1) Identify the problem session

    2) Run the following to demonstrate the memory growth:

    select to_char(sysdate,’dd-mon-yyyy hh24:mi:ss’) from dual;
    select value, n.name|| ‘(‘||s.statistic#||’)’
    from v$sesstat s , v$statname n
    where s.statistic# = n.statistic#
    and n.name like ‘%ga memory%’
    and sid= ;

    1. I searched the information. We can reference below.
    As you see the ‘rss’ value doesn’t mean real-memory size of the processes.

    In summary, ps is not a very good tool to measure system memory usage.
    Because this includes text portion of shared libraries.

    http://www.faqs.org/faqs/aix-faq/part2/section-15.html
    ==========================================
    Using “ps vg” gives a per process tally of memory usage for each running
    process. Several fields give memory usage in different units, but these
    numbers do not tell the whole story on where all the memory goes.

    First of all, the man page for ps does not give an accurate description
    of the memory related fields. Here is a better description:

    RSS – This tells how much RAM resident memory is currently being used
    for the text and data segments for a particular process in units of
    kilobytes. (this value will always be a multiple of 4 since memory is
    allocated in 4 KB pages).

    2. Exactly, the oracle patch doesn’t change some kind of the OS parameter.

    today , i did a test on our test system (oracle 10.2.0.3 AIX 5.3 ).i applied oracle patch 4430244,and i found the rss column in command ‘ps vg’ about oracle processes did not become higher.and when i applied oracle patch 6110331,i found the rss about oralce processed became higher.and before i applied the patch ,i made a backup of product direcory.I rollback the patch and i restore the product direcory, startup the database.and the rss value is still high .

    i wonder:
    1.the ‘rss’ value in AIX system means real-memory size of the processes.It’s the real memory used in system and when it gets higher,it means the process used higher memory,is it right?
    2.why does the memory still be higher when i restore the product direcory?Does it mean the oracle patch changes an OS parameter?
    3.please reply me quickly

    1. As the result of “opatch lsinventory”, you applied the patch 6110331 on July 16.
    You mean , this problem is starting immediately after applying patch 6110331. Is that right?

    [ From patched lists ]

    Patch 6110331 : applied on Fri Jul 16 01:33:45 GMT+08:00 2010
    Created on 2 Jul 2007, 23:57:24 hrs PST8PDT

    2. Could you please upload the alert log without omittong?

    3. Did you check what of SQL or application which consums over 100M memory ?
    Are really same SQL consuming compare with before applying that patch?

    4. Are there two instance running? If so, did you apply that patch after shutting down both instance?

    135326 – A 0:00 0 4456 124528 xx 87167 120072 0.0 0.0 oraclePS
    139366 – A 0:02 0 7168 127240 xx 87167 120072 0.0 0.0 oracleMS

    5. Could you please diag trace files?

    oracle@/oracle/product/10.2.0/OPatch>ps -efo “vsz,pid,user,cpu,thcount,comm” | sort -n
    VSZ PID USER CP THCNT COMMAND

    46808 6169004 oracle 0 1 oracle

    oracle@/oracle/product/10.2.0/OPatch>ps -ef|grep 6169004
    oracle 6463736 6140230 0 10:18:40 pts/13 0:00 grep 6169004
    oracle 6169004 1 0 Aug 04 – 3:06 ora_dbw0_PREPAID

    oracle@/oracle/product/10.2.0/OPatch>ps vg 6169004
    PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
    6169004 – A 3:06 0 46808 175344 32768 87166 128536 0.0 1.0 ora_dbw

    oracle@/oracle/product/10.2.0/OPatch>export ORACLE_SID=PREPAID

    oracle@/oracle/product/10.2.0/OPatch>sqlplus / as sysdba
    SQL> select sid,program from v$session;

    882 oracle@prepaid (DBW0)

    SQL> select value, n.name|| ‘(‘||s.statistic#||’)’
    2 from v$sesstat s , v$statname n
    3 where s.statistic# = n.statistic#
    4 and n.name like ‘%ga memory%’
    5 and sid= 882;

    VALUE
    ———-
    N.NAME||'(‘||S.STATISTIC#||’)’
    ——————————————————————————–
    158544
    session uga memory(20)

    158544
    session uga memory max(21)

    35764232
    session pga memory(25)

    VALUE
    ———-
    N.NAME||'(‘||S.STATISTIC#||’)’
    ——————————————————————————–
    36091912
    session pga memory max(26)

    For instance, the actual dbw0 process’s memory usage is 36091912 bytes.

    session pga memory(25)
    36091912

    The RSS size is 175344 kbytes which includes the text and data segments

    I don’t know exactly why the rss value still was higher when you restored the backup of oracle product in your test system.
    I think it might be relate to system librarys ..
    Does it maybe work fine if we relink the oracle binary?

    1. Shutdown all instances, all listeners and em agents etc.

    2. Relink like below. And check the error while relinking.

    As Oracle User,
    cd $ORACLE_HOME/bin
    $relink all

    As I know, you already have the backup of Oracle S/W so you don’t need to backup.

    Please, relink the Oracle Binary.

    As Oracle User,

    cd $ORACLE_HOME/bin
    $relink all

    And then check the time of the Oracle Executables.

    ls -al

    Reference
    =========
    < Note 131321.1- How to Relink Oracle Database Software on UNIX >

  12. maclean Avatar
    maclean

    ps vg|grep ora|awk ‘{sum=sum+$7-$10} END {print sum/1024}’

    svmon -G
    svmon -U oracle
    svmon -U grid

    svmon -Put 10 列出内存前10 的进程
    svmon -Sut 10 列出前10的内存segment
    svmon -P $(ps -elf | egrep ” ora_smon_${ORACLE_SID} ” | grep -v egrep | awk ‘{print $4}’) | grep shmat
    svmon -O unit=MB

    ls -al $ORACLE_HOME/bin/oracle >> /tmp/support.txt
    whoami >> /tmp/support.txt
    ulimit -a >> /tmp/support.txt
    svmon -O unit=MB >> /tmp/support.txt
    /usr/sbin/lsps -a >> /tmp/support.txt
    /usr/sbin/lsattr -HE -l sys0 -a realmem >> /tmp/support.txt
    ipcs -m >> /tmp/support.txt

    select name, value/1024/1024 as Mbytes from v$pgastat
    where name in (‘maximum PGA allocated’,’aggregate PGA target parameter’,’aggregate PGA auto target’);

    select sum(value)/1024/1024 as Mbytes from v$sga;
    select current_size/1024/1024 Mbytes from V$SGA_DYNAMIC_FREE_MEMORY;
    ipcs -mb

  13. dba Avatar
    dba

    受益匪浅

Leave a Reply

Your email address will not be published. Required fields are marked *