ora-600 [17182]错误一例

这是一套古老的系统,SUNOS 5.8,Oracle 8.1.7.4。最近老革命途遇新问题,告警日志烽烟掠起:

Errors in file /u01/app/oracle/admin/CULPRODB/udump/culprodb_ora_7913.trc:
ORA-00600: internal error code, arguments: [17182], [32438472], [], [], [], [], [], []
Thu Jul 15 16:19:29 2010
Errors in file /u01/app/oracle/admin/CULPRODB/udump/culprodb_ora_7913.trc:
ORA-00600: internal error code, arguments: [17182], [32438472], [], [], [], [], [], []
Thu Jul 15 16:19:30 2010
Errors in file /u01/app/oracle/admin/CULPRODB/udump/culprodb_ora_7913.trc:
ORA-00600: internal error code, arguments: [17182], [32438472], [], [], [], [], [], []

如果你像我一样对600着迷,那么点击这里欣赏一下这个trace文件。报错期间运行的SQL及调用栈信息:

ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [17182], [32438472], [], [], [], [], [], []
Current SQL statement for this session:
select * from olsuser.cardmaster where cm_card_no between '2336330010201570013' and '2336330010201580004' union
select * from olsuser.cardmaster where cm_card_no between '2336330012402300018' and '2336330012402310009' union
select * from olsuser.cardmaster where cm_card_no between '2336330052400220016' and '2336330052400230007' union
select * from olsuser.cardmaster where cm_card_no between '2336330015103900012' and '2336330015138100032' union
select * from olsuser.cardmaster where cm_card_no between '2336330055100910018' and '2336330055100920009'
----- Call Stack Trace -----
calling                   call     entry
location                  type     point
--------------------      -------- --------------------
ksedmp()+220              CALL     ksedst()+0
kgeriv()+268              PTR_CALL 0000000000000000
kgesiv()+140              CALL     kgeriv()+0
kgesic1()+32              CALL     kgesiv()+0
kghfrf()+204              CALL     kgherror()+0
kkscls()+1592             CALL     kghfrf()+0
opicca()+248              CALL     kkscls()+0
opiclo()+8                CALL     opicca()+0
kpoclsa()+60              CALL     opiclo()+0
opiodr()+2540             PTR_CALL 0000000000000000
ttcpip()+5676             PTR_CALL 0000000000000000
opitsk()+2408             CALL     ttcpip()+0
opiino()+2080             CALL     opitsk()+0
opiodr()+2540             PTR_CALL 0000000000000000
opidrv()+1656             CALL     opiodr()+0
sou2o()+16                CALL     opidrv()+0
main()+172                CALL     sou2o()+0
_start()+380              CALL     main()+0
/*8.1.7中stack trace还附带着寄存器信息,但我们可读不懂:)  */

opicca->kkscls->kghfrf->kgherror(heap层报错)->kgesic1。问题主要发生在调用kghfrf函数的时候,《famous summary stack trace from Oracle Version 8.1.7.4.0 Bug Note》 一文罗列了Oracle的一些stack summary,其中kghfrx函数的作用是”Free extent. This is called when a heap is unpinned to request that it”;可以猜测kghfrf函数是用来释放某种内存结构的。在MOS上输入”kghfrf 8.1.7.4″关键词,可以找到Note 291936.1:

ORA-00600 [17182] on Oracle 8.1.7.4.0 After a CTRL-C or Client Termination
Applies to:
Oracle Server – Enterprise Edition – Version: 8.1.7.4
This problem can occur on any platform.
Checked for relevance on 06-Mar-2007

Oracle RDBMS Server Versions prior to 9i
Symptoms
1. Intermittent heap corruptions errors like ORA-00600 [17182] are reported in the alert.log file.

2. There is no impact to the database other than the process which encounters the errors getting killed.

3. From the trace file generated for this ORA-00600 error, check if the top few functions are :

kgherror kghfrf kkscls opicca

Cause
If the trace file shows that kkscls calls kghfrf, then it is related to:

Bug 2281320 — ORA-600[17182] POSSIBLE AFTER CTRL-C OR CLIENT DEATH
Solution
The problem is when we call kghfrf to free a chunk of memory, we expect that this chunk to have been allocated from the Heap Memory and hence have a valid header, although internally we have used Frame Memory managed chunk. As a result, kghfrf errors out with the “Bagic Magic Number” in the Memory Chunk header error message.

If you are running Oracle 8174, encounter this ORA-00600 [17182], and the call stack indicates the following functions { kgherror kghfrf kkscls }, then download and apply Patch 2281320 from MetaLink.

This issue has been fixed in Oracle Server 8.1.7.5 and later versions.

Note 2281320.8 is not limited to dblinks and can occur during normal database operation as well.

该文档叙述描述在9i以前版本中可能因堆损坏而出现该ORA-00600 [17182]错误,该错误不会导致致命问题或数据库损坏,最坏的情况是遭遇该错误的服务进程被杀死。与该问题匹配的主要依据是stack trace为kgherror kghfrf kkscls opicca,同我们的实际情况一致。可以通过打上one-off patch 2281320或者升级到8.1.7.5来避免该内部错误的发生,当然也可以置之不理,显然它不会造成太大的麻烦。
此外kghfrf函数用以释放内存chunk,Oracle development起初以为所有这些可能被释放的chunk都是从堆内存中分配而来,因此都该有一个有效的header;而实际上它们可能是以帧式内存管理的chunk。kghfrf因读取到这种chunk header中的错误幻数(Bagic Magic Number)而误入歧途了。


Posted

in

by

Tags:

Comments

4 responses to “ora-600 [17182]错误一例”

  1. admin Avatar
    admin

    Hdr: 2281320 8.1.7.3.0 RDBMS 8.1.7.3.0 PRG INTERFACE PRODID-5 PORTID-87 ORA-600
    Abstract: ORA-600[17182] POSSIBLE AFTER CTRL-C OR CLIENT DEATH
    PROBLEM:
    ——–
    Regularly an ora-600[17182] is generated. Checking on this kind of files is
    done automatically and DBA’s are informed about this error and have to check
    it at once (even middle in the night).
    Except memory corruption there does not seem to be an impact towards the
    database.

    DIAGNOSTIC ANALYSIS:
    ——————–
    Have checked the objects in question:
    – problem occurs on different tables with different queries
    – execution plan shows usage of bitmap index as well FTS scans
    No regular plan can be found in it.
    Setting of diagnostic event 10235 with level 4 seems to introduce ora-4030 so
    had to be put off.
    Patch for 2177050 has been installed but problem occurred before and after
    installation of this patch.
    According cust the same error occurred in 8.1.7.2.0 as well.

    WORKAROUND:
    ———–
    Have not found any.

    RELATED BUGS:
    ————-
    Have not found any

    REPRODUCIBILITY:
    —————-
    not reproducable at will

    TEST CASE:
    ———-
    Not applicable

    STACK TRACE:
    ————
    *** 17:30:34.123
    ksedmp: internal or fatal error
    ORA-600: internal error code, arguments: [17182], [1075716168], [], [], [],
    []
    , [], []
    Current SQL statement for this session:
    select * from rcv a where a.clne_seq in (select clne_seq from cdt_lne where
    val_
    day = 0)
    and (a.dte_due + 1 )= (select dte_nxt_pay from its_per b where b.iper_seq =
    a.ip
    er_seq)
    —– Call Stack Trace —–
    *** 17:30:45.045
    link and map addresses differ for
    /oracle/app/oracle/product/8.1.7/lib/libobk.so
    – 3ffbffe0000, 30000000000
    calling call entry argument values in hex
    location type point (? means dubious value)
    ——————– ——– ——————–
    —————————-
    ksedmp:1838[kse.c] ??? ksedst:2205[kse.c] 12071E6BC ? 0380003D8 ?
    1401EB838 ? 100000018 ?
    1214707E8 ? 0380003D8 ?
    ksfdmp:917[ksf.c] ??? ksedmp:1838[kse.c] 121470854 ? 00000431E ?
    000000000 ? 000000000 ?
    000000001 ? 11FFFCAB0 ?
    kgeriv:1451[kge.c] ??? ksfdmp:917[ksf.c] 000000000 ? 000000000 ?
    000000001 ? 11FFFCAB0 ?
    100000018 ? 121471014 ?
    kgesiv:1679[kge.c] JSR kgeriv:1451[kge.c] 121470C48 ? 0380003D8 ?
    1401EEAF8 ? 11FFFCAB0 ?
    100000018 ? 10000431E ?
    kgesic1:1558[kge.c] ??? kgesiv:1679[kge.c] 12145F8B4 ? 11FFFCAB0 ?
    100000018 ? 000000000 ?
    000000000 ? 038000000 ?
    kgherror:569[kgh.c] ??? kgesic1:1558[kge.c] 000000000 ? 000000008 ?
    000000010 ? 1214677DC ?
    0380003D8 ? 1401EEAF8 ?
    kghfrf:5102[kgh.c] ??? kgherror:569[kgh.c] 000000001 ? 1206E9D84 ?
    1401F9E38 ? 038000000 ?
    000000000 ? 000000024 ?
    kkscls:3728[kks.c] ??? kghfrf:5102[kgh.c] 120FDF020 ? 000000003 ?
    1401F1638 ? 1401F9E38 ?
    038004318 ? 000000000 ?
    opicca:145[opicca.c JSR kkscls:3728[kks.c] 120E202E0 ? 038000000 ?
    000000001 ? 000000001 ?
    120BB502C ? 11FFFD660 ?
    opiclo:79[opiclo.c] JSR opicca:145[opicca.c 120BB502C ? 11FFFD660 ?
    000000003 ? 11FFFD660 ?
    1206408D4 ? 038000000 ?

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-

    DIAL-IN INFORMATION:
    ——————–

    IMPACT DATE:
    ————
    Files will be uploaded to ess30
    The ora-600[729] has been solved by installing patch for 2177050.
    The uploaded traces are erroring in kkscls -> kghfrf during
    closing of cursors. The corruption is in private memory, but
    the bad chunk does not appear in the session heap.

    Can you add the following to see if we can get closer to
    the cause:
    event=”600 trace name heapdump level 5125″
    event=”10501 trace name context forever, level 4109″

    Also can you upload the alert log extract and init.ora
    parameter settings.
    problem has re-occured after enabling above events, have uploaded the
    17182.zip file the cust has provided.
    Cust provided a new tracefile containing the ora-600[17182]
    have uploaded files of customer: trace + alertfile: 17182_18apr2002.zip
    An ORA-3113 is seen from the DBLINK during execution of this statement:
    select * from pmm_rcp_pmm order by dte_sta_pmm_rcp_pmm
    This occurs in the stack:
    ksesec0
    ksucin
    srsmr1
    srsrel
    sorrelqb
    qersoRelease
    rwsrld
    qecrlssub
    opifch
    opiall0
    kpoal8
    Note: The local error is only occuring as the DB link signals an
    ORA-3113. This implies the remote end of the DB link is failing.
    You should find out if that is due to an unexpected process
    death and if so follow that up as a seperate issue. The issue
    here is that in 8i an ORA-3113 from a DB link at a particular
    time can cause a local ORA-600 [17182] error.

    Please indicate if you are likely to need an 8i fix for this
    OERI:17182 problem so I know what action to take next. Thanks
    cust has checked the object in question and it is a local table:
    New info : select * from dba_objects where object_name = ‘PMM_RCP_PMM’;
    OWNER OBJECT_NAME SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID
    —— ———— —————————— ———- ————–
    OBJECT_TYPE CREATED LAST_DDL_ TIMESTAMP STATUS T G S
    —————— ——— ——— ——————- ——- – – –
    PUBLIC PMM_RCP_PMM 39282
    SYNONYM 06-MAR-00 06-MAR-00 2000-03-06:19:30:41 VALID N N N
    ISR PMM_RCP_PMM 39277 39277
    TABLE 06-MAR-00 23-OCT-01 2000-03-06:19:29:53 VALID N N N

    Please (re-)check the tracefile for the database link.
    Ooops – diagnosis is the same – it is just that the ORA-3113 is
    from a dead client connection not a dead DB link.
    ie: The client going away at an inappropriate time exposes the same
    hole in the code.
    Please provide a backport for this problem for 8.1.7.3.0

    Can a timeframe be given in which a fix can be expected??
    Thanks
    Rediscovery Information :
    “If you get ORA-600[17182] after a ORA-3113, and cause for 3113 indicates
    following pattern in the error stack, then it could be this bug.
    …opifch()->qecrlssub()->….ksesec0()”

    ]] ORA-600[17182] occurred followed by ORA-3113 when the heap dump
    ]] indicated that 17182 encountered while freeing the chunk marked with
    ]] “define-info”.

  2. admin Avatar
    admin

    Hdr: 2491757 8.1.7.4 RDBMS 8.1.7.4 PRG INTERFACE PRODID-5 PORTID-23 ORA-600 2281320
    Abstract: ORA-600 [17182] [32227064], [], [], [], [], [], [] AND ORA-3113 ON 8.1.7.4

    TAR:
    —-
    SMS TAR 2396840.995

    PROBLEM:
    ——–
    Customer is getting this problem and problematic sessionis disconnected. They
    are not able toreproduce this at will.

    DIAGNOSTIC ANALYSIS:
    ——————–
    NA

    WORKAROUND:
    ———–
    None

    RELATED BUGS:
    ————-

    REPRODUCIBILITY:
    —————-
    Custome could not reproduce this at will. But we belive it will be reproduced
    in future

    TEST CASE:
    ———-
    NA

    STACK TRACE:
    ————
    ksedmp
    kgeriv
    kgesiv
    kgesic1
    kghfrf
    kkscls
    opicca
    opiclo
    opifcs
    ksuxds
    ksudel
    opidcl
    opidrv
    sou2o
    main
    _start

    SUPPORTING INFORMATION:
    ———————–

    24 HOUR CONTACT INFORMATION FOR P1 BUGS:
    —————————————-
    NA

    DIAL-IN INFORMATION:
    ——————–
    NA

    IMPACT DATE:
    ————

    Alert.log, init.ora and trace file of problem is on machine ess30in directory
    /bug/bug2491757 in file bug2491757.zip

    The current SQL statement is
    SELECT “OSN_POD_OSEBA”.”SIFRA”
    FROM “OSN_POD”.”OSEBA” “OSN_POD_OSEBA” ORDER BY “PRIIMEK”

    If this is over a DB link this is probably a duplicate of
    Bug.2281320 . Please confirm what “OSN_POD”.”OSEBA” is.
    No, this is no db_link. I’ve checked that.
    Sorry – I should not have mentioned DB links. Bug 2281320 has
    nothing to do with DB links – Ive corrected its title.

    From your trace the dump is when freeing kxscdfn in the
    current instantiation. This is not pointing at a KGH chunk
    hence the OERI:17182.

    This is almost certainly a duplicate of bug:2281320
    If the customer needs a fix in 8174 please request a PSE to 8174
    referencing this bug as evidence. It is likely this is from
    a dead client or a client interrupt so unless this is happening
    a lot you would need a good business case for a PSE.

    There is no actual corruption here – just a cleanup error.

  3. admin Avatar
    admin

    Hdr: 3421829 8.1.7.4 RDBMS 8.1.7.4 PRODID-5 PORTID-59 ORA-600
    Abstract: ORA-600 15203 AND ORA-600 17182

    PROBLEM:
    ——–
    Customer is getting intermittent ora-600 errors. The TAR was originally
    opened for the ora-600 17182 errors. However, since asking the customer to
    set event 10235 but before setting it, he has encountered ora-600 15203 which
    seems to have spawned more ora-600 17182 errors.

    He cannot reproduce the errors at will. However, they are quite frequent
    (occur on a daily basis)

    DIAGNOSTIC ANALYSIS:
    ——————–
    I have asked the customer to set event 10235 level 2. However, he is
    concerned about a performance hit. He wanted to know if we could give him a %
    of performance degredation that he would encounter. I told him I would ask
    development. I had also asked him to set event 10501. However, I understand
    that this event causes more of a performance hit then the 10235. So, I don’t
    think I can get him to set that event.

    The customer also wants to be sure that we will get all the information needed
    from setting the 10235 event. He does not want to have to set further events
    – causing downtime on production.

    WORKAROUND:
    ———–
    none known

    RELATED BUGS:
    ————-
    bug 2765055 OERI:15203 / Memory corruption if partitioned table cursor is
    reloaded —–this bug has to do with partitioned tables, my ct does not use
    partitioned tables.

    REPRODUCIBILITY:
    —————-
    intermittently on a daily basis

    TEST CASE:
    ———-
    none available

    STACK TRACE:
    ————
    /opt/oracle/admin/MOVE/udump/ora_7255_move.trc
    ===============================================
    ORA-600: internal error code, arguments: [17182], [1075419096], [], [], [],
    [], [], []
    Current SQL statement for this session:
    SELECT ‘x’ FROM task_master WHERE TASK_ID=2953359 FOR UPDATE NOWAIT

    STACK: kgherror kghfrf kxscln kkscls

    Chunk 401997d8 sz= 56 ERROR, BAD MAGIC NUMBER (3b)

    /opt/oracle/admin/MOVE/udump/ora_3761_move.trc
    ================================================
    ORA-600: internal error code, arguments: [15203], [9], [5], [], [], [], [],
    []
    Current SQL statement for this session:
    select inventory_id ,product_id ,uom_family ,uom_type_code ,product_key
    ,location_is_lp_ind ,physical_location_no ,onhand_quantity ,inbound_quantity
    ,outbound_quantity ,material_status_code ,material_keepers_ref ,inventory_type
    ,inventory_status from inventory where (location_no=:b0 and
    onhand_quantity>0) order by product_id asc

    STACK:ksesic2 kksfal

Leave a Reply

Your email address will not be published. Required fields are marked *