ORA-600(kjbrrefp:key)和ORA-600(kjbmprlst:shadow)错误

这两个错误是由同一个BUG导致的。
数据库环境11.2.0.2 RAC for Solaris sparc,错误信息如下:

2012-01-29 06:15:10.168000 +08:00
Errors IN file /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms3_81.trc (incident=384590):
ORA-00600: internal error code, arguments: [kjbrref:pkey], [332269], [202], [137064], [0], [], [], [], [], [], [], []
Incident details IN: /app/diag/rdbms/orcl/orcl1/incident/incdir_384590/orcl1_lms3_81_i384590.trc
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
2012-01-29 06:15:11.923000 +08:00
Dumping diagnostic DATA IN directory=[cdmp_20120129061511], requested BY (instance=1, osid=81 (LMS3)), summary=[incident=384590].
Sweep [inc][384590]: completed
Sweep [inc2][384590]: completed
2012-01-29 06:15:17.289000 +08:00
Errors IN file /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms3_81.trc:
ORA-00600: internal error code, arguments: [kjbrref:pkey], [332269], [202], [137064], [0], [], [], [], [], [], [], []
LMS3 (ospid: 81): terminating the instance due TO error 484
2012-01-29 06:15:20.910000 +08:00
ORA-1092 : opitsk aborting process
2012-01-29 06:15:22.384000 +08:00
.
.
.
2012-04-17 04:26:44.373000 +08:00
Errors IN file /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms1_8678.trc (incident=432578):
ORA-00600: internal error code, arguments: [kjbmprlst:shadow], [], [], [], [], [], [], [], [], [], [], []
Incident details IN: /app/diag/rdbms/orcl/orcl1/incident/incdir_432578/orcl1_lms1_8678_i432578.trc
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
2012-04-17 04:26:45.864000 +08:00
Dumping diagnostic DATA IN directory=[cdmp_20120417042645], requested BY (instance=1, osid=8678 (LMS1)), summary=[incident=432578].
Errors IN file /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms1_8678.trc:
ORA-00600: internal error code, arguments: [kjbmprlst:shadow], [], [], [], [], [], [], [], [], [], [], []
2012-04-17 04:26:47.359000 +08:00
Sweep [inc][432578]: completed
Sweep [inc2][432578]: completed
2012-04-17 04:26:53.095000 +08:00
Errors IN file /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms1_8678.trc:
ORA-00600: internal error code, arguments: [kjbmprlst:shadow], [], [], [], [], [], [], [], [], [], [], []
LMS1 (ospid: 8678): terminating the instance due TO error 484
2012-04-17 04:26:56.593000 +08:00
ORA-1092 : opitsk aborting process
2012-04-17 04:26:58.088000 +08:00
Instance TERMINATED BY LMS1, pid = 8678

可以看到,无论是kjbrref:pkey错误的出现还是kjbmprlst:shadow错误的出现,都直接导致了实例的CRASH。可以说这两个错误都是非常严重的问题。而且二者都发生在LMSn进程上。

*** 2012-01-29 06:15:10.194
*** SESSION ID:(1009.1) 2012-01-29 06:15:10.194
*** CLIENT ID:() 2012-01-29 06:15:10.194
*** SERVICE NAME:(SYS$BACKGROUND) 2012-01-29 06:15:10.194
*** MODULE NAME:() 2012-01-29 06:15:10.194
*** ACTION NAME:() 2012-01-29 06:15:10.194
Dump continued FROM file: /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms3_81.trc
ORA-00600: internal error code, arguments: [kjbrref:pkey], [332269], [202], [137064], [0], [], [], [], [], [], [], []
========= Dump FOR incident 384590 (ORA 600 [kjbrref:pkey]) ========
----- Beginning of Customized Incident Dump(s) -----
 GCS RESOURCE 0xb92d0cfa0 hashq [0xbb35eddc8,0xc0f9b1f60] name[0x511ed.ca] pkey 136931.0
   GRANT 0xb94a7e8f8 cvt 0x0 send 0x0@1,0 WRITE 0x0,0@65536
   flag 0x2 mdrole 0x1 mode 1 scan 0.0 ROLE LOCAL
   disk: 0x0000.00000000 WRITE: 0x0000.00000000 cnt 0x0 hist 0x0
   xid 0x0000.000.00000000 sid 3 pkwait 0s rmacks 0
   refpcnt 0 weak: 0x0000.00000000
   pkey 136931.0
   hv 91 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 12, dom 0]
   kjga st 0x4, step 0.35.0, cinc 18, rmno 6345, flags 0x20
   lb 16384, hb 32767, myb 16957, drmb 16957, apifrz 1
   GCS SHADOW 0xb94a7e8f8,626 resp[0xb92d0cfa0,0x511ed.ca] pkey 136931.0
     GRANT 1 cvt 0 mdrole 0x1 st 0x100 lst 0x40 GRANTQ rl LOCAL
     master 1 owner 2 sid 3 remote[0x68fde3ef0,11] hist 0x10c30086180431f
     history 0x1f.0x6.0x1.0xc.0x6.0x1.0xc.0x6.0x1.0x0.
     cflag 0x0 sender 0 flags 0x0 replay# 0 abast 0x0.x0.1 dbmap 0x0
     disk: 0x0000.00000000 WRITE request: 0x0000.00000000
     pi scn: 0x0000.00000000 sq[0xb92d0cfd0,0xb92d0cfd0]
     msgseq 0x1 updseq 0x0 reqids[11,0,0] infop 0x0 lockseq x67d9
   GCS SHADOW END
 GCS RESOURCE END
----- End of Customized Incident Dump(s) -----
*** 2012-01-29 06:15:10.261
dbkedDefDump(): Starting incident DEFAULT dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
CURRENT SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst1()+96         CALL     skdstdst()           FFFFFFFF7FFF4C00 ?
                                                   100670460 ? 000000000 ?
                                                   00000000A ? 000000001 ?
                                                   10BD552E0 ?
ksedst()+60          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   00010C1D1 ? 00010C000 ?
                                                   10C1CA000 ? 00010C1CA ?
dbkedDefDump()+2032  CALL     ksedst()             000000000 ? 10B21A000 ?
                                                   10B21AA90 ? 10C1D2000 ?
                                                   00010B000 ? 00010C1D2 ?
dbgexPhaseII()+1800  PTR_CALL dbkedDefDump()       000000003 ? 000000002 ?
                                                   10A6ABAA8 ? 0000014B0 ?
                                                   10C1C9000 ? 000000003 ?
dbgexExplicitEndInc  CALL     dbgexPhaseII()       10C373D30 ?
()+728                                             FFFFFFFF7A634920 ?
                                                   FFFFFFFF7FFF8FDC ?
                                                   0018E0001 ? 10A6A2D98 ?
                                                   000001C00 ?
dbgeEndDDEInvocatio  CALL     dbgexExplicitEndInc  10A6A2C50 ?
nImpl()+704                   ()                   FFFFFFFF7A634920 ?
                                                   FFFFFFFF7FFF8F28 ?
                                                   FFFFFFFF7FFFC620 ?
                                                   000000000 ?
                                                   FFFFFFFFFE4E26A0 ?
kjbrref()+1496       CALL     dbgeEndDDEInvocatio  10C373D30 ? 001B1D800 ?
                              n()                  FFFFFFFFFEC0AF31 ?
                                                   FFFFFFFF7FFFC620 ?
                                                   000002868 ? 0018E0001 ?
kjblreplay()+7380    CALL     kjbrref()            000002868 ? 10C1CA3E0 ?
                                                   000021768 ? A681AFA10 ?
                                                   B92D0CFA0 ? C0F96F920 ?
kjbldrmrpst()+4864   CALL     kjblreplay()         000000000 ? 000000001 ?
                                                   10C1CA0A0 ? BDA03C9B8 ?
                                                   000000000 ? 10C1E8890 ?
kjmprcfgsync()+1424  CALL     kjbldrmrpst()        A681AFA10 ? 000000001 ?

另一个trace文件:

*** 2012-04-17 04:26:44.389
*** SESSION ID:(673.1) 2012-04-17 04:26:44.389
*** CLIENT ID:() 2012-04-17 04:26:44.389
*** SERVICE NAME:(SYS$BACKGROUND) 2012-04-17 04:26:44.389
*** MODULE NAME:() 2012-04-17 04:26:44.389
*** ACTION NAME:() 2012-04-17 04:26:44.389
Dump continued FROM file: /app/diag/rdbms/orcl/orcl1/trace/orcl1_lms1_8678.trc
ORA-00600: internal error code, arguments: [kjbmprlst:shadow], [], [], [], [], [], [], [], [], [], [], []
========= Dump FOR incident 432578 (ORA 600 [kjbmprlst:shadow]) ========
----- Beginning of Customized Incident Dump(s) -----
 FUSION MSG 0xffffffff79c40b80,39 FROM 2 spnum 14 ver[38,11161] ln 144 sq[2,8]
        REPLAY 1 [0x103699.c7, 151132.0] c[0x7e7bd3240,55] [0x494e,x38]
        GRANT 2 CONVERT 0 ROLE x0
        pi [0x0.0x0] flags 0x0 state 0x100
        disk scn 0x0.0 writereq scn 0x0.0 rreqid x0
        msgRM# 11161 bkt# 18131 drmbkt# 18131
    pkey 151132.0 undo 0 stat 5 masters[32768, 2->32768] reminc 38 RM# 11152
 flg x0 TYPE x0 afftime x8517cf38
 nreplays BY lms 0 = 4046 
 nreplays BY lms 1 = 4105 
 nreplays BY lms 2 = 4176 
 nreplays BY lms 3 = 4214 
 nreplays BY lms 4 = 4158 
 nreplays BY lms 5 = 4162 
   hv 125 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 36, dom 0]
   kjga st 0x4, step 0.36.0, cinc 38, rmno 11161, flags 0x20
   lb 16384, hb 32767, myb 18131, drmb 18131, apifrz 1
 FUSION MSG DUMP END
 GCS RESOURCE 0xbb93a40e8 hashq [0xba8f40298,0xc27d16700] name[0x103699.c7] pkey 151008.0
   GRANT 0xb99d64f38 cvt 0x0 send 0x0@1,0 WRITE 0x0,0@65536
   flag 0x2 mdrole 0x1 mode 1 scan 0.0 ROLE LOCAL
   disk: 0x0000.00000000 WRITE: 0x0000.00000000 cnt 0x0 hist 0x0
   xid 0x0000.000.00000000 sid 1 pkwait 0s rmacks 0
   refpcnt 0 weak: 0x0000.00000000
   pkey 151008.0
   hv 125 [stat 0x0, 1->1, wm 32768, RMno 0, reminc 36, dom 0]
   kjga st 0x4, step 0.36.0, cinc 38, rmno 11161, flags 0x20
   lb 16384, hb 32767, myb 18131, drmb 18131, apifrz 1
   GCS SHADOW 0xb99d64f38,42 resp[0xbb93a40e8,0x103699.c7] pkey 151008.0
     GRANT 1 cvt 0 mdrole 0x1 st 0x100 lst 0x40 GRANTQ rl LOCAL
     master 1 owner 2 sid 1 remote[0x85fed2220,13] hist 0xb93e302087234c9f
     history 0x1f.0x19.0xd.0x39.0x8.0x4.0xc.0x1f.0x39.0x1.
     cflag 0x0 sender 0 flags 0x0 replay# 0 abast 0x0.x0.1 dbmap 0x0
     disk: 0x0000.00000000 WRITE request: 0x0000.00000000
     pi scn: 0x0000.00000000 sq[0xbb93a4118,0xbb93a4118]
     msgseq 0x1 updseq 0x0 reqids[13,0,0] infop 0x0 lockseq xf0d1
   GCS SHADOW END
 GCS RESOURCE END
----- End of Customized Incident Dump(s) -----
*** 2012-04-17 04:26:44.478
dbkedDefDump(): Starting incident DEFAULT dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
CURRENT SQL information unavailable - no cursor.
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst1()+96         CALL     skdstdst()           FFFFFFFF7FFF4D20 ?
                                                   100670460 ? 000000000 ?
                                                   00000000A ? 000000001 ?
                                                   10BD552E0 ?
ksedst()+60          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   00010C1D1 ? 00010C000 ?
                                                   10C1CA000 ? 00010C1CA ?
dbkedDefDump()+2032  CALL     ksedst()             000000000 ? 10B21A000 ?
                                                   10B21AA90 ? 10C1D2000 ?
                                                   00010B000 ? 00010C1D2 ?
dbgexPhaseII()+1800  PTR_CALL dbkedDefDump()       000000003 ? 000000002 ?
                                                   10A6ABAA8 ? 0000014B0 ?
                                                   10C1C9000 ? 000000003 ?
dbgexExplicitEndInc  CALL     dbgexPhaseII()       10C373D30 ?
()+728                                             FFFFFFFF7A634920 ?
                                                   FFFFFFFF7FFF90FC ?
                                                   0018E0001 ? 10A6A2D98 ?
                                                   000001C00 ?
dbgeEndDDEInvocatio  CALL     dbgexExplicitEndInc  10A6A2C50 ?
nImpl()+704                   ()                   FFFFFFFF7A634920 ?
                                                   FFFFFFFF7FFF9048 ?
                                                   FFFFFFFF7FFFC740 ?
                                                   000000000 ?
                                                   FFFFFFFFFE4E26A0 ?
kjbmprlst()+13504    CALL     dbgeEndDDEInvocatio  10C373D30 ? 001B1D800 ?
                              n()                  FFFFFFFFFEC0AF31 ?
                                                   FFFFFFFF7FFFC740 ?
                                                   0013F5000 ? 0018E0001 ?
kjmxmpm()+796        PTR_CALL kjbmprlst()          101782000 ? 00010C1CA ?
                                                   10C1EA000 ? 10C1CA000 ?
                                                   10A6A3000 ? 10A6A3000 ?
kjmpbmsg()+4584      CALL     kjmxmpm()            00010A400 ? 000000000 ?
                                                   0852DA2C5 ? 00010C000 ?
                                                   10A7EE000 ? BE22AF0C0 ?
kjmsm()+11308        CALL     kjmpbmsg()           00010A400 ? 00000009C ?
                                                   00010C000 ? 10A7EE000 ?
                                                   000000001 ? 000000027 ?
ksbrdp()+1236        PTR_CALL kjmsm()              000001888 ? 25916872D1 ?
                                                   000002000 ? 000000000 ?
                                                   00000024B ? 000001000 ?
opirip()+1008        CALL     ksbrdp()             10BB56000 ? BD8C0B680 ?
                                                   000000001 ? 000001400 ?
                                                   00010B800 ? 10AC212D8 ?
opidrv()+780         CALL     opirip()             10A6A3000 ? 380013D50 ?
                                                   000380002 ? 3800055C0 ?
                                                   380002000 ? 00010C000 ?
sou2o()+92           CALL     opidrv()             000000032 ? 000000004 ?
                                                   FFFFFFFF7FFFF780 ?
                                                   0001EA190 ?
                                                   FFFFFFFF7AF42F10 ?
                                                   FFFFFFFF7FFFFBB8 ?
opimai_real()+516    CALL     sou2o()              FFFFFFFF7FFFF758 ?

可以看到,两个TRACE文件也非常接近,而且连报错的前几个堆栈函数的名称都完全一样。
查询MOS,确认为Bug 12834027 ORA-600 [kjbmprlst:shadow] / ORA-600 [kjbrasr:pkey] with RAC read mostly locking,这个问题在最新的11.2.0.3.1PSU中被FIXED,除了打补丁之外,还可以考虑通过隐含参数”_gc_read_mostly_locking”=FALSE来禁止READ-MOSTLY OBJECT LOCKING。此外,禁止DRM也可以避免该错误的产生。

This entry was posted in BUG and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *