ORA-600(kdsgrp1)错误

在一个客户的数据库告警日志中,频繁出现ORA-600(kdsgrp1)错误。
详细错误信息为:

/oracle/admin/tjsb/udump/tjsb1_ora_1701138.trc
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
WITH the Partitioning, REAL Application Clusters, DATA Mining AND REAL Application Testing options
ORACLE_HOME = /oracle/product/10.2.0/db_1
System name:    AIX
Node name:      p570c
Release:        3
Version:        5
Machine:        00C971744C00
Instance name: tjsb1
Redo thread mounted BY this instance: 1
Oracle process NUMBER: 883
Unix process pid: 1701138, image: oracle@p570c
*** 2011-07-26 09:59:41.111
*** ACTION NAME:(abc) 2011-07-26 09:59:41.105
*** MODULE NAME:(archive.exe) 2011-07-26 09:59:41.105
*** SERVICE NAME:(tjsb) 2011-07-26 09:59:41.105
*** SESSION ID:(2466.64103) 2011-07-26 09:59:41.105
            ROW 0766ce34.a continuation at
            file# 29 block# 2543156 slot 11 NOT found
**************************************************
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 11 ..... nrows: 110
**************************************************
*** 2011-07-26 09:59:41.111
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
CURRENT SQL statement FOR this SESSION:
SELECT DISTINCT KC01.AAC001 AAC001 FROM AC04, AC02, KC01 WHERE KC01.AAB001 = :B2 AND KC01.AKC021 IN ('11', '12') AND AC02.AAE140 = '
3' AND AC02.AAC031 = '1' AND AC02.AAC001 = KC01.AAC001 AND AC04.AAC001 = KC01.AAC001 AND AC04.AAE030 <= TO_DATE(:B1 , 'yyyymm') AND 
(AC04.AAE031 >= TO_DATE(:B1 , 'yyyymm') OR AC04.AAE031 IS NULL) AND NVL(AC04.AKC010, 0) > 0 AND NOT EXISTS (SELECT 1 FROM AC01 WHERE
 AAB001 = :B2 AND NVL(AAC033, '0') = '3' AND AAC001 = KC01.AAC001) AND NOT EXISTS (SELECT 1 FROM DUAL WHERE :B3 <= (SELECT NVL(SUM(C
KC010), 0) FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4)))) UNION SELECT DISTINCT KC01.AAC001 AAC00
1 FROM AC02, KC01 WHERE KC01.AAB001 = :B2 AND KC01.AKC021 IN ('21', '22', '23') AND AC02.AAE140 = '3' AND AC02.AAC031 = '1' AND AC02
.AAC001 = KC01.AAC001 AND NOT EXISTS (SELECT 1 FROM AC01 WHERE AAB001 = :B2 AND NVL(AAC033, '0') > '0' AND AAC001 = KC01.AAC001) AND
 NOT EXISTS (SELECT 1 FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE002 = :B1 ) AND (NOT EXISTS (SELECT 1 FROM DUAL WHERE :B4 <= (SELE
CT NVL(SUM(CKC010), 0) FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4))) AND NVL(TO_CHAR(KC01.AIC162,
 'yyyymm'),'199001') < SUBSTR(:B1 , 1, 4) || '01')) AND (NOT EXISTS (SELECT 1 FROM DUAL WHERE :B3 <= (SELECT NVL(SUM(CKC010), 0) FRO
M KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4))) AND NVL(TO_CHAR(KC01.AIC162, 'yyyymm'),'199001') >= SU
BSTR(:B1 , 1, 4) || '01'))
----- PL/SQL Call Stack -----
  object      line  object
  handle    NUMBER  name
70000078a8a6960      8456  package body TJSI.PKG_A_JJZJ
70000078bd54dc0         1  anonymous block
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? 000000000 ?
ksedmp+0290          bl       ksedst               104522370 ?
ksfdmp+0018          bl       03F2C1FC             
kgerinv+00dc         bl       _ptrgl               
kgeasnmierr+004c     bl       kgerinv              000000000 ? 000000001 ?
                                                   10D00000000 ? 000000000 ?
                                                   110648E58 ?
kdsgrp+0460          bl       01F92FB8             
kdsfbr+0268          bl       kdsgrp               000000001 ? 1105E2988 ?
                                                   000000000 ?
qertbFetchByRowID+0  bl       kdsfbr               FFFFFFFFFFF1FA0 ? 000000000 ?
9d0                                                1040DC78C ? 000000000 ?
                                                   1105FC720 ? 1101FB888 ?
                                                   1105FCF40 ? 000000000 ?
rwsfcd+0054          bl       _ptrgl               
qerflFetchOutside+0  bl       03F2BB90             
09c                                                
rwsfcd+0054          bl       _ptrgl               
qeruaFetch+013c      bl       03F2BB90             
qersoFetch+0110      bl       01F92FBC             
opifch2+141c         bl       01F92FBC             
opifch+003c          bl       opifch2              000000001 ? 11062BDC0 ?
                                                   FFFFFFFFFFF36D0 ?
opipls+21a4          bl       opifch               FFFFFFFFFFF3858 ? 110173648 ?
                                                   FFFFFFFFFFF3870 ?
opiodr+0ae0          bl       _ptrgl               
rpidrus+01bc         bl       opiodr               6686B0E7F0 ? 684B93430 ?
                                                   FFFFFFFFFFF67E0 ? D86B0E750 ?
skgmstack+00c8       bl       _ptrgl               
rpidru+0088          bl       skgmstack            110173648 ? FFFFFFFFFFF5830 ?
                                                   000000002 ? 000000000 ?
                                                   FFFFFFFFFFF5FA8 ?
rpiswu2+034c         bl       _ptrgl               
rpidrv+095c          bl       rpiswu2              7000007837E2BD8 ?
                                                   3A00000000 ? 10451BB80 ?
                                                   000000000 ? 110196C78 ?
                                                   3A10196C60 ?
                                                   FFFFFFFFFFF6008 ? 000000000 ?
psddr0+02bc          bl       01F96D24             
psdnal+01d0          bl       psddr0               D013BC57C ? 6610573D78 ?
                                                   FFFFFFFFFFF67E0 ?
                                                   3000000000 ?
pevm_BFTCHC+03b4     bl       _ptrgl               
pfrinstr_FTCHC+0138  bl       pevm_BFTCHC          101479974 ? 000000000 ?
                                                   000000010 ?
pfrrun_no_tool+005c  bl       _ptrgl               
pfrrun+1014          bl       pfrrun_no_tool       FFFFFFFFFFF6B20 ?
                                                   70000078BD54DC0 ? 3000009BB ?
plsql_run+06b4       bl       pfrrun               110469588 ?
peicnt+0224          bl       plsql_run            110469588 ? 10001102381F8 ?
                                                   000000000 ?
kkxexe+0250          bl       peicnt               FFFFFFFFFFF7E38 ? 110469588 ?
opiexe+2ef8          bl       kkxexe               11064CEA0 ?
kpoal8+0edc          bl       opiexe               FFFFFFFFFFFB454 ?
                                                   FFFFFFFFFFFB180 ?
                                                   FFFFFFFFFFF9628 ?
opiodr+0ae0          bl       _ptrgl               
ttcpip+1020          bl       _ptrgl               
opitsk+1124          bl       01F95258             
opiino+0990          bl       opitsk               0FFFFD490 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl               
opidrv+0484          bl       01F947C0             
sou2o+0090           bl       opidrv               3C02D9855C ? 44065F000 ?
                                                   FFFFFFFFFFFF390 ?
opimai_real+01bc     bl       01F92214             
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?

查询了一下metalink,发现错误号为kdsgrp1的已知600错误有20多个之多。当前的版本是10204,而这些已知bug又有80%都可能在这个版本上发生。不过仔细看一下错误的说明,这些错误有几分相似之处,多半都和表或索引的逻辑损坏有关。
在这个trace文件中,有明确的标识损坏之处:

            ROW 0766ce34.a continuation at
            file# 29 block# 2543156 slot 11 NOT found

显然这里出现了一个逻辑坏块,那么其实不用再去关心当前的问题属于哪个已知的bug,只要解决了这个坏块,就解决了这个ORA-600的错误。
有了FILE和BLOCK的序号,可以通过DUMP的手段,也可以查询DBA_EXTENTS视图,来确认坏块的内容。
如果坏块属于索引,那么可以通过删除索引然后重建的方式,很容易的解决问题。
如果坏块属于表,那么坏块中的部分行信息可能丢失。如果找不到出现错误之前的备份,可以通过DBMS_REPAIR包进行修复,或者设置alter session set events ‘10231 trace name context forever, level 10’,然后导出或CREATE TABLE AS SELECT将坏块意外的数据进行重建。
可以参考metalink文档ID 468883.1,获取更详细的信息。

This entry was posted in BUG and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *