ORA-600(kcbshlc_1)和ORA-7445(kggchk)错误

以前同时记录两个ORA-600错误,多半是由于这个两个错误在同时,是同一次故障的不同表现,而这次两个错误则是分别出现。
客户的10.2.0.4的逻辑STANDBY备库上前后几次出现了这两个错误:

Thu Jun 16 13:45:05 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_pmon_27660.trc:
ORA-07445: exception encountered: core dump [kggchk()+77] [SIGSEGV] [Address NOT mapped TO object] [0x000000000] [] []
Thu Jun 16 13:45:13 2011
CKPT: terminating instance due TO error 472
Instance TERMINATED BY CKPT, pid = 27670
.
.
.
Sat Jun 25 01:44:02 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_pmon_18907.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [5], [], [], [], [], [], []
Sat Jun 25 01:44:04 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_pmon_18907.trc:
ORA-00600: internal error code, arguments: [kcbshlc_1], [5], [], [], [], [], [], []
Sat Jun 25 01:44:04 2011
PMON: terminating instance due TO error 472
Sat Jun 25 01:44:04 2011
krvxerpt: Errors detected IN process 20, ROLE reader.
Sat Jun 25 01:44:04 2011
krvxmrs: Leaving BY exception: 472
Sat Jun 25 01:44:04 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_p000_19090.trc:
ORA-00472: PMON process TERMINATED WITH error
LOGSTDBY STATUS: ORA-00472: PMON process TERMINATED WITH error
.
.
.
Mon Oct 31 23:23:03 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_pmon_20147.trc:
ORA-07445: exception encountered: core dump [kggchk()+77] [SIGSEGV] [Address NOT mapped TO object] [0x000000000] [] []
Mon Oct 31 23:23:07 2011
CJQ0: terminating instance due TO error 472
Mon Oct 31 23:23:07 2011
krvxerpt: Errors detected IN process 20, ROLE reader.
Mon Oct 31 23:23:07 2011
krvxmrs: Leaving BY exception: 472
Mon Oct 31 23:23:07 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_p000_20224.trc:
ORA-00472: PMON process TERMINATED WITH error
LOGSTDBY STATUS: ORA-00472: PMON process TERMINATED WITH error
Mon Oct 31 23:23:07 2011
Errors IN file /u01/app/oracle/admin/db/bdump/db_psp0_20149.trc:
ORA-00472: PMON process TERMINATED WITH error

之所以将两个错误合在一起是有原因的,一方面无论是ORA-600(kcbshlc_1)错误,还是ORA-7445(kggchk)错误,错误都出现在PMON进程上,而且都直接导致了数据库的崩溃;其二,逻辑STANDBY的应用一般都是只读应用,一般来说出错概率最大的都是应用进程,而这两个错误在这方面的表相是一样的,虽然都导致了数据库崩溃,但是数据库重启之后,错误并不会马上重现,日志的应用可以顺利的执行,这说明错误和日志应用没有必然的因果关系;其三,也是最重要的一点,在ORA-7445的详细trace中,在kggchk函数之前出现的就是kcbshlc函数:

*** 2011-10-31 23:23:03.108
ksedmp: internal OR fatal error
ORA-07445: exception encountered: core dump [kggchk()+77] [SIGSEGV] [Address NOT mapped TO object] [0x000000000] [] []
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   2A97172D50 ? 2A97172DB0 ?
                                                   2A97172CF0 ? 000000000 ?
ksedmp()+610         CALL     ksedst()             000000000 ? 000000001 ?
                                                   2A97172D50 ? 2A97172DB0 ?
                                                   2A97172CF0 ? 000000000 ?
ssexhd()+629         CALL     ksedmp()             000000003 ? 000000001 ?
                                                   2A97172D50 ? 2A97172DB0 ?
                                                   2A97172CF0 ? 000000000 ?
__funlockfile()+64   CALL     ssexhd()             00000000B ? 2A97173D70 ?
                                                   2A97173C40 ? 2A97172DB0 ?
                                                   2A97172CF0 ? 000000000 ?
kggchk()+77          signal   __funlockfile()      0066876E0 ? 000000000 ?
                                                   000000018 ? 0010F4468 ?
                                                   000000000 ? 0052EBEA0 ?
kcbshlc()+105        CALL     kggchk()             0066876E0 ? 000000000 ?
                                                   000000018 ? 0010F4468 ?
                                                   000000000 ? 0052EBEA0 ?
kslilcr()+770        CALL     kcbshlc()            0066876E0 ? 84EC40698 ?
                                                   000000018 ? 0010F4468 ?
                                                   000000000 ? 0052EBEA0 ?
ksl_cleanup()+1567   CALL     kslilcr()            0010F4468 ? 000000000 ?
                                                   000000000 ? 84EC40698 ?
                                                   0066876E0 ? 0052EBEA0 ?
ksuxfl()+492         CALL     ksl_cleanup()        000000000 ? 000000000 ?
                                                   000000000 ? 84EC40698 ?
                                                   0066876E0 ? 0052EBEA0 ?
ksuxda()+55          CALL     ksuxfl()             85F3A6168 ? 000000000 ?
                                                   000000000 ? 84EC40698 ?
                                                   0066876E0 ? 0052EBEA0 ?
ksucln()+1390        CALL     ksuxda()             85F3A6168 ? 000000000 ?
                                                   000000000 ? 84EC40698 ?
                                                   0066876E0 ? 0052EBEA0 ?
ksbrdp()+794         CALL     ksucln()             060008100 ? 000000000 ?
                                                   043FC1A0B ? 84EC40698 ?
                                                   0066876E0 ? 0052EBEA0 ?
opirip()+616         CALL     ksbrdp()             060008100 ? 000000000 ?
                                                   000000001 ? 060008100 ?
                                                   0066876E0 ? 0052EBEA0 ?
opidrv()+582         CALL     opirip()             000000032 ? 000000004 ?
                                                   7FBFFFF738 ? 060008100 ?
                                                   0066876E0 ? 0052EBEA0 ?
sou2o()+114          CALL     opidrv()             000000032 ? 000000004 ?
                                                   7FBFFFF738 ? 060008100 ?
                                                   0066876E0 ? 0052EBEA0 ?
opimai_real()+317    CALL     sou2o()              7FBFFFF710 ? 000000032 ?
                                                   000000004 ? 7FBFFFF738 ?
                                                   0066876E0 ? 0052EBEA0 ?
main()+116           CALL     opimai_real()        000000003 ? 7FBFFFF7A0 ?
                                                   000000004 ? 7FBFFFF738 ?
                                                   0066876E0 ? 0052EBEA0 ?
__libc_start_main()  CALL     main()               000000003 ? 7FBFFFF7A0 ?
+219                                               000000004 ? 7FBFFFF738 ?
                                                   0066876E0 ? 0052EBEA0 ?
_start()+42          CALL     __libc_start_main()  000713988 ? 000000001 ?
                                                   7FBFFFF8E8 ? 005288D00 ?
                                                   000000000 ? 000000003 ?
--------------------- Binary Stack Dump ---------------------

根据上面三点进行判断,这两个错误应该是同一个BUG引发的,根据MOS查询ORA-600 [kcbshlc_1] [ID 1274837.1]文档记录的信息最为接近,要解决这个问题可以通过将数据库版本升级到10.2.0.4.3或10.2.0.5。

This entry was posted in BUG and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *