由于共享磁盘问题导致的两个ORA-600错误。
客户的10.2.0.4 RAC for Linux X86-64,在告警日志中出现了大量的错误信息:
Tue Apr 24 16:15:04 2012 Errors IN file /u01/admin/orcl/udump/orcl1_ora_10437.trc: ORA-00600: internal error code, arguments: [KSFD_DECAIOPC], [0xFC213CBF0], [], [], [], [], [], [] ORA-07445: exception encountered: core dump [<0x9293a0>] [SIGSEGV] [Address NOT mapped TO object] [0x0000007CA] [] [] ORA-07445: exception encountered: core dump [<0xb5814a>] [SIGSEGV] [Address NOT mapped TO object] [0xFFFFFFFFFFFFFFF9] [] [] ORA-00333: redo log READ error block 2 COUNT 8192 ORA-00202: control file: '+ASM_DISK1/orcl/controlfile/current.256.757170241' ORA-15081: failed TO submit an I/O operation TO a disk Tue Apr 24 16:15:14 2012 WARNING: kfk failed TO OPEN a disk[/dev/oracleasm/disks/DISK4] Tue Apr 24 16:15:14 2012 Errors IN file /u01/admin/orcl/udump/orcl1_ora_10437.trc: ORA-15025: could NOT OPEN disk '/dev/oracleasm/disks/DISK4' ORA-27041: unable TO OPEN file Linux-x86_64 Error: 24: Too many OPEN files Additional information: 3 ORA-00600: internal error code, arguments: [KSFD_DECAIOPC], [0xFC213CBF0], [], [], [], [], [], [] ORA-07445: exception encountered: core dump [<0x9293a0>] [SIGSEGV] [Address NOT mapped TO object] [0x0000007CA] [] [] ORA-07445: exception encountered: core dump [<0xb5814a>] [SIGSEGV] [Address NOT mapped TO object] [0xFFFFFFFFFFFFFFF9] [] [] ORA-00333: redo log READ error block 2 COUNT 8192 ORA-00202: control file: '+ASM_DISK1/orcl/controlfile/current.256.757170241' ORA-15081: failed TO submit an I/O operation TO a disk WARNING: kfk failed TO OPEN a disk[/dev/oracleasm/disks/DISK2] |
当前的版本10.2.0.4和ORA-600错误信息KSFD_DECAIOPC,都符合bug 8433026的描述,但是当前数据库并未配置STREAM环境,虽然当前库配置了DSG的复制应用,但是毕竟和流应用还是有所区别。
从错误信息可以判断,当前的问题要问题来自ASM磁盘组中部分磁盘存在异常,导致读取时出现错误。
除了KSFD_DECAIOPC错误外,由于底层共享存储的问题,还导致了另外的ORA-600错误:
Tue Jul 10 10:39:53 2012 Errors IN file /u01/admin/orcl/udump/orcl1_ora_19546.trc: ORA-00600: internal error code, arguments: [kfioReapIO00], [0], [52], [], [], [], [], [] ORA-00333: redo log READ error block 2 COUNT 8192 ORA-00600: internal error code, arguments: [KSFD_DECAIOPC], [0xFC7D6ADB8], [], [], [], [], [], [] ORA-07445: exception encountered: core dump [<0x9293a0>] [SIGSEGV] [Address NOT mapped TO object] [0x0000007CA] [] [] ORA-07445: exception encountered: core dump [<0xb5814a>] [SIGSEGV] [Address NOT mapped TO object] [0xFFFFFFFFFFFFFFF9] [] [] ORA-00333: redo log READ error block 2 COUNT 8192 ORA-00202: control file: '+ASM_DISK1/orcl/controlfile/current.256.757170241' ORA-15081: failed TO submit an I/O operation TO a disk |
显然这两个ORA-600都是由于底层磁盘错误所引起的,而当硬件人员解决了共享磁盘错误后,ASM实例没有经过重启就恢复了正常,此后也没有类似错误的出现。