客户的11.2数据库测试环境中碰到了ORA-7445(kcrfw_update_blk_list)错误。
详细的错误信息如下:
Tue DEC 20 22:00:02 2011 BEGIN automatic SQL Tuning Advisor run FOR special tuning task "SYS_AUTO_SQL_TUNING_TASK" Tue DEC 20 22:00:46 2011 Exception [TYPE: SIGBUS, Non-existent physical address] [ADDR:0x62652000] [PC:0x216E0C4, kcrfw_update_blk_list()+196] [flags: 0x0, COUNT: 1] Errors IN file /u01/app/oracle/diag/rdbms/fhacdb/fhacdb/trace/fhacdb_lgwr_24506.trc (incident=140173): ORA-07445: exception encountered: core dump [kcrfw_update_blk_list()+196] [SIGBUS] [ADDR:0x62652000] [PC:0x216E0C4] [Non-existent physical address] [] Incident details IN: /u01/app/oracle/diag/rdbms/fhacdb/fhacdb/incident/incdir_140173/fhacdb_lgwr_24506_i140173.trc USE ADRCI OR Support Workbench TO package the incident. See Note 411.1 at My Oracle Support FOR error AND packaging details. Tue DEC 20 22:00:48 2011 Dumping diagnostic DATA IN directory=[cdmp_20111220220048], requested BY (instance=1, osid=24506 (LGWR)), summary=[incident=140173]. Tue DEC 20 22:00:49 2011 PMON (ospid: 24482): terminating the instance due TO error 470 System state dump requested BY (instance=1, osid=24482 (PMON)), summary=[abnormal instance termination]. System State dumped TO trace file /u01/app/oracle/diag/rdbms/fhacdb/fhacdb/trace/fhacdb_diag_24492.trc Tue DEC 20 22:00:50 2011 ORA-1092 : opitsk aborting process Tue DEC 20 22:00:50 2011 License high water mark = 45 Dumping diagnostic DATA IN directory=[cdmp_20111220220049], requested BY (instance=1, osid=24482 (PMON)), summary=[abnormal instance termination]. Instance TERMINATED BY PMON, pid = 24482 USER (ospid: 17228): terminating the instance Instance TERMINATED BY USER, pid = 17228 Tue DEC 20 22:01:03 2011 Adjusting the DEFAULT VALUE OF parameter parallel_max_servers FROM 960 TO 685 due TO the VALUE OF parameter processes (700) Starting ORACLE instance (normal) WARNING: You are trying TO USE the MEMORY_TARGET feature. This feature requires the /dev/shm file system TO be mounted FOR at least 7868514304 bytes. /dev/shm IS either NOT mounted OR IS mounted WITH available SPACE less than this SIZE. Please fix this so that MEMORY_TARGET can WORK AS expected. CURRENT available IS 7857356800 AND used IS 562315264 bytes. Ensure that the mount point IS /dev/shm FOR this directory. memory_target needs larger /dev/shm Wed DEC 21 09:34:59 2011 Adjusting the DEFAULT VALUE OF parameter parallel_max_servers FROM 960 TO 685 due TO the VALUE OF parameter processes (700) Starting ORACLE instance (normal) WARNING: You are trying TO USE the MEMORY_TARGET feature. This feature requires the /dev/shm file system TO be mounted FOR at least 7784628224 bytes. /dev/shm IS either NOT mounted OR IS mounted WITH available SPACE less than this SIZE. Please fix this so that MEMORY_TARGET can WORK AS expected. CURRENT available IS 7742439424 AND used IS 677232640 bytes. Ensure that the mount point IS /dev/shm FOR this directory. memory_target needs larger /dev/shm |
对应的详细信息为:
*** 2011-12-20 22:00:46.889 *** SESSION ID:(586.1) 2011-12-20 22:00:46.889 *** CLIENT ID:() 2011-12-20 22:00:46.889 *** SERVICE NAME:(SYS$BACKGROUND) 2011-12-20 22:00:46.889 *** MODULE NAME:() 2011-12-20 22:00:46.889 *** ACTION NAME:() 2011-12-20 22:00:46.889 Dump continued FROM file: /u01/app/oracle/diag/rdbms/fhacdb/fhacdb/trace/fhacdb_lgwr_24506.trc ORA-07445: exception encountered: core dump [kcrfw_update_blk_list()+196] [SIGBUS] [ADDR:0x62652000] [PC:0x216E0C4] [Non-existent physical address] [] ========= Dump FOR incident 140173 (ORA 7445 [kcrfw_update_blk_list()+196]) ======== ----- Beginning of Customized Incident Dump(s) ----- Exception [TYPE: SIGBUS, Non-existent physical address] [ADDR:0x62652000] [PC:0x216E0C4, kcrfw_update_blk_list()+196] [flags: 0x0, COUNT: 1] Registers: %rax: 0x0000000000015ffc %rbx: 0x0000000000000000 %rcx: 0x0000000000033908 %rdx: 0x000000006263c000 %rdi: 0x0000000000001d55 %rsi: 0x000000000000eaa8 %rsp: 0x00007fff89e18410 %rbp: 0x00007fff89e18440 %r8: 0x0000000000000002 %r9: 0x0000000000015ffc %r10: 0x000000006263c000 %r11: 0x000000000000eaa8 %r12: 0x0000000000001d55 %r13: 0x00007fc19ece10b8 %r14: 0x0000000000000002 %r15: 0x0000000000000000 %rip: 0x000000000216e0c4 %efl: 0x0000000000010212 kcrfw_update_blk_list()+170 (0x216e0aa) mov 0x5deb4677(%rip),%r12d kcrfw_update_blk_list()+177 (0x216e0b1) mov 0x5deb4660(%rip),%rdx kcrfw_update_blk_list()+184 (0x216e0b8) lea 0x0(,%r12,8),%r11 kcrfw_update_blk_list()+192 (0x216e0c0) lea (%r11,%r12,4),%rax > kcrfw_update_blk_list()+196 (0x216e0c4) mov %ebx,0x4(%rax,%rdx) kcrfw_update_blk_list()+200 (0x216e0c8) mov 0x5deb465a(%rip),%edx kcrfw_update_blk_list()+206 (0x216e0ce) mov 0x78(%r13),%rcx kcrfw_update_blk_list()+210 (0x216e0d2) mov 0x34(%rcx,%r15),%r15d kcrfw_update_blk_list()+215 (0x216e0d7) lea 0x0(,%rdx,8),%rax *** 2011-12-20 22:00:46.901 dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0) ----- SQL Statement (None) ----- CURRENT SQL information unavailable - no cursor. ----- Call Stack Trace ----- calling CALL entry argument VALUES IN hex location TYPE point (? means dubious VALUE) -------------------- -------- -------------------- ---------------------------- skdstdst()+36 CALL kgdsdst() 000000000 ? 000000000 ? 7FC19EEA4098 ? 000000001 ? 000000001 ? 000000003 ? ksedst1()+98 CALL skdstdst() 000000000 ? 000000000 ? 7FC19EEA4098 ? 000000001 ? 000000000 ? 000000003 ? ksedst()+34 CALL ksedst1() 000000001 ? 000000001 ? 7FC19EEA4098 ? 000000001 ? 000000000 ? 000000003 ? dbkedDefDump()+2741 CALL ksedst() 000000001 ? 000000001 ? 7FC19EEA4098 ? 000000001 ? 000000000 ? 000000003 ? ksedmp()+36 CALL dbkedDefDump() 000000003 ? 000000003 ? 7FC19EEA4098 ? 000000001 ? 000000000 ? 000000003 ? ssexhd()+2366 CALL ksedmp() 000000003 ? 000000003 ? 7FC19EEA4098 ? 000000001 ? 000000000 ? 000000003 ? __sighandler() CALL ssexhd() 000000007 ? 7FC19EEACD70 ? 7FC19EEACC68 ? 000000001 ? 000000000 ? 000000003 ? kcrfw_update_blk_li signal __sighandler() 000001D55 ? 00000EAA8 ? st()+196 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? kcrfw_post()+284 CALL kcrfw_update_blk_li 7FC19ECE10B8 ? 00000EAA8 ? st() 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? kcrfw_redo_write()+ CALL kcrfw_post() 7FFF89E18FB8 ? 00000EAA8 ? 2528 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? ksbabs()+771 CALL kcrfw_redo_write() 7FFF89E18FB8 ? 000000018 ? 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? ksbrdp()+971 CALL ksbabs() 7FFF89E18FB8 ? 000000018 ? 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? opirip()+618 CALL ksbrdp() 7FFF89E18FB8 ? 000000018 ? 06263C000 ? 000033908 ? 000000002 ? 000015FFC ? opidrv()+598 CALL opirip() 000000032 ? 000000004 ? 7FFF89E1A178 ? 000033908 ? 000000002 ? 000015FFC ? sou2o()+98 CALL opidrv() 000000032 ? 000000004 ? 7FFF89E1A178 ? 000033908 ? 000000002 ? 000015FFC ? opimai_real()+261 CALL sou2o() 7FFF89E1A150 ? 000000032 ? 000000004 ? 7FFF89E1A178 ? 000000002 ? 000015FFC ? ssthrdmain()+252 CALL opimai_real() 000000000 ? 7FFF89E1A340 ? 000000004 ? 7FFF89E1A178 ? 000000002 ? 000015FFC ? main()+196 CALL ssthrdmain() 000000003 ? 7FFF89E1A340 ? 000000001 ? 000000000 ? 000000002 ? 000015FFC ? __libc_start_main() CALL main() 000000003 ? 7FFF89E1A4E0 ? +253 000000001 ? 000000000 ? 000000002 ? 000015FFC ? _start()+36 CALL __libc_start_main() 000A07804 ? 000000001 ? 7FFF89E1A4D8 ? 000000000 ? 000000002 ? 000015FFC ? --------------------- Binary Stack Dump --------------------- |
虽然这个错误信息在MOS上没有任何记录,根据错误信息和TRACE信息不难判断,导致问题的原因是由于内存空间不足所致。
由于配置的MEMORY_TARGET的值大于/dev/shm的值,导致Oracle在处理内存地址时出现了异常,从而导致数据库的崩溃。
那么解决问题的方法很简单,缩小MEMORY_TARGET的值,或增大/dev/shm的设置,确保MEMORY_TARGET小于/dev/shm,数据库即可正常启动。