11.2.0.1的ASM出现ORA-600[kfgFinalize_2]错误。
错误信息为:
Fri Jun 17 09:59:51 2011 NOTE: No asm libraries found IN the system * instance_number obtained FROM CSS = 2, checking FOR the existence OF node 0... * node 0 does NOT exist. instance_number = 2 Starting ORACLE instance (normal) sskgpgetexecname failed TO GET name LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Interface TYPE 1 en1 192.168.138.0 configured FROM GPnP Profile FOR USE AS a cluster interconnect Interface TYPE 1 en0 10.121.128.0 configured FROM GPnP Profile FOR USE AS a public interface Picked latch-free SCN scheme 3 USING LOG_ARCHIVE_DEST_1 parameter DEFAULT VALUE AS /u01/app/grid/dbs/arch Autotune OF undo retention IS turned ON. LICENSE_MAX_USERS = 0 SYS auditing IS disabled Starting up: Oracle DATABASE 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production WITH the REAL Application Clusters AND Automatic Storage Management options. USING parameter settings IN server-side spfile +CRS/jhzw-cluster/asmparameterfile/registry.253.732844175 System parameters WITH non-DEFAULT VALUES: large_pool_size = 12M instance_type = "asm" remote_login_passwordfile= "EXCLUSIVE" asm_diskgroups = "DATA" asm_diskgroups = "FRA" asm_power_limit = 1 diagnostic_dest = "/u01/grid" Cluster communication IS configured TO USE the following interface(s) FOR this instance 192.168.138.187 cluster interconnect IPC version:Oracle UDP/IP (generic) IPC Vendor 1 proto 2 Fri Jun 17 09:59:59 2011 PMON started WITH pid=2, OS id=258330 Fri Jun 17 09:59:59 2011 VKTM started WITH pid=3, OS id=311588 at elevated priority VKTM running at (10)millisec PRECISION WITH DBRM quantum (100)ms Fri Jun 17 09:59:59 2011 GEN0 started WITH pid=4, OS id=319784 Fri Jun 17 10:00:00 2011 DIAG started WITH pid=5, OS id=282666 Fri Jun 17 10:00:00 2011 PING started WITH pid=6, OS id=323882 Fri Jun 17 10:00:00 2011 PSP0 started WITH pid=7, OS id=327980 Fri Jun 17 10:00:00 2011 DIA0 started WITH pid=8, OS id=286766 Fri Jun 17 10:00:00 2011 LMON started WITH pid=9, OS id=332080 Fri Jun 17 10:00:02 2011 LMD0 started WITH pid=10, OS id=278586 Fri Jun 17 10:00:03 2011 LMS0 started WITH pid=11, OS id=290872 at elevated priority Fri Jun 17 10:00:03 2011 LMHB started WITH pid=12, OS id=299072 Fri Jun 17 10:00:03 2011 MMAN started WITH pid=13, OS id=336182 Fri Jun 17 10:00:03 2011 DBW0 started WITH pid=14, OS id=303170 Fri Jun 17 10:00:03 2011 LGWR started WITH pid=15, OS id=348478 Fri Jun 17 10:00:03 2011 CKPT started WITH pid=16, OS id=352578 Fri Jun 17 10:00:03 2011 SMON started WITH pid=17, OS id=311366 Fri Jun 17 10:00:03 2011 RBAL started WITH pid=18, OS id=307272 Fri Jun 17 10:00:03 2011 GMON started WITH pid=19, OS id=315464 Fri Jun 17 10:00:04 2011 MMON started WITH pid=20, OS id=340278 Fri Jun 17 10:00:04 2011 MMNL started WITH pid=21, OS id=319562 lmon registered WITH NM - instance NUMBER 2 (internal mem no 1) Reconfiguration started (OLD inc 0, NEW inc 12) ASM instance List OF instances: 1 2 (myinst: 2) Global Resource Directory frozen * allocate DOMAIN 0, invalid = TRUE Communication channels reestablished * DOMAIN 0 valid = 1 according TO instance 1 * allocate DOMAIN 1, invalid = TRUE * DOMAIN 1 valid = 1 according TO instance 1 * allocate DOMAIN 2, invalid = TRUE * DOMAIN 2 valid = 1 according TO instance 1 * allocate DOMAIN 3, invalid = TRUE * DOMAIN 3 valid = 1 according TO instance 1 Master broadcasted resource hash VALUE bitmaps Non-LOCAL Process blocks cleaned OUT LMS 0: 0 GCS shadows cancelled, 0 closed, 0 Xw survived SET master node info Submitted ALL remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious ALL grantable enqueues GRANTED Submitted ALL GCS remote-cache requests Fix WRITE IN gcs resources Reconfiguration complete Fri Jun 17 10:00:06 2011 LCK0 started WITH pid=22, OS id=327758 Fri Jun 17 10:00:09 2011 ORACLE_BASE NOT SET IN environment. It IS recommended that ORACLE_BASE be SET IN the environment Fri Jun 17 10:00:12 2011 SQL> ALTER DISKGROUP ALL MOUNT /* asm agent */ NOTE: Diskgroups listed IN ASM_DISKGROUPS are DATA FRA NOTE: Diskgroup used FOR Voting files IS: CRS Diskgroup WITH spfile:CRS Diskgroup used FOR OCR IS:CRS NOTE: cache registered GROUP CRS NUMBER=1 incarn=0xfdaebf4e NOTE: cache began mount (NOT FIRST) OF GROUP CRS NUMBER=1 incarn=0xfdaebf4e NOTE: cache registered GROUP DATA NUMBER=2 incarn=0xfe4ebf4f NOTE: cache began mount (NOT FIRST) OF GROUP DATA NUMBER=2 incarn=0xfe4ebf4f NOTE: cache registered GROUP FRA NUMBER=3 incarn=0xfe5ebf50 NOTE: cache began mount (NOT FIRST) OF GROUP FRA NUMBER=3 incarn=0xfe5ebf50 NOTE: Assigning NUMBER (1,2) TO disk (/dev/rhdisk10) NOTE: Assigning NUMBER (1,0) TO disk (/dev/rhdisk8) NOTE: Assigning NUMBER (1,1) TO disk (/dev/rhdisk9) NOTE: Assigning NUMBER (2,1) TO disk (/dev/rhdisk12) NOTE: Assigning NUMBER (2,0) TO disk (/dev/rhdisk6) NOTE: Assigning NUMBER (3,0) TO disk (/dev/rhdisk7) kfdp_query(CRS): 4 Fri Jun 17 10:00:14 2011 kfdp_queryBg(): 4 NOTE: cache opening disk 0 OF grp 1: CRS_0000 path:/dev/rhdisk8 NOTE: F1X0 found ON disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 OF grp 1: CRS_0001 path:/dev/rhdisk9 NOTE: F1X0 found ON disk 1 au 2 fcn 0.0 NOTE: cache opening disk 2 OF grp 1: CRS_0002 path:/dev/rhdisk10 NOTE: F1X0 found ON disk 2 au 2 fcn 0.0 NOTE: cache mounting (NOT FIRST) normal redundancy GROUP 1/0xFDAEBF4E (CRS) Fri Jun 17 10:00:14 2011 kjbdomatt send TO inst 1 Fri Jun 17 10:00:14 2011 NOTE: attached TO recovery DOMAIN 1 Fri Jun 17 10:00:14 2011 NOTE: LGWR attempting TO mount thread 2 FOR diskgroup 1 (CRS) NOTE: LGWR found thread 2 OPEN - signalling ORA-15133 WARNING: instance recovery required during mount kfdp_query(DATA): 5 kfdp_queryBg(): 5 NOTE: cache opening disk 0 OF grp 2: DATA_0000 path:/dev/rhdisk6 NOTE: F1X0 found ON disk 0 au 2 fcn 0.61290 NOTE: cache opening disk 1 OF grp 2: DATA_0001 path:/dev/rhdisk12 NOTE: cache mounting (NOT FIRST) external redundancy GROUP 2/0xFE4EBF4F (DATA) kjbdomatt send TO inst 1 NOTE: attached TO recovery DOMAIN 2 NOTE: LGWR attempting TO mount thread 2 FOR diskgroup 2 (DATA) NOTE: LGWR found thread 2 closed at ABA 12.1279 NOTE: LGWR mounted thread 2 FOR diskgroup 2 (DATA) NOTE: LGWR opening thread 2 at fcn 0.1600788 ABA 13.1280 NOTE: cache mounting GROUP 2/0xFE4EBF4F (DATA) succeeded NOTE: cache ending mount (success) OF GROUP DATA NUMBER=2 incarn=0xfe4ebf4f kfdp_query(FRA): 6 kfdp_queryBg(): 6 NOTE: cache opening disk 0 OF grp 3: FRA_0000 path:/dev/rhdisk7 NOTE: F1X0 found ON disk 0 au 2 fcn 0.0 NOTE: cache mounting (NOT FIRST) external redundancy GROUP 3/0xFE5EBF50 (FRA) kjbdomatt send TO inst 1 NOTE: attached TO recovery DOMAIN 3 NOTE: LGWR attempting TO mount thread 2 FOR diskgroup 3 (FRA) NOTE: LGWR found thread 2 closed at ABA 12.91 NOTE: LGWR mounted thread 2 FOR diskgroup 3 (FRA) NOTE: LGWR opening thread 2 at fcn 0.2439 ABA 13.92 NOTE: cache mounting GROUP 3/0xFE5EBF50 (FRA) succeeded NOTE: cache ending mount (success) OF GROUP FRA NUMBER=3 incarn=0xfe5ebf50 Fri Jun 17 10:00:17 2011 kfdp_query(DATA): 7 kfdp_queryBg(): 7 NOTE: Instance updated compatible.asm TO 11.2.0.0.0 FOR grp 2 SUCCESS: diskgroup DATA was mounted kfdp_query(FRA): 8 kfdp_queryBg(): 8 NOTE: Instance updated compatible.asm TO 11.2.0.0.0 FOR grp 3 SUCCESS: diskgroup FRA was mounted kfdp_query(CRS): 10 kfdp_queryBg(): 10 NOTE: cache opening disk 0 OF grp 1: CRS_0000 path:/dev/rhdisk8 NOTE: F1X0 found ON disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 OF grp 1: CRS_0001 path:/dev/rhdisk9 NOTE: F1X0 found ON disk 1 au 2 fcn 0.0 NOTE: cache opening disk 2 OF grp 1: CRS_0002 path:/dev/rhdisk10 NOTE: F1X0 found ON disk 2 au 2 fcn 0.0 NOTE: cache mounting (NOT FIRST) (retry) normal redundancy GROUP 1/0xFDAEBF4E (CRS) NOTE: LGWR attempting TO mount thread 2 FOR diskgroup 1 (CRS) NOTE: LGWR found thread 2 OPEN - signalling ORA-15133 WARNING: instance recovery required during mount . . . Fri Jun 17 10:06:55 2011 kfdp_query(CRS): 406 kfdp_queryBg(): 406 NOTE: cache opening disk 0 OF grp 1: CRS_0000 path:/dev/rhdisk8 NOTE: F1X0 found ON disk 0 au 2 fcn 0.0 NOTE: cache opening disk 1 OF grp 1: CRS_0001 path:/dev/rhdisk9 NOTE: F1X0 found ON disk 1 au 2 fcn 0.0 NOTE: cache opening disk 2 OF grp 1: CRS_0002 path:/dev/rhdisk10 NOTE: F1X0 found ON disk 2 au 2 fcn 0.0 NOTE: cache mounting (NOT FIRST) (retry) normal redundancy GROUP 1/0xFDAEBF4E (CRS) NOTE: LGWR attempting TO mount thread 2 FOR diskgroup 1 (CRS) NOTE: LGWR found thread 2 OPEN - signalling ORA-15133 WARNING: instance recovery required during mount Errors IN file /u01/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_340052.trc (incident=11385): ORA-00600: internal error code, arguments: [kfgFinalize_2], [], [], [], [], [], [], [], [], [], [], [] Incident details IN: /u01/grid/diag/asm/+asm/+ASM2/incident/incdir_11385/+ASM2_ora_340052_i11385.trc Fri Jun 17 10:07:01 2011 Trace dumping IS performing id=[cdmp_20110617100701] ORA-00600: internal error code, arguments: [kfgFinalize_2], [], [], [], [], [], [], [], [], [], [], [] ERROR: ALTER DISKGROUP ALL MOUNT /* asm agent */ NOTE: cache dismounting (clean) GROUP 1/0xFDAEBF4E (CRS) NOTE: lgwr NOT being msg'd to dismount Fri Jun 17 10:07:02 2011 kjbdomdet send to inst 1 detach from dom 1, sending detach message to inst 1 Fri Jun 17 10:07:02 2011 NOTE: detached from domain 1 NOTE: cache dismounted group 1/0xFDAEBF4E (CRS) NOTE: cache ending mount (fail) of group CRS number=1 incarn=0xfdaebf4e kfdp_dismount(): 407 Fri Jun 17 10:07:02 2011 kfdp_dismountBg(): 407 NOTE: De-assigning number (1,0) from disk (/dev/rhdisk8) NOTE: De-assigning number (1,1) from disk (/dev/rhdisk9) NOTE: De-assigning number (1,2) from disk (/dev/rhdisk10) ERROR: diskgroup CRS was not mounted NOTE: cache deleting context for group CRS 1/-38879410 Fri Jun 17 10:07:02 2011 Shutting down instance (abort) License high water mark = 2 USER (ospid: 307518): terminating the instance Instance terminated by USER, pid = 307518 Fri Jun 17 10:07:03 2011 Instance shutdown complete |
在实例异常崩溃后,尝试启动ASM实例时,ASM尝试进行实例回复,并导致了错误,Oracle在不断的进行多次尝试后,出现了这个ORA-600的错误,并最终导致了实例的异常关闭。
查询MOS,发现存在几个和当前类似的错误信息,错误函数都是kfgFinalize_2,不过绝大部分错误都明确在11.2.0.1中被FIXED。经排除和文档Bug 6771400 – ASM fails to mount a disk group with OERI[kfgfinalize_2] [ID 6771400.8]描述的最为解决,该错误确认影响的版本是10.2.0.3和10.2.0.4,在10.2.0.5和11.1.0.6中被FIXED,该bug并没有提及11.2版本,因此不确认当前问题是否在11.2.0.1版本出现,不过从其他表相上看,当前问题属于这个bug的可能性很大。
这个问题发生在ASM异常崩溃后,可能会导致ASM实例第一次重启出现异常,随后的手工启动并未发生异常,且该问题出现概率很低,可以忽略掉该错误。