10.2.0.4 RAC环境关闭数据库时出现ORA-600[2801]错误。
详细错误信息为:
Mon Apr 11 22:44:24 2011 ALTER DATABASE ADD supplemental log DATA Tue Apr 12 00:11:42 2011 Thread 1 advanced TO log SEQUENCE 3108 (LGWR switch) CURRENT log# 1 seq# 3108 mem# 0: +DATA/orcl/onlinelog/group_1.307.727394923 CURRENT log# 1 seq# 3108 mem# 1: +DATA/orcl/onlinelog/group_1.306.727394925 Tue Apr 12 00:16:55 2011 Thread 1 advanced TO log SEQUENCE 3109 (LGWR switch) CURRENT log# 2 seq# 3109 mem# 0: +DATA/orcl/onlinelog/group_2.305.727394927 CURRENT log# 2 seq# 3109 mem# 1: +DATA/orcl/onlinelog/group_2.304.727394929 Tue Apr 12 00:17:09 2011 ALTER SYSTEM SET service_names='orcl' SCOPE=MEMORY SID='orcl1'; Tue Apr 12 00:17:16 2011 Shutting down instance: further logons disabled Tue Apr 12 00:17:34 2011 Stopping background process QMNC Tue Apr 12 00:17:35 2011 Stopping background process CJQ0 Tue Apr 12 00:17:36 2011 Stopping background process MMNL Tue Apr 12 00:17:37 2011 Stopping background process MMON Tue Apr 12 00:17:39 2011 Shutting down instance (immediate) License high water mark = 2047 Tue Apr 12 00:17:39 2011 Stopping Job queue slave processes, flags = 7 Tue Apr 12 00:17:39 2011 SUPLOG: Supplemental log DDL failed at scn = 5550732597 SUPLOG: minimal = ON, PRIMARY KEY = ON SUPLOG: UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF ORA-1089 signalled during: ALTER DATABASE ADD supplemental log DATA... Tue Apr 12 00:17:39 2011 SUPLOG STATE OBJECT CLEANUP: Failed DDL needs ROLLBACK Tue Apr 12 00:17:39 2011 Process OS id : 4151 alive after KILL Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_23074.trc Tue Apr 12 00:17:39 2011 Job queue slave processes stopped ALL dispatchers AND shared servers shutdown Tue Apr 12 00:17:42 2011 ALTER DATABASE CLOSE NORMAL Tue Apr 12 00:17:52 2011 Reconfiguration started (OLD inc 8, NEW inc 10) List OF nodes: 0 Global Resource Directory frozen * dead instance detected - DOMAIN 0 invalid = TRUE Communication channels reestablished Master broadcasted resource hash VALUE bitmaps Non-LOCAL Process blocks cleaned OUT Tue Apr 12 00:17:52 2011 LMS 0: 0 GCS shadows cancelled, 0 closed Tue Apr 12 00:17:52 2011 LMS 3: 0 GCS shadows cancelled, 0 closed Tue Apr 12 00:17:52 2011 LMS 2: 0 GCS shadows cancelled, 0 closed Tue Apr 12 00:17:52 2011 LMS 1: 1 GCS shadows cancelled, 0 closed SET master node info Submitted ALL remote-enqueue requests Dwn-cvts replayed, VALBLKs dubious ALL grantable enqueues GRANTED Post SMON TO START 1st pass IR Tue Apr 12 00:17:52 2011 LMS 0: 19640 GCS shadows traversed, 0 replayed Tue Apr 12 00:17:52 2011 LMS 3: 20357 GCS shadows traversed, 0 replayed Tue Apr 12 00:17:52 2011 LMS 1: 20465 GCS shadows traversed, 0 replayed Tue Apr 12 00:17:52 2011 LMS 2: 20361 GCS shadows traversed, 0 replayed Tue Apr 12 00:17:52 2011 Submitted ALL GCS remote-cache requests Fix WRITE IN gcs resources Reconfiguration complete Tue Apr 12 00:17:52 2011 TRANSACTION recovery: LOCK conflict caught AND ignored TRANSACTION recovery: LOCK conflict caught AND ignored Tue Apr 12 00:17:52 2011 SUPLOG SMON: Attempt TO ROLLBACK DDL Tue Apr 12 00:17:52 2011 SUPLOG: Waiting TO GET supplemental DDL enqueue Tue Apr 12 00:17:52 2011 SUPLOG: Commencing TO ROLLBACK failed DDL at scn = 5550732880 SUPLOG: minimal = ON, PRIMARY KEY = ON SUPLOG: UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF Tue Apr 12 00:17:52 2011 SUPLOG: Failed TO ROLLBACK DDL at scn = 5550732880 SUPLOG: minimal = ON, PRIMARY KEY = ON SUPLOG: UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF Tue Apr 12 00:17:52 2011 Errors IN file /opt/app/oracle/admin/orcl/bdump/orcl1_smon_4516.trc: ORA-00600: internal error code, arguments: [2801], [], [], [], [], [], [], [] ORA-00601: cleanup LOCK conflict Tue Apr 12 00:17:53 2011 Trace dumping IS performing id=[cdmp_20110412001753] Tue Apr 12 00:17:53 2011 Non-fatal internal error happenned while SMON was doing failed supplemental log DDL cleanup. SMON encountered 1 OUT OF maximum 100 non-fatal internal errors. SMON: disabling tx recovery Tue Apr 12 00:17:53 2011 Instance recovery: looking FOR dead threads Instance recovery: LOCK DOMAIN invalid but no dead threads SMON: disabling cache recovery Tue Apr 12 00:18:02 2011 Shutting down archive processes Archiving IS disabled Tue Apr 12 00:18:07 2011 ARCH shutting down ARC1: Archival stopped Tue Apr 12 00:18:12 2011 ARCH shutting down ARC0: Archival stopped Tue Apr 12 00:18:13 2011 Thread 1 closed at log SEQUENCE 3109 Successful close OF redo thread 1 Tue Apr 12 00:18:14 2011 Completed: ALTER DATABASE CLOSE NORMAL Tue Apr 12 00:18:14 2011 ALTER DATABASE DISMOUNT Tue Apr 12 00:18:15 2011 SUCCESS: diskgroup DATA was dismounted Tue Apr 12 00:18:15 2011 Completed: ALTER DATABASE DISMOUNT ARCH: Archival disabled due TO shutdown: 1089 Shutting down archive processes Archiving IS disabled Archive process shutdown avoided: 0 active ARCH: Archival disabled due TO shutdown: 1089 Shutting down archive processes Archiving IS disabled Archive process shutdown avoided: 0 active Tue Apr 12 00:18:22 2011 freeing rdom 0 |
从告警日志中记录的内容可以看出这个ORA-600错误产生的始末。最开始执行了一个ADD SUPPLEMENTAL LOG DATA的操作,而这个操作一直都没有完成。显然,存在事务一直没有结束,导致了这个操作无法完成。
大约一个半小时之后,手工执行了数据库的SHUTDOWN IMMEDIATE操作。从告警日志可以看到,另一个节点在之前也执行了数据库的关闭操作。而在数据库的关闭过程中,Oracle尝试回归ALTER DATABASE ADD SUPPLEMENTAL LOG DATA的DDL语句且失败。而Oracle在尝试清除锁信息的时候出现了错误并最终导致了ORA-600[2801]错误。
显然这个错误是特定环境引发的,且对于数据库本身而言并无危害,这里做为一个案例记录一下问题产生的始末。