本机数据库数据库链无法访问远端数据库

在笔记本的本地数据库中创建数据库链,访问远端数据库,出现ORA-12170访问超时错误。

由于访问超时,首先检查网络,确认远端服务器是可以访问的:

C:\Users\lenovo>ping 192.168.0.56
正在 Ping 192.168.0.56 具有 32 字节的数据:
来自 192.168.0.56 的回复: 字节=32 时间<1ms TTL=64
来自 192.168.0.56 的回复: 字节=32 时间<1ms TTL=64
来自 192.168.0.56 的回复: 字节=32 时间<1ms TTL=64
来自 192.168.0.56 的回复: 字节=32 时间<1ms TTL=64
192.168.0.56 的 Ping 统计信息:
    数据包: 已发送 = 4,已接收 = 4,丢失 = 0 (0% 丢失),
往返行程的估计时间(以毫秒为单位):
    最短 = 0ms,最长 = 0ms,平均 = 0ms
显然远端节点是可以访问的,尝试通过tnsping进行测试:
C:\Users\lenovo>tnsping 192.168.0.56/enmo10
TNS Ping Utility for 32-bit Windows: Version 10.2.0.5.0 - Production on 24-10月 -
2012 15:39:59
Copyright (c) 1997,  2010, Oracle.  All rights reserved.
已使用的参数文件:
d:\oracle\product\10.2.0\network\admin\sqlnet.ora
已使用 HOSTNAME 适配器来解析别名
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=enmo10))(ADDRESS=
(PROTOCOL=TCP)(HOST=192.168.0.56)(PORT=1521)))
OK (20 毫秒)

同样tnsping连接测试正常。登录目标数据库服务器,检查监听状态同样正常。
这说明远端数据库是可以访问的,看来问题多半出在本地设置上。通过本地SQLPLUS客户端直接连接远端数据库,同样无法连接:

SQL> conn test/test@192.168.0.56/enmo10
ERROR:
ORA-12170: TNS: 连接超时
警告: 您不再连接到 ORACLE。

看来可能是本地防火墙导致了网络连接被阻止。找到防火墙设置界面,将sqlplus工具添加到允许程序访问列表中,通过sqlplus连接远端数据库成功,但是通过dblink访问问题依旧:

SQL> conn test/test@192.168.0.56/enmo10
已连接。
SQL> conn test/test
已连接。
SQL> create database link enmo10 connect to test identified by test using '192.168.0.56/enmo10';
数据库链接已创建。
SQL> select * from tab@enmo10;
select * from tab@enmo10
                  *
第 1 行出现错误:
ORA-12170: TNS: 连接超时

表面上看似乎是一个很诡异的问题,其实如果了解数据库链的工作原理就很清楚,在通过数据库链访问远端数据库时,访问远端站点的程序并不是sqlplus而是本地的oracle.exe程序。
将oracle添加到防火墙的允许访问列表中,问题解决:

SQL> select * from tab@enmo10;
TNAME                          TABTYPE  CLUSTERID
------------------------------ ------- ----------
T_CHAR                         TABLE
T                              TABLE
Posted in BUG | Tagged , , , , , | Leave a comment

ORA-600(kssadd: null parent)错误

Oracle 11.1.0.6数据库在归档到远端实例时出现这个错误。

详细错误信息为:

Wed DEC 26 12:39:20 2012
Error 1017 received logging ON TO the standby
------------------------------------------------------------
CHECK that the PRIMARY AND standby are USING a password file
AND remote_login_passwordfile IS SET TO SHARED OR EXCLUSIVE, 
AND that the SYS password IS same IN the password files.
      returning error ORA-16191
------------------------------------------------------------
Errors IN file /oracle/app/diag/rdbms/orcl/orcl1/trace/orcl1_arc2_533302.trc:
ORA-16191: 主日志传送客户机没有登录到备用数据库
PING[ARC2]: Heartbeat failed TO CONNECT TO standby 'orcl2'. Error IS 16191.
Wed DEC 26 12:43:05 2012
Errors IN file /oracle/app/diag/rdbms/orcl/orcl1/trace/orcl1_ora_455450.trc  (incident=852209):
ORA-00600: 内部错误代码, 参数: [kssadd:  NULL parent], [], [], [], [], [], [], []
Incident details IN: /oracle/app/diag/rdbms/orcl/orcl1/incident/incdir_852209/orcl1_ora_455450_i852209.trc
Wed DEC 26 12:43:07 2012
Trace dumping IS performing id=[cdmp_20121226124307]
Wed DEC 26 12:43:08 2012
Sweep Incident[852209]: completed
Wed DEC 26 12:43:12 2012
Errors IN file /oracle/app/diag/rdbms/orcl/orcl1/trace/orcl1_ora_1827082.trc  (incident=852210):
ORA-00600: 内部错误代码, 参数: [kssadd:  NULL parent], [], [], [], [], [], [], []
Incident details IN: /oracle/app/diag/rdbms/orcl/orcl1/incident/incdir_852210/orcl1_ora_1827082_i852210.trc
Trace dumping IS performing id=[cdmp_20121226124314]
Wed DEC 26 12:43:22 2012
Errors IN file /oracle/app/diag/rdbms/orcl/orcl1/trace/orcl1_ora_1426102.trc  (incident=852211):
ORA-00600: 内部错误代码, 参数: [kssadd:  NULL parent], [], [], [], [], [], [], []
Incident details IN: /oracle/app/diag/rdbms/orcl/orcl1/incident/incdir_852211/orcl1_ora_1426102_i852211.trc
Wed DEC 26 12:43:25 2012
Trace dumping IS performing id=[cdmp_20121226124325]
Wed DEC 26 12:44:09 2012
Sweep Incident[852211]: completed
Sweep Incident[852210]: completed

很遗憾,没有拿到客户的相关TRACE文件。因此无法确认出现ORA-600错误时,具体引发错误的语句以及堆栈信息。根据MOS的查询结果,这个kssadd: null parent的相关错误,只有一个和JOB相关的,虽然确认影响版本为11.1.0.7,很符合当前问题的版本,但是报错并未发生在JOB进程或CJQ进程上。
而从这个报错信息的发生来看,这个错误的出现与ORA-16191错误直接相关。这个600错误的出现都是在ORA-16191大量产生后出现的,而随着ORA-16191错误的被最终解决,该错误也没有再次出现。基本上可以确定,ORA-16191是这个ORA-600[kssadd: null parent]错误产生的一个诱因。
最后简单说明一下ORA-16191错误,导致这个问题的原因是主库修改了SYS的密码,导致主库归档到本库时密码不正确。在11g的情况下,如果开启了SEC_CASE_SENSITIVE_LOGON参数,那么有时即使复制了密码文件到目标数据库,报错依旧。将SEC_CASE_SENSITIVE_LOGON置为false,可以避免这个错误。

Posted in BUG | Tagged , , , , , | Leave a comment

ORA-600(kkoljt1)错误

又是一个ANSI标准SQL导致的BUG。
数据库版本10.2.0.1,错误信息为:

Wed Nov 09 19:15:27 2011
Errors IN file c:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_784.trc:
ORA-00600: 内部错误代码, 参数: [kkoljt1], [], [], [], [], [], [], []

根据文档Bug 5864217 – OERI [kkoljt1] from ANSI query with join elimination [ID 5864217.8],在10.2.0.3以前版本中标准SQL如果出现了连接消除会导致ORA-600[kkoljt1]的错误。
Oracle在11.1.0.6和10.2.0.4中解决了这个问题。除了升级外,避免标准SQL的写法是最好的解决问题的方法。

Posted in BUG | Tagged , , , | Leave a comment

ORA-7445(_kkdlavc)错误

客户10.2.0.1 for Windows出现ORA-7445错误。
没有打过补丁的数据库出现问题的几率果然很大:

Thu Mar 29 15:11:50 2012
Errors IN file c:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_880.trc:
ORA-07445: 出现异常错误: 核心转储 [ACCESS_VIOLATION] [_kkdlavc+233] [PC:0x7F6661] [ADDR:0x0] [UNABLE_TO_READ] []

这个问题是由于访问了一个不正常的视图所致,解决方法就是重建视图。关于这个bug的描述可以参考文档ORA-07445 [kkdlavc()+284] When Performing Query Against a View。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-7445(_kkqtnloCbk)错误

客户10.2.0.1 for Windows 32数据库出现ORA-7445[_kkqtnloCbk]错误。
告警日志中错误信息为:

Tue Nov 08 14:56:39 2011
Errors IN file c:\oracle\product\10.2.0\admin\orcl\udump\orcl_ora_4720.trc:
ORA-07445: 出现异常错误: 核心转储 [ACCESS_VIOLATION] [_kkqtnloCbk+124] [PC:0x1DB818C] [ADDR:0xF5] [UNABLE_TO_READ] []

导致这个错误的原因是Oracle处理ANSI语法时出现异常,参考文档Bug 4204383 Dump [kkqtnlocbk] optimizing ANSI OUTER JOINs with subqueries。
这个错误影响10.2.0.3以下版本,Oracle在10.2.0.4和11.1.0.6中对这个问题进行了修正。
事实证明,Oracle对于ANSI的标准SQL写法的支持更容易存在问题,如果没有明确的跨数据库需求,建议使用Oracle自己的SQL语法,尤其是外联接的语法,更是容易出现问题。

Posted in BUG | Tagged , , , , , , | Leave a comment

TNSNAMES配置中空格导致的bug(二)

之前一直认为这是Oracle的一个bug,直到前一段时间通读网络服务参考手册时,才发现问题的真相。
TNSNAMES配置中空格导致的bug:http://yangtingkun.itpub.net/post/468/487506
当时测试发现,只要在SERVICE_NAME下面的一行缺少空格,就会导致错误的产生:

TEST08 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.25.13.229)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = test08)
)
  )

替换下划线的版本为:

TEST08 =
__(DESCRIPTION =
____(ADDRESS = (PROTOCOL = TCP)(HOST = 172.25.13.229)(PORT = 1521))
____(CONNECT_DATA =
______(SERVER = DEDICATED)
______(SERVICE_NAME = test08)
)
__)

后来发现,却是这种情况会引发TNSNAMES.ORA解析错误,但是如果这里添加了空格,而在其他位置删除空格,会引发同样的问题。不过由于一直把这个现象当做了BUG,没有再进行深入的分析。
问题的答案在看文档的时候被揭晓。原来Oracle的TNSNAMES.ORA中对格式存在要求,换行的下一行如果和上一行存在父子关系,则不能顶头,至少要保留一个空格,来提示Oracle这不是一个新的配置,而是上面配置的继续。
根据文档的描述,对于子配置项,如果要单独一行,那么至少要在换行后保留一个空格,否则就应该与主配置同属一行。
本以为发现的是Oracle的一个bug,没想到却是文档中明确说明的特性,这就是文档没有看全就自以为是乱下判断的结果,引以为戒。

Posted in ORACLE | Tagged , , , | Leave a comment

收集STATSPACK的级别

虽然AWR功能已经完全可以替代STATSPACK的功能,但是目前存在的9i的数据库还有不少,而且却是发现有不少人不了解如何更改STATSPACK的级别。
STATSPACK默认是LEVEL 5收集,而LEVEL 5收集的数据是不包括执行计划的,因此对于关注SQL性能的情况,至少需要调整到LEVEL 6,下面是检查和调整STATSPACK收集等级的方法,比较简单,就不多解释了:

SQL> EXEC statspack.snap
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
1 ROW selected.
SQL> EXEC statspack.snap(i_snap_level => 6)
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
         2 1258229964          6
2 ROWS selected.
SQL> EXEC statspack.snap
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
         2 1258229964          6
         3 1258229964          5
3 ROWS selected.
SQL> EXEC statspack.snap(i_snap_level => 6, i_modify_parameter => 'true')
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
         2 1258229964          6
         3 1258229964          5
         4 1258229964          6
4 ROWS selected.
SQL> EXEC statspack.snap
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
         2 1258229964          6
         3 1258229964          5
         4 1258229964          6
         5 1258229964          6
5 ROWS selected.
SQL> EXEC statspack.modify_statspack_parameter(i_snap_level => 10)
PL/SQL PROCEDURE successfully completed.
SQL> EXEC statspack.snap
PL/SQL PROCEDURE successfully completed.
SQL> SELECT snap_id, dbid, snap_level FROM stats$snapshot;
   SNAP_ID       DBID SNAP_LEVEL
---------- ---------- ----------
         1 1258229964          5
         2 1258229964          6
         3 1258229964          5
         4 1258229964          6
         5 1258229964          6
         6 1258229964         10
6 ROWS selected.
Posted in ORACLE | Tagged , , , , | Leave a comment

ORA-7445(lnxadd)错误

一个由于低版本JDBC驱动导致的错误。
告警日志中错误信息为:

Fri Aug 12 19:00:03 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl2_ora_28464.trc:
ORA-07445: exception encountered: core dump [lnxadd()+471] [SIGSEGV] [Address NOT mapped TO object] [0x000000000] [] []
Fri Aug 12 19:00:04 2011
Trace dumping IS performing id=[cdmp_20110812190004]

根据MOS文档setDouble On INSERT Corrupts Data In NUMBER Column Using Oracle10g JDBC Driver [ID 313373.1],导致问题的原因是使用了低版本的JDBC驱动插入NUMBER类型的数值。
当使用JDBC驱动版本为10.1.0.2到10.1.0.4时,使用setDouble设置绑定变量的值时会引发这个错误。解决方法时使用10.1.0.5以上版本的JDBC驱动,或者在SQL语句中使用TO_NUMBER函数,而绑定变量使用setString通过字符串进行赋值。

Posted in BUG | Tagged , , , , , , | Leave a comment

ORA-600(6033)错误

客户10.2.0.4 RAC for Linux X64环境,告警日志出现ORA-600(6033)错误。
详细错误信息为:

/opt/app/oracle/admin/orcl/udump/orcl1_ora_25075.trc
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
WITH the Partitioning, REAL Application Clusters, OLAP, DATA Mining
AND REAL Application Testing options
ORACLE_HOME = /opt/app/oracle/product/10.2.0/db_1
System name:	Linux
Node name:	hb1
Release:	2.6.9-78.ELlargesmp
Version:	#1 SMP Wed Jul 9 16:03:59 EDT 2008
Machine:	x86_64
Instance name: orcl1
Redo thread mounted BY this instance: 1
Oracle process NUMBER: 157
Unix process pid: 25075, image: oracle@hb1
*** ACTION NAME:() 2012-06-01 10:06:21.070
*** MODULE NAME:(JDBC Thin Client) 2012-06-01 10:06:21.070
*** SERVICE NAME:(orcl_taf) 2012-06-01 10:06:21.070
*** SESSION ID:(4475.1832) 2012-06-01 10:06:21.070
*** 2012-06-01 10:06:21.070
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [6033], [], [], [], [], [], [], []
CURRENT SQL statement FOR this SESSION:
SELECT * FROM access_log WHERE source_db='erpDataSource' AND target_db='cmsDataSource' ORDER BY id ASC 
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFF8480 ? 7FBFFF84E0 ?
                                                   7FBFFF8420 ? 000000000 ?
ksedmp()+610         CALL     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFF8480 ? 7FBFFF84E0 ?
                                                   7FBFFF8420 ? 000000000 ?
ksfdmp()+21          CALL     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFF8480 ? 7FBFFF84E0 ?
                                                   7FBFFF8420 ? 000000000 ?
kgeriv()+176         CALL     ksfdmp()             000000003 ? 000000001 ?
                                                   7FBFFF8480 ? 7FBFFF84E0 ?
                                                   7FBFFF8420 ? 000000000 ?
kgeasi()+263         CALL     kgeriv()             0067B4700 ? 2A97200040 ?
                                                   000000000 ? 7FBFFF3788 ?
                                                   7FBFFF8420 ? 000000000 ?
kdifxs1()+4656       CALL     kgeasi()             0067B4700 ? 2A97200040 ?
                                                   000001791 ? 000000002 ?
                                                   000000000 ? 0FFFFFFFF ?
kdifxs()+17          CALL     kdifxs1()            1E221D00C ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   7F00000000 ? 000000000 ?
qerixtFetch()+242    CALL     kdifxs()             1E221D00C ? 000000001 ?
                                                   000000000 ? 000000000 ?
                                                   7F00000000 ? 000000000 ?
qertbFetchByRowID()  CALL     qerixtFetch()        22B1C9480 ? 000000000 ?
+493                                               7FBFFFADF0 ? 000000001 ?
                                                   000000001 ? 000000000 ?
opifch2()+3189       CALL     qertbFetchByRowID()  22B1C9138 ? 0030E04AA ?
                                                   7FBFFFB378 ? 00000000A ?
                                                   000000001 ? 000000000 ?
opifch()+64          CALL     opifch2()            000000089 ? 000000005 ?
                                                   7FBFFFB550 ? 2A97251398 ?
                                                   000000001 ? 000000000 ?
opiodr()+984         CALL     opifch()             000000089 ? 000000005 ?
                                                   7FBFFFB550 ? 00000000A ?
                                                   000000000 ? 001330002 ?
ttcpip()+1012        CALL     opiodr()             000000005 ? 000000002 ?
                                                   7FBFFFE2B0 ? 000000009 ?
                                                   0059DE968 ? 001330002 ?
opitsk()+1322        CALL     ttcpip()             0067BC3D0 ? 0052BC900 ?
                                                   7FBFFFE2B0 ? 000000000 ?
                                                   7FBFFFDDA8 ? 7FBFFFE418 ?
opiino()+1026        CALL     opitsk()             000000003 ? 000000000 ?
                                                   7FBFFFE2B0 ? 000000001 ?
                                                   000000000 ? 683051B00000001 ?
opiodr()+984         CALL     opiino()             00000003C ? 000000004 ?
                                                   7FBFFFF478 ? 000000000 ?
                                                   000000000 ? 683051B00000001 ?
opidrv()+547         CALL     opiodr()             00000003C ? 000000004 ?
                                                   7FBFFFF478 ? 000000000 ?
                                                   0059DF200 ? 683051B00000001 ?
sou2o()+114          CALL     opidrv()             00000003C ? 000000004 ?
                                                   7FBFFFF478 ? 000000000 ?
                                                   0059DF200 ? 683051B00000001 ?
opimai_real()+163    CALL     sou2o()              7FBFFFF450 ? 00000003C ?
                                                   000000004 ? 7FBFFFF478 ?
                                                   0059DF200 ? 683051B00000001 ?
main()+116           CALL     opimai_real()        000000002 ? 7FBFFFF4E0 ?
                                                   000000004 ? 7FBFFFF478 ?
                                                   0059DF200 ? 683051B00000001 ?
__libc_start_main()  CALL     main()               000000002 ? 7FBFFFF4E0 ?
+219                                               000000004 ? 7FBFFFF478 ?
                                                   0059DF200 ? 683051B00000001 ?
_start()+42          CALL     __libc_start_main()  0007139F8 ? 000000002 ?
                                                   7FBFFFF628 ? 0052B4BD0 ?
                                                   000000000 ? 000000002 ?
--------------------- Binary Stack Dump ---------------------

根据MOS文档ORA-600 [6033] “null value retrieved from index leaf lookup” [ID 45795.1],导致问题的原因在于从索引获取页节点的键值时找到了空值。这个问题是Oracle的bug所导致的,但是根据已有的bug描述,找不到10.2上对应的已知bug。
对于这个问题,最简单的方法莫过于将索引删除重建,或直接采用ONLINE REBUILD的方式进行重建。

Posted in BUG | Tagged , , , | Leave a comment

ORA-600(2801)错误

10.2.0.4 RAC环境关闭数据库时出现ORA-600[2801]错误。
详细错误信息为:

Mon Apr 11 22:44:24 2011
ALTER DATABASE ADD supplemental log DATA
Tue Apr 12 00:11:42 2011
Thread 1 advanced TO log SEQUENCE 3108 (LGWR switch)
  CURRENT log# 1 seq# 3108 mem# 0: +DATA/orcl/onlinelog/group_1.307.727394923
  CURRENT log# 1 seq# 3108 mem# 1: +DATA/orcl/onlinelog/group_1.306.727394925
Tue Apr 12 00:16:55 2011
Thread 1 advanced TO log SEQUENCE 3109 (LGWR switch)
  CURRENT log# 2 seq# 3109 mem# 0: +DATA/orcl/onlinelog/group_2.305.727394927
  CURRENT log# 2 seq# 3109 mem# 1: +DATA/orcl/onlinelog/group_2.304.727394929
Tue Apr 12 00:17:09 2011
ALTER SYSTEM SET service_names='orcl' SCOPE=MEMORY SID='orcl1';
Tue Apr 12 00:17:16 2011
Shutting down instance: further logons disabled
Tue Apr 12 00:17:34 2011
Stopping background process QMNC
Tue Apr 12 00:17:35 2011
Stopping background process CJQ0
Tue Apr 12 00:17:36 2011
Stopping background process MMNL
Tue Apr 12 00:17:37 2011
Stopping background process MMON
Tue Apr 12 00:17:39 2011
Shutting down instance (immediate)
License high water mark = 2047
Tue Apr 12 00:17:39 2011
Stopping Job queue slave processes, flags = 7
Tue Apr 12 00:17:39 2011
SUPLOG: Supplemental log DDL failed at scn = 5550732597
SUPLOG:  minimal = ON, PRIMARY KEY = ON
SUPLOG:  UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF
ORA-1089 signalled during: ALTER DATABASE ADD supplemental log DATA...
Tue Apr 12 00:17:39 2011
SUPLOG STATE OBJECT CLEANUP: Failed DDL needs ROLLBACK
Tue Apr 12 00:17:39 2011
Process OS id : 4151 alive after KILL
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_23074.trc
Tue Apr 12 00:17:39 2011
Job queue slave processes stopped
ALL dispatchers AND shared servers shutdown
Tue Apr 12 00:17:42 2011
ALTER DATABASE CLOSE NORMAL
Tue Apr 12 00:17:52 2011
Reconfiguration started (OLD inc 8, NEW inc 10)
List OF nodes:
 0
 Global Resource Directory frozen
 * dead instance detected - DOMAIN 0 invalid = TRUE 
 Communication channels reestablished
 Master broadcasted resource hash VALUE bitmaps
 Non-LOCAL Process blocks cleaned OUT
Tue Apr 12 00:17:52 2011
 LMS 0: 0 GCS shadows cancelled, 0 closed
Tue Apr 12 00:17:52 2011
 LMS 3: 0 GCS shadows cancelled, 0 closed
Tue Apr 12 00:17:52 2011
 LMS 2: 0 GCS shadows cancelled, 0 closed
Tue Apr 12 00:17:52 2011
 LMS 1: 1 GCS shadows cancelled, 0 closed
 SET master node info 
 Submitted ALL remote-enqueue requests
 Dwn-cvts replayed, VALBLKs dubious
 ALL grantable enqueues GRANTED
 Post SMON TO START 1st pass IR
Tue Apr 12 00:17:52 2011
 LMS 0: 19640 GCS shadows traversed, 0 replayed
Tue Apr 12 00:17:52 2011
 LMS 3: 20357 GCS shadows traversed, 0 replayed
Tue Apr 12 00:17:52 2011
 LMS 1: 20465 GCS shadows traversed, 0 replayed
Tue Apr 12 00:17:52 2011
 LMS 2: 20361 GCS shadows traversed, 0 replayed
Tue Apr 12 00:17:52 2011
 Submitted ALL GCS remote-cache requests
 Fix WRITE IN gcs resources
Reconfiguration complete
Tue Apr 12 00:17:52 2011
TRANSACTION recovery: LOCK conflict caught AND ignored
TRANSACTION recovery: LOCK conflict caught AND ignored
Tue Apr 12 00:17:52 2011
SUPLOG SMON: Attempt TO ROLLBACK DDL
Tue Apr 12 00:17:52 2011
SUPLOG: Waiting TO GET supplemental DDL enqueue
Tue Apr 12 00:17:52 2011
SUPLOG: Commencing TO ROLLBACK failed DDL at scn = 5550732880
SUPLOG:  minimal = ON, PRIMARY KEY = ON
SUPLOG:  UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF
Tue Apr 12 00:17:52 2011
SUPLOG: Failed TO ROLLBACK DDL at scn = 5550732880
SUPLOG:  minimal = ON, PRIMARY KEY = ON
SUPLOG:  UNIQUE = ON, FOREIGN KEY = OFF, ALL COLUMN = OFF
Tue Apr 12 00:17:52 2011
Errors IN file /opt/app/oracle/admin/orcl/bdump/orcl1_smon_4516.trc:
ORA-00600: internal error code, arguments: [2801], [], [], [], [], [], [], []
ORA-00601: cleanup LOCK conflict
Tue Apr 12 00:17:53 2011
Trace dumping IS performing id=[cdmp_20110412001753]
Tue Apr 12 00:17:53 2011
Non-fatal internal error happenned while SMON was doing failed supplemental log DDL cleanup.
SMON encountered 1 OUT OF maximum 100 non-fatal internal errors.
SMON: disabling tx recovery
Tue Apr 12 00:17:53 2011
Instance recovery: looking FOR dead threads
Instance recovery: LOCK DOMAIN invalid but no dead threads
SMON: disabling cache recovery
Tue Apr 12 00:18:02 2011
Shutting down archive processes
Archiving IS disabled
Tue Apr 12 00:18:07 2011
ARCH shutting down
ARC1: Archival stopped
Tue Apr 12 00:18:12 2011
ARCH shutting down
ARC0: Archival stopped
Tue Apr 12 00:18:13 2011
Thread 1 closed at log SEQUENCE 3109
Successful close OF redo thread 1
Tue Apr 12 00:18:14 2011
Completed: ALTER DATABASE CLOSE NORMAL
Tue Apr 12 00:18:14 2011
ALTER DATABASE DISMOUNT
Tue Apr 12 00:18:15 2011
SUCCESS: diskgroup DATA was dismounted
Tue Apr 12 00:18:15 2011
Completed: ALTER DATABASE DISMOUNT
ARCH: Archival disabled due TO shutdown: 1089
Shutting down archive processes
Archiving IS disabled
Archive process shutdown avoided: 0 active
ARCH: Archival disabled due TO shutdown: 1089
Shutting down archive processes
Archiving IS disabled
Archive process shutdown avoided: 0 active
Tue Apr 12 00:18:22 2011
freeing rdom 0

从告警日志中记录的内容可以看出这个ORA-600错误产生的始末。最开始执行了一个ADD SUPPLEMENTAL LOG DATA的操作,而这个操作一直都没有完成。显然,存在事务一直没有结束,导致了这个操作无法完成。
大约一个半小时之后,手工执行了数据库的SHUTDOWN IMMEDIATE操作。从告警日志可以看到,另一个节点在之前也执行了数据库的关闭操作。而在数据库的关闭过程中,Oracle尝试回归ALTER DATABASE ADD SUPPLEMENTAL LOG DATA的DDL语句且失败。而Oracle在尝试清除锁信息的时候出现了错误并最终导致了ORA-600[2801]错误。
显然这个错误是特定环境引发的,且对于数据库本身而言并无危害,这里做为一个案例记录一下问题产生的始末。

Posted in BUG | Tagged , , , , , | Leave a comment