20130723 上海OOW第二日

今天在OOW的ACOUG专场分享了Think Different的主题。
今天一天在会场非常的充实。到了会场后先去听了今天的KeyNotes,其中Steve Miranda的云计算的主题以及Andy Mendelsohn的12c数据库对云计算的支持,都是本次大会的重点话题。Oracle 12c中的C指的就是CLOUD,而12c最大的亮点PLUGABLE DB就是Oracle在DB层对云架构的支持。
如果说去年OOW大会只是给出了12c PLUGABLE DATABASE的概念,那么今年OOW上Oracle已经想好要如何用PDB来支持云服务了,甚至我怀疑Oracle来云服务的收费细则都已经定义好了,青铜、白银和黄金三档,每档对应的架构和服务都已经定义清楚,而在各个档之间进行升级也同样进行说明。看来Oracle已经准备利用12c来打造数据库服务云了。只不过以Oracle以往收费的标准,不知道一些不想自己搭建环境、维护数据库的中小企业,能否接受正版云服务的价格。
中午OOW的午饭一如既往的不给力,这种面包、三明治的配置更适合老外,估计大部分国内参会者,这三天午饭会比较郁闷。
下午全程参加了展览大厅的ACOUG活动,首先是Eygle两个优化的主题,然后是Kamus三个关于升级的主题,最后是我的两个主题,一个是ODA,另一个是Think Different。可惜的是,每个主题的时间只有二十分钟,而我们任何一个主题正常时间都应该超过1小时。因此,最终Eygle只讲了一个主题就超过了两个主题的时间,而Kamus最后紧赶慢赶终于完成了三个主题,但是时间方面超了不少。根据他们两个的前车之鉴,我直接选择了现场绝大部分人都想听的Think Different,而舍弃了ODA的主题。其实舍弃ODA也比较可惜,在OOW的现场展示了ODA的最新版本,而我在中午的时候还把最新版的ODA的内容添加到了PPT里面。
剩下一些零碎的时间基本上都在公司的展台帮忙,除了解答问题,当背景合影外,还碰到了不少新、老朋友。

Posted in NEWS | Tagged , , | Leave a comment

20130722 上海OOW开幕

期待已久的ORACLE OPEN WORLD终于开幕了。
有了以往OOW的经验,没有一大早就跑去排队注册,而是到了下午才去签到注册地方,几分钟搞定了注册的过程。
今天一早听说马克赫德的飞机由于机械故障,导致中途返航,导致他将无法现场出席本次上海OOW大会,他的演讲将改为远程的方式。我在去年10月不但现场听过他的演讲,而且现场看过Larry的演讲,因此倒也没有什么遗憾,而对于大部分技术人员而言,在现场的时候也会盯着大屏幕,是不是现场真人演讲也没有本质的区别。估计影响最大的应该是媒体的朋友,缺少现场采访对象,可能会使本次的OOW之行大打折扣。
下午并没有去听KeyNotes,而是跑到了公司展台帮忙。公司的几个同事基本已经把展品布置好了,我们为大家准备了一些礼物,尤其是一张Oracle 12c的架构图,经过我们多次设计修改,是公司很多人集体智慧的结果,希望可以对技术爱好者有所帮助。

Posted in NEWS | Tagged , , | Leave a comment

20130721 OOW媒体发布会

今天抵达了上海,开始开启OOW模式。
这次是第三次参加OOW,和以往不同,本次公司作为参展商参加了OOW,在一层的展览大厅有专门的展台,而且本次公司的6个ACE都到达现场,准备一起参加这次Oracle的盛会。
今天下午原本想去会场先进行注册,但是听说会场还没有开门。本来今天应该没有相关的活动,结果Eygle作为微博特使有一个媒体发布会,被Eygle拉了过去,顺便把Kamus也一起叫上。
这个媒体发布会主要介绍了本次OOW的重点方向,包括软硬件集成的系统、云计算、客户体验和JAVA。另外,本次会议的规模在国内也是空前的,参会人员达到了18000人,而国内其他IT上千人的会议相比,绝大算得上是巨无霸了。

Posted in NEWS | Tagged , , | Leave a comment

密码延迟验证导致的系统HANG住

又是一个11g新特性导致的问题。
这个新特性很早之前就研究过,也在其他客户处碰到过类似的问题。从11g开始,如果一个用户使用不正确的密码尝试登录数据库,那么随着登录失败次数的增加,每次登录验证前延迟等待的时间也会增加:

SQL> SET TIME ON
18:30:54 SQL> 
18:30:58 SQL> conn test/test
Connected.
18:31:25 SQL> 
18:31:25 SQL> conn test/a
conn test/a
conn test/a
conn test/a
conn test/a
conn test/a
conn test/a
conn test/test
conn test/a
ERROR:
ORA-01017: invalid username/password; logon denied
Warning: You are no longer connected TO ORACLE.
18:31:26 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:26 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:26 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:27 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:29 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:32 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
18:31:36 SQL> Connected.
18:31:36 SQL> ERROR:
ORA-01017: invalid username/password; logon denied
Warning: You are no longer connected TO ORACLE.
18:31:36 SQL>

可以看到,从第三次密码错误的登录开始,每次延迟时间开始变成2秒、3秒并一次递增。既是这时提供正确的密码登录,会话也会延迟N秒,然后进行验证。不过一旦验证成功,会将失败计数清零,后续的错误登录会重新计数。
不过这只是单一会话尝试失败登录的情况,如果同时存在两个会话,则很快延迟验证时间就会达到10秒、20秒的级别。如果同时大量的连接采用错误的密码,基本上这个用户的登录就会被完全HANG住。
客户的数据库就出现了类似的情况,数据库版本为11.2.0.3 RAC,在数据库中观察,三个节点每个节点的会话数都接近SESSIONS参数设置的上线3000,而后台高级日志已经出现了ORA-20错误。由于客户系统的关键用户只有一个,因此几乎所有的会话都无法正常的登录到数据库中。而在数据库上发现,大量的会话用户名、EVENT以及PROGRAM都信息都是NULL,这说明这些会话还没有完成验证成功的登录到数据库中。而当前主机的CPU资源使用并不高,那些已经连接到数据库中的进程也可以正常的工作。尝试使用SYSTEM等其他用户发现可以迅速的登录数据库。所有这一切都已经说明,当前有一个或多个中间件服务器在使用错误的密码连接数据库,由于密码延迟验证的策略,导致所有后续的连接都被HANG住。
任何一个新特性带来性能或功能上的提高的同时,也会引入相关的bug,显然这个安全性上的考虑,有时候也会带来验证的性能问题,甚至成为用来攻击数据库的一种手段。
之前几次并没有给出彻底屏蔽密码延迟验证的手段,而Oracle最强大之处就在于几乎所有的功能和特性都有对应的开关,通过设置EVENTS 28401可以屏蔽密码延迟验证:

SQL> ALTER SYSTEM SET EVENT =28401 TRACE NAME CONTEXT FOREVER, LEVEL 1’ SCOPE = SPFILE;

设置该事件后重启数据库即可。

Posted in ORACLE | Tagged , , , , | Leave a comment

IP地址被清空导致实例重启

客户10.2.0.4 RAC for Solaris 10环境突然出现了实例重启的现象。
数据库正常运行到下午3点左右,随后两个节点分别重启,其中一个节点上的实例无法自动启动。检查两个实例的告警日志发现,在节点重启前,两个节点都出现了明显的ORA-27504错误:

Wed Apr 10 15:00:05 2013
Errors IN file /oracle/admin/orcl/udump/orcl1_ora_10997.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:00:06 2013
Errors IN file /oracle/admin/orcl/udump/orcl1_ora_11007.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:00:06 2013
Errors IN file /oracle/admin/orcl/udump/orcl1_ora_11009.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:00:06 2013
Errors IN file /oracle/admin/orcl/udump/orcl1_ora_11011.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
.
.
.
Wed Apr 10 15:07:08 2013
IPC Send timeout detected.Sender: ospid 25688
Receiver: inst 2 binc 427282 ospid 11838
Wed Apr 10 15:07:08 2013
IPC Send timeout detected.Sender: ospid 25724
Wed Apr 10 15:07:08 2013
IPC Send timeout detected.Sender: ospid 25680
Receiver: inst 2 binc 431591 ospid 11822
Receiver: inst 2 binc 431795 ospid 11874
Wed Apr 10 15:07:08 2013
IPC Send timeout detected.Sender: ospid 25684
Receiver: inst 2 binc 428985 ospid 11826
Wed Apr 10 15:07:08 2013
IPC Send timeout detected.Sender: ospid 25708
Receiver: inst 2 binc 430048 ospid 11858
Wed Apr 10 15:07:09 2013
ospid 25678: network interface WITH IP address 192.168.168.3 no longer operational
requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:07:35 2013
IPC Send timeout TO 1.1 inc 4 FOR msg TYPE 44 FROM opid 7
Wed Apr 10 15:07:35 2013
IPC Send timeout TO 1.12 inc 4 FOR msg TYPE 44 FROM opid 21
Wed Apr 10 15:07:35 2013
IPC Send timeout TO 1.2 inc 4 FOR msg TYPE 44 FROM opid 8
Wed Apr 10 15:07:35 2013
IPC Send timeout TO 1.3 inc 4 FOR msg TYPE 44 FROM opid 10
Wed Apr 10 15:07:35 2013
IPC Send timeout TO 1.8 inc 4 FOR msg TYPE 44 FROM opid 15
Wed Apr 10 15:08:13 2013
ospid 25678: network interface WITH IP address 192.168.168.3 no longer operational
requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:08:16 2013
IPC Send timeout detected.Sender: ospid 25748
Receiver: inst 2 binc 430164 ospid 11890
.
.
.
Wed Apr 10 15:08:53 2013
IPC Send timeout TO 1.13 inc 4 FOR msg TYPE 36 FROM opid 176
Wed Apr 10 15:08:53 2013
IPC Send timeout TO 1.15 inc 4 FOR msg TYPE 36 FROM opid 167
Wed Apr 10 15:08:57 2013
IPC Send timeout TO 1.4 inc 4 FOR msg TYPE 32 FROM opid 180
.
.
.
Wed Apr 10 15:15:51 2013
Evicting instance 2 FROM cluster
Wed Apr 10 15:16:09 2013
ospid 25678: network interface WITH IP address 192.168.168.3 no longer operational
requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:16:40 2013
Waiting FOR instances TO leave: 
2 
Wed Apr 10 15:17:00 2013
Waiting FOR instances TO leave: 
2 
Wed Apr 10 15:17:09 2013
ospid 25678: network interface WITH IP address 192.168.168.3 no longer operational
requested interface 192.168.168.3 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:17:20 2013
Waiting FOR instances TO leave: 
2

节点2上的错误信息与之类似:

.
.
.
Wed Apr 10 15:19:07 2013
Errors IN file /oracle/admin/orcl/udump/orcl2_ora_14065.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.4 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:19:08 2013
Errors IN file /oracle/admin/orcl/udump/orcl2_ora_14057.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if_not_found failed WITH STATUS: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpvaddr9
ORA-27303: additional information: requested interface 192.168.168.4 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:19:46 2013
ospid 11820: network interface WITH IP address 192.168.168.4 no longer operational
requested interface 192.168.168.4 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:20:46 2013
ospid 11820: network interface WITH IP address 192.168.168.4 no longer operational
requested interface 192.168.168.4 NOT found. CHECK output FROM ifconfig command
Wed Apr 10 15:20:55 2013
Errors IN file /oracle/admin/orcl/bdump/orcl2_lmon_11818.trc:
ORA-29740: evicted BY member 0, GROUP incarnation 6
Wed Apr 10 15:20:55 2013
LMON: terminating instance due TO error 29740
Wed Apr 10 15:20:55 2013
Errors IN file /oracle/admin/orcl/bdump/orcl2_smon_11924.trc:
ORA-29740: evicted BY member , GROUP incarnation 
Wed Apr 10 15:20:55 2013
Errors IN file /oracle/admin/orcl/bdump/orcl2_lmse_11886.trc:
ORA-29740: evicted BY member , GROUP incarnation 
Wed Apr 10 16:11:37 2013
Starting ORACLE instance (normal)
Wed Apr 10 16:11:45 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:11:45 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:11:45 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:11:45 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:11:50 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:11:50 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:11:50 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:11:50 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:11:54 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:11:54 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:11:54 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:11:54 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:12:29 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:12:29 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:12:29 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:12:29 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:12:47 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:12:47 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:12:47 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:12:47 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:12:52 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:12:52 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:12:52 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:12:52 2013
Failed TO acquire instance startup/shutdown serialization primitive
Wed Apr 10 16:12:56 2013
sculkget: failed TO LOCK /oracle/products/10.2/db_1/dbs/lkinstorcl2 exclusive
Wed Apr 10 16:12:56 2013
sculkget: LOCK held BY PID: 6912
Wed Apr 10 16:12:56 2013
Oracle Instance Startup operation failed. Another process may be attempting TO startup OR shutdown this Instance.
Wed Apr 10 16:12:56 2013
Failed TO acquire instance startup/shutdown serialization primitive

导致问题的原因根据错误信息很容易分析出来,节点2上的IP地址被修改,导致心跳通信出现了异常,而节点1试图将节点2踢出集群,但是由于无法和节点2之间进行通信,因此只有等待节点2重启。
检查节点2的操作系统日志:

Apr 10 15:00:04 bj-sst-xhm-3f2-m5k-02 ip: [ID 482227 kern.notice] ip_arp_done: init failed
Apr 10 15:07:37 bj-sst-xhm-3f2-m5k-02 Had[4135]: [ID 702911 daemon.notice] VCS CRITICAL V-16-1-50086 CPU usage ON bj-sst-xhm-3f2-m5k-02 IS 92%
Apr 10 15:18:41 bj-sst-xhm-3f2-m5k-02 sshd[13485]: [ID 800047 auth.error] error: Failed TO allocate internet-DOMAIN X11 display socket.

在15点04秒时出现的ip_arp_done: init failed信息,说明设置网卡接口时使用了主机名信息,且主机的IP地址被在线修改。
最后根据HISTORY确认,发现有人通过root登录系统,执行ifconfig –a6来检查IPV6的地址,但是命令敲错,执行了ifconfig –a 6,在a和6之间多了一个空格,导致主机所有的IP地址被设置成0.0.0.0,于是导致了上面的错误。
这再次说明,对于root这种权限用户而言,任何的不小心都可能会导致非常严重的后果。

Posted in ORACLE | Tagged , , , , , , , , , , , | Leave a comment

ORA-7445(kjbcrcomplete)错误

客户10.2.0.5 RAC在验证表的逻辑结构是出现ORA-7445错误。
错误信息如下:

Sun Mar 31 03:45:16 EAT 2013
Errors IN file /oracle/app/admin/orcl/udump/orcl2_ora_4039.trc:
ORA-07445: 出现异常错误: 核心转储 [kjbcrcomplete()+5521] [SIGSEGV] [Invalid permissions FOR mapped object] [0x00000002A] [] []

详细TRACE信息为:

Ioctl ASYNC_CONFIG error, errno = 1
*** 2013-03-31 02:20:45.846
*** ACTION NAME:() 2013-03-31 02:20:45.846
*** MODULE NAME:(sqlplus@db2 (TNS V1-V3)) 2013-03-31 02:20:45.846
*** SERVICE NAME:(SYS$USERS) 2013-03-31 02:20:45.846
*** SESSION ID:(2053.926) 2013-03-31 02:20:45.846
WARNING:Could NOT increase the asynch I/O LIMIT TO 32 FOR SQL direct I/O. It IS SET TO 0
*** 2013-03-31 03:45:16.545
Exception signal: 11 (SIGSEGV), code: 2 (Invalid permissions FOR mapped object), addr: 0x2a, PC: [0x40000000053973d1, kjbcrcomplete()+5521]
  r1: 60000000000ba268       r20:                0       br5:                0
  r2: c0000030f2636d20       r21:               21       br6: c00000000042a870
  r3: c000000028da7000       r22:                0       br7: c00000000043d720
  r4:                0       r23:         5c3412c0        ip: 40000000053973d1
  r5: c000000000000408       r24: c0000030f598d9e8      iipa:                0
  r6: c0000000000443e0       r25: 60000000000ac688       cfm:             14b1
  r7: 9fffffffbf7f8de8       r26: 60000000000ca6b8        um:               1a
  r8:                0       r27:                1       rsc:               1f
  r9: c00000145dfc536c       r28: 60000000000ac650       bsp: 9fffffffbf801600
 r10: 60000000000ca6c0       r29: c0000030f6695ec8  bspstore: 9fffffffbf801600
 r11:               20       r30: 9fffffffbf372318      rnat:                0
 r12: 9ffffffffffd9360       r31:               20       ccv: 2000000000000030
 r13: 9fffffffbf3fd4b0      NaTs:                0      unat:                0
 r14: 60000000000ac650       PRs: c000000000398309      fpsr:    9804c8a76433f
 r15:         5c3412cd       br0: 40000000053964e0       pfs: c0000000000014b1
 r16:                0       br1: c000000000299260        lc:                0
 r17: 60000000000ca6c0       br2: c00000000029ba60        ec:                0
 r18:               20       br3:                0       isr: 9fffffffbf801600
 r19: 9ffffffffffd9330       br4:                0       ifa:                0
Reason code: 0053
*** 2013-03-31 03:45:16.590
ksedmp: internal OR fatal error
ORA-07445: 出现异常错误: 核心转储 [kjbcrcomplete()+5521] [SIGSEGV] [Invalid permissions FOR mapped object] [0x00000002A] [] []
CURRENT SQL statement FOR this SESSION:
analyze TABLE c_inter VALIDATE STRUCTURE CASCADE
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+64          CALL     ksedst1()            000000001 ? 000000001 ?
ksedmp()+2176        CALL     ksedst()             000000001 ?
                                                   C000000000000D20 ?
                                                   4000000004032B40 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ssexhd()+1264        CALL     ksedmp()             000000003 ?
                                                   6000000000247DA0 ?
                                                   60000000000BA268 ?
                                                   6000000000248370 ?
                                                   C000000000000B9F ?
                                                   4000000006C1DF80 ?
                                                   C00000000039B6CD ?
                                                   60000000000C7420 ?
<kernel>             CALL     ssexhd()             C0000030F54A95F0 ?
                                                   60000000000C9570 ?
                                                   C000000028DAE8C8 ?
                                                   60000000000BA268 ?
kjbcrcomplete()+552  CALL     <kernel>             600000000024C200 ?
1                                                  20000000B ?
                                                   600000000024C010 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
kclwcrs()+69392      CALL     kjbcrcomplete()      0001703A6 ? 0001F0000 ?
                                                   0000188C5 ?
                                                   C0000030F0307B48 ?
                                                   000000001 ?
                                                   C0000030F0307B48 ?
                                                   000000000 ?
                                                   9FFFFFFFFFFD9390 ?
kclgclk()+21824      CALL     kclwcrs()            C00000145DFC537A ?
                                                   000000800 ?
                                                   60000000000CAA68 ?
                                                   60000000000C6F68 ?
                                                   9FFFFFFFFFFE0A10 ?
                                                   9FFFFFFFFFFE0958 ?
                                                   9FFFFFFFFFFE0D10 ?
                                                   9FFFFFFFFFFE2460 ?
$cold_kcbzib()+8640  CALL     kclgclk()            9FFFFFFFFFFDBDD8 ?
                                                   000000800 ?
                                                   60000000000CAA68 ?
                                                   60000000000C8A38 ?
                                                   000000040 ?
                                                   9FFFFFFFFFFE0D10 ?
                                                   00000026D ?
                                                   9FFFFFFFFFFE0A10 ?
kcbgtcr()+9536       CALL     $cold_kcbzib()       C00000308A416408 ?
                                                   9FFFFFFFFFFE0D10 ?
                                                   9FFFFFFFFFFE0A10 ?
                                                   000000004 ? 000000003 ?
                                                   00000026D ? 000000000 ?
                                                   000000000 ?
ktrgtc()+1120        CALL     kcbgtcr()            9FFFFFFFFFFE0D10 ?
                                                   9FFFFFFFFFFE0A10 ?
                                                   00000026D ? 000000000 ?
                                                   60000000000BA268 ?
kdifbk()+6608        CALL     ktrgtc()             9FFFFFFFFFFE0D00 ?
                                                   9FFFFFFFFFFE2430 ?
                                                   4000000001CA99C0 ?
                                                   9FFFFFFFFFFE0C20 ?
                                                   00000026D ? 000100000 ?
                                                   4000000001CA99E0 ?
kdgvsp()+15888       CALL     kdifbk()             9FFFFFFFFFFE0C44 ?
                                                   9FFFFFFFFFFE0D2C ?
                                                   000000001 ?
                                                   9FFFFFFFFFFE0DE8 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
                                                   9FFFFFFFFFFE1320 ?
kdgvss()+432         CALL     kdgvsp()             9FFFFFFFFFFE8DB8 ?
                                                   000000000 ?
                                                   9FFFFFFFFFFE2468 ?
                                                   000000000 ? 00C100000 ?
                                                   C0000030FA5CBF20 ?
kdavls()+6624        CALL     kdgvss()             9FFFFFFFBF2F3780 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFE9C80 ?
                                                   9FFFFFFFFFFE9170 ?
                                                   4000000001CAB230 ?
                                                   C0000030FA5CBF20 ?
                                                   9FFFFFFFBF2F3780 ?
                                                   000000001 ?
kkndgd()+2960        CALL     kdavls()             9FFFFFFFFFFEB4C0 ?
                                                   4000000006557450 ?
                                                   C000000000390061 ?
                                                   9FFFFFFFFFFE8FB4 ?
                                                   000000001 ?
                                                   60000000000AE7A8 ?
                                                   000000001 ?
                                                   9FFFFFFFBF2F3BB4 ?
kknpat()+1648        CALL     kkndgd()             C000002F2B449B18 ?
                                                   0000142B9 ? 000000002 ?
                                                   00000000A ?
                                                   C000002F2B4498DE ?
                                                   000000014 ?
                                                   C000002F2B4498A6 ?
                                                   9FFFFFFFFFFEBE40 ?
kknpob()+448         CALL     kknpat()             C000002F2B449B18 ?
                                                   9FFFFFFFFFFEBE40 ?
                                                   000000001 ?
                                                   C000002F2B6BC670 ?
                                                   9FFFFFFFFFFF2F10 ?
                                                   9FFFFFFFFFFEBD60 ?
                                                   C000000000001838 ?
                                                   400000000655AF30 ?
kknls()+1872         CALL     kknpob()             C000002F2B449B18 ?
                                                   9FFFFFFFFFFEBE40 ?
                                                   9FFFFFFFFFFEC1B0 ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   000100000 ? 000100000 ?
                                                   60000000000BA268 ?
                                                   C0000029B20453AF ?
kkndrv()+64          CALL     kknls()              000000000 ?
$cold_opiexe()+7600  CALL     kkndrv()             C000000000002450 ?
                                                   4000000003637C80 ?
                                                   60000000000CAA68 ?
                                                   60000000000C8A38 ?
                                                   000000040 ?
                                                   9FFFFFFFFFFE0D10 ?
                                                   00000026D ?
                                                   9FFFFFFFFFFE0A10 ?
opiosq0()+8144       CALL     $cold_opiexe()       9FFFFFFFFFFF6130 ?
                                                   4000000002F89200 ?
                                                   00002821B ?
                                                   9FFFFFFFFFFF44B0 ?
                                                   60000000000BA268 ?
                                                   C000000000001838 ?
                                                   9FFFFFFFFFFF44B4 ?
                                                   60000000000C6CA0 ?
kpooprx()+416        CALL     opiosq0()            000000003 ?
                                                   9FFFFFFFFFFF6D90 ?
                                                   4000000002AEB2A0 ?
                                                   00002F21B ?
                                                   C000000000000815 ?
kpoal8()+1152        CALL     kpooprx()            000000003 ?
                                                   9FFFFFFFFFFF9AD0 ?
                                                   000000048 ?
                                                   9FFFFFFFFFFF6DD0 ?
                                                   000000001 ? 0000000A4 ?
                                                   60000000000BA268 ?
                                                   60000000000A7E20 ?
opiodr()+2144        CALL     kpoal8()             9FFFFFFFFFFF7590 ?
                                                   C0000000000018B7 ?
                                                   9FFFFFFFFFFF9C70 ?
                                                   9FFFFFFFFFFF6EB0 ?
                                                   60000000000BA268 ?
                                                   4000000002F33E40 ?
ttcpip()+1680        CALL     opiodr()             00000005E ? 000000017 ?
                                                   4000000001BF80B0 ?
                                                   0000046C0 ?
                                                   9FFFFFFFFFFF75A0 ?
opitsk()+2368        CALL     ttcpip()             600000000003DF40 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF9C70 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFF9DE0 ?
                                                   9FFFFFFFFFFF9BD4 ?
                                                   4000000001CE0810 ?
                                                   000000000 ?
opiino()+1664        CALL     opitsk()             000000000 ? 000000000 ?
                                                   60000000000BA268 ?
                                                   400000000293B500 ?
                                                   000028089 ?
                                                   4000000001BF80C8 ?
opiodr()+2144        CALL     opiino()             00000003C ?
                                                   9FFFFFFFFFFFC630 ?
                                                   9FFFFFFFFFFFEDD0 ?
                                                   9FFFFFFFFFFFBAF0 ?
                                                   60000000000BA268 ?
                                                   C0000000000018B7 ?
opidrv()+1248        CALL     opiodr()             00000003C ? 000000004 ?
                                                   4000000001BF7B60 ?
                                                   0000046C0 ?
                                                   9FFFFFFFFFFFC640 ?
                                                   60000000000BA268 ?
sou2o()+240          CALL     opidrv()             00000003C ?
                                                   60000000000C6C98 ?
                                                   9FFFFFFFFFFFEDD0 ?
opimai_real()+496    CALL     sou2o()              9FFFFFFFFFFFEDF0 ?
                                                   00000003C ? 000000004 ?
                                                   9FFFFFFFFFFFEDD0 ?
main()+240           CALL     opimai_real()        000000000 ?
                                                   9FFFFFFFFFFFEE20 ?
main_opd_entry()+80  CALL     main()               000000002 ?
                                                   9FFFFFFFFFFFF2D8 ?
                                                   60000000000BA268 ?
                                                   C000000000000004 ?
--------------------- Binary Stack Dump ---------------------

在MOS中找不到这个ORA-7445[kjbcrcomplete]错误的记录,但是从详细的TRACE可以判断,问题导致的原因是验证表的结构时出现了异常。显然是表或索引上存在逻辑坏块,导致Oracle在验证其逻辑结构时发现异常。
解决这个问题的方式是通过逻辑方式重建表和索引。

Posted in BUG | Tagged , , , | Leave a comment

Streams AQ: qmn coordinator waiting for slave to start等待

一个客户的10.2.0.5数据库的TOP 5等待出现了这个等待事件。

其实导致这个Streams AQ: qmn coordinator waiting for slave to start等待出现的原因除了参数设置不合理外,最主要的原因还是数据库太闲了:

Event

Waits

Time(s)

Avg Wait(ms)

% Total Call   Time

Wait Class

CPU time

372

59.0

Streams AQ: qmn   coordinator waiting for slave to start

6

34

5,667

5.4

Other

db file   scattered read

50,528

28

1

4.4

User I/O

gc cr multi   block request

66,347

24

0

3.8

Cluster

db file   sequential read

7,157

18

2

2.8

User I/O

 

可以看到,这个Streams AQ: qmn coordinator waiting for slave to start等待排在TOP 5的第二位,仅仅等待了34秒。不过这个等待的平均等待时间则达到了5秒以上,相比总的等待时间,这个对单次操作的性能影响更加明显。

导致这个问题的主要原因是初始化参数AQ_TM_PROCESSES设置为0,而Oracle推荐改参数应该至少设置为1。由于Oracle的高级队列、流、数据泵等多种内置功能都会依赖于QMN进程,当AQ_TM_PROCESSES不为0时,Oracle无法自动根据负载确定QNNN进程的数量,来满足队列相关操作的需求。而如果该参数被禁止,Oracle只能在出现队列需求的时候才被动的去启动一个SLAVE进程,从而导致了较高的平均等待时间。

 

Posted in ORACLE | Tagged , , | Leave a comment

ORA-600(13310)错误

客户10.2.0.4 RAC环境出现ORA-600[13310]错误。
错误信息如下:

Sun Aug  1 04:08:24 2010
Errors IN file /oracle/admin/ORCL/udump/orcl1_ora_14964.trc:
ORA-00603: ORACLE server SESSION TERMINATED BY fatal error
ORA-27544: Failed TO map memory region FOR export
ORA-27300: OS system dependent operation:socket failed WITH STATUS: 23
ORA-27301: OS failure message: File TABLE overflow
ORA-27302: failure occurred at: sskgxpcre1
Sun Aug  1 04:08:24 2010
Trace dumping IS performing id=[cdmp_20100801040824]
Sun Aug  1 16:08:25 2010
Trace dumping IS performing id=[cdmp_20100801160825]
Sun Aug  1 16:08:27 2010
Errors IN file /oracle/admin/ORCL/bdump/orcl1_diag_17090.trc:
ORA-27050: FUNCTION called WITH invalid FIB/IOV STRUCTURE
Additional information: 2
ORA-27041: unable TO OPEN file
HPUX-ia64 Error: 23: File TABLE overflow
Additional information: 3
Sun Aug  1 04:08:42 2010
Errors IN file /oracle/admin/ORCL/udump/orcl1_ora_15065.trc:
ORA-00603: Message 603 NOT found; No message file FOR product=RDBMS, facility=ORA
ORA-27544: Message 27544 NOT found; No message file FOR product=RDBMS, facility=ORA
ORA-27300: Message 27300 NOT found; No message file FOR product=RDBMS, facility=ORA; arguments: [socket] [23]
ORA-27301: Message 27301 NOT found; No message file FOR product=RDBMS, facility=ORA; arguments: [File TABLE overflow]
ORA-27302: Message 27302 NOT found; No message file FOR product=RDBMS, facility=ORA; arguments: [sskgxpcre1]
Sun Aug  1 04:08:48 2010
Trace dumping IS performing id=[cdmp_20100801160843]
Sun Aug  1 04:08:48 2010
Trace dumping IS performing id=[cdmp_20100801160843]
Sun Aug  1 04:08:56 2010
ospid 17096: network interface WITH IP address 192.168.1.81 no longer operational
Sun Aug  1 04:08:58 2010
Errors IN file /oracle/admin/ORCL/bdump/orcl1_ora_15148.trc:
ORA-00600: internal error code, arguments: [13310], [], [], [], [], [], [], []
Sun Aug  1 04:08:58 2010
Errors IN file /oracle/admin/ORCL/bdump/orcl1_ora_15148.trc:
ORA-00210: cannot OPEN the specified control file
ORA-00202: control file: '/dev/vgpmcs11/rr01G1001'
ORA-27041: unable TO OPEN file
HPUX-ia64 Error: 23: File TABLE overflow
Additional information: 3
ORA-00600: internal error code, arguments: [13310], [], [], [], [], [], [], []
Sun Aug  1 04:09:03 2010
Trace dumping IS performing id=[cdmp_20100801040859]

根据MOS文档ORA-600 [13310] Reported When Obtaining An OS Unique Identifier [ID 1234083.1],ORA-600[13310]错误出现的原因是无法获取操作系统上的唯一标识信息,而Oracle文档给出的解决方案是释放或者增加系统资源,比如内存或SWAP空间。
由于出现600错误之前出现了ORA-2730N的错误,说明确实出现了操作系统层面的问题。根据ORA-27544错误以及File table overflow信息进行定位,确认是HP-UX上10.2.0.4的bug:hp-ux: File handles not released after upgrade to 10.2.0.3 CRS Bundle#2 or 10.2.0.4 [ID 739557.1]。
这个问题是10.2.0.3的CRS Bundle补丁引入,由mmap引发的错误,影响版本包括10.2.0.3、10.2.0.4和11.1.0.6。Oracle在10.2.0.5、11.1.0.7中解决了这个问题,在10.2.0.4 CRS Bundle Patch #2以及10.2.0.4最新的CRS PSU中同样修正了该错误。

Posted in BUG | Tagged , , , , , , , , , , , , , | Leave a comment

ORA-600(17147)和ORA-7445(__lwp_kill)错误

客户10.2.0.4 RAC环境出现ORA-600[17147]和ORA-7445[__lwp_kill]错误。
错误信息为:

Fri DEC 14 16:05:56 2012
Errors IN file /oraclelog/admin/orcl/bdump/orcl2_diag_27263.trc:
ORA-07445: exception encountered: core dump [__lwp_kill()+48] [SIGIOT] [UNKNOWN code] [0x000006A7F] [] []
ORA-00600: internal error code, arguments: [17147], [0x9FFFFFFFFD3E6BB8], [], [], [], [], [], []
Fri DEC 14 16:06:06 2012
Restarting dead background process DIAG
DIAG started WITH pid=6, OS id=12243

报错的进程是DIAG进程,出现错误后引发了ORA-7445错误,这个__lwp_kill函数在MOS中没有记录,但是在SUN的文档中找到了相关函数说明,这是操作系统关闭轻量级进程(lightweight process)是的函数调用。因此问题应该是diag出现了ORA-600[17147]错误,随后Oracle尝试通过系统函数调用__lwp_kill结束进程,但是引发了错误。
根据MOS文档Bug 7028176 – Memory corruption / OERI:17147 in DIAG in RAC [ID 7028176.8],当前的错误是10.2.0.4上的BUG,在RAC环境中出现私有内存错误导致DIAG进程出现ORA-600[17147]错误。
Oracle在10.2.0.4.4和10.2.0.5解决了这个错误。这种内存异常导致的错误一般并不会重现,出现的几率相对较低,因此如果没有大量报错也可以选择忽略。如果确实报错比较频繁,Oracle还在多个平台上都提供了这个BUG的单独PATCH。

Posted in BUG | Tagged , , , , , , | Leave a comment

ORA-600(17059)错误

客户10.2.0.4 RAC出现大量的ORA-600[17059]错误。
错误信息如下:

Tue May 21 09:55:30 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:32 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:33 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:34 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:36 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:37 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
Tue May 21 09:55:38 2013
Errors IN file /oraclelog/admin/orcl/bdump/orcl1_j000_307.trc:
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []

详细TRACE信息如下:

*** 2013-07-03 08:16:46.382
*** ACTION NAME:() 2013-07-03 08:16:46.381
*** MODULE NAME:() 2013-07-03 08:16:46.381
*** SERVICE NAME:(SYS$USERS) 2013-07-03 08:16:46.381
*** SESSION ID:(460.3380) 2013-07-03 08:16:46.380
LIBRARY OBJECT HANDLE: handle=c000001349e9a160 mtx=c000001349e9a290(0) cdp=0
name=
INSERT INTO T_R_P_OY@D_UH (S_NUM ,P_CODE ,E_CODE ,CODE_STATE , T_CATE ,O_TYPE_CODE ,P_KEY ,O_TIME , O_DEPART_ID ,O_STAFF_ID ,R_TIME ,REMARK , P_TYPE ,P_ID ,F_TYPE ,FOREGIFT ,RSV1 , RSV2 ,RSV3 ,RSV4 ,RSV5 ,RSV6 , W_CARD_TYPE ,S_CODE ,P_KEY_MODE) VALUES (:B25 ,:B24 ,:B23 ,:B22 , :B21 ,:B20 ,:B19 ,:B18 , :B17 ,:B16 ,:B15 ,:B14 , :B13 ,:B12 ,:B11 ,:B10 ,:B9 , :B8 ,:B7 ,:B6 ,:B5 ,:B4 , :B3 ,:B2 ,:B
hash=191708dff85615cba49f2013927d9b70 TIMESTAMP=05-15-2013 16:56:26
namespace=CRSR flags=RON/KGHP/TIM/OBS/PN0/LRG/DBN/MTX/[104100c1]
kkkk-dddd-llll=0000-0001-0001 LOCK=N pin=0 latch#=10 hpc=7ada hlc=7ada
lwt=c000001349e9a208[c000001349e9a208,c000001349e9a208] ltm=c000001349e9a218[c000001349e9a218,c000001349e9a218]
pwt=c000001349e9a1d0[c000001349e9a1d0,c000001349e9a1d0] ptm=c000001349e9a1e0[c000001349e9a1e0,c000001349e9a1e0]
REF=c000001349e9a238[c000001349e9a238,c000001349e9a238] lnd=c000001349e9a250[c000001355aa6bf8,c0000013374654c8]
  LOCK OWNERS:
      LOCK     USER  SESSION COUNT mode flags
  -------- -------- -------- ----- ---- ------------------------
  c00000133808f6f0 c00000103d35e2b0 c00000103d35e2b0     1 N    [00]
  LIBRARY OBJECT: object=c000001346657eb8
  TYPE=CRSR flags=EXS[0001] pflags=[0000] STATUS=VALD LOAD=0
  CHILDREN: SIZE=32768
  child#    TABLE reference   handle
  ------ -------- --------- --------
       0 c00000134fc4cf30 c00000134fc4cba0 c000001340160d80
       1 c00000134fc4cf30 c000001363a76b48 c00000135dabf960
       2 c00000134fc4cf30 c00000135a292320 c00000136988b450
       3 c00000134fc4cf30 c000001337db37a0 c00000133487dac0
       4 c00000134fc4cf30 c00000135dcf2be8 c000001346fd8448
.
.
.
*** 2013-07-03 08:16:46.734
ksedmp: internal OR fatal error
ORA-00600: 内部错误代码, 参数: [17059], [0xC000001346657EB8], [], [], [], [], [], []
No CURRENT SQL statement being executed.
----- PL/SQL Call Stack -----
  object      line  object
  handle    NUMBER  name
c000001331c31eb0        31  PROCEDURE U_T.P_D_P_O
c000001339ef8bf0         1  anonymous block
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+64          CALL     ksedst1()            000000000 ? 000000001 ?
ksedmp()+2176        CALL     ksedst()             000000000 ?
                                                   C000000000000C9F ?
                                                   4000000003FF5020 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ksfdmp()+48          CALL     ksedmp()             000000003 ?
kgeriv()+336         CALL     ksfdmp()             C000000000000695 ?
                                                   000000003 ?
                                                   400000000946EB60 ?
                                                   0000CF507 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
kgesiv()+192         CALL     kgeriv()             6000000000031370 ?
                                                   6000000000032428 ?
                                                   40000000019EDB40 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFEC3B8 ?
kgesic1()+128        CALL     kgesiv()             6000000000031370 ?
                                                   9FFFFFFFFD3114C0 ?
                                                   9FFFFFFFFD3114D0 ?
                                                   000000001 ?
                                                   9FFFFFFFFFFEC3B8 ?
$cold_kgltba()+464   CALL     kgesic1()            6000000000031370 ?
                                                   9FFFFFFFFD3114C0 ?
                                                   0000042A3 ? 000000002 ?
                                                   C000001346657EB8 ?
                                                   C000000000004F26 ?
                                                   4000000003351960 ?
                                                   0000CF54D ?
kglhdgc()+608        CALL     $cold_kgltba()       6000000000031370 ?
                                                   C000001346657EB8 ?
                                                   C0000013466581D8 ?
                                                   C00000134F0E5288 ?
                                                   60000000000314E0 ?
                                                   C0000013DE36F380 ?
                                                   C00000134F0E5288 ?
                                                   C0000013F2E679D0 ?
kglget()+5536        CALL     kglhdgc()            6000000000031370 ?
                                                   C000001346657EB8 ?
                                                   9FFFFFFFFFFEC7EE ?
                                                   000000000 ? 000000000 ?
                                                   010010000 ?
                                                   C0000013466581D8 ?
                                                   9FFFFFFFFFFEC478 ?
kxsGetLookupLock()+  CALL     kglget()             6000000000031370 ?
336                                                9FFFFFFFFFFEC7D0 ?
                                                   000000001 ? 000000003 ?
                                                   9FFFFFFFFD370460 ?
kksfbc()+15568       CALL     kxsGetLookupLock()   6000000000031370 ?
                                                   9FFFFFFFFD370438 ?
                                                   9FFFFFFFFFFEC7D0 ?
                                                   C000000000001EC5 ?
                                                   4000000002EEFB60 ?
                                                   9FFFFFFFFD370458 ?
                                                   9FFFFFFFFD370460 ?
                                                   9FFFFFFFFFFEC7F8 ?
kkspbd0()+1184       CALL     kksfbc()             9FFFFFFFFFFED910 ?
                                                   400000000324E890 ?
                                                   0000C8213 ?
                                                   9FFFFFFFFFFEC5F4 ?
                                                   000000000 ? 000000000 ?
                                                   C000001346657FD8 ?
                                                   C000001346657FD8 ?
kksParseCursor()+81  CALL     kkspbd0()            000003F50 ? 000000003 ?
6                                                  000080000 ?
                                                   60000000000C6208 ?
                                                   9FFFFFFFFD370418 ?
                                                   9FFFFFFFFD370570 ?
                                                   60000000000B5E18 ?
opiosq0()+4320       CALL     kksParseCursor()     9FFFFFFFFFFEDB80 ?
                                                   C000000000001736 ?
                                                   4000000002E99F80 ?
                                                   9FFFFFFFFD3C1AB0 ?
                                                   0FFDFFFFF ?
                                                   60000000000C4BF0 ?
                                                   60000000000C4F48 ?
                                                   60000000000C4EE0 ?
opipls()+3392        CALL     opiosq0()            000000003 ?
                                                   9FFFFFFFFFFEE620 ?
                                                   4000000003009120 ?
                                                   00000F1DF ?
                                                   9FFFFFFFFFFEDA40 ?
                                                   60000000000B5E18 ?
                                                   C00000000000224C ?
                                                   0000001ED ?
opiodr()+2128        CALL     opipls()             9FFFFFFFFFFEEF30 ?
                                                   4000000002E42BA0 ?
                                                   00000E09F ?
                                                   9FFFFFFFFFFEE680 ?
                                                   60000000000B5E18 ?
                                                   9FFFFFFFFFFEEA10 ?
rpidrus()+352        CALL     opiodr()             000000066 ? 000000006 ?
                                                   4000000001B01EC0 ?
                                                   0000046B0 ?
                                                   9FFFFFFFFFFEEF40 ?
                                                   60000000000B5E18 ?
skgmstack()+288      CALL     rpidrus()            9FFFFFFFFFFF1680 ?
                                                   9FFFFFFFFFFF10C0 ?
                                                   60000000000B5E18 ?
                                                   9FFFFFFFFFFF1640 ?
                                                   C000000000000716 ?
                                                   4000000002E819C0 ?
                                                   00000E05F ?
                                                   9FFFFFFFFFFF1120 ?
.
.
.

报错为ORA-600(17059)的BUG有很多,不过根据报错堆栈信息以及ORA-600之前出现的LIBRARY CACHE信息,可以确认Bug 8922013 – OERI [17059] / excess child cursors for SQL referencing REMOTE objects [ID 8922013.8]描述的问题和当前一致。
当语句通过数据库链进行远端访问,可能会造成子游标无法重用,从而导致大量的ORA-600错误的出现。这个问题影响10.2.0.4和10.2.0.5,Oracle仍然没有彻底解决这个问题,事实上是在10.2.0.4中解决了Bug 4581334 Cursors accessing remote tables may be repeatedly rebuilt and not used的问题,而引发了当前的BUG。Oracle目前并没有给出一个解决方案,也没有明确哪个版本会修正这个问题。这是针对一些平台提供了单独的补丁,但是安装这个补丁意味着撤销对BUG 4581334的修正。
这个BUG并不会导致任何功能上的问题,但是可能会在短时间内生成大量的TRACE从而将ORACLE_BASE目录撑满。

Posted in BUG | Tagged , , , , , | Leave a comment