20121109 Oracle技术嘉年华首日

筹备了几个月的第二届Oracle嘉年华重要开幕了。
这次大会计划中我在明天会有一个关于ODA的主题,不过没想到今天临时又增加了一个任务。
Kaya在今天下午有一个主题,这个主题就是他们组在OOW上带来的“戏说Oracle性能--Oracle性能问题的模拟与展示”。在OOW上,这个主题的展示参与人数有5、6个人之多,每个人会负责一个角色,通过不同角色之间的对话来将问题分析逐步的分析清楚。
而今天的会议只有Kaya一个人显然难以将这个主题的特点显示出来。原计划是Kamus跟Kaya进行配合,不过即使对主题中的角色进行了整合,两个人仍然难以诠释那么多角色,于是今天上午临时通知我需要帮忙参与这个主题。
利用上午和午饭后的一点时间,我们总算将台词对过了一遍。好在Kaya对于这个主题驾轻就熟,且我们饰演的三个场景在OOW上我和Kamus都听过一次,也算多少有点了解。总算整个主题下来没有出什么意外,而且大家反馈也还不错。这个会话的成功最大的功劳毫无疑问属于Kaya,且不说整个演示的代码都是Kaya编写的,就是本次主题中,所有的展开和技术分析都是Kaya来完成的。此外,Kamus将一个压力山大的DBA表现的淋漓尽致,也是这个主题的亮点,且他增加了对于google和baidu搜索的调侃,并结合了当前的形式,可谓神来之笔。

Posted in NEWS | Tagged , | Leave a comment

20121108 Oracle Developer Day

今天应邀参加了Oracle Developer Day。
很长时间没有参加Oracle的Developer Day活动了,上次可能是4年以前的事情了。那次参加还是去学习的,这是变成了特约技术专家了。
这次巧合的是,Oracle把Developer Day的会场也放在了国宾宾馆,且就在我们的Oracle技术嘉年华前一天,于是下午去附会的同时可以去热热场子。
今天的会议都是大数据相关的主题,而我的工作和经验基本上和大数据没有太多关系,本来去参加会议也就是打打酱油,不过没有想到现场朋友的提问基本上都是Oracle DB相关的问题,基本上和大数据没有什么关系。不过也好,要真是大数据相关的问题,我也没有办法解答。
最后,今天Developer Day最后的专家交流活动变成了恩墨的专场,现场的4个专家分别是Kamus、圣文、老熊和我。

Posted in NEWS | Tagged | Leave a comment

ORA-600(kcbgcur_1)错误

一个临时表空间无法分配导致的ORA-600错误。
错误信息如下:

Wed May 16 17:26:23 2012
ORA-1652: unable TO extend temp segment BY 128 IN tablespace                 ATEMP 
Wed May 16 17:26:31 2012
ORA-1652: unable TO extend temp segment BY 128 IN tablespace                 ATEMP 
Wed May 16 17:26:37 2012
ORA-1652: unable TO extend temp segment BY 128 IN tablespace                 ATEMP 
Wed May 16 17:26:38 2012
ORA-1652: unable TO extend temp segment BY 128 IN tablespace                 ATEMP 
Errors IN file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_15925604.trc  (incident=209723):
ORA-00600: 内部错误代码, 参数: [kcbgcur_1], [], [], [], [], [], [], [], [], [], [], []
ORA-01652: 无法通过 128 (在表空间 ATEMP 中) 扩展 temp 段
Incident details IN: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_209723/orcl1_ora_15925604_i209723.trc

在MOS中找不到和当前现象一致的已知BUG描述,但是在当前这个11.2.0.3 RAC for AIX环境中,这个ORA-600[kcbgcur_1]错误是可重现的,一旦连续发生ORA-1652错误,就会导致这个ORA-600错误的产生。
要避免这个600错误,其实就是要消除ORA-1652错误,除了增加足够的临时空间外,还可以优化占用临时空间较大的SQL语句。

Posted in BUG | Tagged , , , | Leave a comment

ORA-7445(ksfd_odmwat)错误

Oracle 9206 RAC环境访问LOGMINER视图时报错。
错误信息:

Wed Aug 10 17:06:37 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl2_ora_20704.trc:
ORA-07445: exception encountered: core dump [000000010069AB88] [SIGSEGV] [Address NOT mapped TO object] [0x000000008] [] []
Wed Aug 10 17:07:22 2011
Trace dumping IS performing id=[cdmp_20110810170722]

对应的TRACE信息为:

*** 2011-08-10 17:06:37.463
*** SESSION ID:(49.28760) 2011-08-10 17:06:37.461
Exception signal: 11 (SIGSEGV), code: 1 (Address NOT mapped TO object), addr: 0x8, PC: [0x10069ab88, 000000010069AB88]
*** 2011-08-10 17:06:37.463
ksedmp: internal OR fatal error
ORA-07445: exception encountered: core dump [000000010069AB88] [SIGSEGV] [Address NOT mapped TO object] [0x000000008] [] []
CURRENT SQL statement FOR this SESSION:
SELECT COUNT(*) FROM v$logmnr_contents
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()             00000000B ? 000000000 ?
                                                   000000000 ? 00000004A ?
                                                   FFFFFFFF7FFF0658 ?
                                                   1032E18E8 ?
ssexhd()+676         CALL     ksedmp()             000103705 ? 103705000 ?
                                                   103705468 ? 10370A000 ?
                                                   000102C00 ? 000000000 ?
sigacthandler()+44   PTR_CALL 0000000000000000     00010370D ?
                                                   FFFFFFFF7FFF76F0 ?
                                                   10370D000 ? 10370A620 ?
                                                   000000000 ? 10370D578 ?
ksfd_odmwat()+968    PTR_CALL 0000000000000000     00000000B ?
                                                   FFFFFFFF7FFF76F0 ?
                                                   FFFFFFFF7FFF7410 ?
                                                   000004400 ? 000103400 ?
                                                   00000000E ?
ksfdwtio()+812       CALL     ksfd_odmwat()        FFFFFFFF7B7E63A8 ?
                                                   000000000 ? 1037077D4 ?
                                                   000000000 ? 000000000 ?
                                                   0FFFFFFFB ?
ksfdwat1()+64        CALL     ksfdwtio()           38000A800 ? 1037077D4 ?
                                                   380011054 ? 000000000 ?
                                                   0FFFFFFFB ? 01CBA91A7 ?
ksfdrwat0()+212      CALL     ksfdwat1()           0FFFFFFFB ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   10370A000 ? 00000FC00 ?
kcrfais()+752        CALL     ksfdwat()            38000A000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   00000001B ? 10370AA10 ?
kcrfdr()+292         CALL     kcrfais()            FFFFFFFF7B7E1478 ?
                                                   FFFFFFFF7B7E1400 ?
                                                   FFFFFFFF7B7E1210 ?
                                                   000000800 ? 000002002 ?
                                                   FFFFFFFF7FFF9D7C ?
kcrfrgv()+452        CALL     kcrfdr()             FFFFFFFF7B7E10F0 ?
                                                   FFFFFFFF7B7E1168 ?
                                                   000000040 ? 000103400 ?
                                                   00010370B ? 000000002 ?
krvxrgr_GetRedo()+1  CALL     kcrfrgv()            FFFFFFFF7B7E10F0 ?
04                                                 0000002D0 ? 0000002D0 ?
                                                   000000080 ? 000000000 ?
                                                   00000FC00 ?
krvxror_ReadOneReco  CALL     krvxrgr_GetRedo()    FFFFFFFF7B7E6818 ?
rd()+1284                                          FFFFFFFF7B7E10F0 ?
                                                   FFFFFFFF7FFFB018 ?
                                                   FFFFFFFF7B7E10E0 ?
                                                   1033E7E98 ? 1033E7E88 ?
krvxread()+432       CALL     krvxror_ReadOneReco  FFFFFFFF7B7E10E0 ?
                              rd()                 FFFFFFFF7B7E3118 ?
                                                   FFFFFFFF7B7E2F90 ?
                                                   000000007 ?
                                                   FFFFFFFF7FFFB910 ?
                                                   000000000 ?
krvxgtsp_GetTxnSing  CALL     krvxread()           FFFFFFFF7B7E6818 ?
leProcess()+28                                     000000000 ? 000000000 ?
                                                   0990FD824 ? 000000000 ?
                                                   FFFFFFFF7B7E2F90 ?
krvxgt()+604         CALL     krvxgtsp_GetTxnSing  FFFFFFFF7B7E6818 ?
                              leProcess()          000000000 ? 103708000 ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   000000000 ? 000000000 ?
krvfcact()+1940      CALL     krvxgt()             FFFFFFFF7B7E6818 ?
                                                   000000001 ?
                                                   FFFFFFFF7FFFB2D4 ?
                                                   FFFFFFFF7FFFB2B8 ?
                                                   FFFFFFFF7FFFB2C0 ?
                                                   0000000CA ?
qerfxFetch()+1056    PTR_CALL 0000000000000000     FFFFFFFF7B7E6818 ?
                                                   000001000 ?
                                                   FFFFFFFF7B7E7768 ?
                                                   FFFFFFFF7B7E0068 ?
                                                   47CABB0D8 ?
                                                   FFFFFFFF7B7E2F90 ?
qergsFetch()+2268    PTR_CALL 0000000000000000     4C9EB4768 ? 102E54FC8 ?
                                                   000000000 ? 102DDC520 ?
                                                   102E53830 ? 000007FFF ?
opifch2()+1724       PTR_CALL 0000000000000000     000101800 ?
                                                   FFFFFFFF7B7E74E0 ?
                                                   000000000 ? 000000001 ?
                                                   100FD5400 ? 4C9EB46D0 ?
opiall0()+3860       CALL     opifch2()            101001000 ? 102E55AD8 ?
                                                   100FD5400 ? 0000000C9 ?
                                                   FFFFFFFF7FFFB910 ?
                                                   FFFFFFFF7FFFBF3C ?
kpoal8()+1040        CALL     opiall0()            000000000 ? 00000005E ?
                                                   FFFFFFFF7FFFC1C8 ?
                                                   103705808 ?
                                                   FFFFFFFF7B7E7B48 ?
                                                   FFFFFFFF7FFFC558 ?
opiodr()+1688        PTR_CALL 0000000000000000     000000000 ? 000000001 ?
                                                   FFFFFFFF7FFFEA10 ?
                                                   000000024 ? 000000000 ?
                                                   0000022B0 ?
ttcpip()+1556        PTR_CALL 0000000000000000     000103400 ? 100FBBFC0 ?
                                                   10370D808 ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFFCBB0 ?
opitsk()+984         CALL     ttcpip()             10370D800 ? 000000014 ?
                                                   FFFFFFFF7FFFEA10 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFF7FFFDCFC ?
opiino()+1572        CALL     opitsk()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   103707D28 ?
                                                   FFFFFFFF7FFFEB64 ?
opiodr()+1688        PTR_CALL 0000000000000000     000380007 ? 10370C658 ?
                                                   1037F9458 ?
                                                   FFFFFFFF7FFFF8A0 ?
                                                   000000000 ? 47DED9E20 ?
opidrv()+736         CALL     opiodr()             000103400 ? 10100C380 ?
                                                   10370D808 ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFFF3C0 ?
sou2o()+16           CALL     opidrv()             000000000 ? 000000004 ?
                                                   1037051EC ? 00000003C ?
                                                   1037056C8 ? 000103400 ?
main()+184           CALL     sou2o()              FFFFFFFF7FFFF8C0 ?
                                                   00000003C ? 000000004 ?
                                                   FFFFFFFF7FFFF8A0 ?
                                                   000039E70 ? 000000000 ?
_start()+380         CALL     main()               000000002 ?
                                                   FFFFFFFF7FFFFA08 ?
                                                   FFFFFFFF7FFFFA20 ?
                                                   000000000 ? 000000000 ?
                                                   100000000 ?
--------------------- Binary Stack Dump ---------------------

当前的数据库采用ODM:VERITAS 4.1.20.00 ODM Library, Version 1.1,而在发生错误时等待的是ODM的IO操作。这个错误在MOS甚至是GOOGLE上都找不到,不过可以很容易的判断出,问题发生在Oracle与Veritas之间的配合有关。
发生错误的语句是SELECT COUNT(*) FROM V$LOGMNR_CONTENTS,但是观察会话的历史执行需要,之前已经做过对V$LOGMNR_CONTENTS的全表访问,并未引发异常:

Cursor Dump:
----------------------------------------
Cursor 1 (ffffffff7ca60490): CURROW  curiob: ffffffff7ca69068
 curflg: 4e curpar: 0 curusr: 0 curses 47e8b7800
 cursor name: SELECT * FROM v$logmnr_contents
 child pin: 48f2da230, child LOCK: 490123c90, parent LOCK: 490123da0
 xscflg: 80100074, parent handle: 4a8d79a90, xscfl2: 4200409
  nxt: 15.0x00000008  nxt: 14.0x00000158  nxt: 13.0x00000fa8  nxt: 12.0x00000fa8
  nxt: 11.0x00000158  nxt: 10.0x00000fa8  nxt: 9.0x00000180  nxt: 8.0x00000fa8
  nxt: 7.0x00000fa8  nxt: 6.0x00000158  nxt: 5.0x00000fa8  nxt: 4.0x00000420
  nxt: 3.0x000007c8  nxt: 2.0x000007c8  nxt: 1.0x000007c8
Cursor frame allocation dump:
frm: -------- Comment --------  Size  Seg Off 
 bhp SIZE: 160/600
 whp SIZE: 11066784/11067768
Dump OF CURRENT WORK HEAP:
******************************************************
.
.
.
******************************************************
----------------------------------------
Cursor 2 (ffffffff7ca604e0): CURBOUND  curiob: ffffffff7b7e8130
 curflg: 4e curpar: 0 curusr: 0 curses 47e8b7800
 cursor name: SELECT DISTINCT operation FROM v$logmnr_contents
 child pin: 0, child LOCK: 48f491d08, parent LOCK: 48f492368
 xscflg: 100024, parent handle: 4aa79cb50, xscfl2: 4200409
 bhp SIZE: 160/600
----------------------------------------
Cursor 3 (ffffffff7ca60530): CURFETCH  curiob: ffffffff7b7e7b48
 curflg: 4e curpar: 0 curusr: 0 curses 47e8b7800
 cursor name: SELECT COUNT(*) FROM v$logmnr_contents
 child pin: 48f2db030, child LOCK: 490125618, parent LOCK: 490125c78
 xscflg: 80100074, parent handle: 4c9ec3038, xscfl2: 4200409
  nxt: 4.0x00000008  nxt: 3.0x00000060  nxt: 2.0x000004c0  nxt: 1.0x000007c8
Cursor frame allocation dump:
frm: -------- Comment --------  Size  Seg Off 
 bhp SIZE: 160/600
 whp SIZE: 11046016/11046720
Dump OF CURRENT WORK HEAP:
******************************************************

显然这个错误的发生是偶然的,目前并没有导致或解决这个问题的明确方式,不过考虑只是偶尔影响v$logmnr_contents的访问,可以忽略掉这个错误。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-600(12333)错误(二)

又一个ORA-600(12333)错误。
ORA-600(12333)错误:http://yangtingkun.itpub.net/post/468/526154
ORA-600(12333)错误和ORA-600(ttclxx1)错误:http://yangtingkun.itpub.net/post/468/526078
错误发生在9206RAC环境下:
Tue Jul 12 17:26:57 2011
Errors in file /opt/app/admin/orcl/udump/orcl1_ora_28993.trc:
ORA-00600: internal error code, arguments: [12333], [19], [3], [15], [], [], [], []
Tue Jul 12 17:26:58 2011
Trace dumping is performing id=[cdmp_20110712172658]

详细TRACE如下:

*** SESSION ID:(116.15703) 2011-07-12 17:26:57.605
*** 2011-07-12 17:26:57.605
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [12333], [19], [3], [15], [], [], [], []
CURRENT SQL statement FOR this SESSION:
INSERT INTO BACK_SESSION (SESSION_SEQ, LOGIN_TIME, LOGOUT_TIME, SESSION_ID, TIMEOUT, LOGIN_ADDR, SUM_PAY, USER_SEQ, SUM_ACCESS, UNIQ_TOKEN, STATUS) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11)
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()             00000000B ? 000000000 ?
                                                   000000000 ? 00000004A ?
                                                   FFFFFFFF7FFF9A58 ?
                                                   1032E18E8 ?
kgeriv()+208         PTR_CALL 0000000000000000     000103705 ? 103705000 ?
                                                   103705468 ? 10370A000 ?
                                                   000102C00 ? 000000000 ?
kgesiv()+108         CALL     kgeriv()             1037056C8 ? 10381D0E8 ?
                                                   000000258 ? 0000013C8 ?
                                                   FFFFFFFF7FFFD408 ?
                                                   103706A98 ?
ksesic3()+92         CALL     kgesiv()             1037056C8 ? 10381D0E8 ?
                                                   00000302D ? 000000003 ?
                                                   FFFFFFFF7FFFD408 ?
                                                   FFFFFFFF7FFFE2B0 ?
opitsk()+5088        CALL     ksesic3()            00000302D ? 000000000 ?
                                                   000000013 ? 000000000 ?
                                                   000000003 ? 000000000 ?
opiino()+1572        CALL     opitsk()             000000000 ? 000003000 ?
                                                   000000000 ? 000000000 ?
                                                   103707D28 ?
                                                   FFFFFFFF7FFFEC04 ?
opiodr()+1688        PTR_CALL 0000000000000000     000380007 ? 10370C658 ?
                                                   1037EE848 ?
                                                   FFFFFFFF7FFFF940 ?
                                                   000000000 ? 5C4447BC8 ?
opidrv()+736         CALL     opiodr()             000103400 ? 10100C380 ?
                                                   10370D808 ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFFF460 ?
sou2o()+16           CALL     opidrv()             000000000 ? 000000004 ?
                                                   1037051EC ? 00000003C ?
                                                   1037056C8 ? 000103400 ?
main()+184           CALL     sou2o()              FFFFFFFF7FFFF960 ?
                                                   00000003C ? 000000004 ?
                                                   FFFFFFFF7FFFF940 ?
                                                   000039E70 ? 000000000 ?
_start()+380         CALL     main()               000000002 ?
                                                   FFFFFFFF7FFFFAA8 ?
                                                   FFFFFFFF7FFFFAC0 ?
                                                   000000000 ? 000000000 ?
                                                   100000000 ?
--------------------- Binary Stack Dump ---------------------

其中在opiino函数中发现存在ORA-1403错误:

========== FRAME [6] (opiino()+1572 -> opitsk()) ==========
%l0 FFFFFFFF7FFFDD2B %l1 000000010370D808 %l2 0000000000000000 
%l3 000000010370D790 %l4 0000000000000000 %l5 000000010370D800 
%l6 0000000000000001 %l7 000000000000000A %i0 0000000000000000 
%i1 0000000000003000 %i2 0000000000000000 %i3 0000000000000000 
%i4 0000000103707D28 %i5 FFFFFFFF7FFFEC04 %fp FFFFFFFF7FFFE411 
rtn-pc 000000010100C9A4 argd FFFFFFFF7FFFD400 stret FFFFFFFF7FFFD400 
xtraarg FFFFFFFF7FFFE4C1 locals FFFFFFFF7FFFD438 
Dump OF memory FROM 0xFFFFFFFF7FFFD380 TO 0xFFFFFFFF7FFFD780
FFFFFFFF7FFFD380 FFFFFFFF 7FFFDD2B 00000001 0370D808  [.......+.....p..]
FFFFFFFF7FFFD390 00000000 00000000 00000001 0370D790  [.............p..]
FFFFFFFF7FFFD3A0 00000000 00000000 00000001 0370D800  [.............p..]
FFFFFFFF7FFFD3B0 00000000 00000001 00000000 0000000A  [................]
FFFFFFFF7FFFD3C0 00000000 00000000 00000000 00003000  [..............0.]
FFFFFFFF7FFFD3D0 00000000 00000000 00000000 00000000  [................]
FFFFFFFF7FFFD3E0 00000001 03707D28 FFFFFFFF 7FFFEC04  [.....p}(........]
FFFFFFFF7FFFD3F0 FFFFFFFF 7FFFE411 00000001 0100C9A4  [................]
FFFFFFFF7FFFD400 FFFFFFFF 7DDC1A90 00000000 00000000  [....}...........]
FFFFFFFF7FFFD410 00000000 00000013 00000000 00000000  [................]
FFFFFFFF7FFFD420 00000000 00000003 00000000 00000000  [................]
FFFFFFFF7FFFD430 00000000 0000000F 00000000 00000000  [................]
FFFFFFFF7FFFD440 FFFFFFFF 7FFFEC08 00000000 00000053  [...............S]
FFFFFFFF7FFFD450 00000001 03710D48 00000000 00000778  [.....q.H.......x]
FFFFFFFF7FFFD460 00000001 02D67BE8 00000000 00000000  [......{.........]
FFFFFFFF7FFFD470 00006FA3 7FFFE5D0 FFFFFFFF 7FFFEC0D  [..o.............]
FFFFFFFF7FFFD480 00080000 7DDBF3A6 00000001 02EB8304  [....}...........]
FFFFFFFF7FFFD490 00004000 037A7A00 FFFFFFFF 7FFFEAB0  [..@..zz.........]
FFFFFFFF7FFFD4A0 FFFFFFFF 7FFFE2B0 FFFFFFFF 7FFFE2B0  [................]
FFFFFFFF7FFFD4B0 00000001 0370D790 FFFFFFFF 7FFFE2AA  [.....p..........]
FFFFFFFF7FFFD4C0 00000001 00000000 00000001 037EE740  [.............~.@]
FFFFFFFF7FFFD4D0 00000000 FFFFFFFF FFFFFFFF FFFFEBFF  [................]
FFFFFFFF7FFFD4E0 00000000 00000000 00000000 00000000  [................]
FFFFFFFF7FFFD4F0 FFFFFFFF 7DDC1AF8 FFFFFFFF 7DDC1AA8  [....}.......}...]
FFFFFFFF7FFFD500 FFFFFFFF 7FFFEC04 FFFFFFFF 7DDC1A88  [............}...]
FFFFFFFF7FFFD510 FFFFFFFF 7DDC1A90 00000000 7FFFFFFF  [....}...........]
FFFFFFFF7FFFD520 FFFFFFFF 4F52412D 30313430 333A206E  [....ORA-01403: n]
FFFFFFFF7FFFD530 6F206461 74612066 6F756E64 0A000000  [o DATA found....]
FFFFFFFF7FFFD540 00000000 00000000 FFFFFFFF 7DDC1AA8  [............}...]
FFFFFFFF7FFFD550 FFFFFFFF 7DDC1AB0 00000000 0000000A  [....}...........]
FFFFFFFF7FFFD560 FFFFFFFF 7DDC1A90 FFFFFFFF 7FFFE523  [....}..........#]
FFFFFFFF7FFFD570 FFFFFFFF 7DDBF3A7 FFFFFFFF 7DDBF3BC  [....}.......}...]
FFFFFFFF7FFFD580 00000000 00000047 00000000 00000000  [.......G........]
FFFFFFFF7FFFD590 00000000 00000000 00000000 00000000  [................]
FFFFFFFF7FFFD5A0 00000000 00000000 00000047 00000001  [...........G....]
FFFFFFFF7FFFD5B0 00000016 00000000 00000000 00000000  [................]
FFFFFFFF7FFFD5C0 00000000 00000000 00000001 0359A570  [.............Y.p]
FFFFFFFF7FFFD5D0 00000000 00000000 00000001 03590259  [.............Y.Y]
FFFFFFFF7FFFD5E0 00000000 00000003 FFFFFFFF 7DDB6F48  [............}.oH]
FFFFFFFF7FFFD5F0 FFFFFFFF 7FFFE750 00000000 00000000  [.......P........]
FFFFFFFF7FFFD600 FFFFFFFF 7DDBF3A6 00000000 00000053  [....}..........S]
FFFFFFFF7FFFD610 00000000 00000000 FFFFFFFF FFFFFFFF  [................]
FFFFFFFF7FFFD620 00000001 00000000 FFFFFFFF 7DDC1B00  [............}...]
FFFFFFFF7FFFD630 FFFFFFFF 7DDBF3BB FFFFFFFF 7DDC1B08  [....}.......}...]
FFFFFFFF7FFFD640 00000000 00000000 FFFFFFFF 7DDC1AA0  [............}...]
FFFFFFFF7FFFD650 7FFFFFFF FFFFFFFF 00000016 FFFFEBFF  [................]
FFFFFFFF7FFFD660 FFFFFFFF 7FFFE83F 00000000 00000000  [.......?........]
FFFFFFFF7FFFD670 FFFFFFFF 7DDC1AF8 FFFFFFFF 7FFFE849  [....}..........I]
FFFFFFFF7FFFD680 FFFFFFFF 7DDC1AB0 FFFFFFFF 7DDC1A88  [....}.......}...]
FFFFFFFF7FFFD690 FFFFFFFF 7DDC1A90 00000000 7FFFFFFF  [....}...........]
FFFFFFFF7FFFD6A0 00000000 00000003 00000001 0359025B  [.............Y.[]
FFFFFFFF7FFFD6B0 00000000 00000047 00000000 00000000  [.......G........]
FFFFFFFF7FFFD6C0 0000002C 00000000 FFFFFFFF 7FFFE727  [...,...........']
FFFFFFFF7FFFD6D0 00000000 00000000 00000001 035901D8  [.............Y..]
FFFFFFFF7FFFD6E0 FFFFFFFF 7DDC1AF8 FFFFFFFF 7DDC1AA8  [....}.......}...]
FFFFFFFF7FFFD6F0 FFFFFFFF 7DDC1AB0 FFFFFFFF 7DDC1A88  [....}.......}...]
FFFFFFFF7FFFD700 FFFFFFFF 7DDC1A90 FFFFFFFF 7DDBF3BD  [....}.......}...]
FFFFFFFF7FFFD710 00000004 7DDBF3A7 00000000 0000000A  [....}...........]
FFFFFFFF7FFFD720 00000000 00000047 FFFFFFFF 7FFFDF2E  [.......G........]
FFFFFFFF7FFFD730 00000000 00000000 00000000 00000000  [................]
FFFFFFFF7FFFD740 00000000 00000000 00000000 00000001  [................]
FFFFFFFF7FFFD750 00000000 00000000 00000000 00000005  [................]
FFFFFFFF7FFFD760 00000000 00000016 0000002C 037EE7C8  [...........,.~..]
FFFFFFFF7FFFD770 00000001 037EE848 00000000 00000000  [.....~.H........]

检查报错CURSOR对应的信息,发现没有绑定变量的值:

Cursor 9 (ffffffff7ca60630): CURBOUND  curiob: ffffffff7c956d20
 curflg: 44 curpar: 0 curusr: 0 curses 5c43999d0
 cursor name: INSERT INTO BACK_SESSION (SESSION_SEQ, LOGIN_TIME, LOGOUT_TIME, SESSION_ID, TIMEOUT, LOGIN_ADDR, SUM_PAY, USER_SEQ, SUM_ACCESS, UNIQ_TOKEN, STATUS) VALUES (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11)
 child pin: 0, child LOCK: 5cb8e3a48, parent LOCK: 5cb8e3be0
 xscflg: 110424, parent handle: 5d1e23780, xscfl2: 5200008
Dumping cursor sharing failures: 22000
 bhp SIZE: 160/600
 bind 0: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=584 offset=0
   No bind buffers allocated
 bind 1: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=24
   No bind buffers allocated
 bind 2: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=48
   No bind buffers allocated
 bind 3: dty=1 mxl=128(90) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 SIZE=0 offset=72
   No bind buffers allocated
 bind 4: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=200
   No bind buffers allocated
 bind 5: dty=1 mxl=128(39) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 SIZE=0 offset=224
   No bind buffers allocated
 bind 6: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=352
   No bind buffers allocated
 bind 7: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=376
   No bind buffers allocated
 bind 8: dty=2 mxl=22(22) mal=00 scl=00 pre=00 oacflg=03 oacfl2=0 SIZE=0 offset=400
   No bind buffers allocated
 bind 9: dty=1 mxl=128(90) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 SIZE=0 offset=424
   No bind buffers allocated
 bind 10: dty=1 mxl=32(03) mal=00 scl=00 pre=00 oacflg=03 oacfl2=10 SIZE=0 offset=552
   No bind buffers allocated
END OF cursor dump
ksedmp: no CURRENT context area

由于绝大部分的ORA-600[12333]错误都与通信或网络层异常有关,因此根据上面的信息判断,问题可能发生在会话处理绑定变量时,发现找不到绑定变量的信息,而绑定变量无法找不到多半是客户端与服务器端交互的过程中出现意外,导致数据的丢失。
由于现有的已知bug没有和当前现象接近的,因此也没有明确的解决问题的方案。考虑到这个错误的发生极为偶然,几年的时间只出现了一次,可以考虑忽略这个问题。

Posted in BUG | Tagged , , , | Leave a comment

ORA-7445(opipls)错误

客户的9206数据库出现ORA-7445错误。
错误信息:

Fri Apr 17 09:11:24 2009
Errors IN file /opt/app/admin/orcl/udump/orcl1_ora_15039.trc:
ORA-07445: exception encountered: core dump [000000010102BD1C] [SIGSEGV] [Address NOT mapped TO object] [0x000000004] [] []
Fri Apr 17 09:11:25 2009
Trace dumping IS performing id=[cdmp_20090417091125]

导致该错误详细信息:

*** 2009-04-17 09:11:24.746
*** SESSION ID:(107.25) 2009-04-17 09:11:24.738
Exception signal: 11 (SIGSEGV), code: 1 (Address NOT mapped TO object), addr: 0x4, PC: [0x10102bd1c, 000000010102BD1C]
*** 2009-04-17 09:11:24.746
ksedmp: internal OR fatal error
ORA-07445: exception encountered: core dump [000000010102BD1C] [SIGSEGV] [Address NOT mapped TO object] [0x000000004] [] []
CURRENT SQL statement FOR this SESSION:
SELECT /*+ USE_NL(store) USE_NL(dn) INDEX(store EI_ATTRSTORE) ORDERED */ store.eid,AttrName,NVL(AttrVal,' '),attrkind,NVL(attrstype, ' '),NVL(AttrVer,' ') FROM C_D dn, ds_attr store WHERE (dn.rdn = :szName AND dn.parentdn = :szDomain) AND store.eid = dn.eid
----- PL/SQL Call Stack -----
  object      line  object
  handle    NUMBER  name
5dbc4a990       244  package body ODS.OLADD
5df8530c0         1  anonymous block
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()             00000000B ? 000000000 ?
                                                   000000000 ? 00000004A ?
                                                   FFFFFFFF7FFF00F8 ?
                                                   1032E18E8 ?
ssexhd()+676         CALL     ksedmp()             000103705 ? 103705000 ?
                                                   103705468 ? 10370A000 ?
                                                   000102C00 ? 000000000 ?
sigacthandler()+44   PTR_CALL 0000000000000000     00010370D ?
                                                   FFFFFFFF7FFF7190 ?
                                                   10370D000 ? 10370A620 ?
                                                   000000000 ? 10370D578 ?
opipls()+1180        PTR_CALL 0000000000000000     00000000B ?
                                                   FFFFFFFF7FFF7190 ?
                                                   FFFFFFFF7FFF6EB0 ?
                                                   10382B700 ? 000000000 ?
                                                   FFFFFFFFFFFFFF4A ?
opiodr()+1688        PTR_CALL 0000000000000000     000000000 ? 000000001 ?
                                                   FFFFFFFF7C94EE98 ?
                                                   000000005 ? 000000002 ?
                                                   103705808 ?
rpidrus()+144        CALL     opiodr()             000103400 ? 10102B880 ?
                                                   102EB849A ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFF7FA0 ?
skgmstack()+156      PTR_CALL 0000000000000000     00000000B ? 000000066 ?
                                                   103705808 ?
                                                   FFFFFFFF7CA6A7E0 ?
                                                   FFFFFFFF7FFF81B0 ?
                                                   000103400 ?
rpidru()+160         CALL     skgmstack()          FFFFFFFF7FFF83D8 ?
                                                   1037051F0 ? 00000F618 ?
                                                   10022A300 ?
                                                   FFFFFFFF7FFF8400 ?
                                                   00193EAA4 ?
rpiswu2()+384        PTR_CALL 0000000000000000     FFFFFFFF7FFF8AD8 ?
                                                   FFFFFFFF7FFF8D70 ?
                                                   00000000C ? 000000410 ?
                                                   000103705 ? 00010022A ?
rpidrv()+1432        CALL     rpiswu2()            5C42DBC48 ? 000103705 ?
                                                   103705690 ? 1037056C8 ?
                                                   000000000 ? 10329F000 ?
psddr0()+156         CALL     rpidrv()             000100000 ? 000110424 ?
                                                   FFFFFFFF7FFF89DC ?
                                                   00000003A ? 5C42DBC48 ?
                                                   000100000 ?
psdnal()+344         CALL     psddr0()             103705468 ? 102EBA518 ?
                                                   FFFFFFFF7FFF8D70 ?
                                                   1037056C8 ? 000000140 ?
                                                   103705808 ?
pevm_EXECC()+324     PTR_CALL 0000000000000000     FFFFFFFF7FFFB100 ?
                                                   FFFFFFFF7FFFB278 ?
                                                   000001B58 ?
                                                   FFFFFFFF7C94EE98 ?
                                                   5DBC4A990 ? 000000001 ?
pfrrun()+3244        CALL     pevm_EXECC()         000000000 ? 103814EC8 ?
                                                   000000000 ?
                                                   FFFFFFFF7CA6A778 ?
                                                   000000000 ?
                                                   FFFFFFFF7C94EE98 ?
peicnt()+268         CALL     pfrrun()             00000122C ?
                                                   FFFFFFFF7FFFB100 ?
                                                   FFFFFFFF7CA6A778 ?
                                                   FFFFFFFF7CA6A7E0 ?
                                                   5C42DC6A8 ? 103705808 ?
kkxexe()+524         CALL     peicnt()             FFFFFFFF7FFFB100 ?
                                                   FFFFFFFF7CA6A778 ?
                                                   000000009 ? 103829160 ?
                                                   000102C00 ?
                                                   FFFFFFFF7FFFAF78 ?
opiexe()+9256        CALL     kkxexe()             000103400 ? 000000000 ?
                                                   00000FFFB ?
                                                   FFFFFFFF7CA6A778 ?
                                                   000000000 ? 103705468 ?
opiall0()+1776       CALL     opiexe()             00000002E ? 10370D808 ?
                                                   FFFFFFFF7CA60680 ?
                                                   10370D800 ?
                                                   FFFFFFFF7C953850 ?
                                                   103705808 ?
kpoal8()+1040        CALL     opiall0()            000000000 ? 00000005E ?
                                                   FFFFFFFF7FFFC218 ?
                                                   103705808 ?
                                                   FFFFFFFF7C953850 ?
                                                   FFFFFFFF7FFFC5A8 ?
opiodr()+1688        PTR_CALL 0000000000000000     000000000 ? 000000000 ?
                                                   FFFFFFFF7FFFEA60 ?
                                                   000000024 ? 000000000 ?
                                                   0000022B0 ?
ttcpip()+1556        PTR_CALL 0000000000000000     000103400 ? 100FBBFC0 ?
                                                   10370D808 ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFFCC00 ?
opitsk()+984         CALL     ttcpip()             10370D800 ? 000000014 ?
                                                   FFFFFFFF7FFFEA60 ?
                                                   000000000 ? 000000000 ?
                                                   FFFFFFFF7FFFDD4C ?
opiino()+1572        CALL     opitsk()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   103707D28 ?
                                                   FFFFFFFF7FFFEBB4 ?
opiodr()+1688        PTR_CALL 0000000000000000     000380007 ? 10370C658 ?
                                                   1037F9458 ?
                                                   FFFFFFFF7FFFF8F0 ?
                                                   000000000 ? 5C1447BA0 ?
opidrv()+736         CALL     opiodr()             000103400 ? 10100C380 ?
                                                   10370D808 ? 103705808 ?
                                                   103707D40 ?
                                                   FFFFFFFF7FFFF410 ?
sou2o()+16           CALL     opidrv()             000000000 ? 000000004 ?
                                                   1037051EC ? 00000003C ?
                                                   1037056C8 ? 000103400 ?
main()+184           CALL     sou2o()              FFFFFFFF7FFFF910 ?
                                                   00000003C ? 000000004 ?
                                                   FFFFFFFF7FFFF8F0 ?
                                                   000039E70 ? 000000000 ?
_start()+380         CALL     main()               000000002 ?
                                                   FFFFFFFF7FFFFA58 ?
                                                   FFFFFFFF7FFFFA70 ?
                                                   000000000 ? 000000000 ?
                                                   100000000 ?
--------------------- Binary Stack Dump ---------------------

分析错误信息,这个错误与Bug 2662683 – Heap corruption from schema name overwriting memory in PLSQL [ID 2662683.8]描述的问题比较类似,尤其是会话DUMP部分,似乎也存在问题描述的名称覆盖的问题:

Argument/Register addr=0x0000000103824760.
Dump OF memory FROM 0x0000000103824720 TO 0x0000000103824860
103824720 00000000 000028D1 00000000 00000000  [......(.........]
103824730 000028D0 00000000 00000001 03705470  [..(..........pTp]
103824740 00000001 03820698 10B38F00 000028B9  [..............(.]
103824750 00000000 00000000 00000001 032FF0B0  [............./..]
103824760 0000ABAB 00000000 00000001 03824770  [..............Gp]
103824770 00000001 03708CF0 0000FF80 00000000  [.....p..........]
103824780 00000005 C42DBC48 FFFFFFFF 7C950080  [.....-.H....|...]
103824790 FFFFFFFF 7C952E68 00000000 00000000  [....|..h........]
1038247A0 00000000 00000000 00021203 00000000  [................]
1038247B0 00000000 00000000 0000FF80 73657373  [............sess]
1038247C0 696F6E20 68656170 00000000 7FFF7FFF  [ion heap........]
1038247D0 7FFF0098 00000000 00000000 00000000  [................]
1038247E0 00000000 00000038 FFFFFFFF 7C93EBD0  [.......8....|...]
1038247F0 FFFFFFFF 7CA6A368 00000000 00000058  [....|..h.......X]
103824800 FFFFFFFF 7C953648 FFFFFFFF 7C953648  [....|.6H....|.6H]
103824810 00000000 00000098 00000001 03824818  [..............H.]
103824820 00000001 03824818 00000000 000000A8  [......H.........]
103824830 00000001 03824830 00000001 03824830  [......H0......H0]
103824840 00000000 00000118 00000001 03824848  [..............HH]
103824850 00000001 03824848 00000000 000001B0  [......HH........]
Argument/Register addr=0x00000005DBC4AAB0.
Dump OF memory FROM 0x00000005DBC4AA70 TO 0x00000005DBC4ABB0
5DBC4AA70 00000001 00000000 0000000A 00000000  [................]
5DBC4AA80 00000005 DBC4AA80 00000005 DBC4AA80  [................]
5DBC4AA90 00000024 00000000 00000005 DBC4AA98  [...$............]
5DBC4AAA0 00000005 DBC4AA98 00000000 00000000  [................]
5DBC4AAB0 F61D1590 4C434E43 3839494F EC7F614B  [....LCNC89IO..aK]
5DBC4AAC0 01000000 00000000 00786806 1013312A  [.........xh...1*]
5DBC4AAD0 00000000 00000000 00000000 00000005  [................]
5DBC4AAE0 03000000 00000000 00000000 00000000  [................]
5DBC4AAF0 4F4C4144 444F4453 00000000 00000000  [OLADDODS........]
5DBC4AB00 00000000 00000000 00000000 00000000  [................]

这个BUG在9.2.0.3中被FIXED,但是不排除在9206中重新引入,低于10.1.0.2版本都可能碰到这个错误。Oracle在个别平台和版本上提供了单独的补丁,此外只能通过升级到10.1.0.2以上才能避免这个错误。

Posted in BUG | Tagged , , | Leave a comment

ORA-600(kghpih:ds)错误

9206数据库在收集统计信息时出现这个错误。
错误信息如下:

Fri Oct 14 05:00:52 2011
Errors IN file /opt/app/admin/orcl/bdump/orcl2_p010_16736.trc:
ORA-00600: internal error code, arguments: [kghpih:ds], [0x494217CE8], [], [], [], [], [], []
Fri Oct 14 05:00:53 2011
Trace dumping IS performing id=[cdmp_20111014050053]
Fri Oct 14 05:00:53 2011
Errors IN file /opt/app/admin/orcl/bdump/orcl2_j000_16586.trc:
ORA-12012: error ON auto EXECUTE OF job 291
ORA-12801: error signaled IN parallel query server P010, instance server3:orcl2 (2)
ORA-00600: internal error code, arguments: [kghpih:ds], [0x494217CE8], [], [], [], [], [], []
ORA-06512: at "SYS.DBMS_STATS", line 10070
ORA-06512: at "SYS.DBMS_STATS", line 10564
ORA-06512: at "SYS.DBMS_STATS", line 10751
ORA-06512: at "SYS.DBMS_STATS", line 10805
ORA-06512: at "SYS.DBMS_STATS", line 10782
ORA-06512: at "SYS.PROC_ANALYZE", line 4
ORA-06512: at line 1

对应的TRACE信息为:

*** 2011-10-14 05:00:52.427
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [kghpih:ds], [0x494217CE8], [], [], [], [], [], []
No CURRENT SQL statement being executed.
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()             00000000B ? 000000000 ?
                                                   000000000 ? 00000004A ?
                                                   FFFFFFFF7FFF8A88 ?
                                                   1032E18E8 ?
kgerinv()+184        PTR_CALL 0000000000000000     000103705 ? 103705000 ?
                                                   103705468 ? 10370A000 ?
                                                   000102C00 ? 000000000 ?
kgesinv()+20         CALL     kgerinv()            1037056C8 ? 1037F9968 ?
                                                   0000013C8 ? 000000001 ?
                                                   1037077D4 ? 103706A98 ?
kgesin()+28          CALL     kgesinv()            1037056C8 ? 1037F9968 ?
                                                   1034D9E48 ? 000000001 ?
                                                   FFFFFFFF7FFFC430 ?
                                                   00000BFA0 ?
kghpih()+340         CALL     kghnerror()          1037056C8 ? 1037F9968 ?
                                                   1034D9E48 ? 000000001 ?
                                                   000000002 ? 494217CE8 ?
kgllkal()+808        CALL     kghpih()             1037056C8 ? 498D5D470 ?
                                                   000000000 ? 000000001 ?
                                                   49052A780 ? 49052A770 ?
kglget()+720         CALL     kgllkal()            48C8BE440 ? 49052A750 ?
                                                   000000003 ? 000000001 ?
                                                   000000006 ? 103706A98 ?
kkslce()+124         CALL     kglget()             000000000 ?
                                                   FFFFFFFF7FFFD4B0 ?
                                                   000000000 ? 49052A750 ?
                                                   4864BB1C0 ? 48DC28988 ?
kkschkcsr()+108      CALL     kkslce()             1037056C8 ? 49052ADB0 ?
                                                   FFFFFFFF7FFFD4B0 ?
                                                   FFFFFFFF7CA6AAB0 ?
                                                   FFFFFFFF7FFFD5D8 ?
                                                   FFFFFFFF7FFFCA90 ?
kksscl()+988         CALL     kkschkcsr()          1037056C8 ?
                                                   FFFFFFFF7CA6AAB0 ?
                                                   4906247E0 ? 000000008 ?
                                                   49052AD98 ? 000000000 ?
kksfbc()+3020        CALL     kksscl()             49052AD98 ? 000000006 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFFD4B0 ?
                                                   4906247E0 ? 000000000 ?
kkspsc0()+976        CALL     kksfbc()             000000000 ?
                                                   FFFFFFFF7FFFD4B0 ?
                                                   FFFFFFFF7FFFD64F ?
                                                   FFFFFFFF7FFFD1EC ?
                                                   FFFFFFFF7FFFD1F0 ?
                                                   002000001 ?
opiosq0()+924        CALL     kkspsc0()            FFFFFFFF7CA603B0 ?
                                                   000000005 ? 000000000 ?
                                                   4904CD9F8 ? 000000002 ?
                                                   103705808 ?
kxfxsp1()+472        CALL     opiosq()             100FF8000 ? 1037079C8 ?
                                                   000000E0D ? 1037F56E0 ?
                                                   1037077D4 ? 000000016 ?
kxfxspPO()+472       CALL     kxfxsp1()            1037F56E0 ? 000000E0D ?
                                                   000000000 ? 000000002 ?
                                                   000000000 ? 000001BF8 ?
kxfxsp()+292         CALL     kxfxspPO()           FFFFFFFF7FFFE1D8 ?
                                                   000103000 ?
                                                   FFFFFFFF7CA5FC08 ?
                                                   000000000 ? 000000000 ?
                                                   000100000 ?
kxfxmai()+960        CALL     kxfxsp()             4863FECA0 ? 48DC3E6D0 ?
                                                   0001003CC ? 103705468 ?
                                                   000000002 ? 000000000 ?
kxfprdp()+2456       PTR_CALL 0000000000000000     103803080 ? 103705808 ?
                                                   000000001 ? 000000000 ?
                                                   000000001 ? 000000000 ?
opirip()+1152        CALL     kxfprdp()            49AD362E8 ? 000003C00 ?
                                                   000102E55 ? 103705808 ?
                                                   000000010 ? 103707D40 ?
opidrv()+1012        CALL     opirip()             000000000 ? 000000010 ?
                                                   000000000 ? 103705468 ?
                                                   10370A610 ? 000000001 ?
sou2o()+16           CALL     opidrv()             000000000 ? 000000000 ?
                                                   1037051EC ? 000000032 ?
                                                   1037056C8 ? 000103400 ?
main()+304           CALL     sou2o()              FFFFFFFF7FFFF980 ?
                                                   000000032 ? 000000000 ?
                                                   000000000 ? 000039E70 ?
                                                   000000000 ?
_start()+380         CALL     main()               000000001 ?
                                                   FFFFFFFF7FFFFAC8 ?
                                                   FFFFFFFF7FFFFAD8 ?
                                                   000000000 ? 000000000 ?
                                                   100000000 ?
--------------------- Binary Stack Dump ---------------------

根据MOS文档ALERT: ORA-600 [kghpih:ds] After Applying 9.2.0.6 [ID 310939.1],这是9206补丁集引入的一个bug,两个会话同时锁住一个游标,就会导致其中一个出现ORA-600错误。
Oracle在9207中解决了这个bug,或者在9206上针对Bug 415771应用patch,同样可以避免这个错误。

Posted in BUG | Tagged , , , | Leave a comment

重启RAC实例无法启动报错CRS-1019

客户反馈在重启RAC环境后,发现CLUSTER启动正常,但是数据库实例没用启动。
根据客户的电话描述,Oracle尝试在节点1上启动实例2,在节点2上启动实例1,并导致错误CRS-1019。
从客户的描述上很难得到真正有意义的信息,于是请客户将详细的错误信息发给我:

oracle@orcl1:/home/oracle>crs_start -ALL 
Attempting TO START `ora.orcl.orcl1.inst` ON member `orcl1` 
Attempting TO START `ora.orcl.orcl2.inst` ON member `orcl2` 
START OF `ora.orcl.orcl1.inst` ON member `orcl1` failed. 
orcl2 : CRS-1019: Resource ora.orcl.orcl1.inst (application) cannot run ON orcl2 
START OF `ora.orcl.orcl2.inst` ON member `orcl2` failed. 
orcl1 : CRS-1019: Resource ora.orcl.orcl2.inst (application) cannot run ON orcl1 
Attempting TO START `ora.orcl.db` ON member `orcl1` 
START OF `ora.orcl.db` ON member `orcl1` failed. 
Attempting TO START `ora.orcl.db` ON member `orcl2` 
START OF `ora.orcl.db` ON member `orcl2` failed. 
CRS-1006: No more members TO consider 
CRS-0215: Could NOT START resource 'ora.orcl.db'. 
CRS-0215: Could NOT START resource 'ora.orcl.orcl1.inst'. 
CRS-0215: Could NOT START resource 'ora.orcl.orcl2.inst'.

显然客户提到的CRS-1019错误,并不是导致问题的原因。上面的信息中最有意义的部分为:Start of `ora.orcl.orcl1.inst` on member `orcl1` failed。而随后的实例1无法在实例2上启动只是一个提示性的信息,并不是Oracle尝试在实例2上启动实例 1。
那么问题就很简单,找到实例无法启动的原因既可,向客户询问数据库的告警日志中记录的信息,告之只有一个启动实例的信息,没有什么错误也没有其他的信息写入。
有些时候确实会出现通过工具启动,错误信息没有写入到告警日志的情况,于是让客户尝试通过sqlplus直接STARTUP数据库,这次得到的明确的错误信息:

oracle@orcl2:/u01/app/oracle/admin/orcl/bdump>sqlplus / AS sysdba 
SQL*Plus: Release 10.2.0.5.0 - Production ON Fri Nov 2 17:44:24 2012 
Copyright (c) 1982, 2010, Oracle.  ALL Rights Reserved. 
Connected TO an idle instance. 
SQL> startup mount; 
ORA-02194: event specification syntax error 230 (minor error 215) near 'OFF'

显然导致问题的原因是SPFILE中设置的EVENT存在语法错误。这也是为什么告警日志中没有记录错误的原因,Oracle在解析初始化参数的时候就碰到了错误,因此还没有真正的开始启动过程。
剩下的问题就很简单了,让客户手工创建PFILE,将EVENT的语法修改正确,或者先暂时注释掉,然后重新生成SPFILE,并重启数据库。
本以为问题解决了,没想到没过多长时间,再次接到客户的电话。这次实例2已经正常启动,不过实例1还存在问题,在SQLPLUS中直接启动不会报错,但是通过crs_start却无法正常启动。

2012-11-02 18:38:55.460: [  CRSRES][11628]32ora.orcl.orcl1.inst target SET TO OFFLINE BEFORE stop action 
2012-11-02 18:38:55.460: [  CRSRES][11628]32StopResource: setting CLI VALUES 
2012-11-02 18:38:55.471: [  CRSRES][11628]32Target SET TO OFFLINE FOR `ora.orcl.orcl1.inst` 
2012-11-02 18:40:07.862: [  CRSRES][11633]32startRunnable: setting CLI VALUES 
2012-11-02 18:40:07.867: [  CRSRES][11633]32Attempting TO START `ora.orcl.orcl1.inst` ON member `orcl1` 
2012-11-02 18:40:09.194: [  CRSAPP][11633]32StartResource error FOR ora.orcl.orcl1.inst error code = 1 
2012-11-02 18:40:09.853: [  CRSRES][11633]32Start OF `ora.orcl.orcl1.inst` ON member `orcl1` failed. 
2012-11-02 18:40:09.865: [  CRSRES][11633]32orcl2 : CRS-1019: Resource ora.orcl.orcl1.inst (application) cannot run ON orcl2

开始怀疑是ORACLE_HOME/dbs目录下的initorcl1.ora文件存在错误,没有指向正确的SPFILE文件,让客户进行确认后没有发现问题。
由于SQLPLUS启动没有问题,而通过CRS_START启动出现问题,怀疑是OCR中某些配置异常,于是让客户检查SRVCTL的CONFIG命令输出结果:

oracle@orcl1:/u01/app/oracle/product/10.2.0/crs/log/orcl1/crsd>srvctl config DATABASE -d orcl -a 
orcl1 orcl1 /u01/app/oracle/product/10.2.0/db 
orcl2 orcl2 /u01/app/oracle/product/10.2.0/db 
DB_UNIQUE_NAME: orcl 
DB_NAME: orcl 
ORACLE_HOME: /u01/app/oracle/product/10.2.0/db 
SPFILE: /dev/rspfile 
DOMAIN: NULL 
DB_ROLE: NULL 
START_OPTIONS: NULL 
POLICY:  AUTOMATIC 
ENABLE FLAG: DB ENABLED, INST DISABLED ON orcl1

很明显,在OCR配置中,实例1被DISABLE了,这就是通过CRS_START启动时,实例1无法正常启动的原因。
执行下面的命令:

srvctl enable instance -d orcl –i orcl1

问题解决。

Posted in ORACLE | Tagged , , , , | Leave a comment

OLAP用户手册

这篇文档介绍了OLAP相关的基础概念。
Oracle的OLAP组件从9i就独立存在了,不过一直没有研究过。由于有客户在使用,且前一段时间发现OLAP组件的SCHEMA并不像其他数据库SCHEMA迁移那么简单,因此还是有必要了解一下相关的知识。
在线文档地址:http://www.oracle.com/pls/db112/to_toc?pathname=olap.112%2Fe17123%2Ftoc.htm&remark=portal+%28Books%29

Posted in BOOKS | Leave a comment

ODA入门指南总结

如果要和Oracle数据库的文档进行比较,这篇文档类似管理员手册。
由于ODA是一体机,既包含了软件也包括了硬件。因此文档虽然仍然是以软件的描述为主,但是仍然包括了一些硬件的描述,比如网络接口、前后面板警示灯的介绍等等。
ODA本身的管理就很简单,因此这篇文档已经涵盖了ODA日常操作维护的绝大部分内容,包括部署ODA前的准备、ODA的安装、ODA的维护、ODA的高可用以及ODA的故障诊断等等。如果对ODA感兴趣或者已经部署了ODA环境,那么这篇文档是最好的ODA入门手册。

Posted in BOOKS | Leave a comment