Oracle DBA实战攻略:运维管理、诊断优化、高可用与最佳实践——序

印象中很少帮人写序,提笔后才发现这也不是件轻松的事情。
初识周亮应该是2012年10月到杭州出差,经一个杭州的同事介绍认识的,虽然是第一次见面,但是聊得却很投机:一方面大家都是这个圈子里的人,虽然之前没有打过交道,但是有着很多共同的朋友,因此感觉比较亲切;另一方面是我们两个的相似点非常多,因此共同语言也就比较多。大家都是Oracle的DBA,都对Oracle技术感兴趣,也都在Oracle的圈子内积累了不少年头,即便是只聊技术,也能找到说不完的话题。更何况我们两个人的工作性质几乎完全一样。我们两个目前都是从事乙方运维的工作,我是2011年开始从事Oracle数据库的乙方的运维管理,而周亮的乙方运维经验则比我长得多,如果从这个角度讲,他的Oracle数据库运维经验要比我丰富得多,这无疑也使我们增加了更多的共同话题。于是当天晚上,从Oracle数据库聊到了具体的案例,从工作聊到了客户,从技术聊到了团队,估计当天如果在聊的晚一点,就差谈人生和理想了。
也就是在当晚的聊天过程中,我听说他正在写书,而且已经动笔几个月了。当时Eygle的新书刚刚出版不久,而我几个熟悉的朋友也恰好都在写书,所以这方面的话题也比较多,也就多聊了几句,当时对他的新书有了一个第一印象:这本书是根据案例和实践经验整理而成的,当时跟我的感觉应该和DBA手记系列比较相似。
之后的几个月,听说他仍然在笔耕不缀。乙方工作的辛苦自己是深有感触的,能在日常繁忙工作的基础上,把一些知识要点记录下来已经是很少有人能做到的事情了,而还能坚持不断的写作,这绝对是需要毅力才能完成的工作。因为写作和写BLOG是不同的,虽然二者可能都是需要每天挤出半个小时到一个小时的时间,前者要求的整块的时间,否则难以展开思路,而对于后者而言,如果有了什么思路或者碰到了什么案例,可以用很短的时间先记录下来,随后也可以通过零碎的时间进行实验和验证,说起来似乎是同样的一小时时间,前者的坚持和付出要远远高于后者。
到了最近他成书后邀请我写序,我才发现他的书并不是简单的案例集合,而是存在一条主线将各个知识点串联在一起,这在成书的困难程度上就要比DBA手记之类的案例集合至少上了一个台阶。
从严格意义上讲,我没有写过书,只是写过文章。虽然参与了《Oracle数据库性能优化》以及《DBA手记》等书的编写,但形式仍然是供稿方式。这几本书的最大特点在于没有一条明确的主线来穿起所有的内容,书中的各个章节独立,因此成书相对来说要容易得多。而如果要独立写一本书,就需要考虑完整的架构、背景知识、贯穿全书的主线、以及选取的案例是否合适等等,显然这要比纯案例的组合困难得多。而即便是案例类的书籍,成书过程也并不简单。先不说素材、案例的选取,知识点深入程度的把控这些对于技术含量要求很高的事情。只说整体书籍编写工作完成后,对全书的修订和审阅工作也不是一件轻松的事情。因为经历过几次这种不断的修订的工作,每次都是一个很痛苦的过程,以至于当现在为止还记忆犹新。而如果要我独立写一本书,要有明确的主线,要有详尽的基础知识介绍,要覆盖Oracle的主要体系结构点,还要有大量深入的案例作为佐证,最好还要有一些尚未公开的研究结果,光想想就已经足够了。因此,我一直很钦佩那些能独立完成一本书的人,无疑作者也是令我钦佩的大牛中的一员。
作者令我钦佩的不仅仅是技术上的积累,也不只是我提到的为了成书而付出的艰辛,而更令我钦佩的是其对技术几年如一日的持续专注的态度。最近微博上还有人讨论是否应该写书。国内的技术和出版的大环境,决定了靠写技术书籍赚钱绝对是Mission impossible。即使是Eygle这种几乎一年一本的速度,且每本都很畅销的情况,也完全不足以通过写书养活自己,那就更不用说其他人了。通过写书来出名也越来越难,其实很多人是因为作者本身的名气才买的书。因此有位网友说得很有道理,别人知道你是因为你做了什么,而不是你写了什么。因此想要靠写书来成就自己的名气也不是件容易的事情。在现阶段无论是收名还是获利,想通过写书的方式来实现,其投入产出比都非常不好的。但是也正因为如此,以名利为目的作者越来越少,而写书更多的变成了一些有技术追求的人,把自己的技术沉淀、总结和提升的过程。大浪淘沙始得金,我们欣喜的看到,近期出版的和即将要出版的几本Oracle书籍,无不是业内专业人士的呕心力作。而作者的这边《Oracle DBA实战攻略:运维管理、诊断优化、高可用与最佳实践》正是其中之一。
作者根据他多年的运维诊断经验,从数据库如何创建开始,循序渐进的介绍了数据库的启动关闭过程,如何配置监听并连接到数据库,如果对数据库空间进行管理和监控,SGA的调整和优化方法,数据库的CHECKPOINT和SCN机制与备份恢复,数据库性能优化的方法论以及Oracle Data Guard的配置和管理。仅看目录似乎都是非常基础的内容,似乎这是本针对Oracle初学者的入门级书籍,其实恰恰相反,所谓大道至简,能把最基础的东西写出新意并加入自己的理解本身就是一件非常见功底的事情。而且作者结合了大量的真实案例,把自己多年的宝贵经验融入其中,通过一些复杂案例的诊断过程来说明这些简单的原理和知识点,这正是作者高明的地方。更何况作者并没有简单的停留在案例诊断分析的层面上,而是根据大量案例的经验汇总,把问题的优化、诊断和解决提升到了方法论的层面上,这就不仅仅是通过几年经验积累就可以轻易达成的,必要要求作者不断的思考、分析、归纳和验证,才能上升到理论指导实践的层面。
最后希望周亮的《Oracle DBA实战攻略:运维管理、诊断优化、高可用与最佳实践》一书,可以帮助更多的数据库技术爱好者解决日常碰到的技术困难,给那些以日常操作入门的运维人员指出一条深入学习提高的道路。

Posted in BOOKS | Leave a comment

ORA-600(ktrgcm_3)错误

客户10.2.0.5 RAC数据库出现ORA-600[ktrgcm_3]错误。
错误信息如下:

Sun Jul 07 01:06:00 EAT 2013
Errors IN file /oracle/admin/orcl/udump/orcl1_ora_28947.trc:
ORA-00600: internal error code, arguments: [ktrgcm_3], [], [], [], [], [], [], []

详细TRACE如下:

*** ACTION NAME:() 2013-07-07 00:49:57.500
*** MODULE NAME:(g@orcl (TNS V1-V3)) 2013-07-07 00:49:57.500
*** SERVICE NAME:(orcl) 2013-07-07 00:49:57.500
*** SESSION ID:(3332.36419) 2013-07-07 00:49:57.500
*** 2013-07-07 00:49:57.500
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [ktrgcm_3], [], [], [], [], [], [], []
CURRENT SQL statement FOR this SESSION:
INSERT INTO T_B_T_IN (TRADE_ID,SUBSCRIBE_ID,BPM_ID,TRADE_TYPE_CODE,IN_MODE_CODE,PRIORITY,SUBSCRIBE_STATE, NEXT_DEAL_TAG,PRODUCT_ID,BRAND_CODE,USER_ID,CUST_ID,USECUST_ID,ACCT_ID,USER_DIFF_CODE, NET_TYPE_CODE,SERIAL_NUMBER,CUST_NAME,ACCEPT_DATE,ACCEPT_MONTH,TRADE_STAFF_ID, TRADE_DEPART_ID,TRADE_CITY_CODE,TRADE_EPARCHY_CODE,TERM_IP,EPARCHY_CODE,CITY_CODE, OLCOM_TAG,EXEC_TIME,FINISH_DATE,OPER_FEE,FOREGIFT,ADVANCE_PAY,INVOICE_NO,FEE_STATE, FEE_TIME,FEE_STAFF_ID,CANCEL_TAG,CANCEL_DATE,CANCEL_STAFF_ID,CANCEL_DEPART_ID, CANCEL_CITY_CODE,CANCEL_EPARCHY_CODE,CHECK_TYPE_CODE,CHK_TAG,AUDIT_TAG, AUDIT_BATCH_NO,ACTOR_NAME,ACTOR_CERTTYPEID,ACTOR_PHONE,ACTOR_CERTNUM, CONTACT,CONTACT_PHONE,CONTACT_ADDRESS,REMARK,IF_MAINTENANCE, PROVINCE_CODE,TRADE_PROVINCE_CODE,CANCEL_PROVINCE_CODE,RSRV_TAG1) SELECT A.TRADE_ID,SUBSCRIBE_ID,BPM_ID,TRADE_TYPE_CODE,IN_MODE_CODE,PRIORITY,SUBSCRIBE_STATE, NEXT_DEAL_TAG,A.PRODUCT_ID,NVL(D.BRAND_CODE,A.BRAND_CODE), A.USER_ID,A.CUST_ID,A.USECUST_ID,ACCT_ID,A.USER_DIFF_CODE, DECODE(A.PRODUCT_ID,'-1','99',NVL(D.NET_TYPE_CODE,A.NET_TYPE_CODE)),A.SERIAL_NUMBER,CUST_NAME,ACCEPT_DATE,A.ACCEPT_MONTH,TRADE_STAFF_ID, TRADE_DEPART_ID,TRADE_CITY_CODE,TRADE_EPARCHY_CODE,TERM_IP,A.EPARCHY_CODE,A.CITY_CODE, OLCOM_TAG,EXEC_TIME,FINISH_DATE,OPER_FEE,FOREGIFT,ADVANCE_PAY,INVOICE_NO,FEE_STATE, FEE_TIME,FEE_STAFF_ID, CASE WHEN B.TRADE_ID_A IS NOT NULL AND A.CANCEL_DATE IS NULL THEN DECODE(A.CANCEL_TAG,'0','4',A.CANCEL_TAG) ELSE CASE WHEN B.TRADE_ID_A IS NOT NULL AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN DECODE(A.CANCEL_TAG,'3','0',A.CANCEL_TAG) ELSE CASE WHEN TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN DECODE(A.CANCEL_TAG,'3','0',A.CANCEL_TAG) ELSE DECODE(A.CANCEL_TAG,'1','0','4','0',A.CANCEL_TAG) END END END CANCEL_TAG, CASE WHEN A.CANCEL_TAG IN ('1','4','3') AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN NULL ELSE A.CANCEL_DATE END CANCEL_DATE, CASE WHEN A.CANCEL_TAG IN ('1','4','3') AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN NULL ELSE A.CANCEL_STAFF_ID END CANCEL_STAFF_ID, CASE WHEN A.CANCEL_TAG IN ('1','4','3') AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN NULL ELSE A.CANCEL_DEPART_ID END CANCEL_DEPART_ID, CASE WHEN A.CANCEL_TAG IN ('1','4','3') AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN NULL ELSE A.CANCEL_CITY_CODE END CANCEL_CITY_CODE, CASE WHEN A.CANCEL_TAG IN ('1','4','3') AND TRUNC(A.ACCEPT_DATE,'DD')<TRUNC(A.CANCEL_DATE,'DD') THEN NULL ELSE A.CANCEL_EPARCHY_CODE END CANCEL_EPARCHY_CODE, CHECK_TYPE_CODE,CHK_TAG,AUDIT_TAG, AUDIT_BATCH_NO,ACTOR_NAME,ACTOR_CERTTYPEID,ACTOR_PHONE,ACTOR_CERTNUM, CONTACT,CONTACT_PHONE,CONTACT_ADDRESS,A.REMARK,IF_MAINTENANCE, F_SYS_GETVALUE('get_province_code',NULL,DECODE(LENGTH(A.EPARCHY_CODE),4,A.EPARCHY_CODE,A.TRADE_EPARCHY_CODE),NULL,NULL) PROVINCE_CODE, F_SYS_GETVALUE('get_province_code',NULL,A.TRADE_EPARCHY_CODE,NULL,NULL) TRADE_PROVINCE_CODE, F_SYS_GETVALUE('get_province_code',NULL,A.CANCEL_EPARCHY_CODE,NULL,NULL) CANCEL_PROVINCE_CODE, CASE WHEN C.NEW_STATE_CODE='H' THEN '9130' ELSE CASE WHEN C.OLD_STATE_CODE='H' THEN '9140' END END FROM TF_BH_TRADE A, (SELECT TRADE_ID_A,TRADE_ID_B FROM TF_B_TRADE_REL WHERE RELATION_TYPE_CODE='02' AND STATE IN ('0','A') )B,TF_B_TRADE_STATE_TRANS C,TD_B_PRODUCT D WHERE A.TRADE_ID=B.TRADE_ID_B(+) AND A.TRADE_ID=C.TRADE_ID(+) AND A.PRODUCT_ID = D.PRODUCT_ID(+) AND A.ACCEPT_MONTH = :B3 AND A.ACCEPT_DATE >= :B2 AND A.ACCEPT_DATE < :B1 AND A.TRADE_TYPE_CODE NOT IN ('116','616') AND A.NEXT_DEAL_TAG IN (SELECT VALUEC1 FROM TD_SD_COMMPARA WHERE PARAM_TYPE_CODE ='NEXT_DEAL_TAG') AND EXISTS (SELECT X.TRADE_ID FROM (SELECT B.TRADE_ID TRADE_ID FROM TF_B_TRADE_BATDEAL A,TF_BH_TRADE B WHERE A.TRADE_ID = B.TRADE_ID AND A.CANCEL_TAG = B.CANCEL_TAG AND A.STATE ='2' AND A.B_UPDATE_DATE >= :B2 AND A.B_UPDATE_DATE < :B1 AND A.BATCH_OPER_TYPE IN ('CREATEPREUSER','CREATEUSER','DISCNTCHG','ADDGRPMEMBER','BATREMOVEMEMBER','PRODUCTCHG','STATISTICS','BATCHARGEFEE','SCOREADJUST','CANCELPREUSER') MINUS SELECT TRADE_ID FROM T_B_T_IN ) X WHERE A.TRADE_ID = X.TRADE_ID)
----- PL/SQL Call Stack -----
  object      line  object
  handle    NUMBER  name
c00000142951a9f8       434  PROCEDURE U_S_1.P_S_D_T_IN
c0000013baacb568         1  anonymous block
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+64          CALL     ksedst1()            000000000 ? 000000001 ?
ksedmp()+2176        CALL     ksedst()             000000000 ?
                                                   C000000000000D20 ?
                                                   4000000004066EE0 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ?
ksfdmp()+112         CALL     ksedmp()             000000003 ?
                                                   9FFFFFFFFFFE8EB0 ?
                                                   60000000000BA288 ?
                                                   9FFFFFFFFFFE9480 ?
                                                   C000000000000999 ?
                                                   40000000040AEF50 ?
kgerinv()+304        CALL     ksfdmp()             9FFFFFFFFFFE9A10 ?
                                                   000000003 ?
                                                   9FFFFFFFFFFE9490 ?
                                                   60000000000BA288 ?
                                                   C000000000000612 ?
                                                   4000000009903790 ?
kgeasnmierr()+144    CALL     kgerinv()            60000000000318D0 ?
                                                   4000000001AF87F0 ?
                                                   6000000000032988 ?
                                                   4000000001AF87F0 ?
                                                   9FFFFFFFFFFE9A50 ?
$cold_ktrgcm()+1188  CALL     kgeasnmierr()        60000000000318D0 ?
8                                                  60000000003139F8 ?
                                                   6000000000313A08 ?
                                                   6000000000032D00 ?
                                                   000000000 ? 000000000 ?
                                                   C0000014D3B02A54 ?
                                                   C0000014BC6E8AEC ?
ktrget()+1088        CALL     $cold_ktrgcm()       9FFFFFFFBF349F60 ?
                                                   60000000000BA288 ?
                                                   C000000000000C1D ?
                                                   4000000002FB1560 ?
                                                   00003811B ?
kdirfrs()+8160       CALL     ktrget()             9FFFFFFFBF349F60 ?
                                                   000000000 ?
                                                   C00000089B458014 ?
qerixFetchFastFullS  CALL     kdirfrs()            9FFFFFFFBF349F58 ?
can()+7504                                         60000000000BA288 ?
                                                   4000000003057DA0 ?
                                                   00003825F ?
                                                   9FFFFFFFBF34A070 ?
                                                   00003821F ? 000000000 ?
                                                   C000000000001430 ?
qergiFetch()+816     CALL     qerixFetchFastFullS  C0000013CB1A0CA0 ?
                              can()                4000000001CC4CE0 ?
                                                   9FFFFFFFFFFEB750 ?
                                                   000007FFF ? 000007FFF ?
                                                   C0000013CB1A0CA0 ?
rwsfcd()+256         CALL     qergiFetch()         C0000013CB1A08F8 ?
                                                   4000000001CC4CE0 ?
                                                   9FFFFFFFFFFEB750 ?
                                                   000007FFF ?
qerhjFetch()+912     CALL     rwsfcd()             9FFFFFFFBF34A698 ?
                                                   4000000001CC4CE0 ?
                                                   9FFFFFFFFFFEB750 ?
                                                   000007FFF ?
qerflFetch()+336     CALL     qerhjFetch()         C0000013CB1A0740 ?
                                                   4000000001CCEC70 ?
                                                   9FFFFFFFFFFEB820 ?
                                                   000007FFF ?
qersoFetch()+2080    CALL     qerflFetch()         C0000013CB1A05B0 ?
                                                   4000000001CCEED0 ?
                                                   C000001426EB7FA8 ?
                                                   000007FFF ?
                                                   60000000000BA288 ?
qerseFetch()+400     CALL     qersoFetch()         C000001426EB7FA8 ?
                                                   4000000001CCEBF0 ?
                                                   9FFFFFFFFFFEB880 ?
                                                   000007FFF ?
qervwFetch()+368     CALL     qerseFetch()         C000001426EB7AE8 ?
                                                   9FFFFFFFFFFEB8AC ?
                                                   9FFFFFFFFFFEB8AC ?
                                                   9FFFFFFFBF3DC5DA ?
rwsfcd()+256         CALL     qervwFetch()         C000001426EB79C8 ?
                                                   4000000001CC4CF0 ?
                                                   9FFFFFFFFFFEB8D0 ?
                                                   C000001426EB79C0 ?
qerhjFetch()+1152    CALL     rwsfcd()             9FFFFFFFBF3DC5F8 ?
                                                   4000000001CC4CF0 ?
                                                   9FFFFFFFFFFEB8D0 ?
                                                   000007FFF ?
rwsfcd()+256         CALL     qerhjFetch()         C000001426EB7710 ?
                                                   4000000001CC4D00 ?
                                                   9FFFFFFFFFFEB9A0 ?
                                                   000007FFF ?
qerhjFetch()+1728    CALL     rwsfcd()             9FFFFFFFBF3DC800 ?
                                                   4000000001CC4D00 ?
                                                   9FFFFFFFFFFEB9A0 ?
                                                   000007FFF ?
                                                   60000000000BA288 ?
qerjoFetch()+1168    CALL     qerhjFetch()         C0000013AB146D68 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ?
                                                   60000000000BA288 ?
                                                   C0000000000012AB ?
rwsfcd()+256         CALL     qerjoFetch()         C000001388B4B060 ?
                                                   4000000001CCEC20 ?
                                                   9FFFFFFFFFFEBA80 ?
                                                   000007FFF ?
qerhjFetch()+1728    CALL     rwsfcd()             9FFFFFFFBF3D7CA8 ?
                                                   4000000001CC4D00 ?
                                                   9FFFFFFFFFFEBAB0 ?
                                                   000007FFF ?
                                                   60000000000BA288 ?
qerjoFetch()+1168    CALL     qerhjFetch()         C0000013FF4C97D0 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ?
                                                   60000000000BA288 ?
                                                   C0000000000012AB ?
qerflFetch()+336     CALL     qerjoFetch()         C000001380167980 ?
                                                   4000000001CCEC70 ?
                                                   9FFFFFFFFFFEBBC0 ?
                                                   000000001 ?
rwsfcd()+256         CALL     qerflFetch()         C0000013801676A0 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ?
insfch()+304         CALL     rwsfcd()             9FFFFFFFBF3DAEF8 ?
                                                   000000000 ? 000000000 ?
                                                   000000001 ?
insdrv()+912         CALL     insfch()             9FFFFFFFB9F46F60 ?
                                                   C0000000000019BB ?
                                                   60000000000BA288 ?
                                                   40000000031630F0 ?
                                                   00003821D ?
                                                   60000000000C6FE0 ?
                                                   60000000000C6F18 ?
                                                   4000000001C116A0 ?
inscovexe()+1408     CALL     insdrv()             9FFFFFFFFFFEC390 ?
                                                   9FFFFFFFFFFEBBF0 ?
                                                   60000000000BA288 ?
                                                   40000000031B6160 ?
                                                   00003825B ?
                                                   9FFFFFFFFFFEBBF4 ?
insExecStmtExecIniE  CALL     inscovexe()          C0000013E1A08250 ?
ngine()+176                                        C0000013C3B02F08 ?
                                                   9FFFFFFFFFFECB40 ?
insexe()+1040        CALL     insExecStmtExecIniE  C0000013E1A08250 ?
                              ngine()              C0000013C3B02F08 ?
                                                   9FFFFFFFFFFED3D0 ?
                                                   60000000000BA288 ?
                                                   C000000000000FA5 ?
                                                   4000000003166C80 ?
opiexe()+7760        CALL     insexe()             C0000013BE5AB8C0 ?
                                                   9FFFFFFFFFFED3D0 ?
                                                   9FFFFFFFFFFECB50 ?
                                                   60000000000BA288 ?
                                                   9FFFFFFFFFFED0D0 ?
                                                   C0000000000025D3 ?
                                                   4000000002F87610 ?
                                                   00003821B ?
opipls()+3680        CALL     opiexe()             9FFFFFFFFFFEED70 ?
                                                   4000000003119420 ?
                                                   60000000000C6CE0 ?
                                                   9FFFFFFFFFFED0F0 ?
                                                   60000000000BA288 ?
                                                   00002DE7D ?
                                                   C00000000000224C ?
                                                   001121800 ?
opiodr()+2144        CALL     opipls()             9FFFFFFFFFFEF680 ?
                                                   4000000002F58A60 ?
                                                   00002E0B7 ?
                                                   9FFFFFFFFFFEEDD0 ?
                                                   60000000000BA288 ?
                                                   000000001 ?
rpidrus()+368        CALL     opiodr()             000000066 ? 000000006 ?
                                                   4000000001C14510 ?
                                                   0000046C0 ?
                                                   9FFFFFFFFFFEF690 ?
                                                   60000000000BA288 ?
skgmstack()+288      CALL     rpidrus()            9FFFFFFFFFFF1DD0 ?
                                                   9FFFFFFFFFFF1810 ?
                                                   60000000000BA288 ?
                                                   9FFFFFFFFFFF1D90 ?
                                                   C000000000000716 ?
                                                   4000000002F92F80 ?
                                                   00002E07F ?
                                                   9FFFFFFFFFFF1870 ?
rpidru()+240         CALL     skgmstack()          9FFFFFFFFFFF1DF0 ?
                                                   6000000000030F50 ?
                                                   00000F618 ?
                                                   4000000001CCBAD0 ?
                                                   9FFFFFFFFFFF1DD0 ?
rpiswu2()+976        CALL     rpidru()             00000057B ?
                                                   C000000000001026 ?
                                                   C000000000001026 ?
                                                   4000000002F8FD10 ?
                                                   9FFFFFFFFFFF1DE8 ?
                                                   9FFFFFFFFFFF2C10 ?
rpidrv()+2352        CALL     rpiswu2()            9FFFFFFFFFFF2B00 ?
                                                   4000000002F92360 ?
                                                   00002F933 ?
                                                   9FFFFFFFFFFF2590 ?
                                                   60000000000329A8 ?
                                                   C000000000001ABD ?
                                                   4000000000F32248 ?
                                                   60000000000C6CE0 ?
psddr0()+464         CALL     rpidrv()             000000008 ? 000000066 ?
                                                   9FFFFFFFFFFF3710 ?
                                                   000000038 ?
                                                   9FFFFFFFFFFF2B30 ?
                                                   60000000000BA288 ?
psdnal()+736         CALL     psddr0()             000000008 ? 000000066 ?
                                                   9FFFFFFFFFFF3710 ?
                                                   000000030 ?
pevm_EXECC()+832     CALL     psdnal()             9FFFFFFFFFFF54E0 ?
                                                   C0000014FF9104E0 ?
                                                   C0000000000011AA ?
                                                   9FFFFFFFBF3D3060 ?
                                                   C00000142951A9F8 ?
                                                   4000000003159470 ?
                                                   00002C835 ?
pfrinstr_EXECC()+16  CALL     pevm_EXECC()         9FFFFFFFBF3D9F40 ?
0                                                  9FFFFFFFBF3D30D0 ?
                                                   000000020 ?
pfrrun_no_tool()+19  CALL     pfrinstr_EXECC()     9FFFFFFFBF3D9F40 ?
2                                                  C0000013A9EE034C ?
                                                   9FFFFFFFBF3D9FA8 ?
pfrrun()+1376        CALL     pfrrun_no_tool()     9FFFFFFFBF3D9F40 ?
                                                   000002001 ?
                                                   9FFFFFFFBF3D9FA8 ?
                                                   60000000000BA288 ?
                                                   C000000000000A1C ?
                                                   4000000003180540 ?
                                                   9FFFFFFFBF3DA390 ?
                                                   9FFFFFFFBF3DA008 ?
plsql_run()+1328     CALL     pfrrun()             9FFFFFFFFFFF3798 ?
                                                   9FFFFFFFFFFF3780 ?
                                                   60000000000BA288 ?
                                                   9FFFFFFFFFFF4380 ?
                                                   9FFFFFFFFFFF4380 ?
                                                   C000000000000E23 ?
                                                   4000000002C52D50 ?

根据MOS文档,ORA-600[ktrgcm_3]存在三个已知bug,其中和Bug 7225204 – OERI [ktrgcm_3] can occur [ID 7225204.8]的描述非常相似,当前报错的SQL语句就是INSERT SELECT语句,且INSERT对应的表在SELECT中也会出现,这与BUG的描述完全一致。
观察错误语句的执行计划:

============
Plan TABLE
============
---------------------------------------------------------------------+----------------------
| Id | Operation                                | Name               | ROWS | Bytes | Cost |
---------------------------------------------------------------------+----------------------
| 0  | INSERT STATEMENT                         |                    |      |       | 618K |
| 1  |  FILTER                                  |                    |      |       |      |
| 2  |   NESTED LOOPS SEMI                      |                    |    1 |   345 | 618K |
| 3  |    HASH JOIN OUTER                       |                    |    1 |   326 | 618K |
| 4  |     NESTED LOOPS OUTER                   |                    |    1 |   301 | 618K |
| 5  |      HASH JOIN OUTER                     |                    |    1 |   282 | 618K |
| 6  |       HASH JOIN SEMI                     |                    |    1 |   263 | 618K |
| 7  |        PARTITION RANGE SINGLE            |                    |   16 |  4000 |    4 |
| 8  |         TABLE ACCESS BY LOCAL INDEX ROWID| TF_BH_TRADE        |   16 |  4000 |    4 |
| 9  |          INDEX RANGE SCAN                | IDX_TF_BH_TRADE_ACC|    1 |       |    3 |
| 10 |        VIEW                              |                    |  13K |  169K | 618K |
| 11 |         MINUS                            |                    |      |       |      |
| 12 |          SORT UNIQUE                     |                    |  13K |  534K |      |
| 13 |           FILTER                         |                    |      |       |      |
| 14 |            HASH JOIN                     |                    |  13K |  534K | 617K |
| 15 |             TABLE ACCESS FULL            | TF_B_TRADE_BATDEAL |  13K |  377K | 441K |
| 16 |             PARTITION RANGE ALL          |                    | 172M | 2063M | 174K |
| 17 |              INDEX FAST FULL SCAN        | PK_TF_BH_TRADE     | 172M | 2063M | 174K |
| 18 |          SORT UNIQUE                     |                    | 260K | 2599K |      |
| 19 |           INDEX FAST FULL SCAN           | IDX_T_B_T_ID       | 260K | 2599K |  231 |
| 20 |       TABLE ACCESS FULL                  | TF_B_T_STATE_TRANS |    1 |    19 |    2 |
| 21 |      TABLE ACCESS BY INDEX ROWID         | TD_B_PRODUCT       |    1 |    19 |    2 |
| 22 |       INDEX UNIQUE SCAN                  | PK_TD_B_PRODUCT    |    1 |       |    1 |
| 23 |     TABLE ACCESS FULL                    | TF_B_TRADE_REL     | 116K | 2888K |  169 |
| 24 |    TABLE ACCESS BY INDEX ROWID           | TD_SD_COMMPARA     |    8 |   152 |    2 |
| 25 |     INDEX RANGE SCAN                     | PK_TD_SD_COMMPARA  |    8 |       |    1 |
---------------------------------------------------------------------+----------------------

配合报错函数信息,不难确认,问题发生在对T_B_T_IN表的索引快速全扫描上。
但是这个错误影响10.2.0.4和11.1.0.7,Oracle在10.2.0.5中应该已经修正了这个错误。观察文档Bug 6157713 – Wrong result from ORA_ROWSCN [ID 6157713.8]的描述,虽然版本信息符合,但是当前并没有调用ORA_ROWSCN伪列,不过这个BUG的描述中,又指向了上面7225204的bug。
最后检查文档Bug 14076510 – ORA-600 [ktrgcm_3] in 10.2.0.5.3 – 10.2.0.5.7 [ID 14076510.8],可以看到这个bug描述在10.2.0.5.3之后出现的ktrgcm_3错误的原因,根据三篇文章的描述,以及当前的现象,感觉三篇文档描述的应该是都一个bug,Oracle在处理自关联查询时引发了这个错误。而这个bug并没有在10.2.0.5中被真正修复。
除了升级到10.2.0.5.8以及11.2外,如果可能,修改语句避免INSERT SELECT语句访问插入表。此外Oracle还提供了设置隐含参数”_row_cr”为FALSE的方式,这种方式会带来性能的影响,如果这个错误频繁发生,也可以考虑进行设置。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-600(krbb2ec_stamp_mismtach)错误

备份归档日志导致的ORA-600错误。
错误信息如下:

Tue Aug 16 02:43:58 2011
ALTER SYSTEM ARCHIVE LOG
Tue Aug 16 02:44:01 2011
Thread 1 advanced TO log SEQUENCE 5940 (LGWR switch)
  CURRENT log# 7 seq# 5940 mem# 0: /dev/orcl3vg1/rdb3vg1_1_redo71
  CURRENT log# 7 seq# 5940 mem# 1: /dev/orcl3vg2/rdb3vg2_1_redo72
Tue Aug 16 03:43:44 2011
Thread 1 advanced TO log SEQUENCE 5941 (LGWR switch)
  CURRENT log# 8 seq# 5941 mem# 0: /dev/orcl3vg2/rdb3vg2_1_redo81
  CURRENT log# 8 seq# 5941 mem# 1: /dev/orcl3vg3/rdb3vg3_1_redo82
Tue Aug 16 03:47:06 2011
Errors IN file /oraclelog/admin/orcl3/udump/orcl31_ora_14006.trc:
ORA-00600: internal error code, arguments: [krbb2ec_stamp_mismtach], [13092], [758175879], [], [], [], [], []
Tue Aug 16 03:47:07 2011
Trace dumping IS performing id=[cdmp_20110816034707]

根据MOS文档Bug 6793372 – RMAN backup archivelog fails ORA-19571 / OERI[krbb2ec_stamp_mismtach] [ID 6793372.8],在使用RMAN备份归档路径时,可能会导致这个ORA-600[krbb2ec_stamp_mismtach]错误。
从错误发生之前出现的ALTER SYSTEM ARCHIVE LOG信息可以推断,问题发生之前,确实在进行数据库的备份操作。且当前数据库版本为10.2.0.4,是确认问题发生的版本之一,其他确认的版本包括10.1.0.5和11.1.0.6。
没有想到备份归档日志这种想到简单的操作也会导致ORA-600错误,而且严重怀疑Oracle的报错函数的名称出现了错误,理论上应该是krbb2ec_stamp_mismatch才对。
Oracle在11.2.0.1、11.1.0.7和10.2.0.5中解决了这个问题。不过由于这个错误并非每次重现,且出现概率很低,也可以选择忽略这个错误。

Posted in BUG | Tagged , , , | Leave a comment

ORA-600(koksadqb1)错误

使用utlrp.sql编译失效对象引发的ORA-600错误。
错误信息如下:

Sat DEC 19 05:29:59 2009
Completed: ALTER DATABASE OPEN
Sat DEC 19 05:30:20 2009
SERVER COMPONENT id=UTLRP_BGN: TIMESTAMP=2009-12-19 05:30:20
Sat DEC 19 05:31:09 2009
Errors IN file /oraclelog/admin/orcl3/bdump/orcl31_cjq0_24952.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 576 bytes OF shared memory ("shared pool","update seq$ set increment$=:...","sql area","ckydef : kkdlcky")
Sat DEC 19 05:31:10 2009
Errors IN file /oraclelog/admin/orcl3/bdump/orcl31_cjq0_24952.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 32 bytes OF shared memory ("shared pool","select job, nvl2(last_date, ...","sql area","tmp")
Sat DEC 19 05:31:14 2009
Errors IN file /oraclelog/admin/orcl3/bdump/orcl31_j001_25676.trc:
ORA-00600: internal error code, arguments: [koksadqb1], [4031], [], [], [], [], [], []
Sat DEC 19 05:31:14 2009
Errors IN file /oraclelog/admin/orcl3/bdump/orcl31_j008_25697.trc:
ORA-12012: error ON auto EXECUTE OF job 136625
ORA-04031: unable TO allocate ORA-04031: unable TO allocate 248 bytes OF shared memory ("shared pool","update seq$ set increment$=:...","Typecheck","qsmksol : qsmg_alloc_sol")
ORA-06512: at "SYS.UTL_RECOMP", line 636
ORA-06512: at line 1
 bytes OF shared memory ("","","","")
Sat DEC 19 05:31:15 2009
Errors IN file /oraclelog/admin/orcl3/bdump/orcl31_cjq0_24952.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 1048 bytes OF shared memory ("shared pool","select job, nvl2(last_date, ...","Typecheck","kgghteInit")
Sat DEC 19 05:31:23 2009
Trace dumping IS performing id=[cdmp_20091219053123]
Sat DEC 19 05:31:39 2009
SERVER COMPONENT id=UTLRP_END: TIMESTAMP=2009-12-19 05:31:39
Sat DEC 19 05:35:09 2009
SERVER COMPONENT id=UTLRP_BGN: TIMESTAMP=2009-12-19 05:35:09
SERVER COMPONENT id=UTLRP_END: TIMESTAMP=2009-12-19 05:35:18

这个ORA-600错误在METALINK上找不到任何记录,不过从之前的ORA-4031错误,以及这个错误的第二个参数可以判断,导致这个错误的原因就是ORA-4031错误。
根据错误之前的信息可以看到,数据库刚刚启动完毕,而马上就出现ORA-4031错误是很不正常的,检查了启动参数中共享池的配置,更是达到了3G以上,这就更没有道理出现ORA-4031错误了。
观察报错之前的上一条信息发现,原来在报错之前在运行UTLRP.SQL脚本对失效对象进行编译。这个现象和一个未确定的BUG描述Bug 8442907 : ORA-4031 RUNNING UTLRP.SQL OR ANY PX QUERY非常接近。
虽然第一次运行编译脚本失败,根据日志信息,马上进行的第二次UTLRP.SQL调用成功了,这说明无论这个ORA-600还是ORA-4031错误的出现都是比较偶然的,考虑到Oracle没有任何关于当前问题的描述,如果碰到了类似的问题,可以考虑在调用UTLRP的时候以串行方式运行。

Posted in BUG | Tagged , , , | Leave a comment

统计信息收集出现DFS等待导致实例HANG死

客户10.2.0.4 RAC环境,出现大量的library cache lock和cursor: pin S wait on X等待,经分析是由于统计信息收集僵死导致的。
数据库在8点到9点期间,数据库两个节点都存在明显的cursor: pin S wait on X和library cache lock的等待:

Event

Waits

Time(s)

Avg   Wait(ms)

%   Total Call Time

Wait   Class

cursor:   pin S wait on X

1,573,056

30,651

19

146.2

Concurrency

library   cache lock

31,757

7,009

221

33.4

Concurrency

CPU   time

6,416

30.6

DFS   lock handle

12,381

2,979

241

14.2

Other

latch:   library cache

1,646

1,974

1,199

9.4

Concurrency

Event

Waits

Time(s)

Avg   Wait(ms)

%   Total Call Time

Wait   Class

cursor:   pin S wait on X

17,586,451

342,437

19

812.1

Concurrency

library   cache lock

63,657

30,153

474

71.5

Concurrency

CPU   time

3,820

9.1

db   file sequential read

241,761

1,094

5

2.6

User   I/O

inactive   session

1,105

1,090

986

2.6

Other

两个节点的等待现象基本一致,而节点1上还存在明显的DFS lock handle等待事件。
通过分析ASH信息,发现library cache lock和cursor: pin S wait on X等待基本上都是6点之后才开始出现:

SQL&gt; SELECT trunc(sample_time, 'hh24') TIME, COUNT(*)
  2  FROM WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
  3  WHERE ash.event_id = en.event_id
  4  AND sample_time &gt;= to_timestamp('20130703', 'yyyymmdd')
  5  AND event_name IN ('cursor: pin S wait on X', 'library cache lock')
  6  GROUP BY trunc(sample_time, 'hh24')
  7  ORDER BY 1;
TIME                  COUNT(*)
------------------- ----------
2013-07-03 00:00:00          4
2013-07-03 01:00:00          8
2013-07-03 02:00:00          3
2013-07-03 03:00:00          8
2013-07-03 04:00:00          9
2013-07-03 05:00:00         14
2013-07-03 06:00:00        348
2013-07-03 07:00:00       2138
2013-07-03 08:00:00      41576
2013-07-03 09:00:00      50108
2013-07-03 10:00:00       2452

而观察出现cursor: pin S wait on X和library cache lock等待会话的BLOCKING SESSION,发现大部分处于等待的会话被会话37锁定:

SQL&gt; SELECT to_char(sample_time, 'hh24:mi:ss.ff') TIME, session_id sid, 
  2  decode(en.event_name, 'cursor: pin S wait on X', to_number(substr(ltrim(to_char(p2, '0xxxxxxxxxxxxxxx')), 1, 8), 'xxxxxxxx'), blocking_session) b_sid, 
  3  en.event_name  
  4  FROM WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
  5  WHERE ash.event_id = en.event_id(+)
  6  AND instance_number = 1
  7  AND sample_time &gt;= to_timestamp('201307030800', 'yyyymmddhh24mi')
  8  AND sample_time &lt;= to_timestamp('201307030801', 'yyyymmddhh24mi')
  9  ORDER BY 1;
TIME                      SID      B_SID EVENT_NAME
------------------ ---------- ---------- ------------------------------
08:00:09.569              457 4294967291 latch: library cache
08:00:09.569               52        981 cursor: pin S wait ON X
08:00:09.569              168         73 library cache LOCK
08:00:09.569             1763 4294967291 SQL*Net message FROM client
08:00:09.569              123 4294967295 kksfbc child completion
08:00:09.569              476 4294967291 kst: async disk IO
08:00:09.569              500 4294967292 db file sequential READ
08:00:09.569              327 4294967291 SQL*Net message TO client
08:00:09.569              233 4294967291 db file scattered READ
08:00:09.569               73         73 DFS LOCK handle
08:00:09.569             1889 4294967291 gc CURRENT block 2-way
08:00:09.569             1969        981 cursor: pin S wait ON X
08:00:09.569              857        981 cursor: pin S wait ON X
08:00:09.569              812        168 cursor: pin S wait ON X
08:00:09.569              981         73 library cache LOCK
08:00:19.669               52        981 cursor: pin S wait ON X
08:00:19.669              812        168 cursor: pin S wait ON X
08:00:19.669              857        981 cursor: pin S wait ON X
08:00:19.669             1969        981 cursor: pin S wait ON X
08:00:19.669              233 4294967291 db file scattered READ
08:00:19.669              500 4294967292 db file sequential READ
08:00:19.669             1763 4294967291 db file sequential READ
08:00:19.669              476 4294967291 kst: async disk IO
08:00:19.669              457 4294967291 kksfbc child completion
08:00:19.669               73         73 DFS LOCK handle
08:00:19.669              654 4294967291 SQL*Net message TO client
08:00:19.669              168         73 library cache LOCK
08:00:19.669              981         73 library cache LOCK
08:00:29.769             1969        981 cursor: pin S wait ON X
08:00:29.769               52        981 cursor: pin S wait ON X
08:00:29.769              457 4294967291 kksfbc child completion
08:00:29.769              981         73 library cache LOCK
08:00:29.769              168         73 library cache LOCK
08:00:29.769              500 4294967292 db file sequential READ
08:00:29.769               73         73 DFS LOCK handle
08:00:29.769              654 4294967291 SQL*Net message TO client
08:00:29.769              327 4294967291 SQL*Net message TO client
08:00:29.769             1763 4294967291 db file sequential READ
08:00:29.769              857        981 cursor: pin S wait ON X
08:00:29.769              123 4294967295 latch: library cache
08:00:29.769              196 4294967295 latch: library cache
08:00:29.769              233 4294967295 latch: library cache
08:00:29.769              812        168 cursor: pin S wait ON X
08:00:29.769              476 4294967291 kst: async disk IO
08:00:39.879              476 4294967291 kst: async disk IO
08:00:39.879              658 4294967291 db file sequential READ
08:00:39.879              500 4294967292 db file sequential READ
08:00:39.879             1969        981 cursor: pin S wait ON X
08:00:39.879              857        981 cursor: pin S wait ON X
08:00:39.879              812        168 cursor: pin S wait ON X
08:00:39.879               52        981 cursor: pin S wait ON X
08:00:39.879              981         73 library cache LOCK
08:00:39.879              168         73 library cache LOCK
08:00:39.879             1849 4294967291 SQL*Net message TO client
08:00:39.879              327 4294967291 SQL*Net message TO client
08:00:39.879              235 4294967291 rdbms ipc reply
08:00:39.879               19 4294967291 rdbms ipc reply
08:00:39.879               73         73 DFS LOCK handle
08:00:39.879              123 4294967294 enq: WF - contention
08:00:49.987              476 4294967291 kst: async disk IO
08:00:49.987              500 4294967292 db file sequential READ
08:00:49.987              382 4294967291 gc cr block busy
08:00:49.987             1969        981 cursor: pin S wait ON X
08:00:49.987              857        981 cursor: pin S wait ON X
08:00:49.987              812        168 cursor: pin S wait ON X
08:00:49.987              235 4294967295 kksfbc child completion
08:00:49.987              658 4294967295 latch: library cache
08:00:49.987               19 4294967291 latch: library cache
08:00:49.987              981         73 library cache LOCK
08:00:49.987              168         73 library cache LOCK
08:00:49.987             1889 4294967291 SQL*Net message FROM client
08:00:49.987               73         73 DFS LOCK handle
08:00:49.987               52        981 cursor: pin S wait ON X

可以判断,导致数据库实例HANG死的原因是因为SID为73的进程:

SQL&gt; SELECT to_char(sample_time, 'hh24:mi:ss'), program, action, event_name 
  2  FROM wrh$_active_session_history ash, wrh$_event_name en
  3  WHERE session_id = 73
  4  AND ash.event_id = en.event_id(+)
  5  AND sample_time &gt;= to_date('201307030550', 'yyyymmddhh24mi')
  6  AND sample_time &lt;= to_date('201307030601', 'yyyymmddhh24mi')
  7  ORDER BY 1;
TO_CHAR( PROGRAM                                  ACTION                         EVENT_NAME
-------- ---------------------------------------- ------------------------------ ---------------------------
05:50:07 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:50:17 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:50:27 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:50:37 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:50:47 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               gc cr block 2-way
05:50:57 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:51:08 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:51:18 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file scattered READ
05:51:28 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               gc cr multi block request
05:51:38 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               direct path WRITE temp
05:51:48 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:51:58 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:52:08 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:52:18 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file scattered READ
05:52:28 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:52:38 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:52:49 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:52:59 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file scattered READ
05:53:09 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:53:19 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file scattered READ
05:53:29 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:53:39 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:53:49 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               gc CURRENT GRANT 2-way
05:53:59 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:54:09 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:54:20 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:54:30 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:54:40 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:54:50 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:55:00 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:55:10 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:55:20 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:55:30 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:55:40 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:55:51 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:01 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:11 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:21 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:31 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:41 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:56:51 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               gc CURRENT GRANT 2-way
05:57:01 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               library cache pin
05:57:11 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
05:57:21 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
05:57:32 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:57:42 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:57:52 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
05:58:02 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:58:12 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:58:22 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:58:32 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:58:42 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:58:52 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:59:03 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:59:13 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:59:23 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:59:33 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
05:59:43 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               db file sequential READ
05:59:53 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
06:00:03 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
06:00:13 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               latch: library cache
06:00:23 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
06:00:34 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
06:00:44 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle
06:00:54 oracle@orcldb5 (J002)                     GATHER_STATS_JOB               DFS LOCK handle

根据MOS文档,Bug 6011045 – DBMS_STATS causes deadlock between ‘cursor: pin S wait on X’ and ‘library cache lock’ [ID 6011045.8],这个问题是RAC环境中Oracle收集统计信息的BUG,在自动收集数据字典信息时,可能会出现进程HANG死,并导致大量的library cache lock和cursor: pin S wait on X等待,此外还可能出现DFS lock handle以及row cache lock的等待。
这个BUG确认影响版本包括10.2.0.4和10.2.0.5,这个BUG Oracle在10.2.0.5.5 PSU中进行了修复。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-600(kposcws-2)错误

客户11.2.0.2 RAC环境出现ORA-600[kposcws-2]错误。
错误信息为:

Thu Sep 27 10:48:15 2012
Errors IN file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_34145026.trc  (incident=682292):
ORA-00600: 内部错误代码, 参数: [kposcws-2], [], [], [], [], [], [], [], [], [], [], []
Incident details IN: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_682292/orcl1_ora_34145026_i682292.trc

根据MOS文档Bug 6615409 – ORA-600 [kposcws-2] on scrollable cursor [ID 6615409.8],导致问题是由于使用了scrollable游标。
不过这个BUG应该在10.2.0.5和11.2.0.1中已经被修正,虽然问题在11.2.0.2中重现的可能性不大,但是根据已知的BUG描述,似乎BUG重现是最合理的解释。
由于没有11.2.0.2上问题出现的描述,因此也不会存在单独的补丁,唯一可以尝试的版本就是将数据库版本升级到11.2.0.3以上。

Posted in BUG | Tagged , , | Leave a comment

ORA-600(kollasg:client-side tmp lob)错误

客户11.2.0.3 RAC环境,出现ORA-600[kollasg:client-side tmp lob]错误。
错误信息为:

Thu Apr 28 20:54:53 2011
Errors IN file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_31916032.trc  (incident=148665):
ORA-00600: 内部错误代码, 参数: [kollasg:client-side tmp lob], [], [], [], [], [], [], [], [], [], [], []
Incident details IN: /u01/app/oracle/diag/rdbms/orcl/orcl1/incident/incdir_148665/orcl1_ora_31916032_i148665.trc
Thu Apr 28 20:54:59 2011
Trace dumping IS performing id=[cdmp_20110428205459]
Thu Apr 28 20:54:59 2011
Sweep [inc][148665]: completed
Sweep [inc2][148665]: completed

从错误信息上很容易判断,问题应该和客户端处理临时LOB有关。根据MOS文档ORA-00600 [kollasg:client-side tmp lob] Fetching Row With LOB Column in OCI Application [ID 1418135.1],在11.1以上版本中,如果读取一个没有被初始化过的LOB列,会引发这个600错误。
Oracle给出的解决方案是对于空的CLOB,使用EMPTY_CLOB函数对齐进行初始化。

Posted in BUG | Tagged , , , | Leave a comment

ORA-600(ksxp_rm_check0)错误

客户11.2.0.3 RAC的ASM实例出现ORA-600[ksxp_rm_check0]错误。
错误信息如下:

Tue Jul 02 19:22:59 2013
Errors IN file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_ora_6488960.trc  (incident=165804):
ORA-04031: unable TO allocate 3000 bytes OF shared memory ("shared pool","select  INSTANCE_NUMBER , IN...","sga heap(1,0)","call")
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
Tue Jul 02 19:22:59 2013
ERROR: Unable TO normalize symbol name FOR the following short stack (at offset 174):
dbgexProcessError()+180<-dbgeExecuteForError()+72<-dbgePostErrorKGE()+2048<-dbkePostKGE_kgsf()+68<-kgeadse()+380<-kgerinv_internal()+48<-kgerinv()+48<-kgeasnmierr()+72<-IPRA.$ksxp_rm_check()+188<-ksxp_rm_check()+80<-ksliwat()+9376<-kslwaitctx()+180<-kmgs_immediate_req()+2396<-ksmasg()+144<-kghnospc()+632<-kghalo()+1964<-ksmdacnk()+276<-ksmdget()+628<-kssadpm()+324<-ksucrp()+904<-ksvrdp()+156<-opirip()+724<-opidrv()+608<-sou2o()+136<-opimai_real()+188<-ssthrdmain()+268<-main()+204<-__start()+112
Errors IN file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_m000_22151498.trc  (incident=186272):
ORA-00600: internal error code, arguments: [ksxp_rm_check0], [0x110A6FE08], [0], [0x700000059D65FD8], [], [], [], [], [], [], [], []
Incident details IN: /u01/app/grid/diag/asm/+asm/+ASM2/incident/incdir_186272/+ASM2_m000_22151498_i186272.trc
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
Tue Jul 02 19:23:01 2013
Process m000 died, see its trace file
Tue Jul 02 19:23:01 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702192301], requested BY (instance=2, osid=22151498 (M000)), summary=[incident=186272].

从错误信息上判断,这个ORA-600错误和前面出现了ORA-4031应该有直接的关系。根据MOS文档Bug 12925089 – ORA-600 [ksxp_rm_check0] can occur on process startup [ID 12925089.8],当一个进程启动时除非了ORA-4031错误,就可能会导致ORA-600[ksxp_rm_check()]的错误。而当前是m000进程启动出现了ORA-4031错误。至于当前ASM的ORA-4031错误,是由于11.2.0.3默认的MEMORY_TARGET不足导致的。
这个错误影响的版本为11.2.0.2和11.2.0.3,Oracle在12.1.0.1中以及将要发布的11.2.0.4中解决了这个问题。其实要避免这个错误的根源是给ASM实例的MEMORY_TARGET分配足够的空间,至少应分配1536M以上。

Posted in BUG | Tagged , , , , , , , , | Leave a comment

11.2.0.3 ASM实例出现ORA-4031导致数据库归档失败

客户的11.2.0.3 RAC数据库出现了归档失败的情况,导致单个实例出现HANG死的状况。
检查错误信息发现:

Tue Jul 02 16:49:13 2013
ARC1: Error 19504 Creating archive log file TO '+DATA02'
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance orcl1 - Archival Error
ORA-16038: log 14 SEQUENCE# 68244 cannot be archived
ORA-19504: failed TO CREATE file ""
ORA-00312: online log 14 thread 1: '+DATA02/orcl/onlinelog/group_14.264.792274883'
ORA-00312: online log 14 thread 1: '+DATA02/orcl/onlinelog/group_14.265.792274889'
Archiver process freed FROM errors. No longer stopped
Tue Jul 02 16:50:37 2013
ARC0: LGWR IS actively archiving destination LOG_ARCHIVE_DEST_3
ARCH: Archival stopped, error occurred. Will continue retrying
ORACLE Instance orcl1 - Archival Error
ORA-16014: log 14 SEQUENCE# 68244 NOT archived, no available destinations
ORA-00312: online log 14 thread 1: '+DATA02/orcl/onlinelog/group_14.264.792274883'
ORA-00312: online log 14 thread 1: '+DATA02/orcl/onlinelog/group_14.265.792274889'
ARC0: Archive log rejected (thread 1 SEQUENCE 68240) at host 'orclsh'
FAL[server, ARC0]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance orcl1 - Archival Error. Archiver continuing.

由于归档失败发生在ASM磁盘上,首先检查ASM磁盘空间以及DB_RECOVERY_FILE_DEST_SIZE,ASM磁盘空间是足够的,而且由于只有一个节点出现出现了无法归档的问题,也可以排除是空间不足造成的。确认两个节点的DB_RECOVERY_FILE_DEST_SIZE参数设置都是0,基本上可以判断问题和当前节点的ASM实例状态不正常有关。
检查ASM实例的错误信息:

Tue Jul 02 16:41:43 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702164115], requested BY (instance=2, osid=2032294 (LMD0)), summary=[incident=165521].
Tue Jul 02 16:49:13 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702164845], requested BY (instance=2, osid=2032294 (LMD0)), summary=[incident=165522].
Tue Jul 02 16:55:45 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702165517], requested BY (instance=2, osid=2032294 (LMD0)), summary=[incident=165523].
Tue Jul 02 17:01:48 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702170120], requested BY (instance=2, osid=2032294 (LMD0)), summary=[incident=165524].
Tue Jul 02 17:07:27 2013
Dumping diagnostic DATA IN directory=[cdmp_20130702170659], requested BY (instance=2, osid=2032294 (LMD0)), summary=[incident=165525].

当前节点ASM实例出现了的这个信息,说明报错发生在实例2上:

Tue Jul 02 18:29:55 2013
Errors IN file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_2032294.trc  (incident=186256):
ORA-04031: unable TO allocate 3768 bytes OF shared memory ("shared pool","unknown object","sga heap(1,0)","ges enqueues")
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
 Insufficient shared pool TO allocate a GES object (ospid 2032294)
Tue Jul 02 18:29:55 2013
Sweep [inc][186256]: completed
Tue Jul 02 18:36:49 2013
Errors IN file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_lmd0_2032294.trc  (incident=186257):
ORA-04031: unable TO allocate 3768 bytes OF shared memory ("shared pool","unknown object","sga heap(1,0)","ges enqueues")
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
 Insufficient shared pool TO allocate a GES object (ospid 2032294)

果然实例2上的ASM出现了大量ORA-4031错误。检查ASM启动的参数配置:

Sat Aug 25 20:06:55 2012
NOTE: No asm libraries found IN the system
ERROR: -5(Duplicate disk DATA_DG01:ASM_DISK1)
ERROR: -5(Duplicate disk DATA_DG01:ASM_DISK2)
MEMORY_TARGET defaulting TO 411041792.
* instance_number obtained FROM CSS = 2, checking FOR the existence OF node 0... 
* node 0 does NOT exist. instance_number = 2 
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Private Interface 'en1' configured FROM GPnP FOR USE AS a private interconnect.
  [name='en1', TYPE=1, ip=169.254.78.6, mac=00-1a-64-bb-50-7d, net=169.254.0.0/16, mask=255.255.0.0, USE=haip:cluster_interconnect/62]
Public Interface 'en0' configured FROM GPnP FOR USE AS a public interface.
  [name='en0', TYPE=1, ip=10.1.16.35, mac=00-1a-64-bb-50-7c, net=10.1.16.32/27, mask=255.255.255.224, USE=public/1]
Picked latch-free SCN scheme 3
USING LOG_ARCHIVE_DEST_1 parameter DEFAULT VALUE AS /u01/app/11.2.0.3/grid/dbs/arch
Autotune OF undo retention IS turned ON. 
LICENSE_MAX_USERS = 0
SYS auditing IS disabled
NOTE: Volume support  enabled
Starting up:
Oracle DATABASE 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
WITH the REAL Application Clusters AND Automatic Storage Management options.
ORACLE_HOME = /u01/app/11.2.0.3/grid
System name:	AIX
Node name:	orcldb2
Release:	1
Version:	6
Machine:	00C94E064C00
USING parameter settings IN server-side pfile /u01/app/11.2.0.3/grid/dbs/init+ASM2.ora
System parameters WITH non-DEFAULT VALUES:
  large_pool_size          = 12M
  instance_type            = "asm"
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "/dev/ocr_*"
  asm_diskstring           = "/dev/voting_*"
  asm_diskstring           = "/dev/asm_*"
  asm_diskgroups           = "DATA"
  asm_diskgroups           = "DATA_DG01"
  asm_diskgroups           = "SPFILE_DG"
  asm_power_limit          = 1
  diagnostic_dest          = "/u01/app/grid"
Cluster communication IS configured TO USE the following interface(s) FOR this instance
  169.254.78.6
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2

当前ASM实例使用默认的MEMORY_TARGET配置,分配大小大约是400M,根据Oracle的MOS文章:ASM & Shared Pool (ORA-4031) [ID 437924.1],在11.2.0.3中,Oracle增加了ASM实例所允许的默认进程数PROCESSES,但是默认的MEMORY_TARGET参数没有增加。
根据Oracle的建议,11.2.0.3的MEMORY_TARGET至少应该设置到1536M,而MEMORY_MAX_TARGET设置为4096M。

SQL> ALTER system SET memory_max_target=4096m scope=spfile;
SQL> ALTER system SET memory_target=1536m scope=spfile;

对于当前的情况,如果短时间内无法重启DB和ASM实例,可以在问题节点配置一个第二本地归档路径,设置目标路径为本地磁盘,从而避免归档无法完成而导致的实例HANG死。

Posted in BUG | Tagged , , , , , , , , | Leave a comment

ORA-7445(ptmax)错误

一个9.2.0.8上的bug。
虽然12c已经发布,但是仍然有部分客户的核心数据库还跑在9i环境上,有碰到了一个9208上的bug。

Thu May 16 14:18:48 2013
Errors IN file /oracle/oracle/admin/orcl/udump/orcl_ora_11922.trc:
ORA-07445: 出现异常: 核心转储 [00000001023A0F08] [SIGSEGV] [Address NOT mapped TO object] [0x000000010] [] []
Thu May 16 14:19:49 2013
Errors IN file /oracle/oracle/admin/orcl/udump/orcl_ora_12389.trc:
ORA-07445: 出现异常: 核心转储 [00000001023A0F08] [SIGSEGV] [Address NOT mapped TO object] [0x000000010] [] []

详细TRACE信息:

*** SESSION ID:(1298.47235) 2013-05-16 14:18:48.666
Exception signal: 11 (SIGSEGV), code: 1 (Address NOT mapped TO object), addr: 0x10, PC: [0x1023a0f08, 00000001023A0F08]
*** 2013-05-16 14:18:48.669
ksedmp: internal OR fatal error
ORA-07445: 出现异常: 核心转储 [00000001023A0F08] [SIGSEGV] [Address NOT mapped TO object] [0x000000010] [] []
CURRENT SQL statement FOR this SESSION:
BEGIN
  -- Call the procedure
  z_r_res_pk.g_num;
END;
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedmp()+328         CALL     ksedst()             00000000B ? 000000000 ?
                                                   000000000 ? 1033CF6C8 ?
                                                   00000003E ?
                                                   FFFFFFFF7FFF1188 ?
ssexhd()+676         CALL     ksedmp()             00010380C ? 10380C000 ?
                                                   10380C718 ? 103811000 ?
                                                   000102C00 ? 000000000 ?
__sighndlr()+12      PTR_CALL 0000000000000000     000103814 ?
                                                   FFFFFFFF7FFF8400 ?
                                                   103814000 ? 1038118E8 ?
                                                   000000000 ? 103814908 ?
call_user_handler()  CALL     __sighndlr()         00000000B ?
+992                                               FFFFFFFF7FFF8400 ?
                                                   FFFFFFFF7FFF8120 ?
                                                   10023F860 ? 000000000 ?
                                                   00000000A ?
sigacthandler()+104  CALL     call_user_handler()  FFFFFFFF7DB00200 ?
                                                   FFFFFFFF7DB00200 ?
                                                   FFFFFFFF7FFF8120 ?
                                                   00000000C ? 000000000 ?
                                                   000000000 ?
ptmak()+424          PTR_CALL 0000000000000000     000000000 ?
                                                   FFFFFFFF7FFF8400 ?
                                                   FFFFFFFF7FFF8120 ?
                                                   FFFFFFFF7DB00200 ?
                                                   000000000 ?
                                                   FFFFFFFF7DB3E000 ?
pdtidc()+4716        CALL     ptmak()              FFFFFFFF7FFF9300 ?
                                                   000000000 ? 000000004 ?
                                                   000000002 ? 000000000 ?
                                                   FFFFFFFF7CE61800 ?
pdlifu()+620         CALL     pdtidc()             000000000 ? 00002000C ?
                                                   000000001 ? 000000001 ?
                                                   102FBB000 ? 000102C00 ?
phpcmp()+1016        CALL     phpcog()             000000000 ?
                                                   FFFFFFFF7CE5C150 ?
                                                   00000000C ? 000000011 ?
                                                   FFFFFFFF7FFF9300 ?
                                                   F988B6748 ?
pcicog()+184         CALL     phpcmp()             FFFFFFFF7FFF9300 ?
                                                   F988B6748 ? 000000000 ?
                                                   000000004 ? 000000004 ?
                                                   000000000 ?
kkxcog()+288         CALL     pcicog()             FFFFFFFF7FFF9300 ?
                                                   F988B6748 ? 10380C49C ?
                                                   00000AB90 ? 000000001 ?
                                                   000000225 ?
opitca()+4796        CALL     kkxcog()             F988B6748 ? 3800231C0 ?
                                                   000380000 ? 000103813 ?
                                                   103813000 ? 103813F34 ?
rpiswu2()+384        PTR_CALL 0000000000000000     FFFFFFFF7FFF9A60 ?
                                                   000000001 ? 00000000A ?
                                                   000000001 ? 00000001A ?
                                                   F988B6748 ?
kkslod()+5196        CALL     rpiswu2()            F46434118 ? 00010380C ?
                                                   10380C940 ? 10380C978 ?
                                                   000000000 ? 10338B000 ?
kglobld()+696        PTR_CALL 0000000000000000     000000000 ?
                                                   FFFFFFFF7CE64658 ?
                                                   000000000 ? 10380C978 ?
                                                   F98D09B58 ? FD7970EF0 ?
kglobpn()+1284       CALL     kglobld()            000000001 ? 000000000 ?
                                                   F98ECBA88 ? 10380CAB8 ?
                                                   000000001 ? F988B6748 ?
kglpim()+236         CALL     kglobpn()            000000001 ? F922092E8 ?
                                                   000000001 ? 10390E848 ?
                                                   10390EA08 ? 10380DA10 ?
kglpin()+764         CALL     kglpim()             0000000FF ? 10380CAB8 ?
                                                   102DC74E8 ? 010010000 ?
                                                   F988B6748 ? F922092E8 ?
kksfbc()+6956        CALL     kglpin()             000010000 ?
                                                   FFFFFFFF7FFFB278 ?
                                                   F9100F9C8 ? 000000000 ?
                                                   FFFFFFFF7FFFAE28 ?
                                                   FFFFFFFF7FFFA950 ?
kkspsc0()+988        CALL     kksfbc()             000004000 ? 000004000 ?
                                                   000103800 ? 000000000 ?
                                                   000004000 ? 000000000 ?
opiosq0()+936        CALL     kkspsc0()            FFFFFFFF7CE60400 ?
                                                   000000024 ? 102FA3B10 ?
                                                   000000003 ? 000000004 ?
                                                   FFFFFFFF7FFFC5D0 ?
kpooprx()+204        CALL     opiosq0()            10104C000 ? 10380EC78 ?
                                                   000000042 ?
                                                   FFFFFFFF7FFFCC90 ?
                                                   10380EA84 ? 000000016 ?
kpoal8()+528         CALL     kpooprx()            FFFFFFFF7FFFEC2C ?
                                                   FFFFFFFF7FFFCC90 ?
                                                   000000042 ? 103814B90 ?
                                                   000000000 ? 000000024 ?
opiodr()+1720        PTR_CALL 0000000000000000     10380E000 ? 000000000 ?
                                                   FFFFFFFF7FFFEC28 ?
                                                   000000024 ? 000000000 ?
                                                   000000000 ?
ttcpip()+1564        PTR_CALL 0000000000000000     000102DD9 ? 00010380C ?
                                                   103814B98 ? 10380CAB8 ?
                                                   10380EFF8 ?
                                                   FFFFFFFF7FFFC5D0 ?
opitsk()+1156        CALL     ttcpip()             103814B90 ? 000000001 ?
                                                   FFFFFFFF7FFFEC28 ?
                                                   000000014 ?
                                                   FFFFFFFF7FFFD718 ?
                                                   FFFFFFFF7FFFD714 ?
opiino()+1504        CALL     opitsk()             000000000 ? 103814B20 ?
                                                   000000000 ? 000000000 ?
                                                   102E484C8 ? 10380EFE0 ?
opiodr()+1720        PTR_CALL 0000000000000000     000000000 ? 000000000 ?
                                                   000000000 ?
                                                   FFFFFFFF7FFFFAA0 ?
                                                   000000001 ? 000000001 ?
opidrv()+764         CALL     opiodr()             000102DD9 ? 00010380C ?
                                                   103814B98 ? 10380CAB8 ?
                                                   10380EFF8 ?
                                                   FFFFFFFF7FFFF5C0 ?
sou2o()+16           CALL     opidrv()             00010380C ? 000000004 ?
                                                   10380CAB8 ? 00000003C ?
                                                   10380CAB8 ? 10380C718 ?
main()+184           CALL     sou2o()              FFFFFFFF7FFFFAC0 ?
                                                   00000003C ? 000000004 ?
                                                   FFFFFFFF7FFFFAA0 ?
                                                   00003B000 ? 00003B030 ?
_start()+380         CALL     main()               000000002 ?
                                                   FFFFFFFF7FFFFC08 ?
                                                   FFFFFFFF7FFFFC20 ?
                                                   000000000 ?
                                                   FFFFFFFF7E100100 ?
                                                   FFFFFFFF7DB00200 ?
--------------------- Binary Stack Dump ---------------------

进一步检查TRACE,发现当前程序为DEVELOPER在进行DEBUG操作:

PROCESS STATE
-------------
Process global information:
     process: f474d5360, CALL: f95814378, xact: 0, curses: f46434118, usrses: f46434118
  ----------------------------------------
  SO: f474d5360, TYPE: 2, owner: 0, flag: INIT/-/-/0x00
  (process) Oracle pid=848, calls cur/top: f95814378/f95814378, flag: (0) -
            INT error: 0, CALL error: 0, sess error: 0, txn error 0
  (post info) LAST post received: 0 0 0
              LAST post received-location: No post
              LAST process TO post me: NONE
              LAST post sent: 0 0 0
              LAST post sent-location: No post
              LAST process posted BY me: NONE
    (latch info) wait_event=0 bits=0
    Process GROUP: DEFAULT, pseudo proc: f443c8d50
    O/S info: USER: oracle, term: UNKNOWN, ospid: 11922
    OSD pid info: Unix process pid: 11922, image: oracle@ysun7 (TNS V1-V3)
    ----------------------------------------
    SO: f46434118, TYPE: 4, owner: f474d5360, flag: INIT/-/-/0x00
    (SESSION) trans: 0, creator: f474d5360, flag: (41) USR/- BSY/-/-/-/-/-
              DID: 0001-0350-00106A26, short-term DID: 0000-0000-00000000
              txn branch: 0
              oct: 47, prv: 0, SQL: f98ecb960, psql: fdcc324f8, USER: 65/GZMPCMIS
    O/S info: USER: Administrator, term: S-E-SERVE, ospid: 5944:4000, machine: WORKGROUP\S-E-SERVE
              program: plsqldev.exe
    application name: PL/SQL Developer, hash VALUE=1190136663
    action name: Test Window - Script FOR procedu, hash VALUE=3367543201
    LAST wait FOR 'SQL*Net message from client' blocking sess=0x0 seq=34 wait_time=8952
                driver id=54435000, #bytes=1, =0
    TEMPORARY object counter: 0
      ----------------------------------------
      SO: f922092e8, TYPE: 52, owner: f46434118, flag: INIT/-/-/0x00
      LIBRARY OBJECT PIN: pin=f922092e8 handle=f988b6748 mode=X LOCK=f9100f9c8
      USER=f46434118 SESSION=f46434118 COUNT=0 mask=00ff SAVEPOINT=195 flags=[00]
      ----------------------------------------
.
.
.
      SO: f6a2eaba8, TYPE: 51, owner: f46434118, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: LOCK=f6a2eaba8 handle=f8dbc9320 mode=N
      CALL pin=faf005300 SESSION pin=0
      htl=f6a2eac18[fa6f5b618,fa6f5b618] htb=fa6f5b618
      USER=f46434118 SESSION=f46434118 COUNT=1 flags=[00] SAVEPOINT=186
      LIBRARY OBJECT HANDLE: handle=f8dbc9320
      name=SYS.DBMS_DEBUG 
      hash=2c51e752 TIMESTAMP=10-04-2010 00:55:36
      namespace=BODY/TYBD flags=KGHP/TIM/SML/[02000000]
      kkkk-dddd-llll=0000-0011-0011 LOCK=N pin=S latch#=21
      lwt=f8dbc9350[f8dbc9350,f8dbc9350] ltm=f8dbc9360[f8dbc9360,f8dbc9360]
      pwt=f8dbc9380[f8dbc9380,f8dbc9380] ptm=f8dbc9410[f8dbc9410,f8dbc9410]
      REF=f8dbc9330[f8dbc9330, f8dbc9330] lnd=f8dbc9428[f8dbc9428,f8dbc9428]
        LIBRARY OBJECT: object=fb741e720
        TYPE=PKBD flags=EXS/LOC[0005] pflags=/SWR [08] STATUS=VALD LOAD=0
        DEPENDENCIES: COUNT=10 SIZE=16
        ACCESSES: COUNT=9 SIZE=16
        DATA BLOCKS:
        DATA#     heap  pointer STATUS pins CHANGE
        ----- -------- -------- ------ ---- ------
            0 f94627a40 fb741e8c8 I/P/A     0 NONE  
            4 fb741e840 fb0c81e68 I/P/A     1 NONE  
      ----------------------------------------
      SO: f85863c38, TYPE: 51, owner: f46434118, flag: INIT/-/-/0x00
      LIBRARY OBJECT LOCK: LOCK=f85863c38 handle=fcbfb58d0 mode=N
      CALL pin=f83bafa18 SESSION pin=0
      htl=f85863ca8[fa6f5b608,fa6f5b608] htb=fa6f5b608
      USER=f46434118 SESSION=f46434118 COUNT=1 flags=[00] SAVEPOINT=186
      LIBRARY OBJECT HANDLE: handle=fcbfb58d0
      name=SYS.DBMS_DEBUG 
      hash=2c51e751 TIMESTAMP=10-04-2010 00:55:32
      namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]
      kkkk-dddd-llll=0000-0215-0215 LOCK=N pin=S latch#=22
      lwt=fcbfb5900[fcbfb5900,fcbfb5900] ltm=fcbfb5910[fcbfb5910,fcbfb5910]
      pwt=fcbfb5930[fcbfb5930,fcbfb5930] ptm=fcbfb59c0[fcbfb59c0,fcbfb59c0]
      REF=fcbfb58e0[fcbfb58e0, fcbfb58e0] lnd=fcbfb59d8[fa81f4ca0,f9a3a9ef0]
        LIBRARY OBJECT: object=fb62f8710
        TYPE=PCKG flags=EXS/LOC[0005] pflags= [00] STATUS=VALD LOAD=0
        DEPENDENCIES: COUNT=2 SIZE=16
        ACCESSES: COUNT=1 SIZE=16
        DATA BLOCKS:
        DATA#     heap  pointer STATUS pins CHANGE
        ----- -------- -------- ------ ---- ------
            0 fafa8d3f8 fb62f88b8 I/P/A     0 NONE  
            2 fb62f8830 fb68a95f8 I/-/A     0 NONE  
            4 fb6f38c50 fd2b7d280 I/P/A     1 NONE  
            9 fb6f38ba0 fb6a0f450 I/-/A     0 NONE  
      ----------------------------------------

根据MOS文档ORA-07445 [ptmak()+107] During PL/SQL Debug [ID 726611.1],这个问题应该是DEBUG模式下编译对象导致的。
虽然BUG的信息是10.2上的,但是不排除9.2.0.8上存在相同的问题,不要使用PL/SQL的DEBUG功能,或者禁止过程的DEBUG选项,可以避免这个错误的产生。

Posted in BUG | Tagged , , , | Leave a comment