网络服务参考

看完参考手册这篇文档后,意识到即使参考手册也有精读的理由,于是准备细读一下这些参考文档。
网络服务参考中主要介绍网络相关的工具和配置文件,包括监听工具lsnrctl和连接管理控制工具cmctl的命令,以及包含listener.ora、tnsnames.ora、sqlnet.ora以及cman.ora在内的多个参数的详细配置和可配置参数。
对于需要了解参数的默认值以及设置高级参数选项,建议详读该文档。
文档在线阅读地址:http://www.oracle.com/pls/db112/to_toc?pathname=network.112%2Fe10835%2Ftoc.htm&remark=portal+%28Books%29

Posted in BOOKS | Leave a comment

参考手册总结

一直认为这篇文档作为工具书更合适,不过细读的过程中却改变了我的看法。
之所以这篇文档的优先级放得比较低,和这篇文档的厚度也有关系。不过这1200多页的内容却包含了很多Oracle中最关键的信息。
第一部分就是初始化参数的详细描述,即使这篇文档只包含了这部分内容,也值得一看了。
第二部分是静态数据字典视图,第三部分是动态数据字典视图。这两部分的内容除了作为工具书备查之外,也是补充自己知识全面性的重要文档,Oracle的提供的数据字典覆盖了非常全面的信息,而我们平时只利用了很少的一部分,如果能充分利用这些视图,那么解决问题会更加的轻松。
最后在附录部分还有重要的内容,比如所有等待事件的描述。

Posted in BOOKS | Leave a comment

ORA-600(qerrmOStart2)(1756)错误

客户数据库出现ORA-600(qerrmOStart2)(1756)错误。
ORA-600(qerrmOStart2)错误:http://yangtingkun.itpub.net/post/468/527461
之前遇到过一次类似的qerrmOStart2错误,不过上次错误的第二个参数是1740。在上一篇文章中已经提到了,这个600错误只是一个表现,实际上真正的错误是第二个函数中体现的错误。
看一下具体的错误TRACE信息:

*** SERVICE NAME:(SYS$USERS) 2012-05-24 17:23:39.684
*** SESSION ID:(598.169) 2012-05-24 17:23:39.684
*** 2012-05-24 17:23:39.684
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [qerrmOStart2], [1756], [ORA-01756: quoted string NOT properly TERMINATED
], [], [], [], [], []
CURRENT SQL statement FOR this SESSION:
SELECT COUNT(*) FROM (  SELECT * FROM ( 
				 SELECT i_t 交易编号,
        decode(TYPE, '00500', '退货'),
        am,
        decode(flag, 'S', '成功', '失败')
   FROM usr1.t_1@db01
  WHERE s_p = '3702491300'
    AND o_i_t = 96924977
			) t WHERE rownum <= 3000 ) tt 
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst+001c          bl       ksedst1              000000000 ? FFFFFFFFFFF78A4 ?
ksedmp+0290          bl       ksedst               104A506E8 ?
ksfdmp+0018          bl       03F4BE78             
kgerinv+00dc         bl       _ptrgl               
kgesinv+0020         bl       kgerinv              FFFFFFFFFFF7BF0 ? 000000000 ?
                                                   10032BE9C ? 000000000 ?
                                                   000000000 ?
ksesin+006c          bl       kgesinv              000000000 ? 000000000 ?
                                                   FFFFFFFFFFF7C60 ? 1104B63E8 ?
                                                   11048EFB0 ?
OCIKSIN+0124         bl       ksesin               105383C74 ? 200000002 ?
                                                   000000000 ? 0000006DC ?
                                                   000000001 ? 000000031 ?
                                                   FFFFFFFFFFF7C64 ?
                                                   FFFFFFFFFFF7C94 ?
qerrmOStart+01d4     bl       03F49D20             
qerrmStart+0310      bl       qerrmOStart          000000001 ? 110471728 ?
                                                   110489E88 ?
selexe+09d0          bl       01FB6A3C             
opiexe+26fc          bl       selexe               110489E88 ? 700000917096A00 ?
                                                   102387EB0C8 ?
kpoal8+0edc          bl       opiexe               FFFFFFFFFFFB874 ?
                                                   FFFFFFFFFFFB4E8 ?
                                                   FFFFFFFFFFF9A48 ?
opiodr+0ae0          bl       _ptrgl               
ttcpip+1020          bl       _ptrgl               
opitsk+1124          bl       ttcpip               1100CB0D8 ? FFFFFFFFFFFB990 ?
                                                   000000001 ? 000000000 ?
                                                   FFFFFFFFFFFB910 ? 1100E0428 ?
                                                   000000000 ? 1104D0240 ?
opiino+0990          bl       opitsk               000000000 ? 000000000 ?
opiodr+0ae0          bl       _ptrgl               
opidrv+0484          bl       01FB7E64             
sou2o+0090           bl       opidrv               3C02AB90DC ? 440660000 ?
                                                   FFFFFFFFFFFF7B0 ?
opimai_real+01bc     bl       01FB5C2C             
main+0098            bl       opimai_real          000000000 ? 000000000 ?
__start+0098         bl       main                 000000000 ? 000000000 ?
--------------------- Binary Stack Dump ---------------------

从SQL语句本身分析,并不存在任何的语法错误,所有的单引号都是匹配的,而这种600错误的产生本身也说明不是简单的普通错误。
观察到这个SQL是通过数据库链访问远端表,且SQL语句本身包含了中文,那么出现这个问题就不奇怪了。显然导致问题的原因在于两个数据库的字符集不同,Oracle在进行字符集转换的时候把某个中文错误的分解出了一个单引号,导致语法分析时报错。
避免这个错误的最行之有效的方法就是去掉多字节字符,对于当前的SQL语句而言,只需要替换所有的中文字符,就可以避免这个错误的产生。

Posted in BUG | Tagged , , , , | Leave a comment

discover_server报错OVMAPI_4010E

在VM Manager中搜索VM Server时出现这个错误。
按照VM Server以及VM Manager后,通过指定IP地址,让VM Manager自动寻找VM Server,结果JOB运行报错,详细的错误信息为:

Job Construction Phase
----------------------
BEGIN()
Appended operation 'Discover Manager Server Discover' TO object 'OVM Foundry : Discover Manager'.
commit()
Completed Step: COMMIT
Objects AND Operations
----------------------
Object (IN_USE): [Server] 35:38:33:39:31:34:43:4e:47:31:33:30:53:37:33:42 (server2.zihexin.com)
Object (IN_USE): [DiscoverManager] OVM Foundry : Discover Manager
 Operation: Discover Manager Server Discover
Job Running Phase at 18:05 ON Fri, Nov 25, 2011
----------------------------------------------
Job Participants: []
Actioner
--------
Starting operation 'Discover Manager Server Discover' ON object 'OVM Foundry : Discover Manager'
Setting Context TO model ONLY IN job WITH id=1322215534120
Job Internal Error (Operation)com.oracle.ovm.mgr.api.exception.FailedOperationException: OVMAPI_4010E Attempt TO send command: discover_server TO server: 10.0.10.171 failed. OVMAPI_4004E Server Failed Command: discover_server, STATUS: 
Fri Nov 25 18:05:34 CST 2011
Fri Nov 25 18:05:34 CST 2011
 at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:474)
 at com.oracle.ovm.mgr.action.ActionEngine.sendDiscoverCommand(ActionEngine.java:283)
 at com.oracle.ovm.mgr.action.ServerAction.getServerInfo(ServerAction.java:95)
 at com.oracle.ovm.mgr.discover.ovm.ServerBasicDiscoverHandler.query(ServerBasicDiscoverHandler.java:131)
 at com.oracle.ovm.mgr.discover.ovm.ServerBasicDiscoverHandler.query(ServerBasicDiscoverHandler.java:61)
 at com.oracle.ovm.mgr.discover.ovm.DiscoverHandler.execute(DiscoverHandler.java:50)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.handleDiscover(DiscoverEngine.java:435)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.discoverNewServer(DiscoverEngine.java:345)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.discoverServer(DiscoverEngine.java:265)
 at com.oracle.ovm.mgr.op.manager.DiscoverManagerServerDiscover.action(DiscoverManagerServerDiscover.java:48)
 at com.oracle.ovm.mgr.api.job.JobEngine.operationActioner(JobEngine.java:191)
 at com.oracle.ovm.mgr.api.job.JobEngine.objectActioner(JobEngine.java:257)
 at com.oracle.ovm.mgr.api.job.InternalJobDbImpl.objectCommitter(InternalJobDbImpl.java:1019)
 at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:223)
 at com.oracle.odof.core.BasicWork.invokeMethod(BasicWork.java:136)
 at com.oracle.odof.command.InvokeMethodCommand.process(InvokeMethodCommand.java:100)
 at com.oracle.odof.core.BasicWork.processCommand(BasicWork.java:81)
 at com.oracle.odof.core.TransactionManager.processCommand(TransactionManager.java:751)
 at com.oracle.odof.core.WorkflowManager.processCommand(WorkflowManager.java:395)
 at com.oracle.odof.core.WorkflowManager.processWork(WorkflowManager.java:453)
 at com.oracle.odof.io.AbstractClient.run(AbstractClient.java:42)
 at java.lang.Thread.run(Thread.java:662)
Caused BY: com.oracle.ovm.mgr.api.exception.IllegalOperationException: OVMAPI_4004E Server Failed Command: discover_server, STATUS: 
Fri Nov 25 18:05:34 CST 2011
 at com.oracle.ovm.mgr.action.ActionEngine.sendAction(ActionEngine.java:752)
 at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:470)
 ... 24 more
FailedOperationCleanup
----------
Starting failed operation 'Discover Manager Server Discover' cleanup ON object 'OVM Foundry : Discover Manager'
Complete ROLLBACK operation 'Discover Manager Server Discover' completed WITH direction=OVM Foundry : Discover Manager
Rollbacker
----------
Objects TO Be Rolled Back
-------------------------
Object (IN_USE): [Server] 35:38:33:39:31:34:43:4e:47:31:33:30:53:37:33:42 (server2.zihexin.com)
Object (IN_USE): [DiscoverManager] OVM Foundry : Discover Manager
Completed Step: ROLLBACK
Job failed commit (internal) due TO OVMAPI_4010E Attempt TO send command: discover_server TO server: 10.0.10.171 failed. OVMAPI_4004E Server Failed Command: discover_server, STATUS: 
Fri Nov 25 18:05:34 CST 2011
Fri Nov 25 18:05:34 CST 2011
com.oracle.ovm.mgr.api.exception.FailedOperationException: OVMAPI_4010E Attempt TO send command: discover_server TO server: 10.0.10.171 failed. OVMAPI_4004E Server Failed Command: discover_server, STATUS: 
Fri Nov 25 18:05:34 CST 2011
Fri Nov 25 18:05:34 CST 2011
 at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:474)
 at com.oracle.ovm.mgr.action.ActionEngine.sendDiscoverCommand(ActionEngine.java:283)
 at com.oracle.ovm.mgr.action.ServerAction.getServerInfo(ServerAction.java:95)
 at com.oracle.ovm.mgr.discover.ovm.ServerBasicDiscoverHandler.query(ServerBasicDiscoverHandler.java:131)
 at com.oracle.ovm.mgr.discover.ovm.ServerBasicDiscoverHandler.query(ServerBasicDiscoverHandler.java:61)
 at com.oracle.ovm.mgr.discover.ovm.DiscoverHandler.execute(DiscoverHandler.java:50)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.handleDiscover(DiscoverEngine.java:435)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.discoverNewServer(DiscoverEngine.java:345)
 at com.oracle.ovm.mgr.discover.DiscoverEngine.discoverServer(DiscoverEngine.java:265)
 at com.oracle.ovm.mgr.op.manager.DiscoverManagerServerDiscover.action(DiscoverManagerServerDiscover.java:48)
 at com.oracle.ovm.mgr.api.job.JobEngine.operationActioner(JobEngine.java:191)
 at com.oracle.ovm.mgr.api.job.JobEngine.objectActioner(JobEngine.java:257)
 at com.oracle.ovm.mgr.api.job.InternalJobDbImpl.objectCommitter(InternalJobDbImpl.java:1019)
 at sun.reflect.GeneratedMethodAccessor1001.invoke(UNKNOWN SOURCE)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at com.oracle.odof.core.AbstractVessel.invokeMethod(AbstractVessel.java:223)
 at com.oracle.odof.core.BasicWork.invokeMethod(BasicWork.java:136)
 at com.oracle.odof.command.InvokeMethodCommand.process(InvokeMethodCommand.java:100)
 at com.oracle.odof.core.BasicWork.processCommand(BasicWork.java:81)
 at com.oracle.odof.core.TransactionManager.processCommand(TransactionManager.java:751)
 at com.oracle.odof.core.WorkflowManager.processCommand(WorkflowManager.java:395)
 at com.oracle.odof.core.WorkflowManager.processWork(WorkflowManager.java:453)
 at com.oracle.odof.io.AbstractClient.run(AbstractClient.java:42)
 at java.lang.Thread.run(Thread.java:662)
Caused BY: com.oracle.ovm.mgr.api.exception.IllegalOperationException: OVMAPI_4004E Server Failed Command: discover_server, STATUS: 
Fri Nov 25 18:05:34 CST 2011
 at com.oracle.ovm.mgr.action.ActionEngine.sendAction(ActionEngine.java:752)
 at com.oracle.ovm.mgr.action.ActionEngine.sendCommandToServer(ActionEngine.java:470)
 ... 24 more
----------
END OF Job
----------

由于关键性信息确实,所以无法判断导致错误的原因。即使是在metalink或GOOGLE中查询,也得不到任何有价值的信息。
虽然在VM Manager中得不到有意义的信息,但是在VM Server上,却可以得到更详细的信息,通过检查var/log/ovs-agent.log文件,获取到下面的信息:

[2011-04-16 13:21:46 25970] ERROR (OVSAgentServer:108) Unauthorized access attempt FROM ('10.0.10.173', 59424)!
Traceback (most recent CALL LAST):
  File "/opt/ovs-agent-3.0/OVSAgentServer.py", line 103, IN do_POST
    auth(username, password)
  File "/opt/ovs-agent-3.0/OVSAgentServer.py", line 42, IN auth
    raise Exception('Authorization failed: user does not exist or password error.')
Exception: Authorization failed: USER does NOT exist OR password error.
[2011-04-16 13:21:46 25970] INFO (OVSAgentServer:169) code 403, message Unauthorized access attempt FROM ('10.0.10.173', 59424)!

这次信息就明确多了,显然是由于VM Manager中配置的密码不正确所致,在VM Server上修改oracle用户密码:

[root@server2 ~]# ovs-agent-passwd oracle
Password: 
Again:

在搜索VM Server时使用这里修改的密码,VM Manager成功的发现了VM Server信息。

Posted in VM | Tagged , , , | Leave a comment

Oracle VM Server安装手册

简单描述一下Oracle VM Server的安装过程。
需要注意,VM 3.0以上版本才支持升级操作,因此在VM 2.2没有办法升级到当前版本,安装VM 3.0将会删除服务器上所有的数据。
将VM Server的光盘放入,并从光盘启动服务器。
在启动界面直输入Enter开始安装过程:
Oracle会提示是否监测截至,这里可以直接SKIP跳过;
键盘选择:选择us;
然后是版权声明,选择Accept后,开始正式的安装步骤;
如果服务器上没有系统,那么会直接进入后面的分区阶段,否则会提示重装系统还是在原有系统上升级;
选择ReInstall后,会显示当前系统磁盘分区信息,首先选择准备进行系统安装的分区,然后选择Remove all partitions and create a new default partition layout,Oracle在格式化分区之前会要求再次确认,并询问是否预览分区空间详细配置,可以完全按照默认推荐值安装,因此这里可以跳过,也可以进入到分区空间修改页面进行自定义的修改;
随后选择Boot Loader配置,选择Master Boot Record;
然后选择一个管理网络接口,手工输入IP和掩码,在下一个页面输入网关、DNS信息,接着是主机名信息;
配置服务器所在时区,配置中找不到北京,可以设置Asia/Shanghai代替;
分别输入Agent密码和root密码后,安装操作完成,这是会提示整个安装的日志文件的位置。
在重启界面选择REBOOT,完成整个安装过程。
启动后,进入Oracle VM Server 3.0控制台界面,可以通过Alt + F2进入linux的登录界面。至此VM Server安装完成。

Posted in VM | Leave a comment

ORA-600(qersqCloseRem-2)错误

客户的10.2.0.4 RAC for Hp-un环境碰到了这个错误。
错误信息为:

Wed Feb 29 19:42:05 2012
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_11261.trc:
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:42:05 2012
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_32036.trc:
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:42:05 2012
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_5935.trc:
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:42:05 2012
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_5026.trc:
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:42:05 2012
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_7620.trc:
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:42:08 2012
Trace dumping IS performing id=[cdmp_20120229194207]
Wed Feb 29 19:42:17 2012
Trace dumping IS performing id=[cdmp_20120229194217]

这个ORA-600[qersqCloseRem-2]错误非常罕见,在MOS上居然没有任何记载。不过从错误信息进行进一步的分析,这个错误发生在远端数据库的访问异常。
检查进一步的详细信息:

*** 2012-02-29 19:42:05.564
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [qersqCloseRem-2], [Invalid Handle], [], [], [], [], [], []
ORA-02068: following severe error FROM WEBDB.COM
ORA-03113: end-of-file ON communication channel
CURRENT SQL statement FOR this SESSION:
SELECT ACCESS_LOG_SEQUENCE.NEXTVAL@WEBDB.COM FROM DUAL 
----- PL/SQL Call Stack -----
  object      line  object
  handle    NUMBER  name
0x39b5c3720         5  ECOMMERCE.P_USER_AT
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   7FBFFF4370 ? 7FBFFF43D0 ?
                                                   7FBFFF4310 ? 000000000 ?
ksedmp()+610         CALL     ksedst()             000000000 ? 000000001 ?
                                                   7FBFFF4370 ? 7FBFFF43D0 ?
                                                   7FBFFF4310 ? 000000000 ?
ksfdmp()+21          CALL     ksedmp()             000000003 ? 000000001 ?
                                                   7FBFFF4370 ? 7FBFFF43D0 ?
                                                   7FBFFF4310 ? 000000000 ?
.
.
.
                                                   0059DF200 ? 683F6E400000001 ?
main()+116           CALL     opimai_real()        000000002 ? 7FBFFFF4E0 ?
                                                   000000004 ? 7FBFFFF478 ?
                                                   0059DF200 ? 683F6E400000001 ?
__libc_start_main()  CALL     main()               000000002 ? 7FBFFFF4E0 ?
+219                                               000000004 ? 7FBFFFF478 ?
                                                   0059DF200 ? 683F6E400000001 ?
_start()+42          CALL     __libc_start_main()  0007139F8 ? 000000002 ?
                                                   7FBFFFF628 ? 0052B4BD0 ?
                                                   000000000 ? 000000002 ?
 
--------------------- Binary Stack Dump ---------------------

从详细TRACE分析,在问题发生时刻,正在通过数据库链读取远端序列的值。而此时出现的ORA-3113通信错误,多半与远端数据库状态异常有关。
检查远端数据库的告警日志,果然发现在问题出现时刻,数据库状态异常并最终导致了实例重启:

Wed Feb 29 19:39:29 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:39:30 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:40:01 2012
WARNING: inbound connection timed OUT (ORA-3136)
.
.
.
Wed Feb 29 19:43:28 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:28 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:28 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:28 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:29 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:29 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:29 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:29 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:43:30 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:45:26 2012
PMON failed TO acquire latch, see PMON dump
Wed Feb 29 19:46:32 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:46:33 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:46:34 2012
PMON failed TO acquire latch, see PMON dump
Wed Feb 29 19:46:40 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:46:43 2012
WARNING: inbound connection timed OUT (ORA-3136)
Wed Feb 29 19:46:44 2012
Errors IN file /opt/app/oracle/admin/orcl/bdump/orcl1_asmb_14614.trc:
ORA-15064: communication failure WITH ASM instance
ORA-03113: end-of-file ON communication channel
Wed Feb 29 19:46:44 2012
ASMB: terminating instance due TO error 15064
Wed Feb 29 19:46:44 2012
System state dump IS made FOR LOCAL instance
System State dumped TO trace file /opt/app/oracle/admin/orcl/bdump/orcl1_diag_14555.trc
Wed Feb 29 19:46:47 2012
Shutting down instance (abort)
License high water mark = 1623
Wed Feb 29 19:46:49 2012
Instance TERMINATED BY ASMB, pid = 14614
Wed Feb 29 19:46:52 2012
Instance TERMINATED BY USER, pid = 3684

显然远端数据库状态异常是这个ORA-600错误的直接原因。

Posted in BUG | Tagged , , , , , | Leave a comment

ORA-1595和ORA-1594错误

Oracle 9i上使用自动管理回滚的错误,简单记录一下。
错误信息为:

Sat May 12 21:54:17 2012
Errors IN file /oracle/app/admin/prmdb/bdump/prmdb2_smon_483522.trc:
ORA-01595: error freeing extent (2) OF ROLLBACK segment (19))
ORA-01594: attempt TO wrap INTO ROLLBACK segment (19) extent (2) which IS being freed

数据库环境为9208 RAC for Aix,跟进MOS文档With AUM Enabled ORA-01594 and ORA-01595 Found in the alert.log [ID 280151.1]的描述,导致问题的原因是在自动回滚管理系统中,如果SMON在尝试收缩一个回滚段时,有新的事务导致回滚段需要扩展,那么这个回收的操作就会报错。因此,可以认为这是一个正常的信息,而非是错误提示,可以简单的忽略这个问题。
Oracle在10g中已经解决了这个问题,在9i中,可以尝试添加更多的回滚空间来解决问题。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-600(1265)错误

客户的数据库出现ORA-600错误,错误函数为1265。
数据库版本为10.2.0.4 for Linux,错误信息为:

Fri Aug 26 22:00:11 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
Fri Aug 26 22:00:13 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
Fri Aug 26 22:00:13 2011
Trace dumping IS performing id=[cdmp_20110826220013]
Fri Aug 26 22:00:14 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-06512: at "USER1.P_PRO", line 5
ORA-04088: error during execution OF TRIGGER 'USER1.P_PRO'
Fri Aug 26 22:00:15 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-06512: at "USER1.P_PRO ", line 5
ORA-04088: error during execution OF TRIGGER 'USER1.P_PRO'
ORA-06512: at "USER1.U_PRO ", line 25
Fri Aug 26 22:00:17 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-02067: TRANSACTION OR SAVEPOINT ROLLBACK required
Fri Aug 26 22:00:18 2011
Errors IN file /opt/app/oracle/admin/orcl/udump/orcl1_ora_16655.trc:
ORA-00600: internal error code, arguments: [17281], [600], [0x2E134EEC0], [], [], [], [], []
ORA-00600: internal error code, arguments: [1265], [0x42180EA78], [], [], [], [], [], []
ORA-02067: TRANSACTION OR SAVEPOINT ROLLBACK required

这个错误是ORA-600[1265]错误引发的,随后还出现了ORA-600[17281]、ORA-4088和ORA-2067错误。其中ORA-2067的描述为:

$ oerr ora 2067
02067, 00000, "transaction or savepoint rollback required"
// *Cause: A failure (typically a TRIGGER OR stored PROCEDURE WITH multiple
// remote updates) occurred such that the all-or-nothing execution
// OF a previous Oracle CALL cannot be guaranteed.
// *Action: ROLLBACK TO a previous SAVEPOINT OR ROLLBACK the TRANSACTION
// AND resubmit.

从这个描述和Oracle的报错信息不难判断,Oracle在通过触发器更新远端表时引发了这个600错误。
根据Oracle的MOS文档Bug 5655419 Distributed transaction hits ORA-600:[1265] or ORA-600:[k2gget: downgrade] in 10.2的描述,这个错误和分布式事务有关,确认影响的版本就是当前环境的10.2.0.4。这个错误的产生一般与窗口维护有关,可以看到问题的发生时刻恰好是22点,从这个时刻开始,Oracle进入维护窗口,进行空间回收统计信息收集等后台工作,显然就是因为窗口的变化导致了这个错误的产生。
Oracle在11.1.0.6中FIXED了这个bug。除了版本升级外,可以考虑将包含分布式事务修改的程序放到远离时间窗口改变时间。

Posted in BUG | Tagged , , , , | Leave a comment

ORA-600(kclfusion_1)错误

10.2.0.4 RAC for HP-UX,出现600错误并导致实例崩溃。
数据库错误信息如下:

Thu Oct 29 19:49:19 2009
SMON: enabling cache recovery
Thu Oct 29 19:49:19 2009
LNS: Standby redo logfile selected FOR thread 2 SEQUENCE 48 FOR destination LOG_ARCHIVE_DEST_2
Thu Oct 29 19:49:19 2009
ARC0: Standby redo logfile selected FOR thread 2 SEQUENCE 47 FOR destination LOG_ARCHIVE_DEST_2
Thu Oct 29 19:49:20 2009
Successfully onlined Undo Tablespace 1.
Thu Oct 29 19:49:20 2009
SMON: enabling tx recovery
Thu Oct 29 19:49:20 2009
DATABASE Characterset IS ZHS16GBK
Opening WITH internal Resource Manager plan
WHERE NUMA PG = 1, CPUs = 8
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started WITH pid=37, OS id=10359
Thu Oct 29 19:49:23 2009
Completed: ALTER DATABASE OPEN
Thu Oct 29 20:00:36 2009
Errors IN file /u01/app/oracle/admin/orcl3/bdump/orcl32_dbw2_10125.trc:
ORA-00600: internal error code, arguments: [kclfusion_1], [], [], [], [], [], [], []
Thu Oct 29 20:00:37 2009
Trace dumping IS performing id=[cdmp_20091029200037]
Thu Oct 29 20:00:37 2009
Errors IN file /u01/app/oracle/admin/orcl3/bdump/orcl32_dbw2_10125.trc:
ORA-00600: internal error code, arguments: [kclfusion_1], [], [], [], [], [], [], []
Thu Oct 29 20:00:37 2009
DBW2: terminating instance due TO error 471
Thu Oct 29 20:00:37 2009
Errors IN file /u01/app/oracle/admin/orcl3/bdump/orcl32_lms0_10115.trc:
ORA-00471: DBWR process TERMINATED WITH error
Thu Oct 29 20:00:37 2009
Errors IN file /u01/app/oracle/admin/orcl3/bdump/orcl32_lms1_10117.trc:
ORA-00471: DBWR process TERMINATED WITH error
Thu Oct 29 20:00:37 2009
Errors IN file /u01/app/oracle/admin/orcl3/bdump/orcl32_lgwr_10129.trc:
ORA-00471: DBWR process TERMINATED WITH error
Thu Oct 29 20:00:38 2009
Shutting down instance (abort)
License high water mark = 102
Thu Oct 29 20:00:43 2009
Instance TERMINATED BY DBW2, pid = 10125
Thu Oct 29 20:00:43 2009
Instance TERMINATED BY USER, pid = 14882

对于这个问题,MOS上描述极为简单:Bug 4637902 OERI[kclfusion_1] in RAC。除了说明这个bug影响10.1.0.5、10.2.0.3和10.2.0.4,并在10.2.0.4.1、10.2.0.5和11.1.0.6中被fixed以外,对于错误的描述只有一句话,ORA-600[kclfusion_1]可能发生在RAC环境中。
由于缺少进一步的信息,如果碰到了这个问题,升级或打PSU可能是唯一的选择。

Posted in BUG | Tagged , , , | Leave a comment

ORA-600(KSSRMP1)错误

10.2.0.4 RAC的ASM实例出现这个错误。
错误信息为:

Mon Mar 14 11:45:00 2011
Errors IN file /home/oracle/admin/+ASM/udump/+asm2_ora_717148.trc:
ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []
Mon Mar 14 11:45:01 2011
Trace dumping IS performing id=[cdmp_20110314114501]
Mon Mar 14 11:45:06 2011
Errors IN file /home/oracle/admin/+ASM/udump/+asm2_ora_700424.trc:
ORA-00600: internal error code, arguments: [KSSRMP1], [], [], [], [], [], [], []
Mon Mar 14 11:45:07 2011
Trace dumping IS performing id=[cdmp_20110314114507]

根据文档Bug 7308941 – ORA-600 [KSSRMP1] in ASM [ID 7308941.8]的描述,这个错误是由于在ASM创建文件失败导致的。
该bug影响的版本是10.2.0.4,在10.2.0.5中已经被修复。

Posted in BUG | Tagged , , , | Leave a comment