客户RAC环境在一个节点重启后,另一个节点出现IPC send timeout信息。
详细错误信息为:
Wed May 2 22:07:00 2012 IPC Send timeout detected.Sender: ospid 20808 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:02 2012 IPC Send timeout detected.Sender: ospid 6677 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:09 2012 IPC Send timeout detected.Sender: ospid 16758 Receiver: inst 1 binc 1718096035 ospid 16261 Wed May 2 22:07:13 2012 IPC Send timeout detected.Sender: ospid 8947 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:13 2012 IPC Send timeout detected.Sender: ospid 6583 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:31 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 132 Wed May 2 22:07:31 2012 IPC Send timeout detected.Sender: ospid 17068 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:34 2012 Communications reconfiguration: instance_number 1 Wed May 2 22:07:34 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 154 Wed May 2 22:07:45 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 64 Wed May 2 22:07:45 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 95 Wed May 2 22:07:54 2012 IPC Send timeout detected.Sender: ospid 21078 Receiver: inst 1 binc 1718095761 ospid 16263 Wed May 2 22:07:59 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 24 Wed May 2 22:08:04 2012 Trace dumping IS performing id=[cdmp_20120502220729] Wed May 2 22:08:24 2012 IPC Send timeout TO 0.0 inc 24 FOR msg TYPE 12 FROM opid 146 Wed May 2 22:08:36 2012 Trace dumping IS performing id=[cdmp_20120502220805] Wed May 2 22:08:38 2012 Trace dumping IS performing id=[cdmp_20120502220805] Wed May 2 22:10:55 2012 Evicting instance 1 FROM cluster Wed May 2 22:11:32 2012 Waiting FOR instances TO leave: 1 |
这个信息并不正常,查询MOS后发现,这是一个bug,问题描述可以参考:’IPC Send Timeout Detected’ errors between QMON Processes after RAC reconfiguration [ID 458912.1]。
对于当前的10.2.0.4环境,需要针对Bug 6200820进行PATCH修正,而对于10.2.0.3版本则需要应用Patch 6326889。
在MOS中查到不少类似IPC Timeout的问题,多数都会影响10.2.0.4版本,且大部分都在10.2.0.5中被fixed,因此如果这个问题出现频繁,升级到10.2.0.5也是一个不错的选择。