重启RAC实例无法启动报错CRS-1019

客户反馈在重启RAC环境后,发现CLUSTER启动正常,但是数据库实例没用启动。
根据客户的电话描述,Oracle尝试在节点1上启动实例2,在节点2上启动实例1,并导致错误CRS-1019。
从客户的描述上很难得到真正有意义的信息,于是请客户将详细的错误信息发给我:

oracle@orcl1:/home/oracle>crs_start -ALL 
Attempting TO START `ora.orcl.orcl1.inst` ON member `orcl1` 
Attempting TO START `ora.orcl.orcl2.inst` ON member `orcl2` 
START OF `ora.orcl.orcl1.inst` ON member `orcl1` failed. 
orcl2 : CRS-1019: Resource ora.orcl.orcl1.inst (application) cannot run ON orcl2 
START OF `ora.orcl.orcl2.inst` ON member `orcl2` failed. 
orcl1 : CRS-1019: Resource ora.orcl.orcl2.inst (application) cannot run ON orcl1 
Attempting TO START `ora.orcl.db` ON member `orcl1` 
START OF `ora.orcl.db` ON member `orcl1` failed. 
Attempting TO START `ora.orcl.db` ON member `orcl2` 
START OF `ora.orcl.db` ON member `orcl2` failed. 
CRS-1006: No more members TO consider 
CRS-0215: Could NOT START resource 'ora.orcl.db'. 
CRS-0215: Could NOT START resource 'ora.orcl.orcl1.inst'. 
CRS-0215: Could NOT START resource 'ora.orcl.orcl2.inst'.

显然客户提到的CRS-1019错误,并不是导致问题的原因。上面的信息中最有意义的部分为:Start of `ora.orcl.orcl1.inst` on member `orcl1` failed。而随后的实例1无法在实例2上启动只是一个提示性的信息,并不是Oracle尝试在实例2上启动实例 1。
那么问题就很简单,找到实例无法启动的原因既可,向客户询问数据库的告警日志中记录的信息,告之只有一个启动实例的信息,没有什么错误也没有其他的信息写入。
有些时候确实会出现通过工具启动,错误信息没有写入到告警日志的情况,于是让客户尝试通过sqlplus直接STARTUP数据库,这次得到的明确的错误信息:

oracle@orcl2:/u01/app/oracle/admin/orcl/bdump>sqlplus / AS sysdba 
SQL*Plus: Release 10.2.0.5.0 - Production ON Fri Nov 2 17:44:24 2012 
Copyright (c) 1982, 2010, Oracle.  ALL Rights Reserved. 
Connected TO an idle instance. 
SQL> startup mount; 
ORA-02194: event specification syntax error 230 (minor error 215) near 'OFF'

显然导致问题的原因是SPFILE中设置的EVENT存在语法错误。这也是为什么告警日志中没有记录错误的原因,Oracle在解析初始化参数的时候就碰到了错误,因此还没有真正的开始启动过程。
剩下的问题就很简单了,让客户手工创建PFILE,将EVENT的语法修改正确,或者先暂时注释掉,然后重新生成SPFILE,并重启数据库。
本以为问题解决了,没想到没过多长时间,再次接到客户的电话。这次实例2已经正常启动,不过实例1还存在问题,在SQLPLUS中直接启动不会报错,但是通过crs_start却无法正常启动。

2012-11-02 18:38:55.460: [  CRSRES][11628]32ora.orcl.orcl1.inst target SET TO OFFLINE BEFORE stop action 
2012-11-02 18:38:55.460: [  CRSRES][11628]32StopResource: setting CLI VALUES 
2012-11-02 18:38:55.471: [  CRSRES][11628]32Target SET TO OFFLINE FOR `ora.orcl.orcl1.inst` 
2012-11-02 18:40:07.862: [  CRSRES][11633]32startRunnable: setting CLI VALUES 
2012-11-02 18:40:07.867: [  CRSRES][11633]32Attempting TO START `ora.orcl.orcl1.inst` ON member `orcl1` 
2012-11-02 18:40:09.194: [  CRSAPP][11633]32StartResource error FOR ora.orcl.orcl1.inst error code = 1 
2012-11-02 18:40:09.853: [  CRSRES][11633]32Start OF `ora.orcl.orcl1.inst` ON member `orcl1` failed. 
2012-11-02 18:40:09.865: [  CRSRES][11633]32orcl2 : CRS-1019: Resource ora.orcl.orcl1.inst (application) cannot run ON orcl2

开始怀疑是ORACLE_HOME/dbs目录下的initorcl1.ora文件存在错误,没有指向正确的SPFILE文件,让客户进行确认后没有发现问题。
由于SQLPLUS启动没有问题,而通过CRS_START启动出现问题,怀疑是OCR中某些配置异常,于是让客户检查SRVCTL的CONFIG命令输出结果:

oracle@orcl1:/u01/app/oracle/product/10.2.0/crs/log/orcl1/crsd>srvctl config DATABASE -d orcl -a 
orcl1 orcl1 /u01/app/oracle/product/10.2.0/db 
orcl2 orcl2 /u01/app/oracle/product/10.2.0/db 
DB_UNIQUE_NAME: orcl 
DB_NAME: orcl 
ORACLE_HOME: /u01/app/oracle/product/10.2.0/db 
SPFILE: /dev/rspfile 
DOMAIN: NULL 
DB_ROLE: NULL 
START_OPTIONS: NULL 
POLICY:  AUTOMATIC 
ENABLE FLAG: DB ENABLED, INST DISABLED ON orcl1

很明显,在OCR配置中,实例1被DISABLE了,这就是通过CRS_START启动时,实例1无法正常启动的原因。
执行下面的命令:

srvctl enable instance -d orcl –i orcl1

问题解决。

This entry was posted in ORACLE and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *