11.2单实例ASM启动报错ORA-15186

客户安装了一个Oracle 11.2.0.2 for Linux X64环境,使用了ASM方式,重启后发现数据库没有启动。
检查操作系统进程,发现ASM已经启动,CLUSTER环境也启动完成,但是数据库并未启动,尝试手工启动数据库,报错找不到对应的SPFILE。
登录ASM实例,查询发现找不到任何的ASM磁盘组信息:

[grid@rptdb trace]$ sqlplus / AS sysasm
SQL*Plus: Release 11.2.0.2.0 Production ON Tue DEC 20 18:41:41 2011
Copyright (c) 1982, 2010, Oracle. ALL rights reserved.
 
Connected TO:
Oracle DATABASE 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
WITH the Automatic Storage Management OPTION
SQL> SELECT * FROM v$asm_diskgroup;
no ROWS selected

查询V$ASM_DISK可以看到磁盘信息,但是磁盘头的头状态是UNKNOWN:

SQL> SELECT GROUP_NUMBER, DISK_NUMBER, MOUNT_STATUS, HEADER_STATUS, PATH FROM V$ASM_DISK;
GROUP_NUMBER DISK_NUMBER MOUNT_S HEADER_STATU PATH
------------ ----------- ------- ------------ ----------------------------------
           0           1 CLOSED  UNKNOWN      ORCL:FRAVOL
           0           0 CLOSED  UNKNOWN      ORCL:DATAVOL

显然Oracle并没有去加载磁盘组,利用当前ASM的SPFILE生成PFILE后发现,只有最基本的参数:

*.asm_power_limit=1
*.diagnostic_dest='/u01/app/grid'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'

连ASM_DISKGROUPS参数都没有,难怪查询V$ASM_DISKGROUP找不到任何数据,添加下面的参数:

asm_diskgroups='DATA','FRA'
asm_diskstring='ORCL:*VOL'

重启ASM实例,在加载磁盘组时报错:

[grid@rptdb ~]$ sqlplus / AS sysasm
SQL*Plus: Release 11.2.0.2.0 Production ON Tue DEC 20 18:25:03 2011
Copyright (c) 1982, 2010, Oracle.  ALL rights reserved.
Connected TO:
Oracle DATABASE 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
WITH the Automatic Storage Management OPTION
SQL> shutdown immediate
ORA-15100: invalid OR missing diskgroup name
ASM instance shutdown
SQL> startup pfile=/home/grid/init+ASM.ora
ASM instance started
Total System Global Area  283930624 bytes
Fixed SIZE                  2225792 bytes
Variable SIZE             256539008 bytes
ASM Cache                  25165824 bytes
ORA-15032: NOT ALL alterations performed
ORA-15017: diskgroup "FRA" cannot be mounted
ORA-15063: ASM discovered an insufficient NUMBER OF disks FOR diskgroup "FRA"
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient NUMBER OF disks FOR diskgroup "DATA"

导致这个错误的原因有很多,比如权限,多路径设置,存储设置等等,检查告警日志寻找进一步的信息:

Tue DEC 20 18:25:47 2011
* instance_number obtained FROM CSS = 1, checking FOR the existence OF node 0...
* node 0 does NOT exist. instance_number = 1
Starting ORACLE instance (normal)
****************** Huge Pages Information *****************
Huge Pages memory pool detected (total: 42000 free: 42000)
DFLT Huge Pages allocation successful (allocated: 0)
***********************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
USING LOG_ARCHIVE_DEST_1 parameter DEFAULT VALUE AS /u01/app/grid/product/11.2.0/grid/dbs/arch
Autotune OF undo retention IS turned ON.
IMODE=BR
ILAT =0
LICENSE_MAX_USERS = 0
SYS auditing IS disabled
Starting up:
Oracle DATABASE 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
WITH the Automatic Storage Management OPTION.
USING parameter settings IN client-side pfile /home/grid/init+ASM.ora ON machine rptdb
System parameters WITH non-DEFAULT VALUES:
  large_pool_size          = 12M
  instance_type            = "asm"
  remote_login_passwordfile= "EXCLUSIVE"
  asm_diskstring           = "ORCL:*VOL"
  asm_diskgroups           = "DATA"
  asm_diskgroups           = "FRA"
  asm_power_limit          = 1
  diagnostic_dest          = "/u01/app/grid"
Tue DEC 20 18:25:48 2011
PMON started WITH pid=2, OS id=14718
Tue DEC 20 18:25:48 2011
PSP0 started WITH pid=3, OS id=14722
Tue DEC 20 18:25:49 2011
VKTM started WITH pid=4, OS id=14726 at elevated priority
VKTM running at (1)millisec PRECISION WITH DBRM quantum (100)ms
Tue DEC 20 18:25:49 2011
GEN0 started WITH pid=5, OS id=14732
Tue DEC 20 18:25:49 2011
DIAG started WITH pid=6, OS id=14736
Tue DEC 20 18:25:49 2011
DIA0 started WITH pid=7, OS id=14740
Tue DEC 20 18:25:49 2011
MMAN started WITH pid=8, OS id=14744
Tue DEC 20 18:25:49 2011
DBW0 started WITH pid=9, OS id=14748
Tue DEC 20 18:25:49 2011
LGWR started WITH pid=10, OS id=14752
Tue DEC 20 18:25:49 2011
CKPT started WITH pid=11, OS id=14756
Tue DEC 20 18:25:49 2011
SMON started WITH pid=12, OS id=14760
Tue DEC 20 18:25:49 2011
RBAL started WITH pid=13, OS id=14764
Tue DEC 20 18:25:49 2011
GMON started WITH pid=14, OS id=14768
Tue DEC 20 18:25:49 2011
MMON started WITH pid=15, OS id=14772
Tue DEC 20 18:25:49 2011
MMNL started WITH pid=16, OS id=14776
ORACLE_BASE NOT SET IN environment. It IS recommended
that ORACLE_BASE be SET IN the environment
Tue DEC 20 18:25:49 2011
SQL> ALTER DISKGROUP ALL MOUNT
NOTE: Diskgroups listed IN ASM_DISKGROUPS are
         DATA
         FRA
NOTE: cache registered GROUP DATA NUMBER=1 incarn=0x157c40a1
NOTE: cache began mount (FIRST) OF GROUP DATA NUMBER=1 incarn=0x157c40a1
NOTE: cache registered GROUP FRA NUMBER=2 incarn=0x158c40a2
NOTE: cache began mount (FIRST) OF GROUP FRA NUMBER=2 incarn=0x158c40a2
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
ORA-15186: ASMLIB error FUNCTION = [asm_open(global)],  error = [1],  mesg = [Operation NOT permitted]
ORA-15025: could NOT OPEN disk "ORCL:DATAVOL"
ORA-15186: ASMLIB error FUNCTION = [asm_open(global)],  error = [1],  mesg = [Operation NOT permitted]
ORA-15025: could NOT OPEN disk "ORCL:FRAVOL"
ERROR: no PST quorum IN GROUP: required 2, found 0
NOTE: cache dismounting (clean) GROUP 1/0x157C40A1 (DATA)
NOTE: dbwr NOT being msg'd to dismount
NOTE: lgwr not being msg'd TO dismount
NOTE: cache dismounted GROUP 1/0x157C40A1 (DATA)
NOTE: cache ending mount (fail) OF GROUP DATA NUMBER=1 incarn=0x157c40a1
NOTE: cache deleting context FOR GROUP DATA 1/0x157c40a1
GMON dismounting GROUP 1 at 2 FOR pid 17, osid 14779
ERROR: diskgroup DATA was NOT mounted
ERROR: no PST quorum IN GROUP: required 2, found 0
NOTE: cache dismounting (clean) GROUP 2/0x158C40A2 (FRA)
NOTE: dbwr NOT being msg'd to dismount
NOTE: lgwr not being msg'd TO dismount
NOTE: cache dismounted GROUP 2/0x158C40A2 (FRA)
NOTE: cache ending mount (fail) OF GROUP FRA NUMBER=2 incarn=0x158c40a2
NOTE: cache deleting context FOR GROUP FRA 2/0x158c40a2
GMON dismounting GROUP 2 at 4 FOR pid 17, osid 14779
ERROR: diskgroup FRA was NOT mounted
ORA-15032: NOT ALL alterations performed
ORA-15017: diskgroup "FRA" cannot be mounted
ORA-15063: ASM discovered an insufficient NUMBER OF disks FOR diskgroup "FRA"
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient NUMBER OF disks FOR diskgroup "DATA"
ERROR: ALTER DISKGROUP ALL MOUNT

这里发现了ORA-15186和ORA-15025错误。而ORA-15186错误发生在ASMLIB调用上,根据这个错误更容易定位到问题的原因。
在MOS上,有专门的文章描述这个问题:Mount ASM Disk Group Fails : ORA-15186, ORA-15025, ORA-15063 [ID 1384504.1],根据问题描述,导致这个错误的原因是多路径的配置存在问题:

[grid@rptdb ~]$ /etc/init.d/oracleasm listdisks
DATAVOL
FRAVOL
[grid@rptdb ~]$ cat /proc/partitions
major minor  #blocks  name
   8     0  285155328 sda
   8     1     104391 sda1
   8     2  285049327 sda2
   8    16 1073741824 sdb
   8    17  322119283 sdb1
   8    18  429497775 sdb2
   8    19  322119315 sdb3
   8    32 1073741824 sdc
   8    33  322119283 sdc1
   8    34  429497775 sdc2
   8    35  322119315 sdc3
   8    48 1073741824 sdd
   8    49  322119283 sdd1
   8    50  429497775 sdd2
   8    51  322119315 sdd3
   8    64 1073741824 sde
   8    65  322119283 sde1
   8    66  429497775 sde2
   8    67  322119315 sde3
 253     0  150896640 dm-0
 253     1  134119424 dm-1
 253     2 1073741824 dm-2
 253     3  322119283 dm-3
 253     4  429497775 dm-4
 253     5  322119315 dm-5
[grid@rptdb ~]$ ls -l /dev/oracleasm/disks
total 0
brw-rw---- 1 grid oinstall 8, 18 Dec 20 17:42 DATAVOL
brw-rw---- 1 grid oinstall 8, 19 Dec 20 17:42 FRAVOL

可以看到,DATAVOL和FRAVOL没有对应到多路径设备dm-n上,而是对应到了sdb2和sdb3上。
修改/etc/sysconfig/oracleasm文件,将ORACLEASM_SCANORDER参数和ORACLEASM_SCANEXCLUDE修改如下:

ORACLEASM_SCANORDER="mpath dm"
ORACLEASM_SCANEXCLUDE="sd"

修改后重启服务器,由于ASM中的SPFILE还是存在问题的,所以先关闭,然后重新启动:

[grid@rptdb ~]$ sqlplus / AS sysasm
SQL*Plus: Release 11.2.0.2.0 Production ON Tue DEC 20 19:05:33 2011
Copyright (c) 1982, 2010, Oracle.  ALL rights reserved.
Connected.
SQL> shutdown abort
ASM instance shutdown
SQL> startup pfile=/home/grid/init+ASM.ora
ASM instance started
Total System Global Area  283930624 bytes
Fixed SIZE                  2225792 bytes
Variable SIZE             256539008 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted
SQL> CREATE spfile='+DATA' FROM pfile='/home/grid/init+ASM.ora';
File created.

磁盘组已经顺利启动,创建一个SPFILE文件,确保下次ASM自动启动可以加载磁盘组。切换到Oracle用户打开数据库:

[root@rptdb ~]# su - oracle
[oracle@rptdb ~]$ sqlplus / AS sysdba
SQL*Plus: Release 11.2.0.2.0 Production ON Tue DEC 20 19:06:27 2011
Copyright (c) 1982, 2010, Oracle.  ALL rights reserved.
Connected TO an idle instance.
SQL> startup
ORACLE instance started.
Total System Global Area 6.4939E+10 bytes
Fixed SIZE                  2242560 bytes
Variable SIZE            3.0467E+10 bytes
DATABASE Buffers         3.4360E+10 bytes
Redo Buffers              109195264 bytes
DATABASE mounted.
DATABASE opened.
SQL> exit
Disconnected FROM Oracle DATABASE 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
WITH the Partitioning, Automatic Storage Management, OLAP, DATA Mining
AND REAL Application Testing options

数据库顺利打开,最后检查一下ASM磁盘的设置:

[oracle@rptdb ~]$ ls -l /dev/oracleasm/disks
total 0
brw-rw---- 1 oracle oinstall 253, 4 Dec 20 19:00 DATAVOL
brw-rw---- 1 oracle oinstall 253, 5 Dec 20 19:00 FRAVOL
[oracle@rptdb ~]$ cat /proc/partitions 
major minor  #blocks  name
   8     0  285155328 sda
   8     1     104391 sda1
   8     2  285049327 sda2
   8    16 1073741824 sdb
   8    17  322119283 sdb1
   8    18  429497775 sdb2
   8    19  322119315 sdb3
   8    32 1073741824 sdc
   8    33  322119283 sdc1
   8    34  429497775 sdc2
   8    35  322119315 sdc3
   8    48 1073741824 sdd
   8    49  322119283 sdd1
   8    50  429497775 sdd2
   8    51  322119315 sdd3
   8    64 1073741824 sde
   8    65  322119283 sde1
   8    66  429497775 sde2
   8    67  322119315 sde3
 253     0  150896640 dm-0
 253     1  134119424 dm-1
 253     2 1073741824 dm-2
 253     3  322119283 dm-3
 253     4  429497775 dm-4
 253     5  322119315 dm-5

调整oracleasm的配置后,多路径的配置恢复正常。

This entry was posted in ORACLE and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *