RAC全局死锁检测时间

对于单实例数据库而言,死锁的检测在秒级完成,而RAC环境则死锁的检测时间默认达到了1分钟。
对于单实例环境如果出现了死锁,那么马上其中一个进程就被中止,用户可以快速的得到错误返回。而对于RAC而言,死锁的检测并不是实时完成,而是需要60秒左右的时间。
会话1执行:

SQL> CREATE TABLE t_deadlock (id NUMBER PRIMARY KEY, name varchar2(30));
TABLE created.
Elapsed: 00:00:00.12
SQL> INSERT INTO t_deadlock VALUES (1, 'a');
1 ROW created.
Elapsed: 00:00:00.00
SQL> INSERT INTO t_deadlock VALUES (2, 'b');
1 ROW created.
Elapsed: 00:00:00.00
SQL> commit;
Commit complete.
Elapsed: 00:00:00.00
SQL> UPDATE t_deadlock SET name = 'a1' WHERE id = 1;
1 ROW updated.
Elapsed: 00:00:00.00

会话2执行:

SQL> SET timing ON
SQL> UPDATE t_deadlock SET name = 'b2' WHERE id = 2;
1 ROW updated.
Elapsed: 00:00:00.00
SQL> UPDATE t_deadlock SET name = 'a2' WHERE id = 1;

此时,会话2等待会话1的最终操作,下面会话1更新被会话2锁定的行,引发死锁:

SQL> UPDATE t_deadlock SET name = 'b1' WHERE id = 2;
UPDATE t_deadlock SET name = 'b1' WHERE id = 2
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting FOR resource
 
Elapsed: 00:01:00.12

可以看到,死锁的超时检测为1分钟。
而这个死锁的检测时间是可以调整的,Oracle通过隐含参数_lm_dd_interval控制:

SQL> conn / AS sysdba
Connected.
SQL> ALTER system SET "_lm_dd_interval" = 30 scope = spfile;
System altered.
SQL> shutdown immediate
DATABASE closed.
DATABASE dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 281018368 bytes
Fixed SIZE 2095672 bytes
Variable SIZE 104859080 bytes
DATABASE Buffers 167772160 bytes
Redo Buffers 6291456 bytes
DATABASE mounted.
DATABASE opened.

再次测试死锁的检测时间,会话1:

SQL> UPDATE t_deadlock SET name = 'a1' WHERE id = 1;
1 ROW updated.
SQL> SET timing ON

会话2执行更新:

SQL> SET timing ON
SQL> UPDATE t_deadlock SET name = 'b2' WHERE id = 2;
1 ROW updated.
Elapsed: 00:00:00.02
SQL> UPDATE t_deadlock SET name = 'a2' WHERE id = 1;

会话1执行更新引发死锁:

SQL> UPDATE t_deadlock SET name = 'b1' WHERE id = 2;
大约30秒后,会话2报错ORA-60<pre lang='SQL'>UPDATE t_deadlock SET name = 'a2' WHERE id = 1
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting FOR resource
 
Elapsed: 00:00:30.27

在10.2.0.2版本上,Oracle存在一个bug,允许这个参数设置为0,在10.2.0.3以后,这个bug被修正,如果设置为0后,则数据库无法正常启动:

[oracle@node1 ~]$ sqlplus / AS sysdba
SQL*Plus: Release 10.2.0.5.0 - Production ON Mon Jun 4 07:54:09 2012
Copyright (c) 1982, 2010, Oracle. ALL Rights Reserved.
Connected TO an idle instance.
SQL> startup
ORA-00067: invalid VALUE 0 FOR parameter _lm_dd_interval; must be at least 1

最后修改隐含参数是Oracle不推荐的,而且修改这个参数势必会影响RAC的正常工作方式导致LDM进程的繁忙度增加,而且可能影响RAC环境的稳定性和可用性。
如果确实对于前台的死锁检查时间要求较高,建议在测试环境中详细测试后再部署到产品环境中。

This entry was posted in ORACLE and tagged , , , , . Bookmark the permalink.

One Response to RAC全局死锁检测时间

  1. Pingback: 设置全局死锁优先级 | yangtingkun

Leave a Reply

Your email address will not be published. Required fields are marked *