2012年4月数据库技术大会第一天

今年是ITPUB组织数据库大会的第三年了,而我却是第一次参加。
PUB组织的架构师大会倒是一次都没有拉过,而前两次数据库大会却阴差阳错的都错过了。而且这也是我第一次在数据库大会上分享主题。当然这时废话,大会都没参与过,何来的主题分享。
虽然以前在Oracle的圆桌会议以及ACOUG上都分享过不同的主题,不过在这么大规模的会议上发表演讲,还确实是头一次。在加上最近一段时间工作上的事情比较多,因此演讲和PPT的准备时间都很紧张,甚至是今天的早上,都还在修改PPT。内容主要是一些比较有特点的案例的分享,并且希望通过案例的解决过程带给大家一些思考。
今天除了准备自己的演讲外,上午还去听了三思的演讲。小伙已经从原来的STEP BY STEP的实践阶段成长到设计符合特定应用场景的高可用架构的阶段了,而且目前身兼Oracle和MySql两大绝技,可谓博采众家、内外兼修。
下午和我同一个会场的基本上都是熟人,郑保卫、侯圣文和刘磊分别带来了精彩的演讲,使得我原本计划抽空去Mysql和DB2的会场取经的计划落空了。
晚上是IT168/ITPUB组织的晚宴,基本上演讲的嘉宾和PUB的版主都到齐了,可以说每年PUB组织的数据库大会,也是版主们每年最大的聚会,很多平时外地难得一见的版主汇聚一堂。只是可惜老虎的龙虾仍然是神“龙”首尾都不见。
最后给出我分享主题的PPT下载地址:https://yangtingkun.net/wp-content/uploads/2012/04/Think-Different.pptx

Posted in NEWS | Leave a comment

RMAN-600(8201)错误的解决

根据错误信息的判断,找到一个不重建控制文件的解决方案。
RMAN-600(8201)错误:https://yangtingkun.net/?p=690
RMAN-600(8201)错误的重现:https://yangtingkun.net/?p=716
其实解决方法很简单,由于导致错误的原因是目前的主库做过FAILOVER,导致RESETLOGS的时间比当前控制文件中的RESETLOGS时间要大,Oracle在进行RMAN同步的时候判断不严谨,导致数据同步不完整,并最终体现在一些RMAN命令上,比如SHOW ALL或BACKUP等。
这个问题只影响10.2.0.3及以下版本,因此升级是可以解决问题的。
此外,由于导致问题的根源在于控制文件中的ORPHAN记录,因此手工重建控制文件也是确实可行的方案。
其实除了这些方法外,还有一个办法。就是利用备库或之前备份的控制文件来创建CATALOG。由于只是当前的主库经历过FAILOVER,也就是RESETLOGS的操作,而原来的主库也就是现在的备库并没有经历这个操作,因此用之前的主库也就是目前的备库的控制文件来注册CATALOG就可以屏蔽这个错误。
测试将备库SWITCHOVER为主库,然后通过REGISTER DATABASE命令,可以顺利的进行CATALOG的同步。再次执行SWITCHOVER,切换当前主库为问题主库,这时通过问题主库连接刚才同步后的CATALOG,此时运行SHOW ALL和BACKUP命令都不会再出现错误。
这种方法可以临时的解决这个问题,但是一旦需要对问题主库执行完全同步,则问题可能会重现。
除了使用STANDBY控制文件外,如果可以找到数据库FAILOVER之前的备份控制文件,采用类似的方法,同样可以解决这个问题。

Posted in BUG | Tagged , , , , , , , | Leave a comment

DML_LOCKS设置为0导致SHUTDOWN IMMEDIATE失败

DML_LOCKS参数设置系统中允许DML锁的数量,如果将这个参数设置为0,会禁止ENQUEUE的产生,有可能轻微的提升系统的性能。
不过如果将这个参数设置为0,带来的后果会严重很多:

SQL> SHOW parameter dml_locks
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
dml_locks INTEGER 264
SQL> ALTER system SET dml_locks = 0 scope = spfile;
System altered.
SQL> shutdown immediate
DATABASE closed.
DATABASE dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 524288000 bytes
Fixed SIZE 2097560 bytes
Variable SIZE 150998632 bytes
DATABASE Buffers 369098752 bytes
Redo Buffers 2093056 bytes
DATABASE mounted.
DATABASE opened.
SQL> conn test/test
Connected.
SQL> CREATE TABLE t_dml (id NUMBER);
TABLE created.
SQL> INSERT INTO t_dml VALUES (1);
1 ROW created.
SQL> UPDATE t_dml SET id = 2;
1 ROW updated.
SQL> DELETE t_dml;
1 ROW deleted.
SQL> commit;
Commit complete.
SQL> CREATE INDEX i_t_dml_id ON t_dml(id);
CREATE INDEX i_t_dml_id ON t_dml(id)
*
ERROR at line 1:
ORA-00062: DML full-TABLE LOCK cannot be acquired; DML_LOCKS IS 0
 
SQL> TRUNCATE TABLE t_dml;
TRUNCATE TABLE t_dml
*
ERROR at line 1:
ORA-00062: DML full-TABLE LOCK cannot be acquired; DML_LOCKS IS 0
 
SQL> DROP TABLE t_dml purge;
DROP TABLE t_dml purge
*
ERROR at line 1:
ORA-00062: DML full-TABLE LOCK cannot be acquired; DML_LOCKS IS 0

可以看到,并非是DML_LOCKS设置为0,就不能执行DML操作了,事实上DML仍然可以执行。相反由于大部分的DDL在执行过程中需要对表加锁,因此大部分DDL语句在DML_LOCKS设置为0后不能在运行了。
收影响的操作不只是DDL而已,连SHUTDOWN IMMEDIATE都不能幸免:

SQL> conn / AS sysdba
Connected.
SQL> shutdown immediate
ORA-00062: DML full-TABLE LOCK cannot be acquired; DML_LOCKS IS 0
SQL> shutdown abort
ORACLE instance shut down.

也就是说设置DML_LOCKS后,只能使用ABORT方式关闭数据库了。

Posted in ORACLE | Tagged , , , , , , , | Leave a comment

隐含参数_DISABLE_LOGGING导致数据库的损坏

Oracle不推荐修改隐含参数,尤其是对于你不了解的隐含参数,修改的后果可能很严重。

比如如果设置了_DISABLE_LOGGING参数为TRUE:

_disable_logging = TRUE

那么启动数据库后进行任何的操作,在关闭数据库后,就会发现数据库出现了损坏:

SQL> ALTER system checkpoint;
System altered.
SQL> shutdown abort
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area 6.2268E+10 bytes
Fixed SIZE 2243512 bytes
Variable SIZE 4160750664 bytes
DATABASE Buffers 5.7982E+10 bytes
Redo Buffers 122974208 bytes
DATABASE mounted.
ORA-19821: an intentionally corrupt log file was found

即使在关闭数据库之前做过CHECKPOINT操作,但是数据库关闭后,仍然无法正常打开,数据库启动后会出现ORA-19821错误。
对于这个错误,通过隐含参数_ALLOW_RESETLOGS_CORRUPTION也无法直接打开。因此,即使是在测试环境中,类似_DISABLE_LOGGING之类的参数也不要随便设置。

Posted in ORACLE | Tagged , , , , , | Leave a comment

调高SCN的方法

有的时候出于恢复或其他目的的需要,需要将现有的SCN提升。

在10g以前,一般使用event的方式:

SQL> ALTER SESSION SET events '10015 trace name adjust_scn level 1'; 
SESSION altered.

在MOUNT状态下运行上面的SQL,可以根据调整SCN的需要设置不同的LEVEL。此外还存在类似的方法:

ALTER system SET events 'immediate trace name adjust_scn level 10;

除了EVENT的设置外,还有一个隐藏参数也可以轻易的达到目标。使用_minimum_giga_scn参数,可以将SCN提高1G的倍数:

SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
    5310856
SQL> ALTER system SET "_minimum_giga_scn" = 1 scope = spfile;
System altered.
SQL> shutdown immediate
DATABASE closed.
DATABASE dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.
Total System Global Area  524288000 bytes
Fixed SIZE                  2097560 bytes
Variable SIZE             150998632 bytes
DATABASE Buffers          369098752 bytes
Redo Buffers                2093056 bytes
DATABASE mounted.
DATABASE opened.
SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
 1073741909
SQL> SELECT * FROM v$version;
BANNER
----------------------------------------------------------------
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.5.0 - 64bi
PL/SQL Release 10.2.0.5.0 - Production
CORE    10.2.0.5.0      Production
TNS FOR Linux: Version 10.2.0.5.0 - Production
NLSRTL Version 10.2.0.5.0 - Production

使用这种方法,就可以轻易的提升SCN到指定大小。

Posted in NEWS | Tagged , , , | Leave a comment

opatch命令报错Inventory is corrupted

测试环境中执行opatch报错,简单描述一下解决过程。
错误信息如下:

[orat3@hpserver2 OPatch]$ ./opatch lsinventory
Invoking OPatch 10.2.0.5.1
Oracle Interim Patch Installer version 10.2.0.5.1
Copyright (c) 2010, Oracle Corporation.  ALL rights reserved.
Oracle Home       : /t3/orat3/product/10.2.0/db_4.10
Central Inventory : /u01/app/oracle/oraInventory
   FROM           : /etc/oraInst.loc
OPatch version    : 10.2.0.5.1
OUI version       : 10.2.0.4.0
OUI location      : /t3/orat3/product/10.2.0/db_4.10/oui
Log file location : /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch2012-04-11_14-26-55PM.log
Patch history file: /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch_history.txt
OPatch failed TO locate Central Inventory.
Possible causes are: 
    The Central Inventory IS corrupted
    The oraInst.loc file specified IS NOT valid.
LsInventorySession failed: OPatch failed TO locate Central Inventory.
Possible causes are: 
    The Central Inventory IS corrupted
    The oraInst.loc file specified IS NOT valid.
OPatch failed WITH error code 73

根据错误信息可以判断,Inventory的设置有误:/u01/app/oracle/oraInventory目录已经不存在了。问题出在oraInst.loc文件上,而提示信息也显示,当前操作是从/etc/oraInst.loc中获取信息。
由于这是测试环境,对数据库的ORACLE_HOME以及ORACLE_BASE都进行过迁移,因此opatch命令尝试找一个不存在的Inventory时报错。
找到/etc/oraInst.loc,将其指向一个正确的目录:

[root@hpserver2 ~]# cd /etc/
[root@hpserver2 etc]# vi oraInst.loc 
inventory_loc=/t0/orat0/app/oracle/oraInventory
inst_group=oinstall

将oraInst中inventory指向正确的位置后,再次运行opatch命令:

[orat3@hpserver2 OPatch]$ ./opatch lsinventory
Invoking OPatch 10.2.0.5.1
Oracle Interim Patch Installer version 10.2.0.5.1
Copyright (c) 2010, Oracle Corporation.  ALL rights reserved.
Oracle Home       : /t3/orat3/product/10.2.0/db_4.10
Central Inventory : /t0/orat0/app/oracle/oraInventory
   FROM           : /etc/oraInst.loc
OPatch version    : 10.2.0.5.1
OUI version       : 10.2.0.4.0
OUI location      : /t3/orat3/product/10.2.0/db_4.10/oui
Log file location : /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch2012-04-11_14-28-18PM.log
Patch history file: /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch_history.txt
List OF Homes ON this system:
  Home name= OraDb10g_home1, Location= "/u01/app/oracle/product/10.2.0/db_1"
  Home name= OraDb11g_home1, Location= "/u02/app/oracle/product/11.2.0/db_1"
  Home name= Ora11g_gridinfrahome1, Location= "/u02/app/grid"
  Home name= OraDb10g_home2, Location= "/t3/orat3/database/102044"
  Home name= OraDb10g_home3, Location= "/t3/orat3/database/1020410"
Inventory LOAD failed... OPatch cannot LOAD inventory FOR the given Oracle Home.
Possible causes are:
   Oracle Home dir. path does NOT exist IN Central Inventory
   Oracle Home IS a symbolic link
   Oracle Home inventory IS corrupted
LsInventorySession failed: OracleHomeInventory gets NULL oracleHomeInfo
OPatch failed WITH error code 73

虽然还有错误,但是错误信息已经发生了变化。检查列出信息中当前的ORACLE_HOME设置,发现配置有误:

[root@hpserver2 ~]# su - orat0
[orat0@hpserver2 ~]$ cd $ORACLE_BASE
[orat0@hpserver2 oracle]$ cd oraInventory/
[orat0@hpserver2 oraInventory]$ cd ContentsXML/
[orat0@hpserver2 ContentsXML]$ vi inventory.xml 
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 2005 Oracle Corporation. All rights Reserved -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
<VERSION_INFO>
   <SAVED_WITH>10.2.0.1.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="OraDb10g_home1" LOC="/u01/app/oracle/product/10.2.0/db_1" TYPE="O" IDX="1"/>
<HOME NAME="OraDb11g_home1" LOC="/u02/app/oracle/product/11.2.0/db_1" TYPE="O" IDX="2"/>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/u02/app/grid" TYPE="O" IDX="3"/>
<HOME NAME="OraDb10g_home2" LOC="/t3/orat3/database/102044" TYPE="O" IDX="4"/>
<HOME NAME="OraDb10g_home3" LOC="/t3/orat3/product/10.2.0/db_4.10" TYPE="O" IDX="5"/>
</HOME_LIST>
</INVENTORY>
"inventory.xml" 18L, 786C written

将XML文件中ORACLE_HOME的配置修改正确后,再次运行opatch,问题解决:

[orat3@hpserver2 OPatch]$ ./opatch lsinventory
Invoking OPatch 10.2.0.5.1
Oracle Interim Patch Installer version 10.2.0.5.1
Copyright (c) 2010, Oracle Corporation.  ALL rights reserved.
Oracle Home       : /t3/orat3/product/10.2.0/db_4.10
Central Inventory : /t0/orat0/app/oracle/oraInventory
   FROM           : /etc/oraInst.loc
OPatch version    : 10.2.0.5.1
OUI version       : 10.2.0.4.0
OUI location      : /t3/orat3/product/10.2.0/db_4.10/oui
Log file location : /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch2012-04-11_14-31-23PM.log
Patch history file: /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/opatch_history.txt
Lsinventory Output file location : /t3/orat3/product/10.2.0/db_4.10/cfgtoollogs/opatch/lsinv/lsinventory2012-04-11_14-31-23PM.txt
--------------------------------------------------------------------------------
Installed Top-level Products (2): 
Oracle DATABASE 10g                                                  10.2.0.1.0
Oracle DATABASE 10g Release 2 Patch SET 3                            10.2.0.4.0
There are 2 products installed IN this Oracle Home.
 
Interim patches (2) :
Patch  12827778     : applied ON Sat Feb 25 03:57:18 ICT 2012
UNIQUE Patch ID:  14145955
   Created ON 29 Sep 2011, 00:45:30 hrs PST8PDT
   Bugs fixed:
     12566124, 12566141, 12566142, 12566126, 12828112, 12566143, 6631533
     9573054, 12419397, 12566129, 9714832, 11725015, 10014012, 11787762
     11787763, 10249540, 11787764, 10014015, 11787765, 11724977, 9952234
     11787766, 10325885, 10248636, 11787767, 10013975, 9173248, 9678690
     7609058, 9952272, 9678695, 9678697, 12566131, 9713537, 12566134, 9655017
     12827778, 9654991, 12566136, 12566137, 12976521, 12976522, 12566139
     12976523, 12976524, 9442335, 10014009, 10325878, 8309642, 12419249
     12565867, 12566121
   This patch overlays patches:
     9352164
   This patch needs patches:
     9352164
   AS prerequisites
Patch  9352164      : applied ON Sat Feb 25 03:03:03 ICT 2012
UNIQUE Patch ID:  12307477
   Created ON 1 Apr 2010, 11:33:14 hrs PST8PDT
   Bugs fixed:
     9377578, 6418420, 7835247, 7207654, 7592346, 6724797, 7936993, 7331867
     9093300, 7535152, 7552067, 5879114, 8866013, 5457450, 8344348, 7272297
     7136866, 7196894, 6258480, 7013124, 6512622, 6355586, 7196532, 8568395
     8309587, 7557226, 5399699, 6509115, 8568397, 8568398, 7127618, 5701695
     6052226, 7424804, 6817593, 7553884, 6741425, 7513673, 6783812, 8437213
     6452766, 6469211, 7527650, 8309592, 8479537, 5991038, 5686407, 6945157
     7639602, 9119226, 6403091, 7589862, 7552082, 6711853, 8304589, 6052169
     8199266, 6327692, 5756769, 9352191, 7460818, 6268409, 8232056, 6687381
     6972843, 8230457, 6800507, 7027551, 6778714, 6200820, 6645719, 7393804
     6775231, 3934160, 6683178, 6650256, 5085288, 7528105, 7329252, 6378112
     6151380, 6844866, 4723109, 8544896, 5126719, 5890312, 5934363, 7036453
     7610362, 8426816, 8433026, 7270434, 7172531, 6451626, 8247855, 5497338
     6324944, 6874522, 7175513, 6960489, 7341598, 8576156, 6797677, 8342923
     5895190, 6756739, 7150470, 7593835, 7356443, 7044551, 8227106, 4695511
     7298688, 5747462, 8556340, 7197445, 5348308, 7937113, 8341623, 7569205
     8409848, 6053134, 6163771, 6851669, 6181488, 6375150, 6825866, 7210921
     7295780, 6345573, 7033630, 6954722, 6824129, 7523475, 7457766, 7309458
     8324577, 6840740, 6804746, 7375611, 8268054, 6981690, 6512811, 6988017
     7375613, 8344399, 7340448, 8362683, 7375617, 8251247, 5933656, 6005347
     9145204, 6599920, 7238230, 6379441, 6452375, 6352003, 6833965, 7136489
     6610218, 7612639, 6392076, 7225204, 9119194, 5476236, 9442328, 7609057
     7609058, 6605106, 6374297, 6193945, 4693355, 8217795, 7039896, 7432514
     7330909, 6952701, 7190270, 8287155, 7587008, 7207932, 6802650, 7189447
     8481935, 4598439, 9442331, 6615740, 7155655, 6749617, 9442335, 7159505
     5868257, 5727166, 7173005, 6917874, 9442339, 7013768, 7691766, 7385253
     7291739, 7225720, 7257770, 7363767, 7244238, 6941717, 8267348, 7710551
     8354686, 7247217, 8328954, 7299153, 8909984, 6681695, 8702276, 9119284
     8217011, 7661251, 6265559, 6823287, 6991626, 6954829, 5259835, 6500033
     5923486, 7432601, 7022234, 8534387, 5147386, 7697802, 6653934, 7375644
     6490140, 7662491, 8331466, 5623467, 6070225, 6635214, 7396409, 6638558
     7038750, 6714608, 6838714, 6870937, 7219752, 7263842, 7278117, 6882739
     5404871, 8836667, 8373286, 7393292, 6678845, 6903051, 7936793, 6600051
     7155248, 4966512, 7155249, 7197637, 8836308, 8568402, 8568404, 8568405
     8431487, 5704108, 6343150, 7280764, 6923450, 7643632, 6145177, 8836671
     8310931, 6640411, 8347704, 8836675, 7155250, 7155251, 8836677, 7155252
     8836678, 7155253, 8292378, 7155254, 6219529, 7411865, 8227091, 8340379
     7276960, 6145687, 7659217, 5863926, 7022905, 6852598, 7123643, 6596564
     6413089, 6851438, 8836681, 8836683, 8836684, 8836686, 7579469, 7494333
     7315642, 8340383, 6786022, 8340387, 6926448, 7600026, 7462072, 6679303
     8815639, 7197583, 7172752, 7326645, 7008262, 9173244, 9173248, 7573151
     8490879, 7477934, 6725634, 6733655, 6799205, 6980597, 7499353, 6084232
     6014513, 7140204, 7254987, 8833280, 6647480, 6120004, 7693128, 6760697
     6051177, 8247215, 6858062, 7189645, 6844739, 6768251, 7196863, 5630796
     7378661, 7378735, 5970301, 6705822, 8290506, 6658484, 7599944, 9173253
     8309623, 7125408, 7257461, 6987790, 7568556, 6919819, 8886674, 5883691
     6955744, 7801939, 6074620, 7149004, 6857917, 8283650, 6445948, 5929055
     6110752, 7552042, 8210889, 8287504, 6506617, 7306915, 6271590, 5386204
     6976005, 8330783, 7606362, 5377242, 7043989, 8309632, 7575925, 6870047
     8309637, 5902053, 8309639, 7028176, 6827260, 7588384, 4726401, 6720712
     5910650, 6752765, 6971433, 6024730, 8315482, 6628122, 8239142, 9352164
     5695562, 4637902, 7345904, 8309642, 6994160, 8556586, 6404447, 8220734
     6919764, 7597354, 7523787, 6029179, 5231155, 6455659
 
--------------------------------------------------------------------------------
OPatch succeeded.
Posted in ORACLE | Tagged , , , , , , , , | Leave a comment

通过数据库链指定AS OF SCN语句

一直不确定闪回查询的AS OF语句是否可以通过数据库链对远端数据库生效,而由于通过数据库链查询会使两个数据库的SCN同步,使得这个问题难以测试。
通过数据库链查询会导致源和目标库SCN同步:https://yangtingkun.net/?p=722
虽然Oracle会在通过数据库链执行查询或DML操作时同步数据库的SCN,但是指定的SCN是本地的还是远端的还是有意义的。
为了说明这种情况,创建一个JOB,定期进行数据的修改,以快速提高一个数据库的SCN:

SQL> SELECT dbid, name FROM v$database;
DBID NAME
---------- ---------
 452011237 ORAT2
SQL> SELECT COUNT(*) FROM t;
COUNT(*)
----------
731791
SQL> BEGIN
2 FOR i IN 1..100000 loop
3 INSERT INTO t VALUES (i, 'y');
4 commit;
5 END loop;
6 END;
7 /
PL/SQL PROCEDURE successfully completed.

在另一个数据库中将系统时间和SCN的对应关系保存:

SQL> SELECT dbid, name FROM v$database;
      DBID NAME
---------- ---------
1299676637 ORCL
SQL> CREATE DATABASE LINK TEST CONNECT TO TEST IDENTIFIED BY TEST USING '192.168.0.19:1524/orat2';   
DATABASE link created.
SQL> DECLARE
2 v_job NUMBER;
3 BEGIN
4 dbms_job.submit(v_job, 'begin insert into t_scn values (sysdate, dbms_flashback.get_system_change_number); end;', 
5 sysdate, 'sysdate + 1/86400');
6 commit;
7 END;
8 /
PL/SQL PROCEDURE successfully completed.

检查插入的SCN信息:

SQL> SELECT * FROM t_scn;
TIME                    SCN
------------------- -------
2012-04-06 18:38:39 5098024
2012-04-06 18:38:44 5098031
2012-04-06 18:38:49 5098035
2012-04-06 18:38:54 5098040
2012-04-06 18:38:59 5098045
.
.
.
2012-04-06 18:42:49 5098266
2012-04-06 18:42:54 5098271
2012-04-06 18:42:59 5098276
53 ROWS selected.
SQL> SELECT dbid, name, current_scn FROM v$database@test;
DBID NAME      CURRENT_SCN
---------- --------- -----------
 452011237 ORAT2         5173454
SQL> SELECT * FROM t_scn;
TIME                    SCN
------------------- -------
2012-04-06 18:38:39 5098024
2012-04-06 18:38:44 5098031
2012-04-06 18:38:49 5098035
2012-04-06 18:38:54 5098040
.
.
.
2012-04-06 18:43:04 5098280
2012-04-06 18:43:09 5098285
2012-04-06 18:43:14 5173457
56 ROWS selected.<pre>
可以看到,当前数据库由于执行了远端的查询,导致当前的数据库SCN和远端SCN同步。由于远端通过PL/SQL运行大量的事务,因此SCN涨幅很快,而当前数据库没有操作进行,基本上SCN以每秒1个的速度增加。在运行了查询后,SCN则直接增加了75000。
因此对于当前数据库的而言,SCN是不连续的,对于当前库而言,50982865173456之间的值没有任何区别,下面尝试通过SCN查询远端数据库中的记录:
<pre lang='SQL'>SQL> SELECT COUNT(*) FROM t@test AS OF scn 5098286;
COUNT(*)
----------
    731976
SQL> SELECT COUNT(*) FROM t@test AS OF scn 5173457;
COUNT(*)
----------
    804664
SQL> SELECT COUNT(*) FROM t@test AS OF scn 5100000;
COUNT(*)
----------
    733675
SQL> SELECT COUNT(*) FROM t@test AS OF scn 5140000;
COUNT(*)
----------
    772940

可以看到通过指定SCN可以查询到远端T表不同时刻的记录数,而这些指定的SCN是当前数据库跳过的SCN,因此可以确定,AS OF SCN语句指定的SCN是针对目标表所在数据库而言的。而Oracle可能正是出于类似的考虑,才会将远端数据库和本地数据库的SCN进行同步。

Posted in ORACLE | Tagged , | Leave a comment

通过数据库链查询会导致源和目标库SCN同步

在通过数据库链执行跨数据库的查询时,Oracle会自动将两个数据库的SCN进行同步。

以前Eygle提到过这个问题,这次在讨论一个和远端数据库SCN有关的问题时,又碰到了这个问题,于是简单记录一下。

SQL> conn test/test@192.168.0.19:1666/ora102044
Connected.
SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
2745758
SQL> conn test/test@192.168.0.20:15210/orcl
Connected.
SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
5642527
SQL> CREATE DATABASE link ora102044 
2 CONNECT TO test
3 IDENTIFIED BY test
4 USING '192.168.0.19:1666/ora102044';
DATABASE link created.
SQL> SELECT current_scn FROM v$database@ora102044;
CURRENT_SCN
-----------
5642628
SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
5642638
SQL> conn test/test@192.168.0.19:1666/ora102044
Connected.
SQL> SELECT current_scn FROM v$database;
CURRENT_SCN
-----------
5642702

可以看到,在通过数据库链执行查询后,Oracle会同步两个数据库的SCN,取两个数据库最大的SCN作为新的SCN,执行查询时刻,两个数据库的SCN一致。
Oracle的这个机制应该是为了分布式查询的一致性,这样一个查询只需要一个SCN就可以在分布式查询的任何一个数据库中来获取一致性。
但是这种实现也会存在一定的潜在问题,其中一个数据库的SCN可能会从一个较低的数值,一下增长到一个很高的数据。这个例子中,ORA102044数据库的SCN就增加了一倍以上。而对于实际环境中,SCN的增加可能是几倍、几十倍甚至是上百倍,从而引发一些其他的bug。

Posted in ORACLE | Tagged , , | 1 Comment

ORA-600(kcratr_scan_lastbwr)错误

客户的测试环境中出现这个错误。
信息如下:

2012-03-22 11:10:24.991000 +08:00
Beginning crash recovery OF 1 threads
parallel recovery started WITH 15 processes
Started redo scan
Hex dump OF (file 2, block 130960) IN trace file /oracle11r2/app/diag/rdbms/orcl/orcl/trace/orcl_ora_8087.trc
Reading datafile '/oradata/orcl/sysaux01.dbf' FOR corruption at rdba: 0x0081ff90 (file 2, block 130960)
Reread (file 2, block 130960) found same corrupt DATA (logically corrupt)
WRITE verification failed FOR File 2 Block 130960 (rdba 0x81ff90)
2012-03-22 11:10:27.155000 +08:00
Errors IN file /oracle11r2/app/diag/rdbms/orcl/orcl/trace/orcl_ora_8087.trc (incident=92961):
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Incident details IN: /oracle11r2/app/diag/rdbms/orcl/orcl/incident/incdir_92961/orcl_ora_8087_i92961.trc
2012-03-22 11:10:28.505000 +08:00
Dumping diagnostic DATA IN directory=[cdmp_20120322111028], requested BY (instance=1, osid=8087), summary=[incident=92961].
USE ADRCI OR Support Workbench TO package the incident.
See Note 411.1 at My Oracle Support FOR error AND packaging details.
Aborting crash recovery due TO error 600
Errors IN file /oracle11r2/app/diag/rdbms/orcl/orcl/trace/orcl_ora_8087.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
Errors IN file /oracle11r2/app/diag/rdbms/orcl/orcl/trace/orcl_ora_8087.trc:
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

对应的详细TRACE内容为:

*** 2012-03-22 11:10:27.207
*** SESSION ID:(66.3) 2012-03-22 11:10:27.207
*** CLIENT ID:() 2012-03-22 11:10:27.207
*** SERVICE NAME:() 2012-03-22 11:10:27.207
*** MODULE NAME:(sqlplus@localhost.localdomain (TNS V1-V3)) 2012-03-22 11:10:27.207
*** ACTION NAME:() 2012-03-22 11:10:27.207
Dump continued FROM file: /oracle11r2/app/diag/rdbms/orcl/orcl/trace/orcl_ora_8087.trc
ORA-00600: internal error code, arguments: [kcratr_scan_lastbwr], [], [], [], [], [], [], [], [], [], [], []
========= Dump FOR incident 92961 (ORA 600 [kcratr_scan_lastbwr]) ========
*** 2012-03-22 11:10:27.208
dbkedDefDump(): Starting incident DEFAULT dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=1h50ks4ncswfn) -----
ALTER DATABASE OPEN
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        CALL     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000001 ? 000000002 ?
ksedst1()+98         CALL     skdstdst()           000000000 ? 000000000 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedst()+34          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkedDefDump()+2741  CALL     ksedst()             000000000 ? 000000001 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksedmp()+36          CALL     dbkedDefDump()       000000003 ? 000000002 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
ksfdmp()+64          CALL     ksedmp()             000000003 ? 000000002 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgexPhaseII()+1764  CALL     ksfdmp()             000000003 ? 000000002 ?
                                                   7FFF37DFBF48 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgexProcessError()  CALL     dbgexPhaseII()       7F23BF34D710 ? 7F23BF34E410 ?
+2675                                              7FFF37E082C0 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbgeExecuteForError  CALL     dbgexProcessError()  7F23BF34D710 ? 7F23BF34E410 ?
()+83                                              000000001 ? 000000000 ?
                                                   100000000 ? 000000002 ?
dbgePostErrorKGE()+  CALL     dbgeExecuteForError  7F23BF34D710 ? 7F23BF34E410 ?
2138                          ()                   000000001 ? 000000001 ?
                                                   000000000 ? 000000002 ?
dbkePostKGE_kgsf()+  CALL     dbgePostErrorKGE()   00BAF3FA0 ? 7F23BEE20040 ?
66                                                 000000258 ? 7F23BF34E410 ?
                                                   100000000 ? 000000002 ?
kgeadse()+390        CALL     dbkePostKGE_kgsf()   00BAF3FA0 ? 7F23BEE20040 ?
                                                   000000258 ? 7F23BF34E410 ?
                                                   100000000 ? 000000002 ?
kgerinv_internal()+  CALL     kgeadse()            00BAF3FA0 ? 7F23BEE20040 ?
45                                                 000000258 ? 100000000 ?
                                                   000000000 ? 000000002 ?
kgerinv()+33         CALL     kgerinv_internal()   00BAF3FA0 ? 7F23BEE20040 ?
                                                   7FFF37DF8780 ? 000000258 ?
                                                   000000000 ? 000000002 ?
kgeasnmierr()+143    CALL     kgerinv()            00BAF3FA0 ? 7F23BEE20040 ?
                                                   7FFF37DF8780 ? 000000002 ?
                                                   000000000 ? 000000002 ?
kcratr_scan()+1732   CALL     kgeasnmierr()        00BAF3FA0 ? 7F23BEE20040 ?
                                                   7FFF37DF8780 ? 000000002 ?
                                                   000005C18 ? 7F23BD9DE020 ?
kcratr()+758         CALL     kcratr_scan()        7FFF37E0B810 ? 7FFF37E09554 ?
                                                   7FFF37DF8780 ? 000000002 ?
                                                   000005C18 ? 7F23BD9DE020 ?
kctrec()+4513        CALL     kcratr()             7FFF37E0B810 ? 7FFF37E14670 ?
                                                   000000000 ? 000000002 ?
                                                   000005C18 ? 7F23BD9DE020 ?
kcvcrv()+5693        CALL     kctrec()             7FFF37E15C70 ? 000000000 ?
                                                   7F23BF0F8598 ? 7F23BF0F8C00 ?
                                                   7F23BED95E00 ? 000000000 ?
kcfopd()+1055        CALL     kcvcrv()             7FFF37E16A10 ? 000000000 ?
                                                   000000020 ? 7F23BF0F8C00 ?
                                                   7F23BED95E00 ? 000000000 ?
adbdrv()+51829       CALL     kcfopd()             000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   7FFF37E1CE28 ? 192F25CA8 ?
opiexe()+18379       CALL     adbdrv()             000000000 ? 180FFF8F8 ?
                                                   000000000 ? 000000000 ?
                                                   7FFF37E1CE28 ? 000000000 ?
opiosq0()+3865       CALL     opiexe()             000000004 ? 000000000 ?
                                                   7FFF37E1E490 ? 000000000 ?
                                                   7FFF37E1CE28 ? 000000000 ?
kpooprx()+269        CALL     opiosq0()            000000003 ? 00000000E ?
                                                   7FFF37E1E780 ? 0000000A4 ?
                                                   000000000 ? 7FFF37E1E490 ?
kpoal8()+824         CALL     kpooprx()            7FFF37E21B74 ? 7FFF37E1FBD8 ?
                                                   000000013 ? 000000001 ?
                                                   000000000 ? 7FFF37E1E490 ?
opiodr()+916         CALL     kpoal8()             00000005E ? 00000001C ?
                                                   7FFF37E21B70 ? 000000001 ?
                                                   000000000 ? 100000001 ?
ttcpip()+2242        CALL     opiodr()             00000005E ? 00000001C ?
                                                   7FFF37E21B70 ? 000000000 ?
                                                   009B91670 ? 100000001 ?
opitsk()+1668        CALL     ttcpip()             00BB09C30 ? 0094D6E30 ?
                                                   7FFF37E21B70 ? 000000000 ?
                                                   7FFF37E215C8 ? 7FFF37E21D64 ?
opiino()+961         CALL     opitsk()             00BB09C30 ? 000000001 ?
                                                   7FFF37E21B70 ? 000000000 ?
                                                   7FFF37E215C8 ? 7FFF37E21D64 ?
opiodr()+916         CALL     opiino()             00000003C ? 000000004 ?
                                                   7FFF37E232E8 ? 000000000 ?
                                                   7FFF37E215C8 ? 7FFF37E21D64 ?
opidrv()+565         CALL     opiodr()             00000003C ? 000000004 ?
                                                   7FFF37E232E8 ? 000000000 ?
                                                   009B91120 ? 7FFF37E21D64 ?
sou2o()+98           CALL     opidrv()             00000003C ? 000000004 ?
                                                   7FFF37E232E8 ? 000000000 ?
                                                   009B91120 ? 7FFF37E21D64 ?
opimai_real()+128    CALL     sou2o()              7FFF37E232C0 ? 00000003C ?
                                                   000000004 ? 7FFF37E232E8 ?
                                                   009B91120 ? 7FFF37E21D64 ?
ssthrdmain()+252     CALL     opimai_real()        000000002 ? 7FFF37E234B0 ?
                                                   000000004 ? 7FFF37E232E8 ?
                                                   009B91120 ? 7FFF37E21D64 ?
main()+196           CALL     ssthrdmain()         000000002 ? 7FFF37E234B0 ?
                                                   000000001 ? 000000000 ?
                                                   009B91120 ? 7FFF37E21D64 ?
__libc_start_main()  CALL     main()               000000002 ? 7FFF37E23658 ?
+253                                               000000001 ? 000000000 ?
                                                   009B91120 ? 7FFF37E21D64 ?
_start()+36          CALL     __libc_start_main()  000A0AF54 ? 000000002 ?
                                                   7FFF37E23648 ? 000000000 ?
                                                   009B91120 ? 7FFF37E21D64 ?
 
--------------------- Binary Stack Dump ---------------------

根据MOS上描述,这个问题属于Bug 9584943 Crash / recovery failure due to lost write even if mirror has a good image。导致错误的原因是Oracle检查文件头的时候发现了逻辑坏块。不过这个bug应该在11.2.0.2中被解决,而当前版本是11.2.0.3,莫非这个bug在11.2.0.3中被重新引入。
如果出现这个问题,最有效的办法就是通过备份进行恢复,而且一般BLOCK级的恢复就可以了。

Posted in BUG | Tagged , , , , | Leave a comment

RMAN-600(8201)错误的重现

前两天,客户的数据库在执行CATALOG方式的备份时出现了RMAN-600(8201)错误。由于比较了解客户的环境,在加上客户本身对于系统的了解,使得成功的模拟出这个错误。
RMAN-600(8201)错误:https://yangtingkun.net/?p=690
其实重现这个错误并不算太复杂,要求数据库的版本是10.2.0.3以下。
首先搭建一套DATA GUARD环境。然后在备库启用数据库的FLASHBACK功能,创建一个恢复点,然后将备库激活打开。备库打开后就可以关闭,然后重新MOUNT数据库,并利用FLASHBACK将数据库回滚到激活之前的恢复点,然后利用ALTER DATABASE CONVERT命令再次将这个数据库转化为物理备库,DATA GUARD环境恢复后,使备库应用日志一直到和主库保持一致,然后进行一次DATA GUARD的SWITCHOVER的操作,使得备库变成主库,主库变成备库。
这时,对新的主库创建CATALOG,执行REGISTER DATABASE后,执行show all命令,就会重新错误。

[orat1@hpserver2 ~]$ rman target / catalog rcat_user/rcat_password
Recovery Manager: Release 10.2.0.3.0 - Production ON Sat Apr 4 20:51:56 2012
Copyright (c) 1982, 2005, Oracle. ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857)
connected TO recovery catalog DATABASE
RMAN> SHOW ALL;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # DEFAULT
CONFIGURE BACKUP OPTIMIZATION OFF; # DEFAULT
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # DEFAULT
CONFIGURE DEVICE TYPE DISK PARALLELISM 5 BACKUP TYPE TO BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE MAXSETSIZE TO UNLIMITED; # DEFAULT
CONFIGURE ENCRYPTION FOR DATABASE OFF; # DEFAULT
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # DEFAULT
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF SHOW command at 04/04/2012 20:52:16
RMAN-00600: internal error, arguments [8201] [] [] [] []

显然导致问题产生的主要原因是由于DATA GARUD的备库曾经使用FAILOVER打开过,虽然通过FLASHBACK进行过回滚,但是FAILOVER会导致RESETLOG,当切换为主库后,RESETLOG信息保留在控制文件中,而Oracle在创建CATALOG时策略存在问题,虽然Oracle能分辨CURRENT和ORPHAN的INCARNATION,但是在进行同步的时候,Oracle会尝试读取RESETLOGS最大的记录从而引发了同步的异常。

Posted in BUG | Tagged , , , , , , , , , , , , | Leave a comment