访问ASM的ONNN进程占用大量CPU

客户的11.2 RAC环境中突然一个O002进程占用了100%的CPU,且一直不释放。
导致这个问题的原因应该是通过ASMCMD工具从ASM中拷贝了一个归档日志到本地磁盘,随后就发现一个ora_

[root@node-2 ~]# top
top - 21:14:25 up 257 days,  4:33,  2 users,  LOAD average: 1.69, 1.70, 1.56
Tasks: 994 total,   2 running, 991 sleeping,   0 stopped,   1 zombie
Cpu(s):  2.2%us,  0.1%sy,  0.0%ni, 97.5%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132060636k total, 131006336k used,  1054300k free,   535524k buffers
Swap: 67108856k total,  1649936k used, 65458920k free, 39758892k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND 
 7622 oracle    25   0 50.6g  18m  16m R 100.0  0.0   9498:03 oracle 
29924 root      RT   0  304m  86m  54m S  2.0  0.1   5333:37 osysmond.bin
13804 root      15   0 13424 1844  828 R  1.6  0.0   0:00.16 top
29158 oracle    15   0 50.6g  10g  10g S  1.0  8.2 317:51.59 oracle
30337 root      15   0  971m  25m  13m S  1.0  0.0   1938:12 orarootagent.bi
.
.
.
   25 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/7
[root@node-2 ~]# ps -ef|grep 7622
oracle    7622     1  2  2011 ?        6-14:18:09 ora_o002_orcl2
root     13810 13645  0 21:14 pts/1    00:00:00 grep 7622

从数据库中查询,发现这个会话处于空闲等待状态:

[oracle@settle-2 ~]$ sqlplus / AS sysdba
SQL*Plus: Release 11.2.0.2.0 Production ON Mon Apr 23 21:14:45 2012
Copyright (c) 1982, 2010, Oracle.  ALL rights reserved.
Connected TO:
Oracle DATABASE 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
WITH the Partitioning, REAL Application Clusters, Automatic Storage Management, OLAP,
DATA Mining AND REAL Application Testing options
SQL> SET pages 100 LINES 140
SQL> SELECT sid, username, program, STATUS, event FROM v$session WHERE paddr IN (SELECT addr FROM v$process WHERE spid = 7622);
       SID USERNAME   PROGRAM                        STATUS   EVENT
---------- ---------- ------------------------------ -------- ------------------------------
       529            oracle@node-2 (O002)           ACTIVE   class slave wait
SQL> SELECT sql_id, prev_sql_id FROM v$session WHERE sid = 529;
SQL_ID        PREV_SQL_ID
------------- -------------

可以看到,这个会话一直处以空闲等待,而且没有执行任何SQL语句。

SQL> SELECT sid, serial# FROM v$session WHERE sid = 529;
       SID    SERIAL#
---------- ----------
       529         43
SQL> EXEC dbms_monitor.session_trace_enable(529, 43, TRUE, TRUE)
PL/SQL PROCEDURE successfully completed.
SQL> EXEC dbms_monitor.session_trace_disable(529, 43)
PL/SQL PROCEDURE successfully completed.

通过设置TRACE,发现这个异常的会话并没有产生任何的TRACE文件。
由于ONNN进程是ASM连接池进程,当数据库在ASM上创建文件时,就会通过ONNN进程来完成,这个进程在日常很少会处于工作状态。

[oracle@node-2 ~]$ ps -ef|grep ora_o
oracle    3831     1  0 Feb15 ?        00:00:03 ora_o004_orcl2
oracle    6350     1  0  2011 ?        00:00:08 ora_o000_orcl2
oracle    6358     1  0  2011 ?        00:00:08 ora_o003_orcl2
oracle    7475     1  0  2011 ?        00:00:00 ora_o001_orcl2
oracle    7622     1  2  2011 ?        6-15:38:56 ora_o002_orcl2
oracle   28211 28168  0 22:35 pts/1    00:00:00 grep ora_o
[oracle@node-2 ~]$ KILL -9 7622

由于ONNN进程并非系统核心进程,且Oracle会在需要的时候自动分配ONNN进程,因此可以通过kill -9的方法来清除会话。结束进程后,top命令显示占用100%CPU资源的进程已经消失。
根据MOS的最新更新,这个问题属于BUG:12929268对应的文档为ora_o00n Process High CPU Usage in 11.2.0.2 [ID 1459376.1]。
根据文档的描述,如果ONNN进程空闲时间超过248.5天左右,也就是按照百分之一秒计算超过2G,就会导致ONNN进程陷入死循环中。Oracle对于进程的最大等待时间的定义并不是无符号的LONG,导致进程等待时间只有2G厘秒,而不是4G。
Oracle目前没有明确的fixed计划,而给出的解决方案和我上面的方法一致,就是KILL掉对应的进程。另外重启系统可以保证250天内不会再次出现同样的错误。

Posted in BUG | Tagged , , , | Leave a comment

使用kfed出现无法初始化diag的错误

测试环境上使用kfed工具出现错误。
错误信息如下:

grid@solaris:~$ kfed
ERROR!!! could NOT initialize the diag context 
grid@solaris:~$ id
uid=502(grid) gid=501(dba)
grid@solaris:/u02/app/oracle/product/11.2.0/grid/bin$ ./kfed
Unable TO run program ed

根据错误信息搜索MOS,确认问题为KFED – ERROR!!! could not initialize the diag context [ID 1335596.1]。导致问题的原因是kfed会尝试建立自己的目录并记录日志,而当前的目录结构使得grid用户没有权限执行这个初始化的动作。在MOS的这篇文档中,是由于使用root执行国kfed,导致目录的owner变为root,而grid用户执行操作时则报错。

grid@solaris:~$ ls -l $ORACLE_HOME/log/diag
total 14
-rw-r----- 1 grid dba 16 Apr 23 16:49 adrci_dir.mif
drwxr-xr-x 3 grid dba 3 Apr 23 11:26 asm
drwxr-xr-x 3 grid dba 3 Apr 21 16:03 asmcmd
drwxrwx--- 2 grid dba 2 Jan 29 14:56 clients
drwxr-xr-x 3 grid dba 3 Jan 29 15:18 tnslsnr

当前的问题和MOS文档描述的并不完全相同,当前路径下没有root权限的目录,也没有找到目标目录asmtool。不过无论是现象还是错误信息都说明问题确实和kfed写日志有关系。查看ORACLE_BASE/diag目录,果然发现问题:

grid@solaris:~/grid$ cd $ORACLE_BASE/diag
grid@solaris:/u01/app/oracle/diag$ ls -l
total 30
drwxrwxr-x 3 oracle dba 3 Apr 21 16:20 asm
drwxr-xr-x 3 kamus staff 3 Mar 23 14:31 asmtool
drwxrwxr-x 3 oracle dba 3 Jan 30 15:20 clients
drwxrwxr-x 2 oracle dba 2 Jan 4 18:18 crs
drwxrwxr-x 2 oracle dba 2 Jan 4 18:18 diagtool
drwxrwxr-x 2 oracle dba 2 Jan 4 18:18 lsnrctl
drwxrwxr-x 2 oracle dba 2 Jan 4 18:18 netcman
drwxrwxr-x 2 oracle dba 2 Jan 4 18:18 ofm
drwxrwxr-x 4 oracle dba 4 Feb 15 15:44 rdbms
drwxrwxr-x 3 oracle dba 3 Jan 6 23:55 tnslsnr

目录asmtool目录的owner果然不是grid用户,导致grid在写日志是没有权限:

root@solaris:~# chown grid:dba $ORACLE_BASE/diag/asmtool
chown: /diag/asmtool: No such file OR directory
root@solaris:~# chown grid:dba /u01/app/oracle/diag/asmtool

重新设置owner后,问题解决:

grid@solaris:/u01/app/oracle/diag$ kfed
AS/mlib ASM Library [asmlib='lib']
aun/um AU NUMBER TO examine OR UPDATE [AUNUM=NUMBER]
aus/z Allocation Unit SIZE IN bytes [AUSZ=NUMBER]
blkn/um Block NUMBER TO examine OR UPDATE [BLKNUM=NUMBER]
blks/z Metadata block SIZE IN bytes [BLKSZ=NUMBER]
ch/ksum UPDATE checksum BEFORE each WRITE [CHKSUM=YES/NO]
cn/t COUNT OF AUs TO process [CNT=NUMBER]
de/v ASM device TO examine OR UPDATE [DEV=string]
dm/pall Don't suppress repeated lines when dumping corrupt blocks [DMPALL=YES/NO]
o/p KFED operation type [OP=READ/WRITE/MERGE/REPAIR/NEW/FORM/FIND/STRUCT]
p/rovnm Name for provisioning purposes [PROVNM=string]
s/eek AU number to seek to [SEEK=number]
te/xt File name for translated block text [TEXT=string]
ty/pe ASM metadata block type number [TYPE=number]
Posted in ORACLE | Tagged , , , | 1 Comment

DBCA添加节点出现PRKC-1055错误

很早以前写过几篇文章,描述在RAC环境下添加节点的问题,其中在使用DBCA添加节点时,碰到了PRKC-1055错误。
Oracle10203RAC环境添加新节点(六):http://yangtingkun.itpub.net/post/468/493937
运行DBCA在配置好新实例的设置后,在最后运行的时候报错:Failed to create directory “” on “racnode3”, “PRKC-1055: Directory name passed was null”. PRKC-1055: Directory name passed was null。
本来一直认为是Oracle的DBCA处理不当,导致使用OMF存在异常,并最终引发了这个问题。最近Kamus也碰到了这个问题,最终发现是由于参数设置的问题,导致新增节点的目录参数设置和已有节点的参数冲突从而导致了问题。
可以参考文档DBCA Fails With PRKC-1055 While Adding a New Instance to an Existing RAC Database [ID 394519.1]。
简单的说,DBCA会验证目录设置在所有节点上是否存在。可以通过下面的脚本找到所有修改过的参数:

SELECT name,''''||VALUE||'''' val FROM v$parameter WHERE isdefault='FALSE'
UNION
SELECT name,''''||VALUE||'''' val FROM v$system_parameter WHERE ismodified='MODIFIED'
ORDER BY 1,2;

确认这些参数对应的值和目录在新增节点上是存在的。此外根据Kamus的提醒,还应该确认db_create_file_dest、db_create_online_log_dest_n、db_recovery_file_dest以及log_archive_dest_n参数的设置是否指向了正确的位置。

Posted in ORACLE | Tagged , , | Leave a comment

利用触发器解决更新主键冲突的问题

有朋友问我能否用触发器实现更新数据时,如果发现主键已经存在,则将冲突的主键更新为当前记录之前的主键值。
简单的说,如果表中存在主键为1和2的记录,如果一条UPDATE语句将1更新为2,那么想要实现的功能是为了确保这个UPDATE可以执行成功,在后台自动将ID为2的记录更新为1。
这个功能应该可以实现,但是直接在触发器中肯定无法实现,因为这是一个标准的变异表问题。而且即使是采用自治事务的方式,也需要小心,因为要修改原有的记录就必须通过UPDATE实现,而这个UPDATE又会导致触发器的触发,如果处理不当,就会导致循环触发。

SQL> CREATE TABLE T (ID NUMBER, NAME VARCHAR2(30), CONSTRAINT PK_T PRIMARY KEY (ID));
TABLE created.
SQL> CREATE OR REPLACE PROCEDURE P_UPDATE_T (P_NEW NUMBER, P_OLD NUMBER) AS
  2  BEGIN
  3     FOR I IN (SELECT ID FROM T WHERE ID = P_NEW) LOOP
  4             UPDATE T SET ID = P_OLD WHERE ID = I.ID;
  5     END LOOP;
  6  END;
  7  /
PROCEDURE created.
SQL> INSERT INTO T SELECT ROWNUM, TNAME FROM TAB WHERE ROWNUM < 5;
4 ROWS created.
SQL> COMMIT;
Commit complete.
SQL> CREATE OR REPLACE TRIGGER TRI_B_T 
  2  BEFORE UPDATE ON T
  3  FOR EACH ROW
  4  BEGIN
  5     P_UPDATE_T(:NEW.ID, :OLD.ID);
  6  END;
  7  /
TRIGGER created.
SQL> SELECT * FROM T;
        ID NAME
---------- ------------------------------
         1 BIN$uo1flhazQQngRDzZK0pZWw==$0
         2 BRIDGE_CROSSING
         3 C_NORMAL
         4 C_SINGLE
SQL> UPDATE T SET ID = 3 WHERE ID = 2;
UPDATE T SET ID = 3 WHERE ID = 2
       *
ERROR at line 1:
ORA-04091: TABLE TEST.T IS mutating, TRIGGER/FUNCTION may NOT see it
ORA-06512: at "TEST.P_UPDATE_T", line 3
ORA-06512: at "TEST.TRI_B_T", line 2
ORA-04088: error during execution OF TRIGGER 'TEST.TRI_B_T'
SQL> CREATE OR REPLACE PROCEDURE P_UPDATE_T (P_NEW NUMBER, P_OLD NUMBER) AS
  2  PRAGMA AUTONOMOUS_TRANSACTION;
  3  BEGIN
  4     FOR I IN (SELECT ID FROM T WHERE ID = P_NEW) LOOP
  5             UPDATE T SET ID = P_OLD WHERE ID = I.ID;
  6     END LOOP;
  7     COMMIT;
  8  END;
  9  /
PROCEDURE created.
SQL> UPDATE T SET ID = 3 WHERE ID = 2;
UPDATE T SET ID = 3 WHERE ID = 2
       *
ERROR at line 1:
ORA-00060: deadlock detected while waiting FOR resource
ORA-06512: at "TEST.P_UPDATE_T", line 5
ORA-06512: at "TEST.TRI_B_T", line 2
ORA-04088: error during execution OF TRIGGER 'TEST.TRI_B_T'
ORA-06512: at "TEST.P_UPDATE_T", line 5
ORA-06512: at "TEST.TRI_B_T", line 2
ORA-04088: error during execution OF TRIGGER 'TEST.TRI_B_T'

正如前面提到的,如果直接修改则碰到变异表问题,如果想要通过自治事务解决变异表的问题,又会导致自身的死锁问题。
为了解决这个问题,需要仿照变异表的常规处理方法来解决。常规处理方法是通过一个包记录变量,加上BEFORE和AFTER两个行级触发器以及一个语句级触发器的共同配合。但是一般而言,变异表的处理并不会引入本身记录的更新,而当前还需要解决更新当前表引发的无限循环问题。

SQL> CREATE OR REPLACE PACKAGE P_VAR AS
  2  G_STATUS NUMBER := 0;
  3  END;
  4  /
Package created.
SQL> CREATE OR REPLACE FUNCTION F_GET_T (P_ID NUMBER) RETURN NUMBER AS
  2  PRAGMA AUTONOMOUS_TRANSACTION;
  3  V_RESULT NUMBER;
  4  BEGIN
  5     SELECT COUNT(*) INTO V_RESULT FROM T WHERE ID = P_ID;
  6     RETURN V_RESULT;
  7  END;
  8  /
FUNCTION created.
SQL> CREATE OR REPLACE TRIGGER TRI_B_T
  2  BEFORE UPDATE ON T
  3  FOR EACH ROW
  4  DECLARE
  5  V_RESULT NUMBER;
  6  BEGIN
  7     V_RESULT := F_GET_T(:NEW.ID);
  8     IF V_RESULT = 1 THEN
  9             :NEW.ID := -:NEW.ID;
 10             P_VAR.G_STATUS := 1;
 11     END IF;
 12  END;
 13  /
TRIGGER created.
SQL> CREATE OR REPLACE TRIGGER TRI_A_T
  2  AFTER UPDATE ON T
  3  FOR EACH ROW
  4  DECLARE
  5  V_JOB NUMBER;
  6  BEGIN
  7     IF P_VAR.G_STATUS = 1 THEN
  8             DBMS_JOB.SUBMIT(V_JOB, 'BEGIN
  9             UPDATE T SET ID = ' || :OLD.ID || ' WHERE ID = ' || (-1 * :NEW.ID) || ';
 10             COMMIT;
 11             UPDATE T SET ID = ' || (-1 * :NEW.ID) || ' WHERE ID = ' || :NEW.ID || ';
 12             COMMIT; 
 13             END;', SYSDATE);
 14             P_VAR.G_STATUS := 0;
 15     END IF;
 16  END;
 17  /
TRIGGER created.
SQL> SELECT * FROM T;
        ID NAME
---------- ------------------------------
         1 BIN$uo1flhazQQngRDzZK0pZWw==$0
         2 BRIDGE_CROSSING
         3 C_NORMAL
         4 C_SINGLE
SQL> UPDATE T SET ID = 3 WHERE ID = 2;
1 ROW updated.
SQL> SELECT * FROM T;
        ID NAME
---------- ------------------------------
         1 BIN$uo1flhazQQngRDzZK0pZWw==$0
        -3 BRIDGE_CROSSING
         3 C_NORMAL
         4 C_SINGLE
SQL> SELECT WHAT FROM USER_JOBS;
WHAT
-----------------------------------------------------------------------------------
BEGIN
                UPDATE T SET ID = 2 WHERE ID = 3;
                COMMIT;
                UPDATE T SET ID = 3 WHERE ID = -3;
                COMMIT;
                END;
SQL> COMMIT;
Commit complete.
SQL> SELECT * FROM T;
        ID NAME
---------- ------------------------------
         1 BIN$uo1flhazQQngRDzZK0pZWw==$0
         3 BRIDGE_CROSSING
         2 C_NORMAL
         4 C_SINGLE

最终问题通过利用包变量记录是否存在冲突记录,配合异步JOB的方式对冲突记录进行修改的方式避免了无限UPDATE的情况。
当然上面只是一个示例,对于需要同时更新多条记录的情况,需要在包中构建一个类似映射表的结构,从而记录每条更新的ID是否会出现NEW.ID存在的情况。
而且在提交的JOB中也应该对代码做更多的保护,比如添加异常处理等。
此外,这个问题的最大缺点在于,只有COMMIT后才能达到预期目标,而在更新之后且没有提交之前,更新的结果并不是UPDATE语句要完成的操作,而只是一个中间结果。
事实上,这个问题本来就不应该由触发器来解决,对于更新现有ID到一个已经存在的ID的操作,应该通过一段PL/SQL代码来实现,这样才能更好的考虑并发,事务一致性和事务完整性。

Posted in ORACLE | Tagged , , , , | Leave a comment

ORA-600(6002)错误

客户数据库出现ORA-600(6002)错误。
在告警文件中出现大量的错误:

Sun Apr 1 16:15:41 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16555.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:16:43 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16555.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:17:44 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16961.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:18:45 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16961.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:19:47 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16961.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:20:48 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16555.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
Sun Apr 1 16:21:49 2012
Errors IN file /oracle/orcl/udump/orcl_ora_16961.trc:
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []

详细错误信息为:

ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [6002], [0], [0], [4], [0], [], [], []
CURRENT SQL statement FOR this SESSION:
INSERT INTO TABLE_STAT (ID ,START_TIME, END_TIME) VALUES (:1, :2, :3)         
----- Call Stack Trace -----
calling              CALL     entry                argument VALUES IN hex      
location             TYPE     point                (? means dubious VALUE)     
-------------------- -------- -------------------- ----------------------------
ksedst()+31          CALL     ksedst1()            000000000 ? 000000001 ?
                                                   7FFF7063E240 ? 7FFF7063E2A0 ?
                                                   7FFF7063E1E0 ? 000000000 ?
ksedmp()+610         CALL     ksedst()             000000000 ? 000000001 ?
                                                   7FFF7063E240 ? 7FFF7063E2A0 ?
                                                   7FFF7063E1E0 ? 000000000 ?
ksfdmp()+21          CALL     ksedmp()             000000003 ? 000000001 ?
                                                   7FFF7063E240 ? 7FFF7063E2A0 ?
                                                   7FFF7063E1E0 ? 000000000 ?
kgeriv()+176         CALL     ksfdmp()             000000003 ? 000000001 ?
                                                   7FFF7063E240 ? 7FFF7063E2A0 ?
                                                   7FFF7063E1E0 ? 000000000 ?
kgeasi()+263         CALL     kgeriv()             0068966E0 ? 2AAACED21168 ?
                                                   000000000 ? 000000000 ?
                                                   7FFF7063E1E0 ? 000000000 ?
kdiins0()+4578       CALL     kgeasi()             0068966E0 ? 2AAACED21168 ?
                                                   000001772 ? 000000002 ?
                                                   000000004 ? 000000000 ?
kauxsin()+4261       CALL     kdiins0()            11BCC8DD0 ? 000000000 ?
                                                   000000000 ? 000000000 ?
                                                   000000000 ? 000000000 ?
insidx()+2461        CALL     kauxsin()            11BCC8E38 ? 7FFF70651640 ?

这个错误一般都是索引块出现了逻辑损坏造成的,解决方法其实很简单,将索引删除重建一般就可以解决问题。如果想要使用REBUILD的方式重建索引,记得加上ONLINE选项,确保在索引重建的时候选择读表,而不是从本来就存在数据错误的索引来获取数据。
关于ORA-600[6002]错误的更多问题可以参考文档ID 47449.1,而对于当前版本可能的bug,则应该是Bug 6621940 – Corruption after MOVE of an empty partition with local indexes [ID 6621940.8],这个问题在10.2.0.5和11.2.0.1中被FIXED。

Posted in BUG | Tagged , , , , | Leave a comment

四人过桥问题的SQL解法

看了开发版的帖子,又没有忍住,简单写了一个SQL语句。关于问题的详细描述可以参考http://www.itpub.net/thread-1595264-1-1.html
传说中的微软面试题:
有一群人A,B,C,D (人数>=2)要在夜里走过独木桥过河,他们只有一把手电筒。四个人的速度不同,过河分别需要1,2,5,10分钟,桥上最多走两个人,两个人一起走时按速度慢的计算。过河一定要用手电筒。请问最快的方法是如何安排,需要几分钟?
例子输出:
A B,A,A C,A,A D 19
直接给出最终结果:

SQL> WITH C AS
  2  (SELECT NAME, TIME, POWER(2, ROWNUM - 1) POS FROM BRIDGE_CROSSING), 
  3  A AS
  4  (SELECT 0 RN, A.NAME || ' ' || B.NAME NAME, GREATEST(A.TIME, B.TIME) TIME, A.POS + B.POS POS
  5  FROM C A, C B
  6  WHERE A.NAME < B.NAME
  7  UNION ALL
  8  SELECT 1, NAME, TIME, POS FROM C),
  9  B (RN, NAME, TIME, POS) AS
 10  (SELECT 0 RN, CAST ('' AS VARCHAR2(4000)) NAME, 0, 0 POS FROM DUAL
 11  UNION ALL
 12  SELECT B.RN + 1, 
 13     B.NAME || ',' || A.NAME, 
 14     B.TIME + A.TIME, 
 15     DECODE(MOD(B.RN, 2), 0, B.POS + A.POS, B.POS - A.POS)
 16  FROM B, A
 17  WHERE MOD(B.RN, 2) = A.RN
 18  AND DECODE(MOD(B.RN, 2), 0, BITAND(B.POS, A.POS), BITAND(B.POS - A.POS, A.POS)) = 0)
 19  SELECT LTRIM(NAME, ',') NAME, TIME FROM 
 20  (SELECT NAME, TIME, RANK() OVER(ORDER BY TIME) RN 
 21  FROM B
 22  WHERE POS = POWER(2, 4) - 1)
 23  WHERE RN = 1; 
NAME                                 TIME
------------------------------ ----------
A B,A,C D,B,A B                        17
A B,B,C D,A,A B                        17

简单描述一下思路,通过构造一个POS列标识每个人的位置,比如1101表明ABD三个人都已经过桥了,如果POS的值达到15,则说明所有人都过桥。
这个递归WITH和以往的区别在于,两个人一起过桥和一个人回来送手电的处理规则是不同的,甚至连接数据都是不同的。这里采用将两个人过桥和一个人回来作为不同的记录插入到同一个表中,然后用不同的ID进行区别。随后的递归调用,无论是查询列,还是进行关联的时候都要进行条件的判断,判断当前是两个人过桥,还是一个人回来,从而进行不同的操作。

Posted in ORACLE | Tagged , , | Leave a comment

恢复导致ORA-600(kccfhb_1)错误

这篇文章仍然是RMAN-600错误的后续系列。
RMAN-600(8201)错误:https://yangtingkun.net/?p=690
RMAN-600(8201)错误的重现:https://yangtingkun.net/?p=716
恢复数据库出现ORA-38727:https://yangtingkun.net/?p=759
当控制文件里存在更大RESETLOGS的ORPHAN时,Oracle在10.2.0.3及以前版本是存在bug的,开始仅仅认为这个问题影响CATALOG模式的备份。但是测试发现,对于数据库的恢复同样存在影响。
如果数据库的FLASHBACK出于打开状态,那么恢复操作就会报错ORA-38727错误,而如果FLASHBACK处于关闭状态,恢复操作就可能碰到ORA-600(kccfhb_1)错误。

[orat1@hpserver2 ~]$ sqlplus / AS sysdba
SQL*Plus: Release 10.2.0.3.0 - Production ON Mon Apr 16 10:58:58 2012
Copyright (c) 1982, 2006, Oracle.  ALL Rights Reserved.
Connected TO:
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options
SQL> SET pages 100 LINES 140
SQL> ALTER DATABASE recover tablespace tbs011;
ALTER DATABASE recover tablespace tbs011
*
ERROR at line 1:
ORA-00283: recovery SESSION canceled due TO errors
ORA-00600: internal error code, arguments: [kccfhb_1], [4], [], [], [], [], [], []
SQL> shutdown immediate
DATABASE closed.
DATABASE dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area  314572800 bytes
Fixed SIZE                  2072384 bytes
Variable SIZE             150995136 bytes
DATABASE Buffers          155189248 bytes
Redo Buffers                6316032 bytes
DATABASE mounted.
SQL> recover tablespace tbs010;
ORA-00283: recovery SESSION canceled due TO errors
ORA-01110: DATA file 7: '/t1/orat1/oradata/test10g_s/tbs010.dbf'
ORA-01157: cannot identify/LOCK DATA file 7 - see DBWR trace file
ORA-01110: DATA file 7: '/t1/orat1/oradata/test10g_s/tbs010.dbf'
SQL> host 
[orat1@hpserver2 ~]$ rman target /
Recovery Manager: Release 10.2.0.3.0 - Production ON Mon Apr 16 11:14:16 2012
Copyright (c) 1982, 2005, Oracle.  ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857, NOT OPEN)
RMAN> restore tablespace tbs010;
Starting restore at 16-APR-12
USING target DATABASE control file instead OF recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=155 devtype=DISK
skipping datafile 7; already restored TO file /t1/orat1/oradata/test10g_s/tbs010.dbf
restore NOT done; ALL files readonly, offline, OR already restored
Finished restore at 16-APR-12
RMAN> recover tablespace tbs010;
Starting recover at 16-APR-12
USING channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF recover command at 04/16/2012 11:14:38
RMAN-06067: RECOVER DATABASE required WITH a backup OR created control file
RMAN> restore tablespace tbs010 force;
Starting restore at 16-APR-12
USING channel ORA_DISK_1
channel ORA_DISK_1: starting datafile backupset restore
channel ORA_DISK_1: specifying datafile(s) TO restore FROM backup SET
restoring datafile 00007 TO /t1/orat1/oradata/test10g_s/tbs010.dbf
channel ORA_DISK_1: reading FROM backup piece /t1/orat1/flash_recovery_area/BOSTON/backupset/2012_04_16/o1_mf_nnndf_TAG20120416T101406_7rq03jy8_.bkp
channel ORA_DISK_1: restored backup piece 1
piece handle=/t1/orat1/flash_recovery_area/BOSTON/backupset/2012_04_16/o1_mf_nnndf_TAG20120416T101406_7rq03jy8_.bkp tag=TAG20120416T101406
channel ORA_DISK_1: restore complete, elapsed TIME: 00:00:02
Finished restore at 16-APR-12
RMAN> recover tablespace tbs010;
Starting recover at 16-APR-12
USING channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF recover command at 04/16/2012 11:16:25
RMAN-06067: RECOVER DATABASE required WITH a backup OR created control file
RMAN> recover DATABASE;
Starting recover at 16-APR-12
USING channel ORA_DISK_1
starting media recovery
archive log thread 1 SEQUENCE 67 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo03.log
archive log thread 1 SEQUENCE 68 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo01.log
archive log thread 1 SEQUENCE 69 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo02.log
archive log filename=/t1/orat1/oradata/test10g_s/redo03.log thread=1 SEQUENCE=67
archive log filename=/t1/orat1/oradata/test10g_s/redo01.log thread=1 SEQUENCE=68
archive log filename=/t1/orat1/oradata/test10g_s/redo02.log thread=1 SEQUENCE=69
media recovery complete, elapsed TIME: 00:00:02
Finished recover at 16-APR-12
RMAN> exit
 
Recovery Manager complete.
[orat1@hpserver2 ~]$ exit
exit
SQL> ALTER DATABASE OPEN;
ALTER DATABASE OPEN
*
ERROR at line 1:
ORA-01589: must USE RESETLOGS OR NORESETLOGS OPTION FOR DATABASE OPEN
 
SQL> ALTER DATABASE OPEN resetlogs;
DATABASE altered.
SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS';                                                 
SESSION altered.
SQL> SELECT INCARNATION#, RESETLOGS_CHANGE#, RESETLOGS_TIME, STATUS, RESETLOGS_ID FROM V$DATABASE_INCARNATION;
INCARNATION# RESETLOGS_CHANGE# RESETLOGS_TIME      STATUS  RESETLOGS_ID
------------ ----------------- ------------------- ------- ------------
           1                 1 2012-03-31 16:09:45 PARENT     779386185
           2            251422 2012-04-01 14:32:15 PARENT     779466735
           3            252146 2012-04-01 15:09:56 PARENT     779468996
           4            263286 2012-04-05 16:16:26 ORPHAN     779818586
           5            306405 2012-04-16 11:18:50 CURRENT    780751130
SQL> EXIT
Disconnected FROM Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options
[orat1@hpserver2 ~]$ rman target / catalog RCAT_USER/rcat_password
Recovery Manager: Release 10.2.0.3.0 - Production ON Mon Apr 16 11:25:10 2012
Copyright (c) 1982, 2005, Oracle.  ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857)
connected TO recovery catalog DATABASE
RMAN> unregister DATABASE;
DATABASE name IS "TEST10G" AND DBID IS 1030910857
Do you really want TO unregister the DATABASE (enter YES OR NO)? yes
DATABASE unregistered FROM the recovery catalog
RMAN> register DATABASE;
DATABASE registered IN recovery catalog
starting FULL resync OF recovery catalog
FULL resync complete
RMAN> SHOW ALL;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # DEFAULT
CONFIGURE BACKUP OPTIMIZATION OFF; # DEFAULT
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # DEFAULT
CONFIGURE DEVICE TYPE DISK PARALLELISM 5 BACKUP TYPE TO BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE MAXSETSIZE TO UNLIMITED; # DEFAULT
CONFIGURE ENCRYPTION FOR DATABASE OFF; # DEFAULT
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # DEFAULT
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # DEFAULT
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/t1/orat1/product/10.2.0/db_1/dbs/snapcf_test10g_s.f'; # DEFAULT
RMAN>

可以看到,这个步骤其实和上一篇ORA-38727错误完全一样,虽然只是一个表空间的恢复,Oracle也必须强制关闭数据库执行全数据库的恢复,而且最终只能RESETLOGS方式打开数据库。
随着OPEN RESETLOGS操作的完成,RMAN-600(8201)错误产生的条件也不再存在,因此CATALOG的创建得以成功,控制文件中隐藏的问题同时得以解决。

Posted in BUG | Tagged , , , , , , , , , , | Leave a comment

恢复数据库出现ORA-38727

一个测试数据库在恢复时出现ORA-38727错误。
错误信息如下:

[orat1@hpserver2 ~]$ rman target /
Recovery Manager: Release 10.2.0.3.0 - Production ON Sat Apr 14 09:56:01 2012
Copyright (c) 1982, 2005, Oracle. ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857)
RMAN> recover tablespace tbs013;
Starting recover at 14-APR-12
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=138 devtype=DISK
starting media recovery
media recovery failed
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF recover command at 04/14/2012 09:57:45
ORA-00283: recovery SESSION canceled due TO errors
RMAN-11003: failure during parse/execution OF SQL statement: ALTER DATABASE recover IF needed
tablespace TBS013
ORA-00283: recovery SESSION canceled due TO errors
ORA-38727: FLASHBACK DATABASE requires a CURRENT control file.
RMAN> exit
Recovery Manager complete.

显然导致这个错误的原因和FLASHBACK特性有关,检查数据库,果然发现FLASHBACK属性出于ON的状态:

[orat1@hpserver2 ~]$ sqlplus / AS sysdba
SQL*Plus: Release 10.2.0.3.0 - Production ON Sat Apr 14 09:57:57 2012
Copyright (c) 1982, 2006, Oracle. ALL Rights Reserved.
 
Connected TO:
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options
SQL> SET pages 100 LINES 140
SQL> SELECT dbid, name, flashback_on FROM v$database;
DBID NAME      FLASHBACK_ON
---------- --------- ------------------
1030910857 TEST10G   YES
SQL> shutdown immediate
DATABASE closed.
DATABASE dismounted.
ORACLE instance shut down.
SQL> startup mount
ORACLE instance started.
Total System Global Area 314572800 bytes
Fixed SIZE 2072384 bytes
Variable SIZE 171966656 bytes
DATABASE Buffers 134217728 bytes
Redo Buffers 6316032 bytes
DATABASE mounted.
SQL> ALTER DATABASE flashback off; 
DATABASE altered.
SQL> exit 
Disconnected FROM Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options

再次执行恢复操作:

[orat1@hpserver2 ~]$ rman target /
Recovery Manager: Release 10.2.0.3.0 - Production ON Sat Apr 14 10:01:03 2012
Copyright (c) 1982, 2005, Oracle. ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857, NOT OPEN)
RMAN> recover tablespace tbs013;
Starting recover at 14-APR-12
USING target DATABASE control file instead OF recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=156 devtype=DISK
starting media recovery
media recovery failed
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF recover command at 04/14/2012 10:02:06
ORA-00283: recovery SESSION canceled due TO errors
RMAN-11003: failure during parse/execution OF SQL statement: ALTER DATABASE recover IF needed
tablespace TBS013
ORA-00283: recovery SESSION canceled due TO errors
ORA-01110: DATA file 11: '/t1/orat1/oradata/test10g_s/tbs013.dbf'
ORA-01157: cannot identify/LOCK DATA file 11 - see DBWR trace file
ORA-01110: DATA file 11: '/t1/orat1/oradata/test10g_s/tbs013.dbf'
RMAN> restore tablespace tbs013;
Starting restore at 14-APR-12
USING channel ORA_DISK_1
skipping datafile 11; already restored TO file /t1/orat1/oradata/test10g_s/tbs013.dbf
restore NOT done; ALL files readonly, offline, OR already restored
Finished restore at 14-APR-12
RMAN> recover tablespace tbs013;
Starting recover at 14-APR-12
USING channel ORA_DISK_1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF recover command at 04/14/2012 10:03:05
RMAN-06067: RECOVER DATABASE required WITH a backup OR created control file
RMAN> recover DATABASE;
Starting recover at 14-APR-12
USING channel ORA_DISK_1
starting media recovery
archive log thread 1 SEQUENCE 356 IS already ON disk AS file /t1/orat1/arch_s/1_356_779468996.dbf
archive log thread 1 SEQUENCE 357 IS already ON disk AS file /t1/orat1/arch_s/1_357_779468996.dbf
archive log thread 1 SEQUENCE 358 IS already ON disk AS file /t1/orat1/arch_s/1_358_779468996.dbf
archive log thread 1 SEQUENCE 359 IS already ON disk AS file /t1/orat1/arch_s/1_359_779468996.dbf
archive log thread 1 SEQUENCE 360 IS already ON disk AS file /t1/orat1/arch_s/1_360_779468996.dbf
archive log thread 1 SEQUENCE 361 IS already ON disk AS file /t1/orat1/arch_s/1_361_779468996.dbf
archive log thread 1 SEQUENCE 362 IS already ON disk AS file /t1/orat1/arch_s/1_362_779468996.dbf
archive log thread 1 SEQUENCE 363 IS already ON disk AS file /t1/orat1/arch_s/1_363_779468996.dbf
archive log thread 1 SEQUENCE 364 IS already ON disk AS file /t1/orat1/arch_s/1_364_779468996.dbf
archive log thread 1 SEQUENCE 365 IS already ON disk AS file /t1/orat1/arch_s/1_365_779468996.dbf
archive log thread 1 SEQUENCE 366 IS already ON disk AS file /t1/orat1/arch_s/1_366_779468996.dbf
archive log thread 1 SEQUENCE 367 IS already ON disk AS file /t1/orat1/arch_s/1_367_779468996.dbf
archive log thread 1 SEQUENCE 368 IS already ON disk AS file /t1/orat1/arch_s/1_368_779468996.dbf
archive log thread 1 SEQUENCE 369 IS already ON disk AS file /t1/orat1/arch_s/1_369_779468996.dbf
archive log thread 1 SEQUENCE 370 IS already ON disk AS file /t1/orat1/arch_s/1_370_779468996.dbf
archive log thread 1 SEQUENCE 371 IS already ON disk AS file /t1/orat1/arch_s/1_371_779468996.dbf
archive log thread 1 SEQUENCE 372 IS already ON disk AS file /t1/orat1/arch_s/1_372_779468996.dbf
archive log thread 1 SEQUENCE 373 IS already ON disk AS file /t1/orat1/arch_s/1_373_779468996.dbf
archive log thread 1 SEQUENCE 374 IS already ON disk AS file /t1/orat1/arch_s/1_374_779468996.dbf
archive log thread 1 SEQUENCE 375 IS already ON disk AS file /t1/orat1/arch_s/1_375_779468996.dbf
archive log thread 1 SEQUENCE 376 IS already ON disk AS file /t1/orat1/arch_s/1_376_779468996.dbf
archive log thread 1 SEQUENCE 377 IS already ON disk AS file /t1/orat1/arch_s/1_377_779468996.dbf
archive log thread 1 SEQUENCE 378 IS already ON disk AS file /t1/orat1/arch_s/1_378_779468996.dbf
archive log thread 1 SEQUENCE 379 IS already ON disk AS file /t1/orat1/arch_s/1_379_779468996.dbf
archive log thread 1 SEQUENCE 380 IS already ON disk AS file /t1/orat1/arch_s/1_380_779468996.dbf
archive log thread 1 SEQUENCE 381 IS already ON disk AS file /t1/orat1/arch_s/1_381_779468996.dbf
archive log thread 1 SEQUENCE 382 IS already ON disk AS file /t1/orat1/arch_s/1_382_779468996.dbf
archive log thread 1 SEQUENCE 383 IS already ON disk AS file /t1/orat1/arch_s/1_383_779468996.dbf
archive log thread 1 SEQUENCE 384 IS already ON disk AS file /t1/orat1/arch_s/1_384_779468996.dbf
archive log thread 1 SEQUENCE 385 IS already ON disk AS file /t1/orat1/arch_s/1_385_779468996.dbf
archive log thread 1 SEQUENCE 386 IS already ON disk AS file /t1/orat1/arch_s/1_386_779468996.dbf
archive log thread 1 SEQUENCE 387 IS already ON disk AS file /t1/orat1/arch_s/1_387_779468996.dbf
archive log thread 1 SEQUENCE 388 IS already ON disk AS file /t1/orat1/arch_s/1_388_779468996.dbf
archive log thread 1 SEQUENCE 389 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo02.log
archive log thread 1 SEQUENCE 390 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo03.log
archive log thread 1 SEQUENCE 391 IS already ON disk AS file /t1/orat1/oradata/test10g_s/redo01.log
archive log filename=/t1/orat1/arch_s/1_356_779468996.dbf thread=1 SEQUENCE=356
archive log filename=/t1/orat1/arch_s/1_357_779468996.dbf thread=1 SEQUENCE=357
archive log filename=/t1/orat1/arch_s/1_358_779468996.dbf thread=1 SEQUENCE=358
archive log filename=/t1/orat1/arch_s/1_359_779468996.dbf thread=1 SEQUENCE=359
archive log filename=/t1/orat1/arch_s/1_360_779468996.dbf thread=1 SEQUENCE=360
archive log filename=/t1/orat1/arch_s/1_361_779468996.dbf thread=1 SEQUENCE=361
archive log filename=/t1/orat1/arch_s/1_362_779468996.dbf thread=1 SEQUENCE=362
archive log filename=/t1/orat1/arch_s/1_363_779468996.dbf thread=1 SEQUENCE=363
archive log filename=/t1/orat1/arch_s/1_364_779468996.dbf thread=1 SEQUENCE=364
archive log filename=/t1/orat1/arch_s/1_365_779468996.dbf thread=1 SEQUENCE=365
archive log filename=/t1/orat1/arch_s/1_366_779468996.dbf thread=1 SEQUENCE=366
archive log filename=/t1/orat1/arch_s/1_367_779468996.dbf thread=1 SEQUENCE=367
archive log filename=/t1/orat1/arch_s/1_368_779468996.dbf thread=1 SEQUENCE=368
archive log filename=/t1/orat1/arch_s/1_369_779468996.dbf thread=1 SEQUENCE=369
archive log filename=/t1/orat1/arch_s/1_370_779468996.dbf thread=1 SEQUENCE=370
archive log filename=/t1/orat1/arch_s/1_371_779468996.dbf thread=1 SEQUENCE=371
archive log filename=/t1/orat1/arch_s/1_372_779468996.dbf thread=1 SEQUENCE=372
archive log filename=/t1/orat1/arch_s/1_373_779468996.dbf thread=1 SEQUENCE=373
archive log filename=/t1/orat1/arch_s/1_374_779468996.dbf thread=1 SEQUENCE=374
archive log filename=/t1/orat1/arch_s/1_375_779468996.dbf thread=1 SEQUENCE=375
archive log filename=/t1/orat1/arch_s/1_376_779468996.dbf thread=1 SEQUENCE=376
archive log filename=/t1/orat1/arch_s/1_377_779468996.dbf thread=1 SEQUENCE=377
archive log filename=/t1/orat1/arch_s/1_378_779468996.dbf thread=1 SEQUENCE=378
archive log filename=/t1/orat1/arch_s/1_379_779468996.dbf thread=1 SEQUENCE=379
archive log filename=/t1/orat1/arch_s/1_380_779468996.dbf thread=1 SEQUENCE=380
archive log filename=/t1/orat1/arch_s/1_381_779468996.dbf thread=1 SEQUENCE=381
archive log filename=/t1/orat1/arch_s/1_382_779468996.dbf thread=1 SEQUENCE=382
archive log filename=/t1/orat1/arch_s/1_383_779468996.dbf thread=1 SEQUENCE=383
archive log filename=/t1/orat1/arch_s/1_384_779468996.dbf thread=1 SEQUENCE=384
archive log filename=/t1/orat1/arch_s/1_385_779468996.dbf thread=1 SEQUENCE=385
archive log filename=/t1/orat1/arch_s/1_386_779468996.dbf thread=1 SEQUENCE=386
archive log filename=/t1/orat1/arch_s/1_387_779468996.dbf thread=1 SEQUENCE=387
archive log filename=/t1/orat1/arch_s/1_388_779468996.dbf thread=1 SEQUENCE=388
archive log filename=/t1/orat1/oradata/test10g_s/redo02.log thread=1 SEQUENCE=389
archive log filename=/t1/orat1/oradata/test10g_s/redo03.log thread=1 SEQUENCE=390
archive log filename=/t1/orat1/oradata/test10g_s/redo01.log thread=1 SEQUENCE=391
media recovery complete, elapsed TIME: 00:00:15
Finished recover at 14-APR-12
RMAN> ALTER DATABASE OPEN;
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure OF ALTER db command at 04/14/2012 10:06:34
ORA-01589: must USE RESETLOGS OR NORESETLOGS OPTION FOR DATABASE OPEN
RMAN> ALTER DATABASE OPEN resetlogs;
DATABASE opened

虽然恢复成功,但是可以看到恢复过程并不顺利,最明显的一点,本来一个简单的TABLESPACE的恢复,变成了DATABASE的恢复,而且本来可以直接打开,现在变成了RESETLOGS方式的打开。这也说明由于启用了FLASHBACK,导致Oracle在判断控制文件状态出现了异常,导致最终以RESETLOGS方式打开数据库。
其实这个错误和不久前碰到的RMAN-600错误有直接的关联:
RMAN-600(8201)错误:https://yangtingkun.net/?p=690
RMAN-600(8201)错误的重现:https://yangtingkun.net/?p=716
而且正如我们所料,当前数据库再次RESETLOGS后,目前控制文件中最大的RESETLOGS时间对应的正式当前控制文件中记录的RESETLOGS时间,这就使得CATALOG的同步得以实现:

[orat1@hpserver2 ~]$ sqlplus / AS sysdba
SQL*Plus: Release 10.2.0.3.0 - Production ON Sun Apr 15 00:09:12 2012
Copyright (c) 1982, 2006, Oracle. ALL Rights Reserved.
Connected TO:
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options
SQL> exit
Disconnected FROM Oracle DATABASE 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
WITH the Partitioning, OLAP AND DATA Mining options
[orat1@hpserver2 ~]$ rman target / catalog rcat_user/rcat_password
Recovery Manager: Release 10.2.0.3.0 - Production ON Sun Apr 15 00:09:31 2012
Copyright (c) 1982, 2005, Oracle. ALL rights reserved.
connected TO target DATABASE: TEST10G (DBID=1030910857)
connected TO recovery catalog DATABASE
RMAN> SHOW ALL;
NEW incarnation OF DATABASE registered IN recovery catalog
starting FULL resync OF recovery catalog
FULL resync complete
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # DEFAULT
CONFIGURE BACKUP OPTIMIZATION OFF; # DEFAULT
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # DEFAULT
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # DEFAULT
CONFIGURE DEVICE TYPE DISK PARALLELISM 5 BACKUP TYPE TO BACKUPSET;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # DEFAULT
CONFIGURE MAXSETSIZE TO UNLIMITED; # DEFAULT
CONFIGURE ENCRYPTION FOR DATABASE OFF; # DEFAULT
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # DEFAULT
CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # DEFAULT
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/t1/orat1/product/10.2.0/db_1/dbs/snapcf_test10g_s.f'; # DEFAULT
RMAN>
Posted in BUG | Tagged , , , , , , , | Leave a comment

2012年4月数据库技术大会第三天

由于今天有事,只听了金官丁和BITI的演讲。
发现我确实和ITPUB的数据库大会八字不合,前两次都和重要的事情发生了冲突,导致无法参加,这次总算没有发生正面的冲突,不过第三天的演讲还是受到了影响,只能听两个小时左右的演讲,连上午的重头戏之一,biti_rainy、eygle和piner三巨头的圆桌会议都错过了。
好在听了的两场演讲让我不需此行。Biti和金官丁都已经华丽的转型为管理人员,他们的一些思考可以给现在还在技术职位上奋斗的DBA们指出方向。个人认为,技术人员容易犯一个毛病,只顾低头研究技术,忘了抬头看看方向。如果方向的选择存在问题,不但容易绕弯路,也容易对自己的前途产生迷茫。如果说技术的获取会有多种渠道,那么方向的思考却是数据库大会给我们带来的最大收获之一。让我们听听那些通过技术走向成功的人的经验,对于我们思考自己的方向和前途有着非常大的借鉴意义。
大会今天圆满结束了,三届大会人数一届高过一届,这说明数据库大会受到越来越多国内DBA们的认同,最后希望明年的数据库大会能够碰到更多的老朋友,也结识更多的新朋友。

Posted in NEWS | Leave a comment

2012年4月数据库技术大会第二天

今天的主题中王珊教授、Eygle和老虎是亮点。
上午首先是《数据库系统概论》的作者王珊带来的《Big data时代数据库的机遇、调整和应战》。说实话,我就属于半路出家,没有接触过数据库的基础理论。虽然对于基本的三范式有一些了解,不过像《数据库系统概论》这种专业的理论基础确实没有看过。虽然没有拜读过王珊教授的数据库理论,但是现场听到王教授的关于Big Data的演讲,感触还是非常深的。王教授的理论和技术方面,不是我这种后学之辈有资格进行评价的。不过以王教授这种国内数据库的先行者,现在尚且不断的学习新知识,研究最新的数据库应用方向,足以说明在数据库是学无止境的,因此革命尚未成功,同志们仍需努力。
Eygle的演讲内容已经很熟悉了,一方面是对他最近的一些案例比较了解,另一方面是公司内部有过类似的演讲。Eygle的场面掌控能力非常强,而且安全性方面的内容也非常吸引人。
无独有偶,Eygle后面一个主题,杨宁给我们带来的也是安全相关的内容《数据库攻防实践与SOX安全审计》。两个安全方案的演讲放在一起很有意思,一个是从DBA角度看如何进行数据库的安全管理,另一个是从攻击的角度看Oracle存在哪些漏洞。而大会的两场安全方面的演讲也说明安全性相关问题收到越来越多人的关注。
下午主要听了老虎的演讲,演讲前半部分是一个RAC加载的性能案例分析,而后半部分变成了数据建模的新的算法和数据分析方法。这两部分内容跨度之大,整个大会上无出其右。如果说前半部分是一个很容易理解的CASE,那么后半部分就基本上不知所云了。

Posted in NEWS | Leave a comment