ORA-600(krfg_first_barrier1)错误

客户在打开FLASHBACK的时候碰到了这个错误。

详细的错误如下:

Tue Oct 18 10:55:02 2011
ALTER DATABASE MOUNT
Tue Oct 18 10:55:06 2011
Setting recovery target incarnation TO 1
Tue Oct 18 10:55:06 2011
Successful mount OF redo thread 1, WITH mount id 830459142
Tue Oct 18 10:55:06 2011
DATABASE mounted IN Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Oct 18 13:05:08 2011
ALTER SYSTEM SET db_recovery_file_dest_size='200G' SCOPE=BOTH;
Tue Oct 18 13:05:13 2011
ALTER SYSTEM SET db_recovery_file_dest='/flashback' SCOPE=BOTH;
Tue Oct 18 13:05:17 2011
ALTER SYSTEM SET db_flashback_retention_target=4320 SCOPE=BOTH;
Tue Oct 18 13:06:00 2011
Shutting down instance: further logons disabled
Tue Oct 18 13:06:00 2011
Stopping background process CJQ0
Tue Oct 18 13:06:00 2011
Stopping background process MMNL
Tue Oct 18 13:06:01 2011
Stopping background process MMON
Tue Oct 18 13:06:02 2011
Shutting down instance (immediate)
License high water mark = 1
Tue Oct 18 13:06:02 2011
Stopping Job queue slave processes, flags = 7
Tue Oct 18 13:06:02 2011
Job queue slave processes stopped
Tue Oct 18 13:06:02 2011
ALTER DATABASE CLOSE NORMAL
ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...
Tue Oct 18 13:06:02 2011
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT
ARCH: Archival disabled due TO shutdown: 1089
Shutting down archive processes
Archiving IS disabled
Archive process shutdown avoided: 0 active
ARCH: Archival disabled due TO shutdown: 1089
Shutting down archive processes
Archiving IS disabled
Archive process shutdown avoided: 0 active
Tue Oct 18 13:07:29 2011
Starting ORACLE instance (normal)
sskgpgetexecname failed TO GET name
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Autotune OF undo retention IS turned ON. 
IMODE=BR
ILAT =616
LICENSE_MAX_USERS = 0
SYS auditing IS disabled
ksdpec: called FOR event 13740 prior TO event GROUP initialization
Starting up ORACLE RDBMS Version: 10.2.0.4.0.
System parameters WITH non-DEFAULT VALUES:
_trace_files_public = TRUE
processes = 2800
sessions = 5600
timed_statistics = TRUE
__shared_pool_size = 905969664
shared_pool_size = 838860800
__large_pool_size = 16777216
__java_pool_size = 16777216
__streams_pool_size = 0
shared_pool_reserved_size= 83886080
nls_language = american
nls_territory = america
nls_sort = BINARY
nls_date_format = DD-MON-RR
nls_numeric_characters = .,
nls_comp = BINARY
nls_length_semantics = CHAR
sga_target = 8589934592
control_files = /oradata/accdb/cntrl01.dbf, /oradata/accdb/cntrl02.dbf, /oradata/accdb/cntrl03.dbf
db_block_checksum = TRUE
db_block_size = 8192
__db_cache_size = 7633633280
compatible = 10.2.0
log_archive_config = DG_CONFIG=(ACCDB,ACCPRI)
log_archive_dest_1 = LOCATION=/archlog
log_archive_dest_2 = SERVICE=ACCDB LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=ACCDB
log_archive_dest_state_1 = enable
log_archive_dest_state_2 = enable
log_archive_max_processes= 3
fal_client = ACCPRI
fal_server = ACCDB
log_buffer = 14230528
log_checkpoint_interval = 100000
log_checkpoint_timeout = 1200
db_files = 512
db_recovery_file_dest = /flashback
db_recovery_file_dest_size= 214748364800
standby_file_management = AUTO
log_checkpoints_to_alert = FALSE
db_flashback_retention_target= 4320
dml_locks = 10000
undo_management = AUTO
undo_tablespace = APPS_UNDOTS1
db_block_checking = FALSE
db_domain = ccic
global_names = TRUE
session_cached_cursors = 500
utl_file_dir = /usr/tmp, /usr/tmp, /oracle/accdb/10.2.0/appsutil/outbound/ACCDB_accountdb, /usr/tmp
plsql_native_library_dir = /oracle/accdb/10.2.0/plsql/nativelib
plsql_native_library_subdir_count= 149
plsql_code_type = INTERPRETED
plsql_optimize_level = 2
job_queue_processes = 2
_system_trig_enabled = TRUE
cursor_sharing = EXACT
parallel_min_servers = 0
parallel_max_servers = 8
background_dump_dest = /oracle/admin/ACCDB/bdump
user_dump_dest = /oracle/admin/ACCDB/udump
max_dump_file_size = 20480
core_dump_dest = /oracle/admin/ACCDB/cdump
db_name = ACCDB
db_unique_name = ACCPRI
open_cursors = 1000
_sort_elimination_cost_ratio= 5
_b_tree_bitmap_plans = FALSE
_fast_full_scan_enabled = FALSE
_sqlexec_progression_cost= 2147483647
_like_with_bind_as_equality= TRUE
pga_aggregate_target = 4294967296
workarea_size_policy = AUTO
optimizer_secure_view_merging= FALSE
aq_tm_processes = 1
olap_page_pool_size = 41943040
PMON started WITH pid=2, OS id=3736548
PSP0 started WITH pid=3, OS id=5767232
MMAN started WITH pid=4, OS id=2556302
DBW0 started WITH pid=5, OS id=5112274
DBW1 started WITH pid=6, OS id=4457358
DBW2 started WITH pid=7, OS id=3604948
DBW3 started WITH pid=8, OS id=4915548
LGWR started WITH pid=9, OS id=3867168
CKPT started WITH pid=10, OS id=2818092
SMON started WITH pid=11, OS id=2949892
RECO started WITH pid=12, OS id=4063580
CJQ0 started WITH pid=13, OS id=4785010
MMON started WITH pid=14, OS id=4588312
MMNL started WITH pid=15, OS id=3015526
Tue Oct 18 13:07:31 2011
ALTER DATABASE MOUNT
Tue Oct 18 13:07:35 2011
Setting recovery target incarnation TO 1
Tue Oct 18 13:07:35 2011
Successful mount OF redo thread 1, WITH mount id 830491155
Tue Oct 18 13:07:35 2011
DATABASE mounted IN Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Oct 18 13:07:55 2011
ALTER DATABASE flashback ON
Tue Oct 18 13:07:55 2011
db_recovery_file_dest_size OF 204800 MB IS 0.01% used. This IS a
user-specified LIMIT ON the amount OF SPACE that will be used BY this
DATABASE FOR recovery-related files, AND does NOT reflect the amount OF
SPACE available IN the underlying filesystem OR ASM diskgroup.
Allocated 15265600 bytes IN shared pool FOR flashback generation buffer
Starting background process RVWR
RVWR started WITH pid=17, OS id=4260856
Tue Oct 18 13:07:55 2011
Flashback DATABASE Enabled 
Tue Oct 18 13:07:55 2011
Errors IN file /oracle/admin/ACCDB/udump/accdb_ora_4129208.trc:
ORA-00600: internal error code, arguments: [krfg_first_barrier1], [1], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE flashback ON...

之所以要列出这么多的记录是有意义的,对于一般的ORA-600而言,只需要后面4行而已,但是对于当前的问题是不够的,事实上如果不是篇幅,我打算把整个alert文件中的内容都贴上。
简单介绍一下情况,客户打算在STANDBY数据库上打开FLASHBACK,这样备库可以打开进行读写修改,在修改完毕后,可以恢复到打开之前的状态,再次转换成STANDBY数据库的角色,而不需要由于备库的打开,重建整个STANDBY数据库。
本来这个操作在客户这边已经进行过几次,从来没有碰到过什么问题,不过这次这个数据库的STANDBY环境是新建立的,而FLASHBACK这种新技术,碰到什么bug也不奇怪。
发现这个错误后,首先去metalink上查询,结果发现这个错误在metalink上根本没有记录,看来这时一个出现几率很低的问题。虽然FLASHBACK用的人比较少,但是从10g出现,到11.2,还是经历了一段时间了,如果是常规的问题,不太可能没有任何记录。
详细的错误日志文件:
Dump file /oracle/admin/ACCDB/udump/accdb_ora_4129208.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/db10gr2
System name: AIX
Node name: Paccdb2
Release: 1
Version: 6
Machine: 00F67AD94C00
Instance name: ACCDB
Redo thread mounted by this instance: 1
Oracle process number: 16
Unix process pid: 4129208, image: oracle@Paccdb2 (TNS V1-V3)
*** ACTION NAME:() 2011-10-18 13:07:55.437
*** MODULE NAME:(sqlplus@Paccdb2 (TNS V1-V3)) 2011-10-18 13:07:55.437
*** SERVICE NAME:() 2011-10-18 13:07:55.437
*** SESSION ID:(5586.3) 2011-10-18 13:07:55.437
*** 2011-10-18 13:07:55.437
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [krfg_first_barrier1], [1], [], [], [], [], [], []
Current SQL statement for this session:
alter database flashback on
—– Call Stack Trace —–
calling call entry argument values in hex
location type point (? means dubious value)
——————– ——– ——————– —————————-
ksedst+001c bl ksedst1 000000000 ? 000000000 ?
ksedmp+0290 bl ksedst 104A52AA8 ?
ksfdmp+0018 bl 03F4E238
kgerinv+00dc bl _ptrgl
kgeasnmierr+004c bl kgerinv 000000000 ? 000000000 ?
000009732 ? 00147AE14 ?
700000010018078 ?
krfg_first_barrier_ bl kgeasnmierr 110196398 ? 110490040 ?
w_fba+00e8 104C03194 ? 100000001 ?
000000000 ? 000000001 ?
104C03D84 ? 1056A1010 ?
krfg_first_barrier+ bl krfg_first_barrier_ 000000000 ? 000000000 ?
001c w_fba 000000000 ? 004D25FF8 ?
krf_enable_flashbac bl krfg_first_barrier
k_db+0098
adbdrv+d9b0 bl krf_enable_flashbac 000000000 ?
k_db
opiexe+2c18 bl adbdrv
opiosq0+19f0 bl opiexe 000000000 ? 000000000 ?
FFFFFFFFFFF8EB0 ?
kpooprx+0168 bl opiosq0 3EC779438 ? 700000010003520 ?
7000001EC7793B8 ?
A4000110196398 ?
kpoal8+0400 bl kpooprx FFFFFFFFFFFB704 ?
FFFFFFFFFFFB470 ?
1B0000001B ? 100000001 ?
000000000 ? A40000000000A4 ?
000000000 ? 1103A80D8 ?
opiodr+0ae0 bl _ptrgl
ttcpip+1020 bl _ptrgl
opitsk+1124 bl ttcpip 1100CB7A0 ? FFFFFFFFFFFBA00 ?
000000000 ? 11044F7B0 ?
FFFFFFFFFFFB970 ?
442442226B736800 ?
9000000000D9B74 ? 110451E00 ?
opiino+0990 bl opitsk 1E00000000 ? 000000000 ?
opiodr+0ae0 bl _ptrgl
opidrv+0484 bl 01FBA1C8
sou2o+0090 bl opidrv 3C02ABB45C ? 4A0071EB0 ?
FFFFFFFFFFFF640 ?
opimai_real+01bc bl 01FB7F8C
main+0098 bl opimai_real 000000000 ? 000000000 ?
__start+0070 bl main 000000000 ? 000000000 ?
——————— Binary Stack Dump ———————
简略的看了一下TRACE文件,并没有发现可疑的地方,查询了一下数据库的状态:

SQL> SELECT name, log_mode, controlfile_type, open_mode, database_role, flashback_on FROM v$database;
NAME       LOG_MODE        CONTROLFILE_TY OPEN_MODE  DATABASE_ROLE   FLASHBACK_ON
---------- --------------- -------------- ---------- --------------- ---------------
ACCDB      ARCHIVELOG      BACKUP         MOUNTED    PRIMARY         YES

虽然报了ORA-600错误,但是FLASHBACK已经打开了,而且在对应的未知也发现FLASHBACK日志文件成功创建。此外还有一个疑点,就是控制文件和数据库的角色不正确,按道理来说,控制文件的状态应该是STANDBY,而数据库的角色也应该是STANDBY。于是和具体操作的DBA沟通了一下,发现了问题的所在。
原来这个STANDBY环境还根本没有建立起来,现在数据库只是利用BACKUP的控制文件恢复了控制文件而已,数据库的数据文件还根本就没有恢复,也就是说不但没有建立成STANDBY环境,数据库本身还不存在,这时候开启FLASHBACK当然会有问题。
正常应该先创建STANDBY环境,确认主库和备库之间关系建立且日志应用正常,然后在备库上开启FLASHBACK。这样看来,这个ORA-600错误在metalink上找不到也是正常的,毕竟这个错误太低级了。
不过Oracle应该将错误信息更明确一些,而不是报一个一般人都看不懂的ORA-600错误就完事了,从这个角度讲,这个问题也算是一个小bug。

This entry was posted in BUG and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *