客户在打开FLASHBACK的时候碰到了这个错误。
详细的错误如下:
Tue Oct 18 10:55:02 2011 ALTER DATABASE MOUNT Tue Oct 18 10:55:06 2011 Setting recovery target incarnation TO 1 Tue Oct 18 10:55:06 2011 Successful mount OF redo thread 1, WITH mount id 830459142 Tue Oct 18 10:55:06 2011 DATABASE mounted IN Exclusive Mode Completed: ALTER DATABASE MOUNT Tue Oct 18 13:05:08 2011 ALTER SYSTEM SET db_recovery_file_dest_size='200G' SCOPE=BOTH; Tue Oct 18 13:05:13 2011 ALTER SYSTEM SET db_recovery_file_dest='/flashback' SCOPE=BOTH; Tue Oct 18 13:05:17 2011 ALTER SYSTEM SET db_flashback_retention_target=4320 SCOPE=BOTH; Tue Oct 18 13:06:00 2011 Shutting down instance: further logons disabled Tue Oct 18 13:06:00 2011 Stopping background process CJQ0 Tue Oct 18 13:06:00 2011 Stopping background process MMNL Tue Oct 18 13:06:01 2011 Stopping background process MMON Tue Oct 18 13:06:02 2011 Shutting down instance (immediate) License high water mark = 1 Tue Oct 18 13:06:02 2011 Stopping Job queue slave processes, flags = 7 Tue Oct 18 13:06:02 2011 Job queue slave processes stopped Tue Oct 18 13:06:02 2011 ALTER DATABASE CLOSE NORMAL ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL... Tue Oct 18 13:06:02 2011 ALTER DATABASE DISMOUNT Completed: ALTER DATABASE DISMOUNT ARCH: Archival disabled due TO shutdown: 1089 Shutting down archive processes Archiving IS disabled Archive process shutdown avoided: 0 active ARCH: Archival disabled due TO shutdown: 1089 Shutting down archive processes Archiving IS disabled Archive process shutdown avoided: 0 active Tue Oct 18 13:07:29 2011 Starting ORACLE instance (normal) sskgpgetexecname failed TO GET name LICENSE_MAX_SESSION = 0 LICENSE_SESSIONS_WARNING = 0 Picked latch-free SCN scheme 3 Autotune OF undo retention IS turned ON. IMODE=BR ILAT =616 LICENSE_MAX_USERS = 0 SYS auditing IS disabled ksdpec: called FOR event 13740 prior TO event GROUP initialization Starting up ORACLE RDBMS Version: 10.2.0.4.0. System parameters WITH non-DEFAULT VALUES: _trace_files_public = TRUE processes = 2800 sessions = 5600 timed_statistics = TRUE __shared_pool_size = 905969664 shared_pool_size = 838860800 __large_pool_size = 16777216 __java_pool_size = 16777216 __streams_pool_size = 0 shared_pool_reserved_size= 83886080 nls_language = american nls_territory = america nls_sort = BINARY nls_date_format = DD-MON-RR nls_numeric_characters = ., nls_comp = BINARY nls_length_semantics = CHAR sga_target = 8589934592 control_files = /oradata/accdb/cntrl01.dbf, /oradata/accdb/cntrl02.dbf, /oradata/accdb/cntrl03.dbf db_block_checksum = TRUE db_block_size = 8192 __db_cache_size = 7633633280 compatible = 10.2.0 log_archive_config = DG_CONFIG=(ACCDB,ACCPRI) log_archive_dest_1 = LOCATION=/archlog log_archive_dest_2 = SERVICE=ACCDB LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=ACCDB log_archive_dest_state_1 = enable log_archive_dest_state_2 = enable log_archive_max_processes= 3 fal_client = ACCPRI fal_server = ACCDB log_buffer = 14230528 log_checkpoint_interval = 100000 log_checkpoint_timeout = 1200 db_files = 512 db_recovery_file_dest = /flashback db_recovery_file_dest_size= 214748364800 standby_file_management = AUTO log_checkpoints_to_alert = FALSE db_flashback_retention_target= 4320 dml_locks = 10000 undo_management = AUTO undo_tablespace = APPS_UNDOTS1 db_block_checking = FALSE db_domain = ccic global_names = TRUE session_cached_cursors = 500 utl_file_dir = /usr/tmp, /usr/tmp, /oracle/accdb/10.2.0/appsutil/outbound/ACCDB_accountdb, /usr/tmp plsql_native_library_dir = /oracle/accdb/10.2.0/plsql/nativelib plsql_native_library_subdir_count= 149 plsql_code_type = INTERPRETED plsql_optimize_level = 2 job_queue_processes = 2 _system_trig_enabled = TRUE cursor_sharing = EXACT parallel_min_servers = 0 parallel_max_servers = 8 background_dump_dest = /oracle/admin/ACCDB/bdump user_dump_dest = /oracle/admin/ACCDB/udump max_dump_file_size = 20480 core_dump_dest = /oracle/admin/ACCDB/cdump db_name = ACCDB db_unique_name = ACCPRI open_cursors = 1000 _sort_elimination_cost_ratio= 5 _b_tree_bitmap_plans = FALSE _fast_full_scan_enabled = FALSE _sqlexec_progression_cost= 2147483647 _like_with_bind_as_equality= TRUE pga_aggregate_target = 4294967296 workarea_size_policy = AUTO optimizer_secure_view_merging= FALSE aq_tm_processes = 1 olap_page_pool_size = 41943040 PMON started WITH pid=2, OS id=3736548 PSP0 started WITH pid=3, OS id=5767232 MMAN started WITH pid=4, OS id=2556302 DBW0 started WITH pid=5, OS id=5112274 DBW1 started WITH pid=6, OS id=4457358 DBW2 started WITH pid=7, OS id=3604948 DBW3 started WITH pid=8, OS id=4915548 LGWR started WITH pid=9, OS id=3867168 CKPT started WITH pid=10, OS id=2818092 SMON started WITH pid=11, OS id=2949892 RECO started WITH pid=12, OS id=4063580 CJQ0 started WITH pid=13, OS id=4785010 MMON started WITH pid=14, OS id=4588312 MMNL started WITH pid=15, OS id=3015526 Tue Oct 18 13:07:31 2011 ALTER DATABASE MOUNT Tue Oct 18 13:07:35 2011 Setting recovery target incarnation TO 1 Tue Oct 18 13:07:35 2011 Successful mount OF redo thread 1, WITH mount id 830491155 Tue Oct 18 13:07:35 2011 DATABASE mounted IN Exclusive Mode Completed: ALTER DATABASE MOUNT Tue Oct 18 13:07:55 2011 ALTER DATABASE flashback ON Tue Oct 18 13:07:55 2011 db_recovery_file_dest_size OF 204800 MB IS 0.01% used. This IS a user-specified LIMIT ON the amount OF SPACE that will be used BY this DATABASE FOR recovery-related files, AND does NOT reflect the amount OF SPACE available IN the underlying filesystem OR ASM diskgroup. Allocated 15265600 bytes IN shared pool FOR flashback generation buffer Starting background process RVWR RVWR started WITH pid=17, OS id=4260856 Tue Oct 18 13:07:55 2011 Flashback DATABASE Enabled Tue Oct 18 13:07:55 2011 Errors IN file /oracle/admin/ACCDB/udump/accdb_ora_4129208.trc: ORA-00600: internal error code, arguments: [krfg_first_barrier1], [1], [], [], [], [], [], [] ORA-600 signalled during: ALTER DATABASE flashback ON... |
之所以要列出这么多的记录是有意义的,对于一般的ORA-600而言,只需要后面4行而已,但是对于当前的问题是不够的,事实上如果不是篇幅,我打算把整个alert文件中的内容都贴上。
简单介绍一下情况,客户打算在STANDBY数据库上打开FLASHBACK,这样备库可以打开进行读写修改,在修改完毕后,可以恢复到打开之前的状态,再次转换成STANDBY数据库的角色,而不需要由于备库的打开,重建整个STANDBY数据库。
本来这个操作在客户这边已经进行过几次,从来没有碰到过什么问题,不过这次这个数据库的STANDBY环境是新建立的,而FLASHBACK这种新技术,碰到什么bug也不奇怪。
发现这个错误后,首先去metalink上查询,结果发现这个错误在metalink上根本没有记录,看来这时一个出现几率很低的问题。虽然FLASHBACK用的人比较少,但是从10g出现,到11.2,还是经历了一段时间了,如果是常规的问题,不太可能没有任何记录。
详细的错误日志文件:
Dump file /oracle/admin/ACCDB/udump/accdb_ora_4129208.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /oracle/product/db10gr2
System name: AIX
Node name: Paccdb2
Release: 1
Version: 6
Machine: 00F67AD94C00
Instance name: ACCDB
Redo thread mounted by this instance: 1
Oracle process number: 16
Unix process pid: 4129208, image: oracle@Paccdb2 (TNS V1-V3)
*** ACTION NAME:() 2011-10-18 13:07:55.437
*** MODULE NAME:(sqlplus@Paccdb2 (TNS V1-V3)) 2011-10-18 13:07:55.437
*** SERVICE NAME:() 2011-10-18 13:07:55.437
*** SESSION ID:(5586.3) 2011-10-18 13:07:55.437
*** 2011-10-18 13:07:55.437
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [krfg_first_barrier1], [1], [], [], [], [], [], []
Current SQL statement for this session:
alter database flashback on
—– Call Stack Trace —–
calling call entry argument values in hex
location type point (? means dubious value)
——————– ——– ——————– —————————-
ksedst+001c bl ksedst1 000000000 ? 000000000 ?
ksedmp+0290 bl ksedst 104A52AA8 ?
ksfdmp+0018 bl 03F4E238
kgerinv+00dc bl _ptrgl
kgeasnmierr+004c bl kgerinv 000000000 ? 000000000 ?
000009732 ? 00147AE14 ?
700000010018078 ?
krfg_first_barrier_ bl kgeasnmierr 110196398 ? 110490040 ?
w_fba+00e8 104C03194 ? 100000001 ?
000000000 ? 000000001 ?
104C03D84 ? 1056A1010 ?
krfg_first_barrier+ bl krfg_first_barrier_ 000000000 ? 000000000 ?
001c w_fba 000000000 ? 004D25FF8 ?
krf_enable_flashbac bl krfg_first_barrier
k_db+0098
adbdrv+d9b0 bl krf_enable_flashbac 000000000 ?
k_db
opiexe+2c18 bl adbdrv
opiosq0+19f0 bl opiexe 000000000 ? 000000000 ?
FFFFFFFFFFF8EB0 ?
kpooprx+0168 bl opiosq0 3EC779438 ? 700000010003520 ?
7000001EC7793B8 ?
A4000110196398 ?
kpoal8+0400 bl kpooprx FFFFFFFFFFFB704 ?
FFFFFFFFFFFB470 ?
1B0000001B ? 100000001 ?
000000000 ? A40000000000A4 ?
000000000 ? 1103A80D8 ?
opiodr+0ae0 bl _ptrgl
ttcpip+1020 bl _ptrgl
opitsk+1124 bl ttcpip 1100CB7A0 ? FFFFFFFFFFFBA00 ?
000000000 ? 11044F7B0 ?
FFFFFFFFFFFB970 ?
442442226B736800 ?
9000000000D9B74 ? 110451E00 ?
opiino+0990 bl opitsk 1E00000000 ? 000000000 ?
opiodr+0ae0 bl _ptrgl
opidrv+0484 bl 01FBA1C8
sou2o+0090 bl opidrv 3C02ABB45C ? 4A0071EB0 ?
FFFFFFFFFFFF640 ?
opimai_real+01bc bl 01FB7F8C
main+0098 bl opimai_real 000000000 ? 000000000 ?
__start+0070 bl main 000000000 ? 000000000 ?
——————— Binary Stack Dump ———————
简略的看了一下TRACE文件,并没有发现可疑的地方,查询了一下数据库的状态:
SQL> SELECT name, log_mode, controlfile_type, open_mode, database_role, flashback_on FROM v$database; NAME LOG_MODE CONTROLFILE_TY OPEN_MODE DATABASE_ROLE FLASHBACK_ON ---------- --------------- -------------- ---------- --------------- --------------- ACCDB ARCHIVELOG BACKUP MOUNTED PRIMARY YES |
虽然报了ORA-600错误,但是FLASHBACK已经打开了,而且在对应的未知也发现FLASHBACK日志文件成功创建。此外还有一个疑点,就是控制文件和数据库的角色不正确,按道理来说,控制文件的状态应该是STANDBY,而数据库的角色也应该是STANDBY。于是和具体操作的DBA沟通了一下,发现了问题的所在。
原来这个STANDBY环境还根本没有建立起来,现在数据库只是利用BACKUP的控制文件恢复了控制文件而已,数据库的数据文件还根本就没有恢复,也就是说不但没有建立成STANDBY环境,数据库本身还不存在,这时候开启FLASHBACK当然会有问题。
正常应该先创建STANDBY环境,确认主库和备库之间关系建立且日志应用正常,然后在备库上开启FLASHBACK。这样看来,这个ORA-600错误在metalink上找不到也是正常的,毕竟这个错误太低级了。
不过Oracle应该将错误信息更明确一些,而不是报一个一般人都看不懂的ORA-600错误就完事了,从这个角度讲,这个问题也算是一个小bug。