ORA-600(1616)和ORA-569错误

出现错误的数据库为Oracle 10.2.0.4 RAC for HP-UX。
在RAC的一个节点上出现了ORA-600[1616]错误和ORA-569错误:

Sat Apr 7 03:03:09 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed TO acquire global enqueue.
Sat Apr 7 03:04:14 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed TO acquire global enqueue.
Sat Apr 7 03:04:15 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed TO acquire global enqueue.
.
.
.
Sat Apr 7 03:06:35 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed TO acquire global enqueue.
Sat Apr 7 03:07:42 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_smon_27471.trc:
ORA-00569: Failed TO acquire global enqueue.
Sat Apr 7 03:07:42 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl2_mmon_27483.trc:
ORA-00600: internal error code, arguments: [1616], [30], [1], [], [], [], [], []
Sat Apr 7 03:07:44 2012
Trace dumping IS performing id=[cdmp_20120407030744]
Sat Apr 7 03:07:44 2012
ORA-600 encountered WHEN generating server alert SMG-3503
Sat Apr 7 03:07:57 2012
Reconfiguration started (OLD inc 24, NEW inc 26)
List OF nodes:
1
Global Resource Directory frozen
* dead instance detected - DOMAIN 0 invalid = TRUE 
Communication channels reestablished
Master broadcasted resource hash VALUE bitmaps
Non-LOCAL Process blocks cleaned OUT
Sat Apr 7 03:08:03 2012
LMS 0: 1 GCS shadows cancelled, 0 closed
Sat Apr 7 03:08:03 2012
LMS 1: 1 GCS shadows cancelled, 0 closed
SET master node info 
Submitted ALL remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
ALL grantable enqueues GRANTED
Sat Apr 7 03:08:13 2012
Reconfiguration started (OLD inc 26, NEW inc 28)
List OF nodes:
0 1
Nested reconfiguration detected. 
Global Resource Directory frozen
Communication channels reestablished
* DOMAIN 0 valid = 0 according TO instance 0 
Sat Apr 7 03:08:13 2012
Master broadcasted resource hash VALUE bitmaps
Non-LOCAL Process blocks cleaned OUT
Sat Apr 7 03:08:17 2012
LMS 1: 0 GCS shadows cancelled, 0 closed
Sat Apr 7 03:08:17 2012
LMS 0: 0 GCS shadows cancelled, 0 closed
SET master node info 
Sat Apr 7 03:08:39 2012
Submitted ALL remote-enqueue requests
Dwn-cvts replayed, VALBLKs dubious
ALL grantable enqueues GRANTED
Sat Apr 7 03:08:51 2012
LMS 0: 7542 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:51 2012
LMS 1: 7472 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 7481 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 1: 7522 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 7450 GCS shadows traversed, 4001 replayed
.
.
.
Sat Apr 7 03:08:52 2012
LMS 1: 7550 GCS shadows traversed, 4001 replayed
Sat Apr 7 03:08:52 2012
LMS 0: 6396 GCS shadows traversed, 3431 replayed
Sat Apr 7 03:08:52 2012
LMS 1: 5339 GCS shadows traversed, 2878 replayed
Sat Apr 7 03:08:52 2012
Submitted ALL GCS remote-cache requests
Sat Apr 7 03:08:52 2012
Instance recovery: looking FOR dead threads
Fix WRITE IN gcs resources
Sat Apr 7 03:08:52 2012
Beginning instance recovery OF 1 threads
Reconfiguration complete
Sat Apr 7 03:08:53 2012
parallel recovery started WITH 7 processes
Sat Apr 7 03:08:53 2012
Started redo scan
Sat Apr 7 03:08:53 2012
Completed redo scan
2550 redo blocks READ, 152 DATA blocks need recovery
Sat Apr 7 03:08:54 2012
Started redo application at
Thread 1: logseq 75561, block 256542
Sat Apr 7 03:08:54 2012
Recovery OF Online Redo Log: Thread 1 GROUP 1 Seq 75561 Reading mem 0
Mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
Sat Apr 7 03:08:54 2012
Completed redo application
Sat Apr 7 03:08:54 2012
Completed instance recovery at
Thread 1: logseq 75561, block 259092, scn 52674829295
149 DATA blocks READ, 137 DATA blocks written, 2550 redo blocks READ
Switch log FOR thread 1 TO SEQUENCE 75562

虽然ORA-600[1616]错误在MOS中完全没有记载,但是根据错误信息分析,这个ORA-600错误应该是ORA-569错误导致的,而ORA-569的错误显然和另外一个实例的状态有关。可以看到在问题发生不久之后,另外一个节点发生了关闭和启动的操作。
那么检查一下另外一个节点对应时刻的日志:

Thu Apr 5 23:06:35 2012
Thread 1 advanced TO log SEQUENCE 75470
CURRENT log# 6 seq# 75470 mem# 0: +DATA/orcl/onlinelog/group_6.258.611597995
Thu Apr 5 23:59:38 2012
Thread 1 advanced TO log SEQUENCE 75471
CURRENT log# 2 seq# 75471 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
Fri Apr 6 00:20:12 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:15 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:15 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:16 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:16 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:17 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:17 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:18 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:18 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:19 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:19 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:20 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:21 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:21 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:22 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:22 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
Fri Apr 6 00:20:23 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLLJ.OLD_000 have been marked unusable
.
.
.
Fri Apr 6 10:55:20 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLTJ.OLD_000 have been marked unusable
Fri Apr 6 10:55:22 2012
SOME indexes OR INDEX [sub]partitions OF TABLE ORCLTJ.OLD_000 have been marked unusable
Fri Apr 6 10:58:54 2012
Thread 1 advanced TO log SEQUENCE 75530
CURRENT log# 6 seq# 75530 mem# 0: +DATA/orcl/onlinelog/group_6.258.611597995
Fri Apr 6 11:20:34 2012
Thread 1 advanced TO log SEQUENCE 75531
CURRENT log# 2 seq# 75531 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
.
.
.
Fri Apr 6 16:53:40 2012
Thread 1 advanced TO log SEQUENCE 75543
CURRENT log# 2 seq# 75543 mem# 0: +DATA/orcl/onlinelog/group_2.260.611597981
Fri Apr 6 17:07:15 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 4120 bytes OF shared memory ("shared pool","select increment$,minvalue,m...","Typecheck","kgghteInit")
Fri Apr 6 17:07:22 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 4120 bytes OF shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:23 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 4120 bytes OF shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:40 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 4120 bytes OF shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:07:41 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_smon_15533.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable TO allocate 4120 bytes OF shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit")
Fri Apr 6 17:15:53 2012
Thread 1 advanced TO log SEQUENCE 75544
CURRENT log# 5 seq# 75544 mem# 0: +DATA/orcl/onlinelog/group_5.259.611597989
Fri Apr 6 17:34:28 2012
Thread 1 advanced TO log SEQUENCE 75545
CURRENT log# 1 seq# 75545 mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
.
.
.
Sat Apr 7 03:00:22 2012
Thread 1 advanced TO log SEQUENCE 75561
CURRENT log# 1 seq# 75561 mem# 0: +DATA/orcl/onlinelog/group_1.261.611597975
Sat Apr 7 03:03:18 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:03:49 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:04:20 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:04:51 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:05:22 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:05:54 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:06:24 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:07:42 2012
WARNING: ran OUT OF shared pool FOR GES resource object.
Sat Apr 7 03:07:44 2012
Trace dumping IS performing id=[cdmp_20120407030744]
Sat Apr 7 03:07:45 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:46 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:54 2012
Errors IN file /u01/app/oracle/admin/ORCL/bdump/orcl1_lmd0_15494.trc:
ORA-00600: internal error code, arguments: [kjrpar:noalloc], [1], [385], [0x60000000000C09B8], [0xC0000002137539D8], [0xC000000213752748], [], []
Sat Apr 7 03:07:54 2012
LMD0: terminating instance due TO error 482
Sat Apr 7 03:07:54 2012
Shutting down instance (abort)
License high water mark = 56
Sat Apr 7 03:07:58 2012
Instance TERMINATED BY LMD0, pid = 15494
Sat Apr 7 03:07:59 2012
Instance TERMINATED BY USER, pid = 8118
Sat Apr 7 03:08:05 2012
Starting ORACLE instance (normal)
Sat Apr 7 03:08:05 2012
Specified VALUE OF sga_max_size IS too small, bumping TO 6627000320
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Interface TYPE 1 lan901 192.168.0.0 configured FROM OCR FOR USE AS a cluster interconnect
Interface TYPE 1 lan900 198.5.1.0 configured FROM OCR FOR USE AS a public interface
Picked latch-free SCN scheme 3
Autotune OF undo retention IS turned off. 
LICENSE_MAX_USERS = 0
SYS auditing IS disabled
My pid IS 8415, priority IS -179, running WITH SCHED_NOAGE policy 8. 
ksdpec: called FOR event 13740 prior TO event GROUP initialization
Starting up ORACLE RDBMS Version: 10.2.0.3.0.
System parameters WITH non-DEFAULT VALUES:
processes = 300
sessions = 335
timed_statistics = FALSE
sga_max_size = 6627000320
lock_sga = FALSE
shared_pool_size = 4294967296
large_pool_size = 67108864
java_pool_size = 33554432
streams_pool_size = 16777216
shared_pool_reserved_size= 263402290
spfile = +DATA/orcl/spfileorcl.ora
sga_target = 0
control_files = +DATA/orcl/controlfile/current1.ora, +ORCL_DATA/orcl/controlfile/current2.ora
db_block_size = 8192
db_cache_size = 2147483648
db_writer_processes = 2
compatible = 10.2.0.2.0
log_archive_dest_1 = LOCATION=/u01/app/oradata/orcl_1_arch1/arch2tape
log_archive_dest_2 = LOCATION=/u01/app/oradata/orcl_1_arch2/arch2tape
log_archive_format = ORCL_%t_%s_%r.ARC
log_buffer = 52428800
log_checkpoint_timeout = 1800
db_file_multiblock_read_count= 8
cluster_database = TRUE
cluster_database_instances= 2
db_create_file_dest = +ORCL_DATA
db_create_online_log_dest_1= +DATA
thread = 1
fast_start_mttr_target = 600
instance_number = 1
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 10800
remote_login_passwordfile= EXCLUSIVE
db_domain = 
dispatchers = (PROTOCOL=TCP)(DISPATCHERS=1)
max_shared_servers = 32
max_dispatchers = 2
local_listener = 
remote_listener = LISTENERS_ORCL
job_queue_processes = 10
background_dump_dest = /u01/app/oracle/admin/ORCL/bdump
user_dump_dest = /u01/app/oracle/admin/ORCL/udump
max_dump_file_size = 52428800
core_dump_dest = /u01/app/oracle/admin/ORCL/cdump
audit_file_dest = /u01/app/oracle/admin/ORCL/adump
hpux_sched_noage = 178
open_links = 30
db_name = ORCL
open_cursors = 300
star_transformation_enabled= TRUE
pga_aggregate_target = 1073741824
Cluster communication IS configured TO USE the following interface(s) FOR this instance
192.168.0.111
Sat Apr 7 03:08:10 2012
cluster interconnect IPC version:Oracle UDP/IP (generic)
IPC Vendor 1 proto 2
PMON started WITH pid=2, OS id=8596
DIAG started WITH pid=4, OS id=8606
PSP0 started WITH pid=6, OS id=8608
LMON started WITH pid=8, OS id=8614
LMD0 started WITH pid=10, OS id=8616
LMS0 started WITH pid=12, OS id=8618
LMS1 started WITH pid=3, OS id=8620
MMAN started WITH pid=14, OS id=8623
DBW0 started WITH pid=16, OS id=8625
DBW1 started WITH pid=5, OS id=8644
LGWR started WITH pid=7, OS id=8657
CKPT started WITH pid=18, OS id=8659
SMON started WITH pid=20, OS id=8661
RECO started WITH pid=22, OS id=8663
CJQ0 started WITH pid=24, OS id=8665
MMON started WITH pid=26, OS id=8667
Sat Apr 7 03:08:12 2012
starting up 1 dispatcher(s) FOR network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
MMNL started WITH pid=28, OS id=8669

从节点1上的告警日志基本上就可以判断出问题的原因了。显然在问题发生的前一天夜里,执行了大量的分区维护操作,致使大量的分区索引失效。这个分区维护的操作导致大量的分区源数据被加载到共享池中,随后不久,Oracle就出现了共享池不足的ORA-4031错误。在节点2出现ORA-569错误的同时,节点1出现了WARNING: ran out of shared pool for GES resource object信息,说明这时节点1上共享池不足导致了节点2上全局队列失败。
而后节点1由于资源不足导致了ORA-600[kjrpar:noalloc]错误的出现,并导致了实例的崩溃,关于这个错误的更详细描述可以参考:https://yangtingkun.net/?p=620
这两个错误实际上还是由于RAC另外节点上共享池资源不足导致的,要避免这两个错误,还是要解决ORA-4031的错误。

This entry was posted in BUG and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *