客户的数据库告警日志出现skgpspawn failed信息。
详细错误信息如下:
Mon Nov 28 07:59:59 2011 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn5 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 Mon Nov 28 08:00:20 2011 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn5 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 skgpspawn failed:category = 27142, depinfo = 11, op = fork, loc = skgpspawn3 |
从信息上看,问题应该发生在fork操作时,或者说spawn进程时报错。
这个27142错误对应的实际上是ORA-27142错误。
ORA-27142 could not create new process
Cause: Operating system call error.
Action: Check errno and if possible increase the number of processes.
可以看到这个错误发生在操作系统调用的错误上,根据说明很可能是进程数受到了限制。
检查MOS,发现有两篇文档和当前的情况类似,其中之一是Skgpspawn Errors In Alert Log, New Connections to Database Fail [ID 435787.1],这篇文章中记录的错误和当前十分类似,唯一的差别在于depinfo为12。而导致这个错误的原因是SWAP空间不足。
另外一篇Bug 5141429 – “skgspawn 27142” errors and defunct Oracle processes [ID 5141429.8]记录的问题是由于僵尸进程所致,不过这篇文章记录错误信息与当前的区别仍然在于depinfo上,这篇文章的depinfo为0。
最终查询操作系统上的信息发现,除了系统中存在僵尸进程外,也有操作系统限制上的不同,只不过不是SWAP空间的不足,而是系统参数maxuprc设置太低所致。