Oracle,The controlfile might be corrupted.[VMware/CEntOS]

举报
杰森志智 发表于 2019/01/28 21:21:52 2019/01/28
【摘要】 对于初次遇到此类错误的情况来说会非常棘手,因为控制文件是Oracle数据库正常启动并工作的基础。The controlfile header block returned by the OS has a sequence number that is too old. The controlfile might be corrupted.[VMware/CEntOS]环境:Oracle 11...

对于初次遇到此类错误的情况来说会非常棘手,因为控制文件是Oracle数据库正常启动并工作的基础。

The controlfile header block returned by the OS has a sequence number that is too old. The controlfile might be corrupted.[VMware/CEntOS]


环境:Oracle 11.2, CEntOS 6.6, VMware

原因:磁盘I/O性能低效,VMware虚拟机备份占用大量磁盘I/O吞吐能力

现象:最后的alert日志内容

Fri Feb 03 17:38:57 2017
********************* ATTENTION: ********************
 The controlfile header block returned by the OS
 has a sequence number that is too old.
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely,
 please do the following:
 (1) Save all copies of the controlfile for later
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue:
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance.
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database.
*****************************************************
USER (ospid: 30341): terminating the instance
Fri Feb 03 17:38:58 2017
System state dump requested by (instance=1, osid=30341 (PR00)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/wmsdst/WMSDI/trace/WMSDI_diag_30193_20170203173858.trc
Dumping diagnostic data in directory=[cdmp_20170203173858], requested by (instance=1, osid=30341 (PR00)), summary=[abnormal instance termination].
Instance terminated by USER, pid = 30341

问题处理:定位并消除磁盘性能影响因素,查询并按照MOS上的最新解决方案处理。

简单处理:请先备份每一个在用的控制文件、因为我们并不知道哪一个是最新的。

然后可以startup mount;看看哪一个控制文件的sequence是最新的,复制覆盖其他未同步的控制文件。

至于MOS上的BUG 14281768 - CONTROL FILE GETS CORRUPTEDOLUTION尽可能不要这么处理,除非在限定短时间内找不到磁盘I/O性能为何低效时临时处理。


Error is typically raised when the Controlfile is overwritten by an older copy of the Controlfile. Most likely this happened due to Storage OR I/o error.
All copies of the control file must have the same internal sequence number for Oracle to start up the database or shut it down in normal or immediate mode.

To make a sanity check in the future , please set the following parameter :-

SQL> alter system set "_controlfile_update_check"='HIGH' scope=spfile; -- then bounce the database.

Please check with your OS System/Storage admin regarding the issue.

The precautions is to relocate the control file on a fast and direct I/O enabled disk , the main target is not letting the OS to write an old copy (cached copy of the controlfile to it).
To reverse the parameter setting :-

SQL> alter system set "_controlfile_update_check"='OFF' scope=spfile; -- then bounce the database.

 

更多问题触发场景:

VMware上的Oracle数据库最佳实践指南 54页

12. Backup and Recovery
12.1 Oracle Backup and Recovery Overview
The main purpose of a database backup and recovery strategy is to protect the database against data loss and reconstruct the database after data loss. Typical backup tasks performed by an Oracle DBA would include setting up the database environment for backup and recovery, setting up a backup schedule, monitoring the backup and recovery environment, and troubleshooting backup problems
A backup can be either a physical or a logical backup. Physical backups are physical copies of the database files which include data files, control files, and archive log files. Logical backups contain a logical copy of the data, such as tables, indexes, procedures, functions, and so on. You can use Oracle Data Pump to export logical data to binary files, which you can later import into the database.
There are levels of triggering Oracle database backups within the VMware environment:
* In guest Oracle backup using Oracle Recovery Manager (RMAN)
* VMware level backup using VMware vSphere Data Protection™ / VMware vSphere Data Protection Advanced
* Storage based backup tools
* vSphere Virtual Volumes using vSphere 6.0
vSphere recommends either using Oracle Recovery Manager (RMAN), storage-based backup tools, or vSphere Virtual Volumes using vSphere 6.0.
12.2 Oracle Recovery Manager (RMAN)
For implementing an effective Oracle database backup and recovery strategy, Oracle Recovery Manager (RMAN) is typically the preferred Oracle solution.
RMAN provides a common interface for backup tasks across different host operating systems, and offers several backup techniques not available through user-managed methods.
The method of deploying and using RMAN to backup an Oracle database does not change when virtualizing an Oracle database. It is same across both physical and virtualized environments.
For more information on Oracle Recovery Manager, see the Oracle documentation at https://docs.oracle.com/database/121/BRADV/toc.htm.
12.3 vSphere Data Protection
Any virtual machine VMDK can be backed up with VMware snapshot technology as long as it is not set to Independent-Persistent mode.
A virtual machine housing an Oracle database has two types of VMDKs—guest OS VMDK and the VMDKs housing the Oracle data files.
VMware does not recommend that you back up a high transactional, heavy I/O-centric Oracle database using VMware snapshot technology because, during the snapshot removal (consolidation), there is a brief stun moment. No activity is permitted against the virtual machine, which might result in performance issue and service disruptions.
For more information, see A snapshot removal can stop a virtual machine for long time (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002836).
You can, however, back up Oracle non-production databases (development, test, QA, pre-production, and so on) using VMware snapshot technology.

此文地址:http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/solutions/vmware-oracle-databases-on-vmware-best-practices-guide.pdf

当启用更改块跟踪 (CBT) 后,备份虚拟机失败 (2119254)

Symptoms

免责声明: 本文为 Backing up the virtual machine fails when CBT is enabled (2114076) 的翻译版本。 尽管我们会不断努力为本文提供最佳翻译版本,但本地化的内容可能会过时。 有关最新内容,请参见英文版本。


  • 在主机上升级至 VMware ESXi 6.0.x 后备份启用了更改块跟踪 (CBT) 的虚拟机失败

  • 在主机上安装 VMware ESXi 6.0.x 后备份启用了更改块跟踪 (CBT) 的虚拟机失败

  • 打开虚拟机电源失败。

  • 扩展虚拟磁盘大小失败。

  • 生成虚拟机静止快照失败。

  • vSphere Client 显示类似以下内容的错误:

    An error occurred while taking a snapshot: msg.snapshot.error-QUIESCINGERROR

    注意: 此错误可能存在,也可能不存在

  • 在正在运行受影响虚拟机的 ESXi 主机上的 /var/log/vmkernel.log 文件中,您会看到类似以下内容的错误:

    <YYYY-MM-DD>T<TIME>.623Z cpu5:809536)WARNING: CBT: 191: No memory available! Called from 0x4180219af50e
    <YYYY-MM-DD>T<TIME>.637Z cpu5:809536)WARNING: CBT: 191: No memory available! Called from 0x4180219af50e
    <YYYY-MM-DD>T<TIME>.648Z cpu5:809536)WARNING: CBT: 191: No memory available! Called from 0x4180219af50e

  • 在受影响虚拟机的 vmware.log 文件中,您会看到类似以下内容的条目:

    vcpu-0| I120: DISKLIB-CBT : Creating cbt node 92b78c-cbt failed with error Cannot allocate memory (0xbad0014, Out of memory)

Purpose

要解决在启用 CBT 后备份虚拟机发生故障的问题,升级到 ESXi 6.0 Build 2715440。

Cause

出现此问题的原因是堆耗尽。 尝试启用更改块跟踪 (CBT) 时可能会出现此问题。 如果虚拟机中的大量虚拟磁盘达到阈值上限,则会因为堆耗尽而导致 CBT 启用失败。 多个虚拟机启用了 CBT 时也会出现此问题。 对于启用了 VSS 的 Windows 虚拟机,生成静止快照会产生双倍的内存开销。 最后,如果堆即将耗尽,则执行 vMotion 也会引起此问题,因为该过程也涉及生成快照。

注意: 虚拟磁盘可跨虚拟机,也可位于一个虚拟机内。

Resolution

该问题在 VMware ESXi 6.0 Build 2715440 中已得到解决,后者可从 VMware Downloads 获取。 有关详细信息,请参见 VMware ESXi 6.0, Patch Release ESXi600-201505001 (2116125)

Additional Information

有关启用和禁用 CBT 的详细信息,请参见在虚拟机上启用块修改跟踪 (CBT) (2078214)

此文地址:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2119254


http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2114076



原发:

https://www.cnblogs.com/smartree/p/6369486.html


【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。