【数据恢复】详解ORA-1410错误

ORA-1410 invalid rows错误是与ORA-8103相似的Oracle数据库逻辑层面的讹误。

 

如果自己搞不定可以找诗檀软件专业ORACLE数据库修复团队成员帮您恢复!

诗檀软件专业数据库修复团队

服务热线 : 13764045638   QQ号:47079569    邮箱:[email protected]

 

了解ORA-1410逻辑坏块问题的成因,以及有效的解决手段十分重要。

解决方案之一:

可以通过如下PL/SQL过程将健康数据复制到新建表中,对于问题数据块中的数据将被跳过,对于能够容忍数据丢失的场景可以考虑这样恢复,之后truncate 原表/分区并将健康数据加载进去。 具体的脚本见下面的链接:

【数据恢复】利用构造ROWID实现无备份情况下绕过ORA-1578、ORA-8103、ORA-1410等逻辑/物理坏块问题

 

oerr ora 1410
01410, 00000, “invalid ROWID”
// *Cause:
// *Action:

如果对ORA-1410做errorstack 一般会看到下面的LOG:

OBJD MISMATCH typ=6, seg.obj=%d, diskobj=%d, dsflg=%d, dsobj=%d, tid=%d, cls=%d

 

触发ORA-1410错误的stack call一般都是:  kcbgtcr=>kcbzib=>kcbz_check_objd_typ,即在对数据块做逻辑读时运行到kcbz_check_objd_typ函数时,检测到OBJD 不一致的问题。由于seg.obj和diskobj不一致,而10g以后的kcbz_check_objd_typ函数负责验证块上的objd是否mistmatch,若不一致则触发ORA-1410错误。

造成objd mimatch的主要可能有几种:

1、 写丢失 Lost Write, 写丢失造成相关数据块没有为现有对象正常格式化,导致虽然该数据块的checksum是正确的,但对应数据字典却是不一致的。 写丢失也可能由磁盘或卷组镜像同步软件的不完整复制造成。

 

If the on-disk objd is < kcbdsobj, then there is possibility of Oracle messing up or IO layer (OS Cache, Volume mgr etc) missing writes.

 

对于Lost Write在10g版本中没有太好的预防方案,隐藏”_db_lost_write_checking”控制在DBWR写数据文件后立即去读被写的块以便检测出Lost Write,但是该参数对性能的损耗较大,不建议设置。

11g中引入了DB_LOST_WRITE_PROTECT参数配合Data Guard使用可以有效检测出Lost Write问题。

 

DB_LOST_WRITE_PROTECT enables or disables lost write detection. A data block lost write occurs when an I/O subsystem acknowledges the completion of the block write, while in fact the write did not occur in the persistent storage.When the parameter is set to TYPICAL on the primary database, the instance logs buffer cache reads for read-write tablespaces in the redo log, which is necessary for detection of lost writes.

 

When the parameter is set to FULL on the primary database, the instance logs reads for read-only tablespaces as well as read-write tablespaces.

 

When the parameter is set to TYPICAL or FULL on the standby database or on the primary database during media recovery, the instance performs lost write detection.

 

When the parameter is set to NONE on either the primary database or the standby database, no lost write detection functionality is enabled.

 

 

2、 一些DDL操作例如Exchange Partition 造成block级别的不一致,同一个数据块被2个数据对象所使用,而当这2个对象被使用时都可能覆盖问题数据块。 实际上这种情况也可能是Lost Write所引起的。

 

3、 文档Summary Of Bugs Containing ORA 1410 (Doc ID 422771.1)介绍了引起ORA-1410的主要BUG,其中BUG 4592596(Corruption (ORA-1410) from multi-table insert with direct load) 和 BUG 3868753 (Concurrent export / INSERT of ASSM segment can fail with ORA-1410 / ORA-8103)均为对表的Direct path/Parallel INSERT引起后续对表的SELECT操作报ORA-1410错误。

这说明了Direct Path/Parallel Insert操作有小概率引发ORA-1410错误发生的可能,而常规的conventional insert则不会引发ORA-1410。

 

4、 objd mimatch也可能仅仅是Oracle Buffer Cache内存中的block存在不一致,而Disk磁盘上的block仍是完好的。这一般是Oracle Buffer层的BUG引起的,对于该种现象一般flush buffer_cache即可解决问题。虽然在本例中flush buffer_cache未能解决问题,但是若问题仅仅发生在Memory层,则仍建议先考虑flush buffer_cache。

 

针对该由于OBJD MISMATCH所引起的ORA-1410问题可以采取如下措施:

1、 尝试刷新buffer cache:

alter system flush buffer_cache;   ==>如果是RAC建议2个实例都要flush

刷新后再次运行触发ORA-1410错误的语句,若不再报错则说明刷新BUFFER_CACHE有效

 

 

2、 若flush buffer_cache解决不了问题,那么做analyze validate structure 和收集errorstack操作,分析原因:

alter session set events ‘1410 trace  name errorstack  level 3’;

运行会触发ORA-1410的语句,收集生成的trace文件

analyze table XXX   validate structure online;  ==>在线validate structure

@?/rdbms/admin/utlvaild   ==>对于分区对象需要运行utlvaild脚本

analyze table XXX partition (partition_name) validate structure online;

 

 

 

 

3、对于已经在磁盘上形成OBJD MISMATCH现象的数据对象:

a. 考虑通过move table、partition、subpartition来尝试解决该问题

alter table xxx move tablespace;

or

alter table move partition xxx tablespace;

or

alter table move sunpartition xxx tablespace;

 

ORA-1410问题相关的一些BUG罗列如下:
Bug 5637976
Abstract: ORA-8103/ORA-1410 from concurrent INSERT / export on ASSM tables
This occurs in 10gR2 when there are concurrent inserts and direct path exports. The newly created/updated blocks are not being flushed to disk, so the export is getting a stale version of the block from disk.
Fixed in 10.2.0.4 and 11.1.0.6

Unpublished Bug 4592596
Abstract: Corruption (ORA-1410) from multi-table insert with direct load
This error occurs if a SQL plan is compiled for a parallel run with a Degree of Parallelism (DOP) > 1, but at the time of running, due to lack of resources, it runs serial. Then the problem of invalid rowid will happen.
Fixed in 10.2.0.4 and 11.1.0.6.

Bug 5596325
Abstract: Text query gives wrong results or fails with ORA-1410 ORA-29903
If CONTAINS queries return ORA-1410: invalid rowid errors, and there are more than 200,000,000 documents in the index, then you may have encountered this bug.
Fixed in 10.2.0.4 and 11.1.0.6

Unpublished Bug 6444339
Abstract: TRUNCATE/PURGE DOES NOT CLEAN DEPENDENCIES PROPERLY.
DDL statements to an object were not invalidating all dependencies, so a stale rowid could remain in cache and produce a ORA-1410 if used.
Fixed in 11.2 and 10.2.0.5

Bug 8740993
Abstract: ORA-1410 OCCURRED ON ADG STANDBY DATABASE DURING TABLE SCAN.
This bug applies to standby databases and occurs when the standby is re-applying DDL for table drops/truncates/shrinks. The buffer cache is not being updated for the new object numbers.
Fixed in 12.1, 11.2.0.2


Posted

in

by

Tags:

Comments

2 responses to “【数据恢复】详解ORA-1410错误”

  1. macleanliu Avatar
    macleanliu

    Alert Log报ORA-01410异常相关trace文件内容如下:*** SESSION ID:(1045.59915) 2011-07-22 23:39:14.101 ORA-01410: invalid ROWID *** 2011-07-22 23:39:14.101 GATHER_STATS_JOB: GATHER_TABLE_STATS(‘”orcl”‘,’”TAB1″‘,’””‘, …) ORA-01410: invalid ROWID *** 2011-07-23 06:00:00.507 *** CLIENT ID:() 2011-07-23 06:00:00.507 GATHER_STATS_JOB: Stopped by Scheduler.问题分析常见的引起Oracle报ORA-01410的几种可能:1 正在访问的表或者分区发生truncate操作2 正在访问的分区发生exchange操作3 表或者相关索引存在脏数据–需要诊断是否相关的表和索引是否存在数据异常请进行以下操作:1 执行以下语句,观察是否报错select /*+ full(a) */ count(*)from orcl.TAB1 a;2 校验表和相关索引数据是否存在异常–如果需要分析的对象是分区表,需要在准备执行ANALYZE操作的用户(比如system)下,创建表INVALID_ROWS,这个是一次性操作,以后校验不用再执行@?rdbmsadminutlvalid.sql–验证相关的表和索引(非分区表)ANALYZE TABLE PVMS.DW_TERM_KPI_BAL_M VALIDATE STRUCTURE CASCADE ONLINE;–验证相关的表和索引(分区表)ANALYZE TABLE PVMS.DW_TERM_KPI_BAL_M VALIDATE STRUCTURE CASCADE ONLINE INTO INVALID_ROWS;

  2. zengmuansha Avatar
    zengmuansha

    上周我的备库某个表的索引出现了无效ROWID ,主库是可以的。备库架构在虚拟机上! 应该你说的写丢失。

Leave a Reply

Your email address will not be published. Required fields are marked *