Search Results for: cost

Mysql存储引擎特性比较

下表列出了Mysql 主要数据引擎架构(如myisam、innodb Storage Engine )的特性比较:

 

Feature MyISAM NDB Archive InnoDB Memory
Storage Limits No Yes No 64TB Yes
Transactions No Yes No Yes No
Locking granularity Table Row Row Row Table
MVCC snapshot read No No No Yes No
Geospatial support Yes No Yes Yes No
Data caches No Yes No Yes NA
Index Caches Yes Yes No Yes NA
Compressed Data Yes No Yes No No
Storage cost (relative to other engines) Small Med Small Med NA
Memory cost (relative to other engines) Low High Low High High
Bulk insert speed High High Highest Med High
Replication Support Yes Yes Yes Yes Yes
Foreign Key support No No No Yes No
Built-in Cluster/High-availability support No Yes No No No

 

Dynamically add and remove storage engines.
Change the storage engine on a table with “ALTER TABLE …”

Oracle中比对2张表之间数据是否一致的几种方法

大约是2个星期前做一个夜班的时候,开发人员需要比对shareplex 数据同步复制软件在 源端和目标端的2张表上的数据是否一致,实际上后来想了下shareplex 本身应当具有这种数据校验功能, 但是还是希望从数据库的角度得出几种可用的同表结构下的数据比对方法。

 

注意以下几种数据比对方式适用的前提条件:

 

1. 所要比对的表的结构是一致的
2. 比对过程中源端和 目标端 表上的数据都是静态的,没有任何DML修改

 

方式1:

假设你所要进行数据比对的数据库其中有一个版本为11g且该表上有相应的主键索引(primary key index)或者唯一非空索引(unique key &not null)的话,那么恭喜你! 你可以借助11g 新引入的专门做数据对比的PL/SQL Package dbms_comparison来实现数据校验的目的,如以下演示:

 

 

Source 源端版本为11gR2 :

conn maclean/maclean
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com  & www.askmaclean.com

 drop table test1;
 create table test1 tablespace users as select object_id t1,object_name t2 from dba_objects where object_id is not null;
 alter table test1 add primary key(t1);
 exec dbms_stats.gather_table_stats('MACLEAN','TEST1',cascade=>TRUE);

create database link maclean connect to maclean identified by maclean using 'G10R21';
Database link created.

 

以上源端数据库版本为11.2.0.3 , 源表结构为test1(t1 number primary key,t2 varchar2(128),透过dblink链接到版本为10.2.0.1的目标端

 

conn maclean/maclean

SQL> select * from v$version

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

create table test2 tablespace users as select object_id t1,object_name t2
from dba_objects where object_id is not null;
alter table test2 add primary key(t1);
exec dbms_stats.gather_table_stats('MACLEAN','TEST2',cascade=>TRUE);

 

目标端版本为10.2.0.1 , 表结构为test2(t1 number primary key,t2 varchar2(128))。

注意这里2张表上均必须有相同的主键索引或者伪主键索引(pseudoprimary key伪主键要求是唯一键且所有的成员列均是非空NOT NULL)。

实际创建comparison对象,并实施校验:

 

begin
  dbms_comparison.create_comparison(comparison_name    => 'MACLEAN_TEST_COM',
                                    schema_name        => 'MACLEAN',
                                    object_name        => 'TEST1',
                                    dblink_name        => 'MACLEAN',
                                    remote_schema_name => 'MACLEAN',
                                    remote_object_name => 'TEST2',
                                    scan_mode          => dbms_comparison.CMP_SCAN_MODE_FULL);
end;

PL/SQL procedure successfully completed.

SQL> set linesize 80 pagesize 1400

SQL> select * from user_comparison where comparison_name='MACLEAN_TEST_COM';

COMPARISON_NAME                COMPA SCHEMA_NAME
------------------------------ ----- ------------------------------
OBJECT_NAME                    OBJECT_TYPE       REMOTE_SCHEMA_NAME
------------------------------ ----------------- ------------------------------
REMOTE_OBJECT_NAME             REMOTE_OBJECT_TYP
------------------------------ -----------------
DBLINK_NAME
--------------------------------------------------------------------------------
SCAN_MODE SCAN_PERCENT
--------- ------------
CYCLIC_INDEX_VALUE
--------------------------------------------------------------------------------
NULL_VALUE
--------------------------------------------------------------------------------
LOCAL_CONVERGE_TAG
--------------------------------------------------------------------------------
REMOTE_CONVERGE_TAG
--------------------------------------------------------------------------------
MAX_NUM_BUCKETS MIN_ROWS_IN_BUCKET
--------------- ------------------
LAST_UPDATE_TIME
---------------------------------------------------------------------------
MACLEAN_TEST_COM               TABLE MACLEAN
TEST1                          TABLE             MACLEAN
TEST2                          TABLE
MACLEAN
FULL

ORA$STREAMS$NV

           1000              10000
20-DEC-11 01.08.44.562092 PM

 

利用dbms_comparison.create_comparison创建comparison后,新建的comparison会出现在user_comparison视图中;

以上我们完成了comparison的创建,但实际的校验仍未发生我们利用10046事件监控这个数据对比过程:

 

conn maclean/maclean
set timing on;
alter system flush shared_pool;

alter session set events '10046 trace name context forever,level 8';

set serveroutput on

DECLARE
  retval dbms_comparison.comparison_type;
BEGIN
  IF dbms_comparison.compare('MACLEAN_TEST_COM', retval, perform_row_dif => TRUE) THEN
    dbms_output.put_line('No Differences');
  ELSE
    dbms_output.put_line('Differences Found');
  END IF;
END;
/

Differences Found           =====> 返回结果为Differences Found,说明数据存在差异并不一致

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.87

===========================10046 tkprof result =========================

SELECT MIN("T1"), MAX("T1")
FROM
 "MACLEAN"."TEST1"

SELECT MIN("T1"), MAX("T1")
FROM
 "MACLEAN"."TEST2"@MACLEAN

SELECT COUNT(1)
FROM
 "MACLEAN"."TEST1" s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max )

SELECT COUNT(1)
FROM
 "MACLEAN"."TEST2"@MACLEAN s WHERE ("T1" >= :scan_min AND "T1" <= :scan_max )

SELECT q.wb1, min(q."T1") min_range1, max(q."T1") max_range1, count(*)
  num_rows, sum(q.s_hash) sum_range_hash
FROM
 (SELECT /*+ FULL(s) */  width_bucket(s."T1", :scan_min1, :scan_max_inc1,
  :num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"), 'ORA$STREAMS$NV'),
  4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'), 4294967295, 0))
  s_hash FROM "MACLEAN"."TEST1" s WHERE (s."T1">=:scan_min1 AND s."T1"<=
  :scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1

SELECT /*+ REMOTE_MAPPED */ q.wb1, min(q."T1") min_range1, max(q."T1")
  max_range1, count(*) num_rows, sum(q.s_hash) sum_range_hash
FROM
 (SELECT /*+ FULL(s) REMOTE_MAPPED */  width_bucket(s."T1", :scan_min1,
  :scan_max_inc1, :num_buckets) wb1, s."T1", ora_hash(NVL(to_char(s."T1"),
  'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((s."T2"), 'ORA$STREAMS$NV'),
  4294967295, 0)) s_hash FROM "MACLEAN"."TEST2"@MACLEAN s WHERE (s."T1">=
  :scan_min1 AND s."T1"<=:scan_max1) ) q GROUP BY q.wb1 ORDER BY q.wb1

SELECT /*+ FULL(P) +*/ * FROM "MACLEAN"."TEST2" P

SELECT /*+ FULL ("A1") */
 WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS),
 MIN("A1"."T1"),
 MAX("A1"."T1"),
 COUNT(*),
 SUM(ORA_HASH(NVL(TO_CHAR("A1"."T1"), 'ORA$STREAMS$NV'),
              4294967295,
              ORA_HASH(NVL("A1"."T2", 'ORA$STREAMS$NV'), 4294967295, 0)))
  FROM "MACLEAN"."TEST2" "A1"
 WHERE "A1"."T1" >= :SCAN_MIN1
   AND "A1"."T1" <= :SCAN_MAX1
 GROUP BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS)
 ORDER BY WIDTH_BUCKET("A1"."T1", :SCAN_MIN1, :SCAN_MAX_INC1, :NUM_BUCKETS)

SELECT ROWID, "T1", "T2"
  FROM "MACLEAN"."TEST2" "R"
 WHERE "T1" >= :1
   AND "T1" <= :2

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |   126 |  3528 |     4   (0)| 00:00:01 |
|*  1 |  FILTER                      |             |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| TEST2       |   126 |  3528 |     4   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | SYS_C006255 |   227 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER(:1)<=TO_NUMBER(:2))
   3 - access("T1">=TO_NUMBER(:1) AND "T1"<=TO_NUMBER(:2))

SELECT ll.l_rowid, rr.r_rowid, NVL(ll."T1", rr."T1") idx_val
FROM
 (SELECT l.rowid l_rowid, l."T1", ora_hash(NVL(to_char(l."T1"),
  'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((l."T2"), 'ORA$STREAMS$NV'),
  4294967295, 0)) l_hash  FROM "MACLEAN"."TEST1" l WHERE l."T1">=:scan_min1
  AND l."T1"<=:scan_max1 ) ll FULL OUTER JOIN (SELECT /*+ NO_MERGE
  REMOTE_MAPPED */ r.rowid r_rowid, r."T1", ora_hash(NVL(to_char(r."T1"),
  'ORA$STREAMS$NV'), 4294967295, ora_hash(NVL((r."T2"), 'ORA$STREAMS$NV'),
  4294967295, 0)) r_hash FROM "MACLEAN"."TEST2"@MACLEAN r WHERE r."T1">=
  :scan_min1  AND r."T1"<=:scan_max1 ) rr ON  ll."T1"=rr."T1" WHERE ll.l_hash
  IS NULL OR rr.r_hash IS NULL OR ll.l_hash <> rr.r_hash

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name         | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |              |   190 |   754K|     9  (12)| 00:00:01 |        |      |
|*  1 |  VIEW                           | VW_FOJ_0     |   190 |   754K|     9  (12)| 00:00:01 |        |      |
|*  2 |   HASH JOIN FULL OUTER          |              |   190 |   754K|     9  (12)| 00:00:01 |        |      |
|   3 |    VIEW                         |              |   190 |  7220 |     4   (0)| 00:00:01 |        |      |
|*  4 |     FILTER                      |              |       |       |            |          |        |      |
|   5 |      TABLE ACCESS BY INDEX ROWID| TEST1        |   190 |  5510 |     4   (0)| 00:00:01 |        |      |
|*  6 |       INDEX RANGE SCAN          | SYS_C0013098 |   341 |       |     2   (0)| 00:00:01 |        |      |
|   7 |    VIEW                         |              |   126 |   495K|     4   (0)| 00:00:01 |        |      |
|   8 |     REMOTE                      | TEST2        |   126 |  3528 |     4   (0)| 00:00:01 | MACLE~ | R->S |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("LL"."L_HASH" IS NULL OR "RR"."R_HASH" IS NULL OR "LL"."L_HASH"<>"RR"."R_HASH")
   2 - access("LL"."T1"="RR"."T1")
   4 - filter(TO_NUMBER(:SCAN_MIN1)<=TO_NUMBER(:SCAN_MAX1))
   6 - access("L"."T1">=TO_NUMBER(:SCAN_MIN1) AND "L"."T1"<=TO_NUMBER(:SCAN_MAX1))

Remote SQL Information (identified by operation id):
----------------------------------------------------

   8 - SELECT ROWID,"T1","T2" FROM "MACLEAN"."TEST2" "R" WHERE "T1">=:1 AND "T1"<=:2 (accessing
       'MACLEAN' )

 

可以看到以上过程中虽然没有避免对TEST1、TEST2表的全表扫描(FULL TABLE SCAN), 但是好在实际参与HASH JOIN FULL OUTER 的仅是访问索引后获得的少量数据,所以效率还是挺高的。

 

此外可以通过user_comparison_row_dif了解实际那些row存在差异,如:

 

SQL> set linesize 80 pagesize 1400
SQL> select *
  2    from user_comparison_row_dif
  3   where comparison_name = 'MACLEAN_TEST_COM'
  4     and rownum < 2;

COMPARISON_NAME                   SCAN_ID LOCAL_ROWID        REMOTE_ROWID
------------------------------ ---------- ------------------ ------------------
INDEX_VALUE
--------------------------------------------------------------------------------
STA LAST_UPDATE_TIME
--- ---------------------------------------------------------------------------
MACLEAN_TEST_COM                       42 AAATWGAAEAAANBrAAB AAANJrAAEAAB8AMAAd
46
DIF 20-DEC-11 01.18.08.917257 PM

 

以上利用dbms_comparison包完成了一次简单的数据比对,该方法适用于11g以上版本且要求表上有主键索引或非空唯一索引, 且不支持以下数据类型字段的比对

  •     LONG
  •     LONG RAW
  •     ROWID
  •     UROWID
  •     CLOB
  •     NCLOB
  •     BLOB
  •     BFILE
  •     User-defined types (including object types, REFs, varrays, and nested tables)
  •     Oracle-supplied types (including any types, XML types, spatial types, and media types)

 

 

 

若要比对存有以上类型字段的表,那么需要在create_comparison时指定column_list参数排除掉这些类型的字段。

方法1 dbms_comparison的优势在于可以提供详细的比较信息,且在有适当索引的前提下效率较高。
缺点在于有数据库版本的要求(at least 11gR1), 且也不支持LONG 、CLOB等字段的比较。

 

方式2:

利用minus Query 对比数据

这可以说是操作上最简单的一种方法,如:

 

select * from test1 minus select * from test2@maclean;

-----------------------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       | 75816 |  3527K|       |  1163  (40)| 00:00:14 |        |      |
|   1 |  MINUS              |       |       |       |       |            |          |        |      |
|   2 |   SORT UNIQUE       |       | 75816 |  2147K|  2984K|   710   (1)| 00:00:09 |        |      |
|   3 |    TABLE ACCESS FULL| TEST1 | 75816 |  2147K|       |   104   (1)| 00:00:02 |        |      |
|   4 |   SORT UNIQUE       |       | 50467 |  1379K|  1800K|   453   (1)| 00:00:06 |        |      |
|   5 |    REMOTE           | TEST2 | 50467 |  1379K|       |    56   (0)| 00:00:01 | MACLE~ | R->S |
-----------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------

   5 - SELECT "T1","T2" FROM "TEST2" "TEST2" (accessing 'MACLEAN' )

Select *
  from (select 'MACLEAN.TEST1' "Row Source", a.*
          from (select /*+ FULL(Tbl1)  */
                 T1, T2
                  from MACLEAN.TEST1 Tbl1
                minus
                select /*+ FULL(Tbl2)  */
                 T1, T2
                  from MACLEAN.TEST2@"MACLEAN" Tbl2) A
        union all
        select 'MACLEAN.TEST2@"MACLEAN"', b.*
          from (select /*+ FULL(Tbl2)  */
                 T1, T2
                  from MACLEAN.TEST2@"MACLEAN" Tbl2
                minus
                select /*+ FULL(Tbl1)  */
                 T1, T2
                  from MACLEAN.TEST1 Tbl1) B) Order by 1;

 

MINUS Clause会导致2张表均在本地被全表扫描(TABLE FULL SCAN),且要求发生SORT排序。 若所对比的表上有大量的数据,那么排序的代价将会是非常大的, 因此这种方法的效率不高。

方式2 MINUS的优点在于操作简便,特别适合于小表之间的数据检验。
缺点在于 由于SORT排序可能导致在大数据量的情况下效率很低, 且同样不支持LOB 和 LONG 这样的大对象。

 

方式3:

使用not exists子句,如:

 

select *
  from test1 a
 where not exists (select 1
          from test2 b
         where a.t1 = b.t1
           and a.t2 = b.t2);

no rows selected

Elapsed: 00:00:00.06

------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 75816 |  7996K|       |   691   (1)| 00:00:09 |
|*  1 |  HASH JOIN ANTI    |       | 75816 |  7996K|  3040K|   691   (1)| 00:00:09 |
|   2 |   TABLE ACCESS FULL| TEST1 | 75816 |  2147K|       |   104   (1)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| TEST2 | 77512 |  5979K|       |   104   (1)| 00:00:02 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."T1"="B"."T1" AND "A"."T2"="B"."T2")

 

 

照理说在数据量较大的情况下not exists使用的HASH JOIN ANTI是在性能上是优于MINUS操作的, 但是当所要比较的表身处不同的2个数据库(distributed query)时将无法使用HASH JOIN ANTI,而会使用FILTER OPERATION这种效率极低的操作:

 

 

select *
  from test1 a
 where not exists (select 1
          from test2@maclean b
         where a.t1 = b.t1
           and a.t2 = b.t2)
no rows selected

Elapsed: 00:01:05.76

 --------------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 75816 |  2147K|   147K  (1)| 00:29:31 |        |      |
|*  1 |  FILTER            |       |       |       |            |          |        |      |
|   2 |   TABLE ACCESS FULL| TEST1 | 75816 |  2147K|   104   (1)| 00:00:02 |        |      |
|   3 |   REMOTE           | TEST2 |     1 |    29 |     2   (0)| 00:00:01 | MACLE~ | R->S |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( NOT EXISTS (SELECT 0 FROM  "B" WHERE "B"."T1"=:B1 AND "B"."T2"=:B2))

Remote SQL Information (identified by operation id):
----------------------------------------------------

   3 - SELECT "T1","T2" FROM "TEST2" "B" WHERE "T1"=:1 AND "T2"=:2 (accessing
       'MACLEAN' )

 

可以从以上执行计划看到FILTER 操作是十分昂贵的。

 

补充:

有网友反映可以通过增加 unnest hint 让CBO优化器在远程子查询有效的情况下整体考虑整个查询块,这样可以让执行计划用上HASH JOIN RIGHT ANTI, 这是我一开始没有考虑到的。

 

 

select *
  from test1 a
 where not exists (select /*+ unnset */
         1
          from test2@maclean b
         where a.t1 = b.t1
           and a.t2 = b.t2);

           
           

PLAN_TABLE_OUTPUT
------------------------------------------
Plan hash value: 1776635653

------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       | 79815 |    12M|       |   594   (1)| 00:00:08 |        |      |
|*  1 |  HASH JOIN RIGHT ANTI|       | 79815 |    12M|  1816K|   594   (1)| 00:00:08 |        |      |
|   2 |   REMOTE             | TEST2 | 20420 |  1575K|       |    56   (0)| 00:00:01 | MACLE~ | R->S |
|   3 |   TABLE ACCESS FULL  | TEST1 | 79815 |  6157K|       |   104   (1)| 00:00:02 |        |      |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."T1"="B"."T1" AND "A"."T2"="B"."T2")

Remote SQL Information (identified by operation id):
----------------------------------------------------

   2 - SELECT "T1","T2" FROM "TEST2" "B" (accessing 'MACLEAN' )

 

 

在此基础上加入ordered hint 可以让执行计划使用HASH JOIN ANTI

 

   
 select /*+ ordered */ *
  from test1 a
 where not exists (select /*+ unnset */
         1
          from test2@maclean b
         where a.t1 = b.t1
           and a.t2 = b.t2);  

PLAN_TABLE_OUTPUT
--------------------------------------------------
Plan hash value: 3089912131

----------------------------------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Inst   |IN-OUT|
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 79815 |    12M|       |   594   (1)| 00:00:08 |        |      |
|*  1 |  HASH JOIN ANTI    |       | 79815 |    12M|  7096K|   594   (1)| 00:00:08 |        |      |
|   2 |   TABLE ACCESS FULL| TEST1 | 79815 |  6157K|       |   104   (1)| 00:00:02 |        |      |
|   3 |   REMOTE           | TEST2 | 20420 |  1575K|       |    56   (0)| 00:00:01 | MACLE~ | R->S |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."T1"="B"."T1" AND "A"."T2"="B"."T2")

Remote SQL Information (identified by operation id):
----------------------------------------------------

   3 - SELECT "T1","T2" FROM "TEST2" "B" (accessing 'MACLEAN' )

 

方式3 的优点在于操作简便, 且当需要对比的表位于同一数据库时效率要比MINUS方式高,但如果是distributed query分布式查询则效率可能会因FILTER操作而急剧下降,这时候需要我们手动添加unnest这样的SQL提示,以保证执行计划使用HASH JOIN ANTI操作,这样能够保证not exists方式的性能。not exists同样不支持CLOB等大对象。

 

方式4:

Toad、PL/SQL Developer等图形化工具都提供了compare table data的功能, 这里我们以Toad工具为例,介绍如何使用该工具校验数据:

 

打开Toad 链接数据库-> 点击Database-> Compare -> Data

 

 

分别在Source 1和Source 2对话框中输入源表和目标表的信息

因为Toad的底层实际上使用了MINUS操作,所以提高SORT_AREA_SIZE有助于提高compare的性能,若使用AUTO PGA则可以不设置。

 

选择所要比较的列

 

 

 

首先可以比较2张表的行数,点击Execute计算count

 

 

使用MINUS 找出其中一张表上有,而另一张没有的行

 

使用MINUS 找出所有的差别

 

 

Toad的compare data功能是基于MINUS实现的,所以效率上并没有优势。但是通过图形界面省去了写SQL语句的麻烦。这种方法同样不支持LOB、LONG等对象。

 

方式5:

这是一种别出心裁的做法。 将一行数据的上所有字段合并起来,并使用dbms_utility.get_hash_value对合并后的中间值取hash value,再将所有这些从各行所获得的hash值sum累加, 若2表的hash累加值相等则判定2表的数据一致。

 

简单来说,如下面这样:

 

create table hash_one as select object_id t1,object_name t2 from dba_objects;

select dbms_utility.get_hash_value(t1||t2,0,power(2,30)) from hash_one where rownum <3;

DBMS_UTILITY.GET_HASH_VALUE(T1||T2,0,POWER(2,30))
-------------------------------------------------
                                         89209477
                                        757190129

select sum(dbms_utility.get_hash_value(t1||t2,0,power(2,30))) from hash_one;

SUM(DBMS_UTILITY.GET_HASH_VALU
------------------------------
                40683165992756

select sum(dbms_utility.get_hash_value(object_id||object_name,0,power(2,30))) from dba_objects;

SUM(DBMS_UTILITY.GET_HASH_VALU
------------------------------
                40683165992756

 

 

对于列较多的表,手动去构造所有字段合并可能会比较麻烦,利用以下SQL可以快速构造出我们所需要的语句:

 

放到PL/SQL Developer等工具中运行,在sqlplus 中可能因ORA-00923: FROM keyword not found where expected出错

select 'select sum(dbms_utility.get_hash_value('||column_name_path||',0,power(2,30)) ) from '||owner||'.'||table_name||';'  from (select owner,table_name,column_name_path,row_number() over(partition by table_name order by table_name,curr_level desc) column_name_path_rank from (select owner,table_name,column_name,rank,level as curr_level,ltrim(sys_connect_by_path(column_name,'||''|''||'),'||''|''||') column_name_path from (select owner,table_name,column_name,row_number() over(partition by table_name order by table_name,column_name) rank from dba_tab_columns where owner=UPPER('&OWNER')  and table_name=UPPER('&TABNAME')  order by table_name,column_name) connect by table_name = prior table_name and rank-1 = prior rank)) where column_name_path_rank=1;

 

使用示范:

 

SQL> @get_hash_col
Enter value for owner: SYS
Enter value for tabname: TAB$

'SELECTSUM(DBMS_UTILITY.GET_HASH_VALUE('||COLUMN_NAME_PATH||',0,POWER(2,30)))FROM
--------------------------------------------------------------------------------
select sum(dbms_utility.get_hash_value(ANALYZETIME||'|'||AUDIT$||'|'||AVGRLN||'|
'||AVGSPC||'|'||AVGSPC_FLB||'|'||BLKCNT||'|'||BLOCK#||'|'||BOBJ#||'|'||CHNCNT||'
|'||CLUCOLS||'|'||COLS||'|'||DATAOBJ#||'|'||DEGREE||'|'||EMPCNT||'|'||FILE#||'|'
||FLAGS||'|'||FLBCNT||'|'||INITRANS||'|'||INSTANCES||'|'||INTCOLS||'|'||KERNELCO
LS||'|'||MAXTRANS||'|'||OBJ#||'|'||PCTFREE$||'|'||PCTUSED$||'|'||PROPERTY||'|'||
ROWCNT||'|'||SAMPLESIZE||'|'||SPARE1||'|'||SPARE2||'|'||SPARE3||'|'||SPARE4||'|'
||SPARE5||'|'||SPARE6||'|'||TAB#||'|'||TRIGFLAG||'|'||TS#,0,1073741824) ) from S
YS.TAB$;

利用以上生成的SQL 计算表的sum(hash)值

select sum(dbms_utility.get_hash_value(ANALYZETIME || '|' || AUDIT$ || '|' ||
                                       AVGRLN || '|' || AVGSPC || '|' ||
                                       AVGSPC_FLB || '|' || BLKCNT || '|' ||
                                       BLOCK# || '|' || BOBJ# || '|' ||
                                       CHNCNT || '|' || CLUCOLS || '|' || COLS || '|' ||
                                       DATAOBJ# || '|' || DEGREE || '|' ||
                                       EMPCNT || '|' || FILE# || '|' ||
                                       FLAGS || '|' || FLBCNT || '|' ||
                                       INITRANS || '|' || INSTANCES || '|' ||
                                       INTCOLS || '|' || KERNELCOLS || '|' ||
                                       MAXTRANS || '|' || OBJ# || '|' ||
                                       PCTFREE$ || '|' || PCTUSED$ || '|' ||
                                       PROPERTY || '|' || ROWCNT || '|' ||
                                       SAMPLESIZE || '|' || SPARE1 || '|' ||
                                       SPARE2 || '|' || SPARE3 || '|' ||
                                       SPARE4 || '|' || SPARE5 || '|' ||
                                       SPARE6 || '|' || TAB# || '|' ||
                                       TRIGFLAG || '|' || TS#,
                                       0,
                                       1073741824))
  from SYS.TAB$;

SUM(DBMS_UTILITY.GET_HASH_VALU
------------------------------
                 1646389632463

 

方式5 利用累加整行数据的hash来判定表上数据是否一致, 仅需要对2张表做全表扫描,效率上是这几种方法中最高的, 且能保证较高的准确率。

 

但是该hash方式存在以下几点不足:

1. 所有字段合并的整行数据可能超过4000字节,这时会出现ORA-1498错误。换而言之使用这种方式的前提是表中任一行的行长不能超过4000 bytes,当然常规情况下很少会有一行数据超过4000 bytes,也可以通过dba_tables.avg_row_len平均行长的统计信息来判定,若avg_row_len<<4000 那么一般不会有溢出的问题。

2. 该hash 方式仅能帮助判断 数据是否一致, 而无法提供更多有用的,例如是哪些行不一致等细节信息

3. 同样的该hash方式对于lob、long字段也无能为力

ORA-00600:[1112]内部错误&ROW CACHE ENQUEUE LOCK一例

一套AIX 上的9.2.0.6 2节点RAC系统出现了ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []内部错误伴随有ROW CACHE ENQUEUE LOCK并引发clusterware split-brain resolution,详细的日志及ass.awk输出如下:

 

ALERT LOG
=============
Sun Jun 19 09:06:24 2011
>>> WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK! pid=24
Sun Jun 19 09:06:29 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:06:29 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:06:30 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:06:30 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:06:31 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:06:31 2011
Errors in file /s01/admin/prod/udump/prod2_ora_1061088.trc:
ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Sun Jun 19 09:08:06 2011
Waiting for clusterware split-brain resolution
Sun Jun 19 09:13:17 2011
ALTER SYSTEM SET event='10511 trace name context forever, level 1' SCOPE=SPFILE SID='*';
Sun Jun 19 09:14:44 2011
Trace dumping is performing id=[cdmp_20110619091444]
Sun Jun 19 09:18:05 2011
Errors in file /s01/admin/prod/bdump/prod2_lmon_422072.trc:
ORA-29740: evicted by member 1, group incarnation 9
Sun Jun 19 09:18:05 2011
LMON: terminating instance due to error 29740
Sun Jun 19 09:18:05 2011
Errors in file /s01/admin/prod/bdump/prod2_lms2_725312.trc:
ORA-29740: evicted by member , group incarnation
Sun Jun 19 09:18:05 2011
Errors in file /s01/admin/prod/bdump/prod2_lms7_1008288.trc:
ORA-29740: evicted by member , group incarnation
Instance terminated by LMON, pid = 422072
Sun Jun 19 09:21:16 2011
Starting ORACLE instance (normal)

TRACE FILE
==============
prod2_ora_1061088.trc
Oracle9i Enterprise Edition Release 9.2.0.6.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options
JServer Release 9.2.0.6.0 - Production
ORACLE_HOME = /oracle/app/oracle/product/9.2
System name: AIX
Node name: tprod2
Release: 3
Version: 5
Machine: 00CE5E834C00
Instance name: prod2

*** 2011-06-19 09:06:28.931
================================
PROCESS DUMP FROM HANG ANALYZER:
================================
Current SQL statement for this session:
SELECT formatid, globalid, branchid FROM SYS.DBA_PENDING_TRANSACTIONS ORDER BY formatid, globalid, branchid
*** 2011-06-19 09:06:28.931
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?

Repeat 2 times
----- End of Call Stack Trace -----
*** 2011-06-19 09:06:29.111
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?
----- End of Call Stack Trace -----
*** 2011-06-19 09:06:29.133
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?
----- End of Call Stack Trace -----
*** 2011-06-19 09:06:29.162
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?
----- End of Call Stack Trace -----
*** 2011-06-19 09:06:29.175
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?
----- End of Call Stack Trace -----
*** 2011-06-19 09:06:29.192
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedms+00dc bl ksedst 102905E64 ?
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl nsdo 1102A8098 ? 5500000055 ?
1102DFD20 ? 1102A8200 ?
FFFFFFFFFFFC4E0 ? 000000000 ?
300000003 ?
opikndf2+06a8 bl _ptrgl
opitsk+05fc bl _ptrgl
opiino+0798 bl opitsk 000000000 ? 000000000 ?
opiodr+08e8 bl _ptrgl
opidrv+032c bl opiodr 3C00000018 ? 4101F62A0 ?
FFFFFFFFFFFF8C0 ? 0A057DC60 ?
sou2o+0028 bl opidrv 3C0C000000 ? 4A0644B50 ?
FFFFFFFFFFFF8C0 ?
main+0138 bl 01FD7B5C
__start+0098 bl main 000000000 ? 000000000 ?
----- End of Call Stack Trace -----
Files currently opened by this process:
===================================================
PROCESS STATE
-------------
Process global information:
process: 700000676099520, call: 0, xact: 0, curses: 0, usrses: 700000673decd98
----------------------------------------
SO: 700000676099520, type: 2, owner: 0, flag: INIT/-/-/0x00
(process) Oracle pid=224, calls cur/top: 0/7000006c2ca3df8, flag: (0) -
int error: 0, call error: 0, sess error: 0, txn error 0
(post info) last post received: 0 0 50
last post received-location: kcbzww
last process to post me: 700000676119f00 7 0
last post sent: 0 0 21
last post sent-location: ksqrcl
last process posted by me: 700000676428258 1 0
(latch info) wait_event=0 bits=0
Process Group: DEFAULT, pseudo proc: 700000676cc19b0
O/S info: user: oracle, term: UNKNOWN, ospid: 1061088
OSD pid info: Unix process pid: 1061088, image: oracle@tprod2 (TNS V1-V3)
----------------------------------------

END OF PROCESS STATE
******************** Cursor Dump ************************
Current cursor: 2, pgadep: 0
pgactx: 7000006f8bc2d40 ctxcbk: 0 ctxqbc: 0 ctxrws: 700000716aecfd0
Explain plan:
Plan Table
--------
-------------------------------------------------------------------------------------------------------------------------
| Operation | Name | Rows | Bytes | Cost | TQ |IN-OUT| PQ Distrib |Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
| SELECT STATEMENT | | 0 | 0 | 0 | | | | | |
| SORT ORDER BY | | 0 | 0 | 0 | | | | | |
| VIEW | | 0 | 0 | 0 | | | | | |
| SORT UNIQUE | | 0 | 0 | 0 | | | | | |
| UNION-ALL | | 0 | 0 | 0 | | | | | |
| MINUS | | 0 | 0 | 0 | | | | | |
| SORT UNIQUE | | 0 | 0 | 0 | | | | | |
| VIEW | | 0 | 0 | 0 | | | | | |
| FIXED TABLE FULL | X$K2GTE2 | 0 | 0 | 0 | | | | | |
| SORT UNIQUE | | 0 | 0 | 0 | | | | | |
| NESTED LOOPS | | 0 | 0 | 0 | | | | | |
*** 2011-06-19 09:06:29.376
ksedmp: internal or fatal error

ORA-00600: internal error code, arguments: [1112], [], [], [], [], [], [], []
Current SQL statement for this session:
SELECT formatid, globalid, branchid FROM SYS.DBA_PENDING_TRANSACTIONS ORDER BY formatid, globalid, branchid
----- Call Stack Trace -----
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedmp+0148 bl ksedst 102905C84 ?
ksfdmp+0018 bl 01FD8148
kgeriv+0118 bl _ptrgl
kgesiv+0080 bl kgeriv 07FFFFFFC ? 800000000000000 ?
1000000000000000 ?
1800000000000000 ?
028828228 ?
ksesic0+005c bl kgesiv 7000006BE3BB328 ? 000010550 ?
7000006BE3AADD8 ? 10297D7E8 ?
FFFFFFFFFFF3A20 ?
kssadf_stage+0084 bl ksesic0 45800000458 ? 11007A2F8 ?
000000000 ? 000000000 ?
000000000 ? 70000000001DB80 ?
000000000 ? 700000703BBF040 ?
kqreqa+008c bl kssadf_stage 7000006BE3AADD8 ? 10297D7E8 ?
068A31055 ? 000006BB0 ?
000000001 ?
kqrpre1+06e4 bl kqreqa 000000001 ?
kqrpre+001c bl kqrpre1 BAC3F8E66 ? 000000001 ?
FFFFFFFFFFF4008 ? 1101F9A14 ?
1101F9A14 ? FFFFFFFFFFF4000 ?
07FFFFFFF ? 000000000 ?
kkdlobni+0058 bl kqrpre 100F29A04 ?
4222442400000000 ?
14DFD4B95 ?
166CCD19101F62A0 ?
000000002 ? 000000000 ?
FFFFFFFFFFF40C0 ?
xplObjnToName+0150 bl kkdlobni 9A0000009A ?
FFFFFFFFFFF4444 ? 000000000 ?
000000000 ?
xplPatchName+00a4 bl xplObjnToName 9AFFFF46F0 ?
FFFFFFFFFFF4444 ?
xplMakeRow+0190 bl xplPatchName 000000000 ? 000000000 ?
000000000 ?
xplFetchRow+00b4 bl _ptrgl
xplDumpRws+0604 bl xplFetchRow 1029CFB48 ? FFFFFFFFFFF4770 ?
1101F9A14 ?
curdmp+0164 bl xplDumpRws 102AE2A20 ?
ksedms+012c bl curdmp
ksdxfdmp+0200 bl _ptrgl
ksdxcb+02d8 bl _ptrgl
sspuser+0084 bl 01FD7CA8
000044C0 ? 00000000
snttread+0028 bl 00009CFC
nttrd+0118 bl snttread FFFFFFFFFFFBBB3 ?
FFFFFFFFFFFBBA8 ?
FFFFFFFFFFFB2C0 ?
nsprecv+0984 bl _ptrgl
nsrdr+01d0 bl nsprecv 000000000 ? 110299C00 ?
000000000 ?
nsdo+1818 bl nsrdr 000000000 ? 000000000 ?
nioqrc+05c4 bl

Blockers
~~~~~~~~

Above is a list of all the processes. If they are waiting for a resource
then it will be given in square brackets. Below is a summary of the
waited upon resources, together with the holder of that resource.
Notes:
~~~~~
o A process id of '???' implies that the holder was not found in the
systemstate. (The holder may have released the resource before we
dumped the state object tree of the blocking process).
o Lines with 'Enqueue conversion' below can be ignored *unless*
other sessions are waiting on that resource too. For more, see
http://dlsunuk11.uk.oracle.com/Public/TOOLS/Ass.html#enqcnv)

Resource Holder State
Latch 70000000000a4b8 115: Blocker
Latch 70000000000a4b8 210: Blocker
Latch 70000000000a4b8 270: Blocker
Latch 70000000000a4b8 406: Blocker
Latch 70000000000a4b8 614: Blocker
Latch 70000000000a4b8 626: Blocker
Latch 70000000000a4b8 882: Blocker
Latch 70000000000a4b8 1489: Blocker
Latch 70000000000a4b8 1617: Blocker
Latch 70000000000a4b8 1878: Blocker
Latch 70000000000a4b8 1916: Blocker
Latch 70000000000a4b8 1947: Blocker
Latch 70000000000a4b8 1963: Blocker
Latch 70000000000a4b8 2121: 2121: is waiting for Latch 700000675dae330
Latch 70000000000a4b8 2245: Blocker
Latch 70000000000a4b8 2351: Blocker
Latch 70000000000a4b8 2566: Blocker
Latch 70000000000a4b8 2585: Blocker
Latch 70000000000a4b8 2643: Blocker
Latch 70000000000a4b8 2773: 2773: is waiting for Latch 700000675daf3a8
Latch 70000000000a4b8 2791: Blocker
Latch 70000000000a4b8 2795: Blocker
Latch 70000000000a4b8 2966: Blocker
Latch 70000000000a4b8 2969: Blocker
Latch 700000675dadf50 ??? Blocker
Latch 700000675dadc68 ??? Blocker
Latch 700000675dadb70 ??? Blocker
Latch 7000006be3a6530 ??? Blocker
Latch 700000675dae808 ??? Blocker
Latch 700000675db0040 ??? Blocker
Latch 7000006d1d71138 ??? Blocker
Latch 700000675dad3b0 ??? Blocker
Latch 700000675dae330 ??? Blocker
Latch 7000006b2d4fd28 2211: Blocker
Latch 7000006b2d4fd28 2220: Blocker
Latch 7000006b2e5df68 2660: Blocker
Latch 7000006b2e5e3e8 2752: Blocker
Latch 7000006b2e5e3e8 2876: Blocker
Latch 7000006b2d06b28 ??? Blocker
Latch 7000006b2f9f928 ??? Blocker
Latch 7000006b2d4db68 ??? Blocker
Latch 7000006b2e5e868 ??? Blocker
Latch 7000006b2d4e6a8 ??? Blocker
Latch 7000006b2d4eb28 2434: Blocker
Latch 7000006b2d4eb28 2437: 2437: is waiting for 2434: 2437:
Latch 7000006b2d4f428 2925: Blocker
Latch 7000006b2d4f428 2948: Blocker
Latch 7000006b2d07428 ??? Blocker
Latch 7000006b2d4e588 ??? Blocker
Latch 7000006b2e5ece8 ??? Blocker
Latch 7000006b2d4efa8 ??? Blocker
Latch 7000006b2d07c08 ??? Blocker
Latch 7000006b2f9e968 ??? Blocker
Latch 700000675daf3a8 ??? Blocker
Latch 7000006b2a49f68 3198: Blocker
Latch 70000000001a968 ??? Blocker

Some of the above latches may be child latches. Please check the section
named 'Child Latch Report' below for further notes.

Blockers According to Tracefile Wait Info:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. This may not work for 64bit platforms. See bug 2902997 for details.
2. If the blocking process is shown as 0 then that session may no longer be
present.
3. If resources are held across code layers then sometimes the tracefile wait
info will not recognise the problem.

No blockers seen.

Object Names
~~~~~~~~~~~~
Latch 70000000000a4b8 enqueues
Latch 700000675dadf50 Child enqueue hash chains
Latch 700000675dadc68 Child enqueue hash chains
Latch 700000675dadb70 Child enqueue hash chains
Latch 7000006be3a6530 Child row cache objects
Latch 700000675dae808 Child enqueue hash chains
Latch 700000675db0040 Child enqueue hash chains
Latch 7000006d1d71138 Child library cache pin
Latch 700000675dad3b0 Child enqueue hash chains
Latch 700000675dae330 Child enqueue hash chains
Latch 7000006b2d4fd28 Child cache buffers chains
Latch 7000006b2e5df68 Child cache buffers chains
Latch 7000006b2e5e3e8 Child cache buffers chains
Latch 7000006b2d06b28 Child cache buffers chains
Latch 7000006b2f9f928 Child cache buffers chains
Latch 7000006b2d4db68 Child cache buffers chains
Latch 7000006b2e5e868 Child cache buffers chains
Latch 7000006b2d4e6a8 Child cache buffers chains
Latch 7000006b2d4eb28 Child cache buffers chains
Latch 7000006b2d4f428 Child cache buffers chains
Latch 7000006b2d07428 Child cache buffers chains
Latch 7000006b2d4e588 Child cache buffers chains
Latch 7000006b2e5ece8 Child cache buffers chains
Latch 7000006b2d4efa8 Child cache buffers chains
Latch 7000006b2d07c08 Child cache buffers chains
Latch 7000006b2f9e968 Child cache buffers chains
Latch 700000675daf3a8 Child enqueue hash chains
Latch 7000006b2a49f68 Child cache buffers chains
Latch 70000000001a968 Parent transaction allocation

Child Latch Report
~~~~~~~~~~~~~~~~~~
Some processes are being blocked waiting for child latches.

At the moment this script does not detect the blocker because the
child latch address differs to the parent latch address. To manually
detect the blocker please take the following steps :
1. Determine the TYPE of latch (Eg library cache) that is involved.
2. Search the source trace file for a target of :
holding.*Parent.*library cache
(Assuming we have a child library cache and have vi-like regular expressions)

If this shows nothing then the blocker may have released the resource
before we got to dump the state object tree of the blocked process.

A list of processes that hold parent latches is given below :

No processes found.

Summary of Wait Events Seen (count>10)
~~~~~~~~~~~~~~~~~~~~~~~~~~~
No wait events seen more than 10 times

 

 

ORA-00600:[1112]内部错误的相关知识如下:

 

ERROR:
ORA-600 [1112] [a] [b] [c] [d] [e]

VERSIONS:
versions 7.3 to 9.2

DESCRIPTION:

ORA-600 [1112] is getting raised while trying to add a
row cache enqueue to a transaction state object during
lookup of the default tablespace number during table
creation.

FUNCTIONALITY:
STATE OBJECT MANAGEMENT

IMPACT:
PROCESS FAILURE
NON CORRUPTIVE – No underlying data corruption.

Bug 2489130 – OERI:1112 can occur while dumping PROCESSSTATE informatio (Doc ID 2489130.8)
Bug 4126973: ORA-600[504] AND ORA-600[1112] OCCURED WHEN GETTING “ERRORSTACK”
Base Bug 2489130
Bug 3954753: ORA-600 [1112] AND SESSION CRASH

 

经过诊断发现该ORA-00600:[1112]内部错误是由Bug 2489130所引起的,而触发该Bug的直接原因是WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!:

 

The cause for the ORA-00600 [1112] appears due to Bug 2489130
This error can occur on dumping of process state which is what occurred here.
The primary issue is the WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!
This then triggers a system state and process state to be dumped due to nature of the problem.
The ORA-00600 [1112] gets dumped out when process state is done.

Stack for trace very similar to Bug 2489130 and this is only known bug on 9.2 like this with a fix.

A fix for bug 2489130 is included in the 9.2.0.7 patchset.
Recommend applying 9.2.0.8 patchset to have this and other bug fixes.
This would only prevent the ORA-00600 [1112] from occurring on state dumps.

 

解决方案是 优化SQL性能以避免出现WAITED TOO LONG FOR A ROW CACHE ENQUEUE LOCK!, 或者至少升级数据库版本到9.2.0.8 这个推荐的patchset。

SQL调优:带函数的谓词导致CBO Cardinality计算误差

今天处理了这样一问题,where条件中存在函数fun(date)<to_date(‘9999-01-01′,’YYYY-MM-DD’)这样的无实际意义谓词,导致CBO计算基数时cardinality远小于实际情况,导致优化器认为2个源数据集的基数都不大,从而选择了HASH JOIN Right SEMI+SORT ORDER BY的执行计划,但是由于实际基数远大于computed 计算值所以变成了大的数据集做HASH JOIN并全数据排序,而实际该SQL只要求返回几十行数据而已,使用NESTED LOOP SEMI JOIN可以立即返回排序的前20行数据。

这里就需要解释带函数的谓词时CBO如何计算基数,我们通过下面的例子来说明:

 

 

create or replace function check_date( RDATE in date)  return date is 
begin
IF rdate< to_date('2099-01-01','YYYY-MM-DD') then   return rdate;   ELSIF  rdate >=to_date('2099-01-01','YYYY-MM-DD') then 
 return to_date('2000-01-01');
 end if;
 end check_date;
 /

 SQL> select check_date (sysdate) from dual;

CHECK_DAT
---------
06-DEC-12

drop table tab1;

SQL> create table tab1 tablespace users as select * from dba_objects where rownum create view vtab1 as select object_id as id , object_name as name, object_type as type , check_date(created) cdata from tab1;

View created.

SQL> select count(distinct cdata) from vtab1;

COUNT(DISTINCTCDATA)
--------------------
                 130

SQL> exec dbms_stats.gather_table_stats('','TAB1', method_opt=>'FOR ALL COLUMNS SIZE 254');

PL/SQL procedure successfully completed.

 

 

因为我们指定收集了直方图所以若直接以”created”为条件查询时可以获得较好的计算基数

   SQL> select count(*) from tab1 where  created  >= to_date('0001-10-10','YYYY-MM-DD');

  COUNT(*)
----------
     10000

Execution Plan
----------------------------------------------------------
Plan hash value: 1117438016

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     8 |    40   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| TAB1 | 10000 | 80000 |    40   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CREATED">=TO_DATE(' 0001-10-10 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        133  consistent gets
          0  physical reads
          0  redo size
        526  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

 

 

在以上查询中>= to_date(‘0001-10-10′,’YYYY-MM-DD’); 这样的过滤条件实际无意义,在直接使用 “created”列作为谓词的情况下,CBO可以获得很好的基数10000。

 

 

SQL> select * from TABLE(dbms_xplan.display_cursor(NULL,NULL,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6zy2k9dy4cv73, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(*) from vtab1 where  cdata
>= to_date('0001-10-10','YYYY-MM-DD')

Plan hash value: 1117438016

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.25 |     154 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.25 |     154 |
|*  2 |   TABLE ACCESS FULL| TAB1 |      1 |    500 |  10000 |00:00:00.31 |     154 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("CHECK_DATE"("CREATED")>=TO_DATE(' 0001-10-10 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))

21 rows selected.

 

 

通过gather_plan_statistics HINT,我们得到E-Rows 即CBO评估的基数,和 A-Rows实际的基数,可以看到这里E-Rows=500, 即在谓词左边存在使用内部函数或隐身装换的情况下,CBO无法通过现有统计信息的DISTINCT、DENSITY和HISTOGRAM获得较好的Cardinality,其基数总是统计信息中表的总行数/20,如上例中的 10000/20=500。

这就会引入不少的麻烦,因为开发人员有时候为了方便会在视图字段中嵌入自定义的函数,之后若在查询中使用该字段作为谓词条件,则可能导致CBO为相应表计算的基数偏少,是本身应当成本非常高的执行计划的COST变低,而容易被优化器选择。

对于上述问题可选的常见方案是若有这样问题的SQL较少则考虑加HINT或者SQL PROFILE,若较多还是需要考虑减少这种谓词左边有函数的现象。

implicit data_type conversion functions in Filter Predicates. Review Execution Plans.
If Filter Predicatesinclude unexpected INTERNAL_FUNCTION to perform an implicit data_type conversion,
be sure it is not preventing a column from being used as an Access Predicate.

有哪些功能是DBMS_STATS无法实现而Analyze命令可以做到的?

今天有同行问我DBMS_STATS与Analyze的区别, 我简单地介绍了一下DBMS_STATS提供了更丰富的功能,从8i开始推荐使用DBMS_STATS来完全替代Analayz Table/Index/Cluster 命令收集数据库中对象的统计信息。

 

之后他又问起我有那些功能是Analyze 命令可以做到,反而DBMS_STATS做不到的?

 

这个问题我倒是没有很系统地去考虑过, 闪过脑子的2个念头就是Analyze Table/Index validate structure 和  Analyze Table list chained rows into chained_rows 这2个命令。

 

Google了一下相关的信息,发现Asktom对这个已经讨论过了,从8i以后analyze被提名为具备”validate”验证功能的命令,主要负责验证表和索引的结构以及链式行(chained and migrated rows)信息; 而DBMS_STATS包负责统计信息的管理。

 

analyze is slated to do this and this alone at some time in the future.  analyze is destined to be
the “validate” thing — analyze validate structure, analyze list chained rows, etc.

dbms_stats will stick with stats and stats only.
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:735625536552

 

因为Oracle Development 开发部门已经给予 Analyze 命令较为精确地功能定位了,  所以其最主要的2个功能validate structure和list chained rows目前和将来都不会被植入到DBMS_STATS包当中。

 

Analyze validate structure 的主要语法包括:

 

@?/rdbms/admin/utlvalid.sql                  /* 创建存放验证信息的表 */
desc invalid_rows

ANALYZE INDEX <index_name>
[PARTITION <partition_name>]
[SUBPARTITION <subpartition_name>]
VALIDATE STRUCTURE CASCADE
INTO <table_name> <OFFLINE | ONLINE>

select * from index_stats;

ANALYZE TABLE <table_name>
[PARTITION <partition_name>]
[SUBPARTITION <subpartition_name>]
VALIDATE STRUCTURE CASCADE
[INTO <table_name>] <OFFLINE | ONLINE>

ANALYZE CLUSTER <cluster_name> VALIDATE STRUCTURE CASCADE
INTO <table_name> <OFFLINE | ONLINE>

 

 

补充:analyze index validate structure会填充index_stats视图,该视图包含了丰富的索引空间信息:

 

SQL> desc index_stats;

 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 HEIGHT                                             NUMBER
 BLOCKS                                             NUMBER
 NAME                                               VARCHAR2(30)
 PARTITION_NAME                                     VARCHAR2(30)
 LF_ROWS                                            NUMBER
 LF_BLKS                                            NUMBER
 LF_ROWS_LEN                                        NUMBER
 LF_BLK_LEN                                         NUMBER
 BR_ROWS                                            NUMBER
 BR_BLKS                                            NUMBER
 BR_ROWS_LEN                                        NUMBER
 BR_BLK_LEN                                         NUMBER
 DEL_LF_ROWS                                        NUMBER
 DEL_LF_ROWS_LEN                                    NUMBER
 DISTINCT_KEYS                                      NUMBER
 MOST_REPEATED_KEY                                  NUMBER
 BTREE_SPACE                                        NUMBER
 USED_SPACE                                         NUMBER
 PCT_USED                                           NUMBER
 ROWS_PER_KEY                                       NUMBER
 BLKS_GETS_PER_ACCESS                               NUMBER
 PRE_ROWS                                           NUMBER
 PRE_ROWS_LEN                                       NUMBER
 OPT_CMPR_COUNT                                     NUMBER
 OPT_CMPR_PCTSAVE                                   NUMBER

 

但是请注意validate structure会锁住table/index/cluster等对象,对于在线业务可以考虑使用validate structure online在线验证方法,但是validate strucutre online也有它的缺点,那就是在线模式下结构验证命令将不填充索引的状态信息到index_stats视图。

 

Analyze list chained rows的主要语法包括:

 

@?/rdbms/admin/utlchain.sql

desc chained_rows

ANALYZE TABLE <table_name> LIST CHAINED ROWS
INTO <table_name>;

ANALYZE CLUSTER <cluster_name> LIST CHAINED ROWS INTO <table_name>;

 

注意因为DBMS_STATS仅仅收集对CBO有用的统计信息,所以一些空间使用情况信息和使用FreeList管理的信息都不会被收集,这些信息包括:

 

If statistics unrelated to the cost based optimizer are required, then these
must still be collected using the the ANALYZE command. These statistics include:

Space Usage information :

EMPTY_BLOCKS,
AVG_SPACE,
CHAIN_CNT

Information on freelist blocks

AVG_SPACE_FREELIST_BLOCKS,
NUM_FREELIST_BLOCKS

 

因为以上信息对于CBO计算成本并没有帮助,所以DBMS_STATS也就无意也无法收集它们,但是Analyze命令还是可以做到收集以上这些信息。

 

此外因为CBO其实并不会参考Cluster类型对象的统计信息来计算Cost成本,而是使用cluster中各个表的统计信息(DBMS_STATS does not gather cluster statistics, but you can use DBMS_STATS to gather statistics on the individual tables instead of the whole cluster. )
所以DBMS_STATS也不支持收集Cluster的统计信息。

 

这么总结一下,发现Analyze命令的独有功能还真不少,因为Oracle公司已经明确了Analyze作为”validate”验证命令的功能定位,且很多内部的工具和脚本仍在使用Analyze Table/Cluster/Index的特有功能,所以可以预期Analyze命令在未来的一段时间内也不会被废弃。

 

总结以上DBMS_STATS无法实现,而传统的Analyze命令可以做到的功能:

  1. Analyze validate structure 验证表、簇、索引的结构的完整性,使用cascade选项可以交叉验证表和索引的数据完整,online选项可以做到在线验证
  2. Analyze list chained rows 收集表、簇上的Migrated and Chained Rows链式或迁移行信息
  3. Analyze table compute statistics 收集表上的 empty_blocks、avg_space 空间使用信息
  4. Analyze Cluster 收集簇的信息,其实cluster上唯一可统计的信息是DBA_CLUSTERS.AVG_BLOCKS_PER_KEY(Number of blocks in the table divided by number of cluster keys) , 所以收集cluster的statistics意义不大

 

附Analyze命令的详细语法如下:

 

Syntax:

   ANALYZE TABLE tablename COMPUTE|ESTIMATE|DELETE STATISTICS ptnOption options
   ANALYZE INDEX indexname COMPUTE|ESTIMATE|DELETE STATISTICS ptnOption options
   ANALYZE CLUSTER clustername COMPUTE|ESTIMATE|DELETE STATISTICS options

ptnOption
   PARTITION (partion)
   SUBPARTITION (subpartition)

options
   VALIDATE STRUCTURE [CASCADE] [INTO tablename]
   LIST CHAINED ROWS [INTO tablename]
   COMPUTE|ESTIMATE STATISTICS FOR TABLE
   COMPUTE|ESTIMATE STATISTICS FOR ALL COLUMNS
   COMPUTE|ESTIMATE STATISTICS FOR ALL INDEXED COLUMNS
   COMPUTE|ESTIMATE STATISTICS FOR COLUMNS [SIZE int] column [SIZE int]

   When Estimating statistics you can optionally
    specify
    ... ESTIMATE STATISTICS SAMPLE n ROWS
    ... ESTIMATE STATISTICS SAMPLE n PERCENT

Oracle Database常见存储用语Terms

Oracle Database常见存储用语Terms,如下:

ACD = Active Change Directory
ACFS = ASM Cluster File System
ADDM = Automatic Database Diagnostic Monitor
ADR = Automatic Diagnostic Repository
ADVM = ASM Dynamic Volume Manager
AIO = Asynchronous I/O
AMDU = ASM Metadata Dump Utility
AMM = Automatic Memory Management
ARC = Archive process
ASH = Active Session History
ASM = Automatic Storage Management
ASMB = ASM Background process
ASMCA = ASM Configuration Assistant
ASMCMD = ASM CoMmanD line utility
ASMLIB = ASM LIBrary tool
ASMM = Automatic Shared Memory Management
ASMSNMP = ASM Simple Network Management Protocol
AT = Allocation Table
ATA = Advanced Technology Attachment
AU = Allocation Unit
AWR = Automatic Workload Repository
BH = Block Header
BMC = Baseboard Management Controller
BS = Block Size
CBO = Cost-Based Optimizer
CF = Control File
CFS = Cluster FileSystem
CHM = Cluster Health Monitor
CIO = Concurrent I/O
CKPT = ChecKPoinT process
CLUVFY = CLUster VeriFy utility
COD = Continuing Operation Directory
CPU = Central Processing Unit or Critical Patch Update
CRM = Customer Relationship Management
CRS = Cluster Ready Services
CSS = Cluster Synchronization Services
CSV = Comma-Separated Values
CVM = Cluster Volume Manager
DB = DataBase
DBCA = DataBase Configuration Assistant
DBFS = DataBase File System
DBM = DataBase Machine
DBMS = DataBase Management Systems
DBPERF = DataBase PERFormance
DBV = DataBase Verification tool
DBW = DataBase Writer process
DD = Disk Directory or Data Description tool
DES = Database Excelleration Systems Inc.
DG = DiskGroup
DH = Disk Header
DIO = Direct I/O
DISM = Dynamic Intimate Shared Memory
DM = Device Mapper
DNFS = Direct Network File System
DRAM = Dynamic Random Access Memory
DSS = Decision Support System
DUL = Data UnLoader
DW = Data Warehouse
EIDE = Enhanced Integrated Drive Electronics
ERP = Enterprise Resource Planning
ETA = Estimated Time of Arrival
ETL = Extract Transform Load
FD = File Directory
FG = FailGroup
FRA = Flash Recovery Area or Fast Recovery Area
FS = FileSystem
FST = Free Segments Table
FTS = Full Table Scan
GC = Grid Control
GI = Grid Infrastructure
GUI = Graphical User Interface
HA = High Availability
HARD = Hardware Assisted Resilient Data
HB = Heart Beat
HBA = Host Bus Adapter
IDE = Integrated Drive Electronics
IIS = Internet Information Services
INST = INSTance
IO = Input/Output
IOPS = IO Per Second
IOT = Index Organized Table
IP = Internet Protocol
IPMI = Intelligent Platform Management Interface
JET = Joint Escalation Team
JFS = Journaled FileSystem
KB = KiloByte
KFED = Kernel Files metadata EDitor
KFOD = Kernel Files Osm disk[group] Discovery
LGWR = redo LoG WRiter process
LLT = Veritas Low Latency Transport protocol
LSNR = LiSteNeR
LUN = Logical Unit Number
LVM = Logical Volume Manager
MAA = Maximum Availability Architecture
MB = MegaByte
NAS = Network Attached Storage
NetApp = Network Appliance
NETCA = NETwork Configuration Assistant
NFS = Network FileSystem
NIC = Network Interface Controller
OCFS = Oracle Cluster FileSystem
OCR = Oracle Cluster Registry
ODM = Oracle Disk Manager
ODS = Operational Data Store
OEL = Oracle Enterprise Linux
OEM = Oracle Enterprise Manager
OID = Oracle Internet Directory
OLAP = OnLine Analytical Processing
OLTP = OnLine Transaction Processing
OMS = Oracle Management Service
OPATCH = Oracle PATCHing utility
OS = Operating System
OSCP = Oracle Storage Compatibility Program
OSW = OS Watcher
OSWFW = OS Watcher For Windows
OTN = Oracle Technology Network
OUI = Oracle Universal Installer
PB = PetaByte
PFILE = Parameter FILE
PGA = Program Global Area
PID = Proces ID
PL/SQL = Procedural Language/Structured Query Language
POC = Proof Of Concept
PROCWATCHER = PROCess WATCHER
PST = Partnership Status Table
PSU = Patch Set Update
PX = Parrallel eXecution
RAC = Real Application Cluster
RAID = Redundant Array of Independent Disks
RAM = Random Access Memory
RBAL = ReBALance process
RCA = Root Cause Analysis
RDA = Remote Diagnostic Agent
RDBMS = Relational DataBase Management System
RHEL = RedHat Enterprise Linux
RM = Resource Manager
RMAN = Recovery MANager
RPM = Resource Package Manager
SAS = Serial Attached SCSI
SATA = Serial Advanced Technology Attachment
SCAN = Single Client Access Name
SCSI = Small Computer System Interface
SGA = System Global Area
SLA = Service Level Agreement
SMF = Service Management Facility
SPFILE = Server Parameter FILE
SQL = Structured Query Language
SRDF = EMC Symmetrix Remote Data Facility
SSD = Solid State Disk
SVCTM = average SerViCe TiMe
SVM = Solaris Volume Manager
TB = TeraByte
TCP = Transmission Control Protocol
TDE = Transparent Data Encryption
TKPROF = Transient Kernel PROFile
TNS = Transparent Network Substrate
UDEV = Unix DEVice manager
UDP = User Datagram Protocol
UFG = Umbilicus ForeGround process
UFS = User FileSystem
VBG = Volume BackGround process
VCS = Veritas Cluster Server
VD = Voting Disk
VDBG = Volume Driver BackGround process
VIP = Virtual Internet Portocol
VLDB = Very Large DataBase
VM = Virtual Machine
VMB = Volume Membership BackGround process
VxFS = Veritas File System
VxVM = Veritas Volume Manager
XDB = XML DataBase

Exadata:Smart Scan(二) FAST FULL SCAN

实测了以下Exadata smart scan对于INDEX STORAGE FAST FULL SCAN似乎实际并不生效,详见以下测试。不仅普通的B*tree index也包括compressed index、reverse key index等类型。

Why is my Exadata smart scan not offloading?
Exadata Smart Scan and Index Access 

上面2篇文章介绍了了类似的XD offload 对index fast full scan不生效的问题, 相关的BUG有:

Bug 8257122 – Exadata smart scan caching does not work for INDEX FAST FULL scan (Doc ID 8257122.8)

 

以下测试了对NORMAL INDEX和bitmap index fast full scan的OFFLOAD情况:


SQL> select blocks,bytes/1024/1024/1024 from dba_segments where segment_name='LARGE_TABLE';

    BLOCKS BYTES/1024/1024/1024
---------- --------------------
   7127040               54.375

Elapsed: 00:00:00.19


SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.19


SQL>  select count(*) from  LARGE_TABLE where object_id between 1000 and 20000;

  COUNT(*)
----------
 486735872

Elapsed: 00:00:23.29


SQL> alter system flush buffer_cache;

System altered.

Elapsed: 00:00:00.19

SQL> select  /*+ OPT_PARAM('cell_offload_processing' 'false') */ count(*) from  LARGE_TABLE where object_id between 1000 and 20000;

  COUNT(*)
----------
 486735872

Elapsed: 00:03:24.22



SQL> create index pk_lt on large_table (object_id,data_object_id) tablespace larget parallel nologging;

Index created.

Elapsed: 00:01:14.18

SQL> alter index pk_lt noparallel;

Index altered.

SQL> exec dbms_stats.gather_table_stats('SYS','LARGE_TABLE',cascade=>TRUE, estimate_percent=>100, degree=>8);

 PL/SQL procedure successfully completed.

Elapsed: 00:13:12.61






select a.name,b.value 
from v$sysstat a , v$mystat b
where
a.statistic#=b.statistic#
and (a.name in ('physical read total bytes','physical write total bytes',
'cell IO uncompressed bytes') or a.name like 'cell phy%' );

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total bytes                                                 0
physical write total bytes                                                0
cell physical IO interconnect bytes                                       0
cell physical IO bytes saved during optimized file creation               0
cell physical IO bytes saved during optimized RMAN file restore           0
cell physical IO bytes eligible for predicate offload                     0
cell physical IO bytes saved by storage index                             0
cell physical IO bytes sent directly to DB node to balance CPU            0
cell physical IO interconnect bytes returned by smart scan                0
cell IO uncompressed bytes                                                0

10 rows selected.

alter system flush buffer_cache;
set timing on;
set linesize 200 pagesize 2000 
select /*+ FULL(LARGE_TABLE) */ count(*) from  LARGE_TABLE where object_id between 1000 and 20000;


  COUNT(*)
----------
 486735872

Elapsed: 00:00:23.30
TABLE ACCESS STORAGE FULL| LARGE_TABLE |   403M|  1925M|  1935K  (1)|

SQL> select a.name,b.value 
  2  from v$sysstat a , v$mystat b
  3  where
a.statistic#=b.statistic#
and (a.name in ('physical read total bytes','physical write total bytes',
'cell IO uncompressed bytes') or a.name like 'cell phy%' );  4    5    6  

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total bytes                                        5.8303E+10
physical write total bytes                                                0
cell physical IO interconnect bytes                              6055421032
cell physical IO bytes saved during optimized file creation               0
cell physical IO bytes saved during optimized RMAN file restore           0
cell physical IO bytes eligible for predicate offload            5.8303E+10
cell physical IO bytes saved by storage index                             0
cell physical IO bytes sent directly to DB node to balance CPU            0
cell physical IO interconnect bytes returned by smart scan       6055396456
cell IO uncompressed bytes                                       5.8321E+10

10 rows selected.

Elapsed: 00:00:00.01
SQL> 


SQL> alter system flush buffer_cache;
set timing on;
set linesize 200 pagesize 2000 
System altered.

SQL> SQL> 
SQL> 
SQL> explain plan for select count(*) from  LARGE_TABLE where object_id between 1000 and 20000;

Explained.

Elapsed: 00:00:00.02
SQL> @?/rdbms/admin/utlxplp

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 800139279

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |     5 |   356K  (1)| 00:00:14 |
|   1 |  SORT AGGREGATE               |       |     1 |     5 |            |          |
|*  2 |   INDEX STORAGE FAST FULL SCAN| PK_LT |   403M|  1925M|   356K  (1)| 00:00:14 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - storage("OBJECT_ID"<=20000 AND "OBJECT_ID">=1000)
       filter("OBJECT_ID"<=20000 AND "OBJECT_ID">=1000)

15 rows selected.

Elapsed: 00:00:00.02






SQL> select count(*) from  LARGE_TABLE where object_id between 1000 and 20000;

  COUNT(*)
----------
 486735872

Elapsed: 00:02:01.66
SQL> oradebug tracefile_name
/u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_ora_39539.trc
SQL> select a.name,b.value 
  2  from v$sysstat a , v$mystat b
  3  where
a.statistic#=b.statistic#
and (a.name in ('physical read total bytes','physical write total bytes',
'cell IO uncompressed bytes') or a.name like 'cell phy%' );  4    5    6  

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total bytes                                        1.3300E+10
physical write total bytes                                                0
cell physical IO interconnect bytes                              1.3300E+10
cell physical IO bytes saved during optimized file creation               0
cell physical IO bytes saved during optimized RMAN file restore           0
cell physical IO bytes eligible for predicate offload                     0
cell physical IO bytes saved by storage index                             0
cell physical IO bytes sent directly to DB node to balance CPU            0
cell physical IO interconnect bytes returned by smart scan                0
cell IO uncompressed bytes                                                0

10 rows selected.

Elapsed: 00:00:00.01


END OF STMT
PARSE #47310019587768:c=2000,e=2137,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=800139279,tim=1353385966411213
EXEC #47310019587768:c=1000,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=800139279,tim=1353385966411365
WAIT #47310019587768: nam='SQL*Net message to client' ela= 3 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1353385966411401
WAIT #47310019587768: nam='cell single block physical read' ela= 511 cellhash#=451279719 diskhash#=3519799300 bytes=8192 obj#=25183 tim=1353385966414839
WAIT #47310019587768: nam='cell multiblock physical read' ela= 16720 cellhash#=451279719 diskhash#=3519799300 bytes=1048576 obj#=25183 tim=1353385966433058
WAIT #47310019587768: nam='cell multiblock physical read' ela= 2965 cellhash#=451279719 diskhash#=3519799300 bytes=1048576 obj#=25183 tim=1353385966440986
...........................


select count(*) from  LARGE_TABLE where owner like '%SY%';


QL> explain plan for select count(*) from  LARGE_TABLE where owner like '%SY%';

Explained.

Elapsed: 00:00:00.00
SQL> @?/rdbms/admin/utlxplp

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3706014413

------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |        |     1 |     5 | 19017   (1)| 00:00:01 |
|   1 |  SORT AGGREGATE                       |        |     1 |     5 |            |          |
|   2 |   BITMAP CONVERSION COUNT             |        |    26M|   127M| 19017   (1)| 00:00:01 |
|*  3 |    BITMAP INDEX STORAGE FAST FULL SCAN| BIT_LT |       |       |            |          |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - storage("OWNER" LIKE '%SY%' AND "OWNER" IS NOT NULL)
       filter("OWNER" LIKE '%SY%' AND "OWNER" IS NOT NULL)

16 rows selected.

Elapsed: 00:00:00.01

SQL> set linesize 200 pagesize 2000 


SQL> select count(*) from  LARGE_TABLE where owner like '%SY%';

  COUNT(*)
----------
 362643456

Elapsed: 00:00:01.07






SQL> select a.name,b.value 
  2  from v$sysstat a , v$mystat b
  3  where
a.statistic#=b.statistic#
and (a.name in ('physical read total bytes','physical write total bytes',
'cell IO uncompressed bytes') or a.name like 'cell phy%' );  4    5    6  

NAME                                                                  VALUE
---------------------------------------------------------------- ----------
physical read total bytes                                         173424640
physical write total bytes                                                0
cell physical IO interconnect bytes                               173424640
cell physical IO bytes saved during optimized file creation               0
cell physical IO bytes saved during optimized RMAN file restore           0
cell physical IO bytes eligible for predicate offload                     0
cell physical IO bytes saved by storage index                             0
cell physical IO bytes sent directly to DB node to balance CPU            0
cell physical IO interconnect bytes returned by smart scan                0
cell IO uncompressed bytes                                                0

10 rows selected.

Elapsed: 00:00:00.02
SQL> 


解决Oracle中Split Partition缓慢的问题

有这样一个case , 用户的10g产品数据库中有一张按照月份划分的RANGE分区表 , 其范围为Less than (maxvalue)的最后一个Partition分区总是为空Empty的, 用户每隔半年会对这个MaxValue Partition 执行Split Partition 操作, 以便存放后续月份的数据, 同时这个表上还有一个非分区索引Nonpartitioned indexes。

 

满以为Split 这样一个Empry Partition会是一个回车就结束的任务, 但是Performance issue就在这平淡无奇的分区维护工作中出现了, 实际在执行”Alter Table Split partition Pn at (value) into …” 的DDL语句时,发现需要花费超过十多分钟才能完成一次Split。问题就在于,如果是有大量数据的Partition分区 , Split 操作慢一些也是正常的(expected预期内的) , 但是实际这里的Max Partition总是为空的,没有任何一条数据, 为什么Split 一个空的分区也会是这种蜗牛的速度呢?

 

我们来模拟这个现象, 首先创建一张分区表,Maxvalue的Partition是Empty的,且有一个普通索引:

 

 

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for Linux: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

SQL> select * from global_name;

GLOBAL_NAME
--------------------------------------------------------------------------------
www.askmaclean.com & www.askmaclean.com

SQL>   CREATE TABLE Maclean
  2     (    "OWNER" VARCHAR2(30),
  3          "OBJECT_NAME" VARCHAR2(128),
  4          "SUBOBJECT_NAME" VARCHAR2(30),
  5          "OBJECT_ID" NUMBER,
  6          "DATA_OBJECT_ID" NUMBER,
  7          "OBJECT_TYPE" VARCHAR2(19),
  8          "CREATED" DATE,
  9          "LAST_DDL_TIME" DATE,
 10          "TIMESTAMP" VARCHAR2(19),
 11          "STATUS" VARCHAR2(7),
 12          "TEMPORARY" VARCHAR2(1),
 13          "GENERATED" VARCHAR2(1),
 14          "SECONDARY" VARCHAR2(1)
 15     ) nologging
 16     partition by range(object_id)
 17     (partition p1 values less than (99999) tablespace users,
 18     partition p2 values less than (maxvalue) tablespace users);

Table created.

SQL> insert /*+ append */ into maclean select * from maclean1;

38736384 rows created.

SQL> commit; 

Commit complete.

SQL> create index ind_obj on maclean(DATA_OBJECT_ID,OBJECT_ID,LAST_DDL_TIME,TIMESTAMP,object_type,owner,status)
nologging parallel
  2  ;

Index created.

SQL> alter index ind_obj noparallel;

Index altered.

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2);

PL/SQL procedure successfully completed.

SQL> select num_rows from dba_tables where table_name='MACLEAN';

  NUM_ROWS
----------
  38818438

SQL>  select * from maclean partition (p2);

no rows selected

/* Maclean表有大量的数据 ,但是都在p1分区中, p2分区没有任何数据 */

 

 

我们执行Split partition 的DDL 语句,并使用10046 level 12 event监控该过程:

 

 oradebug setmypid;
 oradebug event 10046 trace name context forever,level 12;

SQL>  alter table maclean split partition p2 at (100001)
  2   into (partition p3, partition p4);

Table altered.

[oracle@vrh8 ~]$ tkprof /s01/admin/G10R21/udump/g10r21_ora_4896.trc g10r21_ora_4896.tkf

TKPROF: Release 10.2.0.1.0 - Production on Thu Nov 17 23:42:48 2011

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

从tkf 文件中可以找出以下内容:

 alter table maclean split partition p2 at (100001)
 into (partition p3, partition p4)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.13       0.30         20       1139          0           0
Execute      1      0.01       0.18          3          6         33           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.14       0.48         23       1145         33           0

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1
from
 "SYS"."MACLEAN" PARTITION ("P2")  where ( (  (  ( "OBJECT_ID" < 100001 )  )
  ) ) and rownum < 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     24.85      39.15     371276     385828          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3     24.85      39.15     371276     385828          0           0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  COUNT STOPKEY (cr=385828 pr=371276 pw=0 time=39153836 us)
      0   TABLE ACCESS BY GLOBAL INDEX ROWID MACLEAN PARTITION: 2 2 (cr=385828 pr=371276 pw=0 time=39153817 us)
38736384    INDEX FULL SCAN IND_OBJ (cr=385828 pr=371276 pw=0 time=309891137 us)(object id 52832)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                    371276        0.08         19.46
  latch: cache buffers lru chain                  1        0.00          0.00

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1
from
 "SYS"."MACLEAN" PARTITION ("P2")  where ( (  (  ( "OBJECT_ID" >= 100001 OR
  "OBJECT_ID" IS NULL  )  )  ) ) and rownum < 2

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          3          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          3          0           0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: SYS   (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  COUNT STOPKEY (cr=3 pr=0 pw=0 time=83 us)
      0   PARTITION RANGE SINGLE PARTITION: 2 2 (cr=3 pr=0 pw=0 time=63 us)
      0    TABLE ACCESS FULL MACLEAN PARTITION: 2 2 (cr=3 pr=0 pw=0 time=36 us)

 

 

可以看到在执行”Alter table Split partition”的时候该DDL 语句产生了另外2条递归SQL(recursive sql)被调用,即上例中红色标注的SQL语句,这2条递归SQL分别以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL  和  “OBJECT_ID” < 100001 作为条件查询P2分区中的数据, Split Partition的DDL需要使用这2条SQL来找出是否存在满足分隔条件过滤的数据(注意语句中有rownum<2 , 所以实际最多也只返回1条数据,Oracle这样来判定分隔条件的左端或右端是否有数据)。

 

但是这里可以看到,其中 以”OBJECT_ID” < 100001 作为条件的语句运行耗时39.15s, 产生了大量的逻辑和物理读, 究其原因是该SQL的执行计划采用了Index FULL SCAN ,而另外一条以 “OBJECT_ID” >= 100001 OR “OBJECT_ID” IS NULL 作为条件的递归SQL语句则采用了TABLE ACCESS FULL MACLEAN PARTITION, 因为实际P2分区中是没有任何数据的,所以后者运行时间是us级别的, 而前者所要扫描的是一整个没有分区的索引, 这产生了大量的”db file sequential read”等待事件, 我们再来看一下其执行计划:

 

 

SQL> explain plan for
  2  select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  3   1
  4    from "SYS"."MACLEAN" PARTITION("P2")
  5   where (((("OBJECT_ID" < 100001))))
  6     and rownum < 2; 

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-------------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |   380K  (1)| 01:16:01 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |   380K  (1)| 01:16:01 |     2 |     2 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    38M|       |   380K  (1)| 01:16:01 |       |       |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

      1 - filter(ROWNUM<2)
   3 - access("OBJECT_ID"<100001)
       filter("OBJECT_ID"<100001)

 

注意以上递归SQL中的Hint “FIRST_ROWS PARALLEL(“MACLEAN”, 1)”是该Recursive SQL所固有的,换句话说是写死在Split Partition的Oracle代码层里的。

我们可以分析该Recursive SQL 采用INDEX FULL SCAN的原因可能有2种:
1. Split 所指定的分区的分区统计信息不准确,或者已经被清除。 注意一旦我们Split 某个分区后该分区原有统计信息将失效,且被清除。

如下面的例子:

 

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true,degree=>2);

PL/SQL procedure successfully completed.

SQL> col high_value for a20

SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN';

PARTITION_NAME                 HIGH_VALUE             NUM_ROWS     BLOCKS
------------------------------ -------------------- ---------- ----------
P1                             99999                  38789142     533240
P3                             100001                        0          0
P4                             MAXVALUE                      0          0

SQL>  alter table maclean split partition p4 at (100010)
  2   into (partition p5, partition p4);

这里我们再次Split 当前的MAXVALUE分区p4

SQL> select partition_name,high_value,num_rows,blocks from dba_tab_partitions where table_name='MACLEAN';

PARTITION_NAME                 HIGH_VALUE             NUM_ROWS     BLOCKS
------------------------------ -------------------- ---------- ----------
P1                             99999                  38789142     533240
P3                             100001                        0          0
P4                             MAXVALUE
P5                             100010

 

 

可以发现Split Partiton 会导致原Partiton的统计信息失效,即便使用dbms_stats.lock_table_stats锁住统计信息也无法避免这种失效。
且单个partiton的统计信息失效并不会导致动态采用的发生(dynamic sampling):

 

SQL> show parameter dyna 

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     2

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |  9699K|   860M| 29715   (2)| 00:05:57 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
        124  recursive calls
          0  db block gets
         23  consistent gets
          0  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

SQL> alter system flush shared_pool;

System altered.

SQL> alter session set optimizer_dynamic_sampling=10;

Session altered.

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |  9699K|   860M| 29715   (2)| 00:05:57 |       |       |
| 1 | PARTITION RANGE SINGLE | | 9699K| 860M| 29715 (2)| 00:05:57 | 4 | 4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |  9699K|   860M| 29715   (2)| 00:05:57 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
       5812  recursive calls
          0  db block gets
       1141  consistent gets
         22  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
        139  sorts (memory)
          0  sorts (disk)
          0  rows processed

SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',partname=>'P4');

PL/SQL procedure successfully completed.

SQL> select * from maclean partition (p4);

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3900731449

--------------------------------------------------------------------------------------------------
| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |         |     1 |   177 |     2   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|         |     1 |   177 |     2   (0)| 00:00:01 |     4 |     4 |
|   2 |   TABLE ACCESS FULL    | MACLEAN |     1 |   177 |     2   (0)| 00:00:01 |     4 |     4 |
--------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
       1139  bytes sent via SQL*Net to client
        458  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

 

 

由于Split Partition 本身会导致分区的统计信息失效, 这叫造成由该Split DDL所生成的递归查询SQL语句在解析时CBO Optimizer优化器无法了解该分区的必要统计信息,所以优化器会根据整张Table的统计信息来估算(根据算法来估,而没有实际的统计操作), 实际统计信息中整张表有38818438 行 , 且共有4个分区, 所以P4分区就顺利成长的拥有38818438 /4 = 9704k Rows了 , 实际上例中红色标注的估算值时9699k rows, 因为CBO 优化器得到的统计信息是该分区中有大量的数据 ,这导致其最终选择了 FULL INDEX SCAN的执行计划 ,而不是去扫描其实是空空如也的分区。

 

这是我们在 10g 中执行Split Partition 操作时需要密切注意的一个问题, 解决方法是没执行一次Split Partition DDL语句之前都收集一遍MaxValue Partiton 的统计信息, 因为该分区是空的 ,所以这种统计是十分迅速的:

 

exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME');

 

 

另一种手段则是在每次Split 之前都 手动修改Maxvalue分区的统计信息, 这样做会更快一些:

 

exec dbms_stats.set_table_stats(ownname => '&OWNER',tabname => '&TABNAME',partname => '&PARNAME',
numrows => 0,numblks => 0,force => TRUE);

 

 

2. 另一个原因是 相关的递归SQL语句 被嵌入了”First Rows”的hint , 该提示会让CBO更倾向于使用索引以便快速返回前几行的结果,注意因为这些递归SQL实际只要求返回一行结果,所以First Rows 可以说是恰当且明智的; 另外在分区表+本地分区的情景中, 即便这个分区是非空的且存有大量的数据, 那么使用索引都可以说是正确的选择。

 

但是在这里选择INDEX FULL SCAN 恰恰是不明智的, 很显然Oracle开发部门没有为分区表+ 非分区索引(Non-partitioned Indexes) 或 全局索引(global partitioned indexes)的场景考虑该First Rows提示可能带来的后果,  已知在版本10.2.0.2 和 10.2.0.3 上都存在该不恰当的递归SQL hint造成的Split Partiton性能问题,Bug Note<Bug 6435544: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL>说明了该问题:

 

Hdr: 6435544 10.2.0.2.0 RDBMS 10.2.0.2.0 QRY OPTIMIZER PRODID-5 PORTID-226
Abstract: SPLIT PARTITION SLOW BECAUSE OF HINTED RECURSIVE SQL

PROBLEM:
--------
Split partition operation took more than 45 minutes to complete. Almost all
the time is taken up by the following SQL -
select /*+ FIRST_ROWS PARALLEL("D_CUSTOMER_ORDER_ITEM_CHANNELS", 8) */ 1
from
 "BOOKER"."D_CUSTOMER_ORDER_ITEM_CHANNELS" PARTITION ("COIC101_MAX")
where ( ( ( ( "LEGAL_ENTITY_ID" < 101 )  )  OR  ( "LEGAL_ENTITY_ID" = 101 AND
 (
 "ORDER_DAY" < TO_DATE(' 2007-11-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS',
 'NLS_CALENDAR=GREGORIAN') )  )  ) ) and rownum < 2

DIAGNOSTIC ANALYSIS:
--------------------
a. Table is partitioned on - (LEGAL_ENTITY_ID, ORDER_DAY).
   Index PK_D_CUST_ORDER_ITEM_CHANNELS is on -
       (CUSTOMER_ORDER_ITEM_ID, MARKETPLACE_ID, LEGAL_ENTITY_ID, ORDER_DAY)

b. Table has 555 partitions. Index is a global index. 

c. From the tkprof output -

call   count       cpu    elapsed       disk      query  current  rows
----- ------  -------- ---------- ---------- ---------- --------  ----
Parse      1      0.00       0.00          0          0        0     0
Execute    1      0.00       0.00          0          0        0     0
Fetch      1    405.50    2796.28    1408207    1444973        0     0
----- ------  -------- ---------- ---------- ---------- --------  ----
total      3    405.50    2796.28    1408207    1444973        0     0

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 108     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
time=6 us)(object id 1160811)

d. Although we have partition_name in the query, we go for INDEX FULL scan
because of hard coded hint FIRST_ROWS. A FTS of the concerned partition would
be much faster in this case.

WORKAROUND:
-----------
Drop the global index which would force an FTS on a single partition.

RELEASE NOTES:
]] Poor performance was possible for ALTER TABLE SPLIT PARTITION if there
]] was an index on the partition key.
REDISCOVERY INFORMATION:
If you get poor performance for an ALTER TABLE SPLIT PARTITION operation
and there is an index on the partition key, you may be hitting this bug.
WORKAROUND:
None

 

 

Metalink给出的Workaround方案是将分区表上的global index 全局索引drop 掉, 这样可以让CBO只能选择对single partition的FULL TABLE SCAN。

 

实际上这个Solution并很不能让人满意,在产品环境中我们不可能随意drop掉某张关键表上的索引,所以这个solution的可操作性很低。

 

补充:我们来看一下First Rows Hint 在CBO计算成本时如何起作用的:

 

SQL> oradebug setmypid;
Statement processed.
SQL> oradebug event 10053 trace name context forever,level 1;
Statement processed.
SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; Explained. SQL> oradebug tracefile_name;

10053 trace content ================================================

  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
  optimizer_mode_hinted               = true
  optimizer_mode                      = first_rows

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: MACLEAN  Alias: MACLEAN  Partition [3]
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
Index Stats::
  Index: IND_OBJ  Col#: 5 4 8 9 6 1 10
    LVLS: 3  #LB: 380544  #DK: 50447  LB/K: 7.00  DB/K: 777.00  CLUF: 39208542.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#4): OBJECT_ID(NUMBER)  Part#: 3
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Column (#4): OBJECT_ID(NUMBER)
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Table: MACLEAN  Alias: MACLEAN     
    Card: Original: 0  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 7121
      Resp_io: 2.00  Resp_cpu: 7121
kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2
  Access Path: index (skip-scan)
    SS sel: 0.0000e+00  ANDV (#skips): 4073
    SS io: 32584.00 vs. table scan io: 2.00
    Skip Scan rejected
  Access Path: index (FullScan)
    Index: IND_OBJ
    resc_io: 380547.00  resc_cpu: 10551751028
    ix_sel: 1  ix_sel_with_filters: 1
    Cost: 382007.38  Resp: 382007.38  Degree: 1
  Best:: AccessPath: IndexRange  Index: IND_OBJ
         Cost: 382007.38  Degree: 1  Resp: 382007.38  Card: 0.00  Bytes: 0
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]:  MACLEAN[MACLEAN]#0
***********************
Best so far: Table#: 0  cost: 382007.3822  card: 0.0000  bytes: 13
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000
*********************************
Number of join permutations tried: 1
*********************************
prefetching is on for IND_OBJ
Final - First Rows Plan:  Best join order: 1
  Cost: 382007.3822  Degree: 1  Card: 1.0000  Bytes: 13
  Resc: 382007.3822  Resc_io: 380547.0000  Resc_cpu: 12512178128
  Resp: 382007.3822  Resp_io: 380547.0000  Resc_cpu: 12512178128
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ FIRST_ROWS NO_PARALLEL ("MACLEAN") */ 1 "1" FROM "SYS"."MACLEAN" PARTITION ("P4") 
"MACLEAN" WHERE ROWNUM<2 AND "MACLEAN"."OBJECT_ID"<100010
kkoqbc-end
          : call(in-use=46464, alloc=49080), compile(in-use=39288, alloc=40552)
apadrv-end: call(in-use=46464, alloc=49080), compile(in-use=40072, alloc=40552)

sql_id=ff1ft3uxsq105.
Current SQL statement for this session:
explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
 1
  from "SYS"."MACLEAN" PARTITION("P4")
 where (((("OBJECT_ID" < 100010))))
   and rownum < 2

============
Plan Table
============
-------------------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                            | Name    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-------------------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT                     |         |       |       |  373K |           |       |       |
| 1   |  COUNT STOPKEY                       |         |       |       |       |           |       |       |
| 2   |   TABLE ACCESS BY GLOBAL INDEX ROWID | MACLEAN |     1 |    13 |  373K |  01:16:25 | 4     | 4     |
| 3   |    INDEX FULL SCAN                   | IND_OBJ |   37M |       |  373K |  01:16:25 |       |       |
-------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
1 - filter(ROWNUM<2)
3 - access("OBJECT_ID"<100010)
3 - filter("OBJECT_ID"<100010)

Content of other_xml column
===========================
  db_version     : 10.2.0.1
  parse_schema   : SYS
  plan_hash      : 985826631
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
      FIRST_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "MACLEAN"@"SEL$1" ("MACLEAN"."DATA_OBJECT_ID" "MACLEAN"."OBJECT_ID"
"MACLEAN"."LAST_DDL_TIME" "MACLEAN"."TIMESTAMP" "MACLEAN"."OBJECT_TYPE" "MACLE
AN"."OWNER" "MACLEAN"."STATUS"))
    END_OUTLINE_DATA
  */

 

 

可以看到虽然INDEX FULL SCAN的成本(cost:382007)大于 Access Path: TableScan (cost : 2.00) 很多 ,但是optimizer 最终仍然选择了Index Full Scan , 因为其是满足First Rows 要求的执行计划(红色标注部分)。

 

于是我开始自己寻找workaround的路径,目标是让优化器忽略”First Rows”的影响。我一开始寄望于能够通过设置一些影响CBO计算cost的优化器参数来让optimizer 迷途知返, 包括设置optimizer_index_cost_adj和”_db_file_optimizer_read_count”的值到一个很大水平,但发现并不起作用:

 

SQL> alter session set "_db_file_optimizer_read_count"=65535;

Session altered.

SQL>  alter session set optimizer_index_cost_adj=10000;

Session altered.

SQL> alter system flush shared_pool;

System altered.

SQL> select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |    38M  (1)|127:20:09 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |    38M  (1)|127:20:09 |     4 |     4 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    39M|       |    38M  (1)|127:20:09 |       |       |
---------------------------------------------------------------------------------------------------------------

 

得益于好奇心,我以’%optimizer%ignore’的Like语句去查了Oracle的隐式参数表,果然有志者事竟成,最终有所斩获:

 

SQL> col name for a40
SQL> col value for a20
SQL> col describ for a60
SQL> set linesize 200 pagesize 1400
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
  2   FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3   WHERE x.inst_id = USERENV ('Instance')
  4   AND y.inst_id = USERENV ('Instance')
  5   AND x.indx = y.indx
  6  AND x.ksppinm LIKE '%optimizer%ignore%';

NAME                                     VALUE                DESCRIB
---------------------------------------- -------------------- ---------------------
_optimizer_ignore_hints                  TRUE                 enables the embedded hints to be ignored

 

 

在metalink上查了下没有关于该”_optimizer_ignore_hints” 隐式参数的任何有用信息, 就注释来看是可以启用是否忽略SQL中嵌入的HINT提示信息, 我们来具体看以下是否其作用:

 

 

SQL> alter system flush shared_pool;

System altered.

SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; 

Explained. 

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------
Plan hash value: 985826631

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |         |     1 |    13 |    38M  (1)|127:20:09 |       |       |
|*  1 |  COUNT STOPKEY                      |         |       |       |            |          |       |       |
|   2 |   TABLE ACCESS BY GLOBAL INDEX ROWID| MACLEAN |     1 |    13 |    38M  (1)|127:20:09 |     4 |     4 |
|*  3 |    INDEX FULL SCAN                  | IND_OBJ |    39M|       |    38M  (1)|127:20:09 |       |       |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<2)
   3 - access("OBJECT_ID"<100010)
       filter("OBJECT_ID"<100010)

SQL> alter session set "_optimizer_ignore_hints"=true;

Session altered.

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10053 trace name context forever , level 1;
Statement processed.

SQL> explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
  2   1
  3    from "SYS"."MACLEAN" PARTITION("P4")
  4   where (((("OBJECT_ID" < 100010))))
  5     and rownum < 2; Explained. SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
-----------------------------------------------
Plan hash value: 4280157877

---------------------------------------------------------------------------------------------------
| Id  | Operation               | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |         |     1 |    13 |     2   (0)| 00:00:01 |       |       |
|*  1 |  COUNT STOPKEY          |         |       |       |            |          |       |       |
|   2 |   PARTITION RANGE SINGLE|         |     1 |    13 |     2   (0)| 00:00:01 |     4 |     4 |
|*  3 |    TABLE ACCESS FULL    | MACLEAN |     1 |    13 |     2   (0)| 00:00:01 |     4 |     4 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<2)
   3 - filter("OBJECT_ID"<100010)

 

惊喜! 该 “_optimizer_ignore_hints”参数确实起到无视HINT提示的作用, 且可以直接在session级别修改,十分方便, 我们透过10053 event来观察该参数是如何其作用的:

 

***************************************
PARAMETERS USED BY THE OPTIMIZER
********************************
  *************************************
  PARAMETERS WITH ALTERED VALUES
  ******************************
  _optimizer_ignore_hints             = true

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: MACLEAN  Alias: MACLEAN  Partition [3]
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
    #Rows: 0  #Blks:  1  AvgRowLen:  0.00
Index Stats::
  Index: IND_OBJ  Col#: 5 4 8 9 6 1 10
    LVLS: 3  #LB: 380544  #DK: 50447  LB/K: 7.00  DB/K: 777.00  CLUF: 39208542.00
***************************************
SINGLE TABLE ACCESS PATH
  Column (#4): OBJECT_ID(NUMBER)  Part#: 3
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Column (#4): OBJECT_ID(NUMBER)
    AvgLen: 22.00 NDV: 0 Nulls: 0 Density: 0.0000e+00 Min: 0 Max: 0
  Table: MACLEAN  Alias: MACLEAN
    Card: Original: 0  Rounded: 1  Computed: 0.00  Non Adjusted: 0.00
  Access Path: TableScan
    Cost:  2.00  Resp: 2.00  Degree: 0
      Cost_io: 2.00  Cost_cpu: 7121
      Resp_io: 2.00  Resp_cpu: 7121
kkofmx: index filter:"MACLEAN"."OBJECT_ID"<100010 AND ROWNUM<2
  Access Path: index (skip-scan)
    SS sel: 0.0000e+00  ANDV (#skips): 4073
    SS io: 32584.00 vs. table scan io: 2.00
    Skip Scan rejected
  Access Path: index (FullScan)
    Index: IND_OBJ
    resc_io: 380547.00  resc_cpu: 10551751028
    ix_sel: 1  ix_sel_with_filters: 1
    Cost: 382007.38  Resp: 382007.38  Degree: 1
  Best:: AccessPath: TableScan
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]:  MACLEAN[MACLEAN]#0
***********************
Best so far: Table#: 0  cost: 2.0008  card: 0.0000  bytes: 13
(newjo-stop-1) k:0, spcnt:0, perm:1, maxperm:80000
*********************************
Number of join permutations tried: 1
*********************************
Final - All Rows Plan:  Best join order: 1
  Cost: 2.0008  Degree: 1  Card: 1.0000  Bytes: 13
  Resc: 2.0008  Resc_io: 2.0000  Resc_cpu: 7121
  Resp: 2.0008  Resp_io: 2.0000  Resc_cpu: 7121

Current SQL statement for this session:
explain plan for select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */
 1
  from "SYS"."MACLEAN" PARTITION("P4")
 where (((("OBJECT_ID" < 100010))))
   and rownum < 2

============
Plan Table
============
-------------------------------------------+-----------------------------------+---------------+
| Id  | Operation                | Name    | Rows  | Bytes | Cost  | Time      | Pstart| Pstop |
-------------------------------------------+-----------------------------------+---------------+
| 0   | SELECT STATEMENT         |         |       |       |     2 |           |       |       |
| 1   |  COUNT STOPKEY           |         |       |       |       |           |       |       |
| 2   |   PARTITION RANGE SINGLE |         |     1 |    13 |     2 |  00:00:01 | 4     | 4     |
| 3   |    TABLE ACCESS FULL     | MACLEAN |     1 |    13 |     2 |  00:00:01 | 4     | 4     |
-------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
1 - filter(ROWNUM<2)
3 - filter("OBJECT_ID"<100010)

Content of other_xml column
===========================
  db_version     : 10.2.0.1
  parse_schema   : SYS
  plan_hash      : 4280157877
  Outline Data:
  /*+
    BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.1')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "MACLEAN"@"SEL$1")
    END_OUTLINE_DATA
  */

Dumping Hints
=============
  atom_hint=(@=0x988453c0 err=20 resol=1 used=0 token=453 org=1 lvl=2 txt=FIRST_ROWS )
  atom_hint=(@=0x988451f8 err=0 resol=1 used=1 token=177 org=1 lvl=3 txt=NO_PARALLEL ("MACLEAN") )
********** WARNING: SOME HINTS HAVE ERRORS *********

 

实际 “_optimizer_ignore_hints”参数是起到的作用是使SQL 隐式地回归到默认的optimizer_mode=ALL_ROWS上来,  我们终于战胜了嵌入在SQL语句中的”First Rows” Hint 。

 

总结

在针对10g 多个版本上的Split Partition 可能因 其DDL所附带的递归SQL 使用了 固定的 而又不恰当的”First rows Hint” 提示而造成语句执行缓慢的问题, 我们可以通过 以下3个方案解决:

A.  通过每次执行Split之前都收集指定分区的统计,并设置”_optimizer_ignore_hints” =true 来规避 分区统计信息失效和 不恰当的”First Rows hint” 所可能带来的问题, 方法如下:

exec dbms_stats.gather_table_stats('&OWNER','&TABNAME',partname=>'&PARNAME');

alter session set "_optimizer_ignore_hints"=true;

推荐使用方案A

补充:

有同学反映:

”隐藏参数_optimizer_ignore_hints在分区表的split操作中并未启到作用。

从我的测试结果来看,虽然导致split操作慢的根源是FIRST_ROWS优化器模式下的分区表select语句:

select /*+ FIRST_ROWS PARALLEL(“MACLEAN”, 1) */ 1
from
 MACLEAN PARTITION (“P4”)  where ( (  (  (
  “OBJECT_ID” < ‘1000010’ )  )  ) ) and rownum < 2;

不过以上的sql语句在_optimizer_ignore_hints参数调整后仍使用FIRST_ROWS的优化器模式,即没有生效。

我觉得这可能与oracle在执行自己内部命令时,会忽略该参数的设置。(直接执行select的sql语句确实会忽略hint)”

 

实际通过10053 事件追踪该SPLIT PARTITION DDL语句所生产的递归SQL语句,发现带有”FIRST_ROWS”提示的SELECT语句甚至没有解析的过程, 很有可能是该递归SQL语句直接使用了内部硬编码的存储大纲OUTLINES所导致的。

 

换句话说之前_optimizer_ignore_hints的隐式参数针对我们手动执行的带有FIRST_ROWS HINT的SELECT语句是有效的,而对于DDL所附带的递归SQL无效。

 

注:实际在对嵌入了HINT的非recursive SQL语句做调优时,若我们无法修改该SQL的HINT,则还是可以利用到”_optimizer_ignore_hints”的。

 

想了一下可以通过设置较旧的优化器特性了解决该问题(alter session set optimizer_features_enable=’8.0.0′;),该optimizer_features_enable参数同样可以在session级别设置, 如:

 

TEST A:

SQL> set timing on;

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13646.trc

SQL> alter table maclean split partition p4 at (10000081) into (partition p14, partition p4);

Table altered.

Elapsed: 00:00:42.50

TEST B:

SQL> set timing on;

SQL> oradebug setmypid;
Statement processed.

SQL> oradebug tracefile_name;
/s01/admin/G10R21/udump/g10r21_ora_13656.trc

SQL> alter session set optimizer_features_enable='8.0.0';

Session altered.

Elapsed: 00:00:00.01

SQL> alter table maclean split partition p4 at (10000091) into (partition p15, partition p4);

Table altered.

Elapsed: 00:00:00.05

PARSING IN CURSOR #2 len=152 dep=1 uid=0 oct=3 lid=0 tim=1291531645417871 hv=2124209225 ad='a74b41f0'

select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4") 
where ( (  (  ( "OBJECT_ID" < 10000091 )  )  ) ) and rownum < 2
END OF STMT
PARSE #2:c=1000,e=684,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645417864
BINDS #2:
EXEC #2:c=0,e=71,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645417984
FETCH #2:c=0,e=92,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418094
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=122 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=95 us)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=59 us)'
=====================

PARSING IN CURSOR #2 len=177 dep=1 uid=0 oct=3 lid=0 tim=1291531645418799 hv=339345368 ad='a74cedd8'
select /*+ FIRST_ROWS PARALLEL("MACLEAN", 1) */ 1 from "SYS"."MACLEAN" PARTITION ("P4")
 where ( (  (  ( "OBJECT_ID" >= 10000091 OR "OBJECT_ID" IS NULL  )  )  ) ) and
rownum < 2
END OF STMT
PARSE #2:c=1000,e=589,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=2,tim=1291531645418792
BINDS #2:
EXEC #2:c=0,e=67,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418908
FETCH #2:c=0,e=66,p=0,cr=3,cu=0,mis=0,r=0,dep=1,og=2,tim=1291531645418992
STAT #2 id=1 cnt=0 pid=0 pos=1 obj=0 op='COUNT STOPKEY (cr=3 pr=0 pw=0 time=92 us)'
STAT #2 id=2 cnt=0 pid=1 pos=1 obj=0 op='PARTITION RANGE SINGLE PARTITION: 13 13 (cr=3 pr=0 pw=0 time=64 us)'
STAT #2 id=3 cnt=0 pid=2 pos=1 obj=52809 op='TABLE ACCESS FULL MACLEAN PARTITION: 13 13 (cr=3 pr=0 pw=0 time=36 us)'
WAIT #1: nam='control file sequential read' ela= 31 file#=0 block#=1 blocks=1 obj#=-1 tim=1291531645419383
WAIT #1: nam='control file sequential read' ela= 12 file#=1 block#=1 blocks=1 obj#=-1 tim=1291531645419432
WAIT #1: nam='control file sequential read' ela= 23 file#=0 block#=15 blocks=1 obj#=-1 tim=1291531645419494
WAIT #1: nam='control file sequential read' ela= 10 file#=0 block#=17 blocks=1 obj#=-1 tim=1291531645419556

 

B. 如果确实遇到了该问题 ,也可以将Index FULL SCAN 所使用的全局索引drop 掉来达到强制使用FULL single table partition SCAN的目的, 实际使用中不推荐

C. 避免使用Global Partitioned index 或 Non-partitioned Index , 而采用Local partitioned index , 这似乎更难做到

D. 也可以通过将原Maxvalue的分区drop掉之后(前提是该分区真的是空的),再添加新分区的做法来绕过该问题

TEL/電話+86 13764045638
Email service@parnassusdata.com
QQ 47079569