2010年8月3日

dbms_stats收集模式在9i和10g上的区别

mac

Author

3 min

Read Time

603

Views

大约2个月前，一位业内人士问我为什么9i CBO迁移到10g上会出现许多执行计划改变导致的性能，他当然是为了能考考我；实际上我接触过的环境大多在8i/9i下没有使用CBO优化模式，从8i/9i的RBO模式跨越到10g上较为成熟的CBO优化模式，这当中出现执行计划讹误可以说是情理之中的；而9i CBO到10上的CBO问题也不少，我首先想到的是统计信息收集上存在区别，但具体是什么区别却又说不上。那位业内人士听了我的回答，笑，笑而不语。

Oracle十分博大，博大到可以称为Oracle的世界，很多东西长期不用就会遭人淡忘；我们来复习下9i和10g上统计信息收集的一些改动。

在9i中收集统计信息时其默认的MOTHOD_OPT模式为'FOR ALL COLUMNS SIZE 1'，使用这种模式时Oracle只收集所有列上最基础的统计信息，包括了最小/大值，distinct值等信息;但是不会收集列上的直方图。对那些数据均匀分布和没有出现在SQL语句中where子句中作为条件的列来说，这样的统计信息完全足够了。然而如果列上的数据分布并不均匀就可能导致CBO的执行计划成本计算不准确，这时我们需要手动对这些列上的直方图进行统计。

10g上对dbms_stats包中默认的METHOD_OPT模式做了修正，这显然是引起9i CBO迁移到10g CBO后易发地执行计划变化的一个重要因素，也是那位业内人士所要问的题眼。

新的默认METHOD_OPT值为"FOR ALL COLUMNS SIZE AUTO"，这意味着Oracle将通过内部算法自动决定那些列上需要收集统计信息，而那些列上不需要。是否收集直方图取决于列上数据的分布情况和与对应表相关的工作负载,这种工作负载可以解释为数据库中存在某些需要参考这些列的详细信息来计算执行成本的SQL语句。

这种方式听上去十分理想，似乎Oracle可以默默无闻地为我们抓取所有急需的统计信息。

然而问题是在许多环境中Oracle没有做出是否需要收集列上直方图的正确决定。实践证明Oracle有可能收集许许多多不必要的直方图，同时又放弃了许多需要收集的直方图。

在轻量级的应用环境中这种直方图收集不当的问题造成的影响大多数时间不为人们所察觉，相反在performance critical或已经形成性能瓶颈的环境中则可能是一场不大不小的麻烦。

此外Oracle还改变了列上密度(density)信息的计算方式。该值常被Oracle用来确定谓词选择性，当突然出现额外不必要的直方图时可能造成的广泛显著地性能影响(当然好的影响也可能出现，只是概率上......)。

显然这些莫名出现的不速之客也会给共享池造成影响，library cache与row cache相关的闩可能短期内车水马龙，如果您的应用数据表上有成百上千的列那么情况可能更糟(所以说开发要遵循范式，没有规矩的最后结果往往是应用不可用，项目失败。别告诉我你的应用苟且地活着，那同样意味着项目失败)！

admin 2010-08-03

Applies to: Oracle Server - Enterprise Edition - Version: 9.0.1.0 to 10.2.0.4 Information in this document applies to any platform. Goal This document outlines how to determine the default parameter settings when gathering statistics on Table on 9i and 10g. Solution On 9i, Gather procedures have a number of hard coded default values On 10g, All procedures that gather optimizer statistics no longer have hardcoded default values. The defaults can be viewed using : select dbms_stats.get_param('cascade') from dual; select dbms_stats.get_param('degree') from dual; select dbms_stats.get_param('estimate_percent') from dual; select dbms_stats.get_param('method_opt') from dual; select dbms_stats.get_param('no_invalidate') from dual; select dbms_stats.get_param('granularity') from dual; Parameters can be set using DBMS_STATS.SET_PARAM. EXAMPLES Get Parameter Example SQL> select dbms_stats.get_param('method_opt') from dual; DBMS_STATS.GET_PARAM('METHOD_OPT') ---------------------------------- FOR ALL COLUMNS SIZE AUTO Set Parameter Example SQL> exec dbms_stats.set_param('METHOD_OPT', 'FOR ALL COLUMNS SIZE 1') PL/SQL procedure successfully completed. SQL> select dbms_stats.get_param('method_opt') from dual; DBMS_STATS.GET_PARAM('METHOD_OPT') ---------------------------------- FOR ALL COLUMNS SIZE 1 The default values on 9i are hard coded and are as follows: DBMS_STATS.GATHER_TABLE_STATS ( ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, --> ALL partitions estimate_percent NUMBER DEFAULT NULL, --> 100% sample block_sample BOOLEAN DEFAULT FALSE, method_opt VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1', degree NUMBER DEFAULT NULL, --> parallel degree 1 granularity VARCHAR2 DEFAULT 'DEFAULT', --> level (PARTITION + GLOBAL) cascade BOOLEAN DEFAULT FALSE, --> does not cascade to indexes by default no_invalidate BOOLEAN DEFAULT FALSE);

admin 2010-08-03

Show Optimizer Statistics for CBO set echo off set scan on set lines 150 set pages 66 set verify off set feedback off set termout off column uservar new_value Table_Owner noprint select user uservar from dual; set termout on column TABLE_NAME heading "Tables owned by &Table_Owner" format a30 select table_name from dba_tables where owner=upper('&Table_Owner') order by 1 / undefine table_name undefine owner prompt accept owner prompt 'Please enter Name of Table Owner (Null = &Table_Owner): ' accept table_name prompt 'Please enter Table Name to show Statistics for: ' column TABLE_NAME heading "Table|Name" format a15 column PARTITION_NAME heading "Partition|Name" format a15 column SUBPARTITION_NAME heading "SubPartition|Name" format a15 column NUM_ROWS heading "Number|of Rows" format 9,999,999,999,990 column BLOCKS heading "Blocks" format 999,990 column EMPTY_BLOCKS heading "Empty|Blocks" format 999,999,990 column AVG_SPACE heading "Average|Space" format 9,990 column CHAIN_CNT heading "Chain|Count" format 999,990 column AVG_ROW_LEN heading "Average|Row Len" format 990 column COLUMN_NAME heading "Column|Name" format a25 column NULLABLE heading Null|able format a4 column NUM_DISTINCT heading "Distinct|Values" format 999,999,990 column NUM_NULLS heading "Number|Nulls" format 9,999,990 column NUM_BUCKETS heading "Number|Buckets" format 990 column DENSITY heading "Density" format 990 column INDEX_NAME heading "Index|Name" format a15 column UNIQUENESS heading "Unique" format a9 column BLEV heading "B|Tree|Level" format 90 column LEAF_BLOCKS heading "Leaf|Blks" format 990 column DISTINCT_KEYS heading "Distinct|Keys" format 9,999,999,990 column AVG_LEAF_BLOCKS_PER_KEY heading "Average|Leaf Blocks|Per Key" format 99,990 column AVG_DATA_BLOCKS_PER_KEY heading "Average|Data Blocks|Per Key" format 99,990 column CLUSTERING_FACTOR heading "Cluster|Factor" format 999,999,990 column COLUMN_POSITION heading "Col|Pos" format 990 column col heading "Column|Details" format a24 column COLUMN_LENGTH heading "Col|Len" format 9,990 column GLOBAL_STATS heading "Global|Stats" format a6 column USER_STATS heading "User|Stats" format a6 column SAMPLE_SIZE heading "Sample|Size" format 9,999,999,999,990 column to_char(t.last_analyzed,'MM-DD-YYYY') heading "Date|MM-DD-YYYY" format a10 prompt prompt *********** prompt Table Level prompt *********** prompt select TABLE_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tables t where owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') / select COLUMN_NAME, decode(t.DATA_TYPE, 'NUMBER',t.DATA_TYPE||'('|| decode(t.DATA_PRECISION, null,t.DATA_LENGTH||')', t.DATA_PRECISION||','||t.DATA_SCALE||')'), 'DATE',t.DATA_TYPE, 'LONG',t.DATA_TYPE, 'LONG RAW',t.DATA_TYPE, 'ROWID',t.DATA_TYPE, 'MLSLABEL',t.DATA_TYPE, t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '|| decode(t.nullable, 'N','NOT NULL', 'n','NOT NULL', NULL) col, NUM_DISTINCT, DENSITY, NUM_BUCKETS, NUM_NULLS, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_columns t where table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) / select INDEX_NAME, UNIQUENESS, BLEVEL BLev, LEAF_BLOCKS, DISTINCT_KEYS, NUM_ROWS, AVG_LEAF_BLOCKS_PER_KEY, AVG_DATA_BLOCKS_PER_KEY, CLUSTERING_FACTOR, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_indexes t where table_name = upper('&Table_name') and table_owner = upper(nvl('&Owner',user)) / break on index_name select i.INDEX_NAME, i.COLUMN_NAME, i.COLUMN_POSITION, decode(t.DATA_TYPE, 'NUMBER',t.DATA_TYPE||'('|| decode(t.DATA_PRECISION, null,t.DATA_LENGTH||')', t.DATA_PRECISION||','||t.DATA_SCALE||')'), 'DATE',t.DATA_TYPE, 'LONG',t.DATA_TYPE, 'LONG RAW',t.DATA_TYPE, 'ROWID',t.DATA_TYPE, 'MLSLABEL',t.DATA_TYPE, t.DATA_TYPE||'('||t.DATA_LENGTH||')') ||' '|| decode(t.nullable, 'N','NOT NULL', 'n','NOT NULL', NULL) col from dba_ind_columns i, dba_tab_columns t where i.table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) and i.table_name = t.table_name and i.column_name = t.column_name order by index_name,column_position / prompt prompt *************** prompt Partition Level prompt *************** select PARTITION_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_partitions t where table_owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') order by partition_position / break on partition_name select PARTITION_NAME, COLUMN_NAME, NUM_DISTINCT, DENSITY, NUM_BUCKETS, NUM_NULLS, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_PART_COL_STATISTICS t where table_name = upper('&Table_name') and owner = upper(nvl('&Owner',user)) / break on partition_name select t.INDEX_NAME, t.PARTITION_NAME, t.BLEVEL BLev, t.LEAF_BLOCKS, t.DISTINCT_KEYS, t.NUM_ROWS, t.AVG_LEAF_BLOCKS_PER_KEY, t.AVG_DATA_BLOCKS_PER_KEY, t.CLUSTERING_FACTOR, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_ind_partitions t, dba_indexes i where i.table_name = upper('&Table_name') and i.table_owner = upper(nvl('&Owner',user)) and i.owner = t.index_owner and i.index_name=t.index_name / prompt prompt *************** prompt SubPartition Level prompt *************** select PARTITION_NAME, SUBPARTITION_NAME, NUM_ROWS, BLOCKS, EMPTY_BLOCKS, AVG_SPACE, CHAIN_CNT, AVG_ROW_LEN, GLOBAL_STATS, USER_STATS, SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_tab_subpartitions t where table_owner = upper(nvl('&&Owner',user)) and table_name = upper('&&Table_name') order by SUBPARTITION_POSITION / break on partition_name select p.PARTITION_NAME, t.SUBPARTITION_NAME, t.COLUMN_NAME, t.NUM_DISTINCT, t.DENSITY, t.NUM_BUCKETS, t.NUM_NULLS, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_SUBPART_COL_STATISTICS t, dba_tab_subpartitions p where t.table_name = upper('&Table_name') and t.owner = upper(nvl('&Owner',user)) and t.subpartition_name = p.subpartition_name and t.owner = p.table_owner and t.table_name=p.table_name / break on partition_name select t.INDEX_NAME, t.PARTITION_NAME, t.SUBPARTITION_NAME, t.BLEVEL BLev, t.LEAF_BLOCKS, t.DISTINCT_KEYS, t.NUM_ROWS, t.AVG_LEAF_BLOCKS_PER_KEY, t.AVG_DATA_BLOCKS_PER_KEY, t.CLUSTERING_FACTOR, t.GLOBAL_STATS, t.USER_STATS, t.SAMPLE_SIZE, to_char(t.last_analyzed,'MM-DD-YYYY') from dba_ind_subpartitions t, dba_indexes i where i.table_name = upper('&Table_name') and i.table_owner = upper(nvl('&Owner',user)) and i.owner = t.index_owner and i.index_name=t.index_name / clear breaks set echo on

antiper 2010-08-04

这个博客还不错！有点收获！

maclean 2010-08-11

Managing CBO Stats during an upgrade to 10g or 11g Applies to: Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.6 Information in this document applies to any platform. This note applies to Oracle Applications, Siebel, PSFT as well as databases with custom applications. Goal Cost-based Optimizer (CBO) uses statistics when generating explain plans. CBO statistics can be classified into four types: schema objects, data dictionary, fixed objects, and system performance. CBO statistics for schema objects have been used for several releases now. CBO statistics for data dictionary were introduced on 9i and considered optional. CBO system statistics were also introduced on 9i but rarely implemented. CBO statistics for fixed objects were introduced on 10g. This note provides guidance managing existing and new CBO stats during an upgrade from 9i to 10g or 11g; or from 10g to 11g. The terms pre-upgrade release and post-upgrade release are used accordingly. Pre-upgrade refers then to 9i or 10g, and post-upgrade to 10g or 11g. The core idea presented by this note is to continue gathering CBO statistics for application schemas as usual, but create a clean baseline for non-application objects (data dictionary, fixed objects and system performance). Strategy: 1. For application schema objects, use specific procedure required by the application vendor if they provide one. For example, if using Oracle eBiz, continue using FND_STATS. If there is no procedure provided by the application vendor, or this is a home-grown application or customization, use initially the same defaults or settings you were using in your pre-upgrade release. Be aware that some of the defaults in DBMS_STATS package have changed from 9i to 10g/11g, so you may need to use the SET_PARAM api to alter them back to the pre-upgrade release levels if your upgrade is from 9i. 2. For data dictionary objects, gather full statistics once, without histograms. Do not re-gather until a new major upgrade requires so, or the workload changes. Re-gather if you make massive schema changes to the environment (e.g., add 100's of new database users or drop and create a large number of objects, plug-in new tablespaces into the database, etc). 3. For fixed objects, gather once during normal system load. Do not re-gather until a new major upgrade requires so, or the workload changes. 4. For system performance stats, gather once with normal system load. Repeat only if system configuration or load changes significantly. Solution Before upgrading from 9i to 10g/11g; or from 10g to 11g: 1. Continue gathering CBO statistics as per your current procedures. 2. Make a full backup of your CBO statistics. Use scripts coe_create_user_coecbostats.sql and coe_backup_cbo_stats.sql connected as SYS. 3. If pre-upgrade instance will be destroyed soon after the upgrade, make an export of schema owner COECBOSTATS using EXP utility. After upgrading from 9i to 10g/11g; or from 10g to 11g: 1. If your application provides its own procedure to gather CBO statistics, discontinue immediately the execution of the job that performs an automatic gathering of CBO statistics: * On 10g, connect as SYS and EXEC DBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB'); * On 11g, connect as SYS and EXEC DBMS_AUTO_TASK_ADMIN.DISABLE('auto optimizer stats collection', NULL, NULL); 2. Create a baseline of Data Dictionary CBO statistics by using script coe_gather_dictionary_stats.sql connected as SYS. 3. Create during normal system load a baseline of Fixed Objects CBO statistics by using script coe_gather_fixed_objects_stats.sql connected as SYS. 4. Create a baseline of System statistics for the CBO by using scripts coe_gather_system_stats_nw.sql, coe_gather_system_stats_start.sql and coe_gather_system_stats_stop.sql connected as SYS. 4.1 Execute coe_gather_system_stats_nw.sql once in order to generate system statistics that are independent of the system workload (no workload "nw"). 4.1 During normal system load, execute first coe_gather_system_stats_start.sql, then wait for at least two hours and execute second coe_gather_system_stats_stop.sql. This set will generate system statistics that depend on the system workload during the start and stop times. If your workload or hardware configuration changes over time, you will have to execute this set in the same manner after the system load/configuration has been implemented and during normal system utilization. Scripts: (attached to this note) These scripts are provided as a mechanism to perform the actions described in this note. They include a backup of the CBO statistics to be refreshed by the script. 1. coe_create_user_coecbostats.sql creates a new schema owner COECBOSTATS with one object (COE$_STATTAB). Table COE$_STATTAB is a repository to store persistent versions of CBO statistics for any of the four types: schema objects, data dictionary, fixed objects, and system performance. Execute connected as SYS. 2. coe_backup_cbo_stats.sql can be used on 9i, 10g or 11g. It creates a backup of all four types of CBO statistics when executed on 10g/11g, and all three valid types on 9i (skipping fixed objects). Execute connected as SYS. 3. coe_gather_dictionary_stats.sql generates corresponding CBO statistics using an estimated percentage of 100% (compute), no histograms, and cascades into all related indexes (gathers statistics on indexes). It makes a backup of these statistics before and after gathering. Execute connected as SYS. It can be used on 9i, 10g or 11g. 4. coe_gather_fixed_objects_stats.sql generates corresponding CBO statistics. It makes a backup of these statistics before and after gathering. Execute connected as SYS. It can be used on 10g or 11g. 5. coe_gather_system_stats_nw.sql gathers system statistics that are independent of the workload. Execute connected as SYS. It can be used on 10g or 11g. 6. coe_gather_system_stats_start.sql is used in combination with coe_gather_system_stats_stop.sql. They start and stop the gathering of system statistics that dependent of the system workload and configuration. They must be executed during a normal system utilization window. Execute connected as SYS. It can be used on 10g or 11g. 7. coe_gather_system_stats_stop.sql is used in combination with coe_gather_system_stats_start.sql. They start and stop the gathering of system statistics that dependent of the system workload and configuration. They must be executed during a normal system utilization window. Execute connected as SYS. It can be used on 10g or 11g. Words of Caution: 1. If your application does not provide specific instructions regarding CBO statistics gathering and you were using default functionality for DBMS_STATS on your pre-upgrade release (9i/10g), you may want to take the conservative approach of preserving the "pre-upgrade release" functionality initially, and gradually incorporate the new defaults for DBMS_STATS on the post-upgrade release (10g/11g). This is specially important from 9i to 10g or 11g. 2. ESTIMATE_PERCENT had a default of 100% for sample size on 9i, while 10g defaults this parameter to DBMS_STATS.AUTO_SAMPLE_SIZE, which derives a very small estimate percentage (sample size). Small sample sizes are known to produce poor number of distinct values NDV on columns with skewed data (which are common), thus generate sub-optimal plans. Use then an estimate sample size of 100% on 10g if your window maintenance can afford it, even if that means gather statistics less often. If 100% were not feasible, try using at least an estimate of 30%. On 11g the default value of DBMS_STATS.AUTO_SAMPLE_SIZE gathers stats with a large sample size, so using the default value is a better approach. 3. METHOD_OPT has a default of "FOR ALL COLUMNS SIZE 1" on 9i, which basically meant NO HISTOGRAMS. 10g and 11g default to AUTO, which means DBMS_STATS decides in which columns a histogram may help to produce a better plan. It is known that in some cases, the effect of a histogram is adverse to the generation of a better plan. Again, you may want to initially set this parameter to its pre-upgrade release used value, and later adjust to your post-upgrade release default value. 4. In summary, to avoid causing changes to the execution plan, try to keep the statistics gathering process as close as possible to your pre-upgrade release, at least until after the upgrade is complete and stable, then adjust gradually to the features provided by the new release.

maclean 2010-08-11

SIZE Clause in METHOD_OPT Parameter of DBMS_STATS Package Applies to: Oracle Server - Enterprise Edition - Version: 9.0.1.0 to 11.2.0.2 Information in this document applies to any platform. Purpose This note clarifies use of the SIZE clause in the METHOD_OPT parameter of the DBMS_STATS package and its default value. Scope and Application For DBAs of all levels. SIZE Clause in METHOD_OPT Parameter of DBMS_STATS Package SIZE clause is an optional clause in the METHOD_OPT parameter, which itself is an optional parameter in the DBMS_STATS package's procedures GATHER_DATABASE_STATS, GATHER_DICTIONARY_STATS, GATHER_SCHEMA_STATS, GATHER_TABLE_STATS. This parameter determines whether to collect histograms on the columns of the tables, and if so, on which columns and how. Description of the METHOD_OPT parameter in the Oracle Documentation 'Database PL/SQL Packages and Types Reference' (Chapter DBMS_STATS) is as follows: METHOD_OPT accepts: * FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] * FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]...] size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY} - integer : Number of histogram buckets. Must be in the range [1,254]. - REPEAT : Collects histograms only on the columns that already have histograms. - AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns. - SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns. The [size_clause] in METHOD_OPT parameter determines the number of histogram buckets for a column, which must be in the range [1,254]. 'SIZE 1' means no histograms. In Oracle 9i METHOD_OPT, if not specified, defaults to 'FOR ALL COLUMNS SIZE 1', which means no histograms are collected for all columns in the table. In Oracle 10g and 11g default value of METHOD_OPT was changed to 'FOR ALL COLUMNS SIZE AUTO', which means Oracle automatically determines on which columns to collect histograms and how. This often leads to many more columns with histograms than it would have been in Oracle 9i. That in turn may negatively affect execution plans of some SQLs after upgrade from 9i. See Note 465787.1 for more details. If you do specify METHOD_OPT but do not specify the [size_clause], then in all Oracle versions 9i, 10g, 11g [size_clause] defaults to 'SIZE 75'. In cases where the number of distinct values (NDV) for a column is less than 75, the number of histograms buckets for this column becomes =select DBMS_STATS.GET_PARAM('METHOD_OPT') from dual; In Oracle 11g you can also use new function GET_PREFS for this purpose. Default value of METHOD_OPT parameter in Oracle 10g and 11g can be changed using SET_PARAM procedure. For example: EXEC DBMS_STATS.SET_PARAM ('method_opt','for all columns size 1'); In Oracle 11g you can also use new DBMS_STATS package's procedures SET_GLOBAL_PREFS, SET_DATABASE_PREFS, SET_SCHEMA_PREFS, SET_TABLE_PREFS, which add more flexibility when setting default values at different database levels.

maclean 2010-08-11

How to Change Default Parameters for Gathering Statistics Applies to: Oracle Server - Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7 Information in this document applies to any platform. Oracle Server Enterprise Edition - Version: 10.1.0.2 to 11.1.0.7 Goal This document outlines how to change the default parameters in use for gathering optimizer statistics in Oracle 10g and Oracle 11g and provides an outline of the values possible for these parameters. This is useful when using automatic statistics gathering within a maintenance window, as these default parameter settings define how these statistics will be collected. Solution Syntax for Changing Default Parameters for Gathering Statistics In 10g and 11g the default values for parameters used to gather statistics may be changed by using the DBMS_STATS.SET_PARAM procedure. The procedure needs to be run individually for each parameter that one wishes to change The syntax for this is as follows: DBMS_STATS.SET_PARAM ( pname IN VARCHAR2, pval IN VARCHAR2); Pname is the parameter name. Pval is the value to set for the associated 'Pname' parameter. Which Parameters can be Set? The DBMS_STATS.SET_PARAM can be used to set the following parameters: CASCADE Controls whether indexes are analyzed at the same time Default: TRUE Possible Values: * TRUE * FALSE Note: The default value for CASCADE set by SET_PARAM is not used by export/import procedures. It is used only by gather procedures. Example: exec DBMS_STATS.SET_PARAM('CASCADE','FALSE'); DEGREE Degree of parallelism. Default: NULL Possible Values: * NULL - Use the table default value specified by the DEGREE clause in the CREATE TABLE or ALTER TABLE statement. * integer - the integer will be used as degree for all objects Example: exec DBMS_STATS.SET_PARAM('DEGREE','5'); ESTIMATE_PERCENT Percentage of rows to estimate Default: DBMS_STATS.AUTO_SAMPLE_SIZE Possible Values: * Valid range is [0.000001,100] * NULL - compute will be used (100%) * DBMS_STATS.AUTO_SAMPLE_SIZE - sample sizes may vary for different versions, for example, this tends to default to a smaller sample size in 10g than in 11g. Example: exec DBMS_STATS.SET_PARAM('ESTIMATE_PERCENT','NULL'); Note: When NULL is unquoted, this sets the parameter to the value Oracle recommends. In the case of the quoted 'NULL', this sets the value of the parameter itself to NULL so that the above example indicates that estimate_percent=null (i.e compute) as opposed to NULL without quotes that would imply using the default for this parameter for the specific Oracle version METHOD_OPT Used to gather column statistics Default: FOR ALL COLUMNS SIZE AUTO. Possible Values: * FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause] * FOR COLUMNS [size clause] column|attribute [size_clause] [,column|attribute [size_clause]...] size_clause is defined as size_clause := SIZE {integer | REPEAT | AUTO | SKEWONLY} integer : Number of histogram buckets. Must be in the range [1,254]. o REPEAT : Collects histograms only on the columns that already have histograms. o AUTO : Oracle determines the columns to collect histograms based on data distribution and the workload of the columns. o SKEWONLY : Oracle determines the columns to collect histograms based on the data distribution of the columns. Example: exec DBMS_STATS.SET_PARAM('METHOD_OPT', 'FOR ALL COLUMNS SIZE 1'); NO_INVALIDATE Determines whether to invalidate dependent cursors or not Default: DBMS_STATS.AUTO_INVALIDATE Possible Values: * DBMS_STATS.AUTO_INVALIDATE - Oracle decide when to invalidate dependent cursors. * TRUE - Does not invalidate the dependent cursors * FALSE - Invalidates dependent cursors Example: exec DBMS_STATS.SET_PARAM('NO_INVALIDATE','FALSE'); GRANULARITY Determines granularity of statistics to collect (only pertinent if the table is partitioned). Default: 'AUTO' Possible Values: * 'AUTO' - determines the granularity based on the partitioning type * 'ALL' - gathers all (subpartition, partition, and global) statistics * 'GLOBAL' - gathers global statistics * 'GLOBAL AND PARTITION' - gathers the global and partition level statistics. No subpartition level statistics are gathered even if it is a composite partitioned object. * 'PARTITION '- gathers partition-level statistics * 'SUBPARTITION' - gathers subpartition-level statistics. Note: 'DEFAULT' is obsolete. This option gathers global and partition-level statistics It is currently supported, but included in the documentation for legacy reasons only. Use the 'GLOBAL AND PARTITION' for this functionality. Example: exec DBMS_STATS.SET_PARAM('GRANULARITY','GLOBAL AND PARTITION'); AUTOSTATS_TARGET This parameter is applicable only for auto statistics collection. The value of this parameter controls the objects considered for statistics collection Default: 'AUTO' Possible Values: * 'AUTO' - Oracle decides for which objects to collect statistics * 'ALL' - Statistics are collected for all objects in the system * 'ORACLE' - Statistics are collected for all Oracle owned objects. This option restricts the list of schemas for which the automatic stats gathering job will gather statistics to a list of Oracle component system E.g. SYS, SYSMAN, WMSYS and EXFSYS in a sample database Usage Notes * To run this procedure, the user must have the SYSDBA or both the ANALYZE ANY DICTIONARY and ANALYZE ANY system privileges. * Note that both arguments are of type VARCHAR2 and the values need to be enclosed in quotes even when they represent numbers. * Note also the difference between NULL and 'NULL': o When NULL is unquoted, this sets the parameter to the value Oracle recommends. o In the case of the quoted 'NULL', this sets the value of the parameter to NULL. How to Check Present Values for a Parameter In order to check the present value for a certain parameter do: select dbms_stats.get_param(pname) from dual;

admin 2010-08-11

Statistics Gathering: Frequency and Strategy Guidelines Purpose ------- Provide guidelines for Gathering CBO statistics. Audience -------- DBAs Recommendations for Gathering CBO Statistics -------------------------------------------- Summary ------- o Use individual statistic gathering commands for more control o Gather statistics on tables with a 5% sample o Gather statistics on indexes with compute o Add histograms where column data is known to be skewed Explanation of summary: ----------------------- The level to which objects statistics should be collected is very much data dependant. The goal is to read as little data as possible to achieve an accurate sample. Different sample sizes may be required to generate accurate enough figures to produce acceptable plans. Research has indicated that a 5% sample is generally sufficient for most tables. Gathering statistics on tables requires sorting to be done. Gathering statistics on indexes does not because the data is already sorted. Often this means that a compute on an index will perform acceptably whereas a compute on a table will not. Column statistics in the form of histograms are only appropriate for columns whose distribution deviates from the expected uniform distribution. Gathering statistics detail --------------------------- The following article is a collection of opinions. Different systems need different levels of statistical analysis due to differences in data. However, If these recommendations are used sensibly, they give good basic guidelines for gathering statistics of objects. - The reason for gathering statistics is to provide CBO with the best information possible to help it choose a 'good' execution plans. - The accuracy of the stats depends on the sample size. - Even given COMPUTED stats, it is possible that CBO will not arrive at the BEST plan for a given SQL statement. This is because the optimizer inherently makes assumptions and has only limited information available. - Given a production system with predictable, known queries, the 'best' execution plan for each statement is not likely to vary over time - unless the application is unusual and uses data with wildy different characteristics from day to day. - Given the 'best' plan is unlikely to change, frequent gathering statistics has no benefit. It does incur costs though. - To determine the best sample size it is best to use gather statistics using different sample sizes and look at the results. The statistics should be fairly consistent once a reasonable sample size has been used. Increasing the sample size beyond a given size is unlikely to improve the accuracy. You can see this easily by analyzing 10 rows, 100 rows, 1000 rows etc.. At some point the results should start to look consistent. - To determine the best statistic gathering interval one should keep a history of the statistics from BEFORE and AFTER each collect. By keeping a history the user can check for varying statistics and adjust the sampling accordingly. If the statistics remain reasonably constant then the statistic gathering activity may not be adding any value. Unfortunately, it is not possible to determine that there will be a problem without collecting the statistics. - If the before / after stats do vary often then either the data profile is not predictable or the sample size is too small to be accurately reflecting the true nature of the data, Or (unlikely) the data is too random, in which case the stats are of limited use anyway and one should think of ways of ensuring queries using the data will get at least a reasonable response. - As the CBO uses the stats as the basis of its cost calculations, gathering new statistics on a table MAY result in a different execution plan for some statements. This is expected behaviour and allows the CBO to adjust access paths if the data profile changes. - It is possible that a different execution plan may be worse than the original plan. The difference may be small or quite large. This is a non-negotiable fact. Given a set of base information CBO will choose a plan. Given slightly different base information it may choose a different plan. It is unlikely the change of plan will coincide exactly with the real point at which a change of plan would be beneficial. - For most production systems predictability is more important than absolute best performance. Hopefully from the above it is clear than gathering statistics can have a destabilizing effect. This is not to say it should not be used, but to be aware of what may happen. - Most applications have several queries that form the heart of most transactions. These queries are critical in that any adverse change in execution plan could incur a high cost due to the number and frequency of usage of these statements. It is a good idea to isolate such statements into a test-suite where sample queries can be used to guage if performance of the key statements has deteriorated badly due to a statistics collect. IE: after gathering statistics check these statements still response in acceptable times. - It is recommended that users should store critical statistics to allow them to revert back to a working configuration in the event of a statistic change that significantly affects application performance. The following article explains how to store statistics: Note:117203.1 - If gathering statistics causes critical statements to perform badly you can revert to the pre-analyze stats by: - importing previously exported statistics - using database point=in-time recovery. - re-analyze with a larger sample size in an attempt to generate more accurate statistics - Look at the bad plan/plans to see where the 'bad' chunk of cost has been introduced. - Use any available tuning options to correct the problem. For example: add hints to statements or views to correct the performance of problem queries, or selectively delete statistics. - For any statement that does suffer a change between good and very bad plans there is usually some element of the cost which is finely balanced and the re-analyze tips you between the plans. Any such statements are risky in a production system but there is no easy way to identify them. Generally, it is best to hint the SQL wherever there could be a fine balance between 2 options. - In a DSS / warehouse environment, queries are generally NOT predictable so there is no stable environment / query set to upset by gathering statistics. - Although the above may sound quite dramatic. It is actually unusual for a plan to change wildly or for stats to be finely balanced. Unusual does NOT mean it never happens. It can happen. The likelyhood is very small.

wen4270407 2013-01-22

那如果把10g的MV log移植到9i上会不会出现问题啊~~

评论 (8)