Dr. Statslove or: How I Learned to Stop Guessing and Love the 10053 Trace
This is the long-overdue follow-up post about my recent problems with large partitioned tables and statistics. First and foremost, a huge “Thank You!” to Wolfgang Breitling for his help on this. Wolfgang took pity on me after a pathetic post on oracle-l and held my hand, guiding me into the 10053 trace file and looking over my shoulder as I gained a small, but valuable, bit of knowledge into interpreting how and why the CBO makes some of the decisions it does. And so, without further ado, let me layout the evidence in this case…
THE FACTS
We have a large, partitioned table in an instance running on Oracle 10gR2 (10.2.0.2). This table holds data for each of the last 4 months, in a “rolling window” scenario. A full month will contain 25-30 million rows. This table, which we’ll cleverly call FOO, has two indexes on it: FOO_PK, which is a compound primary key index involving 11 fields, and FOO_IDX, which is a compound non-unique index involving 5 fields. Both indexes make use of the fields YEAR, MONTH, and LOCATION in their left-most fields. FOO_PK also indexes CUST_NO in the 4th slot. The other fields do not play a role in this matter, but I initially guessed that they did, as I’ll tell you later.
As part of the “rolling window” scenario, at the end of a month, the oldest partition would be dropped, and a new partition for the next month would be created. This would usually occur on, say, the 28th of a given month. After various billing cycles throughout the month, data would be bulk loaded (via INSERT+APPEND from external tables) into the new partition. The first such load would likely occur around the 5th of the month. Each biling cycle would be for one of 10 or so LOCATIONs, and some LOCATIONs have multiple billing cycles (different end days to split up the customer base).
THE CRIME
After the first bulk load of the cycle, and during some later ones early on in the month (but not all), when the application would query based on YEAR, MONTH, LOCATION, and CUST_NO, the CBO was choosing FOO_IDX rather than FOO_PK. This was bad because FOO_IDX does not index CUST_NO, and so was resulting in basically a huge range scan.
I immediately knew the problem must be with out-of-date statistics. Our instance was using the default GATHER_STATS_JOB that collects stale database statistics every night and weekends. After this job or a manual call to gather stats on that table, things would work well again. However, gathering statistics on even just that partition took nearly an hour. That’s a long time for the users to be frustrated when what should be a 5-second query is now a 3-minute query and the help desk gets flooded with mis-diagnosed “MY COMPUTER IS FROZEN” tickets. A lot of the users were actually hard-rebooting their workstations because of this.
BAAG
In an homage to the BAAG Party, here is the Rube Goldberg-esque conspiracy theory that I had laid out in my head to explain what was happening.
- Knowing that this partition’s statistics were now stale, the CBO decides to discard them entirely.
- The CBO looks at what it knows about the two indexes. It sees that FOO_PK is an 11-field monstrosity, and that FOO_IDX is only 5-fields, a relative kitten of an index.
- The CBO decides that, all-things being equal, FOO_PK would take a lot more I/O to load into memory, and so defaults to FOO_IDX.
Then I sit back and wait for the kudos to roll in, a la C. Montgomery Burns. However when I actually tested this theory, replacing FOO_PK with a smaller, non-unique index on just the “main four” fields, the CBO still chose FOO_IDX. I was crestfallen, to say the least.
Wolfgang to the Rescue
This is about the time that Wolfgang Breitling answered my email off-line and helped through some things. First, he demonstrated how to run a 10053 trace on just the “EXPLAIN PLAN” call for a statement, which avoids having to actually run the statement. From the 10053 trace file after a query against data that had just been loaded into a previously empty partition:
Access Path: index (RangeScan)
Index: FOO_IDX
resc_io: 0.00 resc_cpu: 200
ix_sel: 0.0000e+00 ix_sel_with_filters: 0.0000e+00
Cost: 1.00 Resp: 1.00 Degree: 1
Access Path: index (RangeScan)
Index: FOO_PK
resc_io: 0.00 resc_cpu: 200
ix_sel: 0.0000e+00 ix_sel_with_filters: 0.0000e+00
Cost: 1.00 Resp: 1.00 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: FOO_IDX
Cost: 1.00 Degree: 1 Resp: 1.00 Card: 0.00 Bytes: 0
Here you can see that the CBO costed both indexes equally at 1.00. This is where I learned that, in the case of ties, the CBO chooses the first index that it saw, and it looks at them in alphabetical order. So the chances are that if my FOO_PK index had been named FOO_AWESOME, it would have always been chosen in this case. But then we wouldn’t have had such fun solving the problem.
But what about those seemingly random cases where data was already present? It turns out to have been happening when the LOCATION being queried was outside of the range of LOCATION values in the statistics. For example, if I have locations 001 and 003 loaded with statistics, then I load location 005 and query for it, I’ll get a trace like this:
Access Path: index (RangeScan)
Index: FOO_IDX
resc_io: 5.00 resc_cpu: 36640
ix_sel: 5.9706e-08 ix_sel_with_filters: 5.9706e-08
Cost: 1.00 Resp: 1.00 Degree: 1
Using prorated density: 5.9706e-08 of col #1 as selectivity of out-of-range value pred
Using prorated density: 5.9706e-08 of col #1 as selectivity of out-of-range value pred
Access Path: index (RangeScan)
Index: FOO_PK
resc_io: 5.00 resc_cpu: 36616
ix_sel: 6.6995e-12 ix_sel_with_filters: 3.9386e-15
Cost: 1.00 Resp: 1.00 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: FOO_IDX
Cost: 1.00 Degree: 1 Resp: 1.00 Card: 0.00 Bytes: 0
You can see the messages about “prorated density” and “out-of-range” values. The CBO tries to make an educated guess about the density of the specified LOCATION (col #1) value, which it does very poorly. Due to some rounding on Oracle’s part, the costs of the two indexes once again tie, and the tie goes to the alphabetical first: FOO_IDX again.
If, however, in that same scenario with locations 001 and 003, I load location 002 and query for it, you’ll see much different results, since 002 is within the range of 001-to-003:
Access Path: index (RangeScan)
Index: FOO_IDX
resc_io: 951956.00 resc_cpu: 10796254518
ix_sel: 0.5 ix_sel_with_filters: 0.5
Cost: 96360.10 Resp: 96360.10 Degree: 1
Access Path: index (RangeScan)
Index: FOO_PK
resc_io: 5.00 resc_cpu: 59016
ix_sel: 3.5448e-05 ix_sel_with_filters: 2.3983e-08
Cost: 1.01 Resp: 1.01 Degree: 1
****** trying bitmap/domain indexes ******
Best:: AccessPath: IndexRange Index: FOO_PK
Cost: 1.01 Degree: 1 Resp: 1.01 Card: 0.08 Bytes: 0
Here you can see that the cost associated with FOO_IDX now makes it an obviously poor choice, and FOO_PK is the winner.
POSSIBLE SOLUTIONS
So now that we’ve identified the problem, we drew up a list of possible solutions.
- Gather statistics after loading: Again, not ideal due to time involved in the middle of the business day.
- Stage table loading with partition exchange: This would be ideal. We would load the data into a stage table with the same structure as a single partition in the big table. We’d then gather statistics and build indexes on that stage table and then do a partition exchange to swap that table with an empty partition in the big table, taking with it the up-to-date indexes and statistics. However this would require sub-partitioning the table by market and billing cycle, and then making the necessary changes to the loading scripts. A longer term solution, to be sure.
- “Fix” the statistics: This would involve setting the table, index and column level statistics to values based on previous full months, except for the YEAR and MONTH fields, and without histograms, per Wolfgang’s suggestion. We also lock statistics on that table to prevent GATHER_STATS_JOB from overwriting our settings. This would all be done with the DBMS_STATS package, of course.
Option #3 made the most sense. Wolfgang graciously whipped up a script to set the column statistics to what we needed. As soon as a new partition is created, we set the statistics to our own values with the full range accounted for. We also then gather real statistics on the current partition, which isn’t really necessary but I felt better having real statistics when it was possible.
So now all is well on that front. I’ve had to follow-up with Wolfgang a couple of times to make sure I understood what his script was doing, and he followed-up with some minor errata. Thanks again to Wolfgang, and I’ll lobby my wife to name a future son after him.
Edit: FOO_IDX is a non-unique index. Sorry for the quasi-typo.




Hi,
Could you please tell me on which columns was this table partitioned.
The final solution you used was to set the statistics after load using dbms_stats. Was this setting partition statistics or global statistics ?
thanks
amit
poddar
August 17, 2007 at 3:00 pm
I set statistics after the partition is created, prior to any loading. And the statistics are set only for that new partition.
Don
August 17, 2007 at 3:31 pm
When you say “However when I actually tested this theory, replacing FOO_PK with a smaller, non-unique index on just the “main four” fields, the CBO still chose FOO_IDX”", I presume that the Cost was the same (if you ran a 10053) ? In that case, did you go back and test what happens if the “smaller” index was recreated as FOO_AWESOME ?
Hemant K Chitale
August 17, 2007 at 9:59 pm
Hemant, no I did not test with an alphabetically superior index, such as FOO_AWESOME.
When I have time next month, I’ll revisit this issue and post a 10053 trace.
Don
August 20, 2007 at 9:28 am
Don,
When you set the stats for the partition (mainly the high-low values of columns), do not forget to set them also for the table (i.e. on table level, not the partition one). If your query uses bind values, CBO cannot do partition pruning when establishing the plan, hence it uses the table-level stats, and you’ll have the same problem.
rgds
Ghassan
August 20, 2007 at 12:51 pm
Hi Don,
We are facing a similar scenario as what has been explained by you where queries against
some of our partitioned tables (range partitioned by date with each partition holding current month’s data and we have local paritioned indexes on all of the partitioned tables) suddenly start hanging on the first day of a new month after the data is loaded…
We see this issue only for the first day of the month where we have to access a new partition …
We used to analyze the partition immediately after load…
We used to analyze the partitioned tables every evening as well.(Our loads happen in the early morning hours every business day).
But, this did not help the scenario of first day of new month…
It is no longer an option to analyze the currently loaded partition as the time window is very less…
So, we decided to set the stats on the new parition on the last day of the current month based on the stats of the current month partition.
Though this has come as a welcome rescue to some extent, we still sporadically face the issue of hanging queries whenever they are accessed after the load…
All these problems vanish in thin air once the current problem table parition is analyzed.
But, I suspect the way we set stats on the new partition…
You had mentioned about a script …Is it possible to kindly publish it to other earthlings as well…?
“Wolfgang graciously whipped up a script to set the column statistics to what we needed.”
Also, If I understand correctly,Local partition stats are used only when the partition is known at parse time …
Otherwise, table statistics(global) is what is considered…
Since data access is predominantly via PL/SQL (where queries hang on the first day access) where all variables are automatically bound,
global stats is what is going to be more relevant …
Or am I incorrect in assuming so…
So, please also comment on setting/updating the corresponding global stats as well…and how to check it…
Vaidy
August 20, 2007 at 3:42 pm
In response to ghassan and vaidy: ghassan makes a good point about keeping the globals stats updated as well when setting partition statistics. However, his point about the optimizer using global statistics when bind variables are involved is incorrect – unless you disabled bind value peeking. With bind peeking the cbo is well aware of what partition is being targetted and uses that partition’s statistics.
Of course, if the next sql uses different bind values, the plan based on the partition statistics at the time of the original parse is still being used. In general that should not cause major problems as long as your partitions are fairly homogenous.
As for vaidy’s request for making the script public, that is up to Don as it contains potentially sensitive column names and values. But it is not difficult to whip up such a script. The most important part is to pay attention to low and high values of the partitioning column(s) and any other columns used in predicates which have different ranges in different partitiones. I suppose that is at the root of your occasional problems (as it was for Don when he tried to simply clone the statistics from an “older” partition). Look at the dbms_stats.prepare_column_values procedure.
WolfgangB
August 20, 2007 at 6:05 pm
One other comment. I had to smile when I read Don’s admission “I felt better having real statistics”. He is by no means alone. People like the “blessing” of the analyze or gather_xxx_stats procedure on their statistics as if they somehow became holier this way.
WolfgangB
August 20, 2007 at 6:21 pm
Thanks Don/Wolfgang for sharing this!
Karl
Karl Reitschsuter
August 20, 2007 at 11:53 pm
Wolfgang,
Your quote:
“The most important part is to pay attention to low and high values of the partitioning column(s) and any other columns used in predicates which have different ranges in different partitiones.”
Would you kind enough to elaborate on this please.
All of our partitioned tables are partitioned date with each partition holding a month’s data.
Are global stats automatically updated when prior partition is copied to the current partition?
The reason for this question is that we set the partition on the first day of a new month before the loads based on the prior month’s stats.
In this case,
bind variable peeking should have a negative effect(as it peeks into the current partition) on the first day access as we give an incorrect information to the optimiser by fully copying stats from month partition whereas only one day worth of dat has actually been loaded into the new partition…! Please correct me if I am wrong here…
That is why I tend to believe that once we analyze the current partition, the problem goes away only to re-occur the next month.
Also, is my assumption correct that by analyzing it again, all we are actually doing is to flush the current bad plan in the shared pool that has not aged out and put a good plan thereby masking the actual problem.
kvn
August 21, 2007 at 1:51 pm
Would you kind enough to elaborate on this please.
All of our partitioned tables are partitioned date with each partition holding a month’s data.
Are global stats automatically updated when prior partition is copied to the current partition?
The reason for this question is that we set the partition on the first day of a new month before the loads based on the prior month’s stats.
In this case,
bind variable peeking should have a negative effect(as it peeks into the current partition) on the first day access as we give an incorrect information to the optimiser by fully copying stats from month partition whereas only one day worth of dat has actually been loaded into the new partition…! Please correct me if I am wrong here…
That is why I tend to believe that once we analyze the current partition, the problem goes away only to re-occur the next month.
Also, is my assumption correct that by analyzing it again, all we are actually doing is to flush the current bad plan in the shared pool that has not aged out and put a good plan thereby masking the actual problem.
“Are global stats automatically updated when prior partition is copied to the current partition?”
I assume you mean “Are global stats automatically updated when [the statistics of a] prior partition is copied to the [the statistics of the ] current partition?”
Actually in either case – if you copy the data(!) of one partition to another partition or “just” the statistics – the answer is No. Global statistics are only updated if you explicitly update them through either of a gather_table_stats , an import_table_stats or a set_table_stats.
If you simply clone the statistics from one partition to another without adjustments you run into the problems Don did. I’ll use numbers instead of dates because the high and low boundaries are easier to display, but the principle and problem is the same. Assume the following table definition:
create table tbl ( yyyymm number, filler varchar2(100))
partition by range (yyyymm)
( partition D200704 values less than (200705)
, partition D200705 values less than (200706)
, partition D200706 values less than (200707)
);
and then load data into it:
insert into tbl
select 200704+mod(rownum-1,3), rpad(‘*’,32,’*')
from dual connect by level < = 999;
Each of the partitions has a single value for yyyymm with the low and high value the same. The global statistics will have a num_distinct of 3 with a low of 200704 and a high of 200706.
Now you create a new partition: D200707 values less than (200708)
If you simply clone the statistics of partition D200706 into D200707 then the column statistics for yyyymm will still have a low and high of 200706.
If you now issue a query against this new partition, i.e. with a predicate “where yyyymm = 200707″ the optimizer will correctly do the partition pruning (whether you use literals or binds – altering statistics invalidates plans causing a re-parse and re-peek in the latter case) but when it comes to estimating the cardinality of the predicate the CBO compares the predicate value to the value range. It finds that the value is outside the range (200707 is not >= 200706 and < = 200706) and prorates the selectivity, depending on the size of the range and the distance of the predicate value from the range. In this case that will yield a cardinality estimate of 1 instead of num_rows (of the partition). With a cardinality estimate that far off the mark all sorts of bad decisions can happen.
So it is not so much the fact that by cloning the statistics you describe a “fully grown” partition rather than an “adolescent” one that leads to bad access plans but the incorrect value range.
“That is why I tend to believe that once we analyze the current partition, the problem goes away only to re-occur the next month.”
Analyzing the current partition will correct the low and high values, causing the problem to go away. Unless you properly took care of that during the cloning in which case you don’t have that problem to begin with.
“Also, is my assumption correct that by analyzing it again, all we are actually doing is to flush the current bad plan in the shared pool that has not aged out and put a good plan thereby masking the actual problem.”
I am not sure I understand what you mean by “masking the actual problem”. As I tried to explain, the biggest potential problem in cloning partition statistics is not to adjust value ranges where necessary. That goes away when you analyze because the actual high and low values are gathered. Any affected plan is flushed when cloning the statistics (unless you override no_invalidate to TRUE).
I hope that clarifies it a bit.
WolfgangB
August 21, 2007 at 3:02 pm
My apologies for repeating kvn’s comment. That was an accident. I had saved his comment into notepad and then written my response. At the end I copied and pasted the entire content into the (tiny) comment box, forgetting that the saved comment was still at the beginning.
WolfgangB
August 21, 2007 at 3:06 pm
Many Thanks to Wolfgang for his neat response!
I have another clarifications though…
From your explanation, it appears that just copying partition stats without partition column stats adjustments is a bad thing for the optimiser.
But, is the following sequence say on the last day of the current month to set the stats on the new partition a good alternative that pleases the optimiser:
All our indexes on partitioned tables are local partitioned.
step 1:
dbms_stats.get_table_stats(for current partition)
step 2:
dbms_stats.set_table_stats(for new partition based on current partition)
step 3:
dbms_stats.get_index_stats(for current partition)
step 4:
dbms_stats.set_index_stats(for new partition based on current partition)
We could avoid:
1) Bad performance issues on the first day of month new partition access.
2) Analysing the current partition because 1 is avoided.
Or
Is it necessary in addition to look at set_column_stats or prepare_column_values?
Your thoughts please…
kvn
August 21, 2007 at 9:23 pm
I hate to say this, but
it depends.
It depends on how ingenious (if not to say devious) your developers and users are. I can think of scenarios where that simple approach may not work.
You could make it even simpler and just guarantee that you do not have statistics on the new partition – i.e. prevent any automatic stats gathering from gathering statistics on the empty partition – and let the cbo use defaults or dynamic sampling. Of course that immediately raises the question if the cbo will do that for a partition without statistics or if it tries to infer partition statistics from the global statistics.
Easy enough to test.
WolfgangB
August 22, 2007 at 6:39 am
Wolfgang,
What will CBO do if you have a query that has a predicate like part_col between :a and :b, and the bind value of b is beyond the max (the wrong max, if you did not update the global stats)? a worse case would be if both a and b had values beyond this max.
Ghassan
August 22, 2007 at 4:00 pm
We had a similar problem. We simply gathered statistics for only one full partition and exported them.
Everytime we add new partitions we simply import them.
Dim
Anonymous
August 23, 2007 at 12:12 pm
Hi guys, having the same problem but without luck of generating proper histograms :-(
can anyone though , send me a script to populate column stats?
Thanks a lot
Shay
Shay
February 26, 2008 at 3:47 am
[...] with partitioned tables that do not have representative stats. Don Seiler’s write-up of his real-world case is a poster child for this exception. If you are bulk loading data into a partitioned table, it is [...]
Choosing An Optimal Stats Gathering Strategy | Structured Data
March 26, 2008 at 12:52 am
[...] How I Learned to Stop Guesssing and Love the 10053 Trace [...]
Oracle Point, Oracle Life. » Posting Aggregation of CBO Issues with Oracle
March 28, 2008 at 10:13 pm