Insert Performance analysis part 2

I promised I would have more today.  Here is the raw data from my analysis.  As you can see as rejection (dup_val_on_index) increased, the execution time for select + insert decreased and execution time increased for insert with exception.

This specific use case is a transactional system requiring  a unique index on ins_demo.column1 and we expected 20% of the time the unique index would be violated.

Here the two insert functions we are testing for performance.  They are really quite simple.  The first one is doing an insert into ins_demo and if there is a dup_val_on_index then return false. 

The second function does a select against ins_demo on the unique key.  If the key is not found then insert the value and return true, if the key is found then return false.

1 Begin 2 insert into ins_demo values (p_value); 3 return true; 4 exception when dup_val_on_index then 5 return false; 6 end;

1 begin 2 select count(*) 3 into dummy 4 from ins_demo 5 where column1 = p_value; 6 7 if dummy = 0 then 8 insert into ins_demo values (p_value); 9 return true; 10 else 11 return false; 12 end if; 13 end;

Here are the performance number for the two functions. I wrote a driver function that called the two functions 100,000 times and each once for a specific reject rate.


This bubble chart shows execution time on the y axis, rejection rate on the x axis and bubble size is the insert cpu time. 


We should all strive to analyze changes prior to making recommendations.  If I had gone with my knee jerk reaction to the select+insert and insisted on insert with an exception, system performance would have suffered.

This weekend I will run the same experiment using the merge command.  I expect the merge command to be the fastest option, but as always, we require hard data before we make any recommendations.

Insert performance on a unique constraint.

In the past, I have read about the performance of exception handlers and all that is said is, There is performance overhead. Okay, the generic answer of “performance overhead” tells us nothing.

One of the great pleasures of what I do is code reviews.  I get to learn techniques from other developers and I get to pass on some of my knowledge.  Code reviews are a win-win all around.

Recently, I reviewed some code that had the following code block to enforce a unique constraint. (tables and variables have been changed to protect intellectual property)  There is a unique constraint on ins_demo.column1, so the select will do a index unique scan for every select.  The use case for this example is a transactional system where different users will be imputing data.

select count(*)
into dummy
from ins_demo
where column1 = p_value;

if dummy = 0 then
insert into ins_demo values (p_value);
return true;
return false;
end if;

My knee jerk reaction is to change this from a select+insert to insert with an exception handler to improve performance.

insert into ins_demo values (p_value);
return true;
exception when dup_val_on_index then
return false;

Before making recommendations, make sure you first do no harm. So, in this case I decided to run some performance test against the select+insert and the insert with an exception.  The test will attempt to do 100,000 inserts starting will 100% success to 100% failures.

The results of the test have a few surprises.

Select+Insert 100% success:

execution time = 21.93 seconds

Insert CPU= 9.6

Select CPU = 2.55

Insert with exception 100% success:

execution time = 14.29 seconds

Insert CPU = 9.95

Recursive CPU = 8.96

 Select+Insert 100% reject due to duplicate rows:

execution time = 5.86 seconds

Insert CPU = 0

Select CPU = 1.73

Insert with exception 100% rejected due to duplicate rows:

execution time = 135.03 seconds

Insert CPU = 16.85

Exception CPU = 20.7

We can learn a few things from this.

1) In the select + insert method, as the number of rejections increased execution time decreased, insert CPU decreased and select CPU decreased.  This is due to a couple of things.  A unique index scan is faster when the data is found and as the number of rejected rows increase due to unique index violation, the fewer times an insert is executed.

2) In the insert with exception handler, execution time increases as the number of rows are rejected due to unique index violations increases.  This is because Oracle attempts to do the insert and then must roll the transaction back.

I will post a more detailed analysis with the raw performance data tomorrow.