I promised I would have more today. Here is the raw data from my analysis. As you can see as rejection (dup_val_on_index) increased, the execution time for select + insert decreased and execution time increased for insert with exception.
This specific use case is a transactional system requiring a unique index on ins_demo.column1 and we expected 20% of the time the unique index would be violated.
Here the two insert functions we are testing for performance. They are really quite simple. The first one is doing an insert into ins_demo and if there is a dup_val_on_index then return false.
The second function does a select against ins_demo on the unique key. If the key is not found then insert the value and return true, if the key is found then return false.
1 Begin 2 insert into ins_demo values (p_value); 3 return true; 4 exception when dup_val_on_index then 5 return false; 6 end;
1 begin 2 select count(*) 3 into dummy 4 from ins_demo 5 where column1 = p_value; 6 7 if dummy = 0 then 8 insert into ins_demo values (p_value); 9 return true; 10 else 11 return false; 12 end if; 13 end;
Here are the performance number for the two functions. I wrote a driver function that called the two functions 100,000 times and each once for a specific reject rate.
This bubble chart shows execution time on the y axis, rejection rate on the x axis and bubble size is the insert cpu time.
We should all strive to analyze changes prior to making recommendations. If I had gone with my knee jerk reaction to the select+insert and insisted on insert with an exception, system performance would have suffered.
This weekend I will run the same experiment using the merge command. I expect the merge command to be the fastest option, but as always, we require hard data before we make any recommendations.